Ocrfeeder - A complete document layout analysis and OCR system for GNU/Linux
Please noticeOCRFeeder's official web page is: http://live.gnome.org/OCRFeeder And all news be in there from now on, making the project page on Google Code deprecated. OCRFeederOCRFeeder is a document layout analysis and optical character recognition system. Given the images it will automatically outline its contents, distinguish between what's graphics and text and perform OCR over the latter. It generates multiple formats being its main one ODT. It features a complete GTK graphical user interface that allows the users to correct any unrecognized characters, defined or correct bounding boxes, set paragraph styles, clean the input images, import PDFs, save and load the project, export everything to multiple formats, etc. OCRFeeder was developed as the project of the Master's Thesis in Computer Science of Joaquim Rocha. Check the program in action here: http://www.vimeo.com/3760126 NEWS2009/11/06: OCRFeeder v0.4 released 2009/10/16: OCRFeeder v0.3 released 2009/10/05: OCRFeeder has changed its development to Gitorious. 2009/05/10: Released first tarball version; 2009/03/18: OCRFeeder has been released today to the general public (checkout SVN). After a long wait, finally the initial commit to the public SVN.
comments powered by Disqus
The Tesseract OCR engine was one of the top 3 engines in the 1995 UNLV Accuracy test. Between 1995 and 2006 it had little work done on it, but it is probably one of the most accurate open source OCR engines available. The source code will read a binary, grey or color image and output text. A tiff reader is built in that will read uncompressed TIFF images, or libtiff can be added to read compressed images.
Gnome Office consist of bunch of application such as Word, Speadsheet, Presentation, Graphics and Database.
The Hidden Markov Model Toolkit (HTK) is a portable toolkit for building and manipulating hidden Markov models. HTK is primarily used for speech recognition research although it has been used for numerous other applications including research into speech synthesis, character recognition and DNA sequencing. HTK is in use at hundreds of sites worldwide.
GOCR is an OCR (Optical Character Recognition) program, developed under the GNU Public License. It converts scanned images of text back to text files. Joerg Schulenburg started the program, and now leads a team of developers.
Provides optical character recognition (OCR) solutions for Vietnamese language.
OpenOffice is the leading open-source office software suite for word processing, spreadsheets, presentations, graphics, databases and more. It is available in many languages and works on all common computers. It stores all your data in an international open standard format and can also read and write files from other common office software packages.
This project is a set of Ruby language bindings for the various application development libraries included with the GNOME/GTK+ environment. This project is for GTK+2.0 or later.
Ekiga (formely known as GnomeMeeting) is an open source SoftPhone, Video Conferencing and Instant Messenger application over the Internet. It provides Audio and Video free calls through the internet. It supports standard telephony features like Call Hold, Call Transfer, Call Forwarding, Call Histroy and Call Monitoring.
The Visualization Toolkit (VTK) is an open-source, freely available software system for 3D computer graphics, image processing and visualization. VTK supports a wide variety of visualization algorithms including: scalar, vector, tensor, texture, and volumetric methods; and advanced modeling techniques such as: implicit modeling, polygon reduction, mesh smoothing, cutting, contouring, and Delaunay triangulation.
odt2braille is a Braille extension to OpenOffice.org Writer. odt2braille enables authors to print documents to a Braille embosser and to export documents as Braille files. The Braille output is well-formatted and highly customizable.