Tesseract-ocr

The Tesseract OCR engine was one of the top 3 engines in the 1995 UNLV Accuracy test. Between 1995 and 2006 it had little work done on it, but it is probably one of the most accurate open source OCR engines available. The source code will read a binary, grey or color image and output text. A tiff reader is built in that will read uncompressed TIFF images, or libtiff can be added to read compressed images.



http://code.google.com/p/tesseract-ocr/


Bookmark and Share          5301



comments powered by Disqus


Related Products

Tessnet2

A .NET 2.0 Open Source OCR assembly using Tesseract engine.

Read more

JavaOCR

Java OCR is an Optical Character Recognition algorithm based on a mean squared recognizer. This tool also includes utilities to trace and extract characters.

Read more

GOCR

GOCR is an OCR (Optical Character Recognition) program, developed under the GNU Public License. It converts scanned images of text back to text files. Joerg Schulenburg started the program, and now leads a team of developers.

Read more

OCRopus

OCRopus :- The open source document analysis and OCR system featuring pluggable layout analysis, pluggable character recognition, statistical natural language modeling, and multi-lingual capabilities.

Read more

PDFBox - Java PDF library

Apache PDFBox is an open source Java PDF library for working with PDF documents. This library allows creation of new PDF documents, manipulation of existing documents and the ability to extract content from documents. It provides support for adding bookmarks, fonts, text extraction, Encryption, PDF printing and lot more.

Read more

PDFClown - PDF library

PDFClown is a PDF library helps to generate, read and edit PDF. It helps to split and merge the PDF documents. It has support to add Images, Fonts, Barcodes, Bookmarks, Annotations, Form fields like checkbox, button, list box etc, Compression, text extraction.

Read more

GraphicsMagick

GraphicsMagick is the swiss army knife of image processing. It provides a robust and efficient collection of tools and libraries which support reading, writing, and manipulating an image in over 88 major formats including important formats like DPX, GIF, JPEG, JPEG-2000, PNG, PDF, PNM, and TIFF.

Read more

Visualization Toolkit

The Visualization Toolkit (VTK) is an open-source, freely available software system for 3D computer graphics, image processing and visualization. VTK supports a wide variety of visualization algorithms including: scalar, vector, tensor, texture, and volumetric methods; and advanced modeling techniques such as: implicit modeling, polygon reduction, mesh smoothing, cutting, contouring, and Delaunay triangulation.

Read more

TCPDF - PHP class for generating PDF

TCPDF is a PHP class for generating PDF documents without requiring external extensions. TCPDF Supports UTF-8, Unicode, RTL languages, XHTML, Javascript, digital signatures, barcodes and much more.

Read more

Spyder

Spyder is a Python development environment with advanced editing, interactive testing, debugging and introspection features. It is especially recommended for scientific computing thanks to NumPy (linear algebra), SciPy (signal and image processing), matplotlib (interactive 2D/3D plotting) and MayaVi’s mlab (interactive 3D visualization) support.

Read more




Follow feeds Follow bestopensource on Twitter Follow bestopensource on Facebook

Enter your email address:

Delivered by FeedBurner



Open source products are scattered around the web. Please provide information about the open source projects you own / you use. Add Projects.