pyocr - A Python wrapper for Tesseract and Cuneiform

  •        853

PyOCR is an optical character recognition (OCR) tool wrapper for python. That is, it helps using various OCR tools from a Python program.It has been tested only on GNU/Linux systems. It should also work on similar systems (*BSD, etc). It may or may not work on Windows, MacOSX, etc.

https://github.com/openpaperwork/pyocr

Tags
Implementation
License
Platform

   




Related Projects

tesseract-ocr-for-php - A wrapper to work with Tesseract OCR inside PHP.


A wrapper to work with Tesseract OCR inside PHP. ‼️ This library depends on Tesseract OCR, version 3.03 or later.

ruby-tesseract-ocr - A Ruby wrapper library to the tesseract-ocr API.


This wrapper binds the TessBaseAPI object through ffi-inline (which means it will work on JRuby too) and then proceeds to wrap said API in a more ruby-esque Engine class. To make this library work you need tesseract-ocr and leptonica libraries and headers and a C++ compiler.

pdfocr - Adds text to PDF files using the cuneiform OCR software


pdfocr adds an OCR text layer to scanned PDF files, allowing them to be searched. It currently depends on Ruby 1.8.7 or above, and uses ocropus, cuneiform, or tesseract for performing OCR. For more details, see the manpage.

gosseract - Go package for OCR (Optical Character Recognition), by using Tesseract C++ library


Golang OCR package, by using Tesseract C++ library. Check Dockerfile for more detail of installation, or you can just try by docker run -it --rm otiai10/gosseract.

pytesseract - A Python wrapper for Google Tesseract


Python-tesseract is an optical character recognition (OCR) tool for python. That is, it will recognize and "read" the text embedded in images. Python-tesseract is a wrapper for Google's Tesseract-OCR Engine. It is also useful as a stand-alone invocation script to tesseract, as it can read all image types supported by the Python Imaging Library, including jpeg, png, gif, bmp, tiff, and others, whereas tesseract-ocr by default only supports tiff and bmp. Additionally, if used as a script, Python-tesseract will print the recognized text instead of writing it to a file.


android-ocr - Experimental optical character recognition app


An experimental app for Android that performs optical character recognition (OCR) on images captured using the device camera. Runs the Tesseract OCR engine using tess-two, a fork of Tesseract Tools for Android.

Tesseract-iPhone-Demo - Demo iPhone app utilizing the tesseract library for OCR


OCRDemo is a demo application that utilizes the Tesseract library (http://code.google.com/p/tesseract-ocr/) as a static library compiled under Mac OS 10.6 using the shell script found at http://robertcarlsen.net/2009/07/15/cross-compiling-for-iphone-dev-884. The program is only meant to provide a demonstration of the OCR library and it’s abilities on the iPhone, the program is not optimized in any way.

tess4j - Java JNA wrapper for Tesseract OCR API


# Tess4J ## Description: A Java JNA wrapper for Tesseract OCR API. Tess4J is released and distributed under the Apache License, v2.0. ## Features: The library provides optical character recognition (OCR) support for: TIFF, JPEG, GIF, PNG, and BMP image formats Multi-page TIFF images PDF document format

tess-two - Fork of Tesseract Tools for Android


A fork of Tesseract Tools for Android (tesseract-android-tools) that adds some additional functions. Tesseract Tools for Android is a set of Android APIs and build files for the Tesseract OCR and Leptonica image processing libraries. The source code for these dependencies is included within the tess-two/jni folder.

rtesseract - Ruby library for working with the Tesseract OCR.


Ruby library for working with the Tesseract OCR. Atention: Version 1.0.0 works fine with Ruby 2.0 and tesseract 3.0 and lower versions of rtesseract works fine with Ruby 1.8 and tesseract 2.0.4.

tesseract-ios - Tesseract OCR for iOS


Tesseract-ios is an Objective-C wrapper for Tesseract OCR. This project couldn't exist without the Ângelo Suzuki's blog post. A lot of code came from his article.

tesseract - A .Net wrapper for tesseract-ocr


A .NET wrapper for tesseract-ocr 3.04. Since tesseract and leptonica binaries are compiled with Visual Studio 2015 you'll need to ensure you have the Visual Studio 2015 Runtime installed.

ANPR - License plate recognition for iOS using OpenCV & Tesseract OCR Engine


#ANPR Source code for a License plate recognition (ANPR) demo for iOS using OpenCV and Tesseract OCR engine. I'm open to any pull request that can improve this project.

Tessnet2


A .NET 2.0 Open Source OCR assembly using Tesseract engine.

Pocket-OCR


Demonstration of Tesseract OCR on the iPhone. Video: http://www.youtube.com/v/MICew5-nZp4?hl=en_US&fs=1

receipt-parser - A fuzzy (supermarket) receipt parser written in Python using tesseract


This is a fuzzy receipt parser written in Python. You give it any dirty old receipt lying around and it will try its best to find the correct data for you. It started as a hackathon project. Read more about it on the trivago techblog. Also read the comments on HackerNews Oh hey! And there's also a talk online now if you're the visual kind of person.

Puma.NET


OCR in .NET. Puma.NET is a wrapper library for Cognitive Technologies CuneiFrom recognition engine that makes it easy to incorporate OCR functionality in any .NET Framework 2.0 (or higher) application. The API is provided through a number of simple classes.

baidu-ocr - 百度OCR文字识别API For Node.js


The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

Akshara Malayalam OCR


Akshara Malayalam OCR is a project for the development of an OCR for printed and handwritten documents in Malayalam language. The inspiration is from similar OCR softwares in other languages etc.

JavaOCR


Java OCR is an Optical Character Recognition algorithm based on a mean squared recognizer. This tool also includes utilities to trace and extract characters.