pyocr - A Python wrapper for Tesseract and Cuneiform

  •        923

PyOCR is an optical character recognition (OCR) tool wrapper for python. That is, it helps using various OCR tools from a Python program.It has been tested only on GNU/Linux systems. It should also work on similar systems (*BSD, etc). It may or may not work on Windows, MacOSX, etc.



Related Projects

EasyOCR - Java OCR 识别组件(基于Tesseract OCR 引擎)。能自动完成图片清理、识别 CAPTCHA 验证码图片内容的一体化工作。Java Image cleanup, OCR recognition component (based Tesseract OCR engine, automatically cleanup image and identification CAPTCHA verification code picture content)


EasyOCR is a Java language using OCR recognition engine (based Tesseract). By means of a few simple API, the Java language can be used to complete the picture content identification work. And integrated image cleanup, recognition CAPTCHA image, bill notes and other content integration efforts. EasyOCR engine supports plugin programming, ETD templates support, provide a graphical ETD template design tools (EasyTemplateDesigner GUI). EasyOCR not only provide services for consumers, but mainly oriented to provide localized development SDK integration with C/S, B/S and Android mobile terminal native integration projects.

tesseract-ocr-for-php - A wrapper to work with Tesseract OCR inside PHP.

  •    PHP

A wrapper to work with Tesseract OCR inside PHP. ‼️ This library depends on Tesseract OCR, version 3.03 or later.

ruby-tesseract-ocr - A Ruby wrapper library to the tesseract-ocr API.

  •    Ruby

This wrapper binds the TessBaseAPI object through ffi-inline (which means it will work on JRuby too) and then proceeds to wrap said API in a more ruby-esque Engine class. To make this library work you need tesseract-ocr and leptonica libraries and headers and a C++ compiler.

pdfocr - Adds text to PDF files using the cuneiform OCR software

  •    Ruby

pdfocr adds an OCR text layer to scanned PDF files, allowing them to be searched. It currently depends on Ruby 1.8.7 or above, and uses ocropus, cuneiform, or tesseract for performing OCR. For more details, see the manpage.

tesseract - Tesseract Open Source OCR Engine (main repository)

  •    C++

This package contains an OCR engine - libtesseract and a command line program - tesseract. The lead developer is Ray Smith. The maintainer is Zdenko Podobny. For a list of contributors see AUTHORS and GitHub's log of contributors.

gosseract - Go package for OCR (Optical Character Recognition), by using Tesseract C++ library

  •    Go

Golang OCR package, by using Tesseract C++ library. Check Dockerfile for more detail of installation, or you can just try by docker run -it --rm otiai10/gosseract.

pytesseract - A Python wrapper for Google Tesseract

  •    Python

Python-tesseract is an optical character recognition (OCR) tool for python. That is, it will recognize and "read" the text embedded in images. Python-tesseract is a wrapper for Google's Tesseract-OCR Engine. It is also useful as a stand-alone invocation script to tesseract, as it can read all image types supported by the Python Imaging Library, including jpeg, png, gif, bmp, tiff, and others, whereas tesseract-ocr by default only supports tiff and bmp. Additionally, if used as a script, Python-tesseract will print the recognized text instead of writing it to a file.

open-ocr - Run your own OCR-as-a-Service using Tesseract and Docker

  •    Go

OpenOCR makes it simple to host your own OCR REST API. The heavy lifting OCR work is handled by Tesseract OCR.

android-ocr - Experimental optical character recognition app

  •    Java

An experimental app for Android that performs optical character recognition (OCR) on images captured using the device camera. Runs the Tesseract OCR engine using tess-two, a fork of Tesseract Tools for Android.

gImageReader - A Gtk/Qt front-end to tesseract-ocr.

  •    C++

gImageReader is a simple Gtk/Qt front-end to tesseract-ocr. The steps for compiling gImageReader from source are documented in the wiki.

Tesseract-iPhone-Demo - Demo iPhone app utilizing the tesseract library for OCR

  •    C++

OCRDemo is a demo application that utilizes the Tesseract library ( as a static library compiled under Mac OS 10.6 using the shell script found at The program is only meant to provide a demonstration of the OCR library and it’s abilities on the iPhone, the program is not optimized in any way.

tess4j - Java JNA wrapper for Tesseract OCR API

  •    Java

# Tess4J ## Description: A Java JNA wrapper for Tesseract OCR API. Tess4J is released and distributed under the Apache License, v2.0. ## Features: The library provides optical character recognition (OCR) support for: TIFF, JPEG, GIF, PNG, and BMP image formats Multi-page TIFF images PDF document format

tess-two - Fork of Tesseract Tools for Android

  •    C

A fork of Tesseract Tools for Android (tesseract-android-tools) that adds some additional functions. Tesseract Tools for Android is a set of Android APIs and build files for the Tesseract OCR and Leptonica image processing libraries. The source code for these dependencies is included within the tess-two/jni folder.

rtesseract - Ruby library for working with the Tesseract OCR.

  •    Ruby

Ruby library for working with the Tesseract OCR. Atention: Version 1.0.0 works fine with Ruby 2.0 and tesseract 3.0 and lower versions of rtesseract works fine with Ruby 1.8 and tesseract 2.0.4.

tesseract-ios - Tesseract OCR for iOS

  •    C++

Tesseract-ios is an Objective-C wrapper for Tesseract OCR. This project couldn't exist without the Ângelo Suzuki's blog post. A lot of code came from his article.

node-tesseract - A simple wrapper for the Tesseract OCR package

  •    Javascript

There is a hard dependency on the Tesseract project. You can find installation instructions for various platforms on the project site. For Homebrew users, the installation is quick and easy.

tesseract - A .Net wrapper for tesseract-ocr

  •    CSharp

A .NET wrapper for tesseract-ocr 3.04. Since tesseract and leptonica binaries are compiled with Visual Studio 2015 you'll need to ensure you have the Visual Studio 2015 Runtime installed.

iPhone-OCR-Tesseract-and-OpenCV - Simple academic project made using OpenCV and Tesseract

  •    Objective-C

This is a sample project created by me (@PablosPoject) and @_AJ_R for academic purpose. It use the OpenCV framework and tutorial made by BloodAxe( and some other utilities class made by Aptogo ( It also uses the Tesseract OCR engine to read the text processed with openCV. I also build a simple user interface that permit to take a photo or choose one from library, and also permit to apply to the image every single step in the image processing, or to apply directly all the processing.

ANPR - License plate recognition for iOS using OpenCV & Tesseract OCR Engine

  •    C++

#ANPR Source code for a License plate recognition (ANPR) demo for iOS using OpenCV and Tesseract OCR engine. I'm open to any pull request that can improve this project.

captcha-break - captcha break based on opencv2, tesseract-ocr and some machine learning algorithm.

  •    C++

captcha break based on opencv2, tesseract-ocr and some machine learning algorithm. The simplest captcha breaking.