Displaying 1 to 20 from 93 results

paperwork - Personal document manager (Linux/Windows)

  •    Python

Paperwork is a personal document manager. It manages scanned documents and PDFs.It's designed to be easy and fast to use. The idea behind Paperwork is "scan & forget": You can just scan a new document and forget about it until the day you need it again.

pyocr - A Python wrapper for Tesseract and Cuneiform

  •    Python

PyOCR is an optical character recognition (OCR) tool wrapper for python. That is, it helps using various OCR tools from a Python program.It has been tested only on GNU/Linux systems. It should also work on similar systems (*BSD, etc). It may or may not work on Windows, MacOSX, etc.

ocrad.js - OCR in Javascript via Emscripten

  •    Javascript

As with any minor stepping stone on the road to hell relentless trajectory of Atwood's Law, I probably don't need to justify the existence of yet another "x, but now in Javascript!", but I might as well try. After all, we all would like to think that there's some ulterior motive to fulfilling that prophecy. On tablet or other touchscreen devices- of which there are quite a number of nowadays (as the New Year's Eve post, I am obliged to include conjecture about the technological zeitgeist), a library such as Ocrad.js might be used to add handwriting input in a device and operating system agnostic manner. Oftentimes, capturing the strokes and sending them over to a server to process might entail unacceptably high latency. Maybe you're working on an offline-capable note-taking app, or a browser extension which indexes all the doge memes that you stumble upon while prawling the dark corners of the internet.

tesseract-ocr-for-php - A wrapper to work with Tesseract OCR inside PHP.

  •    PHP

A wrapper to work with Tesseract OCR inside PHP. ‼️ This library depends on Tesseract OCR, version 3.03 or later.

android-ocr - Experimental optical character recognition app

  •    Java

An experimental app for Android that performs optical character recognition (OCR) on images captured using the device camera. Runs the Tesseract OCR engine using tess-two, a fork of Tesseract Tools for Android.

tess-two - Fork of Tesseract Tools for Android

  •    C

A fork of Tesseract Tools for Android (tesseract-android-tools) that adds some additional functions. Tesseract Tools for Android is a set of Android APIs and build files for the Tesseract OCR and Leptonica image processing libraries. The source code for these dependencies is included within the tess-two/jni folder.

card.io-Android-SDK - card.io provides fast, easy credit card scanning in mobile apps

  •    Java

card.io provides fast, easy credit card scanning in mobile apps. Please be sure to keep your app up to date with the latest version of the SDK. All releases follow semantic versioning.

tesseract - Tesseract Open Source OCR Engine (main repository)

  •    C++

This package contains an OCR engine - libtesseract and a command line program - tesseract. The lead developer is Ray Smith. The maintainer is Zdenko Podobny. For a list of contributors see AUTHORS and GitHub's log of contributors.

crnn - Convolutional Recurrent Neural Network (CRNN) for image-based sequence recognition.

  •    Lua

This software implements the Convolutional Recurrent Neural Network (CRNN), a combination of CNN, RNN and CTC loss for image-based sequence recognition tasks, such as scene text recognition and OCR. For details, please refer to our paper http://arxiv.org/abs/1507.05717. UPDATE Mar 14, 2017 A Docker file has been added to the project. Thanks to @varun-suresh.

ShareX - Screen capture, file sharing and productivity tool

  •    CSharp

ShareX is a free and open source program that lets you capture or record any area of your screen and share it with a single press of a key. It also allows uploading images, text or other types of files to over 50 supported destinations you can choose from.

SwiftOCR - Fast and simple OCR library written in Swift

  •    Swift

SwiftOCR is a fast and simple OCR library written in Swift. It uses a neural network for image recognition. As of now, SwiftOCR is optimized for recognizing short, one line long alphanumeric codes (e.g. DI4C9CM). We currently support iOS and OS X. This is a really good question.

Swift-AI - The Swift machine learning library.

  •    Swift

Swift AI is a high-performance deep learning library written entirely in Swift. We currently offer support for all Apple platforms, with Linux support coming soon. Each module now contains its own documentation. We recommend that you read the docs carefully for detailed instructions on using the various components of Swift AI.

CTPN - Detecting Text in Natural Image with Connectionist Text Proposal Network

  •    Jupyter

These demo codes (with our trained model) are for text-line detection (without side-refinement part). You need a GPU. If you use CUDNN, about 1.5GB free memory is required. If you don't use CUDNN, you will need about 5GB free memory, and the testing time will slightly increase. Therefore, we strongly recommend to use CUDNN.


  •    C

GOCR is an OCR (Optical Character Recognition) program, developed under the GNU Public License. It converts scanned images of text back to text files. Joerg Schulenburg started the program, and now leads a team of developers.

node-tesseract - A simple wrapper for the Tesseract OCR package

  •    Javascript

There is a hard dependency on the Tesseract project. You can find installation instructions for various platforms on the project site. For Homebrew users, the installation is quick and easy.

OCRmyPDF - OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched

  •    Python

OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched or copy-pasted. For details: please consult the documentation.


  •    C++

This repository contains everything needed to build the card.io library for Android. What it does not yet contain is much in the way of documentation. 😿 So please feel free to ask any questions by creating github issues -- we'll gradually build our documentation based on the discussions there.

tensorflow-ocr - 🖺 OCR using tensorflow with attention

  •    Python

To get started with a minimal example similar to the famous MNIST try ./train_letters.py ; it automatically generates letters for all different font types from your computer in all different shapes and trains on it.


  •    Java

Java OCR is an Optical Character Recognition algorithm based on a mean squared recognizer. This tool also includes utilities to trace and extract characters.


  •    CSharp

A .NET 2.0 Open Source OCR assembly using Tesseract engine.