JavaOCR

  •        41472

Java OCR is an Optical Character Recognition algorithm based on a mean squared recognizer. This tool also includes utilities to trace and extract characters.

References:

http://javaocr.sourceforge.net/

Tags
Implementation
License
Platform

   




Related Projects

GOCR


GOCR is an OCR (Optical Character Recognition) program, developed under the GNU Public License. It converts scanned images of text back to text files. Joerg Schulenburg started the program, and now leads a team of developers.

Tessnet2


A .NET 2.0 Open Source OCR assembly using Tesseract engine.

JS-OCR-demo - JavaScript optical character recognition demo


JavaScript optical character recognition demo. Check it out here.

pytesseract - A Python wrapper for Google Tesseract


Python-tesseract is an optical character recognition (OCR) tool for python. That is, it will recognize and "read" the text embedded in images. Python-tesseract is a wrapper for Google's Tesseract-OCR Engine. It is also useful as a stand-alone invocation script to tesseract, as it can read all image types supported by the Python Imaging Library, including jpeg, png, gif, bmp, tiff, and others, whereas tesseract-ocr by default only supports tiff and bmp. Additionally, if used as a script, Python-tesseract will print the recognized text instead of writing it to a file.

android-ocr - Experimental app for optical character recognition on Android.


Experimental app for optical character recognition on Android.


pyocr - A Python wrapper for Tesseract and Cuneiform


PyOCR is an optical character recognition (OCR) tool wrapper for python. That is, it helps using various OCR tools from a Python program.It has been tested only on GNU/Linux systems. It should also work on similar systems (*BSD, etc). It may or may not work on Windows, MacOSX, etc.

gosseract - Go package for OCR (Optical Character Recognition), by using Tesseract C++ library


Golang OCR package, by using Tesseract C++ library. Check Dockerfile for more detail of installation, or you can just try by docker run -it --rm otiai10/gosseract.

mlp-character-recognition


Trains a multi-layer perceptron (MLP) neural network to perform optical character recognition (OCR).

tess4j - Java JNA wrapper for Tesseract OCR API


# Tess4J ## Description: A Java JNA wrapper for Tesseract OCR API. Tess4J is released and distributed under the Apache License, v2.0. ## Features: The library provides optical character recognition (OCR) support for: TIFF, JPEG, GIF, PNG, and BMP image formats Multi-page TIFF images PDF document format

SubExtractor


Converts subtitles from DVDs and PGS (Bluray .sup) files into Advanced Substation Alpha and SRT text format using OCR (optical character recognition).

Conjecture


Conjecture is a modular, extensible, open-source C++ framework for Optical Character Recognition (OCR). It is not a single OCR, but rather an extensible collection of OCRs that can be explored, compared, extended and modified within a unified environment

OpenFst Library for constructing weighted finite-state transducer


OpenFst is a library for constructing, combining, optimizing, and searching weighted finite-state transducers (FSTs). Weighted finite-state transducers are automata where each transition has an input label, an output label, and a weight. FSTs have key applications in speech recognition and synthesis, machine translation, optical character recognition, pattern matching, string processing, machine learning, information extraction and retrieval among others.

Optical Character Recognition (GOCR)


This is a command line based optical character recognition program.

VietOCR


Provides optical character recognition (OCR) solutions for Vietnamese language.

Neuronal Optical Character Recognition


It's a tool who shows the concepts of a type of neuronal networks (multi-layers percetron). It's not a real ocr, it's just a little didactical application.

ocropy - Python-based OCR package using recurrent neural networks.


$ pip install -r requirements_2.txt $ wget -nd http://www.tmbdev.net/en-default.pyrnn.gz $ mv en-default.pyrnn.gz models/To test the recognizer, run: $ ./run-testOCRopus is really a collection of document analysis programs, not a turn-key OCR system.In addition to the recognition scripts themselves, there are a number of scripts forground truth editing and correction, measuring error rates, determining confusion matrices, etc.OCRopus commands will generally print a stack trace along

BOCRA


An Optical Character Recognition application for high quality printed text, geared towards (but not restricted to) the Bengali script.

Java OCR


Java OCR is a suite of pure java libraries for image processing and character recognition. Small memory footprint and lack of external dependencies makes it suitable for android development. Provides modular structure for easier deployment

YagpoOCRUnicode c++library


OCR c++ library. Include: contour recognition; vectorisation; matrix letter feature recognition; auto page segmentation and detect rotation; SS3 ASM core; XML base; web-based GUI; 99,6% printed Unicode text recognition; letter base up to 1200 letters.

HTK - Speech Recognition Toolkit


The Hidden Markov Model Toolkit (HTK) is a portable toolkit for building and manipulating hidden Markov models. HTK is primarily used for speech recognition research although it has been used for numerous other applications including research into speech synthesis, character recognition and DNA sequencing. HTK is in use at hundreds of sites worldwide.