NAPS2 is a document scanning application with a focus on simplicity and ease of use. Scan your documents from WIA- and TWAIN-compatible scanners, organize the pages as you like, and save them as PDF, TIFF, JPEG, PNG, and other file formats. Requires .NET Framework 4.0 or higher.
https://www.naps2.com/Tags | scanner pdf pdf-scanner twain wia tiff sane ocr pdf-search |
Implementation | CSharp |
License | GPLv2 |
Platform | Windows |
IFilter plugin for the Microsoft Indexing Service (and Sharepoint in particular) to index and search image files (including TIFF, PDF, JPEG, BMP...) using OCR technology.
ifilter indexing ocr pdf search sharepoint tiffPaperwork is a personal document manager. It manages scanned documents and PDFs.It's designed to be easy and fast to use. The idea behind Paperwork is "scan & forget": You can just scan a new document and forget about it until the day you need it again.
document-management personal-document-system dms edms python3 ocr indexing gtk gtk3 sane pdf scanner paperwork gnomegImageReader is a simple Gtk/Qt front-end to tesseract-ocr. The steps for compiling gImageReader from source are documented in the wiki.
qt ocr pdf-document c-plus-plus tesseract-ocr gtk hocr-documents hocr scanner# Tess4J ## Description: A Java JNA wrapper for Tesseract OCR API. Tess4J is released and distributed under the Apache License, v2.0. ## Features: The library provides optical character recognition (OCR) support for: TIFF, JPEG, GIF, PNG, and BMP image formats Multi-page TIFF images PDF document format
image-processing image-library imagingApache PDFBox is an open source Java PDF library for working with PDF documents. This library allows creation of new PDF documents, manipulation of existing documents and the ability to extract content from documents. It provides support for adding bookmarks, fonts, text extraction, Encryption, PDF printing and lot more.
pdf text-extraction pdf-library pdf-library-dotnet pdf-library-javapdfocr adds an OCR text layer to scanned PDF files, allowing them to be searched. It currently depends on Ruby 1.8.7 or above, and uses ocropus, cuneiform, or tesseract for performing OCR. For more details, see the manpage.
pdf ocrOCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched or copy-pasted. For details: please consult the documentation.
ocr pdf image-processingAmbar is an open-source document search and management system with automated crawling, OCR, tagging and instant full-text search.There are two editions available: Community and Enterprise. Enterprise Edition is a full featured document search and management system that can handle terabytes of data.
search search-engine search-in-text self-hosted ocr pdf smb dropboxPHP, Perl and MySql based web interface for the Nessus security scanner and Nmap port scanner. The system presents scan results via a Email notification, a HTML interface, or exported to a PDF file.
Evince is a document viewer for multiple document formats. The goal of evince is to replace the multiple document viewers that exist on the GNOME Desktop with a single simple application. Evince is specifically designed to support the file following formats: PDF, Postscript, djvu, tiff, dvi, XPS, SyncTex support with gedit, comics books (cbr,cbz,cb7 and cbt).
document-viewer pdf-viewer pdf-reader image-viewerVirtual ImagePrinter is based on the Microsoft universal printer driver. ImagePrinter can print to file any printable document in your Windows system to the one or many BMP, PNG , JPG, TIFF or PDF files. Convert word to pdf, word to jpg and convert DOC, DOCX, PDF, TXT, HTM and RTF files to Image format Please visit http://code-industry.net for more information.
fax2pdf provides a means to convert sets of G3 fax pages (encoded in a TIFF variant) into PDF. This can be used to convert fax pages received with eg. HylaFAX into a file format readily accessible/viewable with programs on most platforms.
Sioyek is a PDF viewer designed for reading research papers and technical books. It can quickly search and open any file you have previously interacted with using sioyek. It supports document search and navigate to referenced figure or bibliography item. It has also support for Bookmarking pages, Highlighting text and lot more.
pdf pdf-viewer research-paperPaperless is an application by Daniel Quinn and contributors that indexes your scanned documents and allows you to easily search for documents and store metadata alongside your documents. It performs OCR on your documents, adds selectable text to image only documents and adds tags, correspondents and document types to your documents. It supports PDF documents, images, plain text files, and Office documents (Word, Excel, Powerpoint, and LibreOffice equivalents).
search machine-learning django angular ocr archiving full-text-search dms document-management-system document-managementPDF Chain is a graphical user interface for the PDF Toolkit (pdftk). The GUI supports all common features of the command line tool in a comfortable way. PDF Chain generates a command for the PDF Toolkit from the GUI settings and executes it on the system. Therefore the PDF Toolkit must be already installed on the system.
pdf pdf-toolkit pdf-toolsHexaPDF is a pure Ruby library with an accompanying application for working with PDF files. It supports Creating new PDF files, Manipulating existing PDF files, Merging multiple PDF files into one, Extracting meta information, text, images and files from PDF files, Securing PDF files by encrypting them and optimizing PDF files for smaller file size or other criteria.
pdf pdf-generation pdf-manipulation text-extraction pdf-libraryPDF Search Engine is a book search engine search on sites, forums, message boards for pdf files. You can find and download a tons of e-books but please respect the publisher and the author for their creations if their books copyrighted.
retroPDF Renderer is a Java library which renders PDF documents to the screen using Java2D in to swing panel. It is capable to view the PDF, Converts it to PNG, View PDF in to 3D scene, Print preview support. It does not support to create or manipulate the PDF.
pdf render pdf-library pdf-library-javaiText is one of the popular and widely used PDF library. It is used to generate PDF documents dynamically. Mostly web developers will love it to generate PDF documents and reports based on data from an XML file or a database and serves it to the browser. It has support of adding bookmarks, watermarks, Encryption, Form filling and lot more.
pdf text-extraction pdf-library pdf-library-dotnet pdf-library-java
We have large collection of open source products. Follow the tags from
Tag Cloud >>
Open source products are scattered around the web. Please provide information
about the open source projects you own / you use.
Add Projects.