borb is a pure python library to read, write and manipulate PDF documents. It represents a PDF document as a JSON-like datastructure of nested lists, dictionaries and primitives (numbers, string, booleans, etc).
https://github.com/jorisschellekens/borbTags | pdf library sdk typesetting pdf-converter python3 pdf-conversion pdf-generation pdf-library text-extraction |
Implementation | Python |
License | AGPL |
Platform | Windows Linux MacOS |
HexaPDF is a pure Ruby library with an accompanying application for working with PDF files. It supports Creating new PDF files, Manipulating existing PDF files, Merging multiple PDF files into one, Extracting meta information, text, images and files from PDF files, Securing PDF files by encrypting them and optimizing PDF files for smaller file size or other criteria.
pdf pdf-generation pdf-manipulation text-extraction pdf-libraryApache PDFBox is an open source Java PDF library for working with PDF documents. This library allows creation of new PDF documents, manipulation of existing documents and the ability to extract content from documents. It provides support for adding bookmarks, fonts, text extraction, Encryption, PDF printing and lot more.
pdf text-extraction pdf-library pdf-library-dotnet pdf-library-javaUniDoc's UniPDF (formerly unidoc) is a PDF library for Go (golang) with capabilities for creating and reading, processing PDF files. The library is written and supported by FoxyUtils.com, where the library is used to power many of its services. Multiple examples are provided in our example repository https://github.com/unidoc/unidoc-examples as well as documented examples on our website.
pdf pdf-library pdf-generation pdf-document-processor text-extraction pdf-manipulationPDFClown is a PDF library helps to generate, read and edit PDF. It helps to split and merge the PDF documents. It has support to add Images, Fonts, Barcodes, Bookmarks, Annotations, Form fields like checkbox, button, list box etc, Compression, text extraction.
pdf text-extraction pdf-library pdf-library-dotnet pdf-library-javaiText is one of the popular and widely used PDF library. It is used to generate PDF documents dynamically. Mostly web developers will love it to generate PDF documents and reports based on data from an XML file or a database and serves it to the browser. It has support of adding bookmarks, watermarks, Encryption, Form filling and lot more.
pdf text-extraction pdf-library pdf-library-dotnet pdf-library-javaA library for PDF manipulation implementing Adobe PDF standard version 1.7. This library allows to read PDF files and apply changes to them, it is written in .NET 2.0 using Visual Studio 2005. Writing and Parsing PDF is supported.
text-extraction pdf pdf-library pdf-library-dotnetPandoc is a Haskell library for converting from one markup format to another, and a command-line tool that uses this library. It an convert documents in markdown, reStructuredText, textile, HTML, DocBook, or LaTeX to HTML formats, Word processor formats, PDF and other markup formats.
text-extraction document-conversion document markup text-to-pdfjPod is a PDF manipulating and rendering framework. It provides functionality to read, verify the document against the PDF specification. It also provides content stream and rendering framework. It could able to create new document and do incremental updates.
pdf text-extraction pdf-library pdf-library-javaPDFsharp is the Open Source .NET library that easily creates and processes PDF documents on the fly from any .NET language. The same drawing routines can be used to create PDF documents, draw on the screen, or send output to any printer. Neither Adobe's PDF Library nor Acrobat are required.
text-extraction pdf pdf-library pdf-library-dotnetPDFedit is a free open source pdf editor and a library for manipulating PDF documents. It includes PDF manipulating library based on xpdf, GUI, set of command line tools and a pdf editor. You can use it to read, change and extract information from a PDF file. It is based on xpdf library.
pdf-library pdf pdf-text-extraction pdf-viewerTCPDF is a PHP class for generating PDF documents without requiring external extensions. TCPDF Supports UTF-8, Unicode, RTL languages, XHTML, Javascript, digital signatures, barcodes and much more.
text-extraction pdf pdf-library pdf-text-extractionPDFjet is a high performance PDF library for Java and .NET. It has support of drawing points, lines, box, polygons etc. It supports unicode text, embedding images, embedding hyperlinks and lot more. Its simple to use table class helps to generate flexible reports.
pdf text-extraction pdf-library pdf-library-dotnet pdf-library-javadocx4j is a library which helps you to work with the Office OpenXML file format as used in docx documents, pptx presentations, and xlsx spreadsheets.
document-processing document-conversion text-extraction microsoft-documentsA Go wrapper library to convert PDF, DOC, DOCX, XML, HTML, RTF, ODT, Pages documents and images (see optional dependencies below) to plain text. Note for returning users: the Go code path for this pkg been moved to code.sajari.com/docconv. Follow the installation instructions to checkout a version of the code in the correct place.
rtf docx xml html rtf-files docs conversion pdf pdf-converter wordHTML to PDF converter via Chrome/Chromium. Note: It is strongly recommended that you keep Chrome running side-by-side with Node.js. There is significant overhead starting up Chrome for each PDF generation which can be easily avoided.
chrome chromium html html-pdf-chrome pdf pdf-generation typescript headless-chrome headless-chromium headless-browsers headless nodejs node-js google google-chrome pdf-generator html-pdf macPrawn is a pure Ruby PDF generation library that provides a lot of great functionality while trying to remain simple and reasonably performant. It provides support to do Vector drawing, including lines, polygons, curves, ellipses, etc. Extensive text rendering, Security features including encryption and password protection, PNG and JPG image embedding, with flexible scaling options and lot more.
pdf pdf-generation pdf-libraryGhostscript is a rendering and conversion engine for page description languages, including Postscript and PDF. It has ability to convert PostScript language files to many raster formats, view them on displays, and print them on printers that don't have PostScript language capability built in.
document-conversion pdf-text-extraction text-extraction graphics pdf postscript printingA JavaScript PDF generation library for Node and the browser. PDFKit is a PDF document generation library for Node and the browser that makes creating complex, multi-page, printable documents easy. It's written in CoffeeScript, but you can choose to use the API in plain 'ol JavaScript if you like. The API embraces chainability, and includes both low level functions as well as abstractions for higher level functionality. The PDFKit API is designed to be simple, so generating complex documents is often as simple as a few function calls.
pdf pdf-writer pdf-generator graphics document vectorSnappy is a PHP library allowing thumbnail, snapshot or PDF generation from a url or a html page. It uses the excellent webkit-based wkhtmltopdf and wkhtmltoimage.
image-generation html-to-pdf pdf-generation hacktoberfest html-to-image wkhtmltopdfPrawn is a nimble PDF writer for Ruby. More important, it’s a hackable platform that offers both high level APIs for the most common needs and low level APIs for bending the document model to accommodate special circumstances. With Prawn, you can write text, draw lines and shapes and place images anywhere on the page and add as much color as you like. In addition, it brings a fluent API and aggressive code re-use to the printable document space.
asciidoc asciidoctor pdf pdf-generation prawn
We have large collection of open source products. Follow the tags from
Tag Cloud >>
Open source products are scattered around the web. Please provide information
about the open source projects you own / you use.
Add Projects.