PDFBox - Java PDF library

  •        13189

Apache PDFBox is an open source Java PDF library for working with PDF documents. This library allows creation of new PDF documents, manipulation of existing documents and the ability to extract content from documents. It provides support for adding bookmarks, fonts, text extraction, Encryption, PDF printing and lot more. It has .NET support.

Form filling is one of most important feature. It helps to fill in form data FDF and XFDF. It has command line utlities for most of the jobs. For example PDFToImage utility create an image for every page in the PDF document.

PDF documents could be splitted to multiple documents and also multiple PDF documents could be merged to one. Lucene Search Engine is integrated to do full text search.

http://pdfbox.apache.org/

Tags
Implementation
License
Platform

   




Related Projects

PDFClown - PDF library

  •    Java

PDFClown is a PDF library helps to generate, read and edit PDF. It helps to split and merge the PDF documents. It has support to add Images, Fonts, Barcodes, Bookmarks, Annotations, Form fields like checkbox, button, list box etc, Compression, text extraction.

iText - Java PDF library

  •    Java

iText is one of the popular and widely used PDF library. It is used to generate PDF documents dynamically. Mostly web developers will love it to generate PDF documents and reports based on data from an XML file or a database and serves it to the browser. It has support of adding bookmarks, watermarks, Encryption, Form filling and lot more.

PDFJet - PDF library for Java and .NET

  •    Java

PDFjet is a high performance PDF library for Java and .NET. It has support of drawing points, lines, box, polygons etc. It supports unicode text, embedding images, embedding hyperlinks and lot more. Its simple to use table class helps to generate flexible reports.

PDF Library - PDF manipulation in .NET

  •    VBNET

A library for PDF manipulation implementing Adobe PDF standard version 1.7. This library allows to read PDF files and apply changes to them, it is written in .NET 2.0 using Visual Studio 2005. Writing and Parsing PDF is supported.

PDFSharp - Create and process PDF in .NET

  •    CSharp

PDFsharp is the Open Source .NET library that easily creates and processes PDF documents on the fly from any .NET language. The same drawing routines can be used to create PDF documents, draw on the screen, or send output to any printer. Neither Adobe's PDF Library nor Acrobat are required.


HexaPDF - A Versatile PDF Creation and Manipulation Library for Ruby

  •    Ruby

HexaPDF is a pure Ruby library with an accompanying application for working with PDF files. It supports Creating new PDF files, Manipulating existing PDF files, Merging multiple PDF files into one, Extracting meta information, text, images and files from PDF files, Securing PDF files by encrypting them and optimizing PDF files for smaller file size or other criteria.

PDFEdit

  •    C++

PDFedit is a free open source pdf editor and a library for manipulating PDF documents. It includes PDF manipulating library based on xpdf, GUI, set of command line tools and a pdf editor. You can use it to read, change and extract information from a PDF file. It is based on xpdf library.

jPod - PDF manipulating and rendering framework

  •    Java

jPod is a PDF manipulating and rendering framework. It provides functionality to read, verify the document against the PDF specification. It also provides content stream and rendering framework. It could able to create new document and do incremental updates.

TCPDF - PHP class for generating PDF

  •    PHP

TCPDF is a PHP class for generating PDF documents without requiring external extensions. TCPDF Supports UTF-8, Unicode, RTL languages, XHTML, Javascript, digital signatures, barcodes and much more.

Apache PDFBox - The Apache PDFBox library is an open source Java tool for working with PDF documents

  •    Java

The Apache PDFBox library is an open source Java tool for working with PDF documents.

openhtmltopdf - An HTML to PDF library for the JVM

  •    Java

Open-HTML-to-PDF is a HTML and CSS renderer written in Java. It supports Java2D and PDF output. Open-HTML-to-PDF is a fork of Flying-saucer with additional features.

PDF Renderer - renders PDF documents to the screen

  •    Java

PDF Renderer is a Java library which renders PDF documents to the screen using Java2D in to swing panel. It is capable to view the PDF, Converts it to PNG, View PDF in to 3D scene, Print preview support. It does not support to create or manipulate the PDF.

Pandoc - General Markup Converter

  •    Haskell

Pandoc is a Haskell library for converting from one markup format to another, and a command-line tool that uses this library. It an convert documents in markdown, reStructuredText, textile, HTML, DocBook, or LaTeX to HTML formats, Word processor formats, PDF and other markup formats.

docx4j - JAXB-based Java library for Word docx, Powerpoint pptx, and Excel xlsx files

  •    Java

docx4j is a library which helps you to work with the Office OpenXML file format as used in docx documents, pptx presentations, and xlsx spreadsheets.

OrsonPDF - A fast, lightweight PDF generator for the Java platform

  •    Java

OrsonPDF is a PDF generation library for the Java(tm) platform that allows you to create content in PDF format using the standard Java2D drawing API (Graphics2D). OrsonPDF is light-weight, fast, and has no dependencies other than the Java runtime (1.6 or later).

Prawn - Fast, Nimble PDF Generation For Ruby

  •    Ruby

Prawn is a pure Ruby PDF generation library that provides a lot of great functionality while trying to remain simple and reasonably performant. It provides support to do Vector drawing, including lines, polygons, curves, ellipses, etc. Extensive text rendering, Security features including encryption and password protection, PNG and JPG image embedding, with flexible scaling options and lot more.

pdfextract - A tool and library that can extract various areas of text from a PDF, especially a scholarly article PDF

  •    Ruby

A tool and library that can extract various areas of text from a PDF, especially a scholarly article PDF. It performs structural analysis to determine column bounds, headers, footers, sections, titles and so on. It can analyse and categorise sections into reference and non-reference sections and can split reference sections into individual references. The latest version is 0.1.1. Earlier versions are far less reliable.

pdf-reader - The PDF::Reader library implements a PDF parser conforming as much as possible to the PDF specification from Adobe

  •    Ruby

The PDF::Reader library implements a PDF parser conforming as much as possible to the PDF specification from Adobe. It provides programmatic access to the contents of a PDF file with a high degree of flexibility.

pdfkit - A JavaScript PDF generation library for Node and the browser

  •    CoffeeScript

A JavaScript PDF generation library for Node and the browser. PDFKit is a PDF document generation library for Node and the browser that makes creating complex, multi-page, printable documents easy. It's written in CoffeeScript, but you can choose to use the API in plain 'ol JavaScript if you like. The API embraces chainability, and includes both low level functions as well as abstractions for higher level functionality. The PDFKit API is designed to be simple, so generating complex documents is often as simple as a few function calls.

PDFSam - PDF Split and Merge

  •    Java

PDFsam basic is a simple, platform independent software designed to split, merge and rotate pdf files. It can split your pdf documents (into chapters, single pages, etc.). It can merge many pdf documents or subsections of them. It helps to visually reorder pages of a selected pdf document.