Pandoc - General Markup Converter

  •        0

Pandoc is a Haskell library for converting from one markup format to another, and a command-line tool that uses this library. It an convert documents in markdown, reStructuredText, textile, HTML, DocBook, or LaTeX to HTML formats, Word processor formats, PDF and other markup formats.

It can read [markdown] and (subsets of) [Textile], [reStructuredText], [HTML], [LaTeX], [MediaWiki markup], and [DocBook XML]; and it can write plain text, [markdown], [reStructuredText], [XHTML], [HTML 5], [LaTeX] (including [beamer] slide shows), [ConTeXt], [RTF], [DocBook XML], [OpenDocument XML], [ODT], [Word docx], [GNU Texinfo], [MediaWiki markup], [EPUB] (v2 or v3), [FictionBook2], [Textile], [groff man] pages, [Emacs Org-Mode], [AsciiDoc], and [Slidy], [Slideous], [DZSlides], or [S5] HTML slide shows. It can also produce [PDF] output on systems where LaTeX is installed.

http://johnmacfarlane.net/pandoc/

Tags
Implementation
License
Platform

   

comments powered by Disqus


Related Projects

Ghostscript - Document Rendering and Conversion


Ghostscript is a rendering and conversion engine for page description languages, including Postscript and PDF. It has ability to convert PostScript language files to many raster formats, view them on displays, and print them on printers that don't have PostScript language capability built in.

PDFBox - Java PDF library


Apache PDFBox is an open source Java PDF library for working with PDF documents. This library allows creation of new PDF documents, manipulation of existing documents and the ability to extract content from documents. It provides support for adding bookmarks, fonts, text extraction, Encryption, PDF printing and lot more.

iText - Java PDF library


iText is one of the popular and widely used PDF library. It is used to generate PDF documents dynamically. Mostly web developers will love it to generate PDF documents and reports based on data from an XML file or a database and serves it to the browser. It has support of adding bookmarks, watermarks, Encryption, Form filling and lot more.

Semantic Vectors - Creating and Searching Semantic Vector using Lucene


The Semantic Vectors package uses a Random Projection algorithm, a form of automatic semantic analysis. Other methods supported by the package include Latent Semantic Analysis (LSA) and Reflective Random Indexing. Latent Semantic Analysis (LSA) is a theory and method for extracting and representing the contextual-usage meaning of words by statistical computations applied to a large corpus of text. This library is used in semantic analysis and text mining.

Plone


Plone lets non-technical people create and maintain information using only a web browser. Perfect for web sites or intranets, Plone offers superior security without sacrificing extensibility or ease of use.

Aperture - Java framework for getting data and metadata


Aperture is a Java framework for extracting and querying full-text content and metadata from various information systems. It could crawl and extract information from File system, Websites, Mail boxes and Mail servers. It supports various file formats like Office, PDF, Zip and lot more. Metadata information is extracted from image files. Aperture has a strong focus on semantics, metadata extracted could be mapped to predefined properties.

Hydra - Distributed processing framework for search solutions


Hydra is designed to give the search solution the tools necessary to modify the data that is to be indexed in an efficient and flexible way. This is done by providing a scalable and efficient pipeline which the documents will have to pass through before being indexed into the search engine. Architecturally Hydra sits in between the search engine and the source integration.

Text Analysis Markup System


Text Analysis Markup System (TAMS) is both a system of marking documents for qualitative analysis and a series of tools for mining information based on that syntax.

TextMine


TextMine is for the Perl hacker who is grappling with the problems of managing unstructured text from various sources. You can use these text mining tools to search the Web, index text, extract entities, categorize your e-mail, and summarize documents.

PDFEdit


PDFedit is a free open source pdf editor and a library for manipulating PDF documents. It includes PDF manipulating library based on xpdf, GUI, set of command line tools and a pdf editor. You can use it to read, change and extract information from a PDF file. It is based on xpdf library.







Open source products are scattered around the web. Please provide information about the open source projects you own / you use. Add Projects.

Tag Cloud >>