Pandoc - Universal markup converter

  •        811

Pandoc is a Haskell library for converting from one markup format to another, and a command-line tool that uses this library. Pandoc can read Markdown, CommonMark, PHP Markdown Extra, GitHub-Flavored Markdown, MultiMarkdown, and (subsets of) Textile, reStructuredText, HTML, LaTeX, MediaWiki markup, TWiki markup, TikiWiki markup, Creole 1.0, Haddock markup, OPML, Emacs Org mode, DocBook, JATS, Muse, txt2tags, Vimwiki, EPUB, ODT, and Word docx.

Pandoc can write plain text, Markdown, CommonMark, PHP Markdown Extra, GitHub-Flavored Markdown, MultiMarkdown, reStructuredText, XHTML, HTML5, LaTeX (including beamer slide shows), ConTeXt, RTF, OPML, DocBook, JATS, OpenDocument, ODT, Word docx, GNU Texinfo, MediaWiki markup, DokuWiki markup, ZimWiki markup, Haddock markup, EPUB (v2 or v3), FictionBook2, Textile, groff man, groff ms, Emacs Org mode, AsciiDoc, InDesign ICML, TEI Simple, Muse, PowerPoint slide shows and Slidy, Slideous, DZSlides, reveal.js or S5 HTML slide shows. It can also produce PDF output on systems where LaTeX, ConTeXt, pdfroff, wkhtmltopdf, prince, or weasyprint is installed.

https://pandoc.org/
http://johnmacfarlane.net/pandoc
https://github.com/jgm/pandoc

Tags
Implementation
License
Platform

   




Related Projects

Pandoc - General Markup Converter

  •    Haskell

Pandoc is a Haskell library for converting from one markup format to another, and a command-line tool that uses this library. It an convert documents in markdown, reStructuredText, textile, HTML, DocBook, or LaTeX to HTML formats, Word processor formats, PDF and other markup formats.

Tikka - A content analysis toolkit

  •    Java

Apache Tika is a toolkit for detecting and extracting metadata and structured text content from various documents using existing parser libraries.

markup.rocks - Pandoc based document editor and converter in your browser.

  •    Haskell

markup.rocks is a client-side app that lets you edit, preview and convert between documents written in various markup languages in your browser. Check out markup.rocks on github to view the source code, file issues and contribute.

pandoc-ruby - Ruby wrapper for Pandoc

  •    Ruby

PandocRuby is a wrapper for Pandoc, a Haskell library with command line tools for converting one markup format to another. Pandoc can convert documents from a variety of formats including markdown, reStructuredText, textile, HTML, DocBook, LaTeX, and MediaWiki markup to a variety of other formats, including markdown, reStructuredText, HTML, LaTeX, ConTeXt, PDF, RTF, DocBook XML, OpenDocument XML, ODT, GNU Texinfo, MediaWiki markup, groff man pages, HTML slide shows, EPUB, Microsoft Word docx, and more.

gitit - A wiki using HAppS, pandoc, and git

  •    Haskell

Gitit is a wiki program written in Haskell. It uses Happstack for the web server and pandoc for markup processing. Pages and uploaded files are stored in a git, darcs, or mercurial repository and may be modified either by using the VCS's command-line tools or through the wiki's web interface. By default, pandoc's extended version of markdown is used as a markup language, but reStructuredText, LaTeX, HTML, DocBook, or Emacs Org-mode markup can also be used. Pages can be exported in a number of different formats, including LaTeX, RTF, OpenOffice ODT, and MediaWiki markup. Gitit can be configured to display TeX math (using texmath) and highlighted source code (using highlighting-kate).


docx4j - JAXB-based Java library for Word docx, Powerpoint pptx, and Excel xlsx files

  •    Java

docx4j is a library which helps you to work with the Office OpenXML file format as used in docx documents, pptx presentations, and xlsx spreadsheets.

Ghostscript - Document Rendering and Conversion

  •    C

Ghostscript is a rendering and conversion engine for page description languages, including Postscript and PDF. It has ability to convert PostScript language files to many raster formats, view them on displays, and print them on printers that don't have PostScript language capability built in.

LaTeX Helper

  •    Python

GUI to help create a LaTeX document

documents4j - Java library for converting documents into another document format

  •    Java

documents4j is a Java library for converting documents into another document format. This is achieved by delegating the conversion to any native application which understands the conversion of the given file into the desired target format.

Wiki Markup Converter

  •    

Wiki Markup Converter is a Windows graphical tool made to convert from one markup to another (i.e. DokuWiki to Markdown).

nb - CLI and local web plain text note‑taking, bookmarking, and archiving with linking, tagging, filtering, search, Git versioning & syncing, Pandoc conversion, + more, in a single portable script

  •    Shell

and more, in a single portable script. nb creates notes in text-based formats like Markdown, Org, and LaTeX, can work with files in any format, can import and export notes to many document formats, and can create private, password-protected encrypted notes and bookmarks. With nb, you can write notes using Vim, Emacs, VS Code, Sublime Text, and any other text editor you like, as well as terminal and GUI web browsers. nb works in any standard Linux / Unix environment, including macOS and Windows via WSL. Optional dependencies can be installed to enhance functionality, but nb works great without them.

textidote - Spelling, grammar and style checking on LaTeX documents

  •    Java

If so, you probably know that the process is far from simple. Since LaTeX documents contain special commands and keywords (the so-called "markup") that are not part of the "real" text, you cannot run a grammar checker directly on these files: it cannot tell the difference between markup and text. The other option is to remove all this markup, leaving only the "clear" text; however, when a grammar tool points to a problem at a specific line in this clear text, it becomes hard to retrace that location in the original LaTeX file. TeXtidote solves this problem; it can read your original LaTeX file and perform various sanity checks on it: for example, making sure that every figure is referenced in the text, enforcing the correct capitalization of titles, etc. In addition, TeXtidote can remove markup from the file and send it to the Language Tool library, which performs a verification of both spelling and grammar in a dozen languages. What is unique to TeXtidote is that it keeps track of the relative position of words between the original and the "clean" text. This means that it can translate the messages from Language Tool back to their proper location directly in your source file.

borb - Library for reading, creating and manipulating PDF files in Python

  •    Python

borb is a pure python library to read, write and manipulate PDF documents. It represents a PDF document as a JSON-like datastructure of nested lists, dictionaries and primitives (numbers, string, booleans, etc).

txt2html - plain text to HTML converter

  •    Perl

txt2html (Text to HTML converter) is a Perl program that converts plain text to HTML. It supports headings, lists, tables, simple character markup, and hyperlinking, and is highly customizable.

Txt2tags - Document generator: ONE source, MULTI targets

  •    Python

Txt2tags is a document generator. It reads a text file with minimal markup such as **bold** and //italic// and converts it to the formats like HTML, XHTML, SGML, DocBook (NEW), LaTeX, Lout, Man page, Creole (NEW), Wikipedia / MediaWiki, Google Code, Wiki, PmWiki (NEW), DokuWiki, MoinMoin, MagicPoint, PageMaker, AsciiDoc (NEW), ASCII Art (NEW), Plain text.

haml - HTML Abstraction Markup Language - A Markup Haiku

  •    Ruby

Haml is a templating engine for HTML. It's designed to make it both easier and more pleasant to write HTML documents, by eliminating redundancy, reflecting the underlying structure that the document represents, and providing an elegant syntax that's both powerful and easy to understand.

Apache POI - Java API To Access Microsoft Document File Formats

  •    Java

APIs for manipulating various file formats based upon Open Office XML (ECMA-376) and Microsoft's OLE 2 Compound Document formats using pure Java. Apache POI is your Java Excel, Word and PowerPoint solution. We have a complete API for porting other OOXML and OLE 2 Compound Document formats and welcome others to participate.

fact-extractor - Fact Extraction from Wikipedia Text

  •    Python

Wikipedia dumps are packaged as XML documents and contain text formatted according to the Mediawiki markup syntax, with templates to be transcluded. To obtain a raw text corpus, we use the WikiExtractor, integrated in a frozen version here. Pull requests not complying to these guidelines will be ignored.

JODConverter - Automates document conversions using OpenOffice

  •    Java

JODConverter automates conversions between office document formats using OpenOffice.org or LibreOffice. Supported formats include OpenDocument, PDF, RTF, HTML, Word, Excel, PowerPoint, and Flash. It can be used as a Java library, a command line tool, or a web application.

Ultralight Markup

  •    

Ultralight Markup makes it easier for webmasters to allow safe user comments. It features a stripped-down intermediate markup language meant to bridge the gap between text entry and HTML. And the project includes an ASP.NET MVC implementation with a Javascript editor.






We have large collection of open source products. Follow the tags from Tag Cloud >>


Open source products are scattered around the web. Please provide information about the open source projects you own / you use. Add Projects.