Displaying 1 to 20 from 47 results

TagSoup - SAX-compliant parser in Java

  •    Java

TagSoup, a SAX-compliant parser written in Java that, instead of parsing well-formed or valid XML, parses HTML as it is found in the wild: poor, nasty and brutish, though quite often far from short. TagSoup is designed for people who have to process this stuff using some semblance of a rational application design. TagSoup also includes a command-line processor that reads HTML files and can generate either clean HTML or well-formed XML that is a close approximation to XHTML.

Expat

  •    C

Expat is an XML parser library written in C. It is a stream-oriented parser in which an application registers handlers for things the parser might find in the XML document (like start tags).

node-xml2js - XML to JavaScript object converter.

  •    CoffeeScript

Simple XML to JavaScript object converter. It supports bi-directional conversion. Uses sax-js and xmlbuilder-js.Note: If you're looking for a full DOM parser, you probably want JSDom.

oga - Moved to https://gitlab.com/yorickpeterse/oga

  •    Ruby

NOTE: my spare time is limited which means I am unable to dedicate a lot of time on Oga. If you're interested in contributing to FOSS, please take a look at the open issues and submit a pull request to address them where possible. Oga is an XML/HTML parser written in Ruby. It provides an easy to use API for parsing, modifying and querying documents (using XPath expressions). Oga does not require system libraries such as libxml, making it easier and faster to install on various platforms. To achieve better performance Oga uses a small, native extension (C for MRI/Rubinius, Java for JRuby).




pugixml - Light-weight, simple and fast XML parser for C++ with XPath support

  •    C++

pugixml is a C++ XML processing library, which consists of a DOM-like interface with rich traversal/modification capabilities, an extremely fast XML parser which constructs the DOM tree from an XML file/buffer, and an XPath 1.0 implementation for complex data-driven tree queries. Full Unicode support is also available, with Unicode interface variants and conversions between different Unicode encodings (which happen automatically during parsing/saving). pugixml is used by a lot of projects, both open-source and proprietary, for performance and easy-to-use interface.

Fuzi - A fast & lightweight XML & HTML parser in Swift with XPath & CSS support

  •    Swift

Fuzi is based on a Swift port of Mattt Thompson's Ono(斧), using most of its low level implementaions with moderate class & interface redesign following standard Swift conventions, along with several bug fixes. Fuzi(斧子) means "axe", in homage to Ono(斧), which in turn is inspired by Nokogiri (鋸), which means "saw".

Kanna - Kanna(鉋) is an XML/HTML parser for Swift.

  •    Swift

Kanna(鉋) is an XML/HTML parser for cross-platform(macOS, iOS, tvOS, watchOS and Linux!). It was inspired by Nokogiri(鋸).

posthtml - PostHTML is a tool to transform HTML/XML with JS plugins

  •    Javascript

PostHTML is a tool for transforming HTML/XML with JS plugins. PostHTML itself is very small. It includes only a HTML parser, a HTML node tree API and a node tree stringifier. All HTML transformations are made by plugins. And these plugins are just small plain JS functions, which receive a HTML node tree, transform it, and return a modified tree.


DiDOM - Simple and fast HTML parser

  •    PHP

DiDOM - simple and fast HTML parser. The second parameter specifies if you need to load file. Default is false.

XOM - XML object model in Java

  •    Java

XOM is a new XML object model. It is a tree-based API for processing XML with Java that strives for correctness, simplicity, and performance, in that order.

TagSoup - HTML/XML parser for Haskell

  •    Haskell

TagSoup is a library for parsing HTML/XML. It supports the HTML 5 specification, and can be used to parse either well-formed XML, or unstructured and malformed HTML from the web. The library also provides useful functions to extract information from an HTML document, making it ideal for screen-scraping.

Libxml++

  •    C

libxml++ is a C++ wrapper for the libxml XML parser library.

Xerces-C++

  •    C++

Xerces-C++ is a validating XML parser written in a portable subset of C++. Xerces-C++ makes it easy to give your application the ability to read and write XML data.

Arbica

  •    C++

Arabica is an XML and HTML processing toolkit, providing SAX, DOM, XPath, and partial XSLT implementations, written in Standard C++.

Nokogiri - HTML, XML, SAX, and Reader parser with XPath and CSS selector support

  •    Ruby

Nokogiri (?) is an HTML, XML, SAX, DOM parser. Among Nokogiri's many features is the ability to search documents via XPath or CSS3 selectors, XML/HTML builder, XSLT transformer. Nokogiri parses and searches XML/HTML using native libraries (either C or Java, depending on your Ruby), which means it's fast and standards-compliant.

Piccolo

  •    Java

Piccolo is a small, extremely fast XML parser for Java. It implements the SAX 1, SAX 2.0.1, and JAXP 1.1 (SAX parsing only) interfaces as a non-validating parser and attempts to detect all XML well-formedness errors. Piccolo was developed by Yuval Oren.

TclXML

  •    TCL

The TclXML project is a collection of tools and libraries for handling XML documents with the Tcl scripting language.

lxml-python

  •    Python

lxml is a Pythonic binding for the libxml2 and libxslt libraries.

libxml-Perl

  •    Perl

Perl interface to Gnome libxml2 xml parsing and DOM library.

Libxml-Ruby

  •    C

The Libxml-Ruby project provides Ruby language bindings for the GNOME Libxml2 XML toolkit.





We have large collection of open source products. Follow the tags from Tag Cloud >>


Open source products are scattered around the web. Please provide information about the open source projects you own / you use. Add Projects.