Displaying 1 to 15 from 15 results

TagSoup - SAX-compliant parser in Java


TagSoup, a SAX-compliant parser written in Java that, instead of parsing well-formed or valid XML, parses HTML as it is found in the wild: poor, nasty and brutish, though quite often far from short. TagSoup is designed for people who have to process this stuff using some semblance of a rational application design. TagSoup also includes a command-line processor that reads HTML files and can generate either clean HTML or well-formed XML that is a close approximation to XHTML.

Expat


Expat is an XML parser library written in C. It is a stream-oriented parser in which an application registers handlers for things the parser might find in the XML document (like start tags).

Nokogiri - HTML, XML, SAX, and Reader parser with XPath and CSS selector support


Nokogiri (?) is an HTML, XML, SAX, DOM parser. Among Nokogiri's many features is the ability to search documents via XPath or CSS3 selectors, XML/HTML builder, XSLT transformer. Nokogiri parses and searches XML/HTML using native libraries (either C or Java, depending on your Ruby), which means it's fast and standards-compliant.

XOM - XML object model in Java


XOM is a new XML object model. It is a tree-based API for processing XML with Java that strives for correctness, simplicity, and performance, in that order.

TagSoup - HTML/XML parser for Haskell


TagSoup is a library for parsing HTML/XML. It supports the HTML 5 specification, and can be used to parse either well-formed XML, or unstructured and malformed HTML from the web. The library also provides useful functions to extract information from an HTML document, making it ideal for screen-scraping.

Libxml++


libxml++ is a C++ wrapper for the libxml XML parser library.




Xerces-C++


Xerces-C++ is a validating XML parser written in a portable subset of C++. Xerces-C++ makes it easy to give your application the ability to read and write XML data.

Arbica


Arabica is an XML and HTML processing toolkit, providing SAX, DOM, XPath, and partial XSLT implementations, written in Standard C++.

Dclib - Portable C++ library


dlib is a library for developing portable applications dealing with networking, threads, graphical interfaces, data structures, linear algebra, machine learning, XML and text parsing, numerical optimization, Bayesian nets, data compression routines, linked lists, binary search trees, linear algebra and matrix utilities, machine learning algorithms, and many other general utilities.

Piccolo


Piccolo is a small, extremely fast XML parser for Java. It implements the SAX 1, SAX 2.0.1, and JAXP 1.1 (SAX parsing only) interfaces as a non-validating parser and attempts to detect all XML well-formedness errors. Piccolo was developed by Yuval Oren.

TclXML


The TclXML project is a collection of tools and libraries for handling XML documents with the Tcl scripting language.

lxml-python


lxml is a Pythonic binding for the libxml2 and libxslt libraries.

libxml-Perl


Perl interface to Gnome libxml2 xml parsing and DOM library.

Libxml-Ruby


The Libxml-Ruby project provides Ruby language bindings for the GNOME Libxml2 XML toolkit.

SAXExpat


This is a SAX for .NET parser implementation based on the popular Expat XML parser.