saxerator - A SAX-based XML parser for parsing large files into manageable chunks

  •        35

Saxerator is a streaming xml-to-hash parser designed for working with very large xml files by giving you Enumerable access to manageable chunks of the document. Each xml chunk is parsed into a JSON-like Ruby Hash structure for consumption.



Related Projects

Nokogiri - HTML, XML, SAX, and Reader parser with XPath and CSS selector support

  •    Ruby

Nokogiri (?) is an HTML, XML, SAX, DOM parser. Among Nokogiri's many features is the ability to search documents via XPath or CSS3 selectors, XML/HTML builder, XSLT transformer. Nokogiri parses and searches XML/HTML using native libraries (either C or Java, depending on your Ruby), which means it's fast and standards-compliant.

SAX for .NET

  •    xml

SAX for .NET is the port of SAX to C#.


  •    Java

Piccolo is a small, extremely fast XML parser for Java. It implements the SAX 1, SAX 2.0.1, and JAXP 1.1 (SAX parsing only) interfaces as a non-validating parser and attempts to detect all XML well-formedness errors. Piccolo was developed by Yuval Oren.

SAX: Simple API for XML

  •    Java

SAX is a common front-end for XML parsers, like the JDBC for database access. SAX is widely used by open-source projects like Apache and by corporate users like Sun, IBM, Oracle and Microsoft. SAX was developed by the members of the XML-Dev mailing list

TagSoup - SAX-compliant parser in Java

  •    Java

TagSoup, a SAX-compliant parser written in Java that, instead of parsing well-formed or valid XML, parses HTML as it is found in the wild: poor, nasty and brutish, though quite often far from short. TagSoup is designed for people who have to process this stuff using some semblance of a rational application design. TagSoup also includes a command-line processor that reads HTML files and can generate either clean HTML or well-formed XML that is a close approximation to XHTML.

xml-stream - XML stream parser based on Expat. Made for Node.

  •    Javascript

XmlStream is a Node.js XML stream parser and editor, based on node-expat (libexpat SAX-like parser binding). When working with large XML files, it is probably a bad idea to use an XML to JavaScript object converter, or simply buffer the whole document in memory. Then again, a typical SAX parser might be too low-level for some tasks (and often a real pain).


  •    C++

Xerces-C++ is a validating XML parser written in a portable subset of C++. Xerces-C++ makes it easy to give your application the ability to read and write XML data.

sax-machine - A declarative sax parsing library backed by Nokogiri.

  •    Ruby

A declarative SAX parsing library backed by Nokogiri, Ox or Oga. SAX Machine can use either nokogiri, ox or oga as XML SAX handler.


  •    C++

Arabica is an XML and HTML processing toolkit, providing SAX, DOM, XPath, and partial XSLT implementations, written in Standard C++.


  •    C#

This is a SAX for .NET parser implementation based on the popular Expat XML parser.

Stack API for XML

  •    Java

StAX is an extension to the popular SAX 2 API for event-based XML parsing. It allows developers to write modular, extensible, XML document handlers while still maintaining the efficiency of SAX.

xmliter: A High Performance XML Iterator

  •    Java

The xmliter package provides an API for processing XML data that is easier to use than SAX or DOM, performs almost as well as SAX, and works with large documents that won't fit into memory using DOM.

node-expat - libexpat XML SAX parser binding for node.js

  •    Javascript

We don't emit an error event because libexpat doesn't use a callback either. Instead, check that parse() returns true. A descriptive string can be obtained via getError() to provide user feedback. Alternatively, use the Parser like a node Stream. write() will emit error events.

Streaming XML -- DOM event processing

  •    Java

This package provides an acceptable middle ground between SAX and DOM techniques for parsing XML. It provides DOM events in a SAX-like manner. Thus, the application can handle elements without storing the entire DOM tree in memory.


  •    Java

The name, quot;SASAXquot;, is from quot;Simple API for SAX(Simple API for XML)quot;. SASAX is the framework to parse XML document easily under SAX framework.

GXPARSE: XML stream parser API

  •    Java

Generic Java XML stream parser API makes it much easier to use event-based stream parsers like SAX Parser. Includes an implementation for SAX parser. Also supports recursive pattern matching.

Piccolo XML Parser for Java

  •    Java

Piccolo is the fastest SAX parser for Java, supporting SAX1, SAX2, and JAXP (SAX only). Piccolo is different from other parsers in that it was developed using parser generators. It weighs 160K including XML APIs. See for more info.

sax-js - A sax style parser for JS

  •    Javascript

A sax-style parser for XML and HTML.Designed with node in mind, but should work fine in the browser or other CommonJS implementations.

Apache Xerces for Perl XML Parser - Perl API to the Apache Xerces XML parser.

  •    Perl

Perl API to the Apache Xerces XML parser.

Apache Xerces for Java XML Parser

  •    Java

Xerces-J is a validating XML parser written in Java.