XOM - XML object model in Java

  •        4594

XOM is a new XML object model. It is a tree-based API for processing XML with Java that strives for correctness, simplicity, and performance, in that order.

XOM is fairly unique in that it is a dual streaming/tree-based API. Individual nodes in the tree can be processed while the document is still being built. The enables XOM programs to operate almost as fast as the underlying parser can supply data. You don't need to wait for the document to be completely parsed before you can start working with it.

XOM is very memory efficient. If you read an entire document into memory, XOM uses as little memory as possible.

http://www.cafeconleche.org/XOM/

Tags
Implementation
License
Platform

   




Related Projects

Apache Xerces for Perl XML Parser - Perl API to the Apache Xerces XML parser.

  •    Perl

Perl API to the Apache Xerces XML parser.

Apache Xerces for Java XML Parser

  •    Java

Xerces-J is a validating XML parser written in Java.

xml-stream - XML stream parser based on Expat. Made for Node.

  •    Javascript

XmlStream is a Node.js XML stream parser and editor, based on node-expat (libexpat SAX-like parser binding). When working with large XML files, it is probably a bad idea to use an XML to JavaScript object converter, or simply buffer the whole document in memory. Then again, a typical SAX parser might be too low-level for some tasks (and often a real pain).

Professional XML Parser

  •    

ProXMLParser project aims at developing a Professional XML Parser using Microsoft .NET framework.

Nokogiri - HTML, XML, SAX, and Reader parser with XPath and CSS selector support

  •    Ruby

Nokogiri (?) is an HTML, XML, SAX, DOM parser. Among Nokogiri's many features is the ability to search documents via XPath or CSS3 selectors, XML/HTML builder, XSLT transformer. Nokogiri parses and searches XML/HTML using native libraries (either C or Java, depending on your Ruby), which means it's fast and standards-compliant.


Expat

  •    C

Expat is an XML parser library written in C. It is a stream-oriented parser in which an application registers handlers for things the parser might find in the XML document (like start tags).

pugixml - Light-weight, simple and fast XML parser for C++ with XPath support

  •    C++

pugixml is a C++ XML processing library, which consists of a DOM-like interface with rich traversal/modification capabilities, an extremely fast XML parser which constructs the DOM tree from an XML file/buffer, and an XPath 1.0 implementation for complex data-driven tree queries. Full Unicode support is also available, with Unicode interface variants and conversions between different Unicode encodings (which happen automatically during parsing/saving). pugixml is used by a lot of projects, both open-source and proprietary, for performance and easy-to-use interface.

fast-xml-parser - Validate XML, Parse XML to JS/JSON and vise versa, or parse XML to Nimn rapidly without C/C++ based libraries and no callback

  •    Javascript

This project welcomes contributors. If you have a feature you'd like to see implemented or a bug you'd liked fixed, the best and fastest way to make that happen is to implement it and submit a PR. Basic knowledge of JS is sufficient. Feel free to ask for any guidance. To use it from CLI Install it globally with -g option.

xml-rs - An XML library in Rust

  •    Rust

xml-rs is an XML library for Rust programming language. It is heavily inspired by Java Streaming API for XML (StAX). This library currently contains pull parser much like StAX event reader. It provides iterator API, so you can leverage Rust's existing iterators library features.

TagSoup - HTML/XML parser for Haskell

  •    Haskell

TagSoup is a library for parsing HTML/XML. It supports the HTML 5 specification, and can be used to parse either well-formed XML, or unstructured and malformed HTML from the web. The library also provides useful functions to extract information from an HTML document, making it ideal for screen-scraping.

Arbica

  •    C++

Arabica is an XML and HTML processing toolkit, providing SAX, DOM, XPath, and partial XSLT implementations, written in Standard C++.

RSS Parser and XML Parser for PHP 5+

  •    PHP

A full XML Parser for PHP with RSS Parser specific functionsl; think of it as an interface to the PHP DOM which allows easy access to your XML based documents. Auto encoding conversion to UTF-8 + Array to XML Conversion. V3 is now a commercial product

posthtml - PostHTML is a tool to transform HTML/XML with JS plugins

  •    Javascript

PostHTML is a tool for transforming HTML/XML with JS plugins. PostHTML itself is very small. It includes only a HTML parser, a HTML node tree API and a node tree stringifier. All HTML transformations are made by plugins. And these plugins are just small plain JS functions, which receive a HTML node tree, transform it, and return a modified tree.

Radiance DomProfiler

  •    Java

There exist many implementations of XML parsers that create DOM. The Radiance DomProfiler parses an XML file and builds a DOM from a handful of available parsers - CRIMSON, DOM4J, JDOM, SPARTA, XOM, XERCES, XPP - to compare time taken and memory used.

swan

  •    Java

swan is a suite of Java-based tools for working with XML. The focus is on a hybridized model that blends pattern-based and event-based models for XML processing, as well as supporting the leading tree-based models (DOM, JDOM, dom4j, XOM, etc.).

Piccolo

  •    Java

Piccolo is a small, extremely fast XML parser for Java. It implements the SAX 1, SAX 2.0.1, and JAXP 1.1 (SAX parsing only) interfaces as a non-validating parser and attempts to detect all XML well-formedness errors. Piccolo was developed by Yuval Oren.

Libxml++

  •    C

libxml++ is a C++ wrapper for the libxml XML parser library.

Fuzi - A fast & lightweight XML & HTML parser in Swift with XPath & CSS support

  •    Swift

Fuzi is based on a Swift port of Mattt Thompson's Ono(斧), using most of its low level implementaions with moderate class & interface redesign following standard Swift conventions, along with several bug fixes. Fuzi(斧子) means "axe", in homage to Ono(斧), which in turn is inspired by Nokogiri (鋸), which means "saw".

TagSoup - SAX-compliant parser in Java

  •    Java

TagSoup, a SAX-compliant parser written in Java that, instead of parsing well-formed or valid XML, parses HTML as it is found in the wild: poor, nasty and brutish, though quite often far from short. TagSoup is designed for people who have to process this stuff using some semblance of a rational application design. TagSoup also includes a command-line processor that reads HTML files and can generate either clean HTML or well-formed XML that is a close approximation to XHTML.

htmlparser2 - forgiving html and xml parser

  •    Javascript

A forgiving HTML/XML/RSS parser. The parser can handle streams and provides a callback interface. A live demo of htmlparser2 is available here.






We have large collection of open source products. Follow the tags from Tag Cloud >>


Open source products are scattered around the web. Please provide information about the open source projects you own / you use. Add Projects.