Nokogiri - HTML, XML, SAX, and Reader parser with XPath and CSS selector support

  •        731

Nokogiri (?) is an HTML, XML, SAX, DOM parser. Among Nokogiri's many features is the ability to search documents via XPath or CSS3 selectors, XML/HTML builder, XSLT transformer. Nokogiri parses and searches XML/HTML using native libraries (either C or Java, depending on your Ruby), which means it's fast and standards-compliant.

http://www.nokogiri.org/
https://github.com/sparklemotion/nokogiri

Tags
Implementation
License
Platform

   




Related Projects

Arbica

  •    C++

Arabica is an XML and HTML processing toolkit, providing SAX, DOM, XPath, and partial XSLT implementations, written in Standard C++.

TagSoup - SAX-compliant parser in Java

  •    Java

TagSoup, a SAX-compliant parser written in Java that, instead of parsing well-formed or valid XML, parses HTML as it is found in the wild: poor, nasty and brutish, though quite often far from short. TagSoup is designed for people who have to process this stuff using some semblance of a rational application design. TagSoup also includes a command-line processor that reads HTML files and can generate either clean HTML or well-formed XML that is a close approximation to XHTML.

Xerces-C++

  •    C++

Xerces-C++ is a validating XML parser written in a portable subset of C++. Xerces-C++ makes it easy to give your application the ability to read and write XML data.

Piccolo

  •    Java

Piccolo is a small, extremely fast XML parser for Java. It implements the SAX 1, SAX 2.0.1, and JAXP 1.1 (SAX parsing only) interfaces as a non-validating parser and attempts to detect all XML well-formedness errors. Piccolo was developed by Yuval Oren.

Apache Xerces for Perl XML Parser - Perl API to the Apache Xerces XML parser.

  •    Perl

Perl API to the Apache Xerces XML parser.


xml-stream - XML stream parser based on Expat. Made for Node.

  •    Javascript

XmlStream is a Node.js XML stream parser and editor, based on node-expat (libexpat SAX-like parser binding). When working with large XML files, it is probably a bad idea to use an XML to JavaScript object converter, or simply buffer the whole document in memory. Then again, a typical SAX parser might be too low-level for some tasks (and often a real pain).

Apache Xerces for Java XML Parser

  •    Java

Xerces-J is a validating XML parser written in Java.

Libxml++

  •    C

libxml++ is a C++ wrapper for the libxml XML parser library.

SAXExpat

  •    C#

This is a SAX for .NET parser implementation based on the popular Expat XML parser.

sax-machine - A declarative sax parsing library backed by Nokogiri.

  •    Ruby

A declarative SAX parsing library backed by Nokogiri, Ox or Oga. SAX Machine can use either nokogiri, ox or oga as XML SAX handler.

Fuzi - A fast & lightweight XML & HTML parser in Swift with XPath & CSS support

  •    Swift

Fuzi is based on a Swift port of Mattt Thompson's Ono(斧), using most of its low level implementaions with moderate class & interface redesign following standard Swift conventions, along with several bug fixes. Fuzi(斧子) means "axe", in homage to Ono(斧), which in turn is inspired by Nokogiri (鋸), which means "saw".

GXPARSE: XML stream parser API

  •    Java

Generic Java XML stream parser API makes it much easier to use event-based stream parsers like SAX Parser. Includes an implementation for SAX parser. Also supports recursive pattern matching.

node-xml2js - XML to JavaScript object converter.

  •    CoffeeScript

Simple XML to JavaScript object converter. It supports bi-directional conversion. Uses sax-js and xmlbuilder-js.Note: If you're looking for a full DOM parser, you probably want JSDom.

NQXML

  •    Ruby

NQXML is a pure Ruby implementation of a non-validating XML processor. It includes an XML tokenizer, a SAX-style streaming XML parser, a DOM-style tree parser, an XML writer, and a context-sensitive callback mechanism.

Piccolo XML Parser for Java

  •    Java

Piccolo is the fastest SAX parser for Java, supporting SAX1, SAX2, and JAXP (SAX only). Piccolo is different from other parsers in that it was developed using parser generators. It weighs 160K including XML APIs. See http://piccolo.sf.net for more info.

Erlsom

  •    Erlang

An Erlang libary for XML parsing. It supports various modes of operation: as an efficient SAX parser, as a simple DOM-like parser, or as a 'data mapper'. The data mapper transforms the XML document to Erlang records, based on an XML Schema.

libxml-Perl

  •    Perl

Perl interface to Gnome libxml2 xml parsing and DOM library.

node-expat - libexpat XML SAX parser binding for node.js

  •    Javascript

We don't emit an error event because libexpat doesn't use a callback either. Instead, check that parse() returns true. A descriptive string can be obtained via getError() to provide user feedback. Alternatively, use the Parser like a node Stream. write() will emit error events.

parser-lib - Collection of parsers written in JavaScript

  •    Javascript

The ParserLib CSS parser is a CSS3 SAX-inspired parser written in JavaScript. It handles standard CSS syntax as well as validation (checking of property names and values) although it is not guaranteed to thoroughly validate all possible CSS properties.The CSS parser is built for a number of different JavaScript environments. The most recently released version of the parser can be found in the dist directory when you check out the repository; run npm run build to regenerate them from the latest sources.

sax-js - A sax style parser for JS

  •    Javascript

A sax-style parser for XML and HTML.Designed with node in mind, but should work fine in the browser or other CommonJS implementations.