himalaya - JavaScript HTML to JSON Parser

  •        35

Download himalaya.js and put it in a

Related Projects

Himalaya Data Mining Tools

  •    C++

Himalaya Tools is a suite of programs focusing on new techniques in data mining. MAFIA/SPAM mine patterns from transactional databases. SECRET is a new algorithm for scalable linear regression trees. More algorithms will be added over time.

sajson - Lightweight, extremely high-performance JSON parser for C++11

  •    C++

sajson is an extremely high-performance, in-place, DOM-style JSON parser written in C++. Originally, sajson meant Single Allocation JSON, but it now supports dynamic allocation too. sajson parses an input document into a contiguous AST structure. Unlike some other high-performance JSON parsers, the AST is efficiently queryable. Object lookups by key are O(lg N) and array indexing is O(1).

flexmark-java - CommonMark/Markdown Java parser with source level AST

  •    Java

Java re-implementation of commonmark-java based parser, with AST reflecting source elements, full source position tracking, greater parser extensibility.

Jodd - The Unbearable Lightness of Java

  •    Java

Jodd is developer-friendly set of Java microframeworks, tools and utilities, under 1.7 MB. Build with common sense to make things simple, but not simpler. Its feature include slick IoC container, elegant MVC framework, unique AOP engine, thin DB-object mapper, standalone transaction manager, focused validation tool, versatile HTML parsers, pages decorator, super properties, powerful BeanUtil, timeless JDateTime, easy email, many super utilities... and more.

csstree - A tool set for working with CSS including fast detailed parser, walker, generator and lexer based on W3C specs and browser implementations

  •    Javascript

CSSTree is a tool set to work with CSS, including fast detailed parser (string->AST), walker (AST traversal), generator (AST->string) and lexer (validation and matching) based on knowledge of spec and browser implementations. The main goal is to be efficient and W3C spec compliant, with focus on CSS analyzing and source-to-source transforming tasks. NOTE: The project is in alpha stage since some parts need further improvements, AST format and API are subjects to change. However it's stable enough and used by packages like CSSO (CSS minifier) and SVGO (SVG optimizer) in production.


json4s - A single AST to be used by other scala json libraries

  •    Scala

At this moment there are at least 6 json libraries for scala, not counting the java json libraries. All these libraries have a very similar AST. This project aims to provide a single AST to be used by other scala json libraries. At this moment the approach taken to working with the AST has been taken from lift-json and the native package is in fact lift-json but outside of the lift project.

espree - An Esprima-compatible JavaScript parser

  •    Javascript

Espree started out as a fork of Esprima v1.2.2, the last stable published released of Esprima before work on ECMAScript 6 began. Espree is now built on top of Acorn, which has a modular architecture that allows extension of core functionality. The goal of Espree is to produce output that is similar to Esprima with a similar API so that it can be used in place of Esprima. The primary goal is to produce the exact same AST structure and tokens as Esprima, and that takes precedence over anything else. (The AST structure being the ESTree API with JSX extensions.) Separate from that, Espree may deviate from what Esprima outputs in terms of where and how comments are attached, as well as what additional information is available on AST nodes. That is to say, Espree may add more things to the AST nodes than Esprima does but the overall AST structure produced will be the same.

cppast - Library to parse and work with the C++ AST

  •    C++

Library interface to the C++ AST — parse source files, synthesize entities, get documentation comments and generate code. If you're writing a tool that needs access to the C++ AST (i.e. documentation generator, reflection library, …), your only option apart from writing your own parser is to use clang. It offers three interfaces for tools, but the only one that really works for standalone applications is libclang. However, libclang has various limitations and does not expose the entire AST.

libgraphqlparser - A GraphQL query parser in C++ with C and C++ APIs

  •    C++

libgraphqlparser is a parser for GraphQL, a query language created by Facebook for describing data requirements on complex application data models, implemented in C++11. It can be used on its own in C++ code (or in C code via the pure C API defined in the c subdirectory), or you can use it as the basis for an extension module for your favorite programming language instead of writing your own parser from scratch.The provided dump_json_ast is a simple program that reads GraphQL text on stdin and prints a JSON representation of the AST to stdout.

mark - A simple and unified notation for both object data, like JSON, and markup data, like HTML and XML

  •    Javascript

Objective Markup Notation, abbreviated as Mark Notation or just Mark, is a new unified notation for both object and markup data. The notation is a superset of what can be represented by JSON, HTML and XML, but overcomes many limitations these popular data formats, yet still having a very clean syntax and simple data model. The major syntax extension Mark makes to JSON is the introduction of a Mark object. It is a JSON object extended with a type name and a list of content items, similar to element in HTML and XML.

ast-types - Esprima-compatible implementation of the Mozilla JS Parser API

  •    Javascript

This module provides an efficient, modular, Esprima-compatible implementation of the abstract syntax tree type hierarchy pioneered by the Mozilla Parser API. Because it understands the AST type system so thoroughly, this library is able to provide excellent node iteration and traversal mechanisms.

php-parser - PHP parser written in Go

  •    Go

This project uses goyacc and golex libraries to parse PHP sources into AST. It can be used to write static analysis, refactoring, metrics, code style formatting tools. Dump AST to stdout.

yacy_grid_parser - Parser Microservice for the YaCy Grid

  •    Java

The Parser is a microservices which can be deployed i.e. using Docker. When the Parser Component is started, it searches for a MCP and connects to it. By default the local host is searched for a MCP but you can configure one yourself. The Parser is able to read a WARC file and parses it's content. The content is analyzed, the plain text, links, images and more entities are extracted. The result is stored in a JSON Object. Calling the parser will generate a list of JSON Objects, each containing the analyzed content of one internet resource. The parser understands not only HTML but also a wide range of different document formats, including PDF, all OpenOffice and MS Office document formats and much more.

tolerant-php-parser - An early-stage PHP parser designed for IDE usage scenarios.

  •    PHP

This is an early-stage PHP parser designed, from the beginning, for IDE usage scenarios (see Design Goals for more details). There is still a ton of work to be done, so at this point, this repo mostly serves as an experiment and the start of a conversation.After you've configured your machine, you can use the parser to generate and work with the Abstract Syntax Tree (AST) via a friendly API.

pikkr - JSON parser which picks up values directly without performing tokenization in Rust

  •    Rust

Pikkr is a JSON parser which picks up values directly without performing tokenization in Rust. This JSON parser is implemented based on Y. Li, N. R. Katsipoulakis, B. Chandramouli, J. Goldstein, and D. Kossmann. Mison: a fast JSON parser for data analytics. In VLDB, 2017. This JSON parser performs well when there are a limited number of different JSON structural variants in a JSON data stream or JSON collection, and that is a common case in data analytics field.

fast-xml-parser - Validate XML, Parse XML to JS/JSON and vise versa, or parse XML to Nimn rapidly without C/C++ based libraries and no callback

  •    Javascript

This project welcomes contributors. If you have a feature you'd like to see implemented or a bug you'd liked fixed, the best and fastest way to make that happen is to implement it and submit a PR. Basic knowledge of JS is sufficient. Feel free to ask for any guidance. To use it from CLI Install it globally with -g option.

Babelfish.NET

  •    DotNet

Babelfish was created as a common framework for navigating several different node-to-node structured data sources, such as HTML, CSS, Javascript, XML & JSON. Developed in C# .NET 3.5

Noggit - JSON streaming parser

  •    Java

Noggit is the world's fastest streaming JSON parser for Java. It is used in Apache Solr.

proposal-binary-ast - Binary AST proposal for ECMAScript

  •    

This is the explainer document for a proposed new binary AST format for JS. Performance of applications on the web platform is becoming increasingly bottlenecked by startup (load) time. Larger amounts of JS code are transferred over the wire by more sophisticated web properties. While caching helps, these properties regularly release new code, and cold load times are very important.

jsonstreamingparser - A JSON streaming parser implementation in PHP.

  •    PHP

This is a simple, streaming parser for processing large JSON documents. Use it for parsing very large JSON documents to avoid loading the entire thing into memory, which is how just about every other JSON parser for PHP works. For more details, I've written up a longer explanation of the JSON streaming parser that talks about pros and cons vs. the standard PHP JSON parser.