node-parquet - NodeJS module to access apache parquet format files

  •        112

Parquet is a columnar storage format available to any project in the Hadoop ecosystem. This nodejs module provides native bindings to the parquet functions from parquet-cpp.A pure javascript parquet format driver (still in development) is also provided.

https://github.com/skale-me/node-parquet

Dependencies:

hexdump-nodejs : ^0.1.0
minimist : ^1.2.0
nan : ^2.5.0
varint : ^5.0.0

Tags
Implementation
License
Platform

   




Related Projects

incubator-parquet-mr - Mirror of Apache Parquet

  •    Java

Parquet is a very active project, and new features are being added quickly; below is the state as of June 2013.<table> <tr><th>Feature</th><th>In trunk</th><th>In dev</th><th>Planned</th><th>Expected release</th></tr> <tr><td>Type-specific encoding</td><td>YES</td><td></td></td><td></td><td>1.0</td></tr> <tr><td>Hive integration</td><td>YES (<a href ="https://github.com/Parquet/parquet-mr/pull/28">28</a>)</td><td></td></td><td></td><td>1.0</td></tr> <tr><td>Pig integration</td><td>YES</td><t

parquet-go - Golang version of Read/Write parquet file

  •    Go

parquet-go is a pure-go implementation of reading and writing the parquet format file. Look at examples in example/.

Gaffer - A large-scale entity and relation database supporting aggregation of properties

  •    Java

Gaffer is a graph database framework. It allows the storage of very large graphs containing rich properties on the nodes and edges. Several storage options are available, including Accumulo, Hbase and Parquet. It is designed to be as flexible, scalable and extensible as possible, allowing for rapid prototyping and transition to production systems.

parquet-format - Mirror of Apache Parquet

  •    Java

Parquet is a columnar storage format that supports nested data. This provides all generated metadata code.

parquet-mr - Mirror of Apache Parquet

  •    Java

Parquet is a columnar storage format that supports nested data. This provides the java implementation.


incubator-hudi - Upserts And Incremental Processing on Big Data

  •    Java

Hoodie is a Apache Spark library that provides the ability to efficiently do incremental processing on datasets in HDFS

Quilt - Data Engineering Infrastructure

  •    Python

With Quilt you can build, push, and install data packages. Data packages are versioned, reusable data structures that can be loaded into Python. Quilt is designed to support reproducible, auditable, and compliant workflows. Quilt consists of three source-level components data catalog, data registry and data compiler.

spindle - Next-generation web analytics processing with Scala, Spark, and Parquet.

  •    Javascript

Spindle is Brandon Amos' 2014 summer internship project with Adobe Research and is not under active development.Analytics platforms such as Adobe Analytics are growing to process petabytes of data in real-time. Delivering responsive interfaces querying this amount of data is difficult, and there are many distributed data processing technologies such as Hadoop MapReduce, Apache Spark, Apache Drill, and Cloudera Impala to build low-latency query systems.

sparser - Sparser: Raw Filtering for Faster Analytics over Raw Data

  •    C

This code base implements Sparser, raw filtering for faster analytics over raw data. Sparser can parse JSON, Avro, and Parquet data up to 22x faster than the state of the art. For more details, check out our paper published at VLDB 2018. Then enter 1 at the Sparser> prompt.

docpad - Empower your website frontends with layouts, meta-data, pre-processors (markdown, jade, coffeescript, etc

  •    CoffeeScript

Hi! I'm DocPad, I streamline the web development process and help close the gap between experts and beginners. I've been used in production by big and small companies for over a year and a half now to create plenty of amazing and powerful web sites and applications quicker than ever before. What makes me different is instead of being a box to cram yourself into and hold you back, I'm a freeway to what you want to accomplish, just getting out of your way and allowing you to create stuff quicker than ever before without limits. Leave the redundant stuff up to me, so you can focus on the awesome stuff.Discover my features below, or skip ahead to the installation instructions to get started with a fully functional pre-made website in a few minutes from reading this.

Vespa - Yahoo's big data serving engine

  •    Java

Vespa is an engine for low-latency computation over large data sets. It stores and indexes your data such that queries, selection and processing over the data can be performed at serving time. Vespa is serving platform for Yahoo.com, Yahoo News, Yahoo Sports, Yahoo Finance, Yahoo Gemini, Flickr.

Kylin - Extreme OLAP Engine for Big Data

  •    Java

Apache Kylin is an open source Distributed Analytics Engine designed to provide SQL interface and multi-dimensional analysis (OLAP) on Hadoop supporting extremely large datasets, original contributed from eBay Inc. It is designed to reduce query latency on Hadoop for 10+ billions of rows of data. It offers ANSI SQL on Hadoop and supports most ANSI SQL query functions.

Apache Tajo - A big data warehouse system on Hadoop

  •    Java

Apache Tajo is a robust big data relational and distributed data warehouse system for Apache Hadoop. Tajo is designed for low-latency and scalable ad-hoc queries, online aggregation, and ETL (extract-transform-load process) on large-data sets stored on HDFS (Hadoop Distributed File System) and other data sources.

email-templates - :mailbox: Create, preview, and send custom email templates for Node

  •    Javascript

Create, preview, and send custom email templates for Node.js. Highly configurable and supports automatic inline CSS, stylesheets, embedded images and fonts, and much more! Made for sending beautiful emails with Lad.NEW: v3.x is released (you'll need Node v6.4.0+); see breaking changes below. 2.x branch docs available if necessary.

node-msgpack - A space-efficient object serialization library for NodeJS

  •    Javascript

node-msgpack is an addon for NodeJS that provides an API for serializing and de-serializing JavaScript objects using the MessagePack library. The performance of this addon compared to the native JSON object isn't too bad, and the space required for serialized data is far less than JSON. node-msgpack is currently slower than the built-in JSON.stringify() and JSON.parse() methods. In recent versions of node.js, the JSON functions have been heavily optimized. node-msgpack is still more compact, and we are currently working performance improvements. Testing shows that, over 500k iterations, msgpack.pack() is about 5x slower than JSON.stringify(), and msgpack.unpack() is about 3.5x slower than JSON.parse().

essential-image-optimization - Essential Image Optimization - an eBook

  •    CSS

Bring up a terminal and type node --version. Node should respond with a version at or above 0.10.x. If you require Node, go to nodejs.org and click on the big green Install button.

flamebearer - Blazing fast flame graph tool for V8 and Node 🔥

  •    Javascript

A blazing fast flame graph tool for Node and V8. Used to visualize and explore performance profiling results. Designed to produce fast, lightweight flame graphs that remain responsive even on really big input.

masteringnode - Open source eBook for nodejs - written w/ markdown, outputs to various formats (pdf, mobi, epub, html, etc)

  •    Javascript

Mastering node is an open source eBook by node hackers for node hackers. I started this as a side project and realized that I don't have time :) so go nuts, download it, build it, fork it, extend it and share it. If you come up with something you wish to contribute back, send me a pull request.Mastering node is written using the markdown files provided in ./chapters, which can then be converted to several output formats, currently including pdf, mobi, epub and of course html.

frisbee - :dog2: Modern fetch-based alternative to axios/superagent/request

  •    Javascript

tldr; Stripe-inspired API wrapper for WHATWG's fetch() method for making simple HTTP requests (alternative to superagent, request, axios).If you're using node-fetch, you need node-fetch@v1.5.3 to use form-data with files properly (due to https://github.com/bitinn/node-fetch/issues/102) If you experience form file upload issues, please see https://github.com/facebook/react-native/issues/7564#issuecomment-266323928.