Avro

Avro is a data serialization system. Avro provides: Rich data structures. A compact, fast, binary data format. A container file, to store persistent data. Remote procedure call (RPC). Simple integration with dynamic languages. Code generation is not required to read or write data files nor to use or implement RPC protocols. Code generation as an optional optimization, only worth implementing for statically typed languages.



http://hadoop.apache.org/avro


Bookmark and Share          3105



comments powered by Disqus


Related Products

Cascading - Data Processing Workflows on Hadoop

Cascading is a Data Processing API, Process Planner, and Process Scheduler used for defining and executing complex, scale-free, and fault tolerant data processing workflows on an Apache Hadoop cluster. It is a thin Java library and API that sits on top of Hadoop's MapReduce layer and is executed from the command line like any other Hadoop application.

Read more

Sqoop - Transfers data between Hadoop and Datastores

Apache Sqoop is a tool designed for efficiently transferring bulk data between Apache Hadoop and structured datastores such as relational databases. You can use Sqoop to import data from external structured datastores into Hadoop Distributed File System or related systems like Hive and HBase. Conversely, Sqoop can be used to extract data from Hadoop and export it to external structured datastores such as relational databases and enterprise data warehouses.

Read more

MessagePack

MessagePack is a binary-based efficient object serialization library. It enables to exchange structured objects between many languages like JSON. But unlike JSON, it is very fast and small.

Read more

Hadoop Common

Apache Hadoop is a framework for running applications on large clusters built of commodity hardware. Hadoop common supports other Hadoop subprojects

Read more

Protocol Buffer

Protocol Buffers are a way of encoding structured data in an efficient yet extensible format. Google uses Protocol Buffers for almost all of its internal RPC protocols and file formats.

Read more

SMSj - Java SMS library

This library allows you to send SMSes (GSM) from the Java platform. It gives you full control over the SMS including the UDH field so you can create and send EMS messages, WAP push messages and nokia smart messages (picture, ringtone etc). The API can send SMS by using a GSM phone connected to the serial port or via a SMS gateway (like Clickatell).

Read more

RSS.PY - parser and the serializer in Python

This library provides tools for working with RSS feeds as datastructures. The core is an RSS parser capable of understanding most RSS formats, and a serializer that produces RSS1.0. The RSS channel itself can be represented as any arbitrary data structure; two such structures are provided both as examples and to service common usage. This approach allows channels to be manipulated and stored ina fashion that suits both their semantics and the applications that access them.

Read more

Katta - Lucene and more in the cloud.

Katta is a scalable, failure tolerant, distributed, data storage for real time access. Katta serves large, replicated, indices as shards to serve high loads and very large data sets. These indices can be of different type. Currently implementations are available for Lucene and Hadoop mapfiles.

Read more

Cascalog - Data processing on Hadoop

Cascalog is a fully-featured data processing and querying library for Clojure. The main use cases for Cascalog are processing Big Data on top of Hadoop or doing analysis on your local computer from the Clojure REPL. Cascalog is a replacement for tools like Pig, Hive, and Cascading.

Read more

HBase - Hadoop database

HBase provides support to handle BigTable - billions of rows X millions of columns. It is a scalable, distributed, versioned, column-oriented store modeled after Google's Bigtable and runs on top of HDFS (Hadoop Distributed Filesystem). It features compression, in-memory operation per-column. Data could be replicated between the nodes. HBase is used in Facebook and Twitter.

Read more

Follow feeds Follow bestopensource on Twitter Follow bestopensource on Facebook

Enter your email address:

Delivered by FeedBurner



Open source products are scattered around the web. Please provide information about the open source projects you own / you use. Add Projects.