GoldenOrb - Scalable Graph Analysis

  •        0

GoldenOrb is a cloud-based project for massive-scale graph analysis, built upon Apache Hadoop and modeled after Google's Pregel architecture. It provides solutions to complex data problems, remove limits to innovation and contribute to the emerging ecosystem that spans all aspects of big data analysis. It enables users to run analytics on entire data sets instead of samples.

Source code:



comments powered by Disqus

Related Projects

Live Graph - Plot and explore your data in real-time

LiveGraph is a framework for real-time data visualisation, analysis and logging. It has a real time plotter that can automatically update graphs of your data while it is still being computed by your application. LiveGraph reads files in a simple CSV-style format. For applications developed in Java, LiveGraph additionally provides an API that handles all data logging and persistency issues.

Gremlin - Graph Traversal Language

Gremlin is a graph traversal language. Gremlin works over those graph databases or frameworks that implement the Blueprints property graph data model. It works beter with graph database like TinkerGraph, Neo4j, OrientDB, DEX, Rexster, and Sail RDF Stores. This language has application in the areas of graph query, analysis, and manipulation.

Java Universal Network/Graph Framework

JUNG provides a common and extendible language for the modeling, analysis, and visualization of data that can be represented as a graph or network.

Scribe - Real time log aggregation used in Facebook

Scribe is a server for aggregating log data that's streamed in real time from clients. It is designed to be scalable and reliable. It is developed and maintained by Facebook. It is designed to scale to a very large number of nodes and be robust to network and node failures. There is a scribe server running on every node in the system, configured to aggregate messages and send them to a central scribe server (or servers) in larger groups.


Framework for text mining, data integration and data analysis. Keywords: ontology and graph alignment, relation mining, warehouse, semantic database integration, bioinformatics, systems biology, microarray, Java.


Jlint will check your Java code and find bugs, inconsistencies and synchronization problems by doing data flow analysis on the code and building the lock graph. Jlint is fast, easy to learn, and requires no changes in the class files to be checked.


Valgrind is an award-winning instrumentation framework for building dynamic analysis tools. There are Valgrind tools that can automatically detect many memory management and threading bugs, and profile your programs in detail. You can also use Valgrind to build new tools.

Titan - Scalable Graph Database

Titan is a scalable graph database optimized for storing and querying graphs containing hundreds of billions of vertices and edges distributed across a multi-machine cluster. Titan is a transactional database that can support thousands of concurrent users executing complex graph traversals. It is a native Blueprints enabled graph database and as such, it supports the full TinkerPop stack of technologies.

Scikit Learn - Machine Learning in Python

scikit-learn is a Python module for machine learning built on top of SciPy. It is simple and efficient tools for data mining and data analysis. It supports automatic classification, clustering, model selection, pre processing and lot more.

Hypertable - A high performance, scalable, distributed storage and processing system for structured

Hypertable is based on Google's Bigtable Design, which is a proven scalable design that powers hundreds of Google services. Many of the current scalable NoSQL database offerings are based on a hash table design which means that the data they manage is not kept physically ordered. Hypertable keeps data physically sorted by a primary key and it is well suited for Analytics.