Hue is a Web application for interacting with Apache Hadoop. It supports a FileBrowser for accessing HDFS, JobBrowser for accessing MapReduce jobs (MR1/MR2-YARN),  Job Designer for creating MapReduce/Streaming/Java jobs, HBase Browser for exploring and modifying HBase tables and data, Oozie App for submitting and scheduling workflows and bundles, A Pig/HBase/Sqoop2 shell,  Beeswax application for executing Hive queries, Search app for querying Solr and Solr Cloud.


Hue - The open source Apache Hadoop UI

Cascalog is a fully-featured data processing and querying library for Clojure or Java. The main use cases for Cascalog are processing "Big Data" on top of Hadoop or doing analysis on your local computer. Cascalog is a replacement for tools like Pig, Hive, and Cascading and operates at a significantly higher level of abstraction than those tools.

Cascalog - Data processing on Hadoop

Apache Sqoop is a tool designed for efficiently transferring bulk data between Apache Hadoop and structured datastores such as relational databases. You can use Sqoop to import data from external structured datastores into Hadoop Distributed File System or related systems like Hive and HBase. Conversely, Sqoop can be used to extract data from Hadoop and export it to external structured datastores such as relational databases and enterprise data warehouses.

Sqoop - Transfers data between Hadoop and Datastores

Ranger is a framework to enable, monitor and manage comprehensive data security across the Hadoop platform. It provides centralized security administration to manage all security related tasks in a central UI or using REST APIs, Fine grained authorization, Centralize auditing of user access within Apache Hadoop, Apache Hive, Apache HBase and other Apache components.

Ranger - Manage Data Security across the Hadoop Platform

Cascading is a Data Processing API, Process Planner, and Process Scheduler used for defining and executing complex, scale-free, and fault tolerant data processing workflows on an Apache Hadoop cluster. It is a thin Java library and API that sits on top of Hadoop's MapReduce layer and is executed from the command line like any other Hadoop application.

Cascading - Data Processing Workflows on Hadoop

The Apache Ambari project is aimed at making Hadoop management simpler by developing software for provisioning, managing, and monitoring Apache Hadoop clusters. Ambari provides an intuitive, easy-to-use Hadoop management web UI backed by its RESTful APIs. The set of Hadoop components that are currently supported by Ambari includes HDFS, MapReduce, Hive, HCatalog, HBase, ZooKeeper, Oozie, Pig, Sqoop.

Ambari - Monitor Hadoop Cluster

Discover open source projects across all platforms

Projects

Hue - The open source Apache Hadoop UI

Cascalog - Data processing on Hadoop

Sqoop - Transfers data between Hadoop and Datastores

Ranger - Manage Data Security across the Hadoop Platform

Cascading - Data Processing Workflows on Hadoop

Ambari - Monitor Hadoop Cluster

TechStack

Tagcloud

License

Suggested keywords:

Projects

Hue - The open source Apache Hadoop UI

Cascalog - Data processing on Hadoop

Sqoop - Transfers data between Hadoop and Datastores

Ranger - Manage Data Security across the Hadoop Platform

Cascading - Data Processing Workflows on Hadoop

Ambari - Monitor Hadoop Cluster

TechStack

Tagcloud

License