Sqoop - Transfers data between Hadoop and Datastores

  •        0

Apache Sqoop is a tool designed for efficiently transferring bulk data between Apache Hadoop and structured datastores such as relational databases. You can use Sqoop to import data from external structured datastores into Hadoop Distributed File System or related systems like Hive and HBase. Conversely, Sqoop can be used to extract data from Hadoop and export it to external structured datastores such as relational databases and enterprise data warehouses.

http://incubator.apache.org/sqoop/

Tags
Implementation
License
Platform

   




Related Projects

couchbase-hadoop-plugin - A Couchbase to Hadoop (Sqoop) plugin for importing and exporting data


A Couchbase to Hadoop (Sqoop) plugin for importing and exporting data

Apache Tajo - A big data warehouse system on Hadoop


Apache Tajo is a robust big data relational and distributed data warehouse system for Apache Hadoop. Tajo is designed for low-latency and scalable ad-hoc queries, online aggregation, and ETL (extract-transform-load process) on large-data sets stored on HDFS (Hadoop Distributed File System) and other data sources.

spatial-framework-for-hadoop


The __Spatial Framework for Hadoop__ allows developers and data scientists to use the Hadoop data processing system for spatial data analysis.For tools, [samples](https://github.com/Esri/gis-tools-for-hadoop/tree/master/samples), and [tutorials](https://github.com/Esri/gis-tools-for-hadoop/wiki) that use this framework, head over to [GIS Tools for Hadoop](https://github.com/Esri/gis-tools-for-hadoop).

Cascalog - Data processing on Hadoop


Cascalog is a fully-featured data processing and querying library for Clojure or Java. The main use cases for Cascalog are processing "Big Data" on top of Hadoop or doing analysis on your local computer. Cascalog is a replacement for tools like Pig, Hive, and Cascading and operates at a significantly higher level of abstraction than those tools.

Apache Hive - The Apache Hive (TM) data warehouse software facilitates querying and managing large d


The Apache Hive (TM) data warehouse software facilitates querying and managing large datasets residing in distributed storage.

gis-tools-for-hadoop


* [Tutorial: An Introduction for Beginners] (https://github.com/Esri/gis-tools-for-hadoop/wiki/GIS-Tools-for-Hadoop-for-Beginners)* [Tutorial: Aggregating Data Into Bins](https://github.com/Esri/gis-tools-for-hadoop/wiki/Aggregating-CSV-Data-%28Spatial-Binning%29)* [Tutorial: Correcting your ArcGIS Projection](https://github.com/Esri/gis-tools-for-hadoop/wiki/Correcting-Projection-in-ArcGIS)* [Updated Wiki page for the Spatial-Framework-for-Hadoop](https://github.com/Esri/spatial-framework-for-h

data-viz


Projeto Java EE 6 para visualização de dados. Utiliza tecnologias Big Data (Hadoop, HDFS, HBase, Sqoop), CDI, data visualization e primefaces

genie - Distributed Big Data Orchestration Service


Genie is a federated job orchestration engine developed by Netflix. Genie provides REST-ful APIs to run a variety of big data jobs like Hadoop, Pig, Hive, Spark, Presto, Sqoop and more. It also provides APIs for managing the metadata of many distributed processing clusters and the commands and applications which run on them.See the official website to find documentation about Genie and specific documentation for various releases.

Ambari - Monitor Hadoop Cluster


The Apache Ambari project is aimed at making Hadoop management simpler by developing software for provisioning, managing, and monitoring Apache Hadoop clusters. Ambari provides an intuitive, easy-to-use Hadoop management web UI backed by its RESTful APIs. The set of Hadoop components that are currently supported by Ambari includes HDFS, MapReduce, Hive, HCatalog, HBase, ZooKeeper, Oozie, Pig, Sqoop.

sqoop-data-generator - Simple tool for generating data usable for Sqoop testing


Simple tool for generating data usable for Sqoop testing

jumbune - Jumbune is an open-source project to optimize both Yarn (v2) and older (v1) Hadoop based solutions


Jumbune is an open-source product built for analyzing Hadoop cluster and MapReduce jobs. It provides development & administrative insights of Hadoop based analytical solutions. It enables user to Debug, Profile, Monitor & Validate analytical solutions hosted on decoupled clusters.

Ranger - Manage Data Security across the Hadoop Platform


Ranger is a framework to enable, monitor and manage comprehensive data security across the Hadoop platform. It provides centralized security administration to manage all security related tasks in a central UI or using REST APIs, Fine grained authorization, Centralize auditing of user access within Apache Hadoop, Apache Hive, Apache HBase and other Apache components.

Cascading - Data Processing Workflows on Hadoop


Cascading is a Data Processing API, Process Planner, and Process Scheduler used for defining and executing complex, scale-free, and fault tolerant data processing workflows on an Apache Hadoop cluster. It is a thin Java library and API that sits on top of Hadoop's MapReduce layer and is executed from the command line like any other Hadoop application.

sqoop-1.1.0hadoop21 - sqoop-1.1.0 with hadoop 0.21 (HOBBY PROJECT)


sqoop-1.1.0 with hadoop 0.21 (HOBBY PROJECT)

jstor-early-journal-content - Tools to extract data from JSTOR Early Journal Content Data Bundle


Tools to extract data from JSTOR Early Journal Content Data Bundle

Hue - The open source Apache Hadoop UI


Hue is a Web application for interacting with Apache Hadoop. It supports a FileBrowser for accessing HDFS, JobBrowser for accessing MapReduce jobs (MR1/MR2-YARN), Job Designer for creating MapReduce/Streaming/Java jobs, HBase Browser for exploring and modifying HBase tables and data, Oozie App for submitting and scheduling workflows and bundles, A Pig/HBase/Sqoop2 shell, Beeswax application for executing Hive queries, Search app for querying Solr and Solr Cloud.

avenir - Set of predictiive data mining tools based on Hadoop


Set of predictiive data mining tools based on Hadoop

cyto-bridge - Plugin to transfer data between Cytoscape and R, MATLAB, and other tools.


Plugin to transfer data between Cytoscape and R, MATLAB, and other tools.

laola - A set of tools to extract data from MS-Word .doc files


A set of tools to extract data from MS-Word .doc files