pocket-etl - Extensible java library that orchestrates batched ETL (extract, transform and load) of data between services using native fluent java to express your pipeline

  •        156

Extensible Java library that orchestrates batched ETL (extract, transform and load) of data between services using native fluent Java to express your pipeline.

https://github.com/awslabs/pocket-etl

Tags
Implementation
License
Platform

   




Related Projects

awesome-etl - A curated list of awesome ETL frameworks, libraries, and software.

  •    

A curated list of notable ETL (extract, transform, load) frameworks, libraries and software. Warning: If you're already familiar with a scripting language, GUI ETL tools are not a good replacement for a well structured application written with a scripting language. These tools lack flexibility and are a good example of the "inner-platform effect". With a large project, you will most likely run into instances where "the tool doesn't do that" and end up implementing something hacky with a script run by the GUI ETL tool. Also, the GUI can conceal complexity and the files these tools generate are impossible to code review. However, the GUI and out-of-the-box functionality can make some tasks simpler, especially for people not comfortable with writing code.

activewarehouse-etl - Extract-Transform-Load library from ActiveWarehouse

  •    Ruby

ActiveWarehouse-ETL is a Ruby Extract-Transform-Load (ETL) tool. This tool is both usable and used in production under its current form – but be aware the project is under reorganization: a new team is shaping up and we’re working mostly on making it easier for people to contribute first. Up-to-date documentation will only come later.

ETL - Extract, Transform, and Load data with Ruby

  •    Ruby

ETL depends on having a database connection object that must respond to #query. The mysql2 gem is a good option. You can also proxy another library using Ruby's SimpleDelegator and add a #query method if need be.The gem comes bundled with a default logger. If you'd like to write your own just make sure that it implements #debug and #info. For more information on what is logged and when, view the logger details.

rhino-etl - Main repository is here ->

  •    CSharp

Rhino Etl is a simple Extract, transform and load library for .NET.Also note that the build script assume that you have git.exe on your path.

Apache Tajo - A big data warehouse system on Hadoop

  •    Java

Apache Tajo is a robust big data relational and distributed data warehouse system for Apache Hadoop. Tajo is designed for low-latency and scalable ad-hoc queries, online aggregation, and ETL (extract-transform-load process) on large-data sets stored on HDFS (Hadoop Distributed File System) and other data sources.


ratchet - A library for performing data pipeline / ETL tasks in Go.

  •    Go

The Go programming language's simplicity, execution speed, and concurrency support make it a great choice for building data pipeline systems that can perform custom ETL (Extract, Transform, Load) tasks. Ratchet is a library that is written 100% in Go, and let's you easily build custom data pipelines by writing your own Go code. Each data processor is receiving, processing, and then sending data to the next stage in the pipeline. All data processors are running in their own goroutine, so all processing is happening concurrently. Go channels are connecting each stage of processing, so the syntax for sending data will be intuitive for anyone familiar with Go. All data being sent and received is JSON, which provides for a nice balance of flexibility and consistency.

Apache Hive - The Apache Hive (TM) data warehouse software facilitates querying and managing large d

  •    Java

The Apache Hive (TM) data warehouse software facilitates querying and managing large datasets residing in distributed storage.

ETL Framework

  •    

This ETL Framework supports dynamic configurations and centralized logging for SSIS solutions in support of minimizing ETL TCO. It consists of an ETL Framework database, SSRS reports and template SSIS packages.

Bender - Serverless ETL Framework

  •    Java

This project provides an extendable Java framework for creating serverless ETL functions on AWS Lambda. Bender handles the complex plumbing and provides the interfaces necessary to build modules for all aspects of the ETL process.

Apache Beam - Unified model for defining both batch and streaming data-parallel processing pipelines

  •    Java

Apache Beam is an open source, unified model for defining both batch and streaming data-parallel processing pipelines. Using one of the open source Beam SDKs, you build a program that defines the pipeline. The pipeline is then executed by one of Beam’s supported distributed processing back-ends, which include Apache Apex, Apache Flink, Apache Spark, and Google Cloud Dataflow.

kiba - Data processing & ETL framework for Ruby

  •    Ruby

If you need help, please ask your question with tag kiba-etl on StackOverflow so that other can benefit from your contribution! I monitor this specific tag and will reply to you. Writing reliable, concise, well-tested & maintainable data-processing code is tricky.

Dynamic ETL

  •    

This project is to offer a framework that can facilitate the ETL development for loading a bunch of tables from the source side to destination side. What you need to do is to maintain the tables and columns mapping relationship. The dynamicETL can help to load data between pa...

SSIS Event Log Business Intelligence

  •    

The SSIS Event Log Business Intelligence package is a complete BI project focused around SSIS Event Log data. Components include: - 9 SSRS Reports - ETL Data Mart - SSIS packages to load ETL Data Mart - Analysis Services Cube - PerformancePoint Dashboard

RapidMiner -- Data Mining, ETL, OLAP, BI

  •    Java

No 1 in Business Analytics: Data Mining, Predictive Analytics, ETL, Reporting, Dashboards in One Tool. 1000+ methods: data mining, business intelligence, ETL, data mining, data analysis + Weka + R, forecasting, visualization, business intelligence

SSIS mojomo ETL-Framework

  •    

A modern SSIS ETL-Framework for executing and managing complex etl-processes. Bundled design patterns enable to speed up implementation of new requirements.

Microsoft SQL Server Metadata-Driven ETL Management Studio (MDDE)

  •    

Originally an internal MSIT solution that has been released as an open source project, the Microsoft SQL Server Metadata-Driven ETL Management Studio (a.k.a. MDDE) provides a tool for rapidly generating SQL Server Integration Services (SSIS) packages from a shared central metadat

Palo ETL Server

  •    Java

Palo ETL Server is a Java based Tool for Extraction, Transformation and Loading of mass data into the Palo OLAP Server. Palo ETL Server is one part of the Palo Suite.

SpagoBI - Business Intelligence Suite

  •    Java

SpagoBI is the only entirely open source Business Intelligence suite. It covers all the analytical areas of Business Intelligence projects, with innovative themes and engines. SpagoBI offers a wide range of entirely open source analytical tools like Reporting, OLAP, Chart, Data mining, Real-time monitoring console, ETL.

Transporter - Sync data between persistence engines, like ETL only not stodgy

  •    Go

Compose Transporter helps with database transformations from one store to another. It can also sync from one to another or several stores.Transporter allows the user to configure a number of data adaptors as sources or sinks. These can be databases, files or other resources. Data is read from the sources, converted into a message format, and then send down to the sink where the message is converted into a writable format for its destination. The user can also create data transformations in JavaScript which can sit between the source and sink and manipulate or filter the message flow.

MixDEM

  •    Javascript

MixDEM a web based ETL tools meant for Web integration, Data transformation and Mashup edition. It include MixDEM ETL Engine created using ZEND Framework, and MixDEM GUI Editor an AJAX IDE that enable developers to quickly and easily create applications.





We have large collection of open source products. Follow the tags from Tag Cloud >>


Open source products are scattered around the web. Please provide information about the open source projects you own / you use. Add Projects.