Displaying 1 to 20 from 24 results

Apache Doris - A fast MPP database for all modern analytics on big data

  •    Java

Apache Doris is a modern MPP analytical database product. It can provide sub-second queries and efficient real-time data analysis. With it's distributed architecture, up to 10PB level datasets will be well supported and easy to operate. Doris provides batch data loading and real-time mini-batch data loading. It provides high availability, reliability, fault tolerance, and scalability. Its original name was Palo, developed in Baidu.

Apache Tajo - A big data warehouse system on Hadoop

  •    Java

Apache Tajo is a robust big data relational and distributed data warehouse system for Apache Hadoop. Tajo is designed for low-latency and scalable ad-hoc queries, online aggregation, and ETL (extract-transform-load process) on large-data sets stored on HDFS (Hadoop Distributed File System) and other data sources.

Cascalog - Data processing on Hadoop

  •    Clojure

Cascalog is a fully-featured data processing and querying library for Clojure or Java. The main use cases for Cascalog are processing "Big Data" on top of Hadoop or doing analysis on your local computer. Cascalog is a replacement for tools like Pig, Hive, and Cascading and operates at a significantly higher level of abstraction than those tools.

AsterixDB - Big Data Management System (BDMS)

  •    Java

AsterixDB is a BDMS (Big Data Management System) with a rich feature set that sets it apart from other Big Data platforms. Its feature set makes it well-suited to modern needs such as web data warehousing and social data storage and analysis. It is a highly scalable data management system that can store, index, and manage semi-structured data, but it also supports a full-power query language with the expressiveness of SQL (and more).




VXQuery - Query XML Data

  •    Java

Apache VXQuer will be a standards compliant XML Query processor implemented in Java. The focus is on the evaluation of queries on large amounts of XML data. Specifically the goal is to evaluate queries on large collections of relatively small XML documents. To achieve this queries will be evaluated on a cluster of shared nothing machines.

Elementary - Data observability platform for modern data teams that is open and transparent

  •    Python

Elementary was built out of the need to effortlessly and immediately gain visibility into the data stack, starting with tracing the actual upstream & downstream dependencies in the data warehouse, without any implementation efforts, security risks or compromises on accuracy.

Dev Lake - Data lake for Dev

  •    Go

Dev Lake brings all your DevOps data into one practical, personalized, extensible view. Ingest, analyze, and visualize data from an ever-growing list of developer tools, with our free and open source product. Dev Lake is most exciting for leaders and managers looking to make better sense of their development data, though it's useful for any developer looking to bring a more data-driven approach to their own practices. With Dev Lake you can ask your process any question, just connect and query.

Intermine - A powerful open source data warehouse system

  •    Java

A powerful open source data warehouse system. InterMine allows users to integrate diverse data sources with a minimum of effort, providing powerful web-services and an elegant web-application with minimal configuration. InterMine powers some of the largest data-warehouses in the life sciences.


SQL Parallel Boost

  •    

Compared to the single-thread approach of SQL Server itself, SQL Parallel Boost facilitates the parallel execution of any data modification operations (UPDATE, INSERT, DELETE) - making best use of all available CPU resources. This results in performance gains of up to factor...

MDX Parser,Builder,DOM and OLAP visual controls with Writeback for Silverlight

  •    CSharp

It is component library for OLAP, .NET & Silverlight (C#). * MDX DOM, Parser, Generator, Query Designer * Description of supported MDX Syntax * Dynamic Pivot Grid - Pivot Table with Writeback * OLAP metadata choice controls See also: http://code.google.com/p/ranet-uilibrary-olap/

IIS Log reader

  •    

IIS Log reader library is reading the IIS log data into intuitive domain model. Useful in ETL/SSIS applications. Intuitive, simple API.

SQL DMVStats

  •    

A SQL Server 2005 Dynamic Management View Performance Data Warehouse

Learning Management Infrastructure (LMI)

  •    

The Learning Management Infrastructure is a set of open source tools that enables school districts to better deal with data storage, data integration, reporting, managing student achievement, communicating with stakeholders, and coordinating curriculum.

xmla4js - Javascript interface for XML for Analysis

  •    Javascript

Xmla4js is a standalone javascript library that provides basic XML for Analysis (XML/A) capabilities, allowing javascript developers to access data and metadata from OLAP provides for use in rich (web) applications. XML/A is an industry standard protocol to communicate with OLAP servers over HTTP. It defines a SOAP webservice that allows clients to obtain metadata and to execute MDX (multi-dimensional expressions) queries. XML is used as the data exchange format.

transmart-core - Core components and documentation of the tranSMART platform

  •    Groovy

This is the repository containing the core components and documentation of the tranSMART platform, an open source data sharing and analytics platform for translational biomedical research. tranSMART is maintained by the tranSMART Foundation. Official releases can be found on the tranSMART Foundation website, and the tranSMART Foundation's development repositories can be found at https://github.com/transmart/. All the instructions on how to install, build and run a private instance of tranSMART, get set up for developing or upgrade to the latest version of tranSMART from an older version are available in the documentation. For details on contributing code changes via pull requests, see the Contributing document.

ixmp - The ix modeling platform for integrated and cross-cutting scenario analysis

  •    Python

The ix modeling platform (ixmp) is a data warehouse for high-powered scenario analysis, with interfaces to Python and R for efficient scientific workflows and effective data pre- and post-processing, and a structured database backend for version-controlled data management. This repository contains the core and application programming interfaces (API) for the ix modeling platform (ixmp), as well as a number of tutorials and examples for a generic model instance based on Dantzig's transport problem.

OLAP-cube - is an hypercube of data

  •    Javascript

An OLAP cube is a multidimensional array of data you can explore and analyze. Here you will find an engine that could feed a graphic viewer. Attribute structure holds necessary information to clone a table excluding its data.

Transformalize - Configurable Extract, Transform, and Load

  •    CSharp

Transformalize automates moving data into data warehouses, search engines, and other value-adding systems. This section introduces <connections/>, <entities/>, and the tfl.exe command line interface.

bigquery-utils - Useful scripts, udfs, views, and other utilities for migration and data warehouse operations in BigQuery

  •    TSQL

BigQuery is a serverless, highly-scalable, and cost-effective cloud data warehouse with an in-memory BI Engine and machine learning built in. This repository provides useful utilities to assist you in migration and usage of BigQuery. All UDFs within this repository will be automatically created under the bqutil project under publicly shared datasets. Queries can then reference the shared UDFs via bqutil.<dataset>.<function>().






We have large collection of open source products. Follow the tags from Tag Cloud >>


Open source products are scattered around the web. Please provide information about the open source projects you own / you use. Add Projects.