Each model is built into a separate Docker image with the appropriate Python, C++, and Java/Scala Runtime Libraries for training or prediction. Use the same Docker Image from Local Laptop to Production to avoid dependency surprises.
machine-learning artificial-intelligence tensorflow kubernetes elasticsearch cassandra ipython spark kafka netflixoss presto airflow pipeline jupyter-notebook zeppelin docker redis neural-network gpu microservicesAlluxio (formerly known as Tachyon) is a virtual distributed storage system. It bridges the gap between computation frameworks and storage systems, enabling computation applications to connect to numerous storage systems through a common interface.
distributed-storage big-data memory-speed hadoop spark virtual-file-system presto tensorflow storage object-storeLinkis helps easily connect to various back-end computation/storage engines
sql spark presto hive storage jdbc rest-api engine impala pyspark udf thrift-server resource-manager jobserver application-manager livy hive-table linkis context-service scriptisTrino is a highly parallel and distributed query engine, that is built from the ground up for efficient, low latency analytics. It is an ANSI SQL compliant query engine, that works with BI tools such as R, Tableau, Power BI, Superset and many others. It helps to natively query data in Hadoop, S3, Cassandra, MySQL, and many others, without the need for complex, slow, and error-prone processes for copying the data.
distributed-systems data-science sql database big-data presto hive hadoop analytics jdbc databases distributed-database query-engine datalake prestodb trinoPyHive is a collection of Python DB-API and SQLAlchemy interfaces for Presto and Hive.First install this package to register it with SQLAlchemy (see setup.py).
hive hiveserver2 presto dbapi sqlalchemyCube.js is an open-source analytical API platform. It is primarily used to build internal business intelligence tools or add customer-facing analytics to existing applications. Cube.js was designed to work with serverless data warehouses and query engines like Google BigQuery and AWS Athena. A multi-stage querying approach makes it suitable for handling trillions of data points. Most modern RDBMS work with Cube.js as well and can be further tuned for performance.
analytics mysql bigquery chart spark presto hive microservice serverless athena postgresql cubeThese docker images are tested by hundreds of tools and also used in the full functional test suites of various other GitHub repos. These images are all available pre-built on My DockerHub - https://hub.docker.com/u/harisekhon/.
hadoop hbase cassandra solr solrcloud kafka consul superset zookeeper apache-drill nifi docker-image dockerhub docker rabbitmq-cluster nagios-plugins spark presto rabbitmqPresto is a powerful interactive querying engine that enables running SQL queries on anything -- be it MySQL, HDFS, local file, Kafka -- as long as there exist a connector to the source. This is a Presto connector to the Ethereum blockchain data. With this connector, one can get hands on with Ethereum blockchain analytics work without having to know how to play with the nitty gritty Javascript API.
presto prestodb ethereum ethereum-blockchain blockchain sqlMLCraft is an open-source low-code business intelligence tool and a data science workflow. MLCraft was designed to query the data from several data warehouses and run machine learning experiments. Cube.js is used as a primary query layer and makes it suitable for handling trillions of data points. It is a full-stack data science platform that provides everything you need to build, manage and automate machine learning
mysql bigquery big-data spark presto hive athena analytics clickhouse postgresql business-intelligence redshiftDistributed query engine "Presto" 's client library for node.js. Or add presto-client to your own packagen.json, and do npm install.
prestoThis will connect to hive metastore via hive connector. On a N worker node cluster, you will have N-2 presto worker nodes and 1 coordinator node. The setup also configures TPCH connector, so you can run TPCH queries directly. You will see output like following, note the IP:Port.
presto azure-hdinsightClone this repo. Run TPCDSDataGen.hql with settings.hql file and set the required config variables.
benchmarking tpcds hive spark presto llapSimple, Scalable and Futuristic API & Data Hub for Publishing and Consuming Microservices
presto restful-api configurableHigh Available Micro Service Suites, with Simple, Scalable and Futuristic API & Data Hub for Publishing and Consuming Microservices 💫 构建高可用微服务,包括可配置的接口生成、性能实验室等
presto restful-api configurableA Presto client for the Go programming language. You need a working environment with Go installed and $GOPATH set.
presto prestodb sql big-dataRun Presto cluster on Kubernetes. Clone this.
presto kubernetesClone this repo. Run TPCHDataGen.hql with settings.hql file and set the required config variables.
benchmarking tpch hive spark presto llapThis Prometheus exporter can scrape the metrics from presto cluster, and it should be installed in the coordinator server of presto cluster. Support Presto version 0.177 (and later).
presto prometheus
We have large collection of open source products. Follow the tags from
Tag Cloud >>
Open source products are scattered around the web. Please provide information
about the open source projects you own / you use.
Add Projects.