Displaying 1 to 20 from 43 results

Ganglia - scalable distributed monitoring system

Ganglia is a scalable distributed monitoring system for high-performance computing systems such as clusters and Grids. It is based on a hierarchical design targeted at federations of clusters. It leverages widely used technologies such as XML for data representation, XDR for compact, portable data transport, and RRDtool for data storage and visualization.

Carrot2 - Search Results Clustering Engine

Carrot2 is an Open Source Search Results Clustering Engine. It could cluster the search results from various sources and generates small collection of documents. Carrot2 offers ready-to-use components for fetching search results from various sources including YahooAPI, GoogleAPI, Bing API, eTools Meta Search, Lucene, SOLR, Google Desktop and more.

Hadoop Common

Apache Hadoop is a framework for running applications on large clusters built of commodity hardware. Hadoop common supports other Hadoop subprojects

js-marker-clusterer - A marker clustering library for the Google Maps JavaScript API v3.

The library creates and manages per-zoom-level clusters for large amounts of markers. Google API v3.

MySQL - World's most popular open source database

The MySQL database is the world's most popular open source database because of its consistent fast performance, high reliability and ease of use. It is used by individual Web developers as well as many of the world's largest and fastest-growing organizations to save time and money powering their high-volume Web sites.

minikube - Run Kubernetes locally

Minikube is a tool that makes it easy to run Kubernetes locally. Minikube runs a single-node Kubernetes cluster inside a VM on your laptop for users looking to try out Kubernetes or develop with it day-to-day.We also released a Debian package and Windows installer on our releases page If you maintain a minikube package, please feel free to add it here.

Spark - Fast Cluster Computing

Apache Spark is an open source cluster computing system that aims to make data analytics fast — both fast to run and fast to write. To run programs faster, Spark offers a general execution model that can optimize arbitrary operator graphs, and supports in-memory computing, which lets it query data faster than disk-based engines like Hadoop.

Performance Co-Pilot - System Performance and Analysis Framework.

Performance Co-Pilot (PCP) provides a framework and services to support system-level performance monitoring and management. It presents a unifying abstraction for all of the performance data in a system, and many tools for interrogating, retrieving and processing that data. The distributed PCP architecture makes it especially useful for those seeking centralized monitoring of distributed processing.

Synnefo - Open source Cloud Software, Used to create massively scalable IaaS clouds

Synnefo is a complete open source cloud stack written in Python that provides Compute, Network, Image, Volume and Storage services, similar to the ones offered by AWS. Synnefo manages multiple Ganeti clusters at the backend for handling low-level VM operations and uses Archipelago to unify cloud storage. To boost 3rd-party compatibility, Synnefo exposes the OpenStack APIs to users.

Hazelcast - In-Memory Data Grid for Java

Hazelcast is a clustering and highly scalable data distribution platform for Java. It supports Distributed implementations of java.util.{Queue, Set, List, Map}, java.util.concurrency.locks.Lock, java.util.concurrent.ExecutorService, Distributed Indexing and Query support, Dynamic scaling, partitioning with backups, fail-over, Web-based cluster monitoring tool and lot more.

H2 Database - Java based Database Engine

H2 database is very fast, open source database engine. It supports SQL and JDBC standards. It can run in Embedded and Server mode and it has clustering support.

Riemann - Monitors Distributed Systems

Riemann monitors distributed systems. It aggregates events from your servers and applications with a powerful stream processing language. Send an email for every exception raised by your code. Track the latency distribution of your web app. See the top processes on any host, by memory and CPU. Combine statistics from every Riak node in your cluster and forward to Graphite.

Vector - Performance Monitoring Framework

Vector is an open source on-host performance monitoring framework which exposes hand picked high resolution system and application metrics to every engineer’s browser. Having the right metrics available on-demand and at a high resolution is key to understand how a system behaves and correctly troubleshoot performance issues.

Dubbo - High-performance, java based, open source RPC framework

Dubbo is a high-performance, java based RPC framework open-sourced by Alibaba. As in many RPC systems, dubbo is based around the idea of defining a service, specifying the methods that can be called remotely with their parameters and return types. On the server side, the server implements this interface and runs a dubbo server to handle client calls. On the client side, the client has a stub that provides the same methods as the server.

Helix - Cluster Management Framework

Helix is a generic cluster management framework used for the automatic management of partitioned, replicated and distributed resources hosted on a cluster of nodes. It helps to perform scheduling of maintenance tasks, such as backups, garbage collection, file consolidation, index rebuilds, repartitioning of data or resources across the cluster, informing dependent systems of changes so they can react appropriately to cluster changes, throttling system tasks and changes and so on.

csync2 - cluster synchronization tool

Csync2 is a cluster synchronization tool. It can be used to keep files on multiple hosts in a cluster in sync. Csync2 can handle complex setups with much more than just 2 hosts, handle file deletions and can detect conflicts. It is expedient for HA-clusters, HPC-clusters, COWs and server farms.

Apache Geode - Distributed, In-memory Database for Scale-Out Applications

Apache Geode is distributed, in-memory database for scale-out applications. All data is stored in-memory for low latency. Performance scales linearly as nodes are added. Data is distributed automatically between nodes to optimize performance. Clusters fail-over to other nodes in case of failures, and rebalance remaining resources. Geode servers can be configured to talk memcached protocol.

JPPF - Parallelize computationally intensive tasks and execute them on a Grid

JPPF enables applications with large processing power requirements to be run on any number of computers, in order to dramatically reduce their processing time. This is done by splitting an application into smaller parts that can be executed simultaneously on different machines.

Cascading - Data Processing Workflows on Hadoop

Cascading is a Data Processing API, Process Planner, and Process Scheduler used for defining and executing complex, scale-free, and fault tolerant data processing workflows on an Apache Hadoop cluster. It is a thin Java library and API that sits on top of Hadoop's MapReduce layer and is executed from the command line like any other Hadoop application.