BeeGFS - Parallel Cluster File System

  •        243

BeeGFS (formerly FhGFS) is the leading parallel cluster file system, developed with a strong focus on performance and designed for very easy installation and management. It transparently spreads user data across multiple servers. By increasing the number of servers and disks in the system, you can simply scale performance and capacity of the file system to the level that you need, seamlessly from small clusters up to enterprise-class systems with thousands of nodes.

https://www.beegfs.io/
https://git.beegfs.io/pub

Tags
Implementation
License
Platform

   




Related Projects

Gluster Filesystem - Scalable Network Filesystem

  •    C

Gluster is a software defined distributed storage that can scale to several petabytes. It provides interfaces for object, block and file storage. It is a distributed scale-out filesystem that allows rapid provisioning of additional storage based on your storage consumption needs. It incorporates automatic failover as a primary feature.

OrangeFS - Scale-out Network File System

  •    C

OrangeFS is a scale-out network file system designed for use on high-end computing (HEC) systems that provides very high-performance access to multi-server-based disk storage, in parallel. The OrangeFS server and client are user-level code, making them very easy to install and manage. OrangeFS has optimized MPI-IO support for parallel and distributed applications, and it is leveraged in production installations and used as a research platform for distributed and parallel storage.

Ceph - Distributed Object Store

  •    C++

Ceph provides seamless access to objects using native language bindings or radosgw, a REST interface that’s compatible with applications written for S3 and Swift. Ceph’s RADOS Block Device (RBD) provides access to block device images that are striped and replicated across the entire storage cluster. Ceph provides a POSIX-compliant network file system that aims for high performance, large data storage, and maximum compatibility with legacy applications.


SeaweedFS - Simple and highly scalable distributed file system

  •    Go

SeaweedFS is a simple and highly scalable distributed file system. There are two objectives: to store billions of files! to serve the files fast! Instead of supporting full POSIX file system semantics, SeaweedFS choose to implement only a key~file mapping. Similar to the word "NoSQL", you can call it as "NoFS".

diskover - File system crawler, disk space usage, file search engine and file system analytics powered by Elasticsearch

  •    Python

diskover is an open source file system crawler and disk space usage software that uses Elasticsearch to index and manage data across heterogeneous storage systems. Using diskover, you are able to more effectively search and organize files and system administrators are able to manage storage infrastructure, efficiently provision storage, monitor and report on storage use, and effectively make decisions about new infrastructure purchases. As the amount of file data generated by business' continues to expand, the stress on expensive storage infrastructure, users and system administrators, and IT budgets continues to grow.

lcfs - LCFS Graph driver for Docker

  •    C

tl;dr: Every time you build, pull or destroy a Docker container, you are using a storage driver. Current storage drivers like Device Mapper, AUFS, and Overlay2 implement container behavior using file systems designed to run a full OS. We are open-sourcing a file system that is purpose-built for the container lifecycle. We call this new file system Layer Cloning File System (LCFS). Because it is designed only for containers, it is up to 2.5x faster to build an image and up to almost 2x faster to pull an image. We're looking forward to working with the container community to improve and expand this new tool. Layer Cloning FileSystem (LCFS) is a new filesystem purpose-built to be a Docker storage driver. All Docker images are constructed of layers using storage drivers (graph drivers) like AUFS, OverlayFS, and Device Mapper. As a design principle, LCFS focuses on layers as the first-class citizen. The LCFS filesystem operates directly on top of block devices, as opposed to merging separate filesystems. Thereby, LCFS aims to directly manage at the container image’s layer level, eliminate the overhead of having a second filesystem that then is merged, and to optimize for density.

irmin - Irmin is a distributed database that follows the same design principles as Git

  •    OCaml

Irmin is an OCaml library for building mergeable, branchable distributed data stores. Below is a simple example of setting a key and getting the value out of a Git based, filesystem-backed store.

storoom - a simple distributed storage

  •    DotNet

storoom is a simple distributed storage system, it uses key-value pair to store data. contains client/clientlib stornode, node to store data to physical storage system storoom, controller of stornode<s> or storoom<s> develop in vb.net

Hazelcast Jet - A general purpose distributed data processing engine, built on top of Hazelcast.

  •    Java

Hazelcast Jet is a distributed computing platform built for high-performance stream processing and fast batch processing. It embeds Hazelcast In-Memory Data Grid (IMDG) to provide a lightweight, simple-to-deploy package that includes scalable in-memory storage. Hazelcast Jet performs parallel execution to enable data-intensive applications to operate in near real-time.

fastdfs - FastDFS is an open source high performance distributed file system (DFS)

  •    C

FastDFS is an open source high performance distributed file system. It's major functions include: file storing, file syncing and file accessing (file uploading and file downloading), and it can resolve the high capacity and load balancing problem. FastDFS should meet the requirement of the website whose service based on files such as photo sharing site and video sharing site. FastDFS has two roles: tracker and storage. The tracker takes charge of scheduling and load balancing for file access. The storage store files and it's function is file management including: file storing, file syncing, providing file access interface. It also manage the meta data which are attributes representing as key value pair of the file. For example: width=1024, the key is "width" and the value is "1024".

Sheepdog - Distributed Storage System for QEMU

  •    C

Sheepdog is a distributed object storage system for volume and container services and manages the disks and nodes intelligently. Sheepdog features ease of use, simplicity of code and can scale out to thousands of nodes. The block level volume abstraction can be attached to QEMU virtual machines and Linux SCSI Target and supports advanced volume management features such as snapshot, cloning, and thin provisioning.

svfs - The Swift Virtual File System

  •    Go

SVFS is a Virtual File System over Openstack Swift built upon fuse. It is compatible with hubiC, OVH Public Cloud Storage and basically every endpoint using a standard Openstack Swift setup. It brings a layer of abstraction over object storage, making it as accessible and convenient as a filesystem, without being intrusive on the way your data is stored. This is not an official project of the Openstack community.

huststore - High-performance Distributed Storage

  •    C

huststore is a open source high performance distributed database system. It not only provides key-value storage service with extremely high performance, up to 100 thousand QPS, but also supports data structures like hash, set, sorted set, etc. Also, it can store binary data as value from a key-value pair, and thus can be used as an alternative of Redis.In addtion, huststore implements a distributed message queue by integrating a special HA module, features including message Push Stream, and message Publish-SubScribe, these features can be used as replacements of the corresponding features in rabbitmq and gearman.

Hypertable - A high performance, scalable, distributed storage and processing system for structured

  •    C++

Hypertable is based on Google's Bigtable Design, which is a proven scalable design that powers hundreds of Google services. Many of the current scalable NoSQL database offerings are based on a hash table design which means that the data they manage is not kept physically ordered. Hypertable keeps data physically sorted by a primary key and it is well suited for Analytics.

Alluxio - Data orchestration for analytics and machine learning in the cloud

  •    Java

Alluxio (formerly known as Tachyon) is a virtual distributed storage system. It bridges the gap between computation frameworks and storage systems, enabling computation applications to connect to numerous storage systems through a common interface.

logcabin - LogCabin is a distributed storage system built on Raft that provides a small amount of highly replicated, consistent storage

  •    C++

LogCabin is a distributed system that provides a small amount of highly replicated, consistent storage. It is a reliable place for other distributed systems to store their core metadata and is helpful in solving cluster management issues. LogCabin uses the Raft consensus algorithm internally and is actually the very first implementation of Raft. It's released under the ISC license (equivalent to BSD).Information about releases is in RELEASES.md.

EventQL - The database for large-scale event analytics

  •    C++

EventQL is a distributed, column-oriented database built for large-scale event collection and analytics. It runs super-fast SQL and MapReduce queries. Its features include Automatic partitioning, Columnar storage, Standard SQL support, Scales to petabytes, Timeseries and relational data, Fast range scans and lot more.