Displaying 1 to 7 from 7 results

lakeFS - Git-like capabilities for your object storage

  •    Go

lakeFS is an open source layer that delivers resilience and manageability to object-storage based data lakes. With lakeFS you can build repeatable, atomic and versioned data lake operations - from complex ETL jobs to data science and analytics.

hadoop-crypto - Library for per-file client-side encyption in Hadoop FileSystems such as HDFS or S3.

  •    Java

Seekable Crypto is a Java library that provides the ability to seek within SeekableInputs while decrypting the underlying contents along with some utilities for storing and generating the keys used to encrypt/decrypt the data streams. An implementation of the Hadoop FileSystem is also included that uses the Seekable Crypto library to provide efficient and transparent client-side encryption for Hadoop filesystems.Currently AES/CTR/NoPadding and AES/CBC/PKCS5Padding are supported.

dynamometer - A tool for scale and performance testing of HDFS with a specific focus on the NameNode

  •    Java

Dynamometer is a tool to performance test Hadoop's HDFS NameNode. The intent is to provide a real-world environment by initializing the NameNode against a production file system image and replaying a production workload collected via e.g. the NameNode's audit logs. This allows for replaying a workload which is not only similar in characteristic to that experienced in production, but actually identical. Dynamometer will launch a YARN application which starts a single NameNode and a configurable number of DataNodes, simulating an entire HDFS cluster as a single application. There is an additional workload job run as a MapReduce job which accepts audit logs as input and uses the information contained within to submit matching requests to the NameNode, inducing load on the service.

pyhdfs - Python HDFS client

  •    Python

Because the world needs yet another way to talk to HDFS from Python. This library provides a Python client for WebHDFS. NameNode HA is supported by passing in both NameNodes. Responses are returned as nice Python classes, and any failed operation will raise some subclass of HdfsException matching the Java exception.




sbt-hadoop-oss - An sbt plugin for publishing artifacts to HDFS.

  •    Scala

An sbt plugin for publishing artifacts to the Hadoop Distributed File System (HDFS). Add the following line to project/plugins.sbt. See the Using plugins section of the sbt documentation for more information.






We have large collection of open source products. Follow the tags from Tag Cloud >>


Open source products are scattered around the web. Please provide information about the open source projects you own / you use. Add Projects.