Hadoop HDFS FSImage Exporter allows exporting HDFS statistics for Prometheus from the Hadoop HDFS FSImage file snapshots.
https://github.com/marcelmay/hadoop-hdfs-fsimage-exporterTags | prometheus-exporter hadoop-fsimage hadoop hdfs monitoring hdfs-metrics |
Implementation | Java |
License | Apache |
Platform | OS-Independent |
The Apache Ambari project is aimed at making Hadoop management simpler by developing software for provisioning, managing, and monitoring Apache Hadoop clusters. Ambari provides an intuitive, easy-to-use Hadoop management web UI backed by its RESTful APIs. The set of Hadoop components that are currently supported by Ambari includes HDFS, MapReduce, Hive, HCatalog, HBase, ZooKeeper, Oozie, Pig, Sqoop.
hadoop-management hadoop-cluster hadoop-tools admin monitorMinos is a distributed deployment and monitoring system. It was initially developed and used at Xiaomi to deploy and manage the Hadoop, HBase and ZooKeeper clusters used in the company. Minos can be easily extended to support other systems, among which HDFS, YARN and Impala have been supported in the current release. This is the command line client tool used to deploy and manage processes of various systems. You can use this client to perform various deployment tasks, e.g. installing, (re)starting, stopping a service. Currently, this client supports ZooKeeper, HDFS, HBase, YARN and Impala. It can be extended to support other systems. You can refer to the following Using Client to learn how to use it.
Specialised plugins for Hadoop, Big Data & NoSQL technologies, written by a former Clouderan (Cloudera was the first Hadoop Big Data vendor) and modern Hortonworks partner/consultant. Supports a a wide variety of compatible Enterprise Monitoring systems.
nagios-plugins zookeeper hadoop hbase cloudera hbase-client jenkins travis-ci nagios-plugin hortonworks ambari cassandra elasticsearch docker kafka solr redis rabbitmq consul datastaxThe Spring for Apache Hadoop project provides extensions to Spring, Spring Batch, and Spring Integration to build manageable and robust pipeline solutions around Hadoop.Spring for Apache Hadoop extends Spring Batch by providing support for reading from and writing to HDFS, running various types of Hadoop jobs (Java MapReduce, Streaming, Hive, Spark, Pig) and using HBase. An important goal is to provide excellent support for non-Java based developers to be productive using Spring Hadoop and not have to write any Java code to use the core feature set.
Snakebite is a python library that provides a pure python HDFS client and a wrapper around Hadoops minicluster. The client uses protobuf for communicating with the NameNode and comes in the form of a library and a command line interface. Currently, the snakebite client supports most actions that involve the Namenode and reading data from DataNodes.Note: all methods that read data from a data node are able to check the CRC during transfer, but this is disabled by default because of performance reasons. This is the opposite behaviour from the stock Hadoop client.
hdfs python-hdfs-clientApache Hadoop is a framework for running applications on large clusters built of commodity hardware. Hadoop common supports other Hadoop subprojects
cluster map-reduce distributed-file-systemApache Tajo is a robust big data relational and distributed data warehouse system for Apache Hadoop. Tajo is designed for low-latency and scalable ad-hoc queries, online aggregation, and ETL (extract-transform-load process) on large-data sets stored on HDFS (Hadoop Distributed File System) and other data sources.
data-warehouse etl aggregation analytics sql-on-hadoopHue is a Web application for interacting with Apache Hadoop. It supports a FileBrowser for accessing HDFS, JobBrowser for accessing MapReduce jobs (MR1/MR2-YARN), Job Designer for creating MapReduce/Streaming/Java jobs, HBase Browser for exploring and modifying HBase tables and data, Oozie App for submitting and scheduling workflows and bundles, A Pig/HBase/Sqoop2 shell, Beeswax application for executing Hive queries, Search app for querying Solr and Solr Cloud.
hadoop-tools hadoop-client big-data map-reduceKudu is a storage system for tables of structured data. Kudu provides a combination of fast inserts/updates and efficient columnar scans to enable multiple real-time analytic workloads across a single storage layer. As a new complement to HDFS and Apache HBase, Kudu gives architects the flexibility to address a wider variety of use cases without exotic workarounds.
analytics column-database hadoop-storage-layer real-time distributedApache Apex is a unified platform for big data stream and batch processing. Use cases include ingestion, ETL, real-time analytics, alerts and real-time actions. Apex is a Hadoop-native YARN implementation and uses HDFS by default. It simplifies development and productization of Hadoop applications by reducing time to market. Key features include Enterprise Grade Operability with Fault Tolerance, State Management, Event Processing Guarantees, No Data Loss, In-memory Performance & Scalability and Native Window Support.Please visit the documentation section.
HBase provides support to handle BigTable - billions of rows X millions of columns. It is a scalable, distributed, versioned, column-oriented store modeled after Google's Bigtable and runs on top of HDFS (Hadoop Distributed Filesystem). It features compression, in-memory operation per-column. Data could be replicated between the nodes. HBase is used in Facebook and Twitter.
database non-relational column-database no-sql scalable distributed distributed-databaseIPython Notebook(s) demonstrating deep learning functionality.IPython Notebook(s) demonstrating scikit-learn functionality.
machine-learning deep-learning data-science big-data aws tensorflow theano caffe scikit-learn kaggle spark mapreduce hadoop matplotlib pandas numpy scipy kerasPrometheus exporter for hardware and OS metrics exposed by *NIX kernels, written in Go with pluggable metric collectors.The WMI exporter is recommended for Windows users.
prometheus-exporter prometheus node-metrics machine-metrics host-metrics metrics procfs system-information system-metricsKafka exporter for Prometheus. For other metrics from Kafka, have a look at the JMX exporter. Support Apache Kafka version 0.10.1.0 (and later).
prometheus prometheus-exporter kafka kafka-metrics metricsDr. Elephant is a performance monitoring and tuning tool for Hadoop and Spark. It automatically gathers all the metrics, runs analysis on them, and presents them in a simple way for easy consumption. Its goal is to improve developer productivity and increase cluster efficiency by making it easier to tune the jobs. It analyzes the Hadoop and Spark jobs using a set of pluggable, configurable, rule-based heuristics that provide insights on how a job performed, and then uses the results to make suggestions about how to tune the job to make it perform more efficiently.For more information on Dr. Elephant, check the wiki pages here.
Apache Spark is an open source cluster computing system that aims to make data analytics fast — both fast to run and fast to write. To run programs faster, Spark offers a general execution model that can optimize arbitrary operator graphs, and supports in-memory computing, which lets it query data faster than disk-based engines like Hadoop.
cluster cluster-computing data-analytics analytics hdfs map-reduce big-dataThe Apache Hive (TM) data warehouse software facilitates querying and managing large datasets residing in distributed storage.
Hoodie is a Apache Spark library that provides the ability to efficiently do incremental processing on datasets in HDFS
hadoop spark parquet analytics-database ingestion hoodie hudi columnar storageand adjust the host name accordingly. Here is an example Kubernetes deployment configuration for how to deploy the redis_exporter as a sidecar with a Redis instance.
prometheus prometheus-exporter metrics redis redis-cluster redis-exporter redis-nodes resist
We have large collection of open source products. Follow the tags from
Tag Cloud >>
Open source products are scattered around the web. Please provide information
about the open source projects you own / you use.
Add Projects.