Cabot - Self-hosted, easily-deployable monitoring and alerts service - like a lightweight PagerDuty

  •        305

It provides a web interface that allows you to monitor services (e.g. "Stage Redis server", "Production ElasticSearch cluster") and send telephone, sms or hipchat/email alerts to your on-duty team if those services start misbehaving or go down - all without writing a line of code. Best of all, you can use data that you're already pushing to Graphite/statsd to generate alerts, rather than implementing and maintaining a whole new system of data collectors.We built Cabot as a Christmas project at Arachnys because we couldn't wrap our heads around Nagios, and nothing else out there seemed to fit our use case. We're open-sourcing it in the hope that others find it useful.

https://github.com/arachnys/cabot

Tags
Implementation
License
Platform

   




Related Projects

Kapacitor - Open source framework for processing, monitoring, and alerting on time series data

  •    Go

Kapacitor is a open source framework for processing, monitoring, and alerting on time series data. Kapacitor imports (stream or batch) time series data, and then transform, analyze, and act on the data. It uses Telegraf to collect system metrics on your local machine and store them in InfluxDB.

Cyphon - Incident Management and Response Platform

  •    Python

Cyphon eliminates the headaches of incident management by streamlining a multitude of related tasks through a single platform. It receives, processes and triages events to provide an all-encompassing solution for your analytic workflow — aggregating data, bundling and prioritizing alerts, and empowering analysts to investigate and document incidents.

Kong - The Microservice API Gateway

  •    Lua

Kong is a cloud-native, fast, scalable, and distributed Microservice Abstraction Layer (also known as an API Gateway, API Middleware or in some cases Service Mesh). Backed by the battle-tested NGINX with a focus on high performance, Kong was made available as an open-source platform in 2015. Under active development, Kong is used in production at thousands of organizations from startups, Global 5000 and Government organizations.

netdata - Real-time performance monitoring, done right! https://www.netdata.cloud

  •    C

Netdata's distributed, real-time monitoring Agent collects thousands of metrics from systems, hardware, containers, and applications with zero configuration. It runs permanently on all your physical/virtual servers, containers, cloud deployments, and edge/IoT devices, and is perfectly safe to install on your systems mid-incident without any preparation. You can install Netdata on most Linux distributions (Ubuntu, Debian, CentOS, and more), container platforms (Kubernetes clusters, Docker), and many other operating systems (FreeBSD, macOS). No sudo required.

OSSEC - Host-based Intrusion Detection System

  •    C

OSSEC is a full platform to monitor and control your systems. It mixes together all the aspects of HIDS (host-based intrusion detection), log monitoring and SIM/SIEM together in a simple, powerful and open source solution.


Riemann - Monitors Distributed Systems

  •    Clojure

Riemann monitors distributed systems. It aggregates events from your servers and applications with a powerful stream processing language. Send an email for every exception raised by your code. Track the latency distribution of your web app. See the top processes on any host, by memory and CPU. Combine statistics from every Riak node in your cluster and forward to Graphite.

icinga2 - The heart of our monitoring platform with a powerful configuration language and REST API.

  •    C++

Icinga 2 is an open source monitoring system which checks the availability of your network resources, notifies users of outages, and generates performance data for reporting. Scalable and extensible, Icinga 2 can monitor large, complex environments across multiple locations.

Performance Co-Pilot - System Performance and Analysis Framework.

  •    C

Performance Co-Pilot (PCP) provides a framework and services to support system-level performance monitoring and management. It presents a unifying abstraction for all of the performance data in a system, and many tools for interrogating, retrieving and processing that data. The distributed PCP architecture makes it especially useful for those seeking centralized monitoring of distributed processing.

Praeco - Elasticsearch alerting made simple

  •    Vue

Praeco is an alerting tool for Elasticsearch – a GUI for ElastAlert, using the ElastAlert API. It interactively build alerts for your Elasticsearch data using a query builder, helps you to preview and test your alerts using historical data.

gatus - ⛑ Gatus - Automated service health dashboard

  •    Go

Gatus is a health dashboard that gives you the ability to monitor your services using HTTP, ICMP, TCP, and even DNS queries as well as evaluate the result of said queries by using a list of conditions on values like the status code, the response time, the certificate expiration, the body and many others. The icing on top is that each of these health checks can be paired with alerting via Slack, PagerDuty, Discord and even Twilio. Neither of these can tell you that there’s a problem if there are no clients actively calling the endpoint. In other words, it's because monitoring metrics mostly rely on existing traffic, which effectively means that unless your clients are already experiencing a problem, you won't be notified.

Bosun - Time Series Alerting Framework

  •    Go

Bosun is a time series alerting framework developed by Stack Exchange. Scollector is a metric collection agent. It has an expressive domain specific language for evaluating alerts and creating detailed notifications. It also lets you test your alerts against history for a faster development experience.

Centreon - Global IT monitoring

  •    Perl

Centreon is one of the most flexible and performant monitoring software. It is based upon the most effective Open Source monitoring engine Nagios. Centreon gathers functionalities that are essential to the monitoring of critical infrastructures.

Unsee - Alert dashboard for Prometheus Alertmanager

  •    Go

Alert dashboard for Prometheus Alertmanager. Alertmanager UI is useful for browsing alerts and managing silences, but it's lacking as a dashboard tool - unsee aims to fill this gap. Starting with 0.7.0 release it can also aggregate alerts from multiple Alertmanager instances, running either in HA mode or separate. Duplicated alerts are deduplicated so only unique alerts are displayed. Each alert is tagged with names of all Alertmanager instances it was found at and can be filtered based on those tags.

LibreNMS - Network monitoring system

  •    PHP

LibreNMS is an autodiscovering PHP/MySQL/SNMP based network monitoring which includes support for a wide range of network hardware and operating systems including Cisco, Linux, FreeBSD, Juniper, Brocade, Foundry, HP and many more.

ceph-dash - Flask based api / dashboard for viewing a ceph clusters overall health status

  •    Javascript

This is a small and clean approach of providing the Ceph overall cluster health status via a restful json api as well as via a (hopefully) fancy web gui. There are no dependencies to the existing ceph-rest-api. This wsgi application talks to the cluster directly via librados. You can find a blog entry regarding monitoring a Ceph cluster with ceph-dash on Crapworks.

Ganglia - scalable distributed monitoring system

  •    C

Ganglia is a scalable distributed monitoring system for high-performance computing systems such as clusters and Grids. It is based on a hierarchical design targeted at federations of clusters. It leverages widely used technologies such as XML for data representation, XDR for compact, portable data transport, and RRDtool for data storage and visualization.

securitybot - Distributed alerting for the masses!

  •    Python

Securitybot is an open-source implementation of a distributed alerting chat bot, as described in Ryan Huber's blog post. Distributed alerting improves the monitoring efficiency of your security team and can help you catch security incidents faster and more efficiently. We've tried to remove all Dropbox-isms from this code so that setting up your own instance should be fairly painless. It should be relatively easy to install the listed requirements in a virtualenv/Docker container and simply have the bot do its thing. We also provide a simple front end to dive through the database, receive API calls, and create custom alerts for the bot to reach out to people as desired.This guide runs through setting up a Securitybot instance as quickly as possible with no frills. We'll be connecting it to Slack, SQL, and Duo. Once we're done, we'll have a file that looks something like main.py.

Elastic HQ - Sleek, intuitive, and powerful ElasticSearch Management and Monitoring

  •    Javascript

ElasticHQ provides monitoring, management, and querying web Interface for ElasticSearch instances and clusters. It provides support for Real Time Monitoring for Clusters, Manage Indices, Mappings, Shards, Aliases, and Nodes,Full Cluster Management. It works in your web browser, allowing you to manage and monitor your ElasticSearch clusters from anywhere at any time.

cloud-ops-sandbox - Cloud Operations Sandbox is an open source tool that helps practitioners to learn Service Reliability Engineering practices from Google and apply them on their cloud services using Cloud Operations suite of tools

  •    HTML

Cloud Operations Sandbox is an open-source tool that helps practitioners to learn Service Reliability Engineering practices from Google and apply them on their cloud services using Cloud Operations (formerly Stackdriver). It is based on Hipster Shop, a cloud-native microservices application. Google Cloud Operations Suite is a suite of tools that helps you gain full observability of your code and applications. You might want to take Cloud Operations to a "test drive" in order to answer the question, "will it work for my application needs"? The most effective way to learn is by testing the tool in "real-life" conditions, but without risking a production system. With Sandbox, we provide a tool that automatically provisions a new demo cluster, which receives traffic, simulating real users. Practitioners can experiment with various Cloud Operations tools to solve problems and accomplish standard SRE tasks in a sandboxed environment.






We have large collection of open source products. Follow the tags from Tag Cloud >>


Open source products are scattered around the web. Please provide information about the open source projects you own / you use. Add Projects.