ark - Heptio Ark is a utility for managing disaster recovery, specifically for your Kubernetes cluster resources and persistent volumes

The documentation provides a getting started guide, plus information about building from source, architecture, extending Ark, and more. If you encounter issues, review the troubleshooting docs, file an issue, or talk to us on the #ark-dr channel on the Kubernetes Slack server.



Related Projects

Barman - Backup and Recovery manager for PostgreSQL

  •    Python

Barman (Backup and Recovery Manager) is an open source administration tool for disaster recovery of PostgreSQL servers . It allows your organisation to perform remote backups of multiple servers in business critical environments and to help DBAs during the recovery phase. Its features include backup catalogues, incremental backup, retention policies, remote backup and recovery, archiving and compression of WAL files and backups.

velero - Backup and migrate Kubernetes applications and their persistent volumes

  •    Go

You can run Velero in clusters on a cloud provider or on-premises. For detailed information, see Compatible Storage Providers. The documentation provides a getting started guide, plus information about building from source, architecture, extending Velero, and more.


  •    Shell

Linux disaster recovery and system migration solution

sonobuoy - Heptio Sonobuoy is a diagnostic tool that makes it easier to understand the state of a Kubernetes cluster by running a set of Kubernetes conformance tests in an accessible and non-destructive manner

  •    Go

Heptio Sonobuoy is a diagnostic tool that makes it easier to understand the state of a Kubernetes cluster by running a set of Kubernetes conformance tests in an accessible and non-destructive manner. It is a customizable, extendable, and cluster-agnostic way to generate clear, informative reports about your cluster. Sonobuoy supports Kubernetes versions 1.9 and later.

gimbal - Heptio Gimbal is an ingress load balancing platform capable of routing traffic to multiple Kubernetes and OpenStack clusters

  •    Go

Heptio Gimbal is a layer-7 load balancing platform built on Kubernetes, the Envoy proxy, and Heptio's Kubernetes Ingress controller, Contour. It provides a scalable, multi-team, and API-driven ingress tier capable of routing Internet traffic to multiple upstream Kubernetes clusters and to traditional infrastructure technologies such as OpenStack. Gimbal was developed out of a joint effort between Heptio and Yahoo Japan Corporation's subsidiary, Actapio, to modernize Yahoo Japan’s infrastructure with Kubernetes, without affecting legacy investments in OpenStack.

aws-iam-authenticator - A tool to use AWS IAM credentials to authenticate to a Kubernetes cluster

  •    Go

A tool to use AWS IAM credentials to authenticate to a Kubernetes cluster. The initial work on this tool was driven by Heptio. The project recieves contributions from multiple community engineers and is currently maintained by Heptio and Amazon EKS OSS Engineers. If you are an administrator running a Kubernetes cluster on AWS, you already need to manage AWS IAM credentials to provision and update the cluster. By using AWS IAM Authenticator for Kubernetes, you avoid having to manage a separate credential for Kubernetes access. AWS IAM also provides a number of nice properties such as an out of band audit trail (via CloudTrail) and 2FA/MFA enforcement.

Rook - Storage Orchestration for Kubernetes

  •    Go

Rook is an open source cloud-native storage orchestrator for Kubernetes, providing the platform, framework, and support for a diverse set of storage solutions to natively integrate with cloud-native environments.

stolon - PostgreSQL cloud native High Availability and more.

  •    Go

Stolon is under active development and used in different environments. Probably its on disk format (store hierarchy and key contents) will change in future to support new features. If a breaking change is needed it'll be documented in the release notes and an upgrade path will be provided. Anyway it's quite easy to reset a cluster from scratch keeping the current master instance working and without losing any data.

WAL-E - A S3 based WAL-shipping disaster recovery and standby toolkit

  •    Python

A S3 based WAL-shipping disaster recovery and standby toolkit

wal-g - Archival and Restoration for Postgres

  •    Go

WAL-G is an archival restoration tool for Postgres. WAL-G is the successor of WAL-E with a number of key differences. WAL-G uses LZ4, LZMA or Brotli compression, multiple processors and non-exclusive base backups for Postgres. More information on the design and implementation of WAL-G can be found on the Citus Data blog post "Introducing WAL-G by Citus: Faster Disaster Recovery for Postgres".

Priam - Co-Process for backup/recovery, Token Management, and Centralized Configuration management for Cassandra

  •    Java

Priam is a process/tool that runs alongside Apache Cassandra to automate Backup and recovery, Token management, Seed discovery, Configuration, Support AWS environment. It supports Support multi-region Cassandra deployment in AWS via public IP, Backup throttling, Uses Snappy compression to compress backup data on the fly, Backup SSTables from local ephemeral disks to S3 and lot more.

Make CD-ROM Recovery

  •    C

Make CD-ROM Recovery (mkCDrec) makes a bootable (El Torito) disaster recovery image, including backups of the linux system to the same CD-ROM if space permits, or to a multi-volume CD-ROM set.

MS Backup Recovery


MS Backup Recovery is integrated with advanced and fail-resistant technique to recover Windows backup data. It can be also used to restore XP backup BKF files.

Restore, backup and recovery

  •    Ruby

RESTORE is a complete enterprise network backup and recovery solution. It is scalable to a complete backup solution for multiple workstations, servers and data centers. It operates over local area networks, wide area networks, and the Internet.

Redo Backup and Recovery


Easy rescue system with GUI tools for full system backup, bare metal recovery, partition editing, recovering deleted files, data protection, web browsing, and more. Uses partclone (like Clonezilla) with a UI like Ghost or Acronis. Runs from CD/USB.

backup-utils - GitHub Enterprise Backup Utilities

  •    Shell

This repository includes backup and recovery utilities for GitHub Enterprise.The backup utilities implement a number of advanced capabilities for backup hosts, built on top of the backup and restore features already included in GitHub Enterprise.

kubeadm-ha - Kubernetes high availiability deploy based on kubeadm (for v1

  •    Smarty

kube-apiserver: exposes the Kubernetes API. It is the front-end for the Kubernetes control plane. It is designed to scale horizontally – that is, it scales by deploying more instances. etcd: is used as Kubernetes’ backing store. All cluster data is stored here. Always have a backup plan for etcd’s data for your Kubernetes cluster. kube-scheduler: watches newly created pods that have no node assigned, and selects a node for them to run on. kube-controller-manager: runs controllers, which are the background threads that handle routine tasks in the cluster. Logically, each controller is a separate process, but to reduce complexity, they are all compiled into a single binary and run in a single process. kubelet: is the primary node agent. It watches for pods that have been assigned to its node (either by apiserver or via local configuration file) kube-proxy: enables the Kubernetes service abstraction by maintaining network rules on the host and performing connection forwarding. keepalived cluster config a virtual IP address (, this virtual IP address point to k8s-master01, k8s-master02, k8s-master03. nginx service as the load balancer of k8s-master01, k8s-master02, k8s-master03's apiserver. The other nodes kubernetes services connect the keepalived virtual ip address ( and nginx exposed port (16443) to communicate with the master cluster's apiservers.

Bacula - The Network Backup Solution

  •    C++

Bacula is a set of Open Source, computer programs that permit you (or the system administrator) to manage backup, recovery, and verification of computer data across a network of computers of different kinds. Bacula is relatively easy to use and efficient, while offering many advanced storage management features that make it easy to find and recover lost or damaged files. In technical terms, it is an Open Source, network based backup program.

Raigad - Co-Process for backup/recovery, Auto Deployments and Centralized Configuration management for ElasticSearch

  •    Java

Raigad is a process/tool that runs alongside Elasticsearch to automate the Snapshot backup and restore., Tribe node deployments, Publishing Elasticsearch monitoring metrics, Configured deployments for a dedicated master/data/search approach, Support for AWS environment.

awesome-operators - A resource tracking a number of Operators out in the wild.


Operators are Kubernetes native applications. We define native as being both managed using the Kubernetes APIs via kubectl and ran on Kubernetes as containers. Operators take advantage of Kubernetes’s extensibility to deliver the automation advantages of cloud services like provisioning, scaling, and backup/restore while being able to run anywhere that Kubernetes can run. This list is built by the community. Have you built or are you using an Operator that is not listed? Please send a pull request and we will add that Operator to the list.