A curated list of awesome Chaos Engineering resources. Chaos Engineering is the discipline of experimenting on a distributed system in order to build confidence in the system’s capability to withstand turbulent conditions in production. - Principles Of Chaos Engineering website.
netflix-chaos-monkey chaos-engineering chaos-monkey simian-army site-reliability-engineering resilience chaos chaos-community chaos-testing awesome awesome-listYou can download Pumba binary for your OS from release page. Note: For Alpine Linux based image, you need to install iproute2 package and also to create a symlink pointing to distribution files ln -s /usr/lib/tc /lib/tc.
docker chaos network-emulator testing-tools testing chaos-monkey chaos-testing kubernetes chaos-engineeringGamified chaos engineering tool for Kubernetes. It is like Space Invaders but the aliens are pods or worker nodes. Through KubeInvaders you can stress a Kubernetes cluster in a fun way and check how it is resilient.
game kubernetes openshift workstation aliens chaos pods chaos-engineering kube-linter kubeinvaders kubeinvaders-container kubernetes-testingThis readme and related documentation are Work in Progress. Controller-manager: used to schedule and manage the lifecycle of CRD objects.
kubernetes microservice site-reliability-engineering cncf chaos operator cloud-native fault-injection hacktoberfest chaos-testing chaos-engineering crd chaos-experiments chaos-meshProxy for simulating real-world distributed system failures to improve resilience in your applications.Muxy is a proxy that mucks with your system and application context, operating at Layers 4, 5 and 7, allowing you to simulate common failure scenarios from the perspective of an application under test; such as an API or a web application.
chaos testing proxy resilience chaos-engineering jvmLitmus is a toolset to do cloud-native chaos engineering. Litmus provides tools to orchestrate chaos on Kubernetes to help SREs find weaknesses in their deployments. SREs use Litmus to run chaos experiments initially in the staging environment and eventually in production to find bugs, vulnerabilities. Fixing the weaknesses leads to increased resilience of the system.
kubernetes golang ansible microservices site-reliability-engineering cncf operator cloud-native fault-injection hacktoberfest litmus kubernetes-resources chaos-testing chaos-engineering crd operator-sdk chaoshub chaos-experimentsTesting distributed systems under hard failures like network partitions and instance termination is critical, but it's also important we test them under less catastrophic conditions because this is what they most often experience. Comcast is a tool designed to simulate common network problems like latency, bandwidth restrictions, and dropped/reordered/corrupted packets.
latency chaos bandwidth packet-loss chaos-engineering golang testing testing-tools networkingThis project provides a Chaos Monkey for Spring Boot applications and will try to attack your running Spring Boot App.
spring spring-boot spring-cloud chaos-engineering chaos-monkey chaos-testing chaostoolkit testing testing-tools test-framework engineering spring-cloud-netflixGo client to the Chaos Monkey REST API that can be used to trigger and retrieve chaos events. This project was started for the purpose of controlled failure injection during GameDay events.
chaos-monkey chaos-engineering chaos-testing awsThis project provides a highly configurable Docker image of the Simian Army as a sound basis for automating chaos experiments. This example is safe to run as Chaos Monkey will operate in dry-run mode by default. It's a good way for getting a feeling of the application without taking a risk.
docker chaos-monkey chaos-engineering chaos-testingLitmus is chaos engineering for stateful workloads on Kubernetes -> hopefully without learning curves. Our vision includes enabling end users to quickly specify needed scenarios using English descriptions. The primary objective of Litmus is to ensure a consistent and reliable behavior of workloads running in Kubernetes. It also aims to catch hard-to-test bugs and unacceptable behaviors before users do. Litmus strives to detect real-world issues which escape during unit and integration tests.
performance-testing workload-automation e2e-tests user-story kubernetes-applications containerised-tests chaos-engineeringPlatform chaos is a collection of tools and sdks that enable engineers to experiement on distributed systems built atop PaaS offerings to ensure confidence in such a system's capabilities. It does so by defining a common interface for inducing chaos, through a construct we call chaos extensions. Given this common interface, we're able to provide tooling that can schedule, start, and stop chaotic events. This project is the core sdk that enables chaos extension development using NodeJS.
nodejs azure chaos-engineering platform chaosmizumochi is a tool to simulate unstable disk I/O for testing stability/robustness of system. The word unstable here means read/write speed is slowdown. We assume mizumochi works on develop environment with target system.
chaos-engineering command-line-toolThis GitHub repo is for the Chaos Engineering Bootcamp.
chaos-engineering chaosThis solution walks you through a prescriptive implementation of Distributed Load Testing using AWS Fargate and Taurus. You can use it to test your services under high stress scenarios and understand it's behavior and scalability. Taurus acts as a wrapper around JMeter and allows you to generate HTTP requests in parallel simulating a real-world scenario. This solution shows how to run Taurus on Docker containers and deploy them to Fargate clusters running in different AWS regions, so that you can simulate requests coming from different geographic locations into your service.
taurus aws-fargate docker chaos-engineering gameday aws aws-ecs performance-testingTo demonstrate the different issues and failures as well as how to fix them, I've been using the commands and resources as shown below. NOTE: whenever you see a 📄 icon, it means this is a reference to the official Kubernetes docs.
kubernetes troubleshooting debugging network storage security chaos-engineering observability distributed-tracingYou might think it must be RPC end point part, which makes your business logic as a real service can be accessed from the network. But, this is not true. By leveraging the opensource RPC packages, such as, apache thrift, gRPC, this part could be extremely easy except defining your service interface with some IDL.
microservice circuit-break rate-limit chaos-engineeringWhich will delete every minute a pod in the current namespace matching run=nginx selector. Cron expressions are based on robfig/cron implementation.
kubernetes chaos-engineering crd operatorPerses allows you to dynamically inject failure/latency at the bytecode level, without the need to add any dependency or restart/deploy the target app. Just load 2 jars at the same enviroment the target JVM is running and execute java -jar perses-injector.jar <Target Application name>. Perses is designed to enable developpers and QAs to easily reproduce & debug tricky production issues.
chaos-engineering jvmProvides many misbehavior cases as a Service. Displays various informations about current server process.
testing nomad k8s failure-injection disaster-recovery failure failure-detection chaos-engineering chaos-testing
We have large collection of open source products. Follow the tags from
Tag Cloud >>
Open source products are scattered around the web. Please provide information
about the open source projects you own / you use.
Add Projects.