Displaying 1 to 4 from 4 results

awesome-chaos-engineering - A curated list of awesome Chaos Engineering resources.


A curated list of awesome Chaos Engineering resources. Chaos Engineering is the discipline of experimenting on a distributed system in order to build confidence in the system’s capability to withstand turbulent conditions in production. - Principles Of Chaos Engineering website.

awesome-scalability - Scalable, Available, Stable, Performant, and Intelligent System Design Patterns


An updated and curated list of readings to illustrate best practices and patterns in building scalable, available, stable, performant, and intelligent large-scale systems. Concepts are explained in the articles of prominent engineers and credible references. Case studies are taken from battle-tested systems that serve millions to billions of users. Understand your problems: scalability problem (fast for a single user but slow under heavy load) or performance problem (slow for a single user) by reviewing some design principles and checking how scalability and performance problems are solved at tech companies. The section of intelligence are created for those who work with data and machine learning at big (data) and deep (learning) scale.

postmortem-templates - A collection of postmortem templates


This is a collection of postmortem templates derived from various sources such as the Site Reliability Engineering book, The Practice of Cloud System Administration book and other online resources. It is possible to load the postmortem templates automatically without copy pasting from the files or manually writing the structure every time you want to author an incident report.