A curated list of awesome Site Reliability and Production Engineering resources.
site-reliability-engineering production availability monitoring post-mortem reliability-engineering capacity-planning service-level-agreement scalability reliability alerting on-call site-reliability postmortem incident-response sre awesome awesome-list devops observabilityThis workshop teaches students the concept and tools needed to debug Node.js applications in production and post-mortem on SmartOS. It is presented as a series of short hands-on exercises.You will need an access to a SmartOS instance to run this workshop.
debug debugging workshop postmortem production mdb smartosThis is a collection of postmortem templates derived from various sources such as the Site Reliability Engineering book, The Practice of Cloud System Administration book and other online resources. It is possible to load the postmortem templates automatically without copy pasting from the files or manually writing the structure every time you want to author an incident report.
site-reliability-engineering site-reliability devops postmortem incident-reports post-mortemRun any of the examples to generate a core dump, then use corevis to make an HTML file with analysis.
coredump mdb node.js postmortemCalculate how much downtime should be permitted in your Service Level Agreement or Objective.
calculator devops availability site-reliability-engineering service-level-agreement slo service-level-objective service-level-indicator sla chaos-engineering postmortem site-reliability service-level
We have large collection of open source products. Follow the tags from
Tag Cloud >>
Open source products are scattered around the web. Please provide information
about the open source projects you own / you use.
Add Projects.