MTTR
MTTR stands for Mean Time To Recovery. It is a key performance indicator used in the field of incident management and reliability engineering. MTTR represents the average time it takes to restore a system or service to normal operation after a failure or incident has occurred.
Organizations use MTTR as a metric to assess the efficiency of their incident response and resolution processes. A lower MTTR generally indicates a more effective and rapid response to incidents, minimizing downtime and impact on users. Reducing MTTR is a common goal in IT and DevOps practices, as it contributes to improved system reliability and availability.
Since failures are often connected to released changes, MTTR is also correlated to the ability to ship (and roll back) changes fast.