Companies invest a lot of time and thought on ways they can reduce risks on their systems. Minimizing risk in our systems helps our businesses avoid the disruption of the thousands of applications that rely on them. However, no one system is perfect; accidents happen and failure occurs, it’s just a matter of time.
It’s what happens after the occurrence of an incidence that matters. There’s a lot that we stand to learn from failure. That is why we conduct post-mortems — a retrospect on what went wrong and how to keep it from happening in the future. As an operation team prepares for all the eventualities that may cause the failure of a system, it also needs to come up with a port-mortem strategy for the plan to be effective. Keep reading to learn some quick tips: how to post mortem every incident:
What Makes a Good Post-Mortem?
Every company that knows and appreciates the importance of retrospect after an incident is always trying to perfect its post-mortem technique. A post-mortem is a critical part of the process of trying to prevent failures and outages of our systems. Keep the following in mind when you conduct a post-mortem on an incident:
The person who was on call during the incident (that is, the primary person who dealt with it) will write up a post-mortem immediately after the incident. The team should then go through the post-mortem in the next meeting. Every incident, small (could have been dangerous) or large (resulted in failure or outage) should be discussed. Here are some of the questions that should be asked during the meeting:
The post-mortem needs to show what went wrong, and in what order it was responded to. Just saying that a particular user reacted in an unpredictable way or a process died isn’t enough. Going the entire process accordingly helps you learn where to look for symptoms of failure in the future.
Post-Mortem Must Have an Agenda
When conducting a post-mortem, the last thing you want is having a totally disorganized mess that after an hour, leaves you without learning anything. You need an agenda; even the most relaxed meetings need an agenda. Make sure that the agenda addresses all the issues raised by the post-mortem.
A success outage resolution must go hand-in-hand with a comprehensive post-mortem. Give your team an opportunity to learn by making sure that everything is documented correctly. It gives the company a chance to grow and ensures that there is no possibility of repeating the same mistakes.
Nov 30, 2018 0
Jul 01, 2017 0
Jun 11, 2019 0The experimental app for bitcoin is underway which is now...