Creating a company culture that can weather failure

Do have processes that take into account that people get tired

Many incident reports include a phrase like “it was now three o’clock in the morning” followed by a decision that actually prolonged the problem, but Lambert points out that "being late at night doesn’t change the frequency of alerts."

“Incidents caused by failures of machines and networks are not more frequent out of hours, but they are harder to respond to.” For one thing, during the day there are more people around to spot problems sooner. For another, unless you have dedicated support staff working shifts, “the person who has to deal with it has to get paged, they might be tired or distracted.”

When you look at what you can learn from an incident, look at what information is available to the people working on the problem and how quickly they can get it, so you can develop clear guidelines to avoid compounding the problem due to stress, confusion or fatigue.

“What can go wrong in high pressure situations is that people can essentially lose sight of the goal of fixing the problem,” warned Lambert. “You can also lose a lot of context and focus by having too many sources of information so we’ve learned to be very targeted about the information you pick.”

To avoid late night confusion, Nather suggests that “it's good to train until it becomes a reflex so you don't have to think so hard about who you're supposed to call; it comes more automatically."

Don’t ignore technical debt

Technical debt can be the reason you fall prey to ransomware, or it can just make key processes slower and less efficient.

“Assess your assets for business criticality, level of non-compliance with security hygiene, cost to remediate, and risk to the business if the asset is compromised, and develop lower cost, lower risk mitigations while you work on the most complex infrastructure renovations," advised Luta Security CEO Katie Moussouris.

"Then develop a plan to keep the org healthy on an ongoing basis and make sure this plan itself is also reviewed for relevance and adjusted. Much of the technical debt that built up in the first place was due to an incorrect notion that whatever is working on the network shouldn't be touched in case it breaks.”