close
close

Lessons from the CrowdStrike incident

Digital security

Organizations, even those not affected by the CrowdStrike incident, should resist the temptation to attribute the IT breakdown to extraordinary circumstances.

Building cyber resilience: lessons from the CrowdStrike incident

After the dust settles from the cyber incident caused by CrowdStrike’s release of a corrupted update, many organizations will (or should) conduct a thorough post-mortem analysis to determine what impact the incident had on their business and what could be done differently going forward.

For most critical infrastructure and large organizations, the tried and tested cyber resilience plan will no doubt have been put into action. However, the incident, dubbed the “largest IT outage in history,” was probably something no organization, no matter how large and cyber framework compliant, could have prepared for. It felt like an “Armageddon moment,” as demonstrated by the disruptions at major airports on Friday.

A company can prepare for the fact that its own systems or those of key partners will be unavailable. However, if an incident is so wide-ranging that it affects everything from air traffic control, government transportation departments, transportation providers, and even the restaurants in the airport to TV stations that could warn passengers of the problem, then preparation is likely to be limited to its own systems. Fortunately, incidents of this magnitude are rare.

Friday’s incident shows that only a small percentage of devices need to go offline to trigger a larger global incident. Microsoft confirmed that 8.5 million devices were affected – a conservative estimate would be between 0.5 and 0.75% of all PC devices.

However, this small percentage are the devices that need to be kept secure and constantly operational. They are used for critical services, which is why the companies that operate them provide security updates and patches as soon as they are available. Failure to do so can have serious consequences and cause cyber incident experts to question the organization’s reasoning and competence in dealing with cybersecurity risks.

Importance of cyber resilience plans

A detailed and comprehensive cyber resilience plan can help get your business back up and running quickly. However, in exceptional situations like these, your business may not be able to operate because others your business depends on are not as well prepared or cannot provide the necessary resources quickly enough. No business can foresee all scenarios and completely eliminate the risk of business interruption.

Still, it’s important that ALL organizations adopt a cyber resilience plan and test it occasionally to ensure it’s performing as expected. The plan can even be tested jointly with direct business partners, but testing on the scale of the CrowdStrike Friday incident is likely impractical. In previous blogs, I’ve detailed the core elements of cyber resilience to provide some advice: Here are two links that may help you – #ShieldsUp and these guidelines to help small businesses improve their preparedness.

The most important message from last Friday’s incident is not to skip the debriefing or attribute the incident to exceptional circumstances. Analyzing and learning from an incident will help you better manage future incidents. This analysis should also consider reliance on only a few vendors, the pitfalls of a technology monoculture environment, and the benefits of implementing technology diversity to reduce risk.

All eggs in one basket

There are several reasons why companies select individual vendors. One of these is obviously cost efficiency, the others are likely a single pane of glass approach and efforts to avoid multiple management platforms and incompatibilities between similar solutions sitting side by side. It may be time for companies to consider how proven coexistence with their competitors and diversified product choices can reduce risk and benefit customers. This could even take the form of an industry requirement or standard.

The aftermath should also be conducted by those who were not affected by CrowdStrike Friday. You have seen the devastation that an exceptional cyber incident can cause, and while you were not affected this time, you may not be so lucky next time. So use the lessons learned from others from this incident to improve your own cyber resilience.

Finally, one way to avoid such an incident is to not use technology that is so old that it cannot be affected by such an incident. Over the weekend, someone alerted me to an article that said Southwest Airlines was not affected, supposedly because they use Windows 3.1 and Windows 95, with Windows 3.1 not having been updated in over 20 years. I’m not sure if there are any anti-malware products that still support and protect this outdated technology. This strategy with the old technology may not give me the confidence I need to fly Southwest in the near future. Old technology is not the answer or a viable cyber resilience plan – it is a disaster waiting to happen.