close
close

CrowdStrike incident: Blue screen of death

CrowdStrike Holdings, Inc. is an American cybersecurity technology company based in Austin, Texas. It provides cloud workload protection and endpoint security, threat intelligence, and cyberattack response services. CrowdStrike partners with companies like Microsoft to deploy tools like Falcon to protect against hacker attacks and security threats.

On Friday, July 19, 2024, CrowdStrike released a configuration update for its Falcon Sensor software, which is installed on Windows PCs to detect intrusions and hacking attempts. While the update was supposed to bring only minor improvements that would have been barely noticeable to customers, it instead caused significant problems due to a logic error in the update software. Many computers running CrowdStrike services had to be rebooted repeatedly and the infamous Blue Screen of Death occurred.

Impact of the incident

The CrowdStrike update incident had a profound impact, affecting nearly 8.5 million Microsoft devices across various user groups.

The incident caused a significant IT outage that had a global impact. Critical systems were affected by disruptions that had far-reaching consequences. The downtime caused by the outage resulted in significant financial losses due to business interruption, lost productivity, potential fines, and costs associated with containing the breach. According to Parametric Impact Analysis, Fortune 500 companies lost $5.4 billion due to the outage (Parametrix, 2024).

In addition, confidence in CrowdStrike’s reliability was undermined, affecting both the organization’s reputation and customer trust. CrowdStrike, the security company responsible for the update, experienced a sharp decline in its stock value. Shares plunged nearly 13% in premarket trading on Friday (McKenna, 2024).

The update has not been released on Mac and Linux systems, so these issues do not occur.

What affected organizations could have done better

This incident highlights the need for effective patch management, incident management and robust business continuity strategies to ensure organizations can restore normal operations after an incident.

For the affected organizations facing recovery issues, several key factors were critical:

1. Poor patch management practices

In the CrowdStrike incident, poor patch management practices played a critical role in enabling the attack. Testing patches in a controlled environment before deployment is crucial to ensure they do not cause new issues and are therefore critical.

2. Inadequate incident response plan

Without a clearly defined incident response plan, companies experienced delayed and uncoordinated responses, which further exacerbated disruptions.

3. Inadequate BC/DR preparation

The lack of effective business continuity and disaster recovery (BC/DR) measures resulted in prolonged downtime and operational losses.

4. Untrained staff

  • Lack of training and awareness: Inadequate incident response training resulted in an ineffective response and exacerbated the impact of the outage.
  • Inability to implement mitigation strategies: Untrained personnel had difficulty implementing effective risk mitigation strategies such as system isolation and communication.

What organizations can do to prevent similar incidents

Preparation and training are key. Blue Team training programs such as Certified Network Defender (C|ND), EC-Council Certified Incident Handler (E|CIH), and EC-Council Disaster Recovery Professional (E|DRP) can help organizations be better prepared for incidents and manage the technical, administrative, and human aspects of dealing with such disasters.

1. Effective patch management

Effective patch management is a critical aspect of maintaining the security and business continuity of any IT infrastructure. It involves testing and deploying timely updates and patches to software and systems to address vulnerabilities, improve performance, and extend functionality. Employees must be trained with comprehensive defensive training programs such as the C|ND, which teaches participants patch management best practices and helps organizations mitigate security risks, avoid downtime, and ensure compliance with industry regulations.

2. Comprehensive incident management and response

Organizations need to develop structured incident response plans and strategies. A well-defined incident response plan includes predefined procedures for handling software conflicts and failures. This ensures that the response is quick and coordinated. Timely detection and analysis helps contain the damage and achieve faster recovery. EC-Council’s E|CIH program provides students with the knowledge, skills, and abilities to effectively prepare for, deal with, and eradicate threats and threat actors when an incident occurs. The program teaches skills to build robust incident response frameworks and best practices for incident handling.

3. Proactive risk management

By proactively assessing potential risks, organizations can identify and address them before they become serious problems. Organizations must also learn to identify and assess risks associated with updates and vendor dependencies. Training programs such as C|ND and E|DRP provide the skills needed to anticipate threats and develop and implement strategies to mitigate or eliminate risks, such as updating security protocols or conducting regular system audits.

4. Improved BC/DR planning

Effective business continuity and disaster recovery (BC/DR) planning helps minimize downtime and ensures that critical business functions can continue or be quickly restored, reducing the impact on operations. Organizations need robust change management processes to assess the impact of updates on critical systems. Maintaining backups and redundant systems ensures continuity even in the event of outages. Regular data backups and failover mechanisms are critical. EC-Council’s E|DRP program provides a solid understanding of business continuity and disaster recovery principles, including conducting business impact analysis, assessing risks, and developing and implementing policies, procedures, and BC/DR plans.

5. Best security practices

Following security policies such as timely patching, network segmentation, and access controls helps prevent vulnerabilities that could lead to incidents. Implementing security best practices, conducting employee training, and running exercise simulations are critical to preventing cybersecurity incidents. EC-Council’s industry-recognized, accredited, and hands-on training programs can help organizations build a solid skills base and avoid incidents like CrowdStrike.

The CrowdStrike incident highlighted how a simple problem like a faulty security software update can cause widespread problems. To prevent similar incidents, organizations must implement robust patch management, comprehensive incident response plans, proactive risk management, enhanced business continuity and disaster recovery strategies, and adhere to security best practices. Training programs such as EC-Council’s Certified Network Defender (C|ND), Certified Incident Handler (E|CIH), and Disaster Recovery Professional (E|DRP) can help organizations be better prepared for incidents and manage the technical, administrative, and human aspects of such crises.

References:

Parametrix. (2024, July 24). CrowdStrike to cost Fortune 500 $5.4 billion. Parametrix. https://www.parametrixinsurance.com/in-the-news/crowdstrike-to-cost-fortune-500-5-4-billion’

McKenna, G. (2024, July 28). CrowdStrike shares fall as IT disruption continues. BBC News. https://www.bbc.com/news/articles/c725knvnk5zo