CrowdStrike Chaos Underscores Serious Downsides of Automatic Software Updates

Cloud technology is here to stay. That means it’s time for a serious discussion about how much control OEM software vendors have over updates.

The recent CrowdStrike outage marked a watershed moment for IT and touched off a global conversation about the downsides of automatic software updates in the cloud era.

The incident affected approximately 8.5 million Windows devices across thousands of business IT environments. Entire health systems went offline, millions of paychecks saw delays in depositing, and airlines found themselves canceling an entire day’s worth of flights, with those that could get off the ground manually sharing crucial updates on whiteboards.

While the media was quick to assign blame, experts say the technology industry’s shift to cloud-native modernization all but guaranteed something on CrowdStrike’s level (or worse) would eventually happen. Now that it has, the chaos it created provides a reminder that – despite the speed and convenience cloud-based systems can offer – there’s no substitute for a thorough and deliberate approach to updating software.

How did the CrowdStrike outage occur?

On July 19, 2024, CrowdStrike released an automated sensor configuration update to its Falcon sensor software, a “normal part of the sensor’s operation” that occurs “several times a day in response to novel tactics […] discovered by CrowdStrike,” according to the company’s blog.

Unfortunately, this particular update shipped with a coding error that caused Falcon to search for an invalid memory address within Windows, according to an article on DEV. The 8.5 million affected Windows devices took this unusual behavior as sign of a potential threat and defaulted to the Blue Screen of Death to prevent further risk, an issue that persisted through rebooting for most customers.

As global systems went down to the blue screen boot loop, software that interfaced with affected systems also began to crash, setting off a ripple of outages that many companies are still struggling to fully resolve.

CrowdStrike outage exposes IT loss of control

It would be an understatement to say a single faulty bit of code and an overlooked checking procedure should not sideline entire sectors of the global economy for a full day. But as the CrowdStrike outage shows, historic cyber events don’t always start with a bad actor or even a particularly notable mistake.

Instead, the events of July 19 hit on a larger trend: the loss of control companies have increasingly experienced over the years as they replace their on-premises software with cloud-based alternatives.

As intelligence provider GlobalData says in a Yahoo Finance article, the practice of customers vetting and testing updates before pushing them to users has faded with the move to cloud. Instead, businesses increasingly trust software vendors to carry out automatic updates.

While the change created undeniable time savings, it also results in customers signing away a huge responsibility, one customer IT departments have historically managed themselves until the last several years.

Security providers like CrowdStrike are far from the only cloud software vendors that could create problems with an errant update. Any software that can push updates without customer intervention has at least some ability to create inadvertent trouble. Throw in the potential for cyberattackers to capitalize on disrupted updates (a key facet of certain supply chain attacks), and it’s easy to see why the enterprise would be motivated to solve this lingering downside of automatic updates.

Regain control of software update testing processes

As GlobalData says in its analysis, the only realistic path forward from an event like this starts with dialogue. High-level IT leaders need to discuss how their software update/patching processes have changed since adopting cloud technology and work to find practices that put oversight and implementation back in their control.

At minimum, companies concerned about another major event should consider:

  • Whether they need every update that vendors push. Do you automatically accept every update from certain vendors? Is there a need for this beyond convenience? If you wanted more control over the implementation process, how would your company go about exerting it?
  • Potential reinforcements to the testing process. Organizations without formalized testing procedures should start defining a policy before another issue stemming from an automatic update touches down. Those with preexisting policies can use the CrowdStrike event as a reminder to revisit, revise, and improve their current rules.
  • Alternative maintenance methodologies for more established technology. Cloud-based updates aren’t the only ones with potential to cause serious trouble in enterprise IT estate. The policy surrounding security/performance upgrades and updates for on-premises IBM and HCL products, among others, should also be carefully considered. In many cases, the older products in your estate can be secured and optimized by an independent software maintenance provider without the potential security and performance risks of updating. 2021’s Log4Shell incident, for instance, highlights how even serious problems can be addressed with preemptive hardening measures.

The cloud supports and enhances every part of modern businesses, and its strengths and convenience are too compelling to consider going back. Even so, CrowdStrike’s misstep and the cloud-dependent IT landscape that allowed it to turn into a full-blown crisis illustrate how that speed and convenience can turn into a downside without thorough testing and oversight.

Instead of assuming every update that comes your way is a good thing, give yourself time to ask plenty of questions and test plenty of outcomes. You’ll be grateful when you don’t have to individually recover thousands of PCs on a Friday afternoon.

Effective Cybersecurity Strategies Going Beyond Patch Management on demand webinar promo image for social media

FOR THE LATEST TECHNOLOGY TIPS SUBSCRIBE TO OUR NEWSLETTER - THE UPTIME

Gain insight into industry-only news, access to webinars, tips and tricks, blog posts, podcasts, and guides, surrounding topics like cybersecurity, reducing software support and maintenance costs and much more, all delivered to your inbox each month.

LEARN MORE