CrowdStrike in a nutshell: how a faulty software update took down millions of Windows PCs
A software update by cybersecurity company CrowdStrike was responsible for taking down millions of Windows PCs, some of them in critical industries.
Last Friday, reports started to come in from companies and organizations from different parts of the world that they experienced computer issues.
This incident affected airports, TV stations, air traffic control systems, banks, ticket purchase systems, retailers, and systems of other companies and organizations. Flights could not take off, flight tickets could not get printed, TV broadcasters went offline, hospitals and banks were affected, and numerous other industries experienced service interruptions.
The initial panic of a world-wide cyberattack turned out to be wrong. Instead, security analysts and administrators from all over the world suggested that the issue was caused by a faulty update of security software. One developed and maintained by CrowdStrike.
What is CrowdStrike?
CrowdStrike is a Texas-based cybersecurity company that develops security products. It is a market leader for endpoint security products and many Fortune 500 companies and other organizations use CrowdStrike products for security.
The company's Falcon security product is an Enterprise Detection and Response (EDR) security software for devices. System updates are pushed via so-called channel files, which are pushed to connected devices automatically.
What happened on Friday and on the weekend?
Cybersecurity company CrowdStrike released a security update on Friday that auto-installed on millions of Windows PCs. This update was faulty and it caused bluescreen errors on PCs it was installed on.
While Windows PCs were affected, the issue itself was not caused by Microsoft or Windows.
Administrators could not restore access to the devices easily, which meant that critical systems remained offline. Up to the day of writing, some systems remain offline.
Workarounds were published quickly, for instance on Reddit and other forums. Microsoft published guidance on Saturday, and CrowdStrike did so on Friday already. There is also a long technical post that provides answers to common issues.
Microsoft said on Saturday that 8.5 million Windows PCS were taken offline because of the security update. It also said that this affected less than 1 percent of the entire Windows population.
However, CrowdStrike solutions are not available for home users and small businesses. This makes it a much larger incident percentage-wise, considering that only Enterprise customers could potentially use the company's security solutions.
Microsoft published a recovery tool on Saturday that admins could run to recover the system either from WinPE or safe mode.
On BitLocker enabled machines, it is also necessary to enter the BitLocker recovery key according to the posted instructions. This Microsoft support page may be helpful to find out where to look it up.
How could this happen?
CrowdStrike has not published a full account of the incident. The big question that is on anyone's mind, and especially on the minds of system administrators who spend many hours on Friday and possibly the weekend to resolve the issue, is "how could this happen".
How could CrowdStrike release an update that was obviously faulty? How did CrowdStrike test the update before its release? How could it land automatically on more than 8 million PCs before its distribution was stopped?
These have not been answered by CrowdStrike up to this point.
What about you? Where you affected by CrowdStrike, e.g., as an administrator who had to repair affected Windows PCs?
Share this with your M$ enslaved minions:
“Though there’s no question that CrowdStrike’s update caused the outage, questions are being raised about whether some of the blame should be directed toward Microsoft.”
“This incident is Microsoft’s fault, not CrowdStrike’s fault,” J.J. Guy, chief executive officer of exposure management company Sevco Security Inc., told SiliconANGLE. “Yes, CrowdStrike pushed a kernel-level update that causes widespread blue screens. Yes, that should have been caught during QA and I’m sure we will get an after-action report that details why release procedures didn’t catch it. But software bugs happen. They are unavoidable — even for top-tier shops like CrowdStrike.
“This is a high-impact incident not because there was a blue screen, but because it causes repeated blue screens on reboot and [appears as of now] to require manual, command-line intervention on each box to remediate, and it’s even harder if BitLocker is enabled,” Guy added. “That is the result of poor resiliency in the Microsoft Windows operating system. Any software causing repeated failures on boot should not be automatically reloaded. We’ve got to stop crucifying CrowdStrike for one bug, when it is the OS’ behavior that is causing the repeated, systemic failures.”
– https://siliconangle.com/2024/07/21/microsoft-reveals-8-5m-windows-computers-affected-crowdstrike-outage/
You say that Linux is also produced by Microsoft, right? because, CrowdStrike also broke Debian and Rocky Linux earlier this year.
https://www.techspot.com/news/103899-crowdstrike-also-broke-debian-rocky-linux-earlier-year.html
George Kurtz – CEO of the cybersecurity company CrowdStrike, which he co-founded with Dmitri Alperovitch.
He graduated from Seton Hall University with a degree in accounting
1. CrowdStrike Update Causes Billions of Dollars in Losses by Breaking One Billion (Or more) Windows Machines (New)
2. CrowdStrike also broke Debian and Rocky Linux earlier this year (Not old)
https://www.techspot.com/news/103899-crowdstrike-also-broke-debian-rocky-linux-earlier-year.html
3. In October 2009, McAfee promoted George Kurtz to chief technology officer and executive vice president. Six months later, McAfee accidentally disrupted its customers’ operations around the world when it pushed out a software update that deleted critical Windows XP system files and caused affected systems to bluescreen and enter a boot loop. “I’m not sure any virus writer has ever developed a piece of malware that shut down as many machines as quickly as McAfee did today.
https://en.wikipedia.org/wiki/George_Kurtz
Note : This guy’s whole life has been full of failures and his only characteristic is that he is an American, someone who made one of his mistakes would never get a job again.
Insane
After the tiny update error at crowdstrike, the company is looking for community support to help them change their name. The candidate new company names being floated around are:
Idiotsstrike – we can’t protect your computer but we can surely give you a BSOD.
Trollstrike – our CEO might look like an idiot but he can actually tie his shoes.
FartStrike – if you love BSODs, buy our not-so-secure products.
DodoStrike – yeah, we know, our company stinks
I bet that some hackers will learn something bad about this big updating issue. Don’t you?
Thanks for the article! :]
It’s pretty easy how this could have happened:
a) we have a faulty update for a program, that runs in ring 0 / kernel mode
b) obviously most Fortune 500 companies don’t have the slightest clue anymore about how to professionally operate IT critical infrastructure.
c) same idiot companies update their entire infrastructure and said software with untested code.
So, such an event is to be expected (rather sooner than later).
The problem here is, that there are some basics rules, that pretty much everyone in IT knows (well or at least knew some time ago):
I) nothing enters production unless tested on a QA/Test system (this can easily be automated, the degree of ‘testing’ to be performed is of course debatable and depends)
II) nothing is rolled out to all systems at once, but roll-out only takes placed in a staged way.
I mean, that’s nothing new. These rules have been literally around for decades.
If SMEs don’t adhere to this, that’s kind of understandable, as nobody expects them to have any IT know-how anyway, but if even a significant part of Fortune 500 companies no longer have the slightest idea of how to securely operate an IT infrastructure, than the apocalypse, as depicted in the movie Idiocracy (2006) is a lot closer than I ever imagined.
The issue is not so much that Crowdstrike severely f*cked up. Something like this is expected to happen sooner or later. Error free software doesn’t exist, so it’s no a matter of if, but rather when something like that happens If it wouldn’t have been Crowdstrike, it could have been an MS update, or a Cisco Router patch. Yes. this was the cause, but a faulty update doesn’t kill systems. To do so, the update also needs to be installed and that’s the real issue.
So some companies failed to properly operate their infrastructure. Something like that is also to be expected .. there is always some idiot that … (you probably also know some humans of this type) …, but the sheer amount of MAJOR companies that failed, THAT’S the real issue. And many of these operate formally designated ‘critical’ infrastructure (incl. banks, transport, hospitals)
If the majority of all major companies nowadays is no longer capable of properly operating their own IT, then mankind is more or less doomed.
These so-called “Top” companies don’t even have a single Business Contingency Plan for their own businesses on what they should do when computers crash. “Top” 500? FXXK You.
Agreed. This is the real issue.
This company is really terrible, let me explain briefly why; We are a software company, and for example, there is no other antivirus that warns our software except this CrowdStrike in virustotal, so far everything seems normal, but when we contact this company and ask them to please remove these false warnings, they never remove them and say that grayware does not mean harmful, in short, they cheat. We even considered suing this company.
AI is still in its infancy and makes a lot of mistakes, CrowdStrike has the highest number of false positives in the market right now and the recent events are probably a result of that.
who cares about virustotal?? If someone doesn’t trust your software they shouldn’t use virustotal to help them make up your mind. If you have end users crying about virustotal rating you should probably not be placating that, unless it’s a very large number of users…?
Never new about clownstrike until now….
Why don’t these critical businessess/organizations/corporations(airports, hospitals, banks, etc) use more reputable Security Products like Bitdefender, Eset, etc. that are pretty much used by many consumers worldwide & should be more reliable than this unknown clownstrike av.
My bet is the cost of clownstrike offers cheap services which intern means cheap reliability aswell hence the global bsod’s.
Some AV Benchamrk comparison even proven WIndows Defender have much better detection rate than clownstrike so if they just want to be cheapskates why not use Windows Defender instead which is free.
Well Crowdstrike is f*cking expensive. It’s kind of the Rolls Royce among the EDR stuff (quite similar to Splunk).
The issue is: yes Crowdstrike f*cked up and caused this, but errorfree code doesn’t exists, so the next time it’s some other update from someone else, maybe from Microsoft or whomever. So something like that HAS TO BE EXPECTED.
The issue real is, that many companies simply don’t seem to know anymore how to properly operate critical IT infrastructures and simply make untested modifications to their production systems. That’s a total no no in professional IT.
And well … Defender is nice (and it has actually a major advantage, unlike other solutions, it’s been integrated with the OS, so it doesn’t drill terrible holes into the architecture of the OS, like all other AVs/EDRs, which in the end is exactly what caused this), but it’s for end-users. You can’t run an organization with a few 10000 or 100000 clients with something like Defender. This just doesn’t fly.
“How could CrowdStrike release an update that was obviously faulty?”
How could Microsoft let one small file crash millions Windows PCs ?
Uninformed opinion. “How did Toyota let this car crash happen! How could they sell a car to someone that might drive drunk!”
Monopolies produce a dangerous infrastructure. Why does everyone put all of their eggs in one basket? CrowdStrike should shut down their bad business. Its nothing more than a bloated AV with buzzwords attached to it. All of the corporations who were impacted should reevaluate their cybersecurity strategies and seek alternatives.
A monopoly isn’t the issue here. It is pushing code to production and missing something as trivial as a null pointer exception. That is quite what they teach a BSC of CompSci at uni not to do.
The company needs to disclose what went wrong, triage it internally and propose publicly how to prevent it in the future. It’s called transparency and accountability. It should be forced by law.
Monopoly is the root of the problem. Similar to Microsoft, they have no competitors, so they can slash expenses everywhere… Including testing. They have no restrictions and can do as they choose. The IT managers employed by these 500 Fortune companies are sheep, mindlessly following the latest buzzwords and fads from these monopolies. This is the result of a complete worldwide outage. Its time they diversified their cyber security solutions.
I don’t understand why more companies don’t run Linux.
Oh noes… https://www.reddit.com/r/linux/comments/1e98yal/crowdstrike_falcon_struck_redhat_kernel_as_well/
As Jimmy Dore commented shortly thereafter, the first thing the CEO should have at least said was “we are sorry”.
“How could this happen?”
Diversity, equity, and inclusion.
THAT’S how this could happen.
corporate greed, (ah lets skip proper Q/A and Testing, too expensive) has nothing to to with the B/S you mentioned.
That’s the dumbest thing I’ve read today.
@El Duderino
Just you wait when the story unfolds…
Sloppy Agile programming with virtually no QA is why these things happen. This should clobber their bottom line. They have competition. So, hopefully they’ll reorganize with full QA testing in mind.
It was a test by World Economic Forum to take down the world infrastructure.
Great Reset anyone?
:-)
Per Martin in his article above:
“Microsoft said on Saturday that 8.5 million Windows PCS were taken offline because of the security update. It also said that this affected less than 1 percent of the entire Windows population.”
Great Reset anyone?
Um, no…
“1 percent of the entire Windows population” and yet it mostly targeted infrastructure, as “AgentSmith007” said (end users are not infrastructure). I know it’s fun to sound smart but this definitely had a major impact and halted a lot of things. The reason is obviously debatable, accident or not.