• Some users have recently had their accounts hijacked. It seems that the now defunct EVGA forums might have compromised your password there and seems many are using the same PW here. We would suggest you UPDATE YOUR PASSWORD and TURN ON 2FA for your account here to further secure it. None of the compromised accounts had 2FA turned on.
    Once you have enabled 2FA, your account will be updated soon to show a badge, letting other members know that you use 2FA to protect your account. This should be beneficial for everyone that uses FSFT.

Global Microsoft outage hits due to CrowdStrike Update Definitions

;-)

IMG_2378.jpeg
 
Post on difficulty in getting the systems back up

https://news.ycombinator.com/item?id=41003390

It is likely the .sys driver had a bug for a very long time, so what they pushed out yesterday was not a code that had to be tested and validated, but a small definition update that generally did not, and this definition update did trigger the existing bug.
At least this sounds more reasonable than someone pushing untested code out to millions of clients.
 
I fixed all the computers at my wife's dental office today. Heard nothing but trash about their actual IT team... and then fixed a computer that was physically broken that the team hasn't figured out yet while I was walking out the door. Had a bad stick of ram.
 
I fixed all the computers at my wife's dental office today. Heard nothing but trash about their actual IT team... and then fixed a computer that was physically broken that the team hasn't figured out yet while I was walking out the door. Had a bad stick of ram.
Yeah. I've been disaster-proofing many of my clients' offices lately for insane incidents like this. It saves them money and makes less work for me because the downtime after an incident would be minimal.

Edit: Appropriate Babylon Bee article on the incident:

Entire Microsoft Network Goes Down After Greg Removes USB Device Without Clicking 'Eject' First
https://babylonbee.com/news/entire-...moves-usb-device-without-clicking-eject-first

:p
 
Sysadmins having a tough time:

https://www.reddit.com/r/sysadmin/comments/1e7j5i8/crowdstrike_what_the_f/

I am beyond pissed off right now, in fact, I'm furious.

WHY DID CROWDSTRIKE NOT TEST THIS UPDATE?

I'm going onto hour 13 of trying to rip this sys file off a few thousands server. Since Windows will not boot, we are having to mount a windows iso, boot from that, and remediate through cmd prompt.

So far- several thousand Win servers down. Many have lost their assigned drive letter so I am having to manually do that. On some, the system drive is locked and I cannot even see the volume (rarer). Running chkdsk, sfc, etc does not work- shows drive is locked. In these cases we are having to do restores. Even migrating vmdks to a new VM does not fix this issue.

This is an enormous problem that would have EASILY been found through testing. When I see easily -I mean easily. Over 80% of our Windows Servers have BSOD due to Crowdstrike sys file. How does something with this massive of an impact not get caught during testing? And this is only for our servers, the scope on our endpoints is massive as well, but luckily that's a desktop problem.

Lastly, if this issue did not cause Windows to BSOD and it would actually boot into Windows, I could automate. I could easily script and deploy the fix. Most of our environment is VMs (~4k), so I can console to fix....but we do have physical servers all over the state. We are unable to ilo to some of the HPE proliants to resolve the issue through a console. This will require an on-site visit.

Our team will spend 10s of thousands of dollars in overtime, not to mention lost productivity. Just my org will easily lose 200k. And for what? Some ransomware or other incident? NO. Because Crowdstrike cannot even use their test environment properly and rolls out updates that literally break Windows. Unbelieveable

I'm sure I will calm down in a week or so once we are done fixing everything, but man, I will never trust Crowdstrike again. We literally just migrated to it in the last few months. I'm back at it at 7am and will work all weekend. Hopefully tomorrow I can strategize an easier way to do this, but so far, manual intervention on each server is needed. Varying symptom/problems also make it complicated.

For the rest of you dealing with this- Good luck!

*end rant.
 
Crowdstrike actually.

Microsoft Releases Recovery Tool for Windows Machines Hit By Crowdstrike Issue (theverge.com)

Posted by EditorDavid on Sunday July 21, 2024 @04:46PM from the Start-buttons dept.
The Verge reports that for machines that aren't automatically receiving Crowdstrike's newly-released software fix, Microsoft has released a recovery tool that creates a bootable USB drive.Some IT admins have reported rebooting PCs multiple times will get the necessary update, but for others the only route is having to manually boot into Safe Mode and deleting the problematic CrowdStrike update file. Microsoft's recovery tool now makes this recovery process less manual, by booting into its Windows PE environment via USB, accessing the disk of the affected machine, and automatically deleting the problematic CrowdStrike file to allow the machine to boot properly. This avoids having to boot into Safe Mode or a requirement of admin rights on the machine, because the tool is simply accessing the disk without booting into the local copy of Windows. If a disk is protected by BitLocker encryption, the tool will prompt for the BitLocker recovery key and then continue to fix the CrowdStrike update.
 

Costs from the global outage could top $1 billion – but who pays the bill is … probably tax payers​


https://www.cnn.com/2024/07/21/business/crowdstrike-outage-cost/index.html

""If you're a lawyer for CrowdStrike, you're probably not going to enjoy the rest of your summer," said Dan Ives, a tech analyst for Wedbush Securities....

But there could be legal protections for CrowdStrike in its customer contracts to shield it from liability, according to one expert. "I would guess that the contracts protect them," said James Lewis, researcher at the Center for Strategic and International Studies...

It's also not clear how many customers CrowdStrike might lose because of Friday. Wedbush Securities' Ives estimates less than 5% of its customers might go elsewhere. "They're such an entrenched player, to move away from CrowdStrike would be a gamble," he said. It will be difficult, and not without additional costs, for many customers to switch from CrowdStrike to a competitor. But the real hit to CrowdStrike could be reputational damage that will make it difficult to win new customers... [E]ven if customers are understanding, it's likely that CrowdStrike's rivals will be seeking to use Friday's events to try to lure them away."
 
Last edited:
Why tax payers?
the buck always seems to be passed on to the Tax Payers

some way some how

whether it's increased up front MSRPs and or actual taxation and hidden fees or "clever" subscription models and micro-transactional subscription pricing tiers

some government agencies were directly impacted, local DMVs i heard, etc
 
"faulty crowdstrike update" should come first, this still leaves it looking like its MSs fault. so "faulty crowdstrike update causes global microsoft outage" would be more accurate.
 
I think the simplest solution would be to stop using ClowdStrike and find an alternative.
Alternatives all work the same way and all have the same access, it is required for them to be effective against user level malware.
 
On Windows machines, CrowdStrike's Falcon security software is a kernel module, which gives the software full access to a PC. The kernel manages memory, processes, files, and devices, and it's basically the heart of the operating system. Much of the software on a PC is typically limited to user mode, where bad code can't cause harm, but software with kernel mode access can cause catastrophic total machine failures, like what was encountered last week.

The Falcon software was not able to wreak similar havoc on Macs because Apple does not give software makers kernel access. In macOS Catalina, which came out in 2019, Apple deprecated kernel extensions and transitioned to system extensions that run in a user space instead of at a kernel level. The change made Macs more stable and more secure, adding protection against unstable software updates like the one CrowdStrike pushed out. It is not possible for Macs to have a similar failure because of the change that Apple made.
https://www.macrumors.com/2024/07/22/microsoft-blames-european-commission-for-outage/
 
Which is a completely meaningless thought process.
Sure no one thought CrowdStrike would do this either, and yet they did, twice now actually, with Debian/ Rocky linux a few months back and now windows.

How often has MS released patches that hosed their Server OS or core functionality? Sure, did not take down half the world, but point is big companies take short cuts all the time.
 
Back
Top