Matrix.org homeserver grinds to a halt after RAID meltdown

Engineers wrangle 55 TB restore and traffic replay as millions of messages queue up​


“Matrix has become increasingly important in recent years as public and private sector organizations seek to reduce their dependency on centralized messaging services that might not meet sovereignty or privacy requirements. The Matrix.org outage, while embarrassing, serves to highlight that a decentralized approach can protect users from whoopsies on the part of those who run the service.”

Source: https://www.theregister.com/2025/09/03/matrixorg_raid_failure/
 
Reading over that, it sounds like they have people managing hardware and databases that have no idea how to actually do so in a proper redundant manner.....

the trouble started with a routine storage upgrade exercise that went badly wrong. "A whole series of things happened at exactly the wrong time in unison, which then led to the situation that we see," he said.
So they likely did improper testing in a matching environment...using improper raid level for a database (going to guess Raid 5...lol or some crap)...and were not monitoring anything...

Twits...
 
