Computer Crashes May Be Due to Forces beyond Our Solar System

Megalith

24-bit/48kHz
Staff member
Joined
Aug 20, 2006
Messages
13,000
The next time your computer or smartphone glitches out, you may have something to blame other than shoddy hardware, sloppy programming, or mischievous gremlins. It turns out that space magic (i.e., electrically charged particles generated by cosmic rays) can interfere with our earthly devices at any time by altering bits of data stored in memory. In one documented case from 2008, a “bit flip” screwed up an airliner’s flight control systems and forced the plane to dive hundreds of feet down in half a minute, smashing its passengers into the ceiling.

When cosmic rays traveling at fractions of the speed of light strike the Earth’s atmosphere they create cascades of secondary particles including energetic neutrons, muons, pions and alpha particles. Millions of these particles strike your body each second. Despite their numbers, this subatomic torrent is imperceptible and has no known harmful effects on living organisms. However, a fraction of these particles carry enough energy to interfere with the operation of microelectronic circuitry. When they interact with integrated circuits, they may alter individual bits of data stored in memory. This is called a single-event upset or SEU.
 
Millions of these particles strike your body each second. Despite their numbers, this subatomic torrent is imperceptible and has no known harmful effects on living organisms.

Are you forgetting cancer?
 
This was a fairly common issue on the Cisco ACE load balancer platform. They would reboot due to bit flips in the memory they used (non ECC) which caused a kernel panic.
 
Nothing new here. Back when I was working on servers for HP (10+ years ago now) we rented some particle accelerator time to test how the various parts of the system reacted to cosmic ray hits. We made sure the DRAM words were spread out over the die physically after that, as a cosmic ray hit acted like a grenade and would flip several physically adjacent bits. Getting the info from some of the DRAM vendors was like pulling teeth though. Worth it when you can get over 1TB of DRAM in a box though. There was serious consideration of going to 3x voting redundancy for some parts of the ASICs we were making too.
 
Wouldn't be a problem if people just bought ECC RAM.

Keep.in mind though. ECC can't fix everything. It can deal with single bit errors, but dual bit errors are still fatal.

It just increases the resilience towards this sort of thing.
 
T
I might turn into John Madden after that experience, yeesh.
That was a Qantas A330. Here is what the ATSB had to say about that: "An SEU was the only potential cause for the malfunctions not ruled out. All potential causes were found to be "unlikely," or "very unlikely," except for an SEU. However, the Australian Transport Safety Board (ATSB) found it had "insufficient evidence to estimate the likelihood" that an SEU was the cause." So we don't know it was an SEU from cosmic rays. Most aviation systems use ECC RAM, but multibit errors are possible. You could also get a bit being transferred on a bus flipped, or even a CPU register corrupted. Overall the risk is very small, but it can't realistically be eliminated. Bird strikes and terrorism are more likely.
 
You could also get a bit being transferred on a bus flipped, or even a CPU register corrupted. Overall the risk is very small, but it can't realistically be eliminated. Bird strikes and terrorism are more likely.

We're going to be in the Hudson.
 
It "Could" be...

But lets be real..most of the time it is shoddy programming, hardware or both.
 
You could also get a bit being transferred on a bus flipped, or even a CPU register corrupted. Overall the risk is very small, but it can't realistically be eliminated. Bird strikes and terrorism are more likely.

Totally. Not going to not fly because of a story...but if I was in that plane, I would very likely be less sanguine about it.
 
Well duur.

Single Event Upsets are a well known,well documented phenomenon.
 
Back in the olden days I was required to use http://www.dtic.mil/docs/citations/AD0811723 because it modeled radiation affects. The article suggests a single particle can bring down an airplane. I doubt it.

I have my bug-out bag ready, my cellar is stocked with MREs and I rotate my water supply monthly, my vault is ready.

Someday a Nuclear Missile is going to just launch itself, it's only a matter of time.
 
That is probably getting to be more likely since they are so averse to spending money on the not sub launched/ICBM minuteman 3 to make it not so crappy and old.
 
Wouldn't be a problem if people just bought ECC RAM.

Keep.in mind though. ECC can't fix everything. It can deal with single bit errors, but dual bit errors are still fatal.

It just increases the resilience towards this sort of thing.

In order to use ECC ram, you have to use a motherboard that supports it. Most consumer-level boards won't even make it though POST with ECC ram installed.
 
In order to use ECC ram, you have to use a motherboard that supports it. Most consumer-level boards won't even make it though POST with ECC ram installed.
But this is a flight controller that i'm assuming is a custom made board.
Do you think they install off the shelf computer components on a plane?
Also supermicro
 
It's too bad we don't have numbers to see what the likelihood is. I imagine it would be something like:

bad drivers / flawed update: 95% of crashes
failing power supply: 4.5% of crashes
cosmic rays: 0.5% of crashes
 
In order to use ECC ram, you have to use a motherboard that supports it. Most consumer-level boards won't even make it though POST with ECC ram installed.

Exactly, I'd be more likely to use it if mainstream boards supported it. It's not just as simple as people buying ECC RAM instead.

But this is a flight controller that i'm assuming is a custom made board.
Do you think they install off the shelf computer components on a plane?
Also supermicro

But he was referring to an above poster who made a blanket statement that it "wouldn't be a problem if people just bought ECC RAM." He wasn't referring to a plane.
 
This whole thread is about a plane... wtf?

I understand that, BUT as is the case with a lot of these threads, people start having their own conversation within it. So to recap...Zarathustra makes comment about people just using ECC RAM, and sir-gold relates that it isn't feasible. The generic term used by Zarathustra of "people" leaves open the possibility that he isn't just referring to a plane (After all, I fall into the subset of people but I have never built a plane). So your statement that this whole thread is about a plane is sort of incorrect.
 
I understand that, BUT as is the case with a lot of these threads, people start having their own conversation within it. So to recap...Zarathustra makes comment about people just using ECC RAM, and sir-gold relates that it isn't feasible. The generic term used by Zarathustra of "people" leaves open the possibility that he isn't just referring to a plane (After all, I fall into the subset of people but I have never built a plane). So your statement that this whole thread is about a plane is sort of incorrect.
He just saw my post and copied it.
Cosmic radiation changing bits is a lot more common in high altitudes than it is on ground level. Ecc for that reason is typically overkill in desktops.
 
Pheww, luckily this thread didn't take the crazy path

3udrvWN.jpg
 
Back
Top