For dual CPU motherboard, if 1 CPU No. 0 fail, does #1 take over?

Happy Hopping

Supreme [H]ardness
Joined
Jul 1, 2004
Messages
7,837
I know w/ dual CPU configuration motherboard, such as the Asus Z9PE D8, that c/w 2 x CPU, say:

CPU 0

CPU 1

that if CPU 1 fail, CPU 0 will continue to function, as CPU 1 is actually the slave component of the SMP, and CPU 0 is actually the master CPU

So in the old days if CPU 0 fail however, then you are finish even though CPU 1 is fully functional, because CPU 0 dictates CPU 1.

Now, did things change recently? i.e., if your CPU 0 fail, does CPU 1 continue to function and you just got a slower performance but still a functionable PC?
 
If either CPU fails, I know for a fact that both Windoze and Linux will come crashing down..... hard. Catastrophic hardware failure is non-recoverable.
 
I've seen some of the dual socket 2011 boards boot up with two CPU's while one is bad, it just acts like it isn't there. However, if this happened while it is running then whatever you're doing is toast.
 
I've seen some of the dual socket 2011 boards boot up with two CPU's while one is bad, it just acts like it isn't there. However, if this happened while it is running then whatever you're doing is toast.

I had the same experience with older 2 CPU systems
 
I've seen some of the dual socket 2011 boards boot up with two CPU's while one is bad, it just acts like it isn't there. However, if this happened while it is running then whatever you're doing is toast.

That's good to know. I thought only CPU 1 fail can be safely recover

So you're saying there is no more Master / Slave set up any more? That either CPU can be function as a master or slave?
 
On socket 2011 systems each CPU has its own PCIe endpoints, so not only will the system crash horribly, but you can no longer use any PCIe devices connected to the failing CPU on reboot.
 
Socket 2011 and the current Xeons do not have these kind of RAS features. If a processor fails on a dual socket 2011 system the entire system is going to down until you fix it.

Modern x86 wise the Xeon E7 platform does have some more advanced RAS features. If implemented it can support CPU hot adds, CPU sparing, CPU failures. The OS needs to support this stuff as well though.

Normally you see features like this on different architectures that run operating systems designed for this. For example, there are lots of Sun SPARC boxes running Solaris that will let you replace CPUs and tolerate failure of them without any downtime. Other larger RISC or PowerPC platforms and UNIXs offer these kind of features as well, but normally you don't see this on low end x86 like socket 2011.
 
Also I would guess that CPUs are the most reliable component on modern systems if properly cooled. It is more likely that some power supply components on the mainboard or memory modules fail than CPUs. Even my Q6600 ran for years with a massive overclock and hefty overvoltage and is still running today.
 
That's true. I've been servicing PC hardware for 20+ years, never seen a fail CPU
 
Back
Top