Is my i7 970 dying or dead?

Ranma

Limp Gawd
Joined
Jan 31, 2008
Messages
173
Hi guys, long time reader, lurker, first real post. I have an i7 970 in an Intel DX58S0 board, and lately I have been getting CPU stop errors. The errors seem to appear after the unit has been up for almost 3 days. I have tried running memtest, with different ram, and the unit will always reboot during the memtest, so I am thinking for sure it is dying. Bad thing is the debug information says everything in inconclusive. I have not tried another motherboard because I do not have a spare 1366 board. :(

I have checked Intel's website and they only offer phone support for CPU's under warranty. And their phone number is long distance, no toll free. Any help would be greatly appreciated. Thanks in advance.


Microsoft (R) Windows Debugger Version 6.11.0001.404 AMD64
Copyright (c) Microsoft Corporation. All rights reserved.


Loading Dump File [C:\Windows\MEMORY.DMP]
Kernel Summary Dump File: Only kernel address space is available

Symbol search path is: SRV*C:\Symbols*http://msdl.microsoft.com/download/symbols
Executable search path is:
Windows 7 Kernel Version 7601 (Service Pack 1) MP (12 procs) Free x64
Product: WinNt, suite: TerminalServer SingleUserTS
Built by: 7601.17640.amd64fre.win7sp1_gdr.110622-1506
Machine Name:
Kernel base = 0xfffff800`0300f000 PsLoadedModuleList = 0xfffff800`03254670
Debug session time: Mon Oct 10 02:44:29.528 2011 (GMT-4)
System Uptime: 2 days 0:56:07.359
Loading Kernel Symbols
...............................................................
................................................................
.......................................................
Loading User Symbols

Loading unloaded module list
..................
*******************************************************************************
* *
* Bugcheck Analysis *
* *
*******************************************************************************

Use !analyze -v to get detailed debugging information.

BugCheck 101, {10, 0, fffff880009b2180, 4}

Probably caused by : Unknown_Image ( ANALYSIS_INCONCLUSIVE )

Followup: MachineOwner
---------

0: kd> !analyze -v
*******************************************************************************
* *
* Bugcheck Analysis *
* *
*******************************************************************************

CLOCK_WATCHDOG_TIMEOUT (101)
An expected clock interrupt was not received on a secondary processor in an
MP system within the allocated interval. This indicates that the specified
processor is hung and not processing interrupts.
Arguments:
Arg1: 0000000000000010, Clock interrupt time out interval in nominal clock ticks.
Arg2: 0000000000000000, 0.
Arg3: fffff880009b2180, The PRCB address of the hung processor.
Arg4: 0000000000000004, 0.

Debugging Details:
------------------


BUGCHECK_STR: CLOCK_WATCHDOG_TIMEOUT_c_PROC

DEFAULT_BUCKET_ID: VISTA_DRIVER_FAULT

PROCESS_NAME: System

CURRENT_IRQL: d

STACK_TEXT:
fffff880`037be528 fffff800`030e38c9 : 00000000`00000101 00000000`00000010 00000000`00000000 fffff880`009b2180 : nt!KeBugCheckEx
fffff880`037be530 fffff800`03096497 : 00000000`00000000 fffff800`00000004 00000000`00002626 00000000`00000000 : nt! ?? ::FNODOBFM::`string'+0x4e2e
fffff880`037be5c0 fffff800`0360c1c0 : 00000000`00000000 fffff880`037be770 fffff800`03628460 fffff800`00000000 : nt!KeUpdateSystemTime+0x377
fffff880`037be6c0 fffff800`03088173 : 00000000`25a10cee fffff800`03628460 fffff800`03201e80 00000000`00000000 : hal!HalpRtcClockInterrupt+0x130
fffff880`037be6f0 fffff800`030917a0 : fffff800`03201e80 fffffa80`00000001 00000000`00000000 fffffa80`09bbc298 : nt!KiInterruptDispatchNoLock+0x163
fffff880`037be880 fffff800`030766f4 : 00000000`00000002 00000000`00000001 fffff880`037beba0 00000000`00002227 : nt!KeFlushMultipleRangeTb+0x260
fffff880`037be950 fffff800`03107ac5 : fffff800`032c1ac0 fffff800`00000001 00000000`00000001 fffff880`037bebb0 : nt!MiAgeWorkingSet+0x64a
fffff880`037beb00 fffff800`03076826 : 00000000`0000b020 00000000`00000000 fffffa80`00000000 00000000`00000002 : nt! ?? ::FNODOBFM::`string'+0x4d786
fffff880`037beb80 fffff800`03076cd3 : 00000000`00000008 fffff880`037bec10 00000000`00000001 fffffa80`00000000 : nt!MmWorkingSetManager+0x6e
fffff880`037bebd0 fffff800`03326fee : fffffa80`09a71040 00000000`00000080 fffffa80`099deb30 00000000`00000001 : nt!KeBalanceSetManager+0x1c3
fffff880`037bed40 fffff800`0307d5e6 : fffff880`009b2180 fffffa80`09a71040 fffff880`009bd1c0 00000000`00000000 : nt!PspSystemThreadStartup+0x5a
fffff880`037bed80 00000000`00000000 : fffff880`037bf000 fffff880`037b9000 fffff880`037be700 00000000`00000000 : nt!KxStartSystemThread+0x16


STACK_COMMAND: kb

SYMBOL_NAME: ANALYSIS_INCONCLUSIVE

FOLLOWUP_NAME: MachineOwner

MODULE_NAME: Unknown_Module

IMAGE_NAME: Unknown_Image

DEBUG_FLR_IMAGE_TIMESTAMP: 0

FAILURE_BUCKET_ID: X64_CLOCK_WATCHDOG_TIMEOUT_c_PROC_ANALYSIS_INCONCLUSIVE

BUCKET_ID: X64_CLOCK_WATCHDOG_TIMEOUT_c_PROC_ANALYSIS_INCONCLUSIVE

Followup: MachineOwner
---------

0: kd> lmvm Unknown_Module
start end module name
 
i dont think youll get a conslusive answer without swapping CPUs, unfortunately.
 
I'd be suspect of the mobo before the cpu, I've only seen 1 bad intel cpu in years of doing repairs every day.
 
I'd be suspect of the mobo before the cpu, I've only seen 1 bad intel cpu in years of doing repairs every day.

same. and it was a wtf thing too. was a Dell under warranty. seemed like prototypical mobo problems so i requested a mobo. same thing. then i requested a CPU and voila.
having NBD warranty and a Dell certification came in handy there...heh.
 
I've seen this error a few times; usually my voltage is too low. Basically, one of the cores (or threads) stopped responding; it got into a broken state will not "wake up."

If the processor is running cool, try upping the voltage. If its running too hot, you might open the computer to help it cool off, or reseat the heatsink and check the thermal interface material. Consider turning off turbo or reduce the clock speed to help with heat.
 
So I know this is a dead thread but I thought I would update everyone. After jumping through the hoops of Intel hell for warranty, I first got a DOA board from them that would only boot for a minute, and then constantly cycle while turning off the USB ports. After this I informed them I wanted a replacement CPU as the CPU would work in the old board for a day or so, sometimes less haha, and then throw a blue screen with CPU STOP error. So finally they relented and gave me an RMA for my i7. Got it today, and set up the new chip first in the new board, and wow same thing, board boots for a minute and cycles and locks out the USB. AWESOME. Since I kept the old board, and made sure they noted it in the RMA field, I just put all my stuff in the old board, and reinstalled windows, and I am good to go. Been going strong for over 12 hours right now with no BSOD.

The woman at Intel told me that because I was using out of spec ram, that it is what caused my CPU to fail. I am using muskin silverline 9-9-9-24 10666 ram, and she said I MUST use 10600 ram, ie PC8500. She said that there is no way that the cpu will work with ram that is NOT the correct speed. Even if you set the ram in the bios!

"You have mentioned that you tested multiple memories; have any of these memories had a default speed of 1066 MHz with a 1.5v?
This is important. The memory needs to be able to work at 1066 MHZ and 1.5v with 9-9-9-24 of timings or CAS latency if the memory that you put on your system has different specifications than the one that are mention you will encounter a lot of issue. The processor i7-970 memory controller will force the RAM to run at 1066 MHZ 1.5v whether the memory will support it or not and that is why at this moment you get that message that memory changes.
I have talked with the engineer today about your case and they have not been able to replicate the issue.
We can do the RMA on the processor, but it the memory used with the new processor is not capable of running at 1066 MHZ and 1.5v with CAS latency 9-9-9-24, in the future you will encounter the same issues or new ones"


So does this mean I am going to be needing another processor in a year haha.
 
Well, if you'd like to trade your 970 for my 920, we can arrange that ;)

That's kind of ridiculous how it can be that picky about RAM though
 
Yeah it makes me wonder if she even knew what she was talking about. I guess in a years time I will see if my 970 fails again. But where am I going to find 9-9-9-24 1066 ram? I looked and looked and the only ram at that speed is CL7. No one makes CL 9 1066 ram. At least that I can find. I asked at crucial and all they said was that the board has to be able to handle the ram, and never answered when I asked about the CPU.

I am just glad that things are working again and I can get back to converting my dvd collection to bluray. As for trading for a 920 haha I dunno if I could ever go back to a quad core :p x264 encoding needs as much power as possible, and my 700+ dvd's will get done that much faster :p
 
Well I thought I would chime in and revive my dead thread.

CPU has died again. Everything is doing the exact same thing. So wonderful, sigh.
 
Were you overclocking when this happened again? Did you keep your temps in check all this time? If you were not overclocking, then it could be you have either a bad PSU or your motherboard is having problems with CPU VRMs and possibly overvolts your i7 970. Given it has taken almost 3 years for the problem to reoccur, pinning down the culprit is not so easy.

If you are still happy with the performance of your rig, I'd just get a new PSU and a cheap Xeon X5650 / X5660 / X5670 processor to stretch some more time out of your machine.
 
Back
Top