New 1.08 core possible multi-GPU problem!

nomad8u

[H]ard|DCer of the Month - December 2008
Joined
Feb 12, 2004
Messages
1,083
I just got home from a 13 hr shift to a HUGH shock when I checked on m' babies. There's a new core (1.08) that was/is supposed to be "forced" on Monday that I got pushed on my multi-GPU rig tonight and what a shock. Secondary GPU production is down 46+% from a benchmarked 3787PPD (1:51 - 1:52 step)on a 5008 WU to 2056PPD (3:21 step)!!! :eek: As I type this (this is the rig it's on) I'm showing 1536PPd and 4:30 step!

I'm just getting this heads-up out there (not trying to panic anyone!!!!!) in case someone is running a similar config as me and sees a HUGH drop in their second GPU production so maybe you can recover faster than I will. I'm currently running the Beta 8 client and haven't upgrade to the Beta 11 client yet, and I gotta be back at work in the AM early, so this will likely sit until tomorrow this way... :mad:

Here's the link to my post on the FF in the thread about the new 1.08 core and I suggest everyone who hasn't read it and isn't aware of Beta 11 look at that thread for how to upgrade (basically deinstall/reinstall) to at least check out the first couple of posts in the Beta 11 thread as well...

To save everyone some time and me some keystrokes, here's what I found and posted there as well as my semi detailed config...

It looks like the 1.08 core may be forced now? I just got home from work and noticed a HUGH drop in PPD on my secondary GPU in a multi card setup (specs in a minute). I took a quick look, and both cards on this system have the 1.08 core downloaded. I have been running the 1.07 core since it first was made available here. It's running 2 WU that I have previously benchmarked on this system. I am also aware of the potential "quirks" of running multi GPU with dissimilar cards from the thread in this forum titled "Multi-GPU Quirks. I'm not going to have time to upgrade the client to Beta 11 until tomorrow so I'll followup then. Anyway... here's the details.

System and setup specs:
Q6600 @ 3.05
2MB DDR2 PC6400
Intel XBX2 (975X chipset)
XP32 SP2
Primary GPU (in 16x slot) 9600GT @ 667/936/1884
Secondary GPU (in 8x slot) 8800GS @ 615/800/1884
Driver 177.35
Client 6.12 beta 8
Core (previously 1.07) 1.08
Primary - 9600Gt running WU 5007 @ 4039PPD (1:42 step) this is in the previously benchmarked range of 4.66PPD Set to run locked (and verified) on core 3.
Secondary - 8800GS running 5008 WU @ 2056PPD (3:21 step) this is WAY outside the previously benchmarked range of 3787PPD / 1:51-1:52 step. Set to run locked (and verified) on core 2.
I should also mention I am running an SMP client (WU 2605) in a VM with notfred's "diskless CD" as I have been running since I first started the GPU client. It is still running normally and is locked to cores 0,1.


I checked both GPU's clocks via GPU-Z to ensure nothing had changed and Task Manager showed both cores running @ 25% as normal. Following that I paused the VM SMP client with no change on the poor 8800GS secondary production. I then stopped and restarted the 8800GS client and verified once again that it's running poorly with the 1.08 core.

I will update to the 6.12 beta 11 but not tonight yet. After that I will post updated results, but wanted to get this in here in case someone else notices similar issues. That's approximately a 46% drop in production with no other noticeable reason than a different core. I thought that was significant and wanted to share..

thanks and good folding!

again guys/gals not trying to sound the alarm, but it shocked the hell outta me and I wanted to get a heads-up going here ASAP in case. BTW, in addition to the steps I took to try to correct this issue (I gotta add this in to my post there.. I reset the clocks on the 8800GS and reclocked it to 601/800/1866 with no positive results.

I seriously hope this is MY problem somehow. If not, this is gonna seriously blow for anyone running dissimilar multi GPU setups. Maybe Beta 11 will fix it (fingers crossed).

 
Gratituitious bump to get this back up top before I hit the rack. Work tomorrow dontcha know..

 
Is the v1.08 core already available? If so, the performance in the above user's post is contrary to what I've been reading over the past week. FWIR, the v1.8 core will apparently have some performance enhancements. :confused:

 
Is the v1.08 core already available? If so, the performance in the above user's post is contrary to what I've been reading over the past week. FWIR, the v1.8 core will apparently have some performance enhancements. :confused:


Yes, it's available. If you want to run this, just delete the current FahCore_11.exe (or rename it if it's 1.07 in case) and it will download 1.08.

 
I hope none of my boxen mess up. I'm on vacation currently posting at 70mph :). I wont be home for another 9 days. :eek:
 
Yes, it's available. If you want to run this, just delete the current FahCore_11.exe (or rename it if it's 1.07 in case) and it will download 1.08.
I'm intending to reinstall some of my GPU clients later this evening, but I don't want to use the 1.08 core if I experience a corresponding performance drop. Can someone confirm or deny the results in the FCF thread above? TIA
 
I run 3 GPU2 and I updated all of them to 1.08 this morning without any noticeable performance hit. You can try it and I bet it will work fine ;)

 
I hope none of my boxen mess up. I'm on vacation currently posting at 70mph :). I wont be home for another 9 days. :eek:

Don't worry Kendrak your boxen are in good shape. You are pulling away from me with all your new firepower!

 
Okeedoke. I've been able to verify it's NOT the 1.08 core causing my problem. This is a weird issue though. I think my card is "bad" in a strange way. After all the steps I've taken to verify this I'm convinced it's the card. Bottom line is I'm down 1400-1700 PPD and have no clue yet how to resolve it.

I'm aware of the issue with heterogeneous multi-GPU setups and I've proven this to myself with the testing I've done to isolate this problem. So this alone is not the issue but it does compound the problem.

Numbers prior to the problem (benchmarked across at least 7 different WU): Slot 1 GPU - 9600GT average PPD was 4073 and Slot 2 GPU - 8800GS average PPD was 3765.
When I first noticed the problem (and noticed the 1.08 core) the Slot 1 GPU was unaffected. The slot 2 GPU had dropped to 2056 PPD on a WU that had benched at 3787 PPD.

Steps taken and results:
1. Updated to beta 11 client (still running 1.08 core but tried 1.07 and 1.06 as well) same issue. Slot 2 @ 2055 PPD.
2. Rebooted and restarted clients. Same issue slot 2 @ 2055 PPD.
3. Removed, cleaned and reinstalled drivers (177.35) with appropriate reboots in between (w/safe mode cleaning). Same issue Slot 2 @ 2056 PPD.
4. Swapped PCIe power cables and then tried another power supply to eliminate power as the source. Same issue Slot 2 around 2056 PPD.
5. Swapped cards/slots. 8800GS now the primary in Slot 1 and 9600GT the secondary in Slot 2. Still the same issue but PPD on the 8800GS improved moderately by making it the primary card/slot. Now the 9600GT is down from normal but that's to be expected in slot 2.

Numbers currently. Slot 1 - 8800GS @ 2722 PPD (off by 1k-1.5k) and Slot 2 - 9600GT @ 3300 PPD (off by about 700-800 PPD and I expected this one since it's now the secondary card.

So I'm convinced the 8800GS is busticated and/or running crippled based on all my testing and benchmarks both on power and speed. Here's why. In addition to the PPD benches I have logged, I did power draw testing and still have the KAW hooked up.
Prior to seeing this issue, I was pulling 350-355W at the wall at full system load. I'm nowpulling 322-325W with the same load. Also, watching the temps, prior to the issue with the fan at 75% the card was running for two weeks+ at 66-68c. When I noticed the problem and saw the power drop on the KAW I looked at the GPU temps and it was around 55C at full tilt (this was even after I checked/redid my OC and bumped the shaders up to 1905). Ambient temps have not changed from about 26.3-26.7c. Loads on teh corresponding CPU cores have not changed looking at them with Task Mangler and Process Explorer. I've now had the fan on the 8800GS turned down to 55% for the past 4 hours and the temps won't climb over 58c. This causes me to think thte card is running crippled somehow.

Looking at the data and all I've done to try and troubleshoot this, I'm convinced the card is running in some sort of a crippled/reduced output sorta way.

I can see the RMA chat chain with EVGA now...
/me: I'd like to RMA my 8800GS.
/tech: I'll be glad to help, what's the problem.
/me: It doesn't fold as fast as it did three weeks ago. I mean we're taking about a serious PPD drop here man.
/tech: Excuse me?
/me: I'm down over 1700 PPD cause my card slowed down. Here, I'll email you the data.
/pause while the spreadsheet goes through and is ponder silently on the other end.
/tech: So are you seeing any video corruption?
/me: No. I don't play games. I don't use the card for anything but number crunching.
/tech: Any Blue Screens or artifacting?
/me: No.
/tech: Any driver issue?
/me: No.
/tech: Overheating?
/me: No.
/tech: Lockups or freezing?
/me: No.
/tech: I'm sorry sir but there doesn't appear to be anything wrong.
/me: NOTHING WRONG? You call losing 1700PPD nothing wrong? Do you know how many times I've been Hard Mowed because of this?
/tech: I'm sure I wouldn't know anything about that sir. Your card appears to be working fine.
/me: OK. Would you by ant chance have any links to volt modding the 8800GS?
/tech: Excuse me?
/me: Nevermind...


So I'm out of ideas at this point... Any one knows of any utilities that can show what/how a card is running OR has any other ideas, LMK.

 
Who is Hugh and why is he shocking you?

haha... Nice catch! I think I'm gonna leave it that way as my buddy Hugh Yuengling would be rather offended if I edited it. His brother Huge Yuengling is feeling rather slighted tho.... :D


 
Back
Top