AMD 4P Burning and Prep for OC?

bowlinra · Jun 8, 2012

tear said:
Soon (in a day or two) I'll have a version of OC BIOS that
will allow testing XMP profile w/o OC (CL7/DDR3-1333
in your case).
If XMP w/o OC gives issues you most likely have bad
DIMMs (or something off between CPUand DIMMs).

Also, re-testing @ 225 (I'm assuming that's what gave
you errors) and checking whether you get errors at same
offsets as before would be a worthy exercise.
Consistent errors would mean DIMM issue. Inconsistent
errors suggest OC/tuning/board issue.

Alternatively, if you have enthusiast board that supports
XMP you could test your DIMMs in it.

Eagerly awaiting the updated BIOS.. I've been rereading some threads and apparently missed the statement, you need better than 1333 Cas 7 memory to run over 225.

Reference link

musky · Jun 8, 2012

bowlinra said:
you need better than 1333 Cas 7 memory to run over 225.

Not true at all. 1333 is fine up through somewhere around 250. It would also be fine over 250, but we would just need to switch you to a lowwer memory multiplier (I think the O/C bios can do that as it is now.)

tear · Jun 8, 2012

Eagerly awaiting the updated BIOS..

Got a bit sidetracked here (work) but it should be there tonight.

I've been rereading some threads and apparently missed the statement, you need better than 1333 Cas 7 memory to run over 225.

Not true at all. 1333 is fine up through somewhere around 250. It would also be fine over 250, but we would just need to switch you to a lowwer memory multiplier (I think the O/C bios can do that as it is now.).

I've made few clarifications in OCNG BIOS thread.

Going above 250 w/1333 memory is uncharted territory and very YMMV thing (BIOS is not designed
to support it). Not sure if there's anyone who uses it (you'd need to go through the thread, I honestly
don't remember...).

bowlinra · Jun 9, 2012

tear said:
Got a bit sidetracked here (work) but it should be there tonight.

I've made few clarifications in OCNG BIOS thread.

Going above 250 w/1333 memory is uncharted territory and very YMMV thing (BIOS is not designed
to support it). Not sure if there's anyone who uses it (you'd need to go through the thread, I honestly
don't remember...).

Run 224-16 for about 4 hours folding p6903 @ TPF 15:21.

Assuming I can get thru this WU.. What speed should I attempt next? Would I continue to append the 16 for everyone over 224, or on certain speeds?

Cut from my first OC adventure:

I started at 240 - Posted, Booted, Locked up folding about a minute in.
Rebooted, Set to 235, shutdown, powered down, restart

tear · Jun 9, 2012

Side note -- "16" flag has virtually no effect w/DDR3-1333 memory. It's meant to enable higher memory clocks.
BTW, description of "16" flag w/frequencies 241 and above was not correct. Fixed now.

"8" flag disables use of XMP; updated OCNG post to reflect that.

Here's new BIOS for your board: http://hardforum.com/showpost.php?p=1038817959&postcount=514

I recommend testing memory in ./smocng.sh 200 configuration first (remember to download new
smocng.sh) given you've seen errors before. If test passes w/o errors you can go back to OC.

There's no good answer to your question. It's trial and error. No matter what, you want to keep temps
as low as possible (you should keep them below 50 at load).

Some 6166s run at 250 w/small voltage bump (realized by TPC) -- FYI.

Good practice is testing the OC w/dedicated test WU: http://hardforum.com/showpost.php?p=1038639726&postcount=433

bowlinra · Jun 11, 2012

tear said:
I recommend testing memory in ./smocng.sh 200 configuration first (remember to download new
smocng.sh) given you've seen errors before. If test passes w/o errors you can go back to OC.

Update:
- 224-16 failed with rig shutdown about 4-6hrs in folding p6903.
- Loaded the ver 3 on the BIOS.
- WU was restarted at 0%
- Set for 222 failed with rig shutdown about 4-6hrs in folding same WU.
- WU was lost
- Reset with "sudo ./smocng.sh reset" and powered off
- New WU p6904 has be running for 38 hrs. TPF 23:22

Way forward:
One I guess I should be playing with a Test WU.
Two I've completed the 30+ hrs Memtest86+ on the Vers 2 BIOS at 200 with no errors, and another 30+ hrs test with errors at 225 (failing at Test #6 and #8). I'm having to run the memtest86+ off a Booted USB stick, because I don't know how to run the test within Ubuntu.. I'm not appearing to get any logging capability, and have no idea how to match the offset with an actually memory stick. It's been frustrating to say the least.

For good news. I'm not have any issues with cpu temps, I haven't seen anything over 42C.

402blownstroker · Jun 11, 2012

What are you HT retries? The behaviors like what I was getting when I was ht-retry issues.

Grandpa_01 · Jun 11, 2012

bowlinra said:
Update:
- 224-16 failed with rig shutdown about 4-6hrs in folding p6903.
- Loaded the ver 3 on the BIOS.
- WU was restarted at 0%
- Set for 222 failed with rig shutdown about 4-6hrs in folding same WU.
- WU was lost
- Reset with "sudo ./smocng.sh reset" and powered off
- New WU p6904 has be running for 38 hrs. TPF 23:22

Way forward:
One I guess I should be playing with a Test WU.
Two I've completed the 30+ hrs Memtest86+ on the Vers 2 BIOS at 200 with no errors, and another 30+ hrs test with errors at 225 (failing at Test #6 and #8). I'm having to run the memtest86+ off a Booted USB stick, because I don't know how to run the test within Ubuntu.. I'm not appearing to get any logging capability, and have no idea how to match the offset with an actually memory stick. It's been frustrating to say the least.

For good news. I'm not have any issues with cpu temps, I haven't seen anything over 42C.

Try 226 and see what happens the memory timings should go up to 7-7-7 instead of 6-6-6 and it might be just what you need.

Linden · Jun 11, 2012

bowlinra, Grandpa_01 is right. It worked for my system. I hit a wall at 225. Boosting the reference clock to 226 pushed the RAM's CL down to 7, allowing higher clocks.

tear · Jun 11, 2012

Bowlinra, it looks like you're running into some timing tuning issue (<=225) which is,
well, odd.

You're not the only one, though. Wolf and Linden had some issues too.
Try running FAH or memtest at 226. It should make big difference
as it pushes the timings into CL7 area.

As far as recording test results goes -- memtest86 doesn't offer logging
(per my knowledge) unless you have it use serial console (not something
I would recommend). Your best bet would be writing them down or taking
a photo (ugh) of the screen.

Also, for the purpose of analysis, I'd like to ask you (if you're fine with it) to:
1. Set the refclock to 225
2. Run: sudo TurionPowerControl -dram and capture its output
3. Use eeprog to capture SPDs of your DIMMs and e-mail them to me ([email protected])

re eeprog

Code:

wget http://darkswarm.org/eeprog-0.7.6-tear10.tar.gz
tar -xzf eeprog-0.7.6-tear10.tar.gz
cd eeprog-0.7.6-tear10
make
sudo make install
mkdir ~/spd
cd ~/spd
sudo eeprog-spd-dump-g34 all

Last command should create 16 SPD images.

Linden · Jun 12, 2012

tear, check your email - you now have my eeprog SPD data; Memtest will have to wait until later

bowlinra · Jun 12, 2012

402blownstroker said:
What are you HT retries? The behaviors like what I was getting when I was ht-retry issues.

Been folding for about 40 hours.

Code:

bowlinra@amd4p:~$ sudo ./ht-retries.sh
       L0S0 L1S0 L2S0 L3S0 L0S1 L1S1 L2S1 L3S1
Node 0 0000 0000 0000 0000 0000 0000 0000 0000 
Node 1 0000 0000 0000 0000 0000 0000 0000 0000 
Node 2 0000 0000 0000 0000 0000 0000 0000 0000 
Node 3 0000 0000 0000 0000 0000 0000 0000 0000 
Node 4 0000 0000 0000 0000 0000 0000 0000 0000 
Node 5 0000 0000 0000 0000 0000 0000 0000 0000 
Node 6 0000 0000 [COLOR="Red"]0002[/COLOR] 0000 0000 0000 0000 0000 
Node 7 0000 0000 0000 0000 0000 0000 0000 0000 
bowlinra@amd4p:~$

I've run the ht-retries several times over the last month and never seen anything but zero.. Interesting to see this.. 0002, when do the counters reset? boot? cmd? or ever?

bowlinra · Jun 12, 2012

tear said:
Code:

sudo eeprog-spd-dump-g34 all

Last command should create 16 SPD images.

Run a test under stock 200, before reboot and got.

Code:

bowlinra@amd4p:~/spd$ sudo eeprog-spd-dump-g34 all
Checking for id head sed od tr dd dmidecode modprobe i2cset eeprog-tear ...done.
ERROR: BMC(s) found. Please disable IPMI and try again.

I'm also run the H8QGi so I thought I was OK with IPMI.. I don't need it, but couldn't find a disable in the bios.

I also add the clocksource to hpet.. incase that has anything to do with anything.

tjmagneto · Jun 12, 2012

bowlinra said:
Run a test under stock 200, before reboot and got.

Code:

bowlinra@amd4p:~/spd$ sudo eeprog-spd-dump-g34 all Checking for id head sed od tr dd dmidecode modprobe i2cset eeprog-tear ...done. ERROR: BMC(s) found. Please disable IPMI and try again.

I'm also run the H8QGi so I thought I was OK with IPMI.. I don't need it, but couldn't find a disable in the bios.

I also add the clocksource to hpet.. incase that has anything to do with anything.

With the Gi board you will need to locate the JPB1 jumper and set to disable. Worked for me.

bowlinra · Jun 12, 2012

tjmagneto said:
With the Gi board you will need to locate the JPB1 jumper and set to disable. Worked for me.

Very timely!!! Worked like a champ.

bowlinra · Jun 12, 2012

tear said:
Also, for the purpose of analysis, I'd like to ask you (if you're fine with it) to:
1. Set the refclock to 225
2. Run: sudo TurionPowerControl -dram and capture its output
3. Use eeprog to capture SPDs of your DIMMs and e-mail them to me

tear, completed and pm you the numbers for 225 & 226. Attempting to fold with 226.

bowlinra · Jun 12, 2012

Linden said:
bowlinra, Grandpa_01 is right. It worked for my system. I hit a wall at 225. Boosting the reference clock to 226 pushed the RAM's CL down to 7, allowing higher clocks.

Thanks for the idea, I had high hopes. But I woke up this morning and the rig was shutdown (under ~5hrs).

Hopefully tear will see something in the eeprog files.

tear · Jun 12, 2012

Busy at work here a bit but here's quick note --
shutdown suggests power supply/distribution issue.

What PSU are you running? And how are your JPW* populated?

G3rg of OCN saw similar issue IIRC but I don't remember what the
resolution was.

EDIT: and another question -- what is the board resting on? what
is the clearance between the board and resting surface?

Zink · Jun 12, 2012

edit: tear is serious about helping people, I'll leave you to it.

tear · Jun 12, 2012

Pls pls pls, this genuinely deserves own thread

tear · Jun 12, 2012

Zink, I meant your topic there.

Why not post a new thread?

bowlinra · Jun 12, 2012

tear said:
Busy at work here a bit but here's quick note --
shutdown suggests power supply/distribution issue.

What PSU are you running? And how are your JPW* populated?

G3rg of OCN saw similar issue IIRC but I don't remember what the
resolution was.

EDIT: and another question -- what is the board resting on? what
is the clearance between the board and resting surface?

PSU Is a used OCZ ModXStream Pro 700W Modular High Performance PSU (had sitting Idle) http://www.newegg.com/Product/Product.aspx?Item=N82E16817341018&Tpk=ocz%20pro%20700w

Power connected to the mother with the PSU standard 8pin CPU power cable and the second is two 4-pin molex to 8pin EPS (both are attached to the same single modular cable, maybe this need to feed from on separate molex modular cables?). It's also powered into a APC back-ups 1500. Motherboard is screwed down to a MDF board with #6-32 3/4in screw to brass threaded inserts with 1/2 nylon spacers between.

Uploaded with ImageShack.us

tear · Jun 12, 2012

Looks like dual +12V rail, 25A each.
Not sure how these (12V1, 12V2) map to your connectors -- you need to make sure you load them as
evenly as possible.

EDIT: just opened its manual and it's pretty much useless. There's no way of knowing
for sure where 12V2 goes unless you have add'l labeling on the unit itself

I would also recommend testing with beefier and/or single-rail PSU.

Linden · Jun 12, 2012

I would test the 12v rail under load at the ATX plug with a multimeter. If that's an older ModXtreme PSU, I wouldn't feel too confident without testing. (speaking from experience, here) Chances are, it's just fine, but I would check.

tear · Jun 12, 2012

Wouldn't overcurrent protection kick in before you observe voltage drop?

theGryphon · Jun 12, 2012

Yeah, that PSU could easily be your culprit. As tear said, either get a single rail or a PSU with a clear rail distribution such that it can feed the 24-pin and 8-pin connectors healthily. There are even 1000W+ PSU's out there where most of the juice is reserved for PCI-E connectors for video cards, etc... One blatant example is TT Toughpower Grand series...

Btw, it's a very nice setup

Core32 · Jun 12, 2012

tear said:
Wouldn't overcurrent protection kick in before you observe voltage drop?

Not necessarily if the wiring/interconnects from the PSU are contributing to the drop.

bowlinra · Jun 12, 2012

Core32 said:
Not necessarily if the wiring/interconnects from the PSU are contributing to the drop.

I thought I provide a good detail response... Thanks for all the help folks..

To be more precise. PSU was purchase in Nov 2009. The box claims it "Delivering 700W of maximum output and utilizing a triple +12V rail design, the ModXStream Pro allocates power where your system needs it most."

I still have the manual and below is the Electrical Specifications. I'm not smart enough to know if the 6 output modules on the PSU map directly to those 6 output listed? But I assume the hardwired cables have to pull power from these as well. Sounds like at a minimum I need to connect another modular cable of molex and bridge two modular cable connection to the connect my 8pin CPU connector. And start looking for deals on PSUs, recommendations? Current fleet consist of Cosair 1200AX, Raidmax RX-1200AE, Cosair TX750w. I could swap with the Cosair TX750w, thou it only has the 1 cpu connector, the other PSUs are for GPU folders.

Uploaded with ImageShack.us

Linden · Jun 13, 2012

I'm not saying you have a poor PSU, just that it is a possibility that the PSU does not/or no longer provides consistently stable power necessary for overclocking.

Just because a PSU is rated for a certain amount of power does not mean that the PSU will provide clean, consistent power precisely at all times. There are many, many factors in power delivery, the maximum Wattage rating is just one of them.

the ModXStream Pro allocates power where your system needs it most

Maybe true, but remember, marketing people are highly skilled in saying nothing very impressively.

Look, it could be that your PSU is not a factor at all in limiting your system's overclocking, but it is one item that should be on the troubleshooting list.

bowlinra · Jun 13, 2012

Linden said:
I would test the 12v rail under load at the ATX plug with a multimeter. If that's an older ModXtreme PSU, I wouldn't feel too confident without testing. (speaking from experience, here) Chances are, it's just fine, but I would check.

Linden, I appreciate your help.. Do we have a guide or a video on how to do this test? I have a multimeter, with hopes of one day understand to use it.

bowlinra · Jun 13, 2012

Linden said:
Maybe true, but remember, marketing people are highly skilled in saying nothing very impressively.

Look, it could be that your PSU is not a factor at all in limiting your system's overclocking, but it is one item that should be on the troubleshooting list.

I completely agree with you...

I just found the marketing message on the box ironic and included it for the group's amusement and the mention of a third rail if that changed anything.

Nathan_P · Jun 13, 2012

Some people have reported problems with the molex adapters, there is an adapter that you can buy that will convert an 8pin PCIe cable into an 8 pin EPS cpu cable. That could be worth trying as they are inexpensive

theGryphon · Jun 13, 2012

It's clear from here (http://www.ocztechnology.com/res/manuals/OCZMXSP500-700.pdf) that the non-modular cables are 12V1, and all modular cables are 12V2; each rail providing 25A max, and the combined is 46A = 552W. That is pretty low for a 700W PSU and THAT is probably causing the problem. 552W maybe just too close to what your system is trying to draw, and don't forget that aging components actually reduce the capability...

HardOCP had some issue with this unit as well:

Linden · Jun 13, 2012

Do we have a guide or a video on how to do this test? I have a multimeter, with hopes of one day understand to use it.

here
here

If those two guides aren't descriptive enough, an Internet search will bring up plenty more. If you feel uncomfortable testing - normal, fear of harming electronics or zapping yourself - practice with the computer powered down. When you perform the readings, have your computer running at full load. Testing at the ATX plug is often more accurate than testing at a Molex connector, although testing at the Molex is certainly easier.

bowlinra · Jun 14, 2012

theGryphon said:
It's clear from here (http://www.ocztechnology.com/res/manuals/OCZMXSP500-700.pdf) that the non-modular cables are 12V1, and all modular cables are 12V2; each rail providing 25A max, and the combined is 46A = 552W. That is pretty low for a 700W PSU and THAT is probably causing the problem. 552W maybe just too close to what your system is trying to draw, and don't forget that aging components actually reduce the capability...

That pretty interesting.. I think I'm going to get a bigger PSU to just eliminate the issue, keep an eye out for a deal 800w or 1000w.

bowlinra · Jun 14, 2012

Linden said:
When you perform the readings, have your computer running at full load. Testing at the ATX plug is often more accurate than testing at a Molex connector, although testing at the Molex is certainly easier.

Linden, Good links, had no problems. Thanks

Here is my reading while folding:

Code:

3.3-volt rail spec range  3.17 to  3.43 : My 3.36 - 3.38
  5-volt rail spec range  4.80 to  5.20 : My 5.11
 12-volt rail spec range 11.52 to 12.48 : My 11.96 - 12.00

I thinking I'm ok on the volt side, theGryphon has an interest point, that I may not be able to test. My Kill-A-Watt meter shows a pull of 440W - 460W vs a potential limit of 552W. Would I see an impact on the volts, if the watt where close to or hitting the limit? or is this a non-issue?

Linden · Jun 14, 2012

The 3, 5, and 12v rails look good.

My Kill-A-Watt meter shows a pull of 440W - 460W vs a potential limit of 552W. Would I see an impact on the volts?

That should be enough headroom that there wouldn't be a problem.

theGryphon · Jun 14, 2012

bowlinra said:
Linden, Good links, had no problems. Thanks

Here is my reading while folding:

Code:

3.3-volt rail spec range 3.17 to 3.43 : My 3.36 - 3.38 5-volt rail spec range 4.80 to 5.20 : My 5.11 12-volt rail spec range 11.52 to 12.48 : My 11.96 - 12.00

I thinking I'm ok on the volt side, theGryphon has an interest point, that I may not be able to test. My Kill-A-Watt meter shows a pull of 440W - 460W vs a potential limit of 552W. Would I see an impact on the volts, if the watt where close to or hitting the limit? or is this a non-issue?

Yeah, agreed with Linden, with those Kill-A-Watt numbers, 552W shouldn't be a problem. Don't forget, 460W is what you're trying to draw from the wall, so the PSU is actually providing 460/.82 = 377W, assuming 82% efficiency. So, 377W should be compared to 552W, technically, and you should be fine.

My only concern remaining is whether the rails are not able to share the juice. It's got 25A on each rail, which leaves 25A to the mobo and CPUs, which is only 300W. The PSU may have a over-current protection such that neither rail cannot go over a certain amperage. If that cut-off point is less than 32A, then there you have your problem, because 32A is 32Ax12V=384W...

tear · Jun 14, 2012

If load on the rails is balanced.

I still recommend:
1. Making sure that one 8-pin EPS comes from 12V1 (non-modular) and other 8-pin EPS comes from 12V2 (modular) -- if this can't be satisfied all bets are off.
2. Testing with better suited PSU

theGryphon · Jun 14, 2012

tear said:
If load on the rails is balanced.

I still recommend:
1. Making sure that one 8-pin EPS comes from 12V1 (non-modular) and other 8-pin EPS comes from 12V2 (modular) -- if this can't be satisfied all bets are off.
2. Testing with better suited PSU

Yep, #1 would definitely help, and may solve the potential problem of over-current protection I put above.

AMD 4P Burning and Prep for OC?

Limp Gawd

[H]ard|DCer of the Year 2012

[H]ard|DCer of the Year 2011

Limp Gawd

[H]ard|DCer of the Year 2011

Limp Gawd

[H]ard|DCer of the Month - Nov. 2012

[H]ard|DCer of the Year 2013

[H]ard|Gawd

[H]ard|DCer of the Year 2011

[H]ard|Gawd

Limp Gawd

Limp Gawd

[H]ard DCOTM x2

Limp Gawd

Limp Gawd

Limp Gawd

[H]ard|DCer of the Year 2011

Gawd

[H]ard|DCer of the Year 2011

[H]ard|DCer of the Year 2011

Limp Gawd

[H]ard|DCer of the Year 2011

[H]ard|Gawd

[H]ard|DCer of the Year 2011

[H]ard|Gawd

[H]ard|Gawd

Limp Gawd

[H]ard|Gawd

Limp Gawd

Limp Gawd

[H]ard DCOTM x3

[H]ard|Gawd

[H]ard|Gawd

Limp Gawd

Limp Gawd

[H]ard|Gawd

[H]ard|Gawd

[H]ard|DCer of the Year 2011

[H]ard|Gawd