Supermicro H8QGi/6 and H8QGL Next Generation OC BIOS

Discussion in 'DC Guides' started by tear, Mar 3, 2012.

Thread Status:
Not open for further replies.
  1. Posidon42

    Posidon42 Limp Gawd

    Messages:
    146
    Joined:
    Dec 20, 2005
    That is a lot of retries. Something seems awry.
     
  2. tear

    tear [H]ard|DCer of the Year 2011

    Messages:
    1,567
    Joined:
    Jul 25, 2011
    Yeah... Node 5/Link 1/Sublink 0 and Node 6/Link 2/Sublink 0 have one thing in common;
    they both connect to CPU1 (just making an observation).

    In any way, dwdawg, you should first check how quickly they're rising. Run the script,
    wait 5-10 seconds, then re-run it. Post results afterwards.

    I've got few ideas but let's see what we're dealing with...
     
  3. theGryphon

    theGryphon [H]ard|Gawd

    Messages:
    1,262
    Joined:
    Nov 21, 2011
    tear, I totally agree, it must have been a WU fluke.
    But forget about that, I've got good news: successfully booted with 233 16!!! Woohooo! :D

    Isn't it something that 232 didn't post but 233 did? I had similar things happening with Intel systems but I wasn't very hopeful to be honest. This is great! It's possible I can push further :)
    Running fah now, let's see how it goes till morning.
     
  4. tear

    tear [H]ard|DCer of the Year 2011

    Messages:
    1,567
    Joined:
    Jul 25, 2011
    I'd probably ask you whether 233 boots reliably... (power-cycle ten times -- will it POST 10 out of 10?).

    Whatever it is, it has to do with HT, it's been the nemesis of this endeavor...

    HT issues are on the investigation list so they will (hopefully) be dealt with (tentative ETA is few weeks from now).
     
  5. tear

    tear [H]ard|DCer of the Year 2011

    Messages:
    1,567
    Joined:
    Jul 25, 2011
    You may be onto something here, mister. My non-232-booting config does boot at 233 :eek:
    I'm sensing productive weekend ahead...

    And by the way, F9 indeed loads optimal defaults. Please pardon my distrust :)
     
    Last edited: Mar 28, 2012
  6. Vaulter98c

    Vaulter98c [H]ard|DCer of the Month - October 2009

    Messages:
    5,097
    Joined:
    May 21, 2008
    Just wanted to say thanks again tear for all you are doing, this is some amazing stuff
     
  7. dwdawg

    dwdawg [H]ard|DCer of the Month - January 2013

    Messages:
    737
    Joined:
    Feb 25, 2001
    Oh well, extra memory never hurts :)
    Think I need to reseat CPU 1?

    Node 0 0000 0000 0000 0000 001c 0000 0000 0000
    Node 1 0000 0000 0000 0000 0000 0000 0000 0000
    Node 2 0000 0000 0000 0000 0000 0000 0000 0000
    Node 3 0000 0000 0000 0000 0000 0000 0000 0000
    Node 4 0000 0000 0000 0000 0000 0000 0000 0000
    Node 5 0000 5a59 0000 0000 0000 0000 0000 0000
    Node 6 0000 0000 32ff 0000 0000 0000 0000 0000
    Node 7 0000 0000 0000 0000 0000 0000 0000 0000

    Node 0 0000 0000 0000 0000 001c 0000 0000 0000
    Node 1 0000 0000 0000 0000 0000 0000 0000 0000
    Node 2 0000 0000 0000 0000 0000 0000 0000 0000
    Node 3 0000 0000 0000 0000 0000 0000 0000 0000
    Node 4 0000 0000 0000 0000 0000 0000 0000 0000
    Node 5 0000 5a5b 0000 0000 0000 0000 0000 0000
    Node 6 0000 0000 338c 0000 0000 0000 0000 0000
    Node 7 0000 0000 0000 0000 0000 0000 0000 0000
     
  8. musky

    musky [H]ard|DCer of the Year 2012

    Messages:
    3,135
    Joined:
    Dec 14, 2009
    It probably wouldn't be a bad idea to reseat CPU1. Make sure the contact on the bottom of the chip are clean before you reseat it.
     
  9. theGryphon

    theGryphon [H]ard|Gawd

    Messages:
    1,262
    Joined:
    Nov 21, 2011
    Haha, I'm glad I tapped into something new! :D
    My 6166s folded just fine through the night with 233 16. Stock voltages.

    The thing is the same thing happened with 231 16: the first time it posted (after a power-cycle) and I started the client, the TPF was abnormally high at 16:37. Then, I power-cycled again and the second time around the TPF went down to an expected level at 14:52 (although it is a bit higher than what I saw with 231, I did not test it on the same frames, no direct comparison is not valid).

    So, what I'm saying is, with higher refclocks (>230) somehow a second power-cycle is helping. I don't know why, but this has been my experience with 231 and 233.

    I'm going to power-cycle it a few more times for ya after this WU finishes ;)

    ht-retries has been just zeroes all along. From what I understand ht-retries is giving you the cumulative number of retries, right? So, if I see zeroes now, it means there were no retries over the night, correct?
     
  10. firedfly

    firedfly [H]ard|DCer of the Month - February 2012

    Messages:
    343
    Joined:
    May 17, 2011
    That is correct. The retries are cumulative since power on. If you reboot the system, the counter will not reset. It is only after a power cycle (power off, power on) that the counters are reset.
     
  11. Grandpa_01

    Grandpa_01 [H]ard|DCer of the Year 2013

    Messages:
    1,157
    Joined:
    Jun 4, 2011
    I have observed the same behavior as far as reboots / Power Cycles go. I have noticed that If at higher settings say 237 on 6174’s ,or so If I power cycle the rig and check retries after it comes back up if there are any the TPF will be slower. If I repower cycle until there are none and sooner or latter there will be none then the TPF drops and the rig will run with no problems. Also over time and with each WU the TPF appears to increase until I do a power cycle then it will drop again. (Possible Memory Issue?) These are just observations in no way have they been controlled test and could be purely coincidence but might be worth others watching and seeing if they are seeing the same or similar behavior.
     
  12. theGryphon

    theGryphon [H]ard|Gawd

    Messages:
    1,262
    Joined:
    Nov 21, 2011
    Thanks firedfly!
    Interesting Grandpa. I don't like the idea of losing TPF over time and I'll sure to be checking on that.

    tear, I'm sorry, I won't be able to power-cycle with 233 anymore. How about I give you 235? :p
    Yep, just posted 235 16, and folding away :D
     
  13. Core32

    Core32 [H]ard|Gawd

    Messages:
    1,053
    Joined:
    Mar 3, 2012
    About to finish up my 6166HE 4P install and saw this post.
    I remember reading something about the second parameter in the command, in this case the "16", what does that represent?
    And I previously saw a mention that 6166HEs can mod the voltage. What is the command for controlling that if I may ask?
    Thanks.
     
  14. theGryphon

    theGryphon [H]ard|Gawd

    Messages:
    1,262
    Joined:
    Nov 21, 2011
    I'm just a student here really, but I can tell you that 16 controls the memory base timings, and it should be used only if your memory has XMP 1600 profile. Even so, it is not supported (tear's instructions), and it's very much experimental from what I gather. I'm taking the risk :)

    The voltage control is done using TPC. There are three commands to issue in a row, but I'm such a noob in this, I'll let tear or another experienced user guide you.
     
  15. tear

    tear [H]ard|DCer of the Year 2011

    Messages:
    1,567
    Joined:
    Jul 25, 2011
    Yeah... but... but... I meant to do couple more things before resorting to that.
    As it's critical that only one thing is changed at a time -- have you reseated it yet, dawg? :D
     
  16. tear

    tear [H]ard|DCer of the Year 2011

    Messages:
    1,567
    Joined:
    Jul 25, 2011
    I strongly suggest that you back up one WU (along with all client files, ofc) and keep it
    for benchmarking (make sure to enable "clock frequently has errors" option so that
    passed deadline won't cause the client to delete the WU).

    What you should also be doing is checking applied frequency after each boot
    (while experimenting) to either confirm of disprove frequency misapplication theory.

    And, for symmetry, you should also start the bench WU several times (from scratch)
    "within" single boot ("inverse" of what you've tried so far).

    That is correct.
     
  17. tear

    tear [H]ard|DCer of the Year 2011

    Messages:
    1,567
    Joined:
    Jul 25, 2011
    Yes, well. I don't mind documenting some of the flags but I'm not sure whether
    this is the time or the place, to be honest. I don't want to contribute to existing confusion;
    will have to consult the elders on this one.

    Neither I want this thread to become TPC support forum.
    Check this post at AMDzone for instructions that should work
    with most recent TPC version. For further inquiries please use AMDzone forum
    (TPC has its thread there somewhere) or post in a new thread :D
     
  18. theGryphon

    theGryphon [H]ard|Gawd

    Messages:
    1,262
    Joined:
    Nov 21, 2011
    That frequency misapplication thing was not a theory, or a claim, was just a thought and I don't think that's the case actually. Frequency has been reported correctly after every power-cycle, so I don't think there's any problem with the script or the BIOS. I think it has to do with how stressing the higher refclocks are for the system. Is there any other way to diagnose this other than the ht retries?

    By the way, interestingly, I did not have any problem with 235. It's now folding at 14:32, which is around what I would expect (not a benchmark to be compared to previous numbers but it's not abnormal like 16:37, etc.).

    I'm very happy with how it's going. ht-retries is reporting full zeroes. Stock voltages. Temps 43-47C. :D

    I'm starting to think that my board is liking odd refclocks. It denied 232 but worked fine with 231, 233 and 235.
     
  19. tear

    tear [H]ard|DCer of the Year 2011

    Messages:
    1,567
    Joined:
    Jul 25, 2011
    Good. Then that's off the table. Per explanation in my post you should exclude GROMACS fluke
    as well...

    Nah. Anything 233+ would probably work. I'm looking into this.
     
  20. dwdawg

    dwdawg [H]ard|DCer of the Month - January 2013

    Messages:
    737
    Joined:
    Feb 25, 2001
    Nope. I was waiting on the word from master Yoda :D
    I figure the low overclock and frequency of retries might give you a good testbed to help figure out the other problems.

    Your humble (my wife disagrees with this) servant awaits.


     
  21. tear

    tear [H]ard|DCer of the Year 2011

    Messages:
    1,567
    Joined:
    Jul 25, 2011
    The spirits hath spoken! They're telling me... they're telling me 209 will not boot for you
    either:eek:
     
  22. tear

    tear [H]ard|DCer of the Year 2011

    Messages:
    1,567
    Joined:
    Jul 25, 2011
    Most excellent! Can you hop on IRC one evening? That would expedite whole process.

    Certain disputes are meant not to ever take place... :D
     
  23. theGryphon

    theGryphon [H]ard|Gawd

    Messages:
    1,262
    Joined:
    Nov 21, 2011
    Who cares about 209? I've got 235 up and running :p
    You may ask the spirits about 241 though :D
     
  24. tear

    tear [H]ard|DCer of the Year 2011

    Messages:
    1,567
    Joined:
    Jul 25, 2011
    241 is a go. 242 is a no-go.

    List of frequencies to avoid (until new version comes out): 209, 232, 242, 262.
     
  25. theGryphon

    theGryphon [H]ard|Gawd

    Messages:
    1,262
    Joined:
    Nov 21, 2011
    Wow, so you weren't kidding??! I really thought you were :eek:
    How did you find out? I mean, how does it work?
     
  26. dwdawg

    dwdawg [H]ard|DCer of the Month - January 2013

    Messages:
    737
    Joined:
    Feb 25, 2001
    I can. Tonight happens to be a good night.
    If I miss you, I'll keep trying.

    Yeah? tell HER that! :p
     
  27. tjmagneto

    tjmagneto [H]ard DCOTM x2

    Messages:
    3,007
    Joined:
    Aug 6, 2008
    Tear has an endless supply of lab rats for testing BCLK. PETA should be knocking on the door anytime soon.
     
  28. 402blownstroker

    402blownstroker [H]ard|DCer of the Month - Nov. 2012

    Messages:
    3,156
    Joined:
    Jan 5, 2006
    LMFAO :D
     
  29. theGryphon

    theGryphon [H]ard|Gawd

    Messages:
    1,262
    Joined:
    Nov 21, 2011
    Reporting: Folding nicely at 235 since morning, zero ht retries. TPF is 14:29 on 6903.
     
  30. 4DoorGTZ

    4DoorGTZ Limp Gawd

    Messages:
    276
    Joined:
    Nov 20, 2005
    Well I'm using 11.04 and I hope this is returning the nb speed, 6176 at 200 this says 1800mhz, all the way up to 261 it reported 2349mhz (multiples of 9x ref clock) I'm still just playing around, had a no boot at 262.... Seems my cheap cl7 1066 that did 1333 on stock bios only did 1066 at 200 and then down clocked to 800 for anything above that. Went back to my gskill eco's and they do 1333 at 200 and 1066 everything above that.
     
  31. ctrlbrk

    ctrlbrk [H]Lite

    Messages:
    73
    Joined:
    Mar 6, 2011
    First, thank you tear for everything!

    Second, I noticed you guys are all running MC's, lots of 6174's etc. I am building a new 4p rig. I just want to double check before I buy the chips, that even with this bios the IL 6274's should be avoided and won't overclock like the MC 6174's?

    Mike
     
  32. 402blownstroker

    402blownstroker [H]ard|DCer of the Month - Nov. 2012

    Messages:
    3,156
    Joined:
    Jan 5, 2006
    I was just thinking.... has anyone been keeping track of what BCLK values work for which CPU/memory combos? I think this would be very useful info.

    Keith
     
  33. musky

    musky [H]ard|DCer of the Year 2012

    Messages:
    3,135
    Joined:
    Dec 14, 2009
    From the FAQ in post 3:

    Q: When will this have IL support?
    A: "When AMD makes a Family 15h CPU that doesn't suck" - tear

    So yes, avoid IL chips.
     
  34. Kendrak

    Kendrak [H]ard|DCer of the Year 2009

    Messages:
    22,871
    Joined:
    Aug 29, 2001
    The fact that almost no one on this team is running 62xx on this team, even before OC was an option should be a strong answer. Now with OCing as an option it isn't even realy an option.

    61xx chips use less power, are better at folding, are cheaper when purchased used, and now can be OC-ed.

    62xx chips don't have much going for them in the folding world.
     
  35. ctrlbrk

    ctrlbrk [H]Lite

    Messages:
    73
    Joined:
    Mar 6, 2011
    Thanks, sorry I missed that -- I thought I read the entire thread twice :)

    Yes, I figured. But wanted to make sure. The 6274's are actually cheaper than the 6174's on ebay right now!

    Appreciate the quick responses guys.
     
  36. Jeanjean

    Jeanjean [H]Lite

    Messages:
    99
    Joined:
    Nov 22, 2011

    +1

    And is it useful to buy ddr3 1600 mhz instead of ddr3 1333 mhz ?

    Can we hope better performance ?
     
  37. musky

    musky [H]ard|DCer of the Year 2012

    Messages:
    3,135
    Joined:
    Dec 14, 2009
    I'll repost this here also:

    To answer what i think you are actually asking though, you aren't going to be able to run much over 1333 for memory speed with MC chips. The chips don't support it.
     
  38. 402blownstroker

    402blownstroker [H]ard|DCer of the Month - Nov. 2012

    Messages:
    3,156
    Joined:
    Jan 5, 2006
    At least with Supermicro boards it is better to get DDR3-1333 CL7 than DDR3-1600 memory.
     
  39. Jeanjean

    Jeanjean [H]Lite

    Messages:
    99
    Joined:
    Nov 22, 2011
    Thanks for your help . ;)
     
  40. theGryphon

    theGryphon [H]ard|Gawd

    Messages:
    1,262
    Joined:
    Nov 21, 2011
    I'm not sure if RAM plays any role in "what BCLK values work" but I'm running these Crucial sticks (http://www.newegg.com/Product/Product.aspx?Item=N82E16820148488) successfully so far. They're 1600 8-8-8, but have a 1333 7-7-7 XMP profile too. They're low profile and that's why I got them over the GSkills (I needed something to slide under my Noctua 92mm's).

    My rig folded nicely through the night at 241. No ht retries, temps are 43-47C. Average TPF is 14:10 on 6903. That's almost 500K on 6166HEs :D :knocks on wood:

    I'm not done searching for the upper limit yet, but this is just awesome already. Thank you again tear and all with a hand in this!
     
Thread Status:
Not open for further replies.