Supermicro H8QGi/6 and H8QGL Next Generation OC BIOS

Status
Not open for further replies.
Correct. The Supermicro G34 4P boards allow 1, 2, 3, or 4 CPU configurations. Previous generation opteron boards support 1, 2, or 4 CPU configurations.
 
Correct. The Supermicro G34 4P boards allow 1, 2, 3, or 4 CPU configurations. Previous generation opteron boards support 1, 2, or 4 CPU configurations.

Thanks. That makes getting started a little easier on the bank. Now I just need to decide which 12 core magny-cours I wanna go with. :eek:
 
Shouldn't each cpu have all 4 ram sockets populated to get the max performance out of it?
 
This ^....
Others with more direct knowledge will probably correct me, but I believe you will see much better performance, if the opening budget will allow it, to have 4 channel memory. Start with the recommended 4 x 2GB kit that a lot of us have used. It can be bought for around $60. Then add this same kit with every CPU you add.

 
Well that's some good info, I was under the impression that each CPU only needed 2GB of RAM and that it was best to have at least Dual channel running on it instead of a single 2GB module.
 
I can run one CPU at a time, correct? So I can get the motherboard, 1GB module of DDR3 1333 CL7 (or should each CPU have 2x1GB DDR3 1333 CL7?), 850W PSU (true 850W PSU from like Antec, Corsair, Seasonic) and I'll be good to go with one CPU and add RAM/CPU as I go?
It will run and you will be able to do regular SMP. You'll need a 2nd cpu to do bigadv. And running 1 DIMM will slow things down a bunch.

Depending on the cpus you get and how much you OC an 850 watt PSU may be borderline. I have a Corsair AX1200 and running my 4 cpus and 16 DIMMS at 3.0GHz it's using over 900 watts.
 
It will run and you will be able to do regular SMP. You'll need a 2nd cpu to do bigadv. And running 1 DIMM will slow things down a bunch.

Depending on the cpus you get and how much you OC an 850 watt PSU may be borderline. I have a Corsair AX1200 and running my 4 cpus and 16 DIMMS at 3.0GHz it's using over 900 watts.

Which CPUs are you using?
 
Yes, each CPU needs 4 sticks of ram to be fully utilized. If you want more information, please create a new thread. This thread is intended for the discussion of the OC BIOS. We would be glad to help you in a new thread.
 
Depending on the cpus you get and how much you OC an 850 watt PSU may be borderline. I have a Corsair AX1200 and running my 4 cpus and 16 DIMMS at 3.0GHz it's using over 900 watts.

My 4x6180SE @ 2.875 pull a bit under 800w (normally around 780w). I'm using an AX-1200 on that system. My 4x6180SE @ 2.8125 pulls a bit less (~750w?). I've got an AX-850 on that system.

If you are using 4x6174, I doubt you will see much over 700w, if that. With a 12.5% OC, I think mine was pulling about 680w.
 
I've been having some local power company/weather issues and have had to shutdown the rigs a few times over the last couple of days.
I think this has always been the case, but the OC value doed not hold through a power cycle, shutdown/reboot or a restart.
Is that expected behavior or is it just me? :eek:

 
Works fine here.

What makes you think it loses the OC? Can you show expected and not expected output
of 'dmesg | grep -o Detected.*' ?

Do you lose any other BIOS settings as well? Or just the OC?
 
Just the OC, apparently.
When I cycle, using any of the methods I list above, once booted I run the 'dmesg | grep -o Detected.*'
command and it always reports back to "1800.xxx MHz processor."
I started checking this because invariably when I rebooted I noticed a big increase in the TPFs.
That's when I noticed the fall back in frequency.
I have two rigs and they both act the same way. Both using 6166HE CPUs.

 
I have two rigs and they both act the same way. Both using 6166HE CPUs.

tears OC script is persistant through reboots but any TPC commands your using to change voltages etc. I think you have to redo each time you reboot.
 
Then I am doing something wrong with the OC setup command I suppose.
Yes, on the voltages. I have added the vcore set command to my rc.local file and it is set properly on boot.
 
Works fine here.
What makes you think it loses the OC? Can you show expected and not expected output
of 'dmesg | grep -o Detected.*' ?
Do you lose any other BIOS settings as well? Or just the OC?

Any ideas on what/where I can check?
As far as I can tell I have entered the command correctly and it changes the clock to the frequency expected for that cycle of power.
I used the optimal defaults setting in CMOS after flashing the BIOS.
Could I be running some command that wipes out the value?
Thanks for any help.


 
How about some suggestions on a better method to test this, other than just cycle power?
Thanks.
 
Hey,Core32.

I can think of two tests you can run.

First test:
1. Boot the machine and make sure that OC is enabled and FAH isn't running
2. Immediately after bootup run and capture results of:
Code:
modprobe nvram ; hexdump -vC /dev/nvram
3. Wait whatever amount of time you need to wait to make the issue manifest
  but do not power-cycle the box.
4. Run commands from step #2 again and capture results as well
5. Post both outputs
6. Power-cycle (by means of 'poweroff' command or 'shutdown -P now')
7. Wait few moments and power the machine back on again
8. See if OC persisted.


Second test:
1. Boot the machine and make sure that OC is enabled and FAH isn't running
2. Run: 'sync' and, once it returns, immediately pull the plug (physically)
3. Wait few moments and power the machine back on again
4. See if OC persisted.
 
Thanks very much for the response.
I will run these tests today.
A question: Is there a method/command to verify the OC is enabled without dmesg?
I noticed that once you clear dmesg, obviously running that command again yields nothing. :D

Hey,Core32.

I can think of two tests you can run.

First test:
1. Boot the machine and make sure that OC is enabled and FAH isn't running
2. Immediately after bootup run and capture results of:
Code:
modprobe nvram ; hexdump -vC /dev/nvram
3. Wait whatever amount of time you need to wait to make the issue manifest
  but do not power-cycle the box.
4. Run commands from step #2 again and capture results as well
5. Post both outputs
6. Power-cycle (by means of 'poweroff' command or 'shutdown -P now')
7. Wait few moments and power the machine back on again
8. See if OC persisted.


Second test:
1. Boot the machine and make sure that OC is enabled and FAH isn't running
2. Run: 'sync' and, once it returns, immediately pull the plug (physically)
3. Wait few moments and power the machine back on again
4. See if OC persisted.
 
Now there is :)

Cool. That works. It appears to show the OC value entered, not the actual frequency?
Thats what I wanted to see anyway.

Starting the tests now but, looking at your test sequence, maybe I've been confusing in explaining the issue.
The OC never goes away, I don't believe, while the systems are running.
Only if I restart the system. The OC seems to get cleared.
 
Test # 1:
Immediately before power down command:
-H8QG6~/fah$ dmesg | grep -o Detected.*
Detected 2340.016 MHz processor.
-H8QG6:~/fah$ sudo ../refclock
Refclock: 259.985 MHz
-H8QG6:~/fah$ sudo modprobe nvram
-H8QG6:~/fah$ sudo hexdump -vC /dev/nvram
00000000 00 00 00 30 f0 30 0e 80 02 ff ff 2f 00 ff 3f 10 |...0.0...../..?.|
00000010 00 00 3f 00 00 00 00 00 00 00 00 30 47 47 47 47 |..?........0GGGG|
00000020 06 e6 ff ff 20 bf bf 7e 0d 00 00 00 00 00 00 00 |.... ..~........|
00000030 1e 69 00 00 00 00 00 00 00 00 00 00 00 00 30 03 |.i............0.|
00000040 00 01 00 40 ee f9 00 00 00 00 00 00 00 00 00 00 |...@............|
00000050 00 80 ae 80 00 00 1f 1f ff 00 02 00 00 00 03 21 |...............!|
00000060 54 00 00 03 76 98 00 00 ba 10 32 54 76 98 d9 18 |T...v.....2Tv...|
00000070 ba 00 |..|
00000072

Immediately after power on:
-H8QG6:~/fah$ dmesg | grep -o Detected.*
Detected 1799.904 MHz processor.
-H8QG6:~/fah$ sudo ../refclock
Refclock: 200.000 MHz
-H8QG6:~/fah$ sudo modprobe nvram
-H8QG6:~/fah$ sudo hexdump -vC /dev/nvram
00000000 00 00 00 30 f0 30 0e 80 02 ff ff 2f 00 ff 3f 10 |...0.0...../..?.|
00000010 00 00 3f 00 00 00 00 00 00 00 00 30 47 47 47 47 |..?........0GGGG|
00000020 06 e6 ff ff 20 bf bf 7e 0d 00 00 00 00 00 00 00 |.... ..~........|
00000030 1e 69 00 00 00 00 00 00 00 00 00 00 00 00 30 03 |.i............0.|
00000040 00 01 00 40 ee f9 00 00 00 00 00 00 00 00 00 00 |...@............|
00000050 00 80 ae 80 00 00 1f 1f ff 00 02 00 00 00 03 21 |...............!|
00000060 54 00 00 03 76 98 00 00 ba 10 32 54 76 98 d9 18 |T...v.....2Tv...|
00000070 ba 00 |..|
00000072

As you can see, the OC did not persist. Running test 2 now.
 
Someone is writing to the NVRAM area used by OCNG, seems. Odd, because per the
BIOS it should be unused (and it isn't used in all other configurations we've seen
to date).

Can you do one more test? Run your usual smocng.sh with your frequency. And then
capture hexdump immediately after running smocng? No need to power-cycle.

BTW, using CODE tag instead of QUOTE tag makes the forum use monospace font
-- much easier to read :)
 
I'm setting up a new 4p system and am planning on trying this out.

I'm new to folding though, and was wondering how to save and run a benchmark WU (6903). I didn't realize that was possible, and on my last set up ended up not completing 2 WU's (also didn't know the client wouldn't re-sync with previous WU if re-installed). I of course don't want to tune on a live WU.

Thanks in advance (for this and all the other info I've gathered from this site).
 
I'm setting up a new 4p system and am planning on trying this out.

I'm new to folding though, and was wondering how to save and run a benchmark WU (6903). I didn't realize that was possible, and on my last set up ended up not completing 2 WU's (also didn't know the client wouldn't re-sync with previous WU if re-installed). I of course don't want to tune on a live WU.

Thanks in advance (for this and all the other info I've gathered from this site).


Welcome, and we are glad that we could help! Just remember to type in team 33 in when it asks for a team number ;)
 
Well you can save the WU by saying the contents of the /work directory, but they have deadlines. So if you don't work fast, then the client will drop the WU once the deadline passes.
 
Someone is writing to the NVRAM area used by OCNG, seems. Odd, because per the
BIOS it should be unused (and it isn't used in all other configurations we've seen
to date).

Can you do one more test? Run your usual smocng.sh with your frequency. And then
capture hexdump immediately after running smocng? No need to power-cycle.

BTW, using CODE tag instead of QUOTE tag makes the forum use monospace font
-- much easier to read :)

Still learning :)
Hexdump right after smocng command:
Code:
-H8QG6:~/fah$ sudo ../smocng.sh 261 16
Success!
To ensure proper application, POWER-OFF the machine, then power it back on again.
-H8QG6:~/fah$ sudo modprobe nvram
-H8QG6:~/fah$ sudo hexdump -vC /dev/nvram
00000000  00 00 00 30 f0 30 0e 80  02 ff ff 2f 00 3d 10 05  |...0.0...../.=..|
00000010  00 00 f6 00 00 00 00 00  00 00 00 30 47 47 47 47  |...........0GGGG|
00000020  06 a1 ff ff 20 bf bf 7e  0d 00 00 00 00 00 00 00  |.... ..~........|
00000030  1d d9 00 00 00 00 00 00  00 00 00 00 00 00 30 03  |..............0.|
00000040  00 01 00 40 ee f9 00 00  00 00 00 00 00 00 00 00  |...@............|
00000050  00 80 ae 80 00 00 1f 1f  ff 00 02 00 00 00 03 21  |...............!|
00000060  54 00 00 03 76 98 00 00  ba 10 32 54 76 98 d9 18  |T...v.....2Tv...|
00000070  ba 00                                             |..|
00000072
 
I'm setting up a new 4p system and am planning on trying this out.

I'm new to folding though, and was wondering how to save and run a benchmark WU (6903). I didn't realize that was possible, and on my last set up ended up not completing 2 WU's (also didn't know the client wouldn't re-sync with previous WU if re-installed). I of course don't want to tune on a live WU.

Thanks in advance (for this and all the other info I've gathered from this site).

Create a copy of client directory.
From that directory, run ./fah6 -configonly
Enable proxy on some obscure port (e.g. 1234) to prevent network communications.
When client asks about system clock having errors, answer Yes (will prevent
client from deleting the unit when its deadline passes).
EDIT: When client asks about machine ID, set it to a number you don't normally use.
From work/ directory, delete all files but wudata_XX.dat (XX being current slot number).

Zip it up and keep in safe place.

In details, it could look like this:
Code:
cd ~
cp -a fah fah-test
cd fah-test
./fah6 -configonly
# Go through configuration and enable proxy on some obscure port, like 1234,
# - when client asks whether to use proxy, answer Yes
# - when client asks about Proxy name, answer 127.0.0.1
# - when client asks about Proxy port, answer 1234
# - when client asks to change advanced settings, answer Yes
# - when client asks about system clock having errors, answer Yes (will prevent
#   client from deleting the unit when its deadline passes).
# - when client asks about machine ID, set it to a number you don't normally
#   use (e.g. 10)
# Once client configuration is done, continue with the following steps
cd work
mv wudata_*.dat ..
rm *
mv ../wudata_*.dat .
cd ~
tar -czf fah-test.tar.gz fah-test

When testing, use clean client copy before each test, that is, delete fah-test and untar afresh.
 
Last edited:
Core32, you did load Optimal Defaults in the BIOS, right? Did you change any other options?
I'm thinking some customization could be responsible for this....

Anyway, here's one more test:
- make sure memtest86 is installed
- run smocng.sh with your usual parameters
- power the machine off
- wait a while, then power it back on; however, do not let it boot; interrupt it in GRUB's (bootloader) menu and start memtest86 instead
- record CPU frequency value from memtest86 (base: 1800)
- power the machine off
- wait a while, then power it back on and start memtest again
- record CPU frequency value again and compare with the previous one

The idea of this test is isolating the issue to either Linux or BIOS.
 
Core32, you did load Optimal Defaults in the BIOS, right? Did you change any other options?
I'm thinking some customization could be responsible for this....

Yes, loaded optimal defaults, but only after the initial [H] BIOS FLASH or after having to clear CMOS from a non-post. I don't normally re-do this after just changing the OC.

The only other parameter I changed was the fan profile to Balanced from Full.
I have also set optimal defaults and not changed the fan profile. Baby those stock fans do sing! :)
Thanks again. I'll run the test this evening.
 
These recommendations apply only to non-OC BIOSes (e.g. stock SM or Tyan).

OC BIOS' optimal defaults already set them to proper values so typically there's nothing
else one needs to do after loading and saving optimal defaults.
 
My assumption was that with the OC BIOS the optimal defaults set all those automatically.

Edit: I should refresh the page occasionally I guess :eek:
 
Anyway, here's one more test:
- make sure memtest86 is installed
- run smocng.sh with your usual parameters
- power the machine off
- wait a while, then power it back on; however, do not let it boot; interrupt it in GRUB's (bootloader) menu and start memtest86 instead
- record CPU frequency value from memtest86 (base: 1800)
- power the machine off
- wait a while, then power it back on and start memtest again
- record CPU frequency value again and compare with the previous one

The idea of this test is isolating the issue to either Linux or BIOS.

First reboot into Grub/memtest86 after issueing sudo ./smocng.sh 243 16 command: 2187MHz
Direct reboot into Grub/memtest86: 1800MHz

Looks like Linux is not involved.
 
Status
Not open for further replies.
Back
Top