• Some users have recently had their accounts hijacked. It seems that the now defunct EVGA forums might have compromised your password there and seems many are using the same PW here. We would suggest you UPDATE YOUR PASSWORD and TURN ON 2FA for your account here to further secure it. None of the compromised accounts had 2FA turned on.
    Once you have enabled 2FA, your account will be updated soon to show a badge, letting other members know that you use 2FA to protect your account. This should be beneficial for everyone that uses FSFT.

MC ES Setup Help

I forget that freqcheck.sh needs a nominal frequency so it couldn't meet your requirement.
You could get turbostat by:
Code:
wget [url]http://stuff.mit.edu/afs/sipb/contrib/linux/tools/power/x86/turbostat/turbostat.c[/url]
cc turbostat.c -o turbostat
cp turbostat /bin
 
I forget that freqcheck.sh needs a nominal frequency so it couldn't meet your requirement.
You could get turbostat by:
Code:
wget [url]http://stuff.mit.edu/afs/sipb/contrib/linux/tools/power/x86/turbostat/turbostat.c[/url]
cc turbostat.c -o turbostat
cp turbostat /bin

Thanks for passing that along.


Code:
[root@scatha ~]# ./turbostat
No invariant TSC

:confused:
 
Last edited:
Looks like turbostat doesn't support Opteron 6100.
Ok, I have another small program for measuring the real frequency of CPU (without TB). It should work for Opteron 6100:

Code:
#include "stdio.h"
#define RDTSC(low,high) __asm__ __volatile__("rdtsc" : "=a" (low), "=d" (high))

long long int rdtsc() {
        unsigned int low, high;
        RDTSC(low, high);
        return high*(1LL<<32)+low;
}

int main() {
        long long int t1, t2;
        t1=rdtsc();
        usleep(1000000);
        t2=rdtsc();
        printf("The CPU frequency is: %lldHz\n", t2-t1);
}

Save the code as myfreq.c, and then:
Code:
cc myfreq.c -o myfreq
./myfreq

Thanks for passing that along.


Code:
[root@scatha ~]# ./turbostat
No invariant TSC

:confused:
 
Looks like turbostat doesn't support Opteron 6100.
Ok, I have another small program for measuring the real frequency of CPU (without TB). It should work for Opteron 6100:

Code:
#include "stdio.h"
#define RDTSC(low,high) __asm__ __volatile__("rdtsc" : "=a" (low), "=d" (high))

long long int rdtsc() {
        unsigned int low, high;
        RDTSC(low, high);
        return high*(1LL<<32)+low;
}

int main() {
        long long int t1, t2;
        t1=rdtsc();
        usleep(1000000);
        t2=rdtsc();
        printf("The CPU frequency is: %lldHz\n", t2-t1);
}

Save the code as myfreq.c, and then:
Code:
cc myfreq.c -o myfreq
./myfreq

Nice, that seems to do the trick.

Actually I am wrong, that is not working correctly or frequency is not being set.

If 'tpc -l' is correct, it looks like the pstate frequencies are list in the there for each state.
 
Last edited:
Could you tell me what your output for this small program (./myfreq) is?

Nice, that seems to do the trick.
Actually I am wrong, that is not working correctly or frequency is not being set.
If 'tpc -l' is correct, it looks like the pstate frequencies are list in the there for each state.
 
I was overlooking that your 6100s were overclocked by tpc.
I'm sorry this code was unable to work properly for that. It would just show the original frequency before oc.
(however, this code should work if it was overclocked by the OC BIOS, I think)
 
Does anyone has a set of 1.6GHz ES chips working at all???!?!!?! If so, what baord and settings are you using?

I think I have caused permanent brain damage fro banging my head against the wall :(

There is the last sequence of commands I have tried and I am still getting the system to reboot with NB Watchdog Timeout Errors after 10 to 20 minutes of folding.

Code:
tpc -fo 1
tpc -nbvid 0 38
tpc -set node all freq 2300 vcore 1.200
tpc -nbfid 4
tpc -fo 0

The nbfid number came from the output of 'tpc -spec'. Right now I have just a single CPU in socket 1 install with 4 sticks of ram.
 
That should be

sudo tpc -nbvid 0 38
sudo tpc -set node all freq 2300 vcore 1.200
sudo tpc -nbfid 4 I am not sure I would mess with this 1 for now what is default 5 if you do you have to set it then reboot and do the others for it to take affect
sudo tpc -fo 1
sudo tpc -fo 0
 
Well got things to run for more than 30 minutes last night. Actually things ran over the night and got a couple of SMP WUs done.

Initially I was running version 2 of the NG BIOS( G60NG2.A11 ). I upgrade to version 3( G69NG3.A11 ) and the 'Node Interleaving Request, But Not Enabled' error went away. I heard from a couple of people that they had issues with the NG BIOS and the 'NB Watchdog Timeout Error' with other ES chips. So I changed to the stock SM BIOS( H8QG62.411 ). Turns out this is the latest and the NB Watchdog errors still persisted along with the 'Node Interleaving Request, But Not Enabled' error returned. I noticed on my flash driver that I had a older version of the stock BIOS( H8QG61.A11 ), so I flashed that. The 'Node Interleaving...' error went away, but the 'NB Watchdog' remained.

I got to thinking( I know that is dangerous ) that maybe the DDR3-1600 CL8 memory is having issues, so I swap them out for the G.Skill DDR3-1333 CL7. I think that might have helped some as now it would fold for about 45 minutes before falling on its face.

I changed around the 'tpc' commands as follows:

Code:
./tpc -fo 1
./tpc -nbvid 37
./tpc -set node all 1700 vcore 1.1250
./tpc -nbfid 4
./tpc -fo 0

That was able to fold over the night with CPUs and running with 32 cores.

Currently I have added the 4th CPU and bump the freq to 1900. So far so 'good'....


Couple of things I would like to know.

  • What is the correct placing of the 'tpc -fo X' commands?
    - I have heard reason for after the last setting command and wrapped around th esetting commands.
  • Why is it not needed to adjust the NorthBridge voltage and set the active ID?
    - Or maybe I should ask what causes the 'NB Watchdog Timeout Errors'
  • How can I adjust or set the RAM timing values from 9-9-9-24 to 7-7-7-21?

Since I got things running here, I think I will put everything back together. I would really like to get them up to 2.2GHz and 2.5GHz being the icing on the cake.
 
What is the correct placing of the 'tpc -fo X' commands?
- I have heard reason for after the last setting command and wrapped around the setting commands.
When you set the machine to power state 1 doesn't matter - you can do it first or right at the end before you go back to power state 0. The big thing is that you set it back to power state 0 after you are done changing everything ralated to the CPUs (multiplier, voltage, etc.)

Why is it not needed to adjust the NorthBridge voltage and set the active ID?
- Or maybe I should ask what causes the 'NB Watchdog Timeout Errors'
No idea - I guess I need to get am ES machine one of these days.

How can I adjust or set the RAM timing values from 9-9-9-24 to 7-7-7-21?
Using tear's bios or flashing the memory are your only options.
 
There is the last sequence of commands I have tried and I am still getting the system to reboot with NB Watchdog Timeout Errors after 10 to 20 minutes of folding.

Hi.

I suppose that you bought them here : http://cgi.ebay.com/ws/eBayISAPI.dl...e=ADME:L:MAV:US:1123&tid=692589429014&guest=1

Because i had exactly the same problem that you : reboot with NB Watchdog Timeout errors after approximately 15 minutes of folding .

The only solution has been to send them back to Korea and be refunded .:(

But perhaps Tear will have a better solution for you and i am curious of that.
 
As an eBay Associate, HardForum may earn from qualifying purchases.
Hi.

I suppose that you bought them here : http://cgi.ebay.com/ws/eBayISAPI.dl...e=ADME:L:MAV:US:1123&tid=692589429014&guest=1

Because i had exactly the same problem that you : reboot with NB Watchdog Timeout errors after approximately 15 minutes of folding .

The only solution has been to send them back to Korea and be refunded .:(

But perhaps Tear will have a better solution for you and i am curious of that.

No, I got them from a different source. Supposidly they work at 2.8GHz. I am guessing that they have been verified that they boot at 2.8GHz, but nothing verified under load. I will try to work with them a little bit more before I give up hope....
 
As an eBay Associate, HardForum may earn from qualifying purchases.
Using tear's bios or flashing the memory are your only options.

How does one flash the memory? Initially I have not good luck with the NG BIOS, but I might try working with that again later.

After getting everything back together I processed a bit of a SMP WU at 1900MHz. After about 1 hour, I switched over to a bigbeta WU and got a 8101...... face plant around 16 minutes of processing.

Back to the drawing board :(
 
How does one flash the memory? Initially I have not good luck with the NG BIOS, but I might try working with that again later.

After getting everything back together I processed a bit of a SMP WU at 1900MHz. After about 1 hour, I switched over to a bigbeta WU and got a 8101...... face plant around 16 minutes of processing.

Back to the drawing board :(

I can walk you through the process on irc, but it sounds like you have other more pressing issues than memory timings.
 
I can walk you through the process on irc, but it sounds like you have other more pressing issues than memory timings.

You got that right. Like having my head examined for going down the ES route rather than the retail :(
 
Did you try leaving out the NB commands to see if they're the cause of your issues? You may need to tweak them a bit.

Try this:
Code:
./tpc -fo 1
./tpc -set node all 1700 vcore 1.1250
./tpc -fo 0
 
Well I have tried something new and I think I am getting a little better results. Thing appear to be a little bit stable, but performance is not quite what I was expecting. I am running a SMP WU right now with the-kraken and DLB has kicked in. The issue is that with a '-smp 48' flag used, the-kraken process is only taking around mid 4600% utilization. On the other 48c box, the-kraken is taking around mid 4700%.

Code:
[root@ancalagon bin]# ps -Leotid=,psr=,pcpu=,comm= | grep thekraken
[COLOR="Red"]16100  14  0.5 thekraken-FahCo[/COLOR]
16101   0 67.9 thekraken-FahCo
[COLOR="Red"]16102  19  0.0 thekraken-FahCo
16103  20  0.0 thekraken-FahCo[/COLOR]
16105   1 83.8 thekraken-FahCo
16106   2 89.4 thekraken-FahCo
16107   3 78.7 thekraken-FahCo
16108   4 93.3 thekraken-FahCo
16109   5 85.0 thekraken-FahCo
16110   6 87.8 thekraken-FahCo
16111   7 87.9 thekraken-FahCo
16112   8 85.8 thekraken-FahCo
16113   9 87.6 thekraken-FahCo
16114  10 90.5 thekraken-FahCo
16115  11 86.1 thekraken-FahCo
16116  12 91.7 thekraken-FahCo
16117  13 78.1 thekraken-FahCo
16118  14 92.1 thekraken-FahCo
16119  15 80.6 thekraken-FahCo
16120  16 87.2 thekraken-FahCo
16121  17 84.0 thekraken-FahCo
16122  18 94.1 thekraken-FahCo
16123  19 79.1 thekraken-FahCo
16124  20 93.6 thekraken-FahCo
16125  21 85.6 thekraken-FahCo
16126  22 85.5 thekraken-FahCo
16127  23 81.5 thekraken-FahCo
16128  24 93.7 thekraken-FahCo
16129  25 88.2 thekraken-FahCo
16130  26 85.9 thekraken-FahCo
16131  27 85.3 thekraken-FahCo
16132  28 92.6 thekraken-FahCo
16133  29 80.2 thekraken-FahCo
16134  30 86.7 thekraken-FahCo
16135  31 87.2 thekraken-FahCo
16136  32 94.8 thekraken-FahCo
16137  33 86.8 thekraken-FahCo
16138  34 90.0 thekraken-FahCo
16139  35 82.0 thekraken-FahCo
16140  36 91.1 thekraken-FahCo
16141  37 83.5 thekraken-FahCo
16142  38 89.0 thekraken-FahCo
16143  39 81.7 thekraken-FahCo
16144  40 88.5 thekraken-FahCo
16145  41 79.1 thekraken-FahCo
16146  42 89.0 thekraken-FahCo
16147  43 78.0 thekraken-FahCo
16148  44 92.8 thekraken-FahCo
16149  45 85.3 thekraken-FahCo
16150  46 89.4 thekraken-FahCo
16151  47 86.0 thekraken-FahCo

I see three kraken threads at 0% cpu usage. Is there anything that can be done to give them a swift kick in the ass to start working :D
 
With ht-retires, it is ok to have them as long as the rate is not increasing 'quickly' correct?

Code:
[root@ancalagon bin]# ht-retries.sh 
       L0S0 L1S0 L2S0 L3S0 L0S1 L1S1 L2S1 L3S1
Node 0 0000 0000 0000 0000 0089 0000 0000 0000 
Node 1 0000 0000 0000 0000 0000 0000 0000 0000 
Node 2 0006 0000 0000 0000 0000 0000 0000 0000 
Node 3 0000 0000 0000 0000 0000 0000 0000 0000 
Node 4 0000 0000 0000 0000 0000 0000 0000 0000 
Node 5 0030 0000 0000 0000 0000 0000 0000 0000 
Node 6 0000 0000 [COLOR="Red"]c3ab[/COLOR] 0000 0000 0000 0000 0000 
Node 7 0020 0004 0000 0000 0000 0000 0000 0000

The one in red is increasing about 1 every 2 to 4 seconds.
 
With ht-retires, it is ok to have them as long as the rate is not increasing 'quickly' correct?

Code:
[root@ancalagon bin]# ht-retries.sh 
       L0S0 L1S0 L2S0 L3S0 L0S1 L1S1 L2S1 L3S1
Node 0 0000 0000 0000 0000 0089 0000 0000 0000 
Node 1 0000 0000 0000 0000 0000 0000 0000 0000 
Node 2 0006 0000 0000 0000 0000 0000 0000 0000 
Node 3 0000 0000 0000 0000 0000 0000 0000 0000 
Node 4 0000 0000 0000 0000 0000 0000 0000 0000 
Node 5 0030 0000 0000 0000 0000 0000 0000 0000 
Node 6 0000 0000 [COLOR="Red"]c3ab[/COLOR] 0000 0000 0000 0000 0000 
Node 7 0020 0004 0000 0000 0000 0000 0000 0000

The one in red is increasing about 1 every 2 to 4 seconds.

One every couple minutes would be OK. One every few seconds is not, and probably expains your ~4600% load.
 
OK, I got the ht-retries cleaned up with all zeros, but I am still getting a utilization of around mid 4600% with 3 threads at 0 cpu utilization. Anyone have any other ideas to boost things up?
 
Watch top for a while and see if anything else is sucking up CPU time. I know early 11.04 versions of Ubuntu spawned a ton of processes that would kill performance. If you don't find anything, try to reinstall thekraken. You could also try to re-nice fah, or change the priority from low to idle in the advanced fah config - the idea would be to give fah a higher priority.

Guessing here - i really have no idea what is going on.
 
402, those non-CPU-time-consuming threads are _not_ worker threads == it's perfectly normal for them not to consume CPU time.

Count all threads. There are 51 -- 48 worker threads and 3 "other" threads.

You should be worried about worker threads that are not utilizing your CPUs fully.

As musky explains, you should be looking for other (non-FAH) consumers that are
taking CPU time away from your worker threads (they should be at 95+% utilization
== something that's missing in your picture).

Check my output out -- this should be your goal:
Code:
$ ps -Leotid=,psr=,pcpu=,comm= | grep [^]]thekraken
29413  13  0.0 thekraken-FahCo
29414   0 99.9 thekraken-FahCo
29415  19  0.0 thekraken-FahCo
29416  31  0.0 thekraken-FahCo
29417   1 99.9 thekraken-FahCo
29418   2 99.9 thekraken-FahCo
29419   3 99.9 thekraken-FahCo
29420   4 99.9 thekraken-FahCo
29421   5 99.9 thekraken-FahCo
29422   6 99.9 thekraken-FahCo
29423   7 99.9 thekraken-FahCo
29424   8 99.9 thekraken-FahCo
29425   9 99.9 thekraken-FahCo
29426  10 99.9 thekraken-FahCo
29427  11 99.9 thekraken-FahCo
29428  12 99.9 thekraken-FahCo
29429  13 99.9 thekraken-FahCo
29430  14 99.9 thekraken-FahCo
29431  15 99.9 thekraken-FahCo
29432  16 99.9 thekraken-FahCo
29433  17 99.9 thekraken-FahCo
29434  18 99.9 thekraken-FahCo
29435  19 99.9 thekraken-FahCo
29436  20 99.9 thekraken-FahCo
29437  21 99.9 thekraken-FahCo
29438  22 99.9 thekraken-FahCo
29439  23 99.9 thekraken-FahCo
29440  24 99.9 thekraken-FahCo
29441  25 99.9 thekraken-FahCo
29442  26 99.9 thekraken-FahCo
29443  27 99.9 thekraken-FahCo
29444  28 99.9 thekraken-FahCo
29445  29 99.9 thekraken-FahCo
29446  30 99.9 thekraken-FahCo
29447  31 99.9 thekraken-FahCo
29448  32 99.9 thekraken-FahCo
29449  33 99.9 thekraken-FahCo
29450  34 99.9 thekraken-FahCo
29451  35 99.9 thekraken-FahCo
29452  36 99.9 thekraken-FahCo
29453  37 99.9 thekraken-FahCo
29454  38 99.9 thekraken-FahCo
29455  39 99.9 thekraken-FahCo
29456  40 99.9 thekraken-FahCo
29457  41 99.9 thekraken-FahCo
29458  42 99.9 thekraken-FahCo
29459  43 99.9 thekraken-FahCo
29460  44 99.9 thekraken-FahCo
29461  45 99.9 thekraken-FahCo
29462  46 99.9 thekraken-FahCo
29463  47 99.9 thekraken-FahCo

W/MC ES you should also turn PowerNow _off_ in the BIOS (it's the easiest way to prevent the OS from changing P-states).
 
Last edited:
Back
Top