So is a 6m50s TPF (PPD=972k) on 8102 any good? :)

quickz

Limp Gawd
Joined
Jul 30, 2012
Messages
256
Sorry for copy the title format from fastgeek just for fun. ;)
Recently, I was also doing some performance tests on a 4P E5-4650.
The best score I got is a min tpf of 6m50s for 8102, with a correspoding PPD of 972k.
Seems not bad, however, still unable to reach the 1M PPD.
Does anyone have got 1M ppd on a single rig before?

Code:
[16:46:58] Completed 55000 out of 250000 steps  (22%)
[16:53:48] Completed 57500 out of 250000 steps  (23%)
[17:00:41] Completed 60000 out of 250000 steps  (24%)
[17:07:31] Completed 62500 out of 250000 steps  (25%)

update:
The TPF has been improved to 6m47s (PPD=983K), see:
http://hardforum.com/showpost.php?p=1039126639&postcount=28
http://hardforum.com/showpost.php?p=1039129601&postcount=29
 
Last edited:
to my knowledge no one has reached 1M PPD on a single rig, Patriot got very close (before the intial BigAdv nerf, with a p6901 IIRC)
 
Nice! I'm pretty sure someone will eventually get 4 of those Intel Xeon E7-8870 ES 10-Core CPUs floating around on Ebay and hit over a mil ppd.
 
We've been close - Patriot would have had it with Harbringer, but the last bigadv points nerf went into effect before he got it running. I'm pretty sure no one around here has actually broken 1M ppd with one machine, though.
 
Yeah....I was pretty ticked at the timing of the first nuke...
To my knowledge no one here has... and if you have its not wise to mention it unless you want another point nuke...
 
Thanks.
I found in -alias-'s post that his 4P 4650 could get a PPD of 871k for p8102 (TPF=7:21).
I don't know if he has got higher scores later, but seems my rig is more efficient. :)
I was using a customized 3.2.9 kernel in the test. I would also suggest -alias- try newer linux kernels, some of them might be faster.
Now I'm trying my best to see if 1M PPD is possible, that means, a tpf of 6:42 for 8102.
Only 8s is the gap, but might be very hard to realize.
 
Last edited:
I'm afraid that 4P E7-8870 was unable to overcome 1M PPD either. Because E7-8870's frequency is significantly lower, and we know that E7 is not as efficient as E5 with the same frequency.
I guess 4P E7-8870 would probably be much slower than 4P E5-4650, and even a little slower than 4P E5-4640.

Nice! I'm pretty sure someone will eventually get 4 of those Intel Xeon E7-8870 ES 10-Core CPUs floating around on Ebay and hit over a mil ppd.
 
Last edited:
4P E7-8870 would be about 15% slower than my g34 4p ....and its 15% slower than my 2011 4p
I had an e7-4870 4p folding...
 
I'd go about exploring possibility of using premium memory (CL6 or CL7/1600 or equivalent) and flashing
SPDs to tighter timings. This could give you several seconds :D

I'll run some tests on my machine (still missing two 4650Ls -- ugh) and let you know.
 
Thanks, I will look forward to your results of CL6 or CL7/1600 memory.
And in the test I found that 4P E5's TPF was not as steady as the 2P one (DLB on for all cases). This might be kind of NUMA-related issue. I'm using the method of stopping and restarting FAH client to get good TPFs. But I think numactl should be a better solution, though not succeed yet.

I'd go about exploring possibility of using premium memory (CL6 or CL7/1600 or equivalent) and flashing
SPDs to tighter timings. This could give you several seconds :D

I'll run some tests on my machine (still missing two 4650Ls -- ugh) and let you know.
 
Last edited:
This is pure geek porn, I feel dirty and I like it.
 
quickz, is your memory (per BIOS) running at 1600? I'm at 1333 and trying to determine why 1600 doesn't get chosen.

If it is running at 1600, can you tell me its exact model?
 
The rig is currently using 16 sticks of 1333Mhz ECC REG memory, not 1600Mhz. And I could only access to the rig remotely.
This is the result of dmidecode:

root@E5:~# dmidecode -t memory | more
# dmidecode 2.10
SMBIOS 2.7 present.
# SMBIOS implementations newer than version 2.6 are not
# fully supported by this version of dmidecode.

Handle 0x0034, DMI type 16, 23 bytes
Physical Memory Array
Location: System Board Or Motherboard
Use: System Memory
Error Correction Type: Multi-bit ECC
Maximum Capacity: 96 GB
Error Information Handle: Not Provided
Number Of Devices: 8

Handle 0x0036, DMI type 17, 34 bytes
Memory Device
Array Handle: 0x0034
Error Information Handle: Not Provided
Total Width: 72 bits
Data Width: 64 bits
Size: 4096 MB
Form Factor: DIMM
Set: None
Locator: P1_DIMMA1
Bank Locator: Node0_Bank0
Type: DDR3
Type Detail: None
Speed: 1333 MHz
Manufacturer: Hynix Semiconducto
Serial Number: 86100C2A
Asset Tag: Dimm0_AssetTag
Part Number: HMT351R7AFR4C-H9
Rank: 1

......
 
I remember when normal people could fold lol
 
I thought I had a pretty sweet farm going on with 8 computers, all in cases in a big pyramid. Barton cores, Pentium 4's. The power consumption was horrific. I think that was 2006.

Any one with an i7 contributes a lot. It just doesn't seem like much when you compare it to a folding only, ~$2000(middle of the line) server. Unless you have other uses for a server, I know I don't.
 
I thought I had a pretty sweet farm going on with 8 computers, all in cases in a big pyramid. Barton cores, Pentium 4's. The power consumption was horrific. I think that was 2006.

Any one with an i7 contributes a lot. It just doesn't seem like much when you compare it to a folding only, ~$2000(middle of the line) server. Unless you have other uses for a server, I know I don't.

What, servers have purposes other than folding?! Blasphemy! ;)
 
you mean like on 300 computers....

you mean like 400, single proc clients, so 2 clients per machine, 800 installs... was quite an adventure :D

even then dont think I broke 100kppd cant even remember
 
Thanks.
I found in -alias-'s post that his 4P 4650 could get a PPD of 871k for p8102 (TPF=7:21).
I don't know if he has got higher scores later, but seems my rig is more efficient. :)
I was using a customized 3.2.9 kernel in the test. I would also suggest -alias- try newer linux kernels, some of them might be faster.
Now I'm trying my best to see if 1M PPD is possible, that means, a tpf of 6:42 for 8102.
Only 8s is the gap, but might be very hard to realize.
After I wrote that post I have not folded a single 8102 on that rig, only 8101 has been coming down. But I have been down in TPF 9:06 on 8101 so the rig is faster now so I am looking forward to se a 8102 again. But I do not think that a faster TPF then 6:42 is possible on my rig anyway. To be able to make 1 million PPD in one rig must really be a folders dream.
 
The 8P E7-8870 might be easy to get 1M PPD, but it's too expensive. :D
Just for test purpose, I have also run some old BA WUs from my disk backup.
Here is the result:
Code:
Proj 6903: TPF= 9:17, PPD=937k
Proj 6904: TPF=13:10, PPD=909k
Sad to see they are also unable to reach the 1M PPD line.
Here 6904's PPD is about 2.5% lower. However, I remember some people could get much higher PPD from 6904 than 6903.
Is there any trick for running 6904 more efficiently?
 
Last edited:
Hi tear, what is your current (min) TPF of 8101 with two 4650L on the 4P board?

I'd go about exploring possibility of using premium memory (CL6 or CL7/1600 or equivalent) and flashing
SPDs to tighter timings. This could give you several seconds :D

I'll run some tests on my machine (still missing two 4650Ls -- ugh) and let you know.
 
I haven't really folded on those so I can't really tell you...
I should be able to give them a shot next weekend (as I'm traveling this week).

EDIT: Did some research on accessing on-chip I2C (meant for SPDs) -- bleeding edge
appears to be here: http://www.spinics.net/lists/linux-edac/msg01658.html

As author explains, this is sort of a kludge (real support should be implemented
as part of general i2c infrastructure (not EDAC)), it's a good starting point, though.
 
One 6901 came down on my 4P Xeon 4650, TPF 4:33 at the best!

My best ever score for 6901 is min TPF=4m10s and avg TPF=4m11.5s. I got this after stoping and restarting the WU for dozens of times.
I suggest you also try it (if you have time) to see if TPF will decrease. :D

FAHlog.txt:
[09:47:18] Project: 6901 (Run XX, Clone XX, Gen XXX)
[09:47:18]
[09:47:18] Assembly optimizations on if available.
[09:47:18] Entering M.D.
[09:47:24] Using Gromacs checkpoints
[09:47:25] Mapping NT from 64 to 64
[09:48:02] Resuming from checkpoint
[09:48:03] Verified work/wudata_02.log
[09:48:03] Verified work/wudata_02.trr
[09:48:04] Verified work/wudata_02.xtc
[09:48:04] Verified work/wudata_02.edr
[09:48:04] Completed 203400 out of 250000 steps (81%)
[09:50:56] Completed 205000 out of 250000 steps (82%)
[09:55:07] Completed 207500 out of 250000 steps (83%)
[09:59:19] Completed 210000 out of 250000 steps (84%)
[10:03:31] Completed 212500 out of 250000 steps (85%)
[10:07:42] Completed 215000 out of 250000 steps (86%)
[10:11:54] Completed 217500 out of 250000 steps (87%)
[10:16:05] Completed 220000 out of 250000 steps (88%)
[10:20:17] Completed 222500 out of 250000 steps (89%)
[10:24:28] Completed 225000 out of 250000 steps (90%)
[10:28:40] Completed 227500 out of 250000 steps (91%)
[10:32:51] Completed 230000 out of 250000 steps (92%)
[10:37:04] Completed 232500 out of 250000 steps (93%)
[10:41:16] Completed 235000 out of 250000 steps (94%)
[10:45:26] Completed 237500 out of 250000 steps (95%)
[10:49:39] Completed 240000 out of 250000 steps (96%)
[10:53:51] Completed 242500 out of 250000 steps (97%)
[10:58:02] Completed 245000 out of 250000 steps (98%)
[11:02:13] Completed 247500 out of 250000 steps (99%)
 
update1: I got another 8102 WU which was a little faster, now the best (min) TPF is 6:49 (PPD=976k).
update2: With the help of a revised thekraken, now I'm able to get a min TPF of 6:48 for 8102 (PPD=980k). The avg TPF of the first 10 frames is 6m48.75s.
FAHlog.txt:
[05:14:07] Project: 8102 (Run X, Clone XX, Gen X)
[05:14:07]
[05:14:07] Assembly optimizations on if available.
[05:14:07] Entering M.D.
[05:14:13] Using Gromacs checkpoints
[05:14:15] Mapping NT from 64 to 64
[05:14:37] Resuming from checkpoint
... ...
[05:16:47] Completed 5000 out of 250000 steps (2%)
[05:23:35] Completed 7500 out of 250000 steps (3%)
[05:30:23] Completed 10000 out of 250000 steps (4%)
[05:37:11] Completed 12500 out of 250000 steps (5%)
[05:43:59] Completed 15000 out of 250000 steps (6%)
[05:50:49] Completed 17500 out of 250000 steps (7%)
[05:57:38] Completed 20000 out of 250000 steps (8%)
[06:04:29] Completed 22500 out of 250000 steps (9%)
[06:11:17] Completed 25000 out of 250000 steps (10%)
 
Last edited:
update3:
With the help of a latest 3.6-rc kernel, now I could get another 1 sec and push 8102's min TPF to 6:47 successfully. The corresponding PPD is 983k.
I think this would be the final score of this test. It's a pity that the result is only 5s slower than the TPF of 1M PPD line.
 
Last edited:
It was said that there is a special version of Opteron 6200 with 32M L2 and 32M L3 (the 'normal' version is only 16M L2 and 16M L3). A 4P rig armed with this CPU was ever running p6901 at a TPF of ~3m50s, that is, 20s faster than my 4P E5-4650!
Does anyone know something about this mysterious 6200 CPU?
 
Today I think it must have been a fake.

I remember when that TPF was reported; at the time it was really exciting.... reality
proved otherwise.


Anyway, here's my 2P 4650L test P8101 result: 17m36s (based on one full frame, really)
Effective frequency fluctuates between 2830 and 2900 MHz.
Power draw: 300W (PC Power&Cooling Silencer 910W -- Silver).

Note that I haven't spent any time on tweaking and that cooling is bad
(slapped some old Socket 754 HSFs on it). The latter could explain
why 2900 MHz isn't holding.

I set scaling governor to "performance" and am running custom 3.5.3
kernel (was playing with it as part of SB-E I2C research). FahCores
have been wrapped with The Kraken, too.

I'll run a test 6901 next (it's small so it shouldn't take that long to get results).
Update: got it -- 8m53s.
 
Last edited:
Your E5-4650L is only 2.9GHz for all-core turbo? It's strange to me since my CPU supplier said it was 3.0GHz.

E5 4650L's freq info from my supplier:
Code:
C0/C1 stepping: 2.6GHz(no turbo), 3.0GHz(all-core turbo), 3.3GHz(one-core turbo)
C2 stepping: 2.6GHz(no turbo), 2.8GHz(all-core turbo), 3.1GHz(one-core turbo)

So my freq info for 4650L might be incorrect?


Anyway, here's my 2P 4650L test P8101 result: 17m36s (based on one full frame, really)
Effective frequency fluctuates between 2830 and 2900 MHz.
Power draw: 300W (PC Power&Cooling Silencer 910W -- Silver).

Note that I haven't spent any time on tweaking and that cooling is bad
(slapped some old Socket 754 HSFs on it). The latter could explain
why 2900 MHz isn't holding.
 
No firm idea. I've reported top value from freqcheck...
I'll take a closer look when I focus on clocks (right now I'm looking at memory).

Does your supplier have any 4650Ls available?
 
Ok, flashing does fly with 4650s:
Code:
DCT2: memory type: DDR3 frequency: 1332 MHz
Tcl=6 Trcd=7 Trp=5 Tras=20 CMD:1T Trtp=5 Trc=25* Twr=9 Trrd=5 Tcwl=7 Tfaw=40 Twtr=5 Tcke=4
Tccd=0 Twrdd=1 Twrdr=1 Twwdd=3 Twwdr=3 Trrdd=2 Trrdr=0 Tcsoe=6 Todtoe=6 Trwsr=2 Trwdd=2
Trwdr=2 Tcwladj=0 Txp=4 Txpdll=16 Trfc=92 Trefi=5200 Trefix9=45
vs before:
Code:
DCT2: memory type: DDR3 frequency: 1332 MHz
Tcl=8 Trcd=8 Trp=8 Tras=24 CMD:1T Trtp=5 Trc=32* Twr=10 Trrd=4 Tcwl=7 Tfaw=40 Twtr=5 Tcke=4
Tccd=0 Twrdd=1 Twrdr=1 Twwdd=3 Twwdr=3 Trrdd=2 Trrdr=0 Tcsoe=6 Todtoe=6 Trwsr=2 Trwdd=2
Trwdr=2 Tcwladj=0 Txp=4 Txpdll=16 Trfc=74 Trefi=5200 Trefix9=45

Except (based on my single test); performance actually seems to have degraded.

It's at 17m45s.

Though, again, Turbo was fluctuating, just like the last time.
Ah, and I've seen in go beyond 2900 so perhaps 3000 is an all-core Turbo after all.

Anyway, looks like this system needs more tuning -- next weekend :)
Will be getting couple of 212 EVOs to eliminate cooling as a variable as well.

I've got a TPC version that can do -dram on SB-E (has issues with node-socket mapping
but otherwise works). Flashing needs to be done in another board as we don't have I2C
support for these chips yet.
 
A bit more info:
These are G.Skill F3-12800CL6D-2GBXH modules.
Modified with d3sak (http://www.amdzone.com/phpbb3/viewtopic.php?f=521&t=138490&start=225#p218810) -m relaxed10.

At 1333 these should theoretically run at Tcl=5 but system hangs during POST if CL5
is added to CAS latency support list (so we're at odd-ish 6-7-5-20 instead of 5-7-5-20).

Also, another matter is that they're running 1333, not 1600 -- no explanation for that yet.
 
I guess the performance reduction is due to turbo fluctuations, and It's odd to see turbo fluctuations on E5 Xeons. To my knowledge E5 Xeons should be able to work at all-core turbo freq without any difficulties, unless some core temps are growing to the critical temp, but usually critical temp is hard to reach since it's as high as ~100C(212F).
Have you checked your cpu temps during the test?

Except (based on my single test); performance actually seems to have degraded.

It's at 17m45s.

Though, again, Turbo was fluctuating, just like the last time.
Ah, and I've seen in go beyond 2900 so perhaps 3000 is an all-core Turbo after all.

Anyway, looks like this system needs more tuning -- next weekend :)
Will be getting couple of 212 EVOs to eliminate cooling as a variable as well.

I've got a TPC version that can do -dram on SB-E (has issues with node-socket mapping
but otherwise works). Flashing needs to be done in another board as we don't have I2C
support for these chips yet.
 
Last edited:
I messed up. Forgot to change scaling governor this time around.
I'll do more meaningful round of tests on weekend.

For now we know that XMP timings appear to generally work on
these chips.
 
There is another folder in my team who also has built a 4P E5-4650 rig, today he finally got his rig working properly.
His avg tpf for 8101 is about 8m53s and min tpf is 8m51s (Run 13, Clone 9, Gen 58), which is also not bad. :cool:
 
The 4p 4650 rig mentioned in 38# is a little faster than my test rig.
Today it gets a min tpf of 6:46 for 8102 without many optimizations, and the avg tpf is about 6:49.
Here is the FAHlog.txt:

Code:
[00:51:12] Project: 8102 (Run 0, Clone 44, Gen 44) 
[00:51:12] 
[00:51:12] Assembly optimizations on if available. 
[00:51:12] Entering M.D. 
[00:51:18] Mapping NT from 64 to 64 
[00:51:21] Completed 0 out of 250000 steps (0%) 
[00:58:25] Completed 2500 out of 250000 steps (1%) 
[01:05:14] Completed 5000 out of 250000 steps (2%) 
[01:12:05] Completed 7500 out of 250000 steps (3%) 
[01:18:54] Completed 10000 out of 250000 steps (4%) 
[01:25:44] Completed 12500 out of 250000 steps (5%) 
[01:32:33] Completed 15000 out of 250000 steps (6%) 
[01:39:24] Completed 17500 out of 250000 steps (7%) 
[01:46:12] Completed 20000 out of 250000 steps (8%) 
[01:53:02] Completed 22500 out of 250000 steps (9%) 
[01:59:49] Completed 25000 out of 250000 steps (10%) 
[02:06:39] Completed 27500 out of 250000 steps (11%) 
[02:13:27] Completed 30000 out of 250000 steps (12%) 
[02:20:15] Completed 32500 out of 250000 steps (13%) 
[02:27:06] Completed 35000 out of 250000 steps (14%) 
[02:33:54] Completed 37500 out of 250000 steps (15%) 
[02:40:45] Completed 40000 out of 250000 steps (16%) 
[02:47:34] Completed 42500 out of 250000 steps (17%) 
[02:54:25] Completed 45000 out of 250000 steps (18%) 
[03:01:14] Completed 47500 out of 250000 steps (19%) 
[03:08:04] Completed 50000 out of 250000 steps (20%) 
[03:14:53] Completed 52500 out of 250000 steps (21%) 
[03:21:42] Completed 55000 out of 250000 steps (22%) 
[03:28:29] Completed 57500 out of 250000 steps (23%) 
[03:35:16] Completed 60000 out of 250000 steps (24%) 
[03:42:07] Completed 62500 out of 250000 steps (25%) 
[03:48:53] Completed 65000 out of 250000 steps (26%) 
[03:55:42] Completed 67500 out of 250000 steps (27%) 
[04:02:37] Completed 70000 out of 250000 steps (28%) 
[04:09:25] Completed 72500 out of 250000 steps (29%) 
[04:16:11] Completed 75000 out of 250000 steps (30%)
 
Last edited:
Update: new record for 8101, which was also made by the 4p 4650 rig as mentioned in 38#.
min tpf=8m42s (2s faster than my old record)
avg tpf=8m45s

Code:
[18:38:29] Project: 8101 (Run 2, Clone 6, Gen 76)
... ...
[00:31:00] Completed 100000 out of 250000 steps  (40%)
[00:39:46] Completed 102500 out of 250000 steps  (41%)
[00:48:32] Completed 105000 out of 250000 steps  (42%)
[00:57:19] Completed 107500 out of 250000 steps  (43%)
[01:06:03] Completed 110000 out of 250000 steps  (44%)
[01:14:49] Completed 112500 out of 250000 steps  (45%)
[01:23:33] Completed 115000 out of 250000 steps  (46%)
[01:32:18] Completed 117500 out of 250000 steps  (47%)
[01:41:05] Completed 120000 out of 250000 steps  (48%)
[01:49:47] Completed 122500 out of 250000 steps  (49%)
[01:58:32] Completed 125000 out of 250000 steps  (50%)
[02:07:14] Completed 127500 out of 250000 steps  (51%)
 
Last edited:
Back
Top