New linpack from intel

If temps reach past 97C, you are probably throttling. Your tool might not be sampling things fast enough to catch the 100C readings or the throttling.

But yeah, the new AVX2 Linpack really gets Haswell running full bore. If real-world applications could use AVX2 all over the place we'd see far better performance gains than trying to overclock.
 
If temps reach past 97C, you are probably throttling. Your tool might not be sampling things fast enough to catch the 100C readings or the throttling.

But yeah, the new AVX2 Linpack really gets Haswell running full bore. If real-world applications could use AVX2 all over the place we'd see far better performance gains than trying to overclock.

AVX2 instructions don't apply to every workload. Only certain workloads can actually be coded to take advantage of them.

That said, I still agree that this tool is very useful for finding the absolute upper limit of the TDP envelope for your system. Pass this, and you likely won't come across a workload in the future that causes problems. Plus you'll have a moderate safety margin under normal workloads.
 
If temps reach past 97C, you are probably throttling. Your tool might not be sampling things fast enough to catch the 100C readings or the throttling.

But yeah, the new AVX2 Linpack really gets Haswell running full bore. If real-world applications could use AVX2 all over the place we'd see far better performance gains than trying to overclock.

I used the statistics page which records peak temperatures. However the temps are very "spiky" when running that test, so it's definitely possible it missed some 100+C temp that only lasted for a fraction of a second. Doesn't really matter - running actual games, temps usually stay in the 60's...
 
200+ gigaflops is very impressive. I wonder if this is also taking advantage of the FMA3 instruction that was included with Haswell.
 
If temps reach past 97C, you are probably throttling. Your tool might not be sampling things fast enough to catch the 100C readings or the throttling.

There are a couple of bits in the IA32_THERM_STATUS register that keep track of whether each individual core ever reaches the throttling point. Even if throttling only occurs for a microsecond, a bit in this register will indicate that so there is no need for monitoring software to monitor this register at a super fast rate to catch thermal throttling.

RealTemp reports information from this register in its Thermal Status area. If RealTemp shows OK, that means the CPU core has not throttled since start up. If it shows LOG, that means that core throttled at least once. If it shows HOT, that means thermal throttling in in full progress. Thermal throttling usually works so well that you rarely see HOT as the CPU is maintained at just a hair under the throttling temperature.

I took the CPU and Intel GPU in my 3570K very close to the throttling point while using the OEM heatsink and fan. I didn't have the guts to go higher on a fairly new CPU.

http://img27.imageshack.us/img27/6216/torturetest.png

My E8400 wasn't so lucky. I disconnected the CPU fan and ran Prime for a few hours just to make sure this RealTemp feature works correctly. It passed that test.

http://img11.imageshack.us/img11/276/hote8400fw5.png

Now that's what I call Prime stable. :D

RealTemp T|I Edition
http://www.overclock.net/t/1330144/realtemp-t-i-edition
 
My cores topped out at 93/98/97/89 C. 4770K @ 4.3 GHz / 1.218V, HT off (for running the test).

Just de-lidded.

Same test now: 81/81/81/81C :cool:

Same overclock, voltages and room temperature (within 1-2C).
 
Hey damn, hot temps. What TIM would you guys use for a delidding for the best thermal drop? Coollab liquid ultra / coollab liquid pro, or something else?
 
AVX2 instructions don't apply to every workload. Only certain workloads can actually be coded to take advantage of them.

That said, I still agree that this tool is very useful for finding the absolute upper limit of the TDP envelope for your system. Pass this, and you likely won't come across a workload in the future that causes problems. Plus you'll have a moderate safety margin under normal workloads.

Too much of a safety margin for me, I'm just gaming. Gimme that extra 200MHz.

I used the statistics page which records peak temperatures. However the temps are very "spiky" when running that test, so it's definitely possible it missed some 100+C temp that only lasted for a fraction of a second. Doesn't really matter - running actual games, temps usually stay in the 60's...

Yup...gaming is such a far cry from AVX2 stress testing that you'll miss out on usable overclocking headroom as far as gaming is concerned.

There are a couple of bits in the IA32_THERM_STATUS register that keep track of whether each individual core ever reaches the throttling point. Even if throttling only occurs for a microsecond, a bit in this register will indicate that so there is no need for monitoring software to monitor this register at a super fast rate to catch thermal throttling.

RealTemp reports information from this register in its Thermal Status area. If RealTemp shows OK, that means the CPU core has not throttled since start up. If it shows LOG, that means that core throttled at least once. If it shows HOT, that means thermal throttling in in full progress. Thermal throttling usually works so well that you rarely see HOT as the CPU is maintained at just a hair under the throttling temperature.

I took the CPU and Intel GPU in my 3570K very close to the throttling point while using the OEM heatsink and fan. I didn't have the guts to go higher on a fairly new CPU.

http://img27.imageshack.us/img27/6216/torturetest.png

My E8400 wasn't so lucky. I disconnected the CPU fan and ran Prime for a few hours just to make sure this RealTemp feature works correctly. It passed that test.

http://img11.imageshack.us/img11/276/hote8400fw5.png

Now that's what I call Prime stable. :D

RealTemp T|I Edition
http://www.overclock.net/t/1330144/realtemp-t-i-edition

Cool, so as long as our monitoring programs all use that register, we're good.

And those temps remind me of my 7800m graphics card on an old laptop...used to hit 112C during Portal and blue-screen.
 
Hey damn, hot temps. What TIM would you guys use for a delidding for the best thermal drop? Coollab liquid ultra / coollab liquid pro, or something else?

I used Coollaboratory Liquid Pro. Pretty happy with the 17C drop in the new Linpack. I've read about people getting drops of 25+C, but I guess each chip and de-lid is unique (and of course I'm using air cooling). 81C on air @ 4.3 GHz and 1.218V with this test definitely feels better than 98C :)

Does that work well for the heatsink to IHS interface too?

CLP and CLU are not compatible with aluminium. They will literally eat it. If the heatsink or water block has a copper base, it will work, but keep in mind that it's very hard to remove the stuff if you ever need to.
 
Last edited:
It essentially bods itself to the metal, removing it means lapping it.
 
Does that work well for the heatsink to IHS interface too?

It works, but it's not really better enough to deal with the hassles of trying to clean it up later. Unless you really want every last degree of performance, it's generally better to use a normal TIM between the IHS and the cooler.
 
The modified batch file assigns one thread to each of the physical cores, ergo, its exactly the same as running HT on as HT off, and my results proove that.

Why can't you understand anything I've written so far?

He is correct in that if HT is turned on, the virtual cores will be active and available to Windows. This means that in *theory*, Windows could assign some background processes to the virtual cores, taking some processing power away from the Linpack process running on the 4 "real" cores. Of course, those same background processes without HT enabled would still share the 4 "real" cores with the Linpack benchmark anyway. So, it doesn't matter. What does matter is the Affinity, which *does* assign Linpack to the 4 "real" cores. 1 Linpack thread per physical core. Just how Intel designed the benchmark.

One thing I did find is that when HT was enabled, I had to increase the VCore slightly to run this benchmark, from 1.24 to 1.25V. I guess that simply keeping HT active increases the demands on the CPU even if no application is taking advantage of it.

The modified .bat file, combined with the modified lininput_xeon64 file to limit problem size to 25000 made this much more user-friendly.

Here's my latest run at 4.4 GHz, 1.25V. Room temp around 25C.


Code:
Intel(R) Optimized LINPACK Benchmark data

Current date/time: Sun Aug 18 13:42:04 2013

CPU frequency:    4.398 GHz
Number of CPUs: -1
Number of cores: 1
Number of threads: 4

Parameters are set to:

Number of tests: 9
Number of equations to solve (problem size) : 1000  2000  3000  4000  5000  10000 15000 20000 25000
Leading dimension of array                  : 1000  2000  3000  4000  5000  10000 15000 20000 25000
Number of trials to run                     : 4     4     4     4     4     2     2     2     2    
Data alignment value (in Kbytes)            : 4     4     4     4     4     4     4     4     4    

Maximum memory requested that can be used=705536800, at the size=25000

=================== Timing linear equation system solver ===================

Size   LDA    Align. Time(s)    GFlops   Residual     Residual(norm) Check
1000   1000   4      0.006      114.0584 1.194739e-012 4.074366e-002   pass
1000   1000   4      0.005      124.8665 1.194739e-012 4.074366e-002   pass
1000   1000   4      0.006      102.9154 1.194739e-012 4.074366e-002   pass
1000   1000   4      0.006      108.3124 1.194739e-012 4.074366e-002   pass
2000   2000   4      0.051      104.0321 4.536926e-012 3.946570e-002   pass
2000   2000   4      0.057      93.1630  4.536926e-012 3.946570e-002   pass
2000   2000   4      0.043      125.5055 4.536926e-012 3.946570e-002   pass
2000   2000   4      0.058      91.7913  4.536926e-012 3.946570e-002   pass
3000   3000   4      0.151      119.1956 8.334888e-012 3.209566e-002   pass
3000   3000   4      0.153      117.6245 8.334888e-012 3.209566e-002   pass
3000   3000   4      0.141      127.7614 8.334888e-012 3.209566e-002   pass
3000   3000   4      0.175      102.7084 8.334888e-012 3.209566e-002   pass
4000   4000   4      0.371      115.1468 1.519912e-011 3.312792e-002   pass
4000   4000   4      0.416      102.5799 1.519912e-011 3.312792e-002   pass
4000   4000   4      0.294      145.3543 1.519912e-011 3.312792e-002   pass
4000   4000   4      0.308      138.5620 1.519912e-011 3.312792e-002   pass
5000   5000   4      0.561      148.5110 2.471656e-011 3.446525e-002   pass
5000   5000   4      0.509      163.8585 2.471656e-011 3.446525e-002   pass
5000   5000   4      0.554      150.5717 2.471656e-011 3.446525e-002   pass
5000   5000   4      0.557      149.5816 2.471656e-011 3.446525e-002   pass
10000  10000  4      3.766      177.0737 9.436774e-011 3.327502e-002   pass
10000  10000  4      4.015      166.0853 9.436774e-011 3.327502e-002   pass
15000  15000  4      12.906     174.3740 2.169435e-010 3.416896e-002   pass
15000  15000  4      12.091     186.1307 2.169435e-010 3.416896e-002   pass
20000  20000  4      25.191     211.7482 3.504283e-010 3.102058e-002   pass
20000  20000  4      25.593     208.4220 3.504283e-010 3.102058e-002   pass
25000  25000  4      53.778     193.7201 5.345983e-010 3.040069e-002   pass
25000  25000  4      53.765     193.7694 1.495229e-008 8.502830e-001   pass

Performance Summary (GFlops)

Size   LDA    Align.  Average  Maximal
1000   1000   4       112.5382 124.8665
2000   2000   4       103.6230 125.5055
3000   3000   4       116.8225 127.7614
4000   4000   4       125.4108 145.3543
5000   5000   4       153.1307 163.8585
10000  10000  4       171.5795 177.0737
15000  15000  4       180.2524 186.1307
20000  20000  4       210.0851 211.7482
25000  25000  4       193.7447 193.7694

Residual checks PASSED

End of tests
 
Back
Top