Robberbaron and (cf)Eclipse preach the truth about A64's and memory

robberbaron

Supreme [H]ardness
Joined
Sep 6, 2004
Messages
6,101
Alright. The point of this thread is to dispel the myths, rumors, misconceptions, and insecurities people have in regards to 1T vs 2T timings and memory speed. Hopefully this will cause less threads asking for "memory that does 1T" and people worrying over the 4x double sided dimms issue.

This benchmark is with a Socket 754 Clawhammer at 2.5GHz.

One setting I used was 10x250MHz 1:1. The other was 11x227 5:6 with 192MHz ram speed.


And Sciencemark 2.0:
Code:
10x250 1T
Mol Dyn        929.63
Primidoria     811.37
cipher bench  1008.29
blas bench    1192.42

10x250 2T
Mol Dyn        833.31
Primidoria     804.80
cipher bench   984.38
blas bench    1012.97


11x227 1T
Mol Dyn        953.12
Primidoria     859.01
cipher bench  1158.53
blas bench    1199.84

11x227 2T
Mol Dyn        830.57
Primidoria     850.04
cipher bench  1131.82
blas bench    1185.05

So aside from memory specific benchmarks, we see the greatest performance hit from latency and memory bandwidth in gaming and Molecular Dynamics.
In 10x250, we lose 3% Aquamark performance when switching to 2T.
In 10x227, we lose 4.5% Aquamark performance when switching to 2T. Not a very big deal if getting 2 gigs means having to use 2T.

Counterstrike lost a relatively small amount of framerate, especially since I ran the benchmark with a +15 LOD bias, 640x480 resolution, and bare minimum detail. To give you an idea of image quality:


Yeah. Ouch. So in higher resolutions, when the GPU is actually stressed, I don't think you'll even feel the hit. Also, since many people are finding Socket 939 systems to be rewarding with their dual cores and dual channel and dual PCI-E, there will be even less of a hit due to the almost doubled write bandwidth.


As for molecular dynamics, well, hopefully if you're going to be messing with that, your university will back you up with a nice computer anyway. ;)
 
ALL RIGHT! I'm finally done. First off, here's the raw data for those who like numbers more than silly graphs ;)

The rig that was used:

DFI Lanparty UT NF3-250gb
Mobile Athlon 64 3700+
XFX 6800GT
2x512mb Crucial Ballistix pc3200
WD 80 + 250gb HDD
assorted cooling :p


I chose to use 3 ram speeds while holding the cpu speed constant. The settings are as follows:

Setting 1
HTT/FSB = 250
CPU Multi = 11
CPU Mhz = 2750
RAM ratio = 1:01
RAM Mhz = 250
Timings = 2.5-2-2-10
HT link multi = 3

Setting 2
HTT/FSB = 229
CPU Multi = 12
CPU Mhz = 2748
RAM ratio = 1:01
RAM Mhz = 229
Timings = 2.5-2-2-10
HT link multi = 3

Setting 3
HTT/FSB = 229
CPU Multi = 12
CPU Mhz = 2748
RAM ratio = 5:06
RAM Mhz = 183
Timings = 2-2-2-10
HT link multi = 3

Then, as part of the quest, RB and I tried each setting with 1T and 2T, to show how the latency hit (about 10-15% for me) is insignificant, thus why the graphs have 6 sets of data for each bench.


I will order the benches from what I feel is most "synthetic" to least. First up:

Everest memory tests
Nothing to really comment on here.. although I thought the difference in latency between 250x11 and 229x12 at the same timings was quite curious.

1_97.png






SiSoft Sandra:
This is.... very very very cpu dependant. Bandwidth and latency has no effect on the CPU scores (go figure! haha)

1_99.png






ScienceMark:
This is one of those benches that may or may not be useful to you, depending on what you do. Those people who run this kind of stuff.. well, it's obvious, ram speed does not have much of an effect on benchmark


1_100.png





3dmark05:
Self-explanitory. Final score is very gpu dependant, thus why there's cpu score, which I have included :D
It seems that it's somewhat memory speed sensitive, but only when the ram speed drops below 200mhz, then it takes a decently large hit, but it's still pretty small when you consider a 17% drop from 250mhz at 1T to 183mhz at 2T

1_94.png





Aquamark:
Again, two parts, but this time I only included GFX and CPU into the graph. Check the raw data for the final score if you care.
This scales more normally with bandwidth than 3dmark does, but doesn't dramatically drop off. Still about the same 17-18% drop from 250mhz at 1T to 183mhz at 2T.

1_95.png





SuperPI
This is known to be very latency sensitive. Seems to have held up to it's repuation on that part

1_101.png





Cinebench
This is a useful tool for those who will plan on doing renditions of stuff. Bandwidth and latency doesn't seem to effect it in the least

1_96.png





Real World apps:
This one takes a bit more explaining. In Terragen, I made up and saved a world, so that I would be rendering the same thing every time. This is what I rendered.
For the music conversion, I converted the Delerium - Poem album from 256kb/s MP3 to 192kb/s Ogg Vorbis with dBpoweramp 10.1.
Both programs had a timer, and indicated how long the task took to complete. I recorded this time.

From this data in conjucntion with the Cinebench results, it is obvious that rendering is very CPU based, and shows no gains from various memory speeds and latencies. Music conversion is the same way, with results that did not vary outside of 2 seconds.

1_102.png







WELL! there you have it. From this, and baron's results, it seems that the typical 'benchmarks' are the only programs that see any difference from memory speed and latency.
Games show a small difference, but nothing really amazing.
Music conversion.. yeah ;)
Rendering programs are entirely CPU limited.
 
Basically unless you overclock hardcore or time everything you do.
You won't notice any difference!
 
Borgschulze said:
Basically unless you overclock hardcore or time everything you do.
You won't notice any difference!
even if you do overclock hardcore, you might want 2t timings if it meant more headroom and then meant more performance.
 
mikelz85 said:
even if you do overclock hardcore, you might want 2t timings if it meant more headroom and then meant more performance.
Possibly, maybe I can get more out of this CPU with 2T Command Rate...
Maybe hit 300HTT stable? That would be nice.
 
or just use a ram divider. this is the purpose of this. my results will be even more dramatic than baron's, showing very very little difference between memory speeds on the arguably more limited single channel setup. if this is how small the difference is with s754, take away any bandwidth limitation and you have s939
 
I would like to see the same numbers on a fairly newer processor such as the Venice or San Deigo Please. I know you guys out their love your 754's but thats old news to me :cool: :p
 
is newark new enough? ;)
just hold out like another hour or two... benching this much takes fooorrrrreverrrr :(


also, those of you with s939 who feel like doing this, feel free to :p
 
I just want to see what the impacts would have with Dual Channel Ram, the newer memory controller, SSE3 if their is an impact etc etc.


:)
 
USMC2Hard4U said:
I just want to see what the impacts would have with Dual Channel Ram, the newer memory controller, SSE3 if their is an impact etc etc.


:)

Well the dual channel will have the most impact. Taking away the write bandwidth limitation will remove most damage from 2T I suspect.
 
Nice work guys. The thread title is sorta tacky but nice work nonetheless :p j\k.

USMC2Hard4U said:
I just want to see what the impacts would have with Dual Channel Ram, the newer memory controller, SSE3 if their is an impact etc etc.


:)

The dual channel RAM will of course have the most impact on performance because the memory controller isn't really improved that much over the 90nm Winchester cores. In fact in certain cases its a little slower. SSE3 wont help at all except in the few cases your running an application that has been coded to use it.

(cf)Eclipse said:
not sure. i've long denouced sse1/2/3 as being pretty useless.. for amd at least. intel can actually take advantage of it with it's higher clock speeds and less efficient x87 FP unit
http://www.anandtech.com/cpuchipsets/showdoc.aspx?i=2330&p=5

in the bottom part. it's very very insignificant, but there :p

SSE2 can be coded to do things much faster then x87 can. AMD has a stronger x87 FPU unit but that doesn't mean its better to use x87 over SSE2. The lack of support for SSE2 is just one of the reasons why the Athlon XP usually fell behind the Pentium 4 in application performance because alot of the applications today have been optimized for SSE2. x87 wont even be supported in the new Longhorn OS when it comes out. I can't see how SSE2 would have any kind of real impact on performance for gaming that would actually matter. Its benefits far outweigh any negative effects just like HyperThreading on the Pentium 4.
 
SSE2 is critical to Folding@home. So in certain apps it is very important.
 
*cries over wasted money on expensive tccd* very nice job tho, quite thorough. however, noone seems to go past 250mhz. id liek to see 200mhz 2-2-2-5 vs 275mhz 2.5-3-3-7, I feel the relaxed latencies required to go over ~230 dont pay off till 260mhz+, if this were possible, I feel it would give a better result. If msi ever releases a proper bios for my mobo, perhaps I will try and run similar tests. my cpu does 2.75ghz fine, so I coudl do 250x11 with ram at 208, and 275x10 to take the cpu out of the question. that is if in fact its a bios issue and not just the cpu's inability to handle high ht.
 
arnemetis said:
*cries over wasted money on expensive tccd* very nice job tho, quite thorough. however, noone seems to go past 250mhz. id liek to see 200mhz 2-2-2-5 vs 275mhz 2.5-3-3-7, I feel the relaxed latencies required to go over ~230 dont pay off till 260mhz+, if this were possible, I feel it would give a better result. If msi ever releases a proper bios for my mobo, perhaps I will try and run similar tests. my cpu does 2.75ghz fine, so I coudl do 250x11 with ram at 208, and 275x10 to take the cpu out of the question. that is if in fact its a bios issue and not just the cpu's inability to handle high ht.
They were just talking about 1t command rates, but when I was talking to my dad about how cool the winbond memories @ 260 2-2-2 looked, he linked me to an article on chipgeek showing that scaling the memory (actual dram frequency/latency) on the A64 doesn't change performance nearly as much as simply scaling the CPU clock speed. Obviously the on chip memory controller is why. I must say though, he still did get me 2 sticks of twinmos for my big "2 0" last week, I made him get the refurb though, was $20 cheaper, but I'm more curious to see how well it performs. Maybe I'll slowly change the attitude that refurbs are crap oc'ers, though I could end up stuck @ 220 mhz as well.

On another note, do you guys find that 2t vs. 1t. is a factor in highest dram frequency? Or does it differ from winbond to samsung?
 
mikelz85 said:
On another note, do you guys find that 2t vs. 1t. is a factor in highest dram frequency? Or does it differ from winbond to samsung?

I think with Samsung you MAY hit higher memory speeds, but this is due to the memory controller being bogged down at whatever speed you found to be a ceiling at 1T.

Any more speed you get with 2T probably wouldn't make a big enough difference.
 
I guess that would make sense. I've also read a few things here and there (nothing that I wouldn't take without a grain a salt) about rev e's having "poor" memory controllers, or at least noticably different. Is this a real issue when it comes to running higher memory frequencies with higher cpu frequencies? Is it the same for all rev e processors?
 
mikelz85 said:
I guess that would make sense. I've also read a few things here and there (nothing that I wouldn't take without a grain a salt) about rev e's having "poor" memory controllers, or at least noticably different. Is this a real issue when it comes to running higher memory frequencies with higher cpu frequencies? Is it the same for all rev e processors?

Apparently, issues between RevE and TCCD are bios problems on the DFI board. Up until very recently (5/10 revision I believe) the Ultra-D couldn't handle RevE chips with TCCD very well. But, once people got this bios and relearned how to tweak memory on the RevE memory controller, all was good. I think if anything, the memory controller is different.
 
so a dfi board has the same issue, that a bios update fixed? *prays msi fixes it too* so about this re-learning how to tweak the memory controller...exactly what needs to be changed? the basic settings 2.5-3-3-7 1t are understood...but like in a64 tweaker I notice LOTS of different numbers, are those what you speak of? Im anxious to get my system optomized, ok bandwith means nothing, but I wasted all this $, may as well use it. Thanks for the help.
 
arnemetis said:
so a dfi board has the same issue, that a bios update fixed? *prays msi fixes it too* so about this re-learning how to tweak the memory controller...exactly what needs to be changed? the basic settings 2.5-3-3-7 1t are understood...but like in a64 tweaker I notice LOTS of different numbers, are those what you speak of? Im anxious to get my system optomized, ok bandwith means nothing, but I wasted all this $, may as well use it. Thanks for the help.

I don't have TCCD anymore (sold it because my TwinMOS SP murdered it performance-wise, nor do I have a DFI nForce4 board or RevE chip.

However, a the http://Xtremesystems.com forum, you can find a lot of info at the Xtreme Bandwidth and AMD forum. Also, try PM'ing Centvalny (in HardForum, Dumo in Xtremesytems) since he's experienced with this issue.
 
Stickified. Good read, robberbaron and eclipse.

Try cleaning up the first post a bit though.
 
TechHead said:
Stickified. Good read, robberbaron and eclipse.

Try cleaning up the first post a bit though.

I'll be able to do that once my new system is up. All of my raw data is on a SATA harddrive, and this dell only has PATA :(
 
Hmm, good read thanks guys. This makes me feel better about running 4 dual sided 512mb dims..

I have to run 2t and a 5:6 divier. All thought, I uped the HT a bit a got my memory to like 205Mhz, 2-3-3-11... Not to bad for 2gbs of DDR..
 
expensive ram is a waste of money for those not needing to bench competitively. One thing is for sure: more cheaper ram in your machine is 10X better than less $$$$ ram.

many of the BF2 troubles are directly due to lack of RAM. 1GB does not cut it now.
 
freeloader1969 said:
HeHe....and I bought robberbaron's board & memory :)

Yep, and I just shipped it. Oh, the memories I have with that board and ram. You're going to have a lot of fun.
 
I can get 3gb in my DFI nf3 250gb at ddr333 I can get the memory up to about 200-215mhz at 2t...

so... I am wondering if it is worth it over 2gb since it would be a pretty high 60ns or so latency.

I mean I can only get about 51 or 52ns stable with 2gb... so... really it is not even that much slower...

I do alot of heavy modeling and other such things... I don't know... sure would be nice to turn the page file off for good.
 
mikelz85 said:
They were just talking about 1t command rates, but when I was talking to my dad about how cool the winbond memories @ 260 2-2-2 looked, he linked me to an article on chipgeek showing that scaling the memory (actual dram frequency/latency) on the A64 doesn't change performance nearly as much as simply scaling the CPU clock speed. Obviously the on chip memory controller is why. I must say though, he still did get me 2 sticks of twinmos for my big "2 0" last week, I made him get the refurb though, was $20 cheaper, but I'm more curious to see how well it performs. Maybe I'll slowly change the attitude that refurbs are crap oc'ers, though I could end up stuck @ 220 mhz as well.

On another note, do you guys find that 2t vs. 1t. is a factor in highest dram frequency? Or does it differ from winbond to samsung?

Man... U really have a cool dad... I hope my son grows up to like computers so I can be one as well :D
 
Nice comparisons, especially the second post but I'd like to see more games tested. I accidently traded my RAM for some lower speed RAM (and can't overclock the RAM with my mb) so that comparison gives me an idea what kind of performance I lose going from 200MHz to 166MHz.
 
I think I may have a go of it as well and make a graph of CPU speed/memory performance vs. memory frequency/memory performance, especially because I'm a tad bit down after getting my x800pro vivo refurb last night. Riva tuner says no unlock, and a bios flash to pro16 pipe bios yielded no display on post, had to reflash to get it working again, but it was $198 shipped and OC's to 550mem 533core, hopefully a voltmod and my monster $5 coolermaster 'sink sandwich yield even faster clocks.
 
Staples said:
Nice comparisons, especially the second post but I'd like to see more games tested.
one of these days i'll get fraps, go through and add some real game benches
 
robberbaron said:
I don't have TCCD anymore (sold it because my TwinMOS SP murdered it performance-wise, nor do I have a DFI nForce4 board or RevE chip.
.

So, what did you hook up with now? I'm sure it's going to be a sweet new rig. :D
 
(cf)Eclipse said:
one of these days i'll get fraps, go through and add some real game benches

You should do this! its fun as hell too!

So dose anyone want to buy a gig of TCCD of me? Im not even useing it for what its worth because im runnign 4x512 sicks like 2-3-3-7 or somehting.. 200mhz
 
bump because I want to read this later and don't want to go looking for it :).
 
Back
Top