GTX Titan vs. Tesla K20/K20x

Zarathustra[H] · Mar 1, 2013

Hey,

I've been googling my brains out trying to find a good comparison between these cards.

I know the Titan is essentially a limited K20, but I can't find details on what each can do in a way I can compare them.

Has anyone found this anywhere?

Thanks,
M

reaper12 · Mar 1, 2013

Why would you need this info?

Deleted member 82943 · Mar 1, 2013

if you're doing CUDA stuff and can afford it you might as well go tesla because they haven't been castrated like the mainstream (sic) gamer targeted Titan.

Zarathustra[H] · Mar 1, 2013

reaper12 said:
Why would you need this info?

Why do we need any information? I'm curious.

gigatexal said:
if you're doing CUDA stuff and can afford it you might as well go tesla because they haven't been castrated like the mainstream (sic) gamer targeted Titan.

Oh I know that the Titan has been limited (presumably in firmware). I'm curous by how much though.

A friend of mine is a postdoc and also runs a biomedical simulation startup on the side. He does some sort of simulations and was very curious to play with my new Titan when I told him about it.

I'm going to set up a linux boot for him so he can SSH in and run some test programs. I'm just curious how what he will experience on my Titan will compare to what he would experience with a Tesla K20...

reaper12 · Mar 1, 2013

You can't really compare them, there is a lot more to the K20 than just the card. Different drivers/software etc.

AS for the various applications, well, try them and find out.

Qinsp · Mar 1, 2013

The trick will be if the compiler "sees" the Titan as a Tesla. If it doesn't, it will raped.

Zarathustra[H] · Mar 1, 2013

reaper12 said:
AS for the various applications, well, try them and find out.

Problem is, I don't have a $3,500 Tesla to compare it to (and I'm not about to buy one

)

Zarathustra[H] · Mar 1, 2013

I found a ZDnet article that suggests the raw compute power has been left untouched on the Titan both in single and double precision, but that ECC and HyperQ (whatever the hell that is) have been disabled.

Not being familiar enough with compute, I don't know what the significance of this is.

CrazyRob · Mar 1, 2013

I thought I remembered reading somewhere about a seperate gpgpu mode you could change in nvidia control panel that would unleash it's gpgpu capability in exchange for 3d performance. I can't seem to find info on it now, though. I'm also curious what a difference this makes, although personally I wouldn't use it for anything beyond F@H.

PhantomTaco · Mar 1, 2013

CrazyRob said:
I thought I remembered reading somewhere about a seperate gpgpu mode you could change in nvidia control panel that would unleash it's gpgpu capability in exchange for 3d performance. I can't seem to find info on it now, though. I'm also curious what a difference this makes, although personally I wouldn't use it for anything beyond F@H.

Yeah I remember reading the same thing, I'll try to find it and update this post. Either way price to performance if you're looking at a card for gpgpu/folding/mining/ the 7970 is a more attractive option.

Zarathustra[H] · Mar 1, 2013

PhantomTaco said:
Yeah I remember reading the same thing, I'll try to find it and update this post. Either way price to performance if you're looking at a card for gpgpu/folding/mining/ the 7970 is a more attractive option.

Yeah, the 7970 provides A LOT of compute performance for the money.

This is true unless you need double precision, in which case the Titan blows everything this side of a Tesla board out of the water.

yelsewshane · Mar 1, 2013

If only we could put gameing driver on a tesla ewwwww like you could do back in old days with the 6800 series. Sure it could be done but I don't know how to do it.

Neb · Mar 1, 2013

Zarathustra[H];1039662572 said:
I found a ZDnet article that suggests the raw compute power has been left untouched on the Titan both in single and double precision, but that ECC and HyperQ (whatever the hell that is) have been disabled.

Not being familiar enough with compute, I don't know what the significance of this is.

Anandtech's review has a fairly extensive portion dedicated to Titan's Compute Performance:

http://www.anandtech.com/show/6774/nvidias-geforce-gtx-titan-part-2-titans-performance-unveiled/3

From what I've read of the Titan, the disabled portions relate to using the GPU in a distributed GPU environment. If you're using it in a workstation they shouldn't affect you.

In terms of price and performance the Titan or the eventual Quadro version of the Titan is looking really good for a simple workstation compute card (that supports multiple monitors)

StormUP · Mar 1, 2013

Titan is not as fast as the Tesla Model S, but it does have lower TDP.

Relevant comparison? Perhaps. Depends on what you value more.

TheBuzzer · Mar 1, 2013

main thing is the ram is different. Tesla have ECC (error correcting code) type ram which is needed if you dont want any errors in your calculations if using cuda.

normal gamer gfx card will not bother about errors and might draw something or calculate something wrong for a second.

Neb · Mar 1, 2013

StormUP said:
Titan is not as fast as the Tesla Model S, but it does have lower TDP.

Relevant comparison? Perhaps. Depends on what you value more.

As a compute processor the Titan and the Tesla K20 are actually quite comparable.

Qinsp · Mar 1, 2013

TheBuzzer said:
main thing is the ram is different. Tesla have ECC (error correcting code) type ram which is needed if you dont want any errors in your calculations if using cuda.

normal gamer gfx card will not bother about errors and might draw something or calculate something wrong for a second.

Just rambling about stuff I have no direct experience with:

Not sure about non-ECC causing an issue with single card users.

It's only 6 GB of RAM, and usually inside a metal case. Don't buy a computer case that is not grounded and doesn't weigh 20lb or more if you are running important stuff. Bit flip is caused by solar radiation. Two millimeters of lead does better than any normal ECC RAM.

Server arrays need ECC. It's important due to the amount of RAM there. Your risk at 6gb of RAM isn't close to your risk with 1000gb of RAM.

And Tesla cards are normally run in arrays for serious crunching. A 4U holds what, 8 Tesla cards? A supercomputer holds hundreds.

jwcalla · Mar 2, 2013

Zarathustra[H];1039661954 said:
Hey,

I've been googling my brains out trying to find a good comparison between these cards.

I know the Titan is essentially a limited K20, but I can't find details on what each can do in a way I can compare them.

Has anyone found this anywhere?

Thanks,
M

Somebody posted these numbers:

https://devtalk.nvidia.com/default/...-boost-2-0-under-linux-/post/3750432/#3750432

mutantmagnet · Mar 2, 2013

Zarathustra[H];1039662572 said:
I found a ZDnet article that suggests the raw compute power has been left untouched on the Titan both in single and double precision, but that ECC and HyperQ (whatever the hell that is) have been disabled..

As someone else pointed out earlier with the Anand review HyperQ isn't disabled. Neither is Dynamic Parallelism as I've seen in other reviews.

Only certain features that are a subset of those 2 features have been cut out.

If you don't believe Anand, Nvidia corroborates at least some of it.

https://developer.nvidia.com/ultimate-cuda-development-gpu

gigatexal said:
if you're doing CUDA stuff and can afford it you might as well go tesla because they haven't been castrated like the mainstream (sic) gamer targeted Titan.

Titan is 1k. Minimum buy in for Tesla is 3.3k. Some of the features in Tesla simply aren't needed for the scope of the workload.

evilsofa · Mar 2, 2013

yelsewshane said:
If only we could put gameing driver on a tesla ewwwww like you could do back in old days with the 6800 series. Sure it could be done but I don't know how to do it.

No, it can't be done. Let me show you why. Do you see what's missing?

Neb · Mar 2, 2013

evilsofa said:
No, it can't be done. Let me show you why. Do you see what's missing?

To be fair the Tesla C2050/C2070 series had a dvi port:

Deleted member 82943 · Mar 2, 2013

Zarathustra[H];1039662572 said:
I found a ZDnet article that suggests the raw compute power has been left untouched on the Titan both in single and double precision, but that ECC and HyperQ (whatever the hell that is) have been disabled.

Not being familiar enough with compute, I don't know what the significance of this is.

HyperQ -

Hyper-QHyper‐Qenablesmultiple CPUcoresto launch work on a singleGPU
simultaneously,thereby dramatically increasingGPUutilization and significantly reducing CPU
idle times.Hyper‐Qincreasesthe total number of connections(work queues) between the host
and theGK110GPUby allowing 32 simultaneous, hardware‐managed connections(compared to
the single connection available with Fermi).Hyper‐Qis a flexible solution that allowsseparate
connectionsfrommultiple CUDA streams,frommultiple Message Passing Interface (MPI)
processes, or even frommultiple threads within a process. Applicationsthat previously
encountered false serialization acrosstasks,thereby limiting achievedGPUutilization, can see
up to dramatic performance increase without changing any existing code.

taken from: http://www.nvidia.com/content/PDF/kepler/NVIDIA-Kepler-GK110-Architecture-Whitepaper.pdf

luiset83 · Mar 4, 2013

Since I just tested this on a GTX TItan, I thought I'd post. The simpleHyperQ sample in the CUDA 5.0 SDK can do up to 8 streams at a time on the Titan, versus 32 streams on the K20/K20x. Tested this on Ubuntu 12.10 x64, MSI X79A-GD45 (8D) motherboard, NVIDIA drivers 313.18, which detect the Titan as D15U-50. Similar to Windows, nvidia-settings has a CUDA-Double precision checkmark under the PowerMizer settings of the Titan GPU to enable/disable full DP speeds.

Some other differences:
I do not believe there is any overclock support on Linux currently. However, utilities like NVIDIA Inspector and EVGA Precision X are able to overclock the card in Windows.
nvidia-smi settings of application clocks/TCC/ECC are not supported by Titan (they are supported on K20/K20x)
Tesla K20 and K20x run at PCI-E 2.0 speeds... Titan runs at PCI-E 3.0 speeds... take a look at my bandwidthTest results with the system configuration listed above:

Code:

root@Tesla:/usr/local/cuda/samples/1_Utilities/bandwidthTest# ./bandwidthTest --device=0
[CUDA Bandwidth Test] - Starting...
Running on...

 Device 0: D15U-50
 Quick Mode

 Host to Device Bandwidth, 1 Device(s)
 PINNED Memory Transfers
   Transfer Size (Bytes)	Bandwidth(MB/s)
   33554432			11190.6

 Device to Host Bandwidth, 1 Device(s)
 PINNED Memory Transfers
   Transfer Size (Bytes)	Bandwidth(MB/s)
   33554432			11802.5

 Device to Device Bandwidth, 1 Device(s)
 PINNED Memory Transfers
   Transfer Size (Bytes)	Bandwidth(MB/s)
   33554432			221383.4

Deleted member 82943 · Mar 4, 2013

so you can unlock titan on windows to get full dp speeds just without ecc as that is hardware locked/not available?

luiset83 · Mar 4, 2013

gigatexal said:
so you can unlock titan on windows to get full dp speeds just without ecc as that is hardware locked/not available?

That is correct, DP can be unlocked in Linux or Windows. The support for ECC/TCC is driver-dependent... basically NVIDIA locks you out of those features given you're buying a consumer card to protect their market. Some in the past were able to unlock a GTX 480 into a C2050... see:

https://devtalk.nvidia.com/default/...-c2050-hack-or-unlocking-tcc-mode-on-geforce/

It might or might not be possible to do the same on Titan cards, but it's not an easy process and prone to bricking a card... so basically if you need those advanced features, buy the real deal (K20)

Mchart · Mar 4, 2013

evilsofa said:
No, it can't be done. Let me show you why. Do you see what's missing?

We had a tard nugget on here a couple weeks ago saying he played BF3 on his k20x. Guess how I knew he was full of crap.

Zarathustra[H] · Mar 4, 2013

luiset83 said:
Since I just tested this on a GTX TItan, I thought I'd post. The simpleHyperQ sample in the CUDA 5.0 SDK can do up to 8 streams at a time on the Titan, versus 32 streams on the K20/K20x. Tested this on Ubuntu 12.10 x64, MSI X79A-GD45 (8D) motherboard, NVIDIA drivers 313.18, which detect the Titan as D15U-50. Similar to Windows, nvidia-settings has a CUDA-Double precision checkmark under the PowerMizer settings of the Titan GPU to enable/disable full DP speeds.

Some other differences:
I do not believe there is any overclock support on Linux currently. However, utilities like NVIDIA Inspector and EVGA Precision X are able to overclock the card in Windows.
nvidia-smi settings of application clocks/TCC/ECC are not supported by Titan (they are supported on K20/K20x)
Tesla K20 and K20x run at PCI-E 2.0 speeds... Titan runs at PCI-E 3.0 speeds... take a look at my bandwidthTest results with the system configuration listed above:

Code:

root@Tesla:/usr/local/cuda/samples/1_Utilities/bandwidthTest# ./bandwidthTest --device=0 [CUDA Bandwidth Test] - Starting... Running on... Device 0: D15U-50 Quick Mode Host to Device Bandwidth, 1 Device(s) PINNED Memory Transfers Transfer Size (Bytes) Bandwidth(MB/s) 33554432 11190.6 Device to Host Bandwidth, 1 Device(s) PINNED Memory Transfers Transfer Size (Bytes) Bandwidth(MB/s) 33554432 11802.5 Device to Device Bandwidth, 1 Device(s) PINNED Memory Transfers Transfer Size (Bytes) Bandwidth(MB/s) 33554432 221383.4

Interesting! Thank you for this.

Do you know if it is possible to enable/disable full DP speeds in a headless system without X installed?

Cube · Mar 4, 2013

some Tesla come wth monitor ports for workstations the super computer ones do not. the Tesla at our studio on the big maya workstation has dvi ports.

you can out put the ones with out dvi ports to a monitor it is just really complicated and need s 2 cards or a video network remote interface.

we can view the render output from a large render farms that have Tesla and some at I fire pros on huge monitor in the meeting room in our studio.

I may benchmark a titan on rendering but I still can't buy a titan. I can tell you nvida nerfs other geforce the quadtro line slow quadtro cards kill even the geforce 680 at rendering.

WorldExclusive · Mar 4, 2013

I don't need studio grade rendering just yet. Still in school for animation. So a Titan is a great balance for work, school and play.

Cube · Mar 6, 2013

I am using my titan for Iray and other cuda and open CL things at home.

SonDa5 · Mar 14, 2013

I get the feeling that Titans are built on broken Tesla GK110 gpus, the ones that couldn't cut it for the higher end cards that cost thousands of dollars more.

Seems like a good way to make some money on bad GK110s.

Is this fathomable?

Deleted member 82943 · Mar 15, 2013

it's called binning and thats how intel and amd manufacture they're chips

ccityinstaller · Mar 15, 2013

SonDa5 said:
I get the feeling that Titans are built on broken Tesla GK110 gpus, the ones that couldn't cut it for the higher end cards that cost thousands of dollars more.

Seems like a good way to make some money on bad GK110s.

Is this fathomable?

I think that is the EXACT reason that Titan's exist..They may only be able to do 95-99% quality, and the K series need 100%, so that 1% would otherwise force that GK110 die to be disposed of..Now, they have a use..

Stoly · Mar 15, 2013

SonDa5 said:
I get the feeling that Titans are built on broken Tesla GK110 gpus, the ones that couldn't cut it for the higher end cards that cost thousands of dollars more.

Seems like a good way to make some money on bad GK110s.

Is this fathomable?

Not sure about that.
Titan doesn't seem to have anything disabled in hardware. It has the same number of SMXs and cuda FP64 cores.

Zinn · Mar 15, 2013

ccityinstaller said:
I think that is the EXACT reason that Titan's exist..They may only be able to do 95-99% quality, and the K series need 100%, so that 1% would otherwise force that GK110 die to be disposed of..Now, they have a use..

Stoly said:
Not sure about that.
Titan doesn't seem to have anything disabled in hardware. It has the same number of SMXs and cuda FP64 cores.

Yeah, that's why it doesn't make sense to say "95% quality." For a device full of transistors that perform logical operations, there is no 95%. It is digital. It either works 100% to its specification or it is broken.

Most likely the OP is correct that Nvidia disabled certain compute operations to gimp the cards for certain types of supercomputer applications but still deliver a solid baseline for people just getting into GPU programming.

TESLA · Mar 15, 2013

If you're talking a startup doing medical simulations, I would stick with the Tesla.

ECC memory is not a requirement for 99% (or more) of usage. Bit flips are indeed rare, and even without ECC memory, errors will generally be detected. The difference is that ECC offers single-bit error immunity, and it ensures that such errors will be corrected.

The current culture of "server or workstation requires ECC memory and nothing else" may be unfounded, registered, high density, memory aside. But for some workflows ensuring zero tolerance towards single-bit errors is definitely the best policy. Anything to do with the financials is the obvious example, but many or even most medical, engineering, disaster, and etc. simulations certainly apply as well.

Many times a small, undetected, error (no matter how rare) can be greatly compounded over the length of a long running simulation. Other times, the risk of having to start over after an error is detected would be unacceptable or costly.

Anyway, your friend will be able to best determine how error tolerant their workflow(s) need to be.

Zarathustra[H] · Mar 15, 2013

TESLA said:
If you're talking a startup doing medical simulations, I would stick with the Tesla.

ECC memory is not a requirement for 99% (or more) of usage. Bit flips are indeed rare, and even without ECC memory, errors will generally be detected. The difference is that ECC offers single-bit error immunity, and it ensures that such errors will be corrected.

The current culture of "server or workstation requires ECC memory and nothing else" may be unfounded, registered, high density, memory aside. But for some workflows ensuring zero tolerance towards single-bit errors is definitely the best policy. Anything to do with the financials is the obvious example, but many or even most medical, engineering, disaster, and etc. simulations certainly apply as well.

Many times a small, undetected, error (no matter how rare) can be greatly compounded over the length of a long running simulation. Other times, the risk of having to start over after an error is detected would be unacceptable or costly.

Anyway, your friend will be able to best determine how error tolerant their workflow(s) need to be.

We actually talked about that.

For his particular application yeah, most simulations do such a large amount of averaging
that a bit error here and there is likely to be less common than a statistical outlier
and probably won't even be noticed.

ECC would definitely be better, in most cases, but in this specific case, possibly not.

ekiro · Oct 29, 2013

Can someone give me a clear answer on this ?

I see mixed responses.

I am considering buying 4 Tesla GPUs. But I'm not sure if they will produce more computing results (folding@home - Stanford) over the consumer grade equipment. Or if it's just about the same.

That said. Should I go with 4x K20s or 4 Titans? I love saving money.

I am doing this for free to donate some computing power.
It's for tax reasons. I can classify as the power computed and hardware as expenses/donations and save a lot. Love my accountant.

Deleted member 82943 · Oct 29, 2013

one would think the K20s would be better.

Pocatello · Oct 29, 2013

ekiro said:
I am considering buying 4 Tesla GPUs. But I'm not sure if they will produce more computing results (folding@home - Stanford) over the consumer grade equipment.

You might consider posting this in the distributed computing sub-forum.

ekiro said:
I am doing this for free to donate some computing power.
It's for tax reasons. I can classify as the power computed and hardware as expenses/donations and save a lot. Love my accountant.

You might want to consider getting a second opinion on the issue of deductibility.

GTX Titan vs. Tesla K20/K20x

Extremely [H]

2[H]4U

Deleted member 82943

Guest

Extremely [H]

2[H]4U

2[H]4U

Extremely [H]

Extremely [H]

[H]ard|Gawd

Limp Gawd

Extremely [H]

Limp Gawd

2[H]4U

Gawd

HACK THE WORLD!

2[H]4U

2[H]4U

2[H]4U

Limp Gawd

[H]F Junkie

2[H]4U

Deleted member 82943

Guest

n00b

Deleted member 82943

Guest

n00b

Supreme [H]ardness

Extremely [H]

[H]ard|Gawd

[H]F Junkie

[H]ard|Gawd

Supreme [H]ardness

Deleted member 82943

Guest

Supreme [H]ardness

Supreme [H]ardness

2[H]4U

Gawd

Extremely [H]

n00b

Deleted member 82943

Guest

DC Moderator and [H]ard DCOTM x6