M76

[H]F Junkie
Joined
Jun 12, 2012
Messages
12,793
Which you don't; if an allocation fails simply transfer a chunk of it back into normal RAM/Virtual RAM to free up space. Imagine the host OS worked like you describe and BSOD's every time a memory access fails.
I might have worded in incorrrectly. I don't mean the app needs to crash or BSOD, What I don't want an app to do is try to guess the ram requirement before even trying the task, and refuse it based on that. There are many tasks whose exact ram requirement is dependent on the data itself, and can only be determined by actually doing the task.
 

DocNo

Gawd
Joined
Apr 23, 2012
Messages
662
You don't 'need' this capability, but it'll help.

Who are you to argue if people "need" something? I can tell you time is money and even if these cards only shave off 10% of time when working with video they will more than pay for themselves in a matter of months. If they flat out enable editing that doesn't work at all with cards with less money, then that's far from "helping".
 

IdiotInCharge

NVIDIA SHILL
Joined
Jun 13, 2003
Messages
14,679
Who are you to argue if people "need" something? I can tell you time is money and even if these cards only shave off 10% of time when working with video they will more than pay for themselves in a matter of months. If they flat out enable editing that doesn't work at all with cards with less money, then that's far from "helping".

If that's the case, then the users in question would have already purchased a proper professional card.
 

Cerulean

[H]F Junkie
Joined
Jul 27, 2006
Messages
9,476
If you are serious about video production, do yourself a favor and use professional cards. I bet they don't use cheap smartphones to capture 4k video, right?
If you are serious about video production, do yourself a favor by doing research and realize that professional cards have nothing on non-professional cards in term of rendering performance. Professional cards won't provide any benefit or justification.

EDIT: I did some more digging to find this enlightening on why some would choose a professional card vs non-professional. Non-professional cards have more relaxed tolerances and are not as strong in FP64 calculations. Because of these two factors, the quality of renders are slightly lower than if rendering with a CPU. The other disadvantage against CPUs if that once you run out of VRAM you run out of VRAM, whereas CPUs have a lot of DRAM available to them. For the majority of videos out there, non-professional hardware is more than adequate. But, if you are a studio where due to the risk/impact/demands of your client and audience there cannot be even the slightest and most minuet flaw on any frame, you will want FP64 with minimal tolerance in errors. This means you either render with CPU or with a professional card.

See Mark Sin's post at https://www.quora.com/Why-are-CPUs-more-important-in-final-rendering-than-GPUs. It is a little lengthy but it is worth the read for the technical minds that want to understand why GPUs output a lower quality than CPUs.
This is a very “it depends” answer, but:

It turns out that many GPUs, especially in earlier days, being designed to render for gaming, may actually take short cuts when rendering DirectX or OpenGL-based graphics. There were web sites that used to compare screen shots from different GPUs (or drivers, which also make a big difference) at different quality and driver settings, vs. the “reference” DirectX or OpenGL renderer, which is CPU-based (so it runs on a common platform independent of GPUs). These short cuts lead to increased frame rates, at the expense of image quality, i.e. display artifacts. However, for gaming purposes, since FPS is key, a little less image quality, or an artifact that may only show up more obviously for just a few frames, out of 60 FPS, is not going to be noticeable by a gamer.

However, this will certainly be noticeable for other 3D animation purposes such as if those frames are going to be “paused” and studied or simply need to look pristine.

Another thing is precision. Even if the GPU is used to run a pixel shader for rendering, to do various effects beyond the capabilities exposed in DirectX/OpenGL, it turns out the GPU may only be fastest or capable of 16-bit or 32-bit precision floating point math (FP16/FP32). In particular, the GPU may not be, or is much slower, at 64-bit precision math (FP64) than at FP32. For example;
Explaining FP64 performance on GPUs

Says the FP64 is 1/24 the performance of FP32. It’s there, but no one will use FP64 for gaming. Only FP32. Now for gaming, trading off speed for quality is fine. But not for rendering.

It also turned out earlier GPUs didn’t even implement so-called IEEE 754 compliant floating-point, or were not as reliable in their floating point computations. This sort of means there could be in some cases, more error in the computation of the floating point math than allowed by that spec. These subtle errors can lead to tiny artifacts in renders. They did so to be fast and again it wasn’t super-critical for gaming. Now as GPUs are also used for computation, it is important to maintain some accuracy in the calculation, so modern GPUs are much better in this regard.

And there’s also an issue of memory. GPUs only compute renders quickly if everything fits in their video RAM (VRAM), which is directly attached to the GPU. The VRAM chips and the interface are designed for speed more than capacity, so while you typically see say 16–64GB of DRAM on a CPU, the GPUs have more like 2–8GB of VRAM (and this is just about almost always not user-expandable). Once this VRAM is exhausted the GPU driver has to swap to the DRAM (of the CPU), in that case now the limit is the PCIe interface to the GPU, which is much slower than say the CPU’s interface to its own DRAM.

For intermediate renders, speed is important, so the renderer is more likely to use the GPU’s driver to render thus there may be subtle artifacts, or lower precision for the math, and also the textures and such have to fit in the available VRAM.

For final renders, quality is important. If the VRAM capacity issue dominates, or if FP64 is needed, it may turn out for all the very capable hardware in the GPU, the CPU may have a faster (or is the only device capable) of FP64, and has far more DRAM at its disposal, plus has more direct bandwidth to that, to handle the size of the final rendered image and the quality required.
 
Last edited:

DocNo

Gawd
Joined
Apr 23, 2012
Messages
662
If that's the case, then the users in question would have already purchased a proper professional card.

Why? You think businesses like throwing out money they don't need to spend for labels like "proper professional card"?!?

lol - some may, but they usually aren't in business long doing stupid stuff like that.
 

Algrim

[H]ard|Gawd
Joined
Jun 1, 2016
Messages
1,737
Most gaming cards don't have or support ECC memory whereas professional cards can. For the use cases where this matters you don't have much choice. For the workloads that aren't needing such precision gaming cards can become usable or even desirable (overclocked gaming cards can be compute monsters).
 

Tsumi

[H]F Junkie
Joined
Mar 18, 2010
Messages
13,538
Why? You think businesses like throwing out money they don't need to spend for labels like "proper professional card"?!?

lol - some may, but they usually aren't in business long doing stupid stuff like that.

When something is labeled professional, the primary factor is not performance, but stability, reliability, and support. That is what businesses pay for, and often preventing lost working time is worth the additional cost of "professional."
 

IdiotInCharge

NVIDIA SHILL
Joined
Jun 13, 2003
Messages
14,679
When something is labeled professional, the primary factor is not performance, but stability, reliability, and support. That is what businesses pay for, and often preventing lost working time is worth the additional cost of "professional."

Yup. There's a thin margin between doing enough work that a high-end gaming card makes sense, but a professional card doesn't.
 

DocNo

Gawd
Joined
Apr 23, 2012
Messages
662
When something is labeled professional, the primary factor is not performance, but stability, reliability, and support. That is what businesses pay for, and often preventing lost working time is worth the additional cost of "professional."

Not always - it just depends on the company, the product and the costs involved.

The amount of "IBM" level businesses that will just blindly play for things like "Professional" nomenclatures are tiny compared to the overall number of smaller businesses out there.

Don't think so? Apple didn't overtake Microsoft in the enterprise.

As with anything there are pro's and cons. Maybe in the final render path you put in the pro card - in the editing bay? Lots of friends are over the moon with this new card in new capabilities it will give them. For them it wasn't a matter of paying more for the pro card, it was a matter of having this card or not since the "pro" cards are simply out of the question.

How that's a bad thing still mystifies me but in forums like these people like you with these arguments exist so here we are I guess.
 

Tsumi

[H]F Junkie
Joined
Mar 18, 2010
Messages
13,538
Not always - it just depends on the company, the product and the costs involved.

The amount of "IBM" level businesses that will just blindly play for things like "Professional" nomenclatures are tiny compared to the overall number of smaller businesses out there.

Don't think so? Apple didn't overtake Microsoft in the enterprise.

As with anything there are pro's and cons. Maybe in the final render path you put in the pro card - in the editing bay? Lots of friends are over the moon with this new card in new capabilities it will give them. For them it wasn't a matter of paying more for the pro card, it was a matter of having this card or not since the "pro" cards are simply out of the question.

How that's a bad thing still mystifies me but in forums like these people like you with these arguments exist so here we are I guess.

Again, we are not saying it's a bad thing. We are saying its purpose is extremely limited, and doesn't do much good towards gamers, which the majority of us on here are. It's not a halo card, it's not a great value card, it's a meh card at an even more meh price that offers only one special thing to an extremely small niche of people. I don't understand how hard that is to comprehend for you.

While the number of businesses that "blindly" pay for things (I assure you, they don't, they have the cost-benefit already figured out) might be small, the volume they purchase is far from insignificant. Apple may not have taken over enterprise, but they do offer a lot of enterprise level services, which goes to say that enterprise cannot be ignored.

I mean, we gave AMD hell for Bulldozer. It offered great multi-threaded performance for the price, but was so meh everywhere else that it was slammed. Same thing here. Great 4K video rendering capabilities for the price, meh everywhere else.
 

gamerk2

[H]ard|Gawd
Joined
Jul 9, 2012
Messages
1,972
That's a somewhat simple way of looking at it. If you have a chunk of data that is 8GB on it's own, and only have 6GB of vRAM, you will have to split the data into smaller pieces before working on it. Even if you had 12GB of vRAM you may have to split it if the original and modified data don't all fit into memory. In some cases, that can have an extremely detrimental effect on the time of the operation, where it might not be worth it to even try.

First off, you pretty much never work with the entire buffer in one go. You typically only need to allocate MB at a time. Secondly, you only need to break up the data in the cases where it won't fit, so there's zero performance loss otherwise. And I again note the alternative is "don't do it at all", so arguing performance is kind of the definition of ironic.
 

gamerk2

[H]ard|Gawd
Joined
Jul 9, 2012
Messages
1,972
I might have worded in incorrrectly. I don't mean the app needs to crash or BSOD, What I don't want an app to do is try to guess the ram requirement before even trying the task, and refuse it based on that. There are many tasks whose exact ram requirement is dependent on the data itself, and can only be determined by actually doing the task.

Which is how ALL memory management works. Application requests a memory allocation of X size from HW, and either gets a memory address to the start of a block of RAM or an error if the request can not be met. All Adobe needs to do here is move some data out of VRAM if a memory request fails and do the request again (which is pretty much how Paging works at the OS level).
 

Nobu

Supreme [H]ardness
Joined
Jun 7, 2007
Messages
7,981
First off, you pretty much never work with the entire buffer in one go. You typically only need to allocate MB at a time. Secondly, you only need to break up the data in the cases where it won't fit, so there's zero performance loss otherwise. And I again note the alternative is "don't do it at all", so arguing performance is kind of the definition of ironic.
It's not an easy problem to solve. When you're working with large data sets (rendering 4k video, which means working with uncompressed image data, lots of textures, meshes, etc), the program cannot always know how much data will be required to render a frame in advance, and a lot of data needs to be in memory in order to complete the operation.
https://devtalk.nvidia.com/default/...error-thrown-by-the-driver-instead-of-opengl/
https://blender.stackexchange.com/q...of-memory-how-to-identify-the-problem-objects
Edit: and from a comment on Dan's question:
http://blender.stackexchange.com/a/61421/1853

Look at it this way: you have a timeline with say two 4k videos, an effect, and a transition between the two. You also have a 1080p video embedded in one of the two streams, using a green-screen effect. You need to have the 1080p video, the two 4k videos, the previous rendered frame (maybe multiple), the 4k and 1080p combined frame, and the current render buffer in video memory. You also have to use video memory for each of the individual operations (sometimes pixel granularity, sometimes multiple pixel, sometimes the whole frame). Sometimes an effect requires you to render the next frame before the current one or at the same time. You have to do this 60 times a second (or more), and you want them to guess the amount of vram needed each time? Or fetch a frame from system memory each time?

The alternative is reducing the data size, removing effects, changing encoding settings (possibly reduced quality), or doing rendering on the CPU.
 
Last edited:
Joined
May 3, 2016
Messages
63
lol - if Linus Tech Tips owns multiple Red cameras and shoots all their content in 4K then there are a shitload of people out there who could benefit from these cards.

It's easy to dismiss things you are ignorant of. It's why teenagers are smarter than their parents ;)

Because one company totaling a use for say a half dozen of these cards possibly is such a huge market indicator. I think the kiddies don't know what a mass market product is.
 

gamerk2

[H]ard|Gawd
Joined
Jul 9, 2012
Messages
1,972
It's not an easy problem to solve. When you're working with large data sets (rendering 4k video, which means working with uncompressed image data, lots of textures, meshes, etc), the program cannot always know how much data will be required to render a frame in advance, and a lot of data needs to be in memory in order to complete the operation.
https://devtalk.nvidia.com/default/...error-thrown-by-the-driver-instead-of-opengl/
https://blender.stackexchange.com/q...of-memory-how-to-identify-the-problem-objects
Edit: and from a comment on Dan's question:
http://blender.stackexchange.com/a/61421/1853

Look at it this way: you have a timeline with say two 4k videos, an effect, and a transition between the two. You also have a 1080p video embedded in one of the two streams, using a green-screen effect. You need to have the 1080p video, the two 4k videos, the previous rendered frame (maybe multiple), the 4k and 1080p combined frame, and the current render buffer in video memory. You also have to use video memory for each of the individual operations (sometimes pixel granularity, sometimes multiple pixel, sometimes the whole frame). Sometimes an effect requires you to render the next frame before the current one or at the same time. You have to do this 60 times a second (or more), and you want them to guess the amount of vram needed each time? Or fetch a frame from system memory each time?

The alternative is reducing the data size, removing effects, changing encoding settings (possibly reduced quality), or doing rendering on the CPU.

The failure in thinking here is that all these need to be in VRAM at the same time; yes, shuffling data across the PCI-E bus can certainly slow the operation down, but that's still preferable to the application crashing and getting the work done never.

Both the OS and graphics APIs have mechanisms for recovering from memory allocation errors; it's up to the developer to use them properly.
 
Top