nvidia or amd for color accuracy and calibration features

spacediver · Oct 4, 2015

flossy_cake said:
I notice at the row for digital value 1 , its output for green is 191 and blue is 193, so it seems to be specifying a luminance correction to within only 2 points out of 65 thousand! Whether dithering at 8-bit is sufficient to create such a small step in luminance is something I simply don't know.

No. Assuming your LUT is 8 bit, then anything from 0-255 will be treated identically. Anything from 256-511 will be treated identically, and so on.

If your LUT is 10 bit, then anything from 0-63 will be treated identically. Anything from 64-127 will be treated identically, and so on.

flossy_cake · Oct 4, 2015

So I did the dispcal -R test, and it says my effective bit depth is 10-bit

flossy_cake · Oct 4, 2015

spacediver said:
No. Assuming your LUT is 8 bit, then anything from 0-255 will be treated identically. Anything from 256-511 will be treated identically, and so on.

If your LUT is 10 bit, then anything from 0-63 will be treated identically. Anything from 64-127 will be treated identically, and so on.

Yep, makes sense

I'm guessing AMD is sampling a value for each chunk of 63 to create its 10-bit table, and then dither that spatially to create the final 8-bit output.

spacediver · Oct 4, 2015

I'm not sure why you think there needs to be any dithering.

Your LUT is still only 256 entries. Give me an example of where you think dithering comes into the picture.

flossy_cake · Oct 4, 2015

spacediver said:
I'm not sure why you think there needs to be any dithering.

Your LUT is still only 256 entries. Give me an example of where you think dithering comes into the picture.

Well, it's just not logically possible to change a value in the table without reducing the total number of shades, which causes banding. Dithering can solve this issue by creating more intermediate shades.

Here is a ramp of only 4 shades

Now here is also 4 shades, but dithered

Dithering is pretty incredible actually

spacediver · Oct 4, 2015

flossy_cake said:
Well, it's just not logically possible to change a value in the table without reducing the total number of shades, which causes banding. Dithering can solve this issue by creating more intermediate shades.

I think we might be talking at cross wavelengths or something here.

Let's take an example of an 256 entry 8 bit LUT. Each value in the LUT will be specified with a number between 0 and 65,535. By default, the LUT is linearized, and the values will be in increments of 256 (65,536 / 256 entries = 256).

So the first entry will have a value of 0, the second a value of 256, the third, a value of 512, etc. all the way up to 65,535.

Now, if you change the value of the first entry, from 0 to any number between 0 and 255 (inclusive), absolutely nothing will change. You'll still have the same image when you present, for example, an 8 bit grayscale ramp (256 shades of gray from black to white).

If you change the value of the first entry to, say, 300, you will now have crushed the first two levels together. This first entry will now map onto exactly the same luminance as the second level. You now only have 255 effective shades of gray.

If you have a 256 entry 10 bit LUT, the linearized default LUT will still be the same as in the 8 bit case, but now you have some flexibility. You can now map each entry onto luminance values that are intermediate between these original 256 luminances. This flexibility will allow you to to do things like gamma adjustment without crushing levels together (which is one cause of banding). In essence, you have a palette of 1024 luminance levels to choose from, when deciding the luminance levels of each of the 256 entries.

Dithering is not required at all here for this flexibility.

Of course, dithering can improve upon whatever flexibility you already have, but that's a separate issue.

flossy_cake · Oct 5, 2015

spacediver said:
If you have a 256 entry 10 bit LUT, the linearized default LUT will still be the same as in the 8 bit case, but now you have some flexibility.

I don't understand why reducing the precision of each entry from 65536 to 1024 would result in more flexibility.

spacediver said:
Dithering is not required at all here for this flexibility.

But without dithering, every shade will still have to snap back to the nearest 256th shade when the final 8-bit (256 steps) output is being derived from it, which will still cause banding.

spacediver · Oct 5, 2015

flossy_cake said:
I don't understand why reducing the precision of each entry from 65536 to 1024 would result in more flexibility.

Reducing the precision from 65536 to 1024 is more flexible than reducing it from 65536 to 256. The precision was never 16 bit to begin with, other than in terms of nominal specification. The effective precision is 1024 in the 10 bit case, and 256 in the 8 bit case. By effective precision, I mean the number of distinct luminances that can be derived from it.

Having a high nominal precision may have other potential benefits, though. I think having more precision means you can store, in memory, the results of cumulative transformations to the LUT - something to do with floating point and rounding errors? But this is a separate issue from the sort of flexibility required to reduce banding when doing things like gamma adjustments.

flossy_cake said:
But without dithering, every shade will still have to snap back to the nearest 256th shade when the final 8-bit (256 steps) output is being derived from it, which will still cause banding.

Are you talking about in the 8 bit LUT case or the 10 bit LUT case?

In the 8 bit case, yes, absolutely there will be banding if you try to create a luminance function that differs in any way from the default one. In the 10 bit case, you have more options.

In the 8 bit case, you have 256 different luminance levels to choose from (for the final 256 "slots"). Assuming you want a monotonically increasing luminance curve, there is exactly one possible luminance function that you can have. The only way you can get different luminance functions is by sacrificing slots by crushing levels together (i.e. banding).

In the 10 bit case, you have 1024 different luminance levels to choose from (for the final 256 slots). I don't have the mathematical tools to figure out how many unique, monotonically increasing luminance functions this allows for, but it's probably in the trillions, if not way more.

Are you assuming that the display itself is only capable of the same 256 shades no matter what? If so, then yes, you'd need dithering in the 10 bit case. I'm really only familiar with CRTs and analogue stuff (and even then my knowledge is somewhat precarious). But in the case of a CRT, I think the values are sent out as voltages, the amplitude of each voltage being proportional to the effective LUT value. So, in the 10 bit LUT case, at each moment in time, one of a set of 256 unique voltages will be sent through the cable (ultimately resulting in the intensity of the target pixel). The 10 bit flexibility means that the amplitudes of these 256 voltages can be chosen from a set of 1024 voltages, in advance.

I don't know how things work in the digital realm.

flossy_cake · Oct 5, 2015

spacediver said:
Are you assuming that the display itself is only capable of the same 256 shades no matter what? If so, then yes, you'd need dithering in the 10 bit case. I'm really only familiar with CRTs and analogue stuff (and even then my knowledge is somewhat precarious). But in the case of a CRT, I think the values are sent out as voltages, the amplitude of each voltage being proportional to the effective LUT value.

Ah that explains it

You are getting a true 1024 step 10-bit signal being displayed on your CRT, whereas I am getting no more than 8-bit 256 steps on my LCD.

zone74 · Oct 5, 2015

spacediver said:
I think we might be talking at cross wavelengths or something here.
Let's take an example of an 256 entry 8 bit LUT. Each value in the LUT will be specified with a number between 0 and 65,535. By default, the LUT is linearized, and the values will be in increments of 256 (65,536 / 256 entries = 256).
So the first entry will have a value of 0, the second a value of 256, the third, a value of 512, etc. all the way up to 65,535.
Now, if you change the value of the first entry, from 0 to any number between 0 and 255 (inclusive), absolutely nothing will change. You'll still have the same image when you present, for example, an 8 bit grayscale ramp (256 shades of gray from black to white).
If you change the value of the first entry to, say, 300, you will now have crushed the first two levels together. This first entry will now map onto exactly the same luminance as the second level. You now only have 255 effective shades of gray.
If you have a 256 entry 10 bit LUT, the linearized default LUT will still be the same as in the 8 bit case, but now you have some flexibility. You can now map each entry onto luminance values that are intermediate between these original 256 luminances. This flexibility will allow you to to do things like gamma adjustment without crushing levels together (which is one cause of banding). In essence, you have a palette of 1024 luminance levels to choose from, when deciding the luminance levels of each of the 256 entries.
Dithering is not required at all here for this flexibility.
Of course, dithering can improve upon whatever flexibility you already have, but that's a separate issue.

What you describe is the reason why your processing needs to have greater precision than the output to be effective.
If you adjust values using the same precision as your output, any changes that you make will create banding because you cannot create intermediate values.
If you use greater precision than your output, you can use dither to display the intermediate shades.
If you do not dither, the output will have banding even if you use processing with greater precision than the output.

Here are some examples:

If we brighten up the images, these differences are far more apparent:

That software (madVR video renderer) only allows you to select 10-bit or 16-bit processing, so I can't show you what happens when you process an 8-bit source in 8-bit, but I think the results should be clear from these examples.

spacediver said:
Are you assuming that the display itself is only capable of the same 256 shades no matter what?

Yes, that's how it works with an 8-bit output. You only get 256 shades.

spacediver said:
I'm really only familiar with CRTs and analogue stuff (and even then my knowledge is somewhat precarious).

I cannot confirm it myself, but it would appear that the output is 10-bit over VGA. Most (all?) GPUs seem to use a 10-bit RAMDAC.
So any behavior that you see when using your CRT is equivalent to a 10-bit digital display, not an 8-bit one.

spacediver · Oct 5, 2015

flossy_cake said:
Ah that explains it

You are getting a true 1024 step 10-bit signal being displayed on your CRT, whereas I am getting no more than 8-bit 256 steps on my LCD.

zone74 said:
I cannot confirm it myself, but it would appear that the output is 10-bit over VGA. Most (all?) GPUs seem to use a 10-bit RAMDAC.
So any behavior that you see when using your CRT is equivalent to a 10-bit digital display, not an 8-bit one.

Right, but within each framebuffer, there are only 256 possible levels available. In other words, I'm not, as far as I know, able to display a grayscale ramp that has 1024 different levels in the same image (unless I use MadVR Full screen directx exclusive mode).

If my output was 10 bit through and through, then how come I'm unable to get 1024 simultaneous shades of gray in a frame using OpenGL? I remember there was this small application that would render an image, and it would only work if your system was 10 bit - can't remember if it was an eizo or nvidia application, but it was supposed to work with the Quadro. Anyway, it wouldn't even run on my system.

Also, if my system truly is 10 bit, I should be able to specify a 1024 entry LUT, right? How would I go about doing that?

zone74 · Oct 5, 2015

spacediver said:
Right, but within each framebuffer, there are only 256 possible levels available. In other words, I'm not, as far as I know, able to display a grayscale ramp that has 1024 different levels in the same image (unless I use MadVR Full screen directx exclusive mode).
If my output was 10 bit through and through, then how come I'm unable to get 1024 simultaneous shades of gray in a frame using OpenGL? I remember there was this small application that would render an image, and it would only work if your system was 10 bit - can't remember if it was an eizo or nvidia application, but it was supposed to work with the Quadro. Anyway, it wouldn't even run on my system.

If we go back to my post about bit-depths:

The source is 8-bit
The processing is 10-bit
The output is 10-bit
The display is 10-bit

Standard windows applications are all 8-bit. They can only use 256 steps.
The exception is full-screen exclusive D3D applications which can output 10-bit.
Or that weird OpenGL mode that was only supported on Windows 7 with certain applications, with professional graphics cards, when the desktop compositor was disabled.

Using processing/output/display with more than 8-bit precision allows you to modify an 8-bit image to a certain extent without losing any steps of gradation.

If everything was 8-bit, you would lose steps of gradation with any changes that you make.

spacediver said:
Also, if my system truly is 10 bit, I should be able to specify a 1024 entry LUT, right? How would I go about doing that?

Again: don't think of the number of steps as the LUT bit-depth.
That is just the number of points that a LUT has.
A 256-point 10-bit LUT would have 256 steps from 0-100%, where each value can be 0-1023
The 0-1023 range may be scaled up to 16-bit values when actually creating the profile, but only 10-bit precision would be used.

As I said previously, using that many points generally harms image quality. Using more points in the LUT is not always better.

spacediver · Oct 5, 2015

zone74 said:
Yes, that's how it works with an 8-bit output. You only get 256 shades.

Does this mean that in DVI-D, for example, there is only a single 8 bit word for the value of each channel for each pixel?

zone74 · Oct 5, 2015

spacediver said:
Does this mean that in DVI-D, for example, there is only a single 8 bit word for the value of each channel for each pixel?

I find your wording confusing, but yes?
With an 8-bit output, you can only specify 8-bit values.

spacediver · Oct 5, 2015

zone74 said:
If we go back to my post about bit-depths:

The source is 8-bit

The processing is 10-bit

The output is 10-bit

The display is 10-bit

Standard windows applications are all 8-bit. They can only use 256 steps.
The exception is full-screen exclusive D3D applications which can output 10-bit.
Or that weird OpenGL mode that was only supported on Windows 7 with certain applications, with professional graphics cards, when the desktop compositor was disabled.

So surely I should be able to do anything a quadro can (with respect to 10 bit rendering), right? If I disable desktop compositor, then I should be able to create 10 bit grayscale ramps in photoshop, right?

zone74 said:
Again: don't think of the number of steps as the LUT bit-depth.
That is just the number of points that a LUT has.
A 256-point 10-bit LUT would have 256 steps from 0-100%, where each value can be 0-1023
The 0-1023 range may be scaled up to 16-bit values when actually creating the profile, but only 10-bit precision would be used.

Yes, I understand this, but if I want to render images that have 1024 different shades of gray simultaneously, the lookup table has to be 1024 entries long, no?

zone74 · Oct 5, 2015

spacediver said:
So surely I should be able to do anything a quadro can (with respect to 10 bit rendering), right? If I disable desktop compositor, then I should be able to create 10 bit grayscale ramps in photoshop, right?

This feature is locked to pro-level GPUs. Unless you are using one, you can't enable it.
Photoshop supports 8-bit, 16-bit, and 32-bit editing modes. So you can already create a grayscale ramp with more than 256 values in it.

The 10-bit output only affects what is sent to the display, not the precision that images are rendered with.

spacediver said:
Yes, I understand this, but if I want to render images that have 1024 different shades of gray simultaneously, the lookup table has to be 1024 entries long, no?

You could have a 2-point profile with a 10-bit output that just specifies new black and white points.
At some point - and I'm not sure where (the profile creation software, LUT loader etc.) that profile will be scaled to the output bit-depth.
Yes, if you want to display 1024 shades of gray on the display, the GPU needs a 10-bit LUT.
But the process of creating a profile for that does not require 1024 points of measurement.

spacediver · Oct 5, 2015

zone74 said:
This feature is locked to pro-level GPUs. Unless you are using one, you can't enable it.
Photoshop supports 8-bit, 16-bit, and 32-bit editing modes. So you can already create a grayscale ramp with more than 256 values in it.

I'm confused. You just said that everything except for my source was 10 bit. But now I need something to be unlocked to get 10 bit grayscale ramps? Would this feature be a fifth element in your list of signal chain components?

I'm also confused by the fact that you say I can already create grayscale ramps with more than 256 values. If that's the case, what would unlocking this fifth element allow me to do that I wouldn't be able to otherwise?

(not trying to be difficult, I really appreciate this discussion

)

XoR · Oct 5, 2015

@spacediver
There is no proof NV does any dithering at all. Dithering in 10bpc DirectX applications can come from DirectX itself.

Other than that is there is no proof that on VGA output 10bpc applications are actually rendered in 10bpc and not just use the same dithering as on any other 8bpc monitor. Does driver report analogue mode as 10bpc capable? If not then what are we even discussing here?

And how anything related to analog outputs relates to G-Sync monitors? They obviously use DP and not VGA input.

@flossy_cake
I was unable to test friends GTX970 at the moment
Imho however chances it uses dithering like Radeons are zero so I would not hold my breath about it.

You need to see if monitor you are interested with have 10bpc support. If it have then there will be no issue and if not then there will be banding after calibration. It is very simple.

flossy_cake · Oct 5, 2015

zone74 said:
Photoshop supports 8-bit, 16-bit, and 32-bit editing modes. So you can already create a grayscale ramp with more than 256 values in it.

I'm trying to do this in Photoshop but can't get it to work. I've set the bit depth to 32-bits per channel, and the colour picker is still limited to only 0-255, and creating a gradient only makes 0-255. Are there some other options I need to set up as well?

spacediver · Oct 5, 2015

weird, this software called Bloom markets itself at "16 bit per channel everything"

Bloom is a purely 16-bit per channel application. Everything, including images, shape colors, masks, and any other color data is stored with 16 bits per channel. This means you never lose precision and color accuracy, allowing for the most vibrant and precise images to be created.

Yet the screenshots tell a different story.

If it's 16 bit everything, shouldn't you be able to specify 65,536 levels per channel, rather than the 256 shown?

zone74 · Oct 6, 2015

spacediver said:
I'm confused. You just said that everything except for my source was 10 bit. But now I need something to be unlocked to get 10 bit grayscale ramps? Would this feature be a fifth element in your list of signal chain components?

Yes it would be prior to the "source" section, which I should really call the "source application's output bit-depth".
Photoshop itself will be rendering in 8, 16, or 32-bit.
Photoshop's output to the desktop compositor is limited to 8-bit.

There is an OpenGL mode which allows Photoshop to bypass the desktop compositor and output 10-bit to the display instead, but this only works on Windows 7, requires the compositor (Aero) to be disabled, and only works with professional GPUs. (Quadro or FirePro/FireGL)
This is very likely a driver lock-in, rather than a hardware limitation.
I'm not sure if this mode is still supported on newer versions of Adobe's applications either?

spacediver said:
I'm also confused by the fact that you say I can already create grayscale ramps with more than 256 values. If that's the case, what would unlocking this fifth element allow me to do that I wouldn't be able to otherwise?

If you have a 16-bit or 32-bit image, using any of the gradient tools will generate a 16-bit or 32-bit ramp.
The software does not expose 16-bit values to the user, however, and you can only select values from 0-255 in the color picker. (which are scaled to their 16/32-bit equivalent)
My guess is that they don't see a need for anyone to have that much precision for the "pixel-editing" tools, and it is a lot easier for artists to keep the values consistent.

To check this, I created two gradient files:
The first used the gradient tool to create an undithered gradient in an 8-bit image.
The second used the tool to create an undithered gradient in a 16-bit image.
Both were then saved as PNG files.

To verify these, I loaded the PNG files into madVR, which is able to display them using a 10-bit output to my display.

Ignore that there is a some banding still visible with the 16-bit gradient.
I don't have the time investigate that, or reshoot those photos right now.
It may be that I forgot to disable some processing in madVR when I disabled dither, or it may be that generating an undithered gradient requires the image to be wider than 1920px to be completely smooth when you have 65536 possible values.
Whatever the reason, you can clearly see that the gradient is a lot smoother, so Photoshop is creating a true 16-bit gradient even if it does not expose those values to the user.

EDIT: Obviously it's because I did not follow my own advice and dither was disabled when converting from 16-bit to 10-bit.

I disabled dither with the 8-bit image because I did not want there to be any processing applied at all, and forgot to re-enable it.
So there you go, it's also proof that 10-bit alone is not enough to avoid banding.

XoR said:
There is no proof NV does any dithering at all. Dithering in 10bpc DirectX applications can come from DirectX itself.

With a 10-bit D3D application, and a 10/12-bit output in the NVIDIA Control Panel, there is no dither being applied.
You only get dither with a 10-bit application, and an 8-bit output.
I expect that it would be the driver which handles this, since the D3D application's output is still 10-bit.

XoR said:
You need to see if monitor you are interested with have 10bpc support. If it have then there will be no issue and if not then there will be banding after calibration. It is very simple.

10-bit without dither can still have banding. It just has a lot less than 8-bit without dither.

spacediver · Oct 6, 2015

thanks for the reply zone.

I'm trying to get my head around this driver lock in business.

Here's where my thinking and understanding currently is.

Digital context:

Suppose you have a video card that has a 10 bit LUT.
This card is connected to a digital 10 bit display via a digital connection.
This means that the pixel value for each channel is encoded in the output as a 10 bit string, which can therefore take on any of 1024 values, and the display will be able to accept these values and display any of 1024 values.

Apparently, AMD cards can do this, which is why you can do things like gamma correction and still have 256 distinct shades of gray (so long as you don't change the gamma function radically).

My first question:

Do you need to disable the desktop compositor to be able to do the above?

Now, suppose you want to render an image that has all 1024 shades of gray shown simultaneously. Suppose you create or find an opengl application for this. This application allows you to create a patch, and instead of the usual 256 shades of gray, you can now pick from 1024.

My second question is this:

If it's already possible to send out 10 bit information in the graphics pipeline to the display (as it must be, if you are able to make use of gamma correction), then how would a driver be able to lock away this feature?

My third question:

There seems to be a distinction between being able to send out 10 bit information over a graphics pipeline (which is all that is necessary for being able to do gamma correction while maintaining 256 distinct shades), and being able to encode 1024 distinct shades within a single frame buffer. What is the proper terminology that respects this distinction? They're both 10 bit, but the second one seems like a "stronger" 10 bit.

XoR · Oct 6, 2015

10-bit without dither can still have banding. It just has a lot less than 8-bit without dither.

I talk solely about 8bit source + gamma correction and in this case 10bit, either available due to monitor itself or done by Radeon is sufficient to not loose any gradation step thus have no additional banding compared to not doing any color correction.

spacediver · Oct 6, 2015

XoR said:
I talk solely about 8bit source + gamma correction and in this case 10bit, either available due to monitor itself or done by Radeon is sufficient to not loose any gradation step thus have no additional banding compared to not doing any color correction.

It depends on what your target luminance function is, relative to the natural Electro-Optical Transfer Function (EOTF) of the display.

For example, if the natural EOTF is around a gamma of 2.2, then trying to linearize the gamma (gamma = 1.0) with only 10 bits is not going to be sufficient if you want to preserve 256 distinct shades.

But if you want to go to 1.8 or 2.6, then there'll be no problem.

zone74 · Oct 6, 2015

spacediver said:
Do you need to disable the desktop compositor to be able to do the above?

The GPU LUT runs a level above the desktop compositor.
Anything going through the desktop compositor (standard desktop applications) is limited to 8-bit, or 256 values.
Those 256 values can then be placed anywhere in the GPU's 10-bit LUT (1024 values) before being output to the display.

What you can't do, is pass 1024 values through to the LUT from a standard desktop application.
For that you need a full-screen exclusive D3D application (bypasses the compositor automatically) or to use that OpenGL mode on Windows 7 which requires you to disable the compositor, and even then it can only be enabled on professional GPUs - whether they are NVIDIA or AMD.
That mode only works on Windows 7 because Microsoft removed the option to disable the compositor with the release of Windows 8.

I don't know why AMD or NVIDIA only enable the use of that mode on their professional GPUs, other than keeping it as a selling point for them, but the only applications I know of which use it at all are Adobe's Creative Suite ones. It doesn't seem like it would be difficult for them to block that mode on consumer GPUs.

spacediver said:
My third question:
There seems to be a distinction between being able to send out 10 bit information over a graphics pipeline (which is all that is necessary for being able to do gamma correction while maintaining 256 distinct shades), and being able to encode 1024 distinct shades within a single frame buffer. What is the proper terminology that respects this distinction? They're both 10 bit, but the second one seems like a "stronger" 10 bit.

I'm not sure that there is a specific term for this.
As I said, currently, the desktop compositor is limited to 8-bit so the assumption is that all applications will be limited to outputting 8-bit values.
To send 10-bit values to your display, you need to select a 10-bit output in the drivers. For VGA, this appears to be automatic. With digital connections, you need to select a 10-bit output.

spacediver · Oct 6, 2015

zone74 said:
The GPU LUT runs a level above the desktop compositor.
Anything going through the desktop compositor (standard desktop applications) is limited to 8-bit, or 256 values.
Those 256 values can then be placed anywhere in the GPU's 10-bit LUT (1024 values) before being output to the display.

What you can't do, is pass 1024 values through to the LUT from a standard desktop application.

Ok this is starting to make a bit more sense.

So, in the (digital) case of AMD:

Compositor = 8 bit. LUT = 10 bit. Video output = 10 bit.

The 8 bit compositor bottleneck means that only 256 unique values can be used in each framebuffer, but you can still have 10 bit precision for these values.

zone74 said:
I don't know why AMD or NVIDIA only enable the use of that mode on their professional GPUs, other than keeping it as a selling point for them, but the only applications I know of which use it at all are Adobe's Creative Suite ones. It doesn't seem like it would be difficult for them to block that mode on consumer GPUs.

I find it bizarre that nobody can just write a simple full screen exclusive directx application that allows for 10 bit graphics rendering. Even a simple demo application that allows people to test out things like rendering 1024 shades.

To send 10-bit values to your display, you need to select a 10-bit output in the drivers. For VGA, this appears to be automatic. With digital connections, you need to select a 10-bit output.

In the case of VGA, does it even make sense to speak of 10 bit values? After all, unlike in the digital case, where I'm assuming the information that represents the "intensity" for each pixel for each channel is a string of ones and zeros 10 characters long, in the analogue case, it's simply a voltage.

zone74 · Oct 6, 2015

spacediver said:
Ok this is starting to make a bit more sense.
So, in the (digital) case of AMD:
Compositor = 8 bit. LUT = 10 bit. Video output = 10 bit.
The 8 bit compositor bottleneck means that only 256 unique values can be used in each framebuffer, but you can still have 10 bit precision for these values.

Correct.

spacediver said:
I find it bizarre that nobody can just write a simple full screen exclusive directx application that allows for 10 bit graphics rendering. Even a simple demo application that allows people to test out things like rendering 1024 shades.

Well the madVR video renderer uses it, and so does the game Alien: Isolation.
Adobe probably aren't using it because their applications are cross-platform, and DirectX is exclusive to Windows.

And don't forget that you can only output >8-bit with a full-screen exclusive application.
You can't do this in a window, so it would not be very useful in many applications.
Lightroom is probably the only Adobe application I can think of that would be feasible to run in full-screen exclusive mode for most people.

What really needs to happen is for Microsoft to make it possible to get >8-bit from the compositor with standard applications, instead of being limited to using that D3D mode.

spacediver said:
In the case of VGA, does it even make sense to speak of 10 bit values? After all, unlike in the digital case, where I'm assuming the information that represents the "intensity" for each pixel for each channel is a string of ones and zeros 10 characters long, in the analogue case, it's simply a voltage.

Well you have to send digital information to the RAMDAC for it to convert the signal to an analog voltage.
And it would appear that this is typically 10-bit.

spacediver · Oct 6, 2015

zone74 said:
Well the madVR video renderer uses it, and so does the game Alien: Isolation.

Right, that reminds me. Gonna have to go back and try to reproduce your tests.

zone74 said:
What really needs to happen is for Microsoft to make it possible to get >8-bit from the compositor with standard applications, instead of being limited to using that D3D mode.

Kind've embarrassing that in 2015 this isn't the case.

Zone, seriously, thanks for all your patient education here.

zone74 · Oct 7, 2015

spacediver said:
Right, that reminds me. Gonna have to go back and try to reproduce your tests.

It will be interesting to see if applications can actually pass through 10-bit data over VGA, or if you only get 8-bit data with a 10-bit LUT.

spacediver said:
Kind've embarrassing that in 2015 this isn't the case.

Now that 10-bit displays are becoming more common, it would be nice if Microsoft did something about it.

spacediver said:
Zone, seriously, thanks for all your patient education here.

No problem.

XoR · Oct 7, 2015

spacediver said:
In the case of VGA, does it even make sense to speak of 10 bit values? After all, unlike in the digital case, where I'm assuming the information that represents the "intensity" for each pixel for each channel is a string of ones and zeros 10 characters long, in the analogue case, it's simply a voltage.

It is only valid for images that are processed fully in analog mode. Output from digital sources is digital and have bit-depth.

Not so long ago (20 and more years ago

) we had marvelous video modes such as eg. "15bit" which were basically 5 bits per channel. 256 color modes had palette with similar limitation. So back then the same CRT VGA monitors that can now can display 10bit per color were limited to 5bit.

And 10bit is most that you will ever get on CRT because there are no higher bitdepth DACs out there. Graphics cards are limited to 10bit and that is that, this is the limit. And frankly 10bit it well enough imho.

spacediver · Oct 7, 2015

XoR said:
It is only valid for images that are processed fully in analog mode. Output from digital sources is digital and have bit-depth.

I was referring to the actual information that is physically encoded in the cable going to the display.

XoR said:
And 10bit is most that you will ever get on CRT because there are no higher bitdepth DACs out there. Graphics cards are limited to 10bit and that is that, this is the limit. And frankly 10bit it well enough imho.

Not true. This piece of hardware has a 16 bit DAC. We have one in our lab, though it's not connected to anything right now. And 10 bit is certainly not enough if you're interested in psychophysical procedures such as measuring human contrast sensitivity with a linearized gamma, or you're working in HDR.

spacediver · Oct 7, 2015

zone74 said:
It will be interesting to see if applications can actually pass through 10-bit data over VGA, or if you only get 8-bit data with a 10-bit LUT.

Well I definitely noticed a difference between 8 bit and 10 bit in the madVR test.

Osjur · Oct 29, 2015

spacediver said:
Ok this is starting to make a bit more sense.

So, in the (digital) case of AMD:

Compositor = 8 bit. LUT = 10 bit. Video output = 10 bit.

The 8 bit compositor bottleneck means that only 256 unique values can be used in each framebuffer, but you can still have 10 bit precision for these values.

I took this from some other forum a long time ago when I was trying to figure out why my 10bit direct drive monitor (HP ZR30w) looked like shit (b-b-b-banding) on Nvidia after calibration but not on AMD.

Here's a small sample of the start of my monitor color profile (it's actually a text file!):

0.00000000 0.00000000 0.00000000 0.00000000
0.00392160 0.00115970 0.00012207 0.00000000
0.00784310 0.00402840 0.00308230 0.00296030
0.01176500 0.00695810 0.00608830 0.00610360
0.01568600 0.00993360 0.00915540 0.00929270
0.01960800 0.01294000 0.01228400 0.01252800
0.02352900 0.01600700 0.01545700 0.01583900
0.02745100 0.01912000 0.01867700 0.01919600
0.03137300 0.02226300 0.02197300 0.02262900
0.03529400 0.02548300 0.02531500 0.02612300

This is called a gamma ramp / curve.

Range: 0 --> 1, for the sake of simplicity I will assume in my examples that the range is 0 --> 255 (8-bit per color channel).

The graphics card loads up the matrix in a Look-up Table, where it basically converts the input color signals to the corrected output signals on-the-fly.

Now here's where it gets confusing:

AMD's consumer cards support 10-bits per color channel. Nvidia's cards don't; they only support 8-bits per color channel. Now, how is this relevant when you have an 8-bit per color channel monitor, you might ask?

Look up, at the matrix. Notice how many digits there are after the decimal point. These number contain enough precision that even 16-bits per color channel do not cover them. So, that means 8 or 10 bits are taken, and the rest are dumped (not considered).

Here's how it looks when mapped from 0 --> 255:

0 0 0 0
1 0.295724 0.0311279 0
2 1.02724 0.785987 0.754876
3 1.77432 1.55252 1.55642
4 2.53307 2.33463 2.36964
5 3.2997 3.13242 3.19464
6 4.08179 3.94154 4.03895
7 4.8756 4.76263 4.89498
8 5.67707 5.60311 5.77039
9 6.49816 6.45533 6.66137

Here's what it looks when only 8 bits are taken (after rounding is performed):

0 0 0 0
1 0 0 0
2 1 1 1
3 2 2 2
4 3 2 2
5 3 3 3
6 4 4 4
7 5 5 5
8 6 6 6
9 6 6 7

Duplicate numbers in sequence? Inability to display certain shades? Banding? Whoops.

Here's how it looks like when 10 bits are taken (after rounding is performed):

0 0 0 0
1 0.25 0.00 0.00
2 1.00 0.75 0.75
3 1.75 1.50 1.50
4 2.50 2.50 2.50
5 3.25 3.25 3.25
6 4.00 4.00 4.00
7 5.00 4.75 5.00
8 5.75 5.50 5.75
9 6.50 6.50 6.75
10 7.50 7.50 7.50

Less gaps, smoother curve, less banding. Awesome.

BUT...monitor is still 8-bit...sooo what AMD does is use dithering from the 10-bit LUT to the 8-bit monitor. Banding is gone!

But afaik that isn't completely true anymore because AMD can use 12 bit GPU LUT (can't find source for that info anymore but if my memory serves me well, it happened when cat 14.6 beta arrived). Don't know if Nvidia upped the internal LUT to 10-12bit recently.

AMD disables dithering if you do not have color profile loaded to the GPU LUT. In that case the chain goes like this: Program = 8bit -> GPU LUT = 8bit -> Monitor, but I'm not sure if this is true anymore because you can define your color bpc in CCC nowadays.

You can disable dithering altogether by putting DisableDithering (something like this) dword to the registry, but I can't remember the exact command and place where to put it anymore :/

spacediver · Oct 29, 2015

Osjur said:
You can disable dithering altogether by putting DisableDithering (something like this) dword to the registry, but I can't remember the exact command and place where to put it anymore :/

you referring to this?

fhoech · Oct 29, 2015

Osjur said:
Don't know if Nvidia upped the internal LUT to 10-12bit recently.

Not recently, some years ago (I can confirm for my GTX 465, for example). Most modern AMD/nVidia cards seem to operate like this (assuming 8 bit connection to display):

Linear videoLUT:
Compositor -> 10-12 bit 1D videoLUT -> 8-bit output

Non-linear videoLUT:
Compositor -> 10-12 bit 1D videoLUT -> dithered 8-bit output

Osjur · Oct 29, 2015

spacediver said:
you referring to this?

Yeah I was referring to that.

fhoech said:
Not recently, some years ago (I can confirm for my GTX 465, for example). Most modern AMD/nVidia cards seem to operate like this (assuming 8 bit connection to display):

Linear videoLUT:
Compositor -> 10-12 bit 1D videoLUT -> 8-bit output

Non-linear videoLUT:
Compositor -> 10-12 bit 1D videoLUT -> dithered 8-bit output

Now I really need to get myself a used newish nvidia card to see how it behaves with my 10bit monitor vs team red. Hmm, where did I put my old 560Ti.

fhoech · Oct 29, 2015

Osjur said:
Now I really need to get myself a used newish nvidia card to see how it behaves with my 10bit monitor vs team red. Hmm, where did I put my old 560Ti.

Note that a 10-bit monitor does nothing for gaming (unless it has an internal LUT that can be used by calibration software), and you need a Quadro card.

Osjur · Oct 29, 2015

fhoech said:
Note that a 10-bit monitor does nothing for gaming (unless it has an internal LUT that can be used by calibration software), and you need a Quadro card.

Hmm, I wasn't talking about gaming. I know completely well that windows composition pushes only 8bit data to gpu LUT and all games use 8bit LUT. What I want to see is how nvidia behaves with grayscale ramp up after calibration software has done its magic, because last time I tried, it looked like shit.

But if we talk about gaming, then AMD is the hands down winner in the color department, when it comes down to keeping the calibrated color profile in the gpu LUT because you can actually force it to AMD cards with powerstrip so that games can't change it. Afaik that is not doable with nvidia cards and colorclutch, etc. has limitations and doesn't even work every time.

zone74 · Oct 29, 2015

fhoech said:
Not recently, some years ago (I can confirm for my GTX 465, for example). Most modern AMD/nVidia cards seem to operate like this (assuming 8 bit connection to display)

With a CRT or digital connection?
Because I see considerable banding with a GTX 570, 960, and 970 if I make any adjustments to the LUT and the output is an 8-bit digital connection.
Banding is greatly reduced if the output is set to 12-bit via HDMI.

Dithering is not used in either case.
The only time NVIDIA seems to dither is if the application outputs a signal >8-bit. (which is limited to the madVR video renderer and the game Alien: Isolation as far as I am aware)

Osjur · Oct 29, 2015

zone74 said:
With a CRT or digital connection?
Because I see considerable banding with a GTX 570, 960, and 970 if I make any adjustments to the LUT and the output is an 8-bit digital connection.
Banding is greatly reduced if the output is set to 12-bit via HDMI.

Dithering is not used in either case.
The only time NVIDIA seems to dither is if the application outputs a signal >8-bit. (which is limited to the madVR video renderer and the game Alien: Isolation as far as I am aware)

Thank you for this information, gonna test this myself at some point and be most likely hugely disappointed when I see extreme banding on gray scale ramp up image after making any adjustments to the LUT by loading calibrated color profile.

Is dithering used when you are running madVR in windowed mode? Getting over that 8bit windows composition limit needs DX or OGL application in fullscreen exclusive mode so that the application can actually output more than 8bit data or that's how I have understood how windows color management works.

nvidia or amd for color accuracy and calibration features

2[H]4U

n00b

n00b

2[H]4U

n00b

2[H]4U

n00b

2[H]4U

n00b

Gawd

2[H]4U

Gawd

2[H]4U

Gawd

2[H]4U

Gawd

2[H]4U

Gawd

n00b

2[H]4U

Gawd

2[H]4U

Gawd

2[H]4U

Gawd

2[H]4U

Gawd

2[H]4U

Gawd

Gawd

2[H]4U

2[H]4U

Limp Gawd

2[H]4U

n00b

Limp Gawd

n00b

Limp Gawd

Gawd

Limp Gawd