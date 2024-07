Marees said: Seems to me that unreal engine 5 (& nvidia) are doing a pretty good job of pushing the work to the CPU under vram limited scenarios !!! Click to expand...

Not even when VRAM is constrained, Direct storage for all its benefits doesn't streamline the texture decompression process, essentially it replaces the old Win32 API's for storage IO which are utterly incapable of working at NVMe speeds with one that is.All GPU IO requests though still travel through the CPU, So the GPU asks the CPU to pull some textures from storage, be it an NVMe or system RAM, those textures are then processed through the CPU which handles decompression and then uploads that to the GPU's VRAM.Nvidia does have RTX IO which removes the texture decompression job from the CPU but it only recently became available (announced in 2020) and debuted in Ratchet & Clank: Rift Apart.However, in R&C, Nvidia users who enable Direct Storage (and subsequently RTX IO) see a substantial increase in VRAM usage, given Nvidia's lineup is lacking in that department that is not ideal. It looks to be caused by the storage of the compressed and decompressed textures in VRAM, that could be a game implementation issue, but it is an issue.To combat that Nvidia is fast on garbage collection and culling, but that offloads resources to system RAM which causes a 10-15% increase in usage there, since that can be expanded and is generally cheap it is far less of a problem for most users.But yeah. This is made necessary by how PCs function in general with how interrupts are processed, but even if you look at a CPU diagram for the PCIe layout you quickly notice that all PCIe traffic passes through the CPU, there is no direct route between storage, system ram, and the GPU.A game requires texture in VRAM, so the game asks the CPU to pull it from storage which loads it to RAM, decompresses it in RAM, and finally the decompressed texture is uploaded from RAM to VRAM.RTX IO removes a few hops which eases up tension on a constrained PCIe configuration, but unless we start seeing 8-lane PCIe 4 or 5 cards being a normal thing, it's mostly useless for existing games.All this though is greatly helped by some optional parts of the NVMe spec becoming non-optional, you would not be overly shocked to find out how many NVMe cards out there are using SATA controllers which can be tweaked to get great throughput which looks good enough on charts like crystal disk and such.But when the OS tries to take advantage of the NVMe storage and gets to one of those SATA controllers, those commands are translated back to the same old SATA interface leaving things back to a relative crawl, such a shame. So the drive looks fine on a review site with synthetic benchmarks but then falls behind in gaming when Direct Storage or other newer or more advanced features are implemented.But Epic as a whole is really doing awesome work with UE5, it's going to be their cash cow for a long while, and it is going to be incredibly difficult for any studio to put out a higher-end title on anything else for less. Epic has managed to build one hell of a great ecosystem there, and the performance is honestly impressive.