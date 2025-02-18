What? Really?

How does it even make any sense?

Wouldn't actual API for PhysX between 32-bit and 64-bit be exactly the same?

I don't get it...



Maybe it is some kind of driver bug or omission rather than actual direction Nvidia is taking things in the future?

Was this Blackwell nor supporting PhysX confirmed by Nvidia or was it only an observation?



I hope its the latter because it would kinda suck to not have PhysX.

I didn't have it on AMD and didn't miss it much but still - it is nice to have such features if I do decide to play older game with PhysX...



I have 3090 dangling off side of my computer connected via PCI-e 3.0 1x to act as LLM accelerator and to run desktop on it with otherwise RTX 4090 used for gaming. I mean why bother 4090 with all the desktop stuff like Chrome and bunch of other apps which use GPU. Especially RTX Video Super Resolution is quite taxing on GPU and in this setup I can use it while playing game with no performance loss. PCI-e speed is enough to play 8K 60fps HDR videos without dropped frames.



Anyways, I ran Cryostatis demo to benchmark it and test if PhysX is working with newest drivers 572.42



RTX 4090 Game + PhysX

avg 283.5 min 133.1 max 531.1

avg 281.7 min 130.8 max 553.9





RTX 4090 Game + RTX 3090 PhysX

avg 215.6 min 109.5 max 523.7

avg 214.9 min 105.8 max 520.0



In the case of 3090 I got lower performance. 3090 was utilized 100% while 4090 did barely anything while with 4090 doing PhysX it still wasn't utilized 100% and card ran at 250W.

And funnily enough computer is quieter running everything on 4090.

I wonder if PCI-e performance is a limiting factor here or not. I cannot improve or degrade it but I will in time do some more testing with OCing 3090 to see if performance improves.



So... 3090 is an option if anyone is in to running local LLMs.

I mean its cool to have "abliterated" fully private ChatGPT-alikes and all and you kinda need two 24GB GPUs to even run 32B models with proper context length like 32K (note: I was able to squeeze num_ctx of 55296 for qwen2.5-32B so quite good actually) or run 70/72B models. Otherwise 32B model can fit in 24GB GPU and with 32GB GPU you could increase context lenght quite a lot and/or use better quantization than 4bit.



Having two GPUs is however not all roses.

For one to really do it like I did you need to have two monitors like gaming + desktop and use dekstop with secondary GPU.

Otherwise big issue is that some games are quite stupid and might choose to run on slower GPU even if primary monitor is connected to faster GPU and application was specifically told by OS to run on faster GPU.

For now the only game I ran which has this issue is Fortnite - but as you can guess this is quite irritating.

What I need to do is to disable 3090 in device manager before running Fortnite and then I can re-enable it. Some applications might crash. Chrome/Firefox etc. will switch to 4090 in this case so to really have any benefit from this setup I need to close web browser and open it again. Otherwise its even worse running these apps on faster GPU and transferring to slower GPU - though in this case impact isn't that big since I only really can transfer so much from 4090 to 3090 which limits performance impact.



Of course if older card was used solely for PhysX it could be just enabled the moment PhysX game is being ran thus avoiding all these issues.

From test results I don't think however performance will be that great if too old/slow GPU is being used. I would definitely not expect something like 750Ti to do an amazing job.



On the other hand it will definitely do better job than CPU.

With CPU PhysX Nvidia screwed us because it doesn't even use multiple cores. It's almost like Nvidia deliberately made CPU PhysX run as slow as it can to make having GPU supporting PhysX a necessity.

Otherwise I guess on 13900KF with its 24 cores and 32 threads I could get quite decent performance. Probably not as good as even with 3090 but still at least playable.