NVIDIA JUST HAD one of their most sleazy marketing tactics exposed, that PhysX is faster on a GPU than a CPU. As David Kanter at Real World Tech proves, the only reason that PhysX is faster on a GPU is because Nvidia purposely hobbles it on the CPU. If they didn't, PhysX would run faster on a modern CPU.
The article itself can be found here, and be forewarned, it is highly technical. In it, Kanter watched the execution of two PhysX enabled programs, a game/tech demo called Cryostasis, and an Nvidia program called PhysX Soft Body Demo. Both use PhysX, and are heavily promoted by Nvidia to 'prove' how much better their GPUs are.
The rationale behind using PhysX in this way is that Nvidia artificially blocks any other GPU from using PhysX, going so far as to disable the functionality on their own GPUs if an ATI GPU is simply present in the system but completely unused. The only way to compare is to use PhysX on the CPU, and compare it to the Nvidia GPU version.
If you can imagine the coincidence, it runs really well on Nvidia cards, but chokes if there is an ATI card in the system. Frame rates tend to go from more than 50 to the single digits even when you have an overclocked i7 and an ATI HD5970. Since this setup is vastly faster than an i7 and a GTX480 in almost every objective test, you might suspect foul play if the inclusion of PhysX drops performance by an order of magnitude. As Real World Tech proved, those suspicions would be absolutely correct.
How do they do it? It is easy, a combination of optimization for the GPU and de-optimization for the CPU. Nvidia has long claimed a 2-4x advantage for GPU physics, using their own PhysX APIs, over anything a CPU can do, no matter what it is or how many there are. And they can back it up with hard benchmarks, but only ones where the Nvidia API is used. For the sake of argument, lets assume that the PhysX implementations are indeed 4x faster on an Nvidia GPU than on the fastest quad core Intel iSomethingMeaningless.
If you look at Page 3 of the article, you will see the code traces of two PhysX using programs. There is one thing you should pay attention to, PhysX uses x87 for FP math almost exclusively, not SSE. For those not versed in processor arcana, Intel introduced SSE with the Pentium 3, a 450MHz CPU that debuted in February of 1999. Every Intel processor since has had SSE. The Pentium 4 which debuted in November of 2000 had SSE2, and the later variants had SSE3. How many gamers use a CPU slower than 450MHz?
Of the SSE variants, the one that matters here is SSE, but SSE2 could also be quite relevant. In any case, Intel hasn't introduced a CPU without SSE or SSE2 in almost a decade, 9 years and a few days short of 8 months to be precise. For the sake of brevity, we will lump SSE, SSE2, and later revisions in to one basket called SSE.
AMD had a similar API called 3DNow!, but the mainstream K8/Athlon64/Opteron lines had full SSE and SSE2 support since May of 2004. Some variants of the K7 had SSE with a different name, 3DNow! Professional, for years prior to that.
Basically, anything that runs at 1GHz or faster has SSE, even the Atom variants aimed at phones and widgets supports full SSE/SSE2. Nothing on the market, and nothing that was on the market for years prior to the founding of Ageia, the originator of PhysX that Nvidia later bought, lacked SSE.
To make matters worse, x87, the 'old way' of doing FP math, has been deprecated by Intel, AMD and Microsoft. x64 extensions write it off the spec, although you still can make it work if you are determined to, but it won't necessarily be there in the future. If you don't have a damn good reason to use it, you really should avoid it....
Copyright 2013 © Godem Online Inc. | Web and server solutions by NewTech Solutions.