That blog entry is utter garbage.
That blog entry is utter garbage.
I understand it's difficult to comprehend data that doesn't align with your worldview, but you *do* realize that the forum post the blog article references is actually by an Intel employee who knows his :banana::banana::banana::banana: about SSE and compilers right? An Intel engineer (of all people!) debunked the thing, and Scali simply referenced it.
Sorry I think if even the competition debunks the thing, you have to probably take a second look.
I should clarify -- it is not the content of what was shown, it was the tone and out of context of Kanter's article that I thought was garbage. I comprehend the data just fine. It did not address the thesis of the original article, it reads like a hack job. But I agree, I am biased in that regard because I consider David a friend.
In any event, his entry would have been better presented if it were a bit more professional in tone and less like a fanatical fanboy post.
This is your Intel engineers thoughts on that Scali post:
http://forum.beyond3d.com/showpost.p...&postcount=148
And I agree, that blog entry misses the point entirely and misreads what Kanter is really saying.
Jack
Fair enough, it reads like the rantings of a rabid fanboy to me. I did talk with Kanter after he posted that article :) and ask how long it took before he got a call from nVidia PR... it was pretty quick.
Apparently, I am not the only with with this 'frame of Scali' in my head: http://forum.beyond3d.com/showthread.php?p=1474154
Thanks!
While the internets were ablaze, especially here, with comments about NVIDIA "de-optimizing" PhysX to cripple CPU PhysX performance, in reality that's how PhysX was developed prior to NVIDIA's involvement.
When Bullet is recompiled to take advantage of SSE, the performance improvement is fairly minimal. Even if PhysX benefit more significantly from the recompile it's still not going to be the 2-4x faster figure that's repeatedly quoted.
To see a significant performance increase with SSE, PhysX would need to be entirely re-written to make the best use of SSE. NVIDIA would have to dump a bunch of money into optimizing for their competitor, which makes no business sense at all, and it still seems unlikely that CPU PhysX would be a strong competitor to GPU PhysX.
The point of NVIDIA PhysX is to add additional value to NVIDIA GeForce GPUs. What's wrong with a company trying to add additional value to their products? :shrug:
The premise of the original RWT article is that, after some profiling, a definite pattern emerged -- PhysX was using legacy x87 as opposed to SMID SSEx for the heavy math lifting, this is besides the fact that PhysX is also not multithreaded on multicore CPUs by default. Kanter then goes on to question why this is the case, and makes his point around the concept that Intel and AMD both are deprecating x87 and focusing on improving/developing SSE.
There are two points of contention -- a) Kanters claims that using properly compiled and vectorized SSE can speed up throughput up to a theoretical 4x and probably, in reality, 2x and b) Kanter makes an argument that there is no technical reason why PhysX should not compiled SSE rather than x87. In this it is implied nVidia did this intentionally in order to make PhysX on the GPU look that much better. -- this seems to have pissed a few people off. :)
You're forgetting about Ageia's PPU.. PhysX was originally meant to run on a PPU, which is even worse than running it on a GPU, because with the former you absolutely had to buy dedicated hardware (and it was slower and more expensive than a GPU as well).
You can run PhysX on a single GPU however, provided it's powerful enough and your graphics settings aren't maxed out.
It seems you have deep seated issues with everyday corporate practices :shrug:Quote:
It is limited to eyecandy because NV want to be unfair to everyone else with its disabling of the function when other GPU makes are in the system & so developers have no choice but to use it for eyecandy only because they would lose out on sales as the game could only be run on NV discrete GPUs & its nothing about being fair if developers could make more by being unfair they would do.
Eh? that link doesn't prove anything... He is in agreement with Scali, but confused why the article was posted now instead of back when this topic first arose.
I think in most cases Andrew would be right that in general 20% perf is not trivial. But when you're talking about taking PhysX CPU performance from 8fps to 10fps it *is* trivial. You'd need integer multipliers to get where it needs to be to be valuable.
Nobody here talks about optimizing it for the ATI cards.But if Nvidia tries to push physx as a standard, than it needs to have a decently working cpu based one,and it needs to stop disabling physx computations on nvidia card if an ati card is present.Quote:
NVIDIA would have to dump a bunch of money into optimizing for their competitor, which makes no business sense at all, and it still seems unlikely that CPU PhysX would be a strong competitor to GPU PhysX.
Current Physx state of things is pretty much a reason for many people to stay away from nvidia.
I wasn't trying to prove someting, I was just pointing out others thought it was fanboy drivel as well.... read through the thread and what I said -- you implied he never gave reason to appear as a fanboy. I disagree.
In terms of 'wondering why it was posted now instead of...' is not what I was referring to ... perhaps I was not clear.... he and I are in agreement thatQuote:
Apparently, I am not the only with with this 'frame of Scali' in my head: http://forum.beyond3d.com/showthread.php?p=1474154
Quote:
And I agree, that blog entry misses the point entirely and misreads what Kanter is really saying.Quote:
I don't think he does... that seems like a misreading of the article to me.
I personally couldn't give a rat's behind if it were only 8%, 20%, or 5000% improvement -- a fanboy rant is a fanboy rant.
Its the PPU that i was talking about & its still dedicated hardware that you have to buy now its call an NV gfx card & your limited to what other hardware you can have in the system unlike the PPU.
And i don't have issues with everyday corporate practices, i just have issues with specific corporate practices just like most people do & i have a right to air my dislike just like others.
I have a question.What happens to Physx(x87 coded) when native 64bit games that support it show up on the market(whenever it may be)?We all know in native AMD64 mode SSE is used instead of x87 ,so will NV be forced to change its stance?
It *should* be faster. That should be enough reason for a recompilation. It takes what, like less than 15 seconds to change the compilation flags, few minutes to compile.
Then again, compiling on Intel compiler should already do that for you... And since it is the best(as in performance) C++ compiler for Windows, I'd bet Nvidia to use it. So there should be clear difference in PhysX CPU performance if CPUID string was changed from GenuineIntel to AuthenticAMD, if the PhysX runtime was compiled with Intel C++ compiler AND one could change CPUID string of the CPU.
Time to ask Agner.
...that brought me toa question; why on earth did I recompile the benchmark to use SSE, when the actual Bullet runtime should've been recompiled... Fixing it now. ; Edit: Recompiled the whole Bullet runtime with -msse, -msse2, -msse3, -mfpmath=sse -mtune=core2.. And, there was some 5 % improvement in the benchmark demo. Physics step went from ~46.5 ms down to ~43.25 ms.
I don't quite see the point here. PhysX can already run on CPU, so why making big deal out of this? The main problem why it's so damn slow is that it's not even utilizing the CPU properly. I used to have Core 2 Duo E4300 and Mirror's Edge is running exactly as fast on a 4 times faster Core i7 920 (even overclocked to 4GHz) with more cores, more threads, more cache etc. CPU was hardly utilized where i could easily dedicate 3 full cores just for physics calculation. Yet they were utilized up to 5% if i was lucky.
So it's not slow because it's not recompiled into something else but because its stupid. That's why.
You know, I have played Vampire Bloodlines on Source engine, it looks average, loading times are terrible and it's not optimized. I bought a new GPU, several times faster, DX11 and everything and nothing has changed. So the GPU wasn't the problem, Source engine just plain sucks.
Interesting conversation.
I think a lot of people are equating poorly implemented PhysX (Batman, Cryostasis, etc) with the actual performance of it in ALL games. This just isn't the case and games like Dark Void (with the latest patch) and Mafia II illustrate this. Both of these games have absolutely no issue processing higher levels of PhysX on the CPU across multiple threads.
Yes, processing it on a dedicated GPU will yield a performance increase but in some cases processing it on the CPU SLIGHTLY BETTER PERFORMANCE due to RENDERING bottlenecks. Transferring the PhysX processing to the CPU can free up the GPU to get on with the task of rendering.
On a side note, I am a firm believer that physics processing has no business being done on the GPU when we have multi core CPUs usually sitting around doing nothing.
I suppose Nvidia will eventually optimize PhysX to take full advantage of x64.
The new SDK 3.0 which will supposedly be released next year will have SIMD optimizations and inherent multithreading, so I'm betting that it will eventually be 64 bit as well..
Either way though, hardware accelerated PhysX will always be faster..
All I know is I need fast physics simulation in Gary's Mod. Until that day there is little going on I care about in this field.
Good points :up:
Although I disagree that physics processing has no business being done on the GPU. If we want higher level physics, then the GPU is the natural choice since it excels at that kind of processing, has a lot more bandwidth than desktop CPUs (which apparently is important for physics processing), and generally provides much more bang for the buck.
Also people often forget that dual cores still constitute the majority in gaming PCs; although quadcores are catching up, it's not as fast as you'd think.
Oh, I agree. However I also think that an effort should be made to max out the CPU before trying to validate a company's marketing practices which say using a GPU for high level physics is better.
The main problem with doing physics on the GPU is that you will ALWAYS loose rendering power in order to accomplish it unless you use a dedicated graphcis card. Personally, I want the ability to choose which to use but in may games (again: Batman, Cryostasis, etc) PhysX is so poorly implemented that the choice is effectively taken away from the user.
That isn't a proper way to sell a technology to consumers. Rather, show us that a CPU can do it WELL and then show how much BETTER a GPU can do it. I think this will actually help sell more GPUs because in many cases a $60 dedicated GPU will most likely benefit physics calculations much more than a $60 CPU upgrade.
At this point I think that the dedicated GPU setup for PhysX is the only way I would actually consider doing this. Loosing rendering power is perfectly fine if the game doesn't demand much but what about games like Mafia II and Metro 2033? Would I want to lower IQ settings just to ensure persistent debris, bouncing titties and somewhat realistic fabric? Heck no.