PhysX on a CPU likely to see very little benefit from SSE recompile

Printable View

Show 100 post(s) from this thread on one page

09-23-2010, 08:12 PM
JumpingJack

That blog entry is utter garbage.
09-23-2010, 10:19 PM
Svnth

Quote:

Originally Posted by JumpingJack

That blog entry is utter garbage.

I understand it's difficult to comprehend data that doesn't align with your worldview, but you *do* realize that the forum post the blog article references is actually by an Intel employee who knows his :banana::banana::banana::banana: about SSE and compilers right? An Intel engineer (of all people!) debunked the thing, and Scali simply referenced it.

Sorry I think if even the competition debunks the thing, you have to probably take a second look.
09-23-2010, 10:27 PM
JumpingJack

Quote:

Originally Posted by Svnth

I understand it's difficult to comprehend data that doesn't align with your worldview, but you *do* realize that the forum post the blog article references is actually by an Intel employee who knows his :banana::banana::banana::banana: about SSE and compilers right? An Intel engineer (of all people!) debunked the thing, and Scali simply referenced it.

Sorry I think if even the competition debunks the thing, you have to probably take a second look.

I should clarify -- it is not the content of what was shown, it was the tone and out of context of Kanter's article that I thought was garbage. I comprehend the data just fine. It did not address the thesis of the original article, it reads like a hack job. But I agree, I am biased in that regard because I consider David a friend.

In any event, his entry would have been better presented if it were a bit more professional in tone and less like a fanatical fanboy post.

This is your Intel engineers thoughts on that Scali post:
http://forum.beyond3d.com/showpost.p...&postcount=148

And I agree, that blog entry misses the point entirely and misreads what Kanter is really saying.

Jack
09-23-2010, 10:49 PM
Svnth

Quote:

Originally Posted by JumpingJack

I should clarify -- it is not the content of what was shown, it was the tone and out of context of Kanter's article that I thought was garbage. I comprehend the data just fine. It did not address the thesis of the original article, it reads like a hack job. But I agree, I am biased in that regard because I consider David a friend.

In any event, his entry would have been better presented if it were a bit more professional in tone and less like a fanatical fanboy post.

Jack

Maybe it's how you frame him in your head.. Scali has never been like that. I read it and it came off calm and clean, debunking what needed to be corrected.

I don't think Scali has any incentive to be a "fanboy" at all.
09-23-2010, 10:58 PM
JumpingJack

Quote:

Originally Posted by Svnth

Maybe it's how you frame him in your head.. Scali has never been like that. I read it and it came off calm and clean, debunking what needed to be corrected.

I don't think Scali has any incentive to be a "fanboy" at all.

Fair enough, it reads like the rantings of a rabid fanboy to me. I did talk with Kanter after he posted that article :) and ask how long it took before he got a call from nVidia PR... it was pretty quick.

Apparently, I am not the only with with this 'frame of Scali' in my head: http://forum.beyond3d.com/showthread.php?p=1474154
09-23-2010, 11:14 PM
Amorphous

Thanks!

Quote:

Originally Posted by Calmatory

Great job Nvidia PR warrior!

While the internets were ablaze, especially here, with comments about NVIDIA "de-optimizing" PhysX to cripple CPU PhysX performance, in reality that's how PhysX was developed prior to NVIDIA's involvement.

When Bullet is recompiled to take advantage of SSE, the performance improvement is fairly minimal. Even if PhysX benefit more significantly from the recompile it's still not going to be the 2-4x faster figure that's repeatedly quoted.

To see a significant performance increase with SSE, PhysX would need to be entirely re-written to make the best use of SSE. NVIDIA would have to dump a bunch of money into optimizing for their competitor, which makes no business sense at all, and it still seems unlikely that CPU PhysX would be a strong competitor to GPU PhysX.

The point of NVIDIA PhysX is to add additional value to NVIDIA GeForce GPUs. What's wrong with a company trying to add additional value to their products? :shrug:
09-23-2010, 11:22 PM
JumpingJack

Quote:

Originally Posted by SexyMF

I'm not really seeing any detailed rebuttal to the Real World Tech article within the blog. The RWT at least had some analysis to show the use of x87 instructions.

The rebuttal eludes to other bottle necks which limit the effectiveness of using SSE over x87, but doesn't start down the path of identifying these bottlenecks what is involved to overcome them. Or did I miss that?

The premise of the original RWT article is that, after some profiling, a definite pattern emerged -- PhysX was using legacy x87 as opposed to SMID SSEx for the heavy math lifting, this is besides the fact that PhysX is also not multithreaded on multicore CPUs by default. Kanter then goes on to question why this is the case, and makes his point around the concept that Intel and AMD both are deprecating x87 and focusing on improving/developing SSE.

There are two points of contention -- a) Kanters claims that using properly compiled and vectorized SSE can speed up throughput up to a theoretical 4x and probably, in reality, 2x and b) Kanter makes an argument that there is no technical reason why PhysX should not compiled SSE rather than x87. In this it is implied nVidia did this intentionally in order to make PhysX on the GPU look that much better. -- this seems to have pissed a few people off. :)
09-23-2010, 11:38 PM
Carfax

Quote:

Originally Posted by Final8ty

1) PhysX is proprietary technology yes & it was so under ageia as well & its purpose was to be more than just eyecandy as it could be use in conjunction with any GPU.

You're forgetting about Ageia's PPU.. PhysX was originally meant to run on a PPU, which is even worse than running it on a GPU, because with the former you absolutely had to buy dedicated hardware (and it was slower and more expensive than a GPU as well).

You can run PhysX on a single GPU however, provided it's powerful enough and your graphics settings aren't maxed out.

Quote:

It is limited to eyecandy because NV want to be unfair to everyone else with its disabling of the function when other GPU makes are in the system & so developers have no choice but to use it for eyecandy only because they would lose out on sales as the game could only be run on NV discrete GPUs & its nothing about being fair if developers could make more by being unfair they would do.

It seems you have deep seated issues with everyday corporate practices :shrug:
09-23-2010, 11:53 PM
Svnth

Quote:

Originally Posted by JumpingJack

Fair enough, it reads like the rantings of a rabid fanboy to me. I did talk with Kanter after he posted that article :) and ask how long it took before he got a call from nVidia PR... it was pretty quick.

Apparently, I am not the only with with this 'frame of Scali' in my head: http://forum.beyond3d.com/showthread.php?p=1474154

Eh? that link doesn't prove anything... He is in agreement with Scali, but confused why the article was posted now instead of back when this topic first arose.

I think in most cases Andrew would be right that in general 20% perf is not trivial. But when you're talking about taking PhysX CPU performance from 8fps to 10fps it *is* trivial. You'd need integer multipliers to get where it needs to be to be valuable.
09-23-2010, 11:55 PM
RaV[666]

Quote:

NVIDIA would have to dump a bunch of money into optimizing for their competitor, which makes no business sense at all, and it still seems unlikely that CPU PhysX would be a strong competitor to GPU PhysX.

Nobody here talks about optimizing it for the ATI cards.But if Nvidia tries to push physx as a standard, than it needs to have a decently working cpu based one,and it needs to stop disabling physx computations on nvidia card if an ati card is present.
Current Physx state of things is pretty much a reason for many people to stay away from nvidia.
09-24-2010, 12:02 AM
JumpingJack

Quote:

Originally Posted by Svnth

Eh? that link doesn't prove anything... He is in agreement with Scali, but confused why the article was posted now instead of back when this topic first arose.

I think in most cases Andrew would be right that in general 20% perf is not trivial. But when you're talking about taking PhysX CPU performance from 8fps to 10fps it *is* trivial. You'd need integer multipliers to get where it needs to be to be valuable.

I wasn't trying to prove someting, I was just pointing out others thought it was fanboy drivel as well.... read through the thread and what I said -- you implied he never gave reason to appear as a fanboy. I disagree.

Quote:

Apparently, I am not the only with with this 'frame of Scali' in my head: http://forum.beyond3d.com/showthread.php?p=1474154

In terms of 'wondering why it was posted now instead of...' is not what I was referring to ... perhaps I was not clear.... he and I are in agreement that

Quote:

Quote:

I don't think he does... that seems like a misreading of the article to me.

And I agree, that blog entry misses the point entirely and misreads what Kanter is really saying.

I personally couldn't give a rat's behind if it were only 8%, 20%, or 5000% improvement -- a fanboy rant is a fanboy rant.
09-24-2010, 12:22 AM
Final8ty

Quote:

Originally Posted by Carfax

You're forgetting about Ageia's PPU.. PhysX was originally meant to run on a PPU, which is even worse than running it on a GPU, because with the former you absolutely had to buy dedicated hardware (and it was slower and more expensive than a GPU as well).

You can run PhysX on a single GPU however, provided it's powerful enough and your graphics settings aren't maxed out.

It seems you have deep seated issues with everyday corporate practices :shrug:

Its the PPU that i was talking about & its still dedicated hardware that you have to buy now its call an NV gfx card & your limited to what other hardware you can have in the system unlike the PPU.

And i don't have issues with everyday corporate practices, i just have issues with specific corporate practices just like most people do & i have a right to air my dislike just like others.
09-24-2010, 02:45 AM
trinibwoy

Quote:

Originally Posted by JumpingJack

there is no technical reason why PhysX should not compiled SSE rather than x87. In this it is implied nVidia did this intentionally in order to make PhysX on the GPU look that much better.

How much faster is PhysX when recompiled with SSE flags?
09-24-2010, 03:22 AM
informal

I have a question.What happens to Physx(x87 coded) when native 64bit games that support it show up on the market(whenever it may be)?We all know in native AMD64 mode SSE is used instead of x87 ,so will NV be forced to change its stance?
09-24-2010, 03:54 AM
Calmatory

Quote:

Originally Posted by trinibwoy

How much faster is PhysX when recompiled with SSE flags?

It *should* be faster. That should be enough reason for a recompilation. It takes what, like less than 15 seconds to change the compilation flags, few minutes to compile.

Then again, compiling on Intel compiler should already do that for you... And since it is the best(as in performance) C++ compiler for Windows, I'd bet Nvidia to use it. So there should be clear difference in PhysX CPU performance if CPUID string was changed from GenuineIntel to AuthenticAMD, if the PhysX runtime was compiled with Intel C++ compiler AND one could change CPUID string of the CPU.

Time to ask Agner.

...that brought me toa question; why on earth did I recompile the benchmark to use SSE, when the actual Bullet runtime should've been recompiled... Fixing it now. ; Edit: Recompiled the whole Bullet runtime with -msse, -msse2, -msse3, -mfpmath=sse -mtune=core2.. And, there was some 5 % improvement in the benchmark demo. Physics step went from ~46.5 ms down to ~43.25 ms.
09-24-2010, 04:32 AM
Carfax

Quote:

Originally Posted by informal

I have a question.What happens to Physx(x87 coded) when native 64bit games that support it show up on the market(whenever it may be)?We all know in native AMD64 mode SSE is used instead of x87 ,so will NV be forced to change its stance?

Doesn't Wow64 allow 32 bit applications to run in a 64 bit environment without performance loss?
09-24-2010, 04:43 AM
informal

Quote:

Originally Posted by Carfax

Doesn't Wow64 allow 32 bit applications to run in a 64 bit environment without performance loss?

It does but I'm talking about Native x64 mode,not compatibility mode.
09-24-2010, 04:53 AM
RejZoR

I don't quite see the point here. PhysX can already run on CPU, so why making big deal out of this? The main problem why it's so damn slow is that it's not even utilizing the CPU properly. I used to have Core 2 Duo E4300 and Mirror's Edge is running exactly as fast on a 4 times faster Core i7 920 (even overclocked to 4GHz) with more cores, more threads, more cache etc. CPU was hardly utilized where i could easily dedicate 3 full cores just for physics calculation. Yet they were utilized up to 5% if i was lucky.
So it's not slow because it's not recompiled into something else but because its stupid. That's why.
09-24-2010, 05:12 AM
Vardant

You know, I have played Vampire Bloodlines on Source engine, it looks average, loading times are terrible and it's not optimized. I bought a new GPU, several times faster, DX11 and everything and nothing has changed. So the GPU wasn't the problem, Source engine just plain sucks.
09-24-2010, 05:30 AM
SKYMTL

Interesting conversation.

I think a lot of people are equating poorly implemented PhysX (Batman, Cryostasis, etc) with the actual performance of it in ALL games. This just isn't the case and games like Dark Void (with the latest patch) and Mafia II illustrate this. Both of these games have absolutely no issue processing higher levels of PhysX on the CPU across multiple threads.

Yes, processing it on a dedicated GPU will yield a performance increase but in some cases processing it on the CPU SLIGHTLY BETTER PERFORMANCE due to RENDERING bottlenecks. Transferring the PhysX processing to the CPU can free up the GPU to get on with the task of rendering.

On a side note, I am a firm believer that physics processing has no business being done on the GPU when we have multi core CPUs usually sitting around doing nothing.
09-24-2010, 05:39 AM
Carfax

Quote:

Originally Posted by informal

It does but I'm talking about Native x64 mode,not compatibility mode.

I suppose Nvidia will eventually optimize PhysX to take full advantage of x64.

The new SDK 3.0 which will supposedly be released next year will have SIMD optimizations and inherent multithreading, so I'm betting that it will eventually be 64 bit as well..

Either way though, hardware accelerated PhysX will always be faster..
09-24-2010, 05:42 AM
JumpingJack

Quote:

Originally Posted by trinibwoy

How much faster is PhysX when recompiled with SSE flags?

I don't know, if you read my post I was simply explaining where the point of contention was with most people.
09-24-2010, 05:49 AM
Particle

All I know is I need fast physics simulation in Gary's Mod. Until that day there is little going on I care about in this field.
09-24-2010, 06:07 AM
Carfax

Quote:

Originally Posted by SKYMTL

Interesting conversation.

I think a lot of people are equating poorly implemented PhysX (Batman, Cryostasis, etc) with the actual performance of it in ALL games. This just isn't the case and games like Dark Void (with the latest patch) and Mafia II illustrate this. Both of these games have absolutely no issue processing higher levels of PhysX on the CPU across multiple threads.

Yes, processing it on a dedicated GPU will yield a performance increase but in some cases processing it on the CPU SLIGHTLY BETTER PERFORMANCE due to RENDERING bottlenecks. Transferring the PhysX processing to the CPU can free up the GPU to get on with the task of rendering.

On a side note, I am a firm believer that physics processing has no business being done on the GPU when we have multi core CPUs usually sitting around doing nothing.

Good points :up:

Although I disagree that physics processing has no business being done on the GPU. If we want higher level physics, then the GPU is the natural choice since it excels at that kind of processing, has a lot more bandwidth than desktop CPUs (which apparently is important for physics processing), and generally provides much more bang for the buck.

Also people often forget that dual cores still constitute the majority in gaming PCs; although quadcores are catching up, it's not as fast as you'd think.
09-24-2010, 07:00 AM
SKYMTL

Quote:

Originally Posted by Carfax

Good points :up:

Although I disagree that physics processing has no business being done on the GPU. If we want higher level physics, then the GPU is the natural choice since it excels at that kind of processing, has a lot more bandwidth than desktop CPUs (which apparently is important for physics processing), and generally provides much more bang for the buck.

Also people often forget that dual cores still constitute the majority in gaming PCs; although quadcores are catching up, it's not as fast as you'd think.

Oh, I agree. However I also think that an effort should be made to max out the CPU before trying to validate a company's marketing practices which say using a GPU for high level physics is better.

The main problem with doing physics on the GPU is that you will ALWAYS loose rendering power in order to accomplish it unless you use a dedicated graphcis card. Personally, I want the ability to choose which to use but in may games (again: Batman, Cryostasis, etc) PhysX is so poorly implemented that the choice is effectively taken away from the user.

That isn't a proper way to sell a technology to consumers. Rather, show us that a CPU can do it WELL and then show how much BETTER a GPU can do it. I think this will actually help sell more GPUs because in many cases a $60 dedicated GPU will most likely benefit physics calculations much more than a $60 CPU upgrade.

At this point I think that the dedicated GPU setup for PhysX is the only way I would actually consider doing this. Loosing rendering power is perfectly fine if the game doesn't demand much but what about games like Mafia II and Metro 2033? Would I want to lower IQ settings just to ensure persistent debris, bouncing titties and somewhat realistic fabric? Heck no.

Show 100 post(s) from this thread on one page

All times are GMT -8. The time now is 12:40 AM.

XtremeSystems