MMM
Page 3 of 9 FirstFirst 123456 ... LastLast
Results 51 to 75 of 217

Thread: PhysX on a CPU likely to see very little benefit from SSE recompile

  1. #51
    Xtreme Mentor
    Join Date
    Mar 2006
    Posts
    2,978
    That blog entry is utter garbage.
    One hundred years from now It won't matter
    What kind of car I drove What kind of house I lived in
    How much money I had in the bank Nor what my cloths looked like.... But The world may be a little better Because, I was important In the life of a child.
    -- from "Within My Power" by Forest Witcraft

  2. #52
    Banned
    Join Date
    Mar 2010
    Posts
    88
    Quote Originally Posted by JumpingJack View Post
    That blog entry is utter garbage.
    I understand it's difficult to comprehend data that doesn't align with your worldview, but you *do* realize that the forum post the blog article references is actually by an Intel employee who knows his about SSE and compilers right? An Intel engineer (of all people!) debunked the thing, and Scali simply referenced it.

    Sorry I think if even the competition debunks the thing, you have to probably take a second look.

  3. #53
    Xtreme Mentor
    Join Date
    Mar 2006
    Posts
    2,978
    Quote Originally Posted by Svnth View Post
    I understand it's difficult to comprehend data that doesn't align with your worldview, but you *do* realize that the forum post the blog article references is actually by an Intel employee who knows his about SSE and compilers right? An Intel engineer (of all people!) debunked the thing, and Scali simply referenced it.

    Sorry I think if even the competition debunks the thing, you have to probably take a second look.
    I should clarify -- it is not the content of what was shown, it was the tone and out of context of Kanter's article that I thought was garbage. I comprehend the data just fine. It did not address the thesis of the original article, it reads like a hack job. But I agree, I am biased in that regard because I consider David a friend.

    In any event, his entry would have been better presented if it were a bit more professional in tone and less like a fanatical fanboy post.

    This is your Intel engineers thoughts on that Scali post:
    http://forum.beyond3d.com/showpost.p...&postcount=148

    And I agree, that blog entry misses the point entirely and misreads what Kanter is really saying.


    Jack
    Last edited by JumpingJack; 09-23-2010 at 10:50 PM.
    One hundred years from now It won't matter
    What kind of car I drove What kind of house I lived in
    How much money I had in the bank Nor what my cloths looked like.... But The world may be a little better Because, I was important In the life of a child.
    -- from "Within My Power" by Forest Witcraft

  4. #54
    Banned
    Join Date
    Mar 2010
    Posts
    88
    Quote Originally Posted by JumpingJack View Post
    I should clarify -- it is not the content of what was shown, it was the tone and out of context of Kanter's article that I thought was garbage. I comprehend the data just fine. It did not address the thesis of the original article, it reads like a hack job. But I agree, I am biased in that regard because I consider David a friend.

    In any event, his entry would have been better presented if it were a bit more professional in tone and less like a fanatical fanboy post.

    Jack
    Maybe it's how you frame him in your head.. Scali has never been like that. I read it and it came off calm and clean, debunking what needed to be corrected.

    I don't think Scali has any incentive to be a "fanboy" at all.

  5. #55
    Xtreme Mentor
    Join Date
    Mar 2006
    Posts
    2,978
    Quote Originally Posted by Svnth View Post
    Maybe it's how you frame him in your head.. Scali has never been like that. I read it and it came off calm and clean, debunking what needed to be corrected.

    I don't think Scali has any incentive to be a "fanboy" at all.
    Fair enough, it reads like the rantings of a rabid fanboy to me. I did talk with Kanter after he posted that article and ask how long it took before he got a call from nVidia PR... it was pretty quick.

    Apparently, I am not the only with with this 'frame of Scali' in my head: http://forum.beyond3d.com/showthread.php?p=1474154
    Last edited by JumpingJack; 09-23-2010 at 11:01 PM.
    One hundred years from now It won't matter
    What kind of car I drove What kind of house I lived in
    How much money I had in the bank Nor what my cloths looked like.... But The world may be a little better Because, I was important In the life of a child.
    -- from "Within My Power" by Forest Witcraft

  6. #56
    Registered User
    Join Date
    Feb 2010
    Location
    NVIDIA HQ
    Posts
    76
    Thanks!

    Quote Originally Posted by Calmatory View Post
    Great job Nvidia PR warrior!
    While the internets were ablaze, especially here, with comments about NVIDIA "de-optimizing" PhysX to cripple CPU PhysX performance, in reality that's how PhysX was developed prior to NVIDIA's involvement.

    When Bullet is recompiled to take advantage of SSE, the performance improvement is fairly minimal. Even if PhysX benefit more significantly from the recompile it's still not going to be the 2-4x faster figure that's repeatedly quoted.

    To see a significant performance increase with SSE, PhysX would need to be entirely re-written to make the best use of SSE. NVIDIA would have to dump a bunch of money into optimizing for their competitor, which makes no business sense at all, and it still seems unlikely that CPU PhysX would be a strong competitor to GPU PhysX.

    The point of NVIDIA PhysX is to add additional value to NVIDIA GeForce GPUs. What's wrong with a company trying to add additional value to their products?
    NVIDIA Forums Administrator

  7. #57
    Xtreme Mentor
    Join Date
    Mar 2006
    Posts
    2,978
    Quote Originally Posted by SexyMF View Post
    I'm not really seeing any detailed rebuttal to the Real World Tech article within the blog. The RWT at least had some analysis to show the use of x87 instructions.

    The rebuttal eludes to other bottle necks which limit the effectiveness of using SSE over x87, but doesn't start down the path of identifying these bottlenecks what is involved to overcome them. Or did I miss that?
    The premise of the original RWT article is that, after some profiling, a definite pattern emerged -- PhysX was using legacy x87 as opposed to SMID SSEx for the heavy math lifting, this is besides the fact that PhysX is also not multithreaded on multicore CPUs by default. Kanter then goes on to question why this is the case, and makes his point around the concept that Intel and AMD both are deprecating x87 and focusing on improving/developing SSE.

    There are two points of contention -- a) Kanters claims that using properly compiled and vectorized SSE can speed up throughput up to a theoretical 4x and probably, in reality, 2x and b) Kanter makes an argument that there is no technical reason why PhysX should not compiled SSE rather than x87. In this it is implied nVidia did this intentionally in order to make PhysX on the GPU look that much better. -- this seems to have pissed a few people off.
    One hundred years from now It won't matter
    What kind of car I drove What kind of house I lived in
    How much money I had in the bank Nor what my cloths looked like.... But The world may be a little better Because, I was important In the life of a child.
    -- from "Within My Power" by Forest Witcraft

  8. #58
    Xtreme Addict
    Join Date
    Jul 2004
    Location
    U.S of freakin' A
    Posts
    1,931
    Quote Originally Posted by Final8ty View Post
    1) PhysX is proprietary technology yes & it was so under ageia as well & its purpose was to be more than just eyecandy as it could be use in conjunction with any GPU.
    You're forgetting about Ageia's PPU.. PhysX was originally meant to run on a PPU, which is even worse than running it on a GPU, because with the former you absolutely had to buy dedicated hardware (and it was slower and more expensive than a GPU as well).

    You can run PhysX on a single GPU however, provided it's powerful enough and your graphics settings aren't maxed out.

    It is limited to eyecandy because NV want to be unfair to everyone else with its disabling of the function when other GPU makes are in the system & so developers have no choice but to use it for eyecandy only because they would lose out on sales as the game could only be run on NV discrete GPUs & its nothing about being fair if developers could make more by being unfair they would do.
    It seems you have deep seated issues with everyday corporate practices
    Intel Core i7 6900K
    Noctua NH-D15
    Asus X99A II
    32 GB G.Skill TridentZ @ 3400 CL15 CR1
    NVidia Titan Xp
    Creative Sound BlasterX AE-5
    Sennheiser HD-598
    Samsung 960 Pro 1TB
    Western Digital Raptor 600GB
    Asus 12x Blu-Ray Burner
    Sony Optiarc 24x DVD Burner with NEC chipset
    Antec HCP-1200w Power Supply
    Viewsonic XG2703-GS
    Thermaltake Level 10 GT Snow Edition
    Logitech G502 gaming mouse w/Razer Exact Mat
    Logitech G910 mechanical gaming keyboard
    Windows 8 x64 Pro

  9. #59
    Banned
    Join Date
    Mar 2010
    Posts
    88
    Quote Originally Posted by JumpingJack View Post
    Fair enough, it reads like the rantings of a rabid fanboy to me. I did talk with Kanter after he posted that article and ask how long it took before he got a call from nVidia PR... it was pretty quick.

    Apparently, I am not the only with with this 'frame of Scali' in my head: http://forum.beyond3d.com/showthread.php?p=1474154
    Eh? that link doesn't prove anything... He is in agreement with Scali, but confused why the article was posted now instead of back when this topic first arose.

    I think in most cases Andrew would be right that in general 20% perf is not trivial. But when you're talking about taking PhysX CPU performance from 8fps to 10fps it *is* trivial. You'd need integer multipliers to get where it needs to be to be valuable.
    Last edited by Svnth; 09-23-2010 at 11:55 PM.

  10. #60
    Banned
    Join Date
    Jan 2003
    Location
    EU
    Posts
    318
    NVIDIA would have to dump a bunch of money into optimizing for their competitor, which makes no business sense at all, and it still seems unlikely that CPU PhysX would be a strong competitor to GPU PhysX.
    Nobody here talks about optimizing it for the ATI cards.But if Nvidia tries to push physx as a standard, than it needs to have a decently working cpu based one,and it needs to stop disabling physx computations on nvidia card if an ati card is present.
    Current Physx state of things is pretty much a reason for many people to stay away from nvidia.

  11. #61
    Xtreme Mentor
    Join Date
    Mar 2006
    Posts
    2,978
    Quote Originally Posted by Svnth View Post
    Eh? that link doesn't prove anything... He is in agreement with Scali, but confused why the article was posted now instead of back when this topic first arose.

    I think in most cases Andrew would be right that in general 20% perf is not trivial. But when you're talking about taking PhysX CPU performance from 8fps to 10fps it *is* trivial. You'd need integer multipliers to get where it needs to be to be valuable.
    I wasn't trying to prove someting, I was just pointing out others thought it was fanboy drivel as well.... read through the thread and what I said -- you implied he never gave reason to appear as a fanboy. I disagree.
    Apparently, I am not the only with with this 'frame of Scali' in my head: http://forum.beyond3d.com/showthread.php?p=1474154
    In terms of 'wondering why it was posted now instead of...' is not what I was referring to ... perhaps I was not clear.... he and I are in agreement that

    I don't think he does... that seems like a misreading of the article to me.
    And I agree, that blog entry misses the point entirely and misreads what Kanter is really saying.

    I personally couldn't give a rat's behind if it were only 8%, 20%, or 5000% improvement -- a fanboy rant is a fanboy rant.
    Last edited by JumpingJack; 09-24-2010 at 12:10 AM.
    One hundred years from now It won't matter
    What kind of car I drove What kind of house I lived in
    How much money I had in the bank Nor what my cloths looked like.... But The world may be a little better Because, I was important In the life of a child.
    -- from "Within My Power" by Forest Witcraft

  12. #62
    Xtremely Kool
    Join Date
    Jul 2006
    Location
    UK
    Posts
    1,875
    Quote Originally Posted by Carfax View Post
    You're forgetting about Ageia's PPU.. PhysX was originally meant to run on a PPU, which is even worse than running it on a GPU, because with the former you absolutely had to buy dedicated hardware (and it was slower and more expensive than a GPU as well).

    You can run PhysX on a single GPU however, provided it's powerful enough and your graphics settings aren't maxed out.



    It seems you have deep seated issues with everyday corporate practices
    Its the PPU that i was talking about & its still dedicated hardware that you have to buy now its call an NV gfx card & your limited to what other hardware you can have in the system unlike the PPU.


    And i don't have issues with everyday corporate practices, i just have issues with specific corporate practices just like most people do & i have a right to air my dislike just like others.
    Last edited by Final8ty; 09-24-2010 at 12:25 AM.

  13. #63
    Xtreme Addict
    Join Date
    Apr 2007
    Posts
    1,870
    Quote Originally Posted by JumpingJack View Post
    there is no technical reason why PhysX should not compiled SSE rather than x87. In this it is implied nVidia did this intentionally in order to make PhysX on the GPU look that much better.
    How much faster is PhysX when recompiled with SSE flags?

  14. #64
    Xtreme Cruncher
    Join Date
    Jun 2006
    Posts
    6,215
    I have a question.What happens to Physx(x87 coded) when native 64bit games that support it show up on the market(whenever it may be)?We all know in native AMD64 mode SSE is used instead of x87 ,so will NV be forced to change its stance?

  15. #65
    Xtreme Addict
    Join Date
    Apr 2007
    Posts
    2,128
    Quote Originally Posted by trinibwoy View Post
    How much faster is PhysX when recompiled with SSE flags?
    It *should* be faster. That should be enough reason for a recompilation. It takes what, like less than 15 seconds to change the compilation flags, few minutes to compile.

    Then again, compiling on Intel compiler should already do that for you... And since it is the best(as in performance) C++ compiler for Windows, I'd bet Nvidia to use it. So there should be clear difference in PhysX CPU performance if CPUID string was changed from GenuineIntel to AuthenticAMD, if the PhysX runtime was compiled with Intel C++ compiler AND one could change CPUID string of the CPU.

    Time to ask Agner.

    ...that brought me toa question; why on earth did I recompile the benchmark to use SSE, when the actual Bullet runtime should've been recompiled... Fixing it now. ; Edit: Recompiled the whole Bullet runtime with -msse, -msse2, -msse3, -mfpmath=sse -mtune=core2.. And, there was some 5 % improvement in the benchmark demo. Physics step went from ~46.5 ms down to ~43.25 ms.
    Last edited by Calmatory; 09-24-2010 at 04:21 AM.

  16. #66
    Xtreme Addict
    Join Date
    Jul 2004
    Location
    U.S of freakin' A
    Posts
    1,931
    Quote Originally Posted by informal View Post
    I have a question.What happens to Physx(x87 coded) when native 64bit games that support it show up on the market(whenever it may be)?We all know in native AMD64 mode SSE is used instead of x87 ,so will NV be forced to change its stance?
    Doesn't Wow64 allow 32 bit applications to run in a 64 bit environment without performance loss?
    Intel Core i7 6900K
    Noctua NH-D15
    Asus X99A II
    32 GB G.Skill TridentZ @ 3400 CL15 CR1
    NVidia Titan Xp
    Creative Sound BlasterX AE-5
    Sennheiser HD-598
    Samsung 960 Pro 1TB
    Western Digital Raptor 600GB
    Asus 12x Blu-Ray Burner
    Sony Optiarc 24x DVD Burner with NEC chipset
    Antec HCP-1200w Power Supply
    Viewsonic XG2703-GS
    Thermaltake Level 10 GT Snow Edition
    Logitech G502 gaming mouse w/Razer Exact Mat
    Logitech G910 mechanical gaming keyboard
    Windows 8 x64 Pro

  17. #67
    Xtreme Cruncher
    Join Date
    Jun 2006
    Posts
    6,215
    Quote Originally Posted by Carfax View Post
    Doesn't Wow64 allow 32 bit applications to run in a 64 bit environment without performance loss?
    It does but I'm talking about Native x64 mode,not compatibility mode.

  18. #68
    Xtreme Addict
    Join Date
    May 2007
    Location
    Europe/Slovenia/Ljubljana
    Posts
    1,540
    I don't quite see the point here. PhysX can already run on CPU, so why making big deal out of this? The main problem why it's so damn slow is that it's not even utilizing the CPU properly. I used to have Core 2 Duo E4300 and Mirror's Edge is running exactly as fast on a 4 times faster Core i7 920 (even overclocked to 4GHz) with more cores, more threads, more cache etc. CPU was hardly utilized where i could easily dedicate 3 full cores just for physics calculation. Yet they were utilized up to 5% if i was lucky.
    So it's not slow because it's not recompiled into something else but because its stupid. That's why.
    Intel Core i7 920 4 GHz | 18 GB DDR3 1600 MHz | ASUS Rampage II Gene | GIGABYTE HD7950 3GB WindForce 3X | WD Caviar Black 2TB | Creative Sound Blaster Z | Altec Lansing MX5021 | Corsair HX750 | Lian Li PC-V354
    Super silent cooling powered by (((Noiseblocker)))

  19. #69
    Xtreme Enthusiast
    Join Date
    Jan 2010
    Posts
    533
    You know, I have played Vampire Bloodlines on Source engine, it looks average, loading times are terrible and it's not optimized. I bought a new GPU, several times faster, DX11 and everything and nothing has changed. So the GPU wasn't the problem, Source engine just plain sucks.

  20. #70
    Xtreme Guru
    Join Date
    Aug 2007
    Posts
    3,562
    Interesting conversation.

    I think a lot of people are equating poorly implemented PhysX (Batman, Cryostasis, etc) with the actual performance of it in ALL games. This just isn't the case and games like Dark Void (with the latest patch) and Mafia II illustrate this. Both of these games have absolutely no issue processing higher levels of PhysX on the CPU across multiple threads.

    Yes, processing it on a dedicated GPU will yield a performance increase but in some cases processing it on the CPU SLIGHTLY BETTER PERFORMANCE due to RENDERING bottlenecks. Transferring the PhysX processing to the CPU can free up the GPU to get on with the task of rendering.

    On a side note, I am a firm believer that physics processing has no business being done on the GPU when we have multi core CPUs usually sitting around doing nothing.

  21. #71
    Xtreme Addict
    Join Date
    Jul 2004
    Location
    U.S of freakin' A
    Posts
    1,931
    Quote Originally Posted by informal View Post
    It does but I'm talking about Native x64 mode,not compatibility mode.
    I suppose Nvidia will eventually optimize PhysX to take full advantage of x64.

    The new SDK 3.0 which will supposedly be released next year will have SIMD optimizations and inherent multithreading, so I'm betting that it will eventually be 64 bit as well..

    Either way though, hardware accelerated PhysX will always be faster..
    Intel Core i7 6900K
    Noctua NH-D15
    Asus X99A II
    32 GB G.Skill TridentZ @ 3400 CL15 CR1
    NVidia Titan Xp
    Creative Sound BlasterX AE-5
    Sennheiser HD-598
    Samsung 960 Pro 1TB
    Western Digital Raptor 600GB
    Asus 12x Blu-Ray Burner
    Sony Optiarc 24x DVD Burner with NEC chipset
    Antec HCP-1200w Power Supply
    Viewsonic XG2703-GS
    Thermaltake Level 10 GT Snow Edition
    Logitech G502 gaming mouse w/Razer Exact Mat
    Logitech G910 mechanical gaming keyboard
    Windows 8 x64 Pro

  22. #72
    Xtreme Mentor
    Join Date
    Mar 2006
    Posts
    2,978
    Quote Originally Posted by trinibwoy View Post
    How much faster is PhysX when recompiled with SSE flags?
    I don't know, if you read my post I was simply explaining where the point of contention was with most people.
    One hundred years from now It won't matter
    What kind of car I drove What kind of house I lived in
    How much money I had in the bank Nor what my cloths looked like.... But The world may be a little better Because, I was important In the life of a child.
    -- from "Within My Power" by Forest Witcraft

  23. #73
    Xtreme X.I.P. Particle's Avatar
    Join Date
    Apr 2008
    Location
    Kansas
    Posts
    3,219
    All I know is I need fast physics simulation in Gary's Mod. Until that day there is little going on I care about in this field.
    Particle's First Rule of Online Technical Discussion:
    As a thread about any computer related subject has its length approach infinity, the likelihood and inevitability of a poorly constructed AMD vs. Intel fight also exponentially increases.

    Rule 1A:
    Likewise, the frequency of a car pseudoanalogy to explain a technical concept increases with thread length. This will make many people chuckle, as computer people are rarely knowledgeable about vehicular mechanics.

    Rule 2:
    When confronted with a post that is contrary to what a poster likes, believes, or most often wants to be correct, the poster will pick out only minor details that are largely irrelevant in an attempt to shut out the conflicting idea. The core of the post will be left alone since it isn't easy to contradict what the person is actually saying.

    Rule 2A:
    When a poster cannot properly refute a post they do not like (as described above), the poster will most likely invent fictitious counter-points and/or begin to attack the other's credibility in feeble ways that are dramatic but irrelevant. Do not underestimate this tactic, as in the online world this will sway many observers. Do not forget: Correctness is decided only by what is said last, the most loudly, or with greatest repetition.

    Rule 3:
    When it comes to computer news, 70% of Internet rumors are outright fabricated, 20% are inaccurate enough to simply be discarded, and about 10% are based in reality. Grains of salt--become familiar with them.

    Remember: When debating online, everyone else is ALWAYS wrong if they do not agree with you!

    Random Tip o' the Whatever
    You just can't win. If your product offers feature A instead of B, people will moan how A is stupid and it didn't offer B. If your product offers B instead of A, they'll likewise complain and rant about how anyone's retarded cousin could figure out A is what the market wants.

  24. #74
    Xtreme Addict
    Join Date
    Jul 2004
    Location
    U.S of freakin' A
    Posts
    1,931
    Quote Originally Posted by SKYMTL View Post
    Interesting conversation.

    I think a lot of people are equating poorly implemented PhysX (Batman, Cryostasis, etc) with the actual performance of it in ALL games. This just isn't the case and games like Dark Void (with the latest patch) and Mafia II illustrate this. Both of these games have absolutely no issue processing higher levels of PhysX on the CPU across multiple threads.

    Yes, processing it on a dedicated GPU will yield a performance increase but in some cases processing it on the CPU SLIGHTLY BETTER PERFORMANCE due to RENDERING bottlenecks. Transferring the PhysX processing to the CPU can free up the GPU to get on with the task of rendering.

    On a side note, I am a firm believer that physics processing has no business being done on the GPU when we have multi core CPUs usually sitting around doing nothing.
    Good points

    Although I disagree that physics processing has no business being done on the GPU. If we want higher level physics, then the GPU is the natural choice since it excels at that kind of processing, has a lot more bandwidth than desktop CPUs (which apparently is important for physics processing), and generally provides much more bang for the buck.

    Also people often forget that dual cores still constitute the majority in gaming PCs; although quadcores are catching up, it's not as fast as you'd think.
    Intel Core i7 6900K
    Noctua NH-D15
    Asus X99A II
    32 GB G.Skill TridentZ @ 3400 CL15 CR1
    NVidia Titan Xp
    Creative Sound BlasterX AE-5
    Sennheiser HD-598
    Samsung 960 Pro 1TB
    Western Digital Raptor 600GB
    Asus 12x Blu-Ray Burner
    Sony Optiarc 24x DVD Burner with NEC chipset
    Antec HCP-1200w Power Supply
    Viewsonic XG2703-GS
    Thermaltake Level 10 GT Snow Edition
    Logitech G502 gaming mouse w/Razer Exact Mat
    Logitech G910 mechanical gaming keyboard
    Windows 8 x64 Pro

  25. #75
    Xtreme Guru
    Join Date
    Aug 2007
    Posts
    3,562
    Quote Originally Posted by Carfax View Post
    Good points

    Although I disagree that physics processing has no business being done on the GPU. If we want higher level physics, then the GPU is the natural choice since it excels at that kind of processing, has a lot more bandwidth than desktop CPUs (which apparently is important for physics processing), and generally provides much more bang for the buck.

    Also people often forget that dual cores still constitute the majority in gaming PCs; although quadcores are catching up, it's not as fast as you'd think.
    Oh, I agree. However I also think that an effort should be made to max out the CPU before trying to validate a company's marketing practices which say using a GPU for high level physics is better.

    The main problem with doing physics on the GPU is that you will ALWAYS loose rendering power in order to accomplish it unless you use a dedicated graphcis card. Personally, I want the ability to choose which to use but in may games (again: Batman, Cryostasis, etc) PhysX is so poorly implemented that the choice is effectively taken away from the user.

    That isn't a proper way to sell a technology to consumers. Rather, show us that a CPU can do it WELL and then show how much BETTER a GPU can do it. I think this will actually help sell more GPUs because in many cases a $60 dedicated GPU will most likely benefit physics calculations much more than a $60 CPU upgrade.

    At this point I think that the dedicated GPU setup for PhysX is the only way I would actually consider doing this. Loosing rendering power is perfectly fine if the game doesn't demand much but what about games like Mafia II and Metro 2033? Would I want to lower IQ settings just to ensure persistent debris, bouncing titties and somewhat realistic fabric? Heck no.

Page 3 of 9 FirstFirst 123456 ... LastLast

Bookmarks

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •