No, it was not meant to contradict all, it was just meant to give more wide view on matter. It clearly shows that it depends on type of program used. Remember bobcat is meant for netbooks/low power notebooks, think what kind of programs that target uses. Yes it is clear that there is cases where decode stage can bottleneck and where l2 can bottleneck. But for intended target market those kind of tasks should be minimum.
Last edited by informal; 09-08-2010 at 03:52 PM.
Fast computers breed slow, lazy programmers
The price of reliability is the pursuit of the utmost simplicity. It is a price which the very rich find most hard to pay.
http://www.lighterra.com/papers/modernmicroprocessors/
Modern Ram, makes an old overclocker miss BH-5 and the fun it was
while we are on the subject, BOINC benchmark is whetstone and drhystone which are useless. for one linux scores much higher than windows on the same system. there is controversy over using this bench as the basis of points. some projects award points as WU time * boinc bench score, others use a really old computer such as a PIII machine as a base and then multiply it by a speed up factor. this whole system gets screwed up when you run non-deterministic algorithms because the wu may run until some criterion is met.
Vectorization is good, but it is not a panacea. Replace third operation in my example with "mul", "and", "shift", "test" or "sub" and SIMD wont help (while these ops are still independent). But my point was simple - as far as some cpu has a bigger pool of uops available for execution, so the cpu's OoO logic has a better chance to explore ILP. This is way i'm surprised by bobcat results (if these are real). I would guess that they have used a loop buffer, but such a buffer would consume a lot of space on the cpu die.
Last edited by kl0012; 09-09-2010 at 12:47 AM.
Just finally realized I can see bumps on the die shot, meaning it's likely a shot at the top level interconnect. The regular structure over the GPU portion must just be power/gnd rails.
However his example is emphasizing how the actual instructions are ordered (for dynamic OoO execution, see VLIW for statically done by compiler), he could've noted those were SIMD operations there instead![]()
![]()
Yeah, the BOINC benchmark is just a purely synthetic benchmark that has little correlation to actual real world performance.
While this is sort of off-topic, the whole notion of having a universal credit scheme based off of an arbitrarily chosen synthetic benchmark makes consistency [between machines] very hard to find; considering each processor has different real world performance characteristics, along with the fact many projects run completely different algorithms... And this is before even bringing SSE into the mess (benchmark doesn't use it, and doing so only further distorts the results), along with the other issues up that alley...
It would be nice if each project made a benchmark representing their algorithm instead and that all got normalized for cross-project comparison![]()
Well if you have a free time to play with it...
http://sourceforge.net/ - chose whatever you want.
Is it just me or is anyone else interested in overclocking that little toy?![]()
extremely interested...
All along the watchtower the watchmen watch the eternal return.
you can do it in sofware too i think boris.
No, on my old X2 @ 2.2GHz only WMV worked smoothly at 1080P, some formats was way to slow.
My distrust for hardware decoding has nothing to do with speed, it has to do with support. I want something that can play all formats in all containers. As far as I know ATi still don't have support for pixelmapped 1080P in Vista or Win 7.
Bookmarks