Originally Posted by
savantu
Legacy code ? What exactly is legacy ? Everything not compiled with BD uarch in mind, XOP and FMA ? I'm not mentioning AVX since the implementation seems extremely fragile and causes performance degradation when used. ( see thread at RWT )
In x86 you need to maintain performance and old, legacy ( that is SW from 5-10 years ago ), current one and the future ( AVX, FMA3 ). There is only one culprit to blame if BD does poorly on legacy and current SW. That's AMD management.
They will learn the hard way the Pentium 4 lesson : even Intel couldn't get code optimized fast enough ( and back then there weren't so many ISVs and standards, OpenCL,Java, uarch-independent source code, etc ). By the time SSEx SW and TLP started to catch on, Pentium 4 was on the chopping block and Core was on the horizon.
AMD launches BD now. By the time FMA code will be common, BD will be like P4 today, a distant, ugly memory. As for XOP and BD optimized code paths, given its lackluster reception, most developers will avoid investing too much time and resources.
Btw, BD reinforces the fact that optimizing for product famillies and not just checking for features ( like Intel's compiler does ) is the right way with BD. BD has AVX for example. But its implementation is suboptimal . It should be avoided. If the Intel compiler would check the feature flags, it will churn auto AVX 256bit code which will have a terrible performance for BD. Eric Bron mentioned this interesting twist.