Quote Originally Posted by sergiojr View Post
Autoparallel FAIL for ICC actually. It doesn't seem to work well for more than 4 cores (i7-3960X loses significantly to i7-2700K in lots of tests). They definitely should run it with OMP_NUM_THREADS=4 for FX8150 and set core affinity accordingly to "first cores in module". But it is Intel and I don't think they have intention to make AMD processor look good.
It is Spec_INT, more or less single threaded. Autoparallel cannot offer significant speedups..
And comparing SSE3 code for AMD and AVX code for Intel is totally irrelevant.
Why ? AVX and SSE have the same throughput for BD since it did not spend any transistors to optimize for AVX. The only question is whether having used FMA would have made a difference.