
Originally Posted by
JumpingJack
So I just glanced over your data and have not fully digested it yet, but all the benches you ran are multithreaded. You have not shown anything that explains single threaded performances, rather what you have done is forced a situation where you have 4 available contexts, in one situation 4 threads are scheduled across 2 modules (a sharing situation) and in other case 4 threads spread over 4 modules (a non-shared situation). What you are showing is the performance hit taken when resources in the front end are shared (i.e. cache, TLBs, BTBs, etc etc). The results would be exactly what I would have expected.
What is more interesting about your data is you can now ascertain AMD's claims of 180% or that 1.8 scaling factor of a module vs two distinct cores.
Here is an experiment to do.... repeat this, but with just one application (Fritz Chess) because this app allows you to specify the number of threads spawn.
Do it for 1, 2, 3, and 4 threads. Then turn everything on (all modules, clusters, cores what ever you want to call them), and do the same runs for 1, 2, 3, 4, 5, 6, 7 and 8 threads and plot the scaling vs thread for the three different configurations.
You will find that the fritz run for 1 thread will not be different regardless of how you configure the modules, clusters, cores (again, call them whatever you want to call them).
jack
Bookmarks