Originally Posted by
BeepBeep2
Sorry, it seems my math is wrong. You are right, I calculated area wrong, neglecting a simple formula.
I had calculated 0.7111 * 346, however the correct formula would be 0.7111 * 0.7111 * 346 (A = L*W), meaning Thuban's die on 32nm if cache structure and IMC were left the same would be ~175.36 mm^2.
A "theoretical" eight core STARS design couldn't be much bigger than 250mm^2...giving 9.69mm^2 (x2) for extra (Llano's) cores and a generous 55mm^2 for extra L2 cache and other improvements. It would be impossible for this CPU to be larger than 300mm^2.
Considering Thuban is beating BD in EVERY x86-64 single threaded application I've see yet but WinRAR and AES-encryption benchmarks (if they happened to run in a single thread, that is), both stock and overclocked, also is near BD performance at equal or lesser power usage while at a deficit of 2 cores, it seems Thuban would be about 80% better in performance per mm^2 ignoring power consumption as that would be an unknown at 32nm.
Also, one would have to think that yeilds would be much better at 32nm with the older, smaller architecture. Smaller dies are easier (not to mention cheaper!) to produce, and chances are that the chips would perform better as well as AMD has worked with K10 for 4 years now.
On server side, since Magny Cours is an MCM package with 2 Instabul dies, its area is 724mm^2. On 32nm, this would translate to ~366mm^2...
A twelve core Magny Cours CPU, just 40mm^2 (about 15%) larger than the current 8 core Bulldozer design, has a four thread benefit (50% more cores/threads for 15% size, and that is the desktop chip)...this defeats Tomasis's arguement about BD being "designed for server".
In fact, that CPU already performs almost as well, sometimes even greater than the 16 core MCM Orochi design while a whole node behind.
AMD was able to pull 2.3 Ghz on 45nm with just a 140w TDP on the old architecture, and 2.5 Ghz at 140w now if you look at numbers before process improvements. 2.2 Ghz was possible with 115w TDP. (Opteron 6176 SE, more recent 6180 SE, 6174.)
To sum up, with (correct me if I'm wrong, like Tomasis said I am a "kid") correct math:
Thuban @ 32nm would be around 175mm^2, up to 80% improvement in performance per mm^2 (315mm^2 being 80% larger than 175mm^2)...no less than 40-50% in worst case scenario.
Magny Cours @ 32nm would be only 40mm^2 (<15%) larger than the current Orochi design, and performs in best case scenario equal to the 16 core Orochi MCM design and worst case 33% lesser. The Orochi MCM design would be 1.7x size of this "theoretical" Magny Cours.
A "theoretical" 8/16 core "STARS" MCM design would be no larger than 250mm^2/500mm^2, so we end up with a 16 core STARS design at ~500mm^2, 130mm^2 smaller than Orochi 16 core MCM. This design would be smaller, more efficient per mm^2, and keep the same performance as Orochi MCM in worst case scenarios (where Orochi MCM has pulled ahead of Magny Cours by 33%) even if clocked at a mere 1.8 Ghz due to GloFo's 32nm process.
Yeilds would also be better, since die sizes would be smaller, chips would be produced much cheaper and AMD/GloFo has been producing K10 for 4 years.
Did I mention that the old uarch runs much cooler as well? (Not known for sure, since smaller node means heat is more concentrated, but less should be produced)
I'm sure wez, TESKATLIPOKA, Tomasis, informal and others will still find a way to blame the process for all of this. If AMD hadn't let go of the fab it would still be AMD's fault and nobody would give a :banana::banana::banana::banana: about that arguement. I did the math, where is yours?