AMD "Piledriver" refresh of Zambezi - info, speculations, test, fans

Printable View

Show 100 post(s) from this thread on one page

11-22-2011, 10:51 PM
behrouz

dont bother with article ars technica

new compilers like open64 and gcc 4.7 improved BD's performance , here Reason :

http://www.phoronix.com/scan.php?pag...r_open64&num=3

http://www.phoronix.com/scan.php?pag...c_open64&num=3
11-23-2011, 12:26 AM
-Boris-

Quote:

Originally Posted by TESKATLIPOKA

-Boris-
can you show me some proof, because some parts are so wrong, mainly 2x performance per mm2, I really want to know how you got that and the same or higher frequencies is just a wet dream o yours, its enough if you look at Llano with disabled IGP, that cpu can't break the 4Ghz barrier while BD is attacking 5Ghz and is made on the same 32nm process.

p.S. I agree on higher IPC and perf/w is worse only on 4module but still on par with Thuban, the 3module and 2 module are better than anything else from AMD. Example Llano 2.9Ghz has the same power consumption as FX4100 but FX is a few % better.
Even FX6100 has a higher consumption by just 4% but performs better by 15% the rest of AMD cpu has higher power draw.

I said on 32nm! And I know, we don't exactly know how Phenom II would perform on 32nm, but it wouldn't be worse than 45nm. How do I know? GloFos 32nm isn't that bad since it already manages two really large dies. I think it's fairly safe that Thuban would reach a bit higher frequencies at early 32nm at almost half the size of BD, and with BDs or Llanos better IMC and Llanos IPC improvements it would already there equal a few hundred MHz extra performance. There you have at least 10% higher performance than Thuban at almost half the size of BD, and that with plenty of headroom to grow in!
And as I said many times before, you can't use Llano as an example of the performance of GloFos 32nm. It's a 1.45 billion trannie monster, with almost no "easy" trannies and die space like caches, it's all complex logic. And it has tradeoffs we don't know anything about. A GPU design originally designed for TSMCs low power and low frequency processes for larger dies with more shaders wouldn't work very good at a high power and high frequency process, like the ones you make CPUs on. Just as a CPUs would have a hard time reaching high frequencies if made on a process tuned for wide low frequency chips like a GPU. And another reason for Llanos bad overclocking abilities is that it has no frequency limiters and is locked, thus raising the bus raises a lot of frequencies that shouldn’t be touched. No one says Intels 32nm is bad just because SBs doesn’t overclock good at all when locked.

Quote:

Originally Posted by Oliverda

http://prohardver.hu/dl/cnt/2011-10/...reltel_cpu.png

Bulldozer: 315 mm2

Thuban: 346 mm2

Just to add that GF's 45 nm process is capable for higher performance than the current 32 nm one.

Seriously? Quote mining much? You forgot the last part of the sentence "Phenom II has higher performance per watt, twice(!) the performance per mm² (taking processes in to account)." I honestly thought our discussion would be above your quote mining tactics. And your chart just proves my point. Phenom II DOES have higher performance per watt! And if shrunk with 32nm it would be almost half as big, but even cooler and capable of even higher performance! There you have twice the performance per mm². And no, no one has presented any proof that 32nm is very bad at all, of course it might not be the best process right now, but if it's capable right from the start to make two gargantuan chips it can't be too bad, and it would most likely perform much better on smaller chips, like Thuban.

Quote:

Originally Posted by Smartidiot89

I am not going to say Bulldozer is great cause it isn't. Facts are it was designed with power efficiency in mind and something has gone terribly wrong with the architecture, and we won't know for sure if these problems can be resolved until we see Piledriver to be honest.

And clocks aren't "near the roof" - far from it. AMD have for the past two quarterly results said explicitly they aren't happy with 32nm performance at GlobalFoundries and have overall been very open about it. Bulldozer was designed with high frequencies which shows as it retails at 3,6 GHz base. Llano was aimed at >3,0 GHz and reached only 2,9 GHz at launch and lets not speak of the mobile models. There are no problems releasing a CPU with clocks above 4,0 GHz and good power efficiency as long as it was designed for it (IPC usually gets cut then) which was a design compromise AMD did.

The manufacturing process isn't good at all today, and AMD decided to launch a brand new architecture on an unproven manufacturing process (never done by AMD nor Intel ever before?). I will wait for Piledriver before I pass any judgment on the architecture as a whole. I don't expect it to rock anyones world, but it will most likely be better than K10 and Phenom II.

First, AMDs current projections on Piledriver doesn't show it being that much better. It just can't magically get twice the performance per mm² it needs to have to be competitive in the long run. And no process scales very good with frequencies above 3-4GHz. That's the reason BD fails, they made huge tradeoffs for frequencies that have a very high price. The differences needed in an architecture or a process to earn an extra GHz at these levels are huge! In the past we could see 50-100% frequency increase with each process, sometimes even more, today, a new process don't give you that. You can still make larger and more complex chips, but not much have happened with frequencies since the 3GHz barrier was broken many years ago. So if an action that usually gave you a lot of frequency headroom in the past no longer does that, how much do you have to do to earn 1-2GHz that AMD needs right now? When closing in to 4GHz intels speed demon P4 couldn't go higher, and designs that were made for lower frequencies kept rising in speed until high IPC A64 and Core 2 were capable of almost the same frequencies. Higher IPC has the same costs it always had, but higher frequencies today require larger tradeoffs than ever before. That's why relatively huge tradeoffs in BD haven't given more than a few hundred MHz, which Thuban on 32nm might have reached just as well.

So, you simply can't make BD at 5GHz and 95W, and that’s where it needs to be to at least be competitive with SB mid-range, not taking BDs enormous die size into account. It would be easier to make a 32nm Thuban with IPC improvements at 95W, and it would have room to grow in. So yes, BD is pretty close to the roof, the roof might be just a bit over 4GHz in base clock, and it needs to be much much higher.

And no, no design can eradicate transistor level leakage. In the old days before leakage was a big problem you could make designs for higher frequencies, but not today, both low-IPC and high-IPC designs suffer from the same leakage at the same high frequencies. The larger the die the larger the problem as it usually means voltage increase. In a leakage free world then half as long steps in the pipe could mean twice the frequency, but if you run into massive leakage problems that grow exponentially with frequency and voltage then both designs suffer from this at the same frequencies. So to tune down IPC to get more frequency is to ask for more heat generated at the same performance today. This is just the same story all over again as when Prescott had it's problems, people blamed the process, even then Dothan shined on the same process, the difference was that the speed demon Prescott already was pushing the roof.

Simply put, to be just a bit competetive with SB then BD would need to be 5GHz at 95w with a much smaller die. No process can fix that! And how will it go with IB which seems to gain an even larger performance per watt advantage over PD. The situation might be even worse between PD and IB!
11-23-2011, 12:53 AM
mongoled

<Sarcasm>My word, I have never seen so many formidable CPU architects posting in one place at the same time in all my time on the Interwebs....< /sarcasm>

My, my, what are some of you people like?

Yes, things arnt as expected

Yes, things could have been done better

Yes, there are/were alot of smoke and mirrors

But some of the stuff being posted here is borderline insanity!

I dont know much about CPU architecture myself, but I sure know it isnt simply plug and pray.

First some of you need to grasp the concept of how tiny tiny tiny tiny tiny to the power of a million (simple speak :D) the parts that constitute a CPU are.

Then you need to have some humilty and understand, that it takes years of knowledge, research and experience to even be in a position to know somewhat what is going on.

Then you need to understand that you are still reliant on old knowledge and information.

Then you need to understand that even if you know the above, the item still needs to be physically made.

Some people posting here need a reality check and need to learn to chill out and look at things a little more pragmatically as you DONT (I speak for the majority here including myself) have the skillset to be discussing these things in any other way.

Express your opinions, yes

But stating things as fact, well, absurd isnt the word.............
11-23-2011, 02:30 AM
Smartidiot89

@ Boris: I've never suggested 95W @ 5 GHz but the 32nm process is borked. I think AMD has a much bigger clue what they did then anyone on enthusiast forums. There have been designs before with long pipelines and high frequencies that worked, IBM Power6 reached over 5 GHz on a 65nm-process. IPC is irrelevant it's the relation of IPC and clock frequencies that really matters. Advantage is in theory a lower amount of transistors/die area thus needing less transistors to power but you need higher frequencies. Obviously AMD dropped the ball here, but the concept none the less works if properly executed.

I am not expecting miracles from Piledriver, only that the path AMD choose will start making sense compared to K10. And the manufacturing process they are using are severely borked so there are increases in frequency and power efficiency to be had here.
11-23-2011, 02:58 AM
-Boris-

Quote:

Originally Posted by Smartidiot89

@ Boris: I've never suggested 95W @ 5 GHz but the 32nm process is borked. I think AMD has a much bigger clue what they did then anyone on enthusiast forums. There have been designs before with long pipelines and high frequencies that worked, IBM Power6 reached over 5 GHz on a 65nm-process. IPC is irrelevant it's the relation of IPC and clock frequencies that really matters. Advantage is in theory a lower amount of transistors/die area thus needing less transistors to power but you need higher frequencies. Obviously AMD dropped the ball here, but the concept none the less works if properly executed.

I am not expecting miracles from Piledriver, only that the path AMD choose will start making sense compared to K10. And the manufacturing process they are using are severely borked so there are increases in frequency and power efficiency to be had here.

No, I say 95W at 5GHz is needed to be even a bit competetive with SB, I never said it was your opinion. Power 6 had one thing AMD doesn't, IBM. IBM used som very interesting techniques to combat leakage that AMD doesn't have. And it was an in order processor cutting a lot of heat generating logic away. And I know IPC is nothing without frequencies. But today when you have to make huge sacrifices to make an architecture gain a few hundred MHz then IPC is in it self more important than ever.

And I still haven't seen how the manufacturing process can play such a big role here. Not even Intel could make bulldozer nearly as fast, cool and small as even mid range SB. Besides no one has given any proof that the process is that bad yet. I know there are supply problems, 32nm is still a lot better than 45nm considering that a huge monstrosity like BD is even doable, it wouldn't work at all at 45nm. So even if 32nm can get better than it is today, it's still better than 45nm, which would make a Phenom III on 32nm much more attractive.
11-23-2011, 04:17 AM
Piledriver

Quote:

Originally Posted by Zeus

Look at the post above. Don't see it being any faster than Thuban, do you?

You are kidding right? Why don't you instead of looking at cherry picked benchmarks by someone with an agenda, consisting of only single thread benchmarks, and great sites like neoseeker :p:, you looked at the complete reviews of the best tech sites out there?

Quote:

I saw 5\6 reviews, my impression was zambezi won the large majority of the tests vs thuban, but when reading the comments on this thread i doubted myself, so i had to double check, and review the reviews I've seen. I stopped at the third, it was pointless to go on, Techreport 20-5, X-bit labs 21-6, TomsHardware 31-10, bringing the total of 72-21 benchmarks in favor of FX-8150, it's not even close. How does that translate to the FX-8150 being 40% slower? or 1100T being quite faster? or a a worse launch than Barcelona for that matter, Phenom 9600 lost the majority of the benchmarks to the X2 6400.

Dispute this. i will be quietly lmao as you try. Its 72 wins for 8150 against 21 to thuban. Good luck.

A recent review made by the best tech site out there:
http://techreport.com/articles.x/21987

8150 wins 25, 1100t wins 7, even the 8120 beats the 1100t often, How do you reconcile this as thuban being faster?

Oh bu-bu-bu-but thuban have better IPC... and? Bulldozer have better turbo, can handle 8 threads and hopefully will get a lot higher frequencies. Why do intel fanboys keep bringing itunes? First cherry picked benchmark, go figure itunes... are they trying to say bulldozer can't handle itunes? Why don't they show a single thread benchmark of windows calculator? It would be as useful.

And if you do manually what the OS should do the difference between 8150 and 1100t is even bigger:
http://techreport.com/articles.x/21865/2

Fact, broken phenom was clearly beaten by K8, a broken bulldozer clearly beats Phenom.

http://www.electroiq.com/articles/ss...nitiative.html
Intel will have finfets on the market next year, with 22nm, everyone else will have it with 14nm, god knows when. A consortium including everyone else but Intel, can't keep up with Intel, and people want AMD alone to compete and win against Intel...
11-23-2011, 04:26 AM
Piledriver

Quote:

Originally Posted by -Boris-

No, I say 95W at 5GHz is needed to be even a bit competetive with SB

:shakes:
11-23-2011, 04:34 AM
-Boris-

Quote:

Originally Posted by Piledriver

:shakes:

Your point is? Show me some valid points.
I've seen i3s at standard frequency beat BD at 4.7GHz+ in games.
11-23-2011, 05:05 AM
Oliverda

Quote:

Originally Posted by muzz

Not sure how anyone can blame GF for the architecture failure.
Granted, they DO take up too much power, and it DOES heat up too much when OC'd....

What else can be THEIR fault?
How fast was it supposed to be, 6G? How was the performance of that monster OC that was on here, did it even compete with a 2600 or 990 at 4.5 g?
At stock frequencies the draw and heat are fine, and this chip gets WHALED on, which tells me and anyone else paying attention that's it's IPC is weak.
We still live in a ST world, regardless of what the geniuses over at AMD say.

This post reflects pretty well that you aren't familiar with the BD uarch at all.

Quote:

Originally Posted by -Boris-

Seriously? Quote mining much? You forgot the last part of the sentence "Phenom II has higher performance per watt, twice(!) the performance per mm² (taking processes in to account)." I honestly thought our discussion would be above your quote mining tactics. And your chart just proves my point. Phenom II DOES have higher performance per watt!

Shrunk? Please show me me that shrunk. Your statement just based on a simple theory.

Quote:

Originally Posted by -Boris-

And if shrunk with 32nm it would be almost half as big, but even cooler and capable of even higher performance! There you have twice the performance per mm².

Or not. Probably that shrunk would be worse than Thuban because the crappy 32nm tech. If not then prove it please.

Quote:

Originally Posted by -Boris-

And no, no one has presented any proof that 32nm is very bad at all, of course it might not be the best process right now, but if it's capable right from the start to make two gargantuan chips it can't be too bad, and it would most likely perform much better on smaller chips, like Thuban.

There is the Llano. It's a perfect proof. AMD and even the overclockers can't reach similar frequencies what we saw on Propus or Deneb.
11-23-2011, 05:32 AM
muzz

Quote:

Originally Posted by Oliverda

This post reflects pretty well that you aren't familiar with the BD uarch at all.
.

Is that so?
Is it performing at stock frequencies in comparison to it's counterparts at similar clocks? No
Other than the MASSIVE heat and power draw, when it's overclocked, does it compete favorably to an overclocked SB? No

Not sure what you want me to say, clock for clock it just doesn't get it done, and there are hundreds of benches out there that prove this to be correct.
11-23-2011, 05:33 AM
Piledriver

Quote:

Originally Posted by -Boris-

Your point is? Show me some valid points.

You said bulldozer needs to be 5Ghz to be a "BIT" competitive and now you are asking for valid points? Priceless.

Quote:

I've seen i3s at standard frequency beat BD at 4.7GHz+ in games

You just said in a tech forum "I've seen" and you want to be taken seriously?

I'm gonna use the same review i used above, it's the most recent:
http://techreport.com/articles.x/21987/17

A 4.4Ghz 8150 beats the stock 2700k quite often, yet somehow 5ghz is needed to be a "BIT" competitive.

Game benchs 8150 vs i3 2100, 8150 wins 7, i3 wins 3, 2 draws... 2 of the 3 wins of the i3 was by 1 frame, in the low resolutions 8150 wins by a large margin, but hey "you've seen"... how can i dispute that.

Now i will stop feeding trolls and i will get on with my life. Thank you sir.
11-23-2011, 05:34 AM
savantu

Quote:

Originally Posted by Oliverda

Shrunk? Please show me me that shrunk. Your statement just based on simple a theory.

Or not. Probably that shrunk would be worse than Thuban because the crappy 32nm tech. If not then prove it please.
..

A shrink typically brings you 0.6-0.7 area reduction ( ~220mm^2 for 32nm Thuban vs. 45nm Thuban ) and 10-20% higher clock ( 3.5-3.9GHz ) et ceteris paribus ( uarch wise ). So would be a 220mm^2 3.8GHz Thuban be better than today's BD ? Most likely yes, both in ST and MT workloads.
The problem is that such a Thuban would have several issues :
- I do not know how speed path limited K10.5 is with 3 cycle L1 and 12 cycle L2, in other words, getting to 3.5-4GHz might have required AMD to relax the latencies of the caches ( like 4 and 14-15 cycles respectively ).
- It lacks AVX and FMA. I can only assume a similar aproach like done with BD, use the existing 128bit FPUs and split de 256bit AVX in 2 halves to minimize area and complexity. I do not know the increase in FPU area and power to support AVX and FMA, I don't think it's trivial
-Maintains the status quo vs. Intel. Thuban roughly needs almost 2x the core count to match Intel Xeons. BD did not improve on this, it definately needs at least 2x the core count to match Intel Xeons.

Given BD's failures, it could be that at least vs. BD ver 1, a 32nm Thuban might have performed better.

Imagine we are in late 2008/ early 2009. BD simulations prove the CPU to be to large and to slow in 45nm vs. competitor CPUs. At the same time, Intel announced they will not use SSE5 and XOP but go for AVX and FMA3. This raises an interesting point : what if AMD would have planned a 32nm Thuban in summer 2011 and delay BD to 2012, drop FMA4 support, focus only on AVX and FMA3 ?
The first one should have been not that difficult to do (?) and would have bought time to polish BD. AVX and FMA4 support is more or less irrelevant now and by the time they will become widespread, BDver1 is history anyway.
11-23-2011, 05:55 AM
sergiojr

Quote:

Originally Posted by savantu

A shrink typically brings you ... and 10-20% higher clock

Do you have an example of such shrink besides Deneb at 45nm? History of AMD shrinks(90nm, 65nm) teaches that they give 10-20% lower clock at start. And speaking of Deneb, it seems to me it was more Agena's failure, but not a Deneb's win. AMD had plans for 3GHz Phenom, but TLB-bug leaved no time to develop frequency-optimized stepping of Agena before Deneb. So 3Ghz Phenom vs 3Ghz PhenomII would mean no clock increase with shink at start.
11-23-2011, 06:01 AM
wez

Quote:

Originally Posted by BeepBeep2

Charts are above. They may be handpicked (you asked me to find them...)...

I see no reason to repeat stuff, so just read the posts above this one, and look at non-hand picked results.

Still waiting on your facts on 32nm thuban/agena. But I see you changed direction after being called out on it :rolleyes:
11-23-2011, 06:15 AM
-Boris-

Quote:

Originally Posted by Oliverda

Shrunk? Please show me me that shrunk. Your statement just based on a simple theory.

A theory yes, not a hypothesis. BD wouldn't be possible on 45nm, but it is on 32nm. The fact that their 32nm is capable of BD is proof enough that it's not crap, it might not be the best around, but it's good enough.

Quote:

Originally Posted by Oliverda

Or not. Probably that shrunk would be worse than Thuban because the crappy 32nm tech. If not then prove it please.

No one has yet showed any proof that 32nm is that bad, that BD is alive and kicking is proof enough that 32nm works. Thuban is not nearly as hard to produce as BD, so if you can make BD, then Thuban would be easy. Thuban would probably be below 200mm².
For me the existence of beasts like Llano and BD is proof enough that 32nm is way better than 45nm, and Thuban would be better off at 32nm.

Quote:

Originally Posted by Oliverda

There is the Llano. It's a perfect proof. AMD and even the overclockers can't reach similar frequencies what we saw on Propus or Deneb.

Not proof at all! Is locked SB's proof that Intels 32nm sucks because they don't overclock? No! Llano suffers from similar problems since they aren't unlocked, and what worse is, you can't lock frequencies like PCIe. We have no clue what Llano would clock like if it was unlocked. Besides the integrated GPU isn't made for that kind of processes, which means there will be tradeoffs in process choice when making it on die. The proof we do have is that Llano consumes 5-20W less power than comparable Athlon II in different tests, and that with an extra GPU in the test for the Llano! What does that say about GloFos 32nm? It's better than 45nm!

So, you have no valid proof whatsoever that CPUs fare worse on 32nm than 45nm. I on the other hand have numbers that show lower power consumption, and the fact that BD exists is a strong indicator that a much simpler chip would perform quite well on 32nm.

There are few tests that compare Athlon II with Llano, here is one, it's in swedish but I hope you understand charts.
http://www.sweclockers.com/recension...no/25#pagehead
11-23-2011, 06:15 AM
muzz

Quote:

Originally Posted by behrouz

dont bother with article ars technica

new compilers like open64 and gcc 4.7 improved BD's performance , here Reason :

http://www.phoronix.com/scan.php?pag...r_open64&num=3

http://www.phoronix.com/scan.php?pag...c_open64&num=3

I can't take them seriously, they overclocked the 8150, and their results got WORSE?
11-23-2011, 06:36 AM
-Boris-

Quote:

Originally Posted by Piledriver

You just said in a tech forum "I've seen" and you want to be taken seriously?

The i3 was a worst case thing. I'm fully aware that an i3 isn't close to an 8150, but when it actually scores better in some games despite the 8150 is heavilly overclocked I use that as an example that there is a long way left to beat i7 in games.

Quote:

Originally Posted by Piledriver

I'm gonna use the same review i used above, it's the most recent:
http://techreport.com/articles.x/21987/17

A 4.4Ghz 8150 beats the stock 2700k quite often, yet somehow 5ghz is needed to be a "BIT" competitive.

Game benchs 8150 vs i3 2100, 8150 wins 7, i3 wins 3, 2 draws... 2 of the 3 wins of the i3 was by 1 frame, in the low resolutions 8150 wins by a large margin, but hey "you've seen"... how can i dispute that.

And how often does an overclocked 8150 beat an stock i7 when you look at other stuff than heavily multithreaded benches? How often does a BD at any frequency beat an SB i7 at stock in games? Show me!

At what frequency can an BD match an stock i7 across the board?

EDIT:
For some reason gamebenches with overclocked BD seems to be rare. So I'll give you the ones I found:
http://www.neoseeker.com/Articles/Ha...x-8150/11.html
http://www.overclockers.com/amd-fx-8...ocessor-review <-- Graphics limied so differences appear smaller.
http://www.vortez.net/articles_pages...review,13.html
http://www.madshrimps.be/articles/ar...#axzz1eXh3RmmC

Again, at what frequency can Bulldozer match this?! You are free to supply reviews of your own to show gaming performance between i7 and overclocked BD.
11-23-2011, 06:59 AM
savantu

Quote:

Originally Posted by sergiojr

Do you have an example of such shrink besides Deneb at 45nm? History of AMD shrinks(90nm, 65nm) teaches that they give 10-20% lower clock at start. And speaking of Deneb, it seems to me it was more Agena's failure, but not a Deneb's win. AMD had plans for 3GHz Phenom, but TLB-bug leaved no time to develop frequency-optimized stepping of Agena before Deneb. So 3Ghz Phenom vs 3Ghz PhenomII would mean no clock increase with shink at start.

It's a general rule of thumb in the industry. Moving to a new process brings you two advantages :
-die size reduction, maximum is 50% (0.7*0.7 )
-20% more frequency for the same power

All new processes ussually claim 20-50% power reduction or alternatively 20-40% more clocks for the same power consumption.
11-23-2011, 07:38 AM
freeloader

Quote:

Originally Posted by muzz

I can't take them seriously, they overclocked the 8150, and their results got WORSE?

Clock throttling due to heat.
11-23-2011, 08:00 AM
Zeus

Quote:

Originally Posted by Piledriver

You are kidding right?

I guess i was when i bought an FX8120. Coming from a Thuban i can tell performance sucks.

But with a name like Piledriver i don't think you're gonna take that from someone with first hand experience.
11-23-2011, 08:12 AM
Smartidiot89

Quote:

Originally Posted by -Boris-

No one has yet showed any proof that 32nm is that bad, that BD is alive and kicking is proof enough that 32nm works. Thuban is not nearly as hard to produce as BD, so if you can make BD, then Thuban would be easy. Thuban would probably be below 200mm².
For me the existence of beasts like Llano and BD is proof enough that 32nm is way better than 45nm, and Thuban would be better off at 32nm.

So I take it you're still ignoring the facts that AMD have said openly that GlobalFoundries 32nm didn't reach AMD's expectations in both performance and that yields are bad? The last two quarterly calls they've talked about it with media and investors, they also issued a press release before their Q3 results saying projections for that quarter would be lower because of bad yields at their 32nm node.

Llano was also still projected to enter the market at 3,0+ GHz yet only retailed at 2,9 GHz, it was also supposed to have launched late-2010 and not mid-2011. Llano is also in extremely short supply both in the retail space, but also with OEM's. 32nm is horrid right now and facts are that AMD aren't happy with it.

There is no doubt 32nm "works" but it's still a dog with horrible yields, which needs to be fixed and is reflected upon in both of their 32nm products. Talking about Bulldozer, it is also very possible AMD are running specific functions at lower clocks, which can impact performance greatly.
11-23-2011, 10:32 AM
BeepBeep2

Quote:

Originally Posted by wez

I see no reason to repeat stuff, so just read the posts above this one, and look at non-hand picked results.

Still waiting on your facts on 32nm thuban/agena. But I see you changed direction after being called out on it :rolleyes:

Considering half the results lean heavily toward Thuban being the better architecture and the other results show FX matching Sandy Bridge (in MT performance only...losing up to 80% in single thread) using 25% more power to do it.

Facts on 32nm Thuban/Agena? Changed direction after being called out on it?
Llano's refined core was supposed to gain up to 5% IPC, correct?
Lets say we shrunk Thuban but used Llano's core...a 6 core would be 269mm^2 like I stated before, correct? Assuming that the 32nm process can produce chips that function at least as good as the 45nm, (or maybe something like the 90nm > 65nm transition was at least) we would have chips with a much smaller die and less power consumption than current BD, producing much more performance per mm^2 even if you ignore the power consumption. I didn't say "Add two cores for Phenom II X8 and set it at 4 Ghz" like informal thought I did. Anyway, the X6 CPU performs very close to BD in real world apps when both are overclocked to 4.2/4.8. Also, "STARS" is very bandwidth starved, the more you overclock ram and overclock CPUNB the better it performs, what if it had the type of bandwidth available that BD has? More IPC improvement.

The only comment I made about an eight core with the old uarch was that the die size would be around 330-340mm^2, only slightly bigger than BD is today (~5-10%). Anyway, who knows if they couldn't have added two more cores AND increased clock? Even if clockspeed had to be reduced, it would still perform better than BD. Lets say we could only get 3.8 Ghz out of the architecture on 32nm with 8 cores. Would that not perform better than BD? Look what they did going from X4 to X6, the CPUs overclocked just as well, and still do, compared to recent quads. Would it be hard to prove that shrinking Thuban would have brought more performance per mm^2 over BD on 32nm? No, not at all. I believe the answer is quite clear in the first paragraph of this post.
11-23-2011, 10:52 AM
savantu

Yield and performance are different things. Having bad yields doesn't preclude working parts to operate at high frequency. The question is still open whether the uarch is to blame or the process for the high power consumption at high clocks. I'd say it is a bit of both, but the process isn't completely broken.
Llano isn't a good indicator since the GPU is causing all the issues apparently.
11-23-2011, 11:41 AM
-Boris-

Quote:

Originally Posted by Smartidiot89

So I take it you're still ignoring the facts that AMD have said openly that GlobalFoundries 32nm didn't reach AMD's expectations in both performance and that yields are bad? The last two quarterly calls they've talked about it with media and investors, they also issued a press release before their Q3 results saying projections for that quarter would be lower because of bad yields at their 32nm node.

Llano was also still projected to enter the market at 3,0+ GHz yet only retailed at 2,9 GHz, it was also supposed to have launched late-2010 and not mid-2011. Llano is also in extremely short supply both in the retail space, but also with OEM's. 32nm is horrid right now and facts are that AMD aren't happy with it.

There is no doubt 32nm "works" but it's still a dog with horrible yields, which needs to be fixed and is reflected upon in both of their 32nm products.

I don't ignore anything, but lower yields than expected is not the same thing as the finished chips perform worse than 45nm counterparts. On the contrary we have numbers showing that Llano consumes less power than Athlon II despite a GPU. Llano or BD would most likely not even be feasible on 45nm. So even if 32nm yields isn't where AMD want them to be I think it's safe to say that a 32nm Thuban would perform better than a 45nm Thuban. You are forgetting that the chips that currently has yield problems are record breakers when it comes to transistor count. It's not surprising yields is bad so far. Yields would be better with smaller chips, so that's just another reason why 32nm Thuban would be better of. You still can't blame GloFo for BD's shortcomings as some people do, the amount of speed needed to make BD competitive isn't possible on any process, especially not when taking thermals into account.

Even if 32nm isn't where AMD expected it's still most likely to give cooler chips and/or higher frequency headroom considering the evidence that we have.

Quote:

Originally Posted by Smartidiot89

Talking about Bulldozer, it is also very possible AMD are running specific functions at lower clocks, which can impact performance greatly.

If so, and if these functions cripple BD considerably, and they expect yields to improve over the next two years allowing them to run these functions at full speed, shouldn't we expect a successor with radically improved performance? AMDs current projections isn't to promising.

So, nothing, still, points at Thuban on 32nm would perform worse than Thuban on 45nm. It should perform much better and with Llanos IPC improvements you could call it a day. Thuban still has higher performance per mm² than BD taking processes into account, that don't bode well for the future.
11-23-2011, 11:42 AM
-Boris-

Quote:

Originally Posted by BeepBeep2

Considering half the results lean heavily toward Thuban being the better architecture and the other results show FX matching Sandy Bridge (in MT performance only...losing up to 80% in single thread) using 25% more power to do it.

Facts on 32nm Thuban/Agena? Changed direction after being called out on it?
Llano's refined core was supposed to gain up to 5% IPC, correct?
Lets say we shrunk Thuban but used Llano's core...a 6 core would be 269mm^2 like I stated before, correct? Assuming that the 32nm process can produce chips that function at least as good as the 45nm, (or maybe something like the 90nm > 65nm transition was at least) we would have chips with a much smaller die and less power consumption than current BD, producing much more performance per mm^2 even if you ignore the power consumption. I didn't say "Add two cores for Phenom II X8 and set it at 4 Ghz" like informal thought I did. Anyway, the X6 CPU performs very close to BD in real world apps when both are overclocked to 4.2/4.8. Also, "STARS" is very bandwidth starved, the more you overclock ram and overclock CPUNB the better it performs, what if it had the type of bandwidth available that BD has? More IPC improvement.

The only comment I made about an eight core with the old uarch was that the die size would be around 330-340mm^2, only slightly bigger than BD is today (~5-10%). Anyway, who knows if they couldn't have added two more cores AND increased clock? Even if clockspeed had to be reduced, it would still perform better than BD. Lets say we could only get 3.8 Ghz out of the architecture on 32nm with 8 cores. Would that not perform better than BD? Look what they did going from X4 to X6, the CPUs overclocked just as well, and still do, compared to recent quads. Would it be hard to prove that shrinking Thuban would have brought more performance per mm^2 over BD on 32nm? No, not at all. I believe the answer is quite clear in the first paragraph of this post.

How do you calculate die size? 32nm Thuban would be much smaller, in theory as small as half the size of 45nm Thuban.

Show 100 post(s) from this thread on one page