http://www.youtube.com/watch?v=JeQ-wjDH4F4
Printable View
Thanks for clarifying that, JF-AMD.
Any hint on the interconnect of the Bulldozer variation of Magy Cours ? :)
Thanks JF-AMD for clearing some of my questions!
Much appreciated.
Desktop die, server die, what is the difference? We start out with the same base design, for instance Shanghai was the same as Deneb (I think, again I don't know my desktop code names). The cores are identical.
The memory controllers are the same, you could support unbuffered or registered memory on the same controller (though not at the same time). It is the memory validation that drives the platform. Desktop would probably never validate for registered memory because a.) nobody would probably want that and b.) it eats up a lot of validation and support cycles (OEMS don't like it either because it causes more work for them.) We don't validate for unbuffered memory even though we could support it because nobody wants a server limited to 8GB of memory.
When it came to Istanbul, that was a server only design. Desktop never asked for a version, so when we did the design, all considerations were for server. We added things like APML and HT Assist that desktop would never use. It was, and it still is, a server-only design. Lisbon is the same way.
Thuban, as I understand it, is probably based on the same general design, they probably took out server features and added desktop features.
Once you punch out a wafer of Istabuls or Lisbons, you could put them in any package, but they would still be Istanbul or Lisbon. There is a feature set that makes them different from desktop. It's not a fusing recipe during APM that makes it a server die or a desktop die, it is the actual design.
JF: There are already (exists) first samples for server Bulldozer? Thx.
http://translate.googleusercontent.c...1M32ea2kxmQ_kA
Very interesting read.... the approach is quite conventional "hehe get it"
Interesting read
Anyway, BullDozer is confirmed to have lower single threaded int performance compared to K10.
But multi-threaded and ILP apps would be very beneficial.
Hopefully AMD could fuse GPU into BullDozer module concept. Throw the FPU load to GPU, so that AMD could add more performance on int with CPU transistors
Lol confirmed by whom?? Hiroshige Goto's ranting? Yeah right. :rolleyes:
You cannot have higher multithreaded performance (ie. QC Orochi Vs QC K10) if single thread performance is lower in Orochi's case.It's simple logic at work. Even AT got the 35% higher integer performance,on average, in QC vs QC case(yeah,that's 2 module BD vs QC K10 btw). Add in that BD will come in 4 module version to desktop,so that's 2x better than the 35%.Add in that it will definitely have better IPC per core and per clock than 10h(quote me on that,i'll admit i was wrong if not true),add in it will have agressive Turbo mode,add in it will have 2x as powerful FPU thanks to FMAC units inside the dual thread SIMD unit,add in it will work at higher clock to begin with,and you end up at much better place than the one Mr. Goto is portraying in his blog.
Ehh .. now you contradict yourself:
If I use your argument from the first paragraph then it would be:
The cores are identical - Lisbon or Thuban.
If I reuse arguments from the 2nd paragraph to Shanghai / Deneb:
Once you punch out a wafer of Shanghais, you could put them in any package, but they would still be Shanghais, no Denebs. There is a feature set that makes them different from desktop. It's not a fusing recipe during APM that makes it a server die or a desktop die, it is the actual design.
The only way to explain that is, that you really mean the Thuban and Istanbul / Lisbon will be based on different designs and different masks.
That would be the 1st time in the whole AMD history. I can hardly believe that ... but I will trust you on that til Thuban's presentation.
If there are still 4 Hypertransport links on the Thuban die, be prepared for my complaint :D
Have a nice sunday
I dont see anything there.
If you refer to the 2pipe design, that does not mean much. K10 has 3pipes, yes, but the front end is inferior to Bulldozer's, i.e. K10's 3 pipes are used seldom.
If I have to choose between a 2pipe Design with an average load of 90% and a 3 pipe design with a useless 3rd pipe, I would choose the first one ;-)
It would save die space and power, too. Furthermore, due to simplicity of the design, it could increase clock headroom, too.
Actually I think that is BD's biggest advantage. Because of the shared Front-End, it could be more sophisticated and complex than usual.
That will yield in a better utilization of the back-end.
So far - with the litte available information - the design looks quite stream-lined to me, I like it. It looks efficient and fast. I just wonder how much of the "information" will be true in the end.
I wouldn't necessarily say that the integer comment is correct. For instance, MC has 12 cores, Interlagos has 16 cores. 33% more cores but more than 33% greater performance. That sounds like faster to me.
That is the whole idea of the architecture, flexibility for multiple designs. Not necessarily sure that the FPU goes by the wayside, at least not until the software situation changes. There will still be regular applications that need to access floating point for a cycle or two, and FPU will be necessary for that. However, I would expect, that over time, a GPU can plug into the architecture the same way as a bulldozer module.
JF: bulldozer is planned for many CPU generations? Thats mean about next 3-4 years? First gen. Zambezzi etc etc...?
Generally speaking we are 12 months from tapeout to final product. That is why commenting on things is very dangerous because it signals where we are in the process. That can have an impact (positively or negatively) on stock price, which is why I have to be careful.
When we give tapeout or sampling information we release through coordination with investor relations to make sure that we are in full compliance with the law.
Talking about BD tapeout or samples would be considered "material" and I could land in hot water (or legal issues) for making statements in public that were not cleared.