AMD cuts to the core with 'Bulldozer' Opterons

Printable View

Show 100 post(s) from this thread on one page

02-05-2010, 10:13 AM
god_43

http://www.youtube.com/watch?v=JeQ-wjDH4F4
02-05-2010, 10:45 AM
doompc

Thanks for clarifying that, JF-AMD.
Any hint on the interconnect of the Bulldozer variation of Magy Cours ? :)
02-05-2010, 11:08 AM
Manicdan

Quote:

Originally Posted by god_43

http://www.youtube.com/watch?v=JeQ-wjDH4F4

great band, awesome song, brings back a memory of trying to listen to that cd in a portable cd player before any skip protection existed.
02-05-2010, 12:22 PM
Lightman

Thanks JF-AMD for clearing some of my questions!

Much appreciated.
02-05-2010, 07:10 PM
JF-AMD

Quote:

Originally Posted by doompc

Thanks for clarifying that, JF-AMD.
Any hint on the interconnect of the Bulldozer variation of Magy Cours ? :)

Interconnect will remain HyperTransport.
02-05-2010, 07:19 PM
JF-AMD

Desktop die, server die, what is the difference? We start out with the same base design, for instance Shanghai was the same as Deneb (I think, again I don't know my desktop code names). The cores are identical.

The memory controllers are the same, you could support unbuffered or registered memory on the same controller (though not at the same time). It is the memory validation that drives the platform. Desktop would probably never validate for registered memory because a.) nobody would probably want that and b.) it eats up a lot of validation and support cycles (OEMS don't like it either because it causes more work for them.) We don't validate for unbuffered memory even though we could support it because nobody wants a server limited to 8GB of memory.

When it came to Istanbul, that was a server only design. Desktop never asked for a version, so when we did the design, all considerations were for server. We added things like APML and HT Assist that desktop would never use. It was, and it still is, a server-only design. Lisbon is the same way.

Thuban, as I understand it, is probably based on the same general design, they probably took out server features and added desktop features.

Once you punch out a wafer of Istabuls or Lisbons, you could put them in any package, but they would still be Istanbul or Lisbon. There is a feature set that makes them different from desktop. It's not a fusing recipe during APM that makes it a server die or a desktop die, it is the actual design.
02-05-2010, 07:20 PM
JF-AMD

Quote:

Originally Posted by Manicdan

great band, awesome song, brings back a memory of trying to listen to that cd in a portable cd player before any skip protection existed.

Where is the love for The Replacements??? THAT is rock and roll.
02-06-2010, 03:53 AM
FlanK3r

JF: There are already (exists) first samples for server Bulldozer? Thx.
02-06-2010, 07:30 AM
JF-AMD

Quote:

Originally Posted by FlanK3r

JF: There are already (exists) first samples for server Bulldozer? Thx.

Sorry, can't comment on that.
02-06-2010, 07:36 AM
freeloader

Quote:

Originally Posted by JF-AMD

Sorry, can't comment on that.

That means yes. :D
02-06-2010, 07:44 AM
ajaidev

http://translate.googleusercontent.c...1M32ea2kxmQ_kA

Very interesting read.... the approach is quite conventional "hehe get it"
02-06-2010, 08:22 AM
Lightman

Quote:

Originally Posted by freeloader

That means yes. :D

If you consider that couple of months ago his reply to the same question was simple NO!

I agree with your conclusion :D
02-06-2010, 08:58 AM
haylui

Quote:

Originally Posted by ajaidev

http://translate.googleusercontent.c...1M32ea2kxmQ_kA

Very interesting read.... the approach is quite conventional "hehe get it"

Interesting read
Anyway, BullDozer is confirmed to have lower single threaded int performance compared to K10.
But multi-threaded and ILP apps would be very beneficial.
Hopefully AMD could fuse GPU into BullDozer module concept. Throw the FPU load to GPU, so that AMD could add more performance on int with CPU transistors
02-06-2010, 09:54 AM
FlanK3r

Quote:

Originally Posted by JF-AMD

Sorry, can't comment on that.

ok, i understand...I asked only, because for partners wiil samples this year (from AMD financial day info)
02-06-2010, 02:52 PM
informal

Quote:

Originally Posted by haylui

Interesting read
Anyway, BullDozer is confirmed to have lower single threaded int performance compared to K10.
But multi-threaded and ILP apps would be very beneficial.
Hopefully AMD could fuse GPU into BullDozer module concept. Throw the FPU load to GPU, so that AMD could add more performance on int with CPU transistors

Lol confirmed by whom?? Hiroshige Goto's ranting? Yeah right. :rolleyes:
You cannot have higher multithreaded performance (ie. QC Orochi Vs QC K10) if single thread performance is lower in Orochi's case.It's simple logic at work. Even AT got the 35% higher integer performance,on average, in QC vs QC case(yeah,that's 2 module BD vs QC K10 btw). Add in that BD will come in 4 module version to desktop,so that's 2x better than the 35%.Add in that it will definitely have better IPC per core and per clock than 10h(quote me on that,i'll admit i was wrong if not true),add in it will have agressive Turbo mode,add in it will have 2x as powerful FPU thanks to FMAC units inside the dual thread SIMD unit,add in it will work at higher clock to begin with,and you end up at much better place than the one Mr. Goto is portraying in his blog.
02-06-2010, 03:57 PM
Opteron146

Ehh .. now you contradict yourself:

Quote:

Originally Posted by JF-AMD

Desktop die, server die, what is the difference? We start out with the same base design, for instance Shanghai was the same as Deneb (I think, again I don't know my desktop code names). The cores are identical.
(...)
Once you punch out a wafer of Istabuls or Lisbons, you could put them in any package, but they would still be Istanbul or Lisbon. There is a feature set that makes them different from desktop. It's not a fusing recipe during APM that makes it a server die or a desktop die, it is the actual design.

If I use your argument from the first paragraph then it would be:
The cores are identical - Lisbon or Thuban.

If I reuse arguments from the 2nd paragraph to Shanghai / Deneb:
Once you punch out a wafer of Shanghais, you could put them in any package, but they would still be Shanghais, no Denebs. There is a feature set that makes them different from desktop. It's not a fusing recipe during APM that makes it a server die or a desktop die, it is the actual design.

The only way to explain that is, that you really mean the Thuban and Istanbul / Lisbon will be based on different designs and different masks.

That would be the 1st time in the whole AMD history. I can hardly believe that ... but I will trust you on that til Thuban's presentation.

If there are still 4 Hypertransport links on the Thuban die, be prepared for my complaint :D

Have a nice sunday
02-06-2010, 04:11 PM
Opteron146

Quote:

Originally Posted by haylui

Anyway, BullDozer is confirmed to have lower single threaded int performance compared to K10.

I dont see anything there.
If you refer to the 2pipe design, that does not mean much. K10 has 3pipes, yes, but the front end is inferior to Bulldozer's, i.e. K10's 3 pipes are used seldom.

If I have to choose between a 2pipe Design with an average load of 90% and a 3 pipe design with a useless 3rd pipe, I would choose the first one ;-)

It would save die space and power, too. Furthermore, due to simplicity of the design, it could increase clock headroom, too.

Actually I think that is BD's biggest advantage. Because of the shared Front-End, it could be more sophisticated and complex than usual.

That will yield in a better utilization of the back-end.

So far - with the litte available information - the design looks quite stream-lined to me, I like it. It looks efficient and fast. I just wonder how much of the "information" will be true in the end.
02-06-2010, 05:49 PM
JF-AMD

Quote:

Originally Posted by freeloader

That means yes. :D

Not necessarily. It means there are some aspects of the business that I can't comment on.
02-06-2010, 05:59 PM
JF-AMD

Quote:

Originally Posted by haylui

Interesting read
Anyway, BullDozer is confirmed to have lower single threaded int performance compared to K10.
But multi-threaded and ILP apps would be very beneficial.
Hopefully AMD could fuse GPU into BullDozer module concept. Throw the FPU load to GPU, so that AMD could add more performance on int with CPU transistors

I wouldn't necessarily say that the integer comment is correct. For instance, MC has 12 cores, Interlagos has 16 cores. 33% more cores but more than 33% greater performance. That sounds like faster to me.

That is the whole idea of the architecture, flexibility for multiple designs. Not necessarily sure that the FPU goes by the wayside, at least not until the software situation changes. There will still be regular applications that need to access floating point for a cycle or two, and FPU will be necessary for that. However, I would expect, that over time, a GPU can plug into the architecture the same way as a bulldozer module.
02-06-2010, 07:34 PM
freeloader

Quote:

Originally Posted by JF-AMD

Not necessarily. It means there are some aspects of the business that I can't comment on.

I would find it hard to believe that you would not have up and running silicone by now, considering BD is about one year away. :) I know you can't comment about it, so I'm being an optimist and filling in between the lines for myself. :D
02-07-2010, 01:11 AM
FlanK3r

JF: bulldozer is planned for many CPU generations? Thats mean about next 3-4 years? First gen. Zambezzi etc etc...?
02-07-2010, 06:43 AM
JF-AMD

Quote:

Originally Posted by freeloader

I would find it hard to believe that you would not have up and running silicone by now, considering BD is about one year away. :) I know you can't comment about it, so I'm being an optimist and filling in between the lines for myself. :D

Generally speaking we are 12 months from tapeout to final product. That is why commenting on things is very dangerous because it signals where we are in the process. That can have an impact (positively or negatively) on stock price, which is why I have to be careful.

When we give tapeout or sampling information we release through coordination with investor relations to make sure that we are in full compliance with the law.

Talking about BD tapeout or samples would be considered "material" and I could land in hot water (or legal issues) for making statements in public that were not cleared.
02-07-2010, 06:44 AM
JF-AMD

Quote:

Originally Posted by FlanK3r

JF: bulldozer is planned for many CPU generations? Thats mean about next 3-4 years? First gen. Zambezzi etc etc...?

There will be multiple generations of the processor architecture. I can't speak beyond what has been shown in the roadmaps to the press and analysts.
02-07-2010, 04:46 PM
haylui

Quote:

Originally Posted by JF-AMD

I wouldn't necessarily say that the integer comment is correct. For instance, MC has 12 cores, Interlagos has 16 cores. 33% more cores but more than 33% greater performance. That sounds like faster to me.

That is the whole idea of the architecture, flexibility for multiple designs. Not necessarily sure that the FPU goes by the wayside, at least not until the software situation changes. There will still be regular applications that need to access floating point for a cycle or two, and FPU will be necessary for that. However, I would expect, that over time, a GPU can plug into the architecture the same way as a bulldozer module.

Hopefully AMD could take back the performance crown from Intel with BullDozer.
02-07-2010, 06:05 PM
freeloader

Quote:

Originally Posted by haylui

Hopefully AMD could take back the performance crown from Intel with BullDozer.

Would definitely be nice for AMD's gross margins. :) More money for R&D.

Show 100 post(s) from this thread on one page