2.3ghz is available for 12cores mcm magny cour.
I don't think next magny cour is 3x2 modules only ... and it's already said 33% more core, so next magny cour ( interlago ) is 16 cores.
Printable View
why not ??? more logic then 10H more cores etc... even with the shrink i doubt that the 8 cores desktop will first come at more then 3.2ghz ... maybe with later revision higher clocks ... but for release it wont go higher then that
but it wont be the same architecture as magny cours ... so only a 200mhz bump to 2.5 for the high end model wouldnt be hard to believe ... maybe 2.6 without turbo .... now how the turbo will kick in on the server is to be seen but i believe that stock with no turbo it wont go past 2.6 at the most for a 12 cores 32nm part ... 16 cores wont go much past 2.3ghz .. if my assumption is bad then its all good ...
AMD seems quite confident that Bulldozer can actually deliver, and I do hope they are right. They need to really compete in the high end against Intel.
I wonder, even if the top end Bulldozer does actually perform faster than Intel's top end or close, will they price it like $1000 or more like the Athlon FX.
I wont mind if the price isnt too high seeing as we should still have plenty of "wiggle" room with the fsb(htt) on the low end chips, assuming they dont pull a Sandy Bridge stunt.
If it delivers Id suspect both Intel and AMD to come down a bit.
He said that 8 10H if done on 32nm would be bigger than 8 BD modules. The rest you can figure out for yourself.
bingo??:eek:
http://www.heise.de/ct/artikel/Proze...r-1064662.html
google translated:http://translate.googleusercontent.c...hbJHxdcLE54JUg
Quote:
With its eight modules - so depending on the perspective, eight to 16 cores - should the bulldozers server chip Interlagos about 70 percent more integer performance (SPECint) than the 12-Kerner achieve Magny-Cours, which provides thus not really a "starving" front end out. Besides the thick Interlagos with up to 8 MB L3 cache for all modules on the chip AMD plans to release half as large chips for servers (Valencia) and high-end desktop PCs (Zambezi). For floating-point Interlagos has to offer in comparison to Magny-Cours, although a third less cores, but with him will be higher thanks AVX and FMA, and better memory connection SPECfp the processing power by a third. These are connected FPUs not even on the small L1 cache. An L1-bypass for FPUs, which had little success Intel Itanium also - hopefully this is not a bad omen
I meant an octocore MCM = 16 cores. I think that an Octocore 10h at 32nm is cooler than a Hexacore 10h at 45nm. So, i believe that even without a new architecture AMD would be able to make an 16 core MCM processor at 2.3GHz and higher.
I also believe that a Hexacore 10h uses more trannies than an 4 module BD.
Why not? I'm not saying I would ever buy their products if they priced it that high, as I go with the best low-mid grade product on the market regardless of brand, but once again, why not?
If Intel can remove fsb overclocking and it's understood around here why they are doing it (for profit reasons of course), then why shouldn't AMD be able to profit if they have a real winner?
Even still, if you can remember back to conroe pricing, Intel smashed AMD performance wise, but didn't really start to milk their products until now, unlike NVIDIA. They always had very powerful, very affordable products and also the ridiculously overpriced extreme editions. It's really up to us to decide the pricing anyways, as the market always determines a product's value, not the seller (at least in the US).
let's forget server part for a minute shall we .... and lets focus on desktop ... 8 cores @ 3.2 on release and later on maybe even higher without turbo boost ....
and if you so love the server part then you could see how wrong my statement was when JF-AMD makes an official statement about the clocks of the bulldozer cpu lineup ....
well because i wouldnt buy a 1000$ gpu ... let alone a 1000$ cpu .... and most people wouldnt ... the market for those type of cpu is quite low and unrealistic ... so if amd goes that route ill stick with what i have now ....
If you're looking purely at desktop, do you really need 4+ modules? Honestly I'm more interested in seeing the 2 module version, I really could care less for anything beyond 4 threads as I just can't think of anything I do that needs the processing power.
It's the samething I brought up in the Sandy Bridge thread, I'm most interested in seeing both how many additional multipliers you get beyond turbo and also how the 2500k is priced. Because quite frankly I don't need an i7, especially not the additional motherboard costs, so if I can get a quadcore (non ht) sandy bridge that overclocks to 4.5 ghz on air, then I'll be a really happy camper.
Same goes for AMD, if their 2 module processor is only 5% behind SB (unlikely, but we can hope), and they clock to 5ghz on air, then that's what I'm getting. As I said before, in the end I could really care less how well the top end stuff costs and performs as I can't afford it anyways. If one company offers a significantly better valued product around the ~$200 level, then I'll be very content.
well 2 programs that i use scale well with more cores ... so yes i wouldnt mind having 8 cores ... and i could also re-install my vmware workstation and set up a virtual machine to try out linux and do some other bits ....
It's a strange one isn't it. If AMD can't get close to SB in terms of out and out performance (per core), then they will need more GHZ to get to that point.
you then hit the problem of a two module chip, being 80% the power if all cores on those modules are being run; so you need even more ghz to match Intel (all estimates of course).
So would the ideal solution, for a powerful desktop, be a 3 module system?
me personally i want 12-16-32 , as many the better since my workflow requires a lot of CGI rendering. But that's me.
not really. Ipc may be lower than SB, or not, but, BD will be a higher count CPU that SB. for the 8-10 SB core versions you will have 12-16 cores BD. Probably in future BD CPUs we will get 12-16 cores for desktop as well, if they need it.
So, in multi-threaded scenarios, BD will be really good. In single thread scenarios, that penalty you say does not apply anymore, since 1 core in a module is used, not both. The penalty applies only when both of them work. And they would work at 90%, not 80%. A module will be 180% of 2 theoretical BD cores, which would be 200%.
That's entirely my point. No one's forcing you to buy their high end product nor should they price their top end product the same as now if it does compete with Intel's. The only reason why their products are so cheap right now is because the performance isn't there. If it performs accordingly then they would be stupid not to raise the prices. As usual, there will always be good midrange products, probably even affordable 4+ module processors, just not the high end - which is perfectly fine.
It's like Movieman said about SB and overclocking:no where in any law does it state that they have to let us overclock via fsb, or even at all. Sure it was nice to get high end performance for 1/6th the price, but their goal, as with any company's is to make money. Same goes for AMD, no where does it state they have to sell their top processors for a loss. So instead of asking for what you likely never will recieve, instead just take what the market gives you, and that will be a very powerful midrange product for $200-300.
Yep, it will be expensive for Intel to compete with BD in multithreaded workloads if the BD performance is as high as we are led to believe. But it will be a good old scenario all over again: release 8+ SB cores when they are ready, and as soon as the BD benchmarks come out, Anand will have a benchmark of Ivy Bridge 16 core/32 thread with 4 graphics cores and whatnot.
Who would buy BD with the knowledge that something quicker is coming 2 months down the road..
If AMD has an edge in performance with BD it needs it on the market ASAP.
if they release in Q3 2011, SB with 6-8 cores, no way that ivy-bridge for high-end (10-12-16 cores) will come out 2-3 months later. Only low-end-mainstream ivy bridge will come in the end of 2011/or start of 2012.
you're still completely missing the point.
The market dictates what you get for $450, not the consumer nor the producer. In the end, regardless whether it's "high end" or "mid range" what you buy for $450 will get you about equivalent on both sides, and even if AMD's "high end" is $300 and Intel's is $1000, the Intel is certain to be the better product. Same goes for if AMD help the $1000 product and Intel the $350, you get what you pay for.
In the end if you're willing to pay $450, then pay $450. If there is a $1000 product from the same company, you can bet your ass it's a better product. No company in their right mind would price their product at $450 if it was deemed to be worth $1000 by the market.
Honestly who cares if it's high end or not? If you gave me the choice between an i7 970 and an AMD 1090t, I would take the i7 hands every time even though it's not the top end product and the 1090t is because the performance lies with the i7. Same goes for graphic cards and so on and so forth. You pay for performance, not for the name (or at least you should be, otherwise you're getting ripped off every time).
yes i understand your point ... your the guy who couldnt care less about high end cpu's because he doesnt have the useage for it .. so he doesnt deem it needed for others either so he coudnt care less if its priced @ 1000$ and to that person 1000$ cpu means its high end because that market deems it is .... and guess what ???? these company could get more sales and make more proffit per waffer if they would be willing to cut down on their proffit margin .... selling more @ lower price = more sales wich in the end = more exposure for company etc .... but yes people on xs feel more powerful when they have a 1000$ cpu .... Oo
My reply was in regards for home Computing (higher end desktops). In that instance you're looking for more speed, less cores. The problem being that BD strength is in it's cores, where it looks intels will be in its speed.
There comes a point in home Desktops, where more cores are just wasted space, hence why I'm stating will a 3 module be the sweat spot for AMD.
Following my previous post,I wanted to see how an 8 core Bulldozer might stack up against an 8 core Sandy Bridge. I looked up on the net for the average perf. figures fro Thuban and Westmere and got to behardware website.It's not the most accurate list nor the best test selection,but it's a start.
Westemere @ 3.33Ghz has a 221 "points" on their scale. X6 @ 3.2Ghz has 160.
Now for an impact of adding of cores I looked at their conclusion for Nehalem->Westemere and Deneb->Thuban effect:
As can be seen, with 50% more cores and same clock Westemere is ~16% faster than Nehalem in client workloads,while Thuban gets a ~24% perf. increase.Quote:
During our test of the Core i7-980X, we noted a gain of 16% on the i7-975, and concluded that the Phenom II would benefit more by going up from 4 to 6 cores because of the absence of Hyperthreading. A quad-core Intel with HT already makes 8 logical cores available to applications and this meant that the potential gain would be lower.
This is verified in practice as, at equal clocks, the average gain between the Phenom II X4 955 and the Phenom II X6 1090T is 24.7% (23.9% without Turbo CORE). The performance gain on the Phenom II X4 965 is a notable 18.2%.
Now,with SB we will have 33% more cores and the impact according to the quote above should be 10% (50% more cores : 16% perf. increase = 33% m. cores : x perf. increase => x=10%). Add on that the IPC jump with SB of ~15%(average) and we have : 221x1.1x1.15=279 "points" . Since Westmere and SB are built on the same 32nm node,I suspect intel won't hit 3.33Ghz for a chip with 33% cores and IPC increase while staying in the 130W bracket.So let's assume they have to cut on the clocks by 10%(a bit generous) and we have a 3Ghz 8 core Sandy Bidge with approx. 279/1.11~=251 "points" . Turbo is counted in the scores since both Westemere(a base for results) has a Turbo mode up to 3.6Ghz.But let's assume a better Turbo in SB adds another 3% on top of the score above : 251x1.03= 258 "points".
On to the Bulldozer 8 core. I've done some speculative calculations and already got some numbers for BD. To cut the story short,a 4Ghz 125/140W model with Turbo up to 4.5Ghz and 10-15% IPC jump. Following the behardware's Deneb->Thuban jump(due to sheer core /thread uptick,no IPC change) we would have : 33% more cores : x% perf. increase= 50% m. c. : 24% p. increase => x=18% . The clock difference between 4Ghz and 3.2Ghz( for Thuban in behardware's chart) is 25%.The IPC difference I would pick 12.5% (10-15% arith. mean).The scaling hit is 10%. Turbo affecting scores is 3% due to higher Turbo on BD vs Thuban,the same as for SB's better Turbo effect. All summed up : 160x1.18x1.25x1.125x0.9x1.03=246 "points" .Thuban @ 3.2Ghz has 160 "points" . This correlates well with the 4.7-5.4Ghz perf. range of X6 Thuban I speculated before(~5Ghz/3.2Ghz=1.56x~=246/160).
X8 BD @ 4Ghz with the new Turbo should be generally comparable(4% difference),performance wise, to 3Ghz 8 core Sandy Bridge with the new turbo.Die sizes will be different though and I expect that 8 core Sandy Bridge will be noticeably larger chip(300-333mm^2) while for BD X8 i expect 200-220mm^2.
Even though we used different approximations, you, poper, dresdenboy and me arrived at about the same performance, give or take a few percent. Hmm....
I don't think that's the case, as you said people buy the high end products primarily to feel more powerful. That means that it has a set market to it, and more than likely even if they priced it at $500, 99% of the people still wouldn't buy it. Those people would just save their money and get the cheaper product that offers just about identical performance as they don't overclock, or if they do they don't have the means to actually capitalize on the extreme edition's additional capabilities (such as cooling).
I'm certain that these companies all have people with degrees in business from the best schools in the world designing their price ranges. There is science behind their so called madness
anyway ... can you please wish upon us that amd doesnt sale a high end product at more then 450$ if they win the performance crown or come even close to .....
omg .. that die shot looks so beautiful :D
top half looks like a GPU with cache, but orochi was supposed to be a 4 module 8-core without GPU :confused:
Picture is from high angle, so lots of distortion.
I don't think it is distorted. Plus, the 2 left modules are completely not symmetrical with the right ones O_o. Also, the up and down module are of different sizes. To tell you the truth, all modules are different! WTF!
If they are the same, why the top ones are clearer much larger than the botom ones? They should be smaller given that they are further away in the picture! :eek:
Slightly better angle...
http://a.imageshack.us/img210/148/orochidieshot1.jpg
So this is the 8 core (4 module) BD fusion chip?
http://www.techeye.net/chips/amd-sho...ks-more-fusion
Nope,i wrote the quote from the article:
The difference is less than 1%.Quote:
This is verified in practice as, at equal clocks, the average gain between the Phenom II X4 955 and the Phenom II X6 1090T is 24.7% (23.9% without Turbo CORE). The performance gain on the Phenom II X4 965 is a notable 18.2%.
I said the same, not the same size. The top ones look like they have the memory controllers and the bottom ones connect to them, could easily be wrong but so what? The Die shot looks like Left and Right mirrored instead of Top and Bottom. Half of the processor would be left and or right, not top and bottom.
First thoughts on the layout. :)
EDIT:
Should be L1 cache in the orange field. L2, L3 and other stuff was so obvious so I didn't mark them. ;)
:banana::banana::banana::banana:, Green field shall say "Prefetch!" ;)
The top 2 modules are a bit larger indeed.
All above differences in the chip are easily doable via solid knowledge of certain imaging software :),and that's what AMD did.
Better picture page 20 : click
There u can clearly see those are of different size.
8-core Orochi die. Heavily photoshopped. We don't release the final shots until launch.
I hope Hans will enlighten us soon :)
Until then, my wild speculation about the different core sizes:
- (real) 256bit FPUs plus smaller but faster and leaky 6T L2 cells on the top
- 128bit FPU, with bigger, but less leaky 8T L2 cells on the bottom.
But this is just my amateurish guess .. ;-)
Also just red a paper which stated that 8T cells are smaller than 6T cells in 32nm :shrug:
I would say the FPU is in the middle of your two orange parts. Otherwise the FMAC part on the bottom is imo too small .. that should be two 128bit FMAC Units, they have to be bigger than the one INT cluster/core.
@JF:
Of course it is photoshopped, but you wouldnt change the module sizes, wouldnt you ?
^^ They would :D
Ok, maybe some scaling .. but the top cores are not scaled, there is definitly more in it, thus, if this is added just by photoshop, then it would either not be an Orochi die, or Zambezi / Valencia are not based on that Orochi die, as I though earlier.
Then Orochi maybe would be just some kind of strange prototype.
JF stated in his first version of his latest blog entry 2MB L2, this and the Open64 Compiler info let me believe that it is really 2MB.
Lol, why would there be GPU?
Noob statement/question warning
Would it be completely out of the picture for them to start integrating some parts of a "gpu" onto bulldozer not for vga but for the benefits of (my mis/understanding) improved fpu calculation speed or whatever "fusion" of gpu and cpu is suppose to bring:shrug:
ati stream supported programs would benefit ... like powerdvd10 for example
http://www.cyberlink.com/products/po...iew_en_US.html
But there is no point to waste die space for so limited benefit.
There is no GPU in Bulldozer at this point. That could always happen in the future but not for the products that we are launching in 2011.
I suppose that the predecessor to Llano might have BD-modules instead of K8 cores.
John, I know you don't really comment on desktop hardware but can you confirm whether the desktop parts will be almost a year after server CPUs?
to JF: Is there any chance to have a 6 module Bulldozer die at 32nm in the future?
I don't care about the FUD, I care that it's time to upgrade my 965BE and I can't wait for BD's release :D I'm almost tempted to go for server hardware to satisfy myself :p:
That was a joke.
Well it really didn't look like one to me.Use smileys next time :)
Just a dirty question now. When BD in 22nm is coming. Any guesses?
Well, begining or end of it? Because that would mean only 1 year with BD on 32nm.
beginning of 2013
I know but maybe this strategy is going to change. The desktop steppings may contain some erratas which the server steppings may not. The desktop segment is more tolerant about some erratas. Just think back to Nehalem's launch. The C0 (first desktop) stepping came months earlier than D0 (first server) stepping because the C0 contains some errata which is not allowed for a server CPU (like TLB bug). So if there will be an early stepping which would be suitable for desktop then maybe can happen my speculation.
Why not? The 65nm K10 lived 1 year and 2 months.
AMD should use a 1206 pin socket and align pin-1 on the AM3 CPU with pin-1 on the BD socket that way you have 265 spare pins for BD. :up:
Socket AM34 anyone.;)
use some of em so zambezi works in 2p :D
After thinking about nn_step's comment I think he is mostly right.
I can't say what the IPC of code is like on average. But my own code doesn't often do a lot of linear arithmetic. I can usually break the computational work into multiple threads and/or send it to the FPU instead. But the bulk of my code tends to be branchy. With the new branch fusion, branch prediction, and prediction steered data prefetch this type of code could see a significant improvement on BD compared to K10.
But despite the gains or losses in IPC for specific types of code it seems to me that increasing IPC for average code is going to achieve diminishing returns in the coming years. First because there is only so much more ILP left to extract from code and second because Amdahl's law applies just as much to instruction level parallelism as it does to thread level parallelism. Also any new program doing a major amount of computation will usually be multi-threaded. Increasingly, as programmers learn to code threaded programs better and take care of higher level parallelism themselves, getting more IPC out of each individual thread may be rapidly approaching a wall.
We also seem to be forgetting the case of program level parallelism. When I run a program and it's not taking up all my cores or not utilizing them fully, I run more programs. That's exactly why I have run MP machines since the Pentium Pro days and why I have a multicore processor now. Multitasking has been around for a long time. But because it's hard to quantify as a benchmark people tend to compare performance only in the single instance case.
I believe the answer will depend on what you are using your high end desktop for. Not everyone uses a high end machine for the same purpose and which architecture and core count is right for you depends on which applications you use the most.
It may be the case that for office users or gamers a high-clocked 2 module BD or 2 core SB is the sweet spot. While if you to lots of encoding, crunching, or media creation a 4 module BD or 8 core SB might be the sweet spot. Of course we will have to wait for benchmarks to know for sure.
I think you could rather buy an octy MC and a dual socket mobo... this will be somewhat future-proof... as after BD comes you should be able to run it with a simple bios update... which is not something that you could say about Intel platforms (both server and desktop) and AMD's desktop offerings at this point in time...
Edit: Makes for an incredible value proposition too... what with the cheapest MC coming in at $290 appx and dual socket board for about $450
FYR...
Mobo... http://www.newegg.com/Product/Produc...82E16813131643
CPU... http://www.newegg.com/Product/Produc...82E16819105266
The big problem with AMDs dual server boards are that they are not overclockable!:mad::(
If the MC dual boards ever came out as an overclockable workstation board, then I would by two or three of them!:cool::up:
If MC or BD CPUs came out unlocked (like Intel's) runs at around 3.0 GHz, then I would buy four to six CPUs!:cool::up:
If AMD released BD as an Phenom FX CPU for G34 or whatever socket that comes out, then I would buy four to six of those CPUs!:cool::up:
who really needs to oc a dual socket board ??? seriously... besides for e-peen value
I mean the processing power :)
< This guy. When I was running a dual Istanbul setup between last July and earlier this year, I really wanted to overclock it. I did, to a limited degree, on my first (nVidia) motherboard.
Why? Because I do some of everything. When I'm compiling FPGA programming files during development, that's unfortunately a single-threaded process that takes a long time. At 2.2 GHz, you're looking at a minute and a half to two minutes every time you want to test a configuration. It's not like Visual Studio where you hit F5 to check your changes and the thing instantly starts up. When I'm relaxing by playing games, that's mildly threaded. Most games don't do their best with more cores and lower clock frequencies, though a few like Bioshock do fine. For the most part though, my gaming experience really tanked when I went from a Phenom II X4 940 @ 3.6 GHz to the dual Istanbul @ 2.2-2.5 GHz. And finally I do some video work which tends to take advantage of more cores quite nicely...some times.
It's never fast enough though. A Thuban overclocked to 4GHz, which isn't terribly difficult, can turn in throughput very close to what I used to get out of my dual Istanbul station. That was the point I decided to switch. If I had been able to overclock that platform, the question of changing or not wouldn't have been so easy to answer.