http://www.xbitlabs.com/news/cpu/dis...tml#discussion
I wonder what they have up their sleeve.
http://www.xbitlabs.com/news/cpu/dis...tml#discussion
I wonder what they have up their sleeve.
I like AMD I prefer AMD but they are emphasizing multithreaded improvements alittle too much.
I really hope they got something much better in ops/clock/core performance coming.
august... that's still quite a ways away. after they divulge info about it, then it'll still be some time before the launch. gosh i really hate playing the waiting game lol
So this is just an announcement of the announcement? :p
We have said many times that single threaded performance will be higher than current systems. Don't expect benchmarks until launch.
But as we all move forward into 2011 and beyond, single threaded performance will become less relative as platforms get higher core counts and applications are written to expect 4+ cores as a minimum.
With both intel and AMD driving to higher core counts, expect software developers to find more ways to take advantage of those resources and expect them to rely less on clock speed. That trend has been happening already and it will only increase over time.
The problem that AMD is facing currently is not how many cores they can squeeze onto a chip, but how much performance they can get out of those cores. I mean it's taken 6 cores @ 3.2Ghz to match Intel's i7 w/ 4 cores/8 threads @ 2.6-2.8Ghz. That's rough.
Of course we want more cores, but there's got to be some processing power behind each of those cores, and intel is running away with it in brute force. Hopefully AMD's bulldozer will exceed i7 performance per clock. If it doesn't, I find it hard to believe that it will be fully competitive in its first generation. AMD will be playing the pricing game again instead of the performance game.
Its good thing to be enthusiast but at some point you have to realize that most people wont feel the difference in their normal everyday tasks between quad core processor from intel or amd
There is no doubt that Zambezi will outperform Phenom II in singlethreaded applications, but will it be competitive? Higher performance than current processors from AMD is not the same thing as higher performance than 2011 Intel processors.
And you are a server dude, we aren't. Even if games can use up to four cores, or even more, it's the singlethreaded performance that is the most important part. And that's how it's going to be for a long time.
How's Thuban 1090T better in price/performance when it loses in majority of tests to low end Core i7? You can always choose some highly multithreaded apps (that the average user has no idea about, ie. Cinema studio, POV-RAY, etc.) to try to skew your argument, but higher ipc trumps frequency/cores in every other scenario, including gaming, encoding, and even in some multithreading scenarios. The all-round better chip is clearly budget i7, with it's robust power-saving features. Thuban is only useful for some specific apps, and even there the difference is negligible to budget i7.
No, it's not an announcement.
They've just showed up in this program, that's all. http://www.hotchips.org/program/conference-day-two/
lol, man, i reading all reviews Thuban think (im reading about 40 reviews to every desktop CPU product) . Some reviews was better, some was very bad (test method choice). We talking about Thuban vs Bloomfield. Bloomfield is much higher powerconsumption. Thuban is in it a bit lower than Phenoms Quadcores! And this is impressive. Watch example at pracitce review at lostcircuits.
Most users buy not hexacores for internet ,-). This segment is for overclokcers, for enthusiasts and working in 3D/video. Its sooo simply.
I guess things work differently in AMD dream land.
http://www.anandtech.com/bench/Product/146?vs=47
And you keep laughing but it takes AMD six real cores to match Intel's "fake" cores right? :rofl:
...
Anand must be an Intel pumper. ;)
say what you want AMD, but i want to see bench marks cuz' you've broken my heart before!
Buy a 6 core, it will get faster over time, as software becomes more and more multi-threaded...
Windows7 (64bit OS) is still not mainstream. Once it is, you will see developers switch to using 64 bit apps and multi-threading. The hardware is ahead of the software, so none of what you say matters in the immediate future.
CPU's overclock themselves now, if using less cores. Unless you have a specific purpose buying a CPU/platform is about features.
Anandtech is not middle of world ;). U read maybe 1-5 review, i read 40. TRy example encoding videos 6x in the same time and u will see diference.:) Now im in work, but later at home, can u send PM with reviews. Definetely, x6 1090T is in real multi thread aplication compared with i7 965 +-.PS: games are ok, but its not optimalized for more than quadcores (i mean not 10% load at others cores, its nothing). Watch in games at x4 965 and x6 1090T. 965 BE will better.
Guys, what's this obsession with IPC? There is no use in having a 50% higher IPC if your competition can clock their cores at more than double the frequency of your cores while still use the same amount of power as your cores. It's about the architecture as a whole, that's amount of threads per chip, IPC per thread and clock frequency. This all has to stay within a certain power envelope and the chip should not be too big.
We have absolutely no idea how Bulldozer will turn out. It might even completely surprise us by having a lower IPC than Phenom II, but running at a lot higher frequency while still consuming less power than whatever Intel has on the market when Bulldozer launches. The end result would be higher per thread performance, while actually having lower IPC and no sacrifices in power consumption. I'm not saying that Bulldozer will be anything like that, but you guys just seem to be too focused on IPC and IPC is only part of the solution.
Another thread about AMD versus Intel, just what I needed! Thanks guys! You're awesome!
IPC is Intel's weapon right now, so it can't be ignored in any honest debate, especially since a quad-core 2.66 GHZ part is beating a hexa-core 3.2GHZ part. The point is moot anyways since the competitor product has higher ipc and can overclock even better with non-extreme cooling.
And what if AMD could clock an 8 core (4 modules) Bulldozer at 5 GHz while only consuming 95 Watts? And it does this while having about 95% of the IPC of AMD's current Phenom II chips. That's what I'm getting at. IPC is only part of the story.
I'm not claiming that Bulldozer will be anything like that, just that there's more than just IPC to getting great single threaded performance.
This is just a HotChips presentation,relax guys.
yup, took less than 10 posts to go from, AMD makes an announcement, to, my chip is bigger than your chip :down:
both brands have cpus that are great at their own thing, get use to the fact that you will never see one have a massive advantage unless you have a very small niche your trying to fill. since pricing is set so there is always some competition.
A quadcore at 3GHz will have a significant lead over a hexacore at 2GHz or an octacore at 1.5GHz for a very long time in games. Even if they have the same IPC/core and theoretical performance.
Just as Intel did with Pentium 4? Except that Intel had tons of more resources of course.
You mentioned "honest debate" in the same sentence where you then pretend that the chips are always running at their base clocks. If IPC is as important as you claim, then we must know the true running frequency in all benchmarks; otherwise any debate is worthless.
Bulldozer? What Bulldozer.... Oh yeah Bulldozer i almost forgot about it. I'm not holding my breath.
Ooops, here comes the TURBO debate again. I thought that argument would disappear with AMD's own rather aggressive TURBO implementation on Thuban compared to bloomfield. Anyway, what's the argument here, Intel has the ipc lead? Clock for clock, even penryn cores are faster than deneb cores.
??
IPC and frequency are conflicting aproaches. You can't have both, it's one or the other. Your theoretical Bulldozer is like having a Pentium 4 with the IPC of Core 2. Pigs will learn to fly sooner than you will get such a CPU.
To get a lot of IPC you need very wide cores with lots of execution units, complex decoders able to extract the parallelism out of the instruction stream and lots of buffers to keep a mountain of data in flight through the chip. All that limits the frequency you can get. In other words, if your goal is IPC, you give up on frequency. Complex circuits clock badly and burn a lot of power.
The best example is Itanium. Extremely wide, has typically 1.5-2x the IPC of Core/Nehalem but also clocks 2x as low. That's inspite of the fact that Intel trying to reduce complexity as much as possible, by making it in-order and moved the task of finding parallelism out of the chip and into the SW ( compiler ).
The other aproach is frequency. Netburst is a fine example, make it as narrow as possible, have half or two thirds of the execution units of other CPUs, but clock as high as possible. Do some clever stuff to hide cache miss penalty and raise unit utilizsation ( SMT ) and you have a speed racer design. IPC can't be high, you have fewer decoders which are simpler, few execution units and miss rate penalties are huge.
Middle aproach is a beefed up Nehalem, the current one leans more to the fat core, high IPC, than speed racer. Altough if you optimize it by hand as Intel does, you can get some impressive frequency too. But that's only possible if you have a few hundreds years of manhours available.
well, they are actually neck and neck in video encoding/3d rendering at the same clock, and bloomfield takes a pretty big lead in most of other tasks
http://img526.imageshack.us/img526/2295/img0028754.png
http://img249.imageshack.us/img249/3104/img0028843.png
I'm only trying to say that IPC is only part of the story. We have pretty much no idea how Bulldozer works. What if Bulldozer has separate clock domains within a core? Not saying it will, but AMD might have just come up with something really innovative for all we know. Let's be realistic, what do we really know about Bulldozer? Not much at all AFAIK.
AMD in thread title, and the trolls just swarm on in. lock please, the announcement has nothing to really discuss, and its been horribly derailed so quickly
I wonder why people only complain when it comes to AMD threads? Go look in the "Sandybridge," "Westmere-EP," and "Westmere-EX" threads. No one is complaining. This thread is about an announcement. What possibly can you discuss, possible dates of the actual announcement? No chip exists in a vacuum, every chip design is inevitably going to be compared to existing designs. The discussion right now is about "IPC," which is quite relevant to bulldozer since this is the one area AMD is lagging heavily (and is only competitive by ramping up frequency and adding more cores, plus aggressive pricing) :shrug: IPC is very relevant imo.
That's not what I said. I said, lower IPC than Phenom II while clocking higher and Phenom II has a lower IPC than Core 2 AFAIK.
You could also go for a more hybrid approach, like double clocking those parts in a core that make sense. Clock domains within a single core in other words. You could for example run the schedulers and execution units at double the clockspeed of the fetch and decode stage. I'm not saying they will, but it's another approach.
As I just said, we know very little about Bulldozer and there is a very slight chance that AMD may really surprise us. I'm just being cautiously optimistic.
i dont complain often, but what caught my attention was how quickly it was derailed. notice the first thing you said was that its AMD fanboys that are crying, instead of pointing out reasons for keeping it open, you decided to start your post by attacking a group of people.
and i dont see discussions. i dont see anyone talking about design ideas to make things better. all i see is people trying to point out that what they have is better than what others have, a battle of epeens.
Again, people could you just relax for a second.Like Helmore said,very little is known about how Bulldozer module works ,let alone the IPC relative to Deneb or Bloomfield.Wait for August for more details and then we can discuss based on facts.
Yep, couldn't agree more.
Until you have benchmarks from both, all of this is just noise. There isn't anyone that has access to both parts to be able to make an honest statement.
The funniest thing I read is "it takes 6 of your cores to match 4 of Intel's". I guess I could have said "it takes 8 of your threads to match 6 of AMD's" but let's face it, those arguments aren't going anywhere.
Patience, 2011 will be a good year for everyone.
that agrument is one of the best proofs that its the same damn chip, but people look at it as less cores or more threads.
i look at 2 things, the chips die size, and the price. the price matters to users more, the die size to relative profits (though size isnt what decides anywhere near the total cost)
cant wait for CMT so both sides can have 2 ways to look at the same chip, resulting in 4 methods to argue instead of just too.
Oh thank me.
I have already read them and nothing in them is new or actually tells me something about performance.
#6: "We have said many times that single threaded performance will be higher than current systems." That's old, and shows that we won't get any more specific info right now.
#7: Speculations
#9: Talks about the current X6, not Zambezi.
I'm not into this "you're a towel!" logic.
You're the only one throwing words around. At least we had a "tech" discussion going before you started pointing fingers.
I have asked you at least twice what you'd rather discuss seeing the OP is about a date of disclosure. So you'd rather have us discuss possible dates? Or focus on what is an inherent weakness in AMD's previous micro-architecture, and which almost everyone agrees AMD has to improve on in order to catch up to Intel? It seems to me you're in this thread to point fingers rather than contribute to what was a relevant discussion to the impending battle between AMD and Intel's next gen.
keep going with your smear campaign, until you learn how a discussion works your only going to continue to blindly troll your way into your so called "discussions". you said yourself you dont have a point, which means your only responding to comments with the intent to start an argument. did you actually bring any useful information into this thread?
alot of people on here are very curious about how amd is going to unique handle multithreading, and might try and speculate on that. so please explain how the intel IPC of a cpu 4 years old, vs current phenoms, has anything to do with that? im here trying to see what ideas people have around CMT, but thanks to a few handful of people who constantly start brand wars, that quickly turned into the rest of the people trying to defend themselves.
so how exactly does one get the thread back on topic. since im here to mainly read, i shouldnt be posting much. but i think i have the right to ask people to stop derailing, dont ya think?
When BD design was in its infancy,back in 2005-2007,scheduled to be released @ 45nm with SSE5(instead of AVX) ,AMD targeted it to be the highest performing x86 compute core in both single and multi thread scenarios. Then they decided to improve the design and release it @ 32nm with additional performance and scalability options(all due to competitive reasons and 45nm node ramp combined with good Shanghai design results).So ,if it was to be a highest performing compute core in 2008,they certainly worked on it to improve on that baseline with the BD ver2 we will be seeing next year. My guess is both higher IPC per core and per Mhz AND better power gating meaning very aggressive turbo mode on the core level(inside the module).There could be asynchronous clocking too and many other things that can all beef up the performance of Bulldozer when compared to what we have today(both intel and AMD).
agreed!!!!!!
better single thread performance is a good indication of higher ipc dont you think.... and dont tell me that they could have taken a 2ghz cpu against a 3ghz cpu to make those claims... we should all drop that subject and wait till september for the real numbers....
All I have said is that bulldozer will be faster than current products. I have not made any clock speed statements.
The statement that I made was that Interlagos would have 33% more cores and will be 50%+ faster than Magny Cours. If you are more than 50% faster with 33% more cores, then your "per core" performance is faster. That is the only statement that we will make on performance.
looks like they wont talk about clock speeds :)
150+% of the performance with 133% the number of cores
thats 12,7+% more performance per core, something easily achievable with higher clocks (magny cours works at a modest 2.3ghz)
I think AMD is on the right track. They've posted Quarterly profits I think for the first time in years this past quarter (correct me if I'm wrong on that. they have made a profit though) so that shows that they're doing something right. Obviously, it's chump change compared to Intel's earnings, but they are doing fine. AMD/ATI's R&D budget is much smaller than Intel's or Nvidia's, so that means they're being smart about what they're doing. The Radeon cards this generation have been fantastic. Who's to say that Bulldozer won't be the same way?
im with mad pistol ... amd has changed in the past couple of years ... pleave give em credit that bulldozer is a beast .....
to ALL WHO ARE bickering here,remember it takes two to tango and two sides to ruin a thread.
That figure appears to come from an AMD chart focusing on one projection, "Floating Point performance".
It shows MC at "28", and Interlagos at "43" (at max, the line fades starting around 40). 43% to 53% improvement.
But given that Interlagos/BD has AVX (including FMA), this really isn't all that impressive, is it?
"Integer performance" on the same chart goes from "29" to "36-38.5" (from start fade to end fade)
That's a 24% to 33% improvement for 33% more cores.
Hmmm, doesn't really look like much single-threaded improvement there.
In sum:
From -7% to 0% performance loss per core on "Integer"
From 7.5% to 15% performance gain per core on "Floating point" -- and that's with AVX!
Can you imagine the outcry from saaya were SB to compare to Westmere in such a way? I can hear it now... "Epic Fail!!!!!111"
Other than AVX helping FP perf, somewhat less than expected, I don't see any single-threaded gains from MC --> BD, not based on these performance projections, anyhow.
About Bulldozer and single thread performance though, it seems AMD is going to take a page from Intel by employing an even more aggressive core boosting strategy (ala SB) in single thread scenarios. Which means the focus may not be so much on ipc tweaks, but rather ultra high core frequency boosting, a scenario that would be greatly helped by power-gating to ensure the chip stays within its tdp limits.
http://www.anandtech.com/show/2879/3
A 16 core (8 modules) Bulldozer based CPU is 60 to 80% faster than a 12 core Magny Cours in integer performance when using SPECInt_rate as a benchmark according to Anandtech.
I'm talking about design philosophy. You're comparing a product that reached 3.4GHz in 130nm with Nehalem which reaches 3.5 with 45nm.
It's not relevant the actual speed P4 reached ( not to mention it was held back in 65nm ), it's a product designed in the late '90s. Nehalem was done by the same time that did Netburst and it took 6 years; obviously, all they've learned with Netburst was put to good use.
And to sum it up : my point is that you can either aim for frequency or for IPC. The middle path are designs like Core/I7/K10 and possibly Bulldozer. I don't expect any wonders from either be it in IPC or frequency.
Even so, it has a lower frequency, thus it has other bottlenecks in the design.
Nothing new here; Pentium 4 did this back in 2000. The integer core was clocked 2x the core clock. Ultimately it ended badly, altough they tried every technique, low swing circuits, domino logic, running something at 6-8GHz means a lot of power used and performance per watt is poor.Quote:
You could also go for a more hybrid approach, like double clocking those parts in a core that make sense. Clock domains within a single core in other words. You could for example run the schedulers and execution units at double the clockspeed of the fetch and decode stage. I'm not saying they will, but it's another approach.
lol. intel copied amd by having LSU's!
integrated FPU, TLB, IMC, OoO, register renaming, superscalar pipeline, branch prediction, integrated L2 cache, L3 cache, among others are all things that intel did first (for x86) that amd has used.
on a more serious note intel doesnt even need to beat amd's uarch. they have the best process and circuit design teams in the world.
Well, I mean, what can I say? It is AMD's own chart from 6 months back. :rofl:
I'll stick with AMD's official figures over Johan's tidbits, especially when the latter date (and source) from the SAME time of the "5% extra die size for 80% performance gain" nonsense that was later clarified.
Unless JF-AMD wants to reiterate any performance claim that is different from what that AMD slide shows?
JF, is Interlagos really 60-80% better than MC (12-core) in specInt_rate, despite the AMD slide showing "integer performance" is only 24-33% better?
Or was your comment to Johan in error?
You know, I've been looking more at that chart from AMD.
I've got a very close fit for the Y-axis:
**** The base results for specInt_rate and specFP_rate, for 2-socket systems with the noted processors, divided by 10. ****
Look them up. They are dead on for the 2009 "2435 Istanbul 2-socket", and very close for 2008 and 2007 as well.
Now, you say, but 2010, MC is better than this: the chart would give:
290 for int_rate (base), 280 for fp_rate for a 2-socket 2.3 MC system.
When we look we find: 309 int, 290 fp.
But recall that JF likes to say that they over-delivered with MC vs what was promised... so I think this is ok.
If I got it right, this chart from AMD calls for
=================================
Interlagos top-bin 2-socket system:
SpecInt_rate(base): 360-390
SpecFP_rate(base): 400-430
(lower numbers are where the fade starts, upper is end of bar)
=================================
The upper end would amount to an FP improvement of 48% (thank you, AVX), and integer is 390/309 = 26%.
Note that per-core, this is 148/133 = 11% better SpecFP_rate, but about 5% worse on SpecInt_rate.
It would make sense that these charts would be some form of SpecInt/FP rates, and base is easier to project than peak, and it must be 2-socket (or 1, but that doesn't make much sense) systems from the 2xx initial parts chosen.
--------------------
Anyhow, given that I now think this chart is giving spectInt/FP_rate projections, the Johan specInt_rate tidbit from JF is completely at odds with this chart, and as they were both put out at the same time... gotta think the chart stands unless JF wants to (re-)claim otherwise.
Johan got his information directly from AMD. The chart has fading bars and AMD(JF) already stated that only they know how high the bars actually go(that's the purpose of the fading btw,to not actually disclose the true perf. projection). You are reading waaaay to much into that chart,especially knowing that AMD couldn't possibly predict the clock speeds they would milk from the BD silicon at the time they made the chart. 60-80% uplift from MC is a good bet,but seeing how AMD delivered and over-delivered with Shanghai,Istanbul and especially MC,you can bet they will do all they can to over-deliver with BD when it launches.
edit:
a question : why are you so obsessed with AMD,BD perfromance/tapeout and 2011? Any chance you're an intel shareholder ?
Johan got his info from JF, at the same time of the 5%-die-size thing, the same day (or so) that AMD released this chart. Hence the question to JF.
***EDIT: Could it be that it was just a misread? What *is* 80% better on SpecInt_rate (per that chart) is MC over Istanbul. (rather than Interlagos over MC, which the chart shows at 35%)
AMD can make a reasonable stab at Interlagos clocks... remember that power is what is really gating things here. (more so than with a Zambezi 1-die part) But I agree there's a bin or so of "not sure", which is why the bars fade out.
I'm sure AMD will try to over-deliver, my point is merely that the chart shows BD relative to a slightly-worse-than-reality version of MC.
The numbers are interesting:
With int_rate, both Nehalem-EX and Westmere are already at the low-end of BD's projected range, so I think Westmere-EX (25% core increase, higher clocks), and also SB (33% core increase, new arch, higher mem bandwidth) will have no trouble maintaining dominance here.
With fp_rate, Intel has a lot further to go to catch a (2-socket) 400-430 SpecFP_rate(base). But presumably this is where AVX comes in, as well as more cores/bandwidth.
For single-to-low-threaded stuff, I expect Intel will win across the board, probably substantially.
edit: obsessed? Isn't the whole point of these boards/threads speculation? Some people find it fun, you know. ;) It's a challenge trying to decode these AMD performance projection slides, but the results can be informative, no?
anecdote:
back then their CEO was craig barret and he used to be an engineer for intel who started working for them in the 70's. he was on the materials side of things so he pushed process over arch. netburst uarch was a really bad idea from the start. even researchers then new about future power issues. ever since presscot/tejas intel has focused on making a good uarch.
my point is that intel will almost always be ahead in process/physical design. amd can match or beat them in uarch but the only realistic way intel would lose is to fall behind in uarch (a la netburst). and fwiw hand optimized circuits can be up to 7x more power efficient over a synthesized counterpart.
At the end of the day who ever can encode my video faster wins. Currently this is Intel and has been for several years now.
And if one person responds to this and mentions something like badaboom I'm going to :banana::banana::banana::banana:! :down:
No, you are completely wrong here.
PLEASE DELETE THAT CHART.
I have explained several times:
1. The chart was drawn in powerpoint, the chart was not done in excel where you would have exact numbers
2. The chart uses a fade to purposely hide the actual performance estimates because we were not making actual estimates at the time.
Anyone that obsesses about that chart would also notice that the Magny Cours performance increase over Istanbul was also underestimated.
If you want to refer to any performance estimate for bulldozer, there is one official one: 50% greater total throughput than Magny Cours.
We won't be saying anything else for the forseeable future.
Any other guess, no matter how complicated the math or methodology, will be wrong.