What is Orochi? We need Zambezi. :)
Printable View
Savantu, if you hadnt noticed, thread is about ontario fusion product, that is due Q4 THIS YEAR(Bobcat).Furthermore, another FUSION product is liano which is due Q1 2011.So i dont get your comment abour waiting to 2012 for netbooks/notebooks based on it...
Yes, bulldozer fusion product will probably show in 2012.But that has nothing to do with this thread :).
2010.Q2 Tape Out
2010.Q4 Sample to OEM
Lets see Magny Cours:
2009.Q3 Sample to OEM
2010.Q1 Announcement
So my speculation: In optimistic situation, Interlagos will appears in 2011 Q2. Or later Q3. Desktop part Zambezi, I don't know & won't make any wild speculation.
savantu responded to my comment that responded to batteryoperated's comment about bulldozer product being delivered as 2012.. wich is fud really ... 2011 for server product ... JF-AMD confirmed it .. and Q4 2011 for desktop products based on the article savantu posted .... so i guess it was relevant to a part of this thread
i think ontario/zacate are more like 64mm˛ then 74mm˛
all you have to do is expect nothing. speculation is pure entertainment for most of us and you dont have to take it seriously.
on an aside here is my transistor estimate for bobcat:
100M for the GPU
30M * 2 for the L2 cache
20M * 2 for the cores
_______
~200M transistors.
h1 2011, BD will come to market. Not h2 2011 or 2012.
Over at SA,Hans was kind to provide a summary of known Atom and Bobcat ES scores in BOINC benchmark(tests single thread performance in both int and fp):
http://www.semiaccurate.com/forums/s...&postcount=163
For comparison Conroe(Core2 @ 65nm) @ 1.6Ghz gets int ~3290, fp ~1560.Bobcat is,according to BOINC, around 95% of 65nm C2D's perf. in integer and around 85% in fp calculations. When compared to Atom,it's 85% faster in int and 90% faster in fp.
Why compare it with C2, P-M is in the same range:
P-M 1.6ghz: 1467/2914 (banias)
P-M 1.86ghz 1476/3052 (dothan)
Its nice to have results, but id rather have other results then the build-in boinc benchmark... way to unreliable... +/- 100-150 mips on both fp and int is not rare, but makes quite the difference.
Because C2 is mainstream performance segment,just like Regor and the results fit almost perfectly with the 90% of mainstream perf. claim. P-M is yesterday's news,heck even 65nm Conroe is replaced by Penryn long time ago.
Very impressive numbers, in addition with the apu this little chip will be totally rocking the netbook, subnotebook.... market.
Maybe its yesterdays news, but thats probably the performance you get.
It smacks atom in every way, and if it can beat the pentium dual core it will be very interesting. If not they are basically locked in the sub 450€ market, and there it will be a price war with atom, especial in this segemtn where peoplehardly care about performance.
interesting, this is kind of a preview/impressions piece.
apu
people hardly care if the performance difference is just around, say, 10-20%. But if the faster part is noticeably quicker, up to twice the speed, especially in graphic intensive task like watching video streaming over the net or playing internet, online games, i think they WILL notice the differential.
It looks like they have done an excellent job, optimizing the die size and aiming directly at the most common and intensive tasks [video].
I'll believe when I see it. I have some doubts about these tests. The max int instruction throughput for bobcat at 1.6 Ghz is 3.2 GIPS (limited by two decoders). I am very sceptical that bobcat can reach nearly max instruction throughput in this synthetic test (while conroe & athlon64 can't). Still very impressive if true.
Even if it had, say, 8 decoders it wouldn't be any faster. It could in theory run at 12.8 "GIPS", but in practice it wouldn't run any faster(only in cases it would actually exploit ILP > 2, which I believe is quite rare with the given code).
But haters gonna hate. :shrug:
But if your cpu has only two decoders it doesn't mean that it has an equal IPC to cpu with 4 decoders when executes code with ILP <= 2. A simple example (code with sequence of 4 arithmetic operations):
a = b + c
a = a + d
e = g + h
e = e + f
Cpu with 4 decoders can execute first and third instructions in the same cycle, while cpu with 2 decoders will need one more cycle for that. Of cause in reality things are a bit more complex because of OutOfOrder buffer but again, i really doubt bobcat has bigger OOO instruction window then Conroe/Athlon64.
You are sticking to the same old number of decoders story.You really need to let it go. Bobcat has other OoO improvements that make it very efficient ,which is very important in average x86 code(which in turn has average IPC of just ~1).
Anand seems to think its around 100mm^2. I reckon its somewhere in the middle of 77 and 100 mm^2, kinda similar to Atom at 87.5 mm^2..
Link
It's not "old story" until proven otherwise. I did not see any prof why decoders are not important any more. Also any OoO improvement (if exists such) is not a replacement to decode stage.
Also keep in mind that average ILP depends on the actual application and this is only an average ILP, which mean that for some part of code it may be < 1 and for the other part it may be > 3 which mean that on cpu with fewer execution resources the average IPC will be lower (even if max IPC of the cpu > average ILP of the app). Also I can bet that apps which show very low IPC on "pentium" will be killers for bobcat. Low IPC usually means cpu is waiting data from memory while half speed cache of bobcat cant help too much here. BTW, Bonic benchmark keeps at min memory access.
Read this