AMD: 32nm issues fixed

Printable View

Show 100 post(s) from this thread on one page

11-27-2010, 04:01 PM
Opteron146

Quote:

Originally Posted by JF-AMD

Terramar and sepang are server products. don't try to draw conclusions about client products based on server.

Sure, I was just speculating, not drawing conclusions. With Sepang's confirmed triple channel IMC, we can speculate if AMD would launch a triple channel enthusiast platform like Intel, too.

There are pros and cons ... but what AMD will do is up to your desktop colleagues ;-)

To say it in lawyer compatible English:
AMD could launch a triple channel enthusiast platform in 2012.
(but maybe they don't ;-))
11-27-2010, 08:57 PM
qcmadness

Quote:

Originally Posted by Opteron146

http://phx.corporate-ir.net/External...xUeXBlPTM=&t=1

C2012: Triple channel
G2012: Quad channel

Where is the 2 channel left?:confused:
11-28-2010, 06:25 AM
JF-AMD

Both of those are server sockets, not client sockets.
11-28-2010, 06:37 AM
Movieman

Quote:

Originally Posted by JF-AMD

Both of those are server sockets, not client sockets.

BUT some of us like server sockets..Just like you do!:rofl:
11-28-2010, 06:42 AM
-Boris-

Quote:

Originally Posted by BeepBeep2

Efficiency was awesome on AM2 compared to intel, what are you talking about?

Of course you could compare with Intels FSB limited chipsets. Which says nothing about the effiency in the memory controller. It only shows limitations in the bus.

Look at this:
http://www.anandtech.com/show/1991/4
13% increase in read and 28% increase in write. With double the theoretical bandwidth. And that's with no bottleneck like the FSB in between.

That's a very big dip in efficiency. And the efficieny didn't increase with AM3.

Compare with i5 or i7 with dual channel IMC. The difference is huge. Phenom II is almost down at C2Q territory. C2Q being limited by a bus only half the speed of the memory.

http://www.techspot.com/review/193-i...750/page5.html
http://www.anandtech.com/show/2832/7

The problem is that Phenom II don't have much faster memory controller than the 939-CPUs had. It's just hacked to support DDR2 and DDR3. The performace haven't scaled.
11-28-2010, 07:26 AM
sergiojr

Quote:

Originally Posted by -Boris-

The problem is that Phenom II don't have much faster memory controller than the 939-CPUs had. It's just hacked to support DDR2 and DDR3. The performace haven't scaled.

Really? Look at history of DDR2 MC from Athlon X2 to Phenom II X4.
http://techreport.com/articles.x/16147/3
And look what DDR3 adds to performance (DDR2-1066 vs DDR3-1333)
http://techreport.com/articles.x/16796/3
11-28-2010, 07:37 AM
qcmadness

Quote:

Originally Posted by sergiojr

Really? Look at history of DDR2 MC from Athlon X2 to Phenom II X4.
http://techreport.com/articles.x/16147/3
And look what DDR3 adds to performance (DDR2-1066 vs DDR3-1333)
http://techreport.com/articles.x/16796/3

Dual-channel DDR3-1333
Theoretical: 21.3GB/s
Real: 9.51GB/s
Efficiency: 45%

Is it low enough?
11-28-2010, 07:46 AM
sergiojr

Quote:

Originally Posted by qcmadness

Dual-channel DDR3-1333
Theoretical: 21.3GB/s
Real: 9.51GB/s
Efficiency: 45%

Is it low enough?

It is dependent on application. STREAM benchmark for example shows much higher numbers.
http://techreport.com/articles.x/18799/5
I think that weak core prefetchers in K8-K10 cores can be blamed (compared to Intel processors), but IMC itself seems ok.
11-28-2010, 07:50 AM
Sn0wm@n

LOL thats your conclusion for saying memory controler efficiency ???
11-28-2010, 08:03 AM
saaya

thx for the infos and heads up everybody :toast:
@opteron, i didnt think it was impressive cause i assumed it was mostly from a clockspeed boost, and cause bw boosts rarely have a notable impact on actual usage perf... at least desktop wise
a 30% ipc boost would be insane cause it would result in notable boost on actual usage perf

i hope that with fusion amd will create GAMING oriented cpus, that would be awesome...
imagine if there was a cpu that would rock in gaming but suck in photoshop and video compression etc...
90% wouldnt care and get the cpu that rocks in games... yet right now new cpus barely show gains in game perf, and if they do, its notably slower than in other areas...
thats why people loved a64 so much, it rocked in games...

PS: enthusiasts being crazy about intels tri channel? lol what?
in most situations dc is as fast as tc, and in some situations its faster...
and the weirdes thing is that sc is sometimes the fastest config on 1366 LOL :D
i know of a few wrs that were set with dc on 1366, so... if i had to chose id go for fast dc over tc any day of the week...
12-02-2010, 09:38 PM
hlopek

Quote:

Originally Posted by Hans de Vries

inverse photoshopping effort...
Regards, Hans

Thank you Hans, nice work :up: (seems like jour PS using GPU acceleration judging by tiling on zoomed core :D)

Looking on your numbers of L3 cache consuming only 3,85mm²/MB i must say how pleasantly surprised i am with small amount of space needed for L3 per MB.
But them i'm in fact wondering why AMD decided to go with only 4x 2MB L3 cache when 4x 4MB L3 cache on already huge 320mm² wouldnt take up much space ~355mm² nor require much more power on chips that probably already consume >140W TDP chips, and extra 8MB might raise that number for only 8-13W while giving more breathing room to IMC.

16MB L3 + 8MB L2 cache seems easily reachable for 32nm 300mm²+ die and in server loads, which for Bulldozer is designed, extra 8MB (or just extra 4MB L3) could be of much use. Guessing in some special cases 16 vs 8MB L3 could provide more than 20% boost while, yet again, only consuming modest amount of power which in server load could be more beneficial if cores arent throttling waiting for data to came few jumps away.

And 12MB-16MB L3 cache could "present in specs" as more competitive product to server SandyBridges with 20 L3 caches and enormous 384mm² (if i read correctly somewhere)

Does these new distributed L3 also support power down L3 cache feature?
And does Bulldozer came with yet another separate power plane and disconnected IMC from L3 so that we now have independent IMC, and independent L3 power planes? If that is supported now i really dont se reason why AMD went only with 8MB L3 cache.

Quote:

Originally Posted by Hans de Vries

The area in the middle shows four times the HT logic as well so I wouldn't expect more than 4 HT links.

Or maybe better "directory table" (DT) and complexity allowing seamless relaying data on dedicated HT link in separated L3 caches? That's why i'm asking why so small amount of L3 cache. 1MB L3 for dedicated HT link DT, plus 1MB for swapping and needs for local core.

Simply too small. 1MB reserved for HTlink DT, 1MB swap/relaying-routing and at least 1MB local core "reservation" would be much saner approach imho.
12-02-2010, 10:58 PM
hlopek

Quote:

Originally Posted by saaya

to make it 5 cores? 0_o :D

Seems AMD disagrees with you :p: take a look at slide p.4 what AMD announces for 2012. It seems to me that 5 modules (10 cores) however weird looking to us will be available at least in server market :rolleyes:

Quote:

Originally Posted by saaya

cause i actually like amd, and i think you guys can do a lot better than what youve been doing in the past years...

amds ddr2 and ddr3 imcs are :banana::banana::banana::banana:e... no offense, but seriously... the ddr2 imc offered no advantages over ddr1 whatsoever, and the ddr3 imc had clockspeed issues from day1 and is comparable to intels P35/X38 ddr3 memory controller which is 2, soon 3 generations behind.

Where did you find so many wrong statements, and what is for you improved IMC

K8 90nm/65nm M2 socket did have improved IMC first support ddr2-667 second ddr2-800mhz DC w/o overclocking over old ddr400. But the wastly different thing was that most of ddr-400 in A64 times could be OCed to 533mhz and having superior timings over ddr2-667 or at least closely comparable, and wastly diff #2 is that dual core K8 nowhere could utilize 2x6400MB bandwidth and only properly clocked 3.2G early K10 B3 came close to saturate 2x5300MB which is ddr2-667

Dont see why so much hate if K10 finally having integrated ddr3 controller that could work as p45/x48 chipset (btw at the release time that was in fact only 0.5 generation behind intel which just started to ship Nehalems) ... you seem to forget that AMD since 2003 has MC integrated onto same die as CPU and we all know how that worked for intel which needed to pump L2 cache size just to keep at pace with 2x smaller AMD caches :p:

And then, even thou bandwidth offerings from ddr3-1333 are mostly overkill even for PII x6 cpu, it's nice that they support it bcause ddr3-1333 is first step up above ddr2-800 which offers some visible performance advantage @cl6 over ddr2-800 @cl5. But way more important feature over X48/P45ins PII AM3 support for ddr3 @normal voltages and insane ddr3 voltages @1.9V as x48/p45 also, and if Phenom II set to work @1.5V it will reduce IMC power needs and CPU TDP overall than same part workinfg as AM2+. :cool: Lame reviewers usual point out that as better performance of AM3 vs. AM2+ and adding credits to better am3 motherboards :p:

Quote:

Originally Posted by JF-AMD

Yes, they can share the FPU but they don't always. To get to 256-bit AVX, we share 2 FPU pipelines and Intel shares a 128-bit FPU with integer pipelines. In either case, to get to 256-bit AVX, you are sharing resources. We just choose to do it in a way that gives you 8 AVX units AND 16 integer pipelines. Intel does it in a way that gives you 8 AVX units and LESS THAN 8 integer untis (because some of their integer resources are going to handle the AVX instructions.

Could this be true for intel? You meant floating point resources will be shared with AVX, and floating point (SSE?) being share with integer pipeline? That sounds messy to me.

And for us others from server 8 module BDs we'll in fact have only 4AVX units and 8 integer ones. It's nice to writing it proper way, because "16 core" parts are just for G34 market ;)
12-02-2010, 10:59 PM
nn_step

Quote:

Originally Posted by hlopek

Thank you Hans, nice work :up: (seems like jour PS using GPU acceleration judging by tiling on zoomed core :D)

Looking on your numbers of L3 cache consuming only 3,85mm²/MB i must say how pleasantly surprised i am with small amount of space needed for L3 per MB.
But them i'm in fact wondering why AMD decided to go with only 4x 2MB L3 cache when 4x 4MB L3 cache on already huge 320mm² wouldnt take up much space ~355mm² nor require much more power on chips that probably already consume >140W TDP chips, and extra 8MB might raise that number for only 8-13W while giving more breathing room to IMC.

16MB L3 + 8MB L2 cache seems easily reachable for 32nm 300mm²+ die and in server loads, which for Bulldozer is designed, extra 8MB (or just extra 4MB L3) could be of much use. Guessing in some special cases 16 vs 8MB L3 could provide more than 20% boost while, yet again, only consuming modest amount of power which in server load could be more beneficial if cores arent throttling waiting for data to came few jumps away.

And 12MB-16MB L3 cache could "present in specs" as more competitive product to server SandyBridges with 20 L3 caches and enormous 384mm² (if i read correctly somewhere)

Does these new distributed L3 also support power down L3 cache feature?
And does Bulldozer came with yet another separate power plane and disconnected IMC from L3 so that we now have independent IMC, and independent L3 power planes? If that is supported now i really dont se reason why AMD went only with 8MB L3 cache.

Or maybe better "directory table" (DT) and complexity allowing seamless relaying data on dedicated HT link in separated L3 caches? That's why i'm asking why so small amount of L3 cache. 1MB L3 for dedicated HT link DT, plus 1MB for swapping and needs for local core.

Simply too small. 1MB reserved for HTlink DT, 1MB swap/relaying-routing and at least 1MB local core "reservation" would be much saner approach imho.

More cache will not solve a problem, that is not cache capacity related.

The problem is prefetch accuracy and unnecessary cache flushing; both of which kill more performance than anything else in AMD and Intel's designs.

Unfortunately there is little Intel or AMD can do to improve that. [x86_64 didn't include separate address spaces for separate programs]

For example: benchmark program runs, gets interrupted by operating system. Which removes all cache lines holding said benchmark from the processor. Thus preventing a prefetch of said benchmark vital data.

For example: Text editor and a music player share the same MMU. To enable multi-tasking, the two programs must be stopped and swapped every 10-300ms. Thus the operating system must first clear out the cached page table and load in the page entries for the next running application. Costing thousands of clock cycles, instead of dozens.
12-03-2010, 12:32 AM
god_43

i say its time to create a brand new instruction arch (x64 anyone?), you can move x86 into virtualization for compatibility.
12-03-2010, 01:45 AM
nn_step

Quote:

Originally Posted by god_43

i say its time to create a brand new instruction arch (x64 anyone?), you can move x86 into virtualization for compatibility.

Or we could just add a half dozen more instructions and modify the Memory management units to support Address space tags.

It would require Zero changes in backwards compatibility but would make both operating systems more efficient and not require us to change Any user space code.

The of course, there are the much more controversial changes which will break some programs and would require serious operating system and user space changes. Things such as absolutely separated Data and Instruction spaces [Makes self modifying code easy to prevent and eliminates entire classes of viruses]
and let us not forget the controversial separate stack space [makes stack based attacks literally impossible and prevents the stack from ever hitting the heap]
12-03-2010, 02:07 AM
zalbard

Quote:

Originally Posted by god_43

i say its time to create a brand new instruction arch (x64 anyone?), you can move x86 into virtualization for compatibility.

x86-64 is a superset of x86, though.
I do not want something like IA64, to be honest. It has to work with older software relatively well to be well accepted by the market.
Hopefully AVX will provide a decent boost for optimised applications, I have big hopes for it.
12-03-2010, 03:30 AM
chew*

I see alot of talk about SOI and how it not beneficial and blah blah blah......

Lets however face the facts.

You do not hear horror stories of AMD cpu's degrading and dying on a regular basis. Yes they can be killed........it's not that common however, more often than not its a mobo taking them out.

Intel is not using SOI, there cpu's die or degrade if you look at them funny.

Can this be contributed to SOI? maybe........it also could be the fact that AMD doesn't use HKMG.......

One thing is for certain. One company has an unreliable product under xtreme conditions and one doesn't.
12-03-2010, 04:06 AM
hlopek

Quote:

Originally Posted by chew*

Intel is not using SOI, there cpu's die or degrade if you look at them funny.

Can this be contributed to SOI? maybe........it also could be the fact that AMD doesn't use HKMG.......

One thing is for certain. One company has an unreliable product under xtreme conditions and one doesn't.

Seem like you're being under impression of i7 980 prematurely deaths. My memory is exhaustive but i can remember same thing was happening to Northwoods A/B especially P4 celerons based on it. While hot willamette and furnace preshott were designed to cope with high temps.... and they're also more likely kill lq mobo and then accidentally themselves rather than degrading or cpu suicide themselves without damaging mobo.

Last relable part befor that was highly appraised Tualatins and then their reiterations in PentiumMs and eventually desktop Core2. So somehow it more looks like product feature rather than inherited faults of processing nodes. I myself am not too surprised of 32nm Nehalems degrading i even expect same from 45nm ones considering claimed low TDP they have but still need for 6+2 phases (real not doubled, and iirc some mobos end up with 12+4 and 18+6 redundant phases).

And SOI well it's reason why AMD only has gate first implementation and it's beneficial (again). iirc somewhere was publicly noted that SOI itself needs somewhat more stretched components (larger footprints) than bulk on same nodde and HKMG GF could be more squeezed up than HKMG GL so it should be the same i guess as when they departured on different routes 130nm SOI - 90nm Ga stretched bulk Si
12-03-2010, 04:07 AM
Hornet331

Quote:

Originally Posted by chew*

I see alot of talk about SOI and how it not beneficial and blah blah blah......

Lets however face the facts.

You do not hear horror stories of AMD cpu's degrading and dying on a regular basis. Yes they can be killed........it's not that common however, more often than not its a mobo taking them out.

Intel is not using SOI, there cpu's die or degrade if you look at them funny.

Can this be contributed to SOI? maybe........it also could be the fact that AMD doesn't use HKMG.......

One thing is for certain. One company has an unreliable product under xtreme conditions and one doesn't.

Lol wut? :ROTF:

The only really know degradation was the SNDS, and that still resulted form overclocking. I don't know a case where a CPU from either AMD or INTEL died from degradation in its lifetime when he was run within specs.
12-03-2010, 11:44 AM
mongoled

Quote:

Originally Posted by Hornet331

.......when he was run within specs.

Just exactly which forum are you posting on, run within specs, lol, :rofl:

I wonder how many people who post here are running on non-overclocked destop systems?

I cant wait to see what AMD brings to the table, I hope it ends up being good

:)
12-03-2010, 01:16 PM
Tomasis

yeah AMD is sweet candy for Oc'ers.. cheap and almost indestructible as bulldozer :D

yay :D
12-03-2010, 09:04 PM
nn_step

Quote:

Originally Posted by Hornet331

Lol wut? :ROTF:

The only really know degradation was the SNDS, and that still resulted form overclocking. I don't know a case where a CPU from either AMD or INTEL died from degradation in its lifetime when he was run within specs.

Name one IT department that expects any desktop CPU to last more than 3 years.
12-03-2010, 11:05 PM
ajaidev

Quote:

Originally Posted by nn_step

Name one IT department that expects any desktop CPU to last more than 3 years.

My old IT head told me that every farm build has to work without major modifications for at least 5 years. But we could make small additions and subtractions from the farm.

But sadly a huge chunk of the farm created a problems in its 3rd year because of incompatibility with newer hardware additions and the whole thing had to be replaced.

The lesson is that one cant plan ahead expecting chips to last more than 2 years is a gamble unless you dont need more power and have repair resources on hand.
12-04-2010, 03:48 AM
Hornet331

Quote:

Originally Posted by mongoled

Just exactly which forum are you posting on, run within specs, lol, :rofl:

I wonder how many people who post here are running on non-overclocked destop systems?

I cant wait to see what AMD brings to the table, I hope it ends up being good

:)

Enough, my notebook and my HTPC (undervolted) arn't overclocked at all. :p:
12-04-2010, 04:20 AM
chew*

Quote:

Originally Posted by Hornet331

Enough, my notebook and my HTPC (undervolted) arn't overclocked at all. :p:

My net book isn't overclocked becasue i haven't modded bios yet.......luckily its AMI.

BTW, my post was not to start an AMD versus Intel flame war, I use both, these are just my personal observations and does not only take gulftown into the equasion, bloomfield clarkdale and lynfield are all in that mix. bottom line the weird thing is the intel chips did not die under necessarily extreme ocing, which is odd becasue I would have expected them to die then......they died the next time for no apparent reason at rather conservative oc's considering the cooling.

The only chip I expected to die has not, 1.93v to IGP for well over 10 hours............yes its degraded and can't pass vantage stock........it's to be expected however. There is no rhyme or reason with intel.

The AMD chips have died due to operator error or flaky mainboard, for instance my biostar has a repeatable bug where i can kill a cpu instantly at stock if I use a certain software bundled with board. Other times my insulation failed, PMW got wet...game over. AMD deaths make sense at least.

Show 100 post(s) from this thread on one page