Is this clear enough?
Is this clear enough?
Each integer core takes 4 Macro ops from the dispatch group buffers while each 10h(Istanbul) core takes 3 Macro ops.
From here. It's very good article
Its not like it hadn't been already posted here in this thread:
http://www.xtremesystems.org/forums/...&postcount=685
I didn't saw the post
PS: I read it from the first day it got published, http://www.google.com/realtime
The thing is that it uses it. If the CPU can't use all 6 at the same time that's another thing. All 6 will get used at some point. Either way, they are on the die, they're connected, and they are used. Alternatively, not at the same time, whatever. But they are there, they are used and thus they are a resource. K10 has more resources than BD (integer "clusters").
Instructions per clock (compared to K10). Frequency doesn't matter, this is per clock:
IPC (CPU level) --> Will be higher, more "modules", double integer resources per "module", less resources per integer "cluster", better use of available resources per integer "cluster".
IPC ("module" level) --> Will be higher, double integer resources per "module", less resources per integer "cluster", better use of available resources per integer "cluster".
IPC (single integer "cluster") --> Less resources, better use of available resources. Higher or lower instructions per clock?
The bold part is likely lower, and that's exactly what savantu, terrace and others are discussing here. IPC per integer "cluster". We don't know for sure, since JF just says "IPC will be higher". At what of the previous levels? After all the BS, bans, etc. he still hasn't answered this question.
Now, if you throw frecuency in the mix, knowing that it will be higher than current K10 CPUs, of course you can say single integer "cluster" perfomance is higher. Just notice how he never uses IPC+higher+per integer "cluster" in the same sentence. The only info we know about single thread perfomance is that it will "be higher". Of course, because of the higher frequency, not because IPC is higher.
JF just has to answer the question and this debate is going to end fast: IPC per integer cluster has been increased or not? No BS, just yes or no.
Friends shouldn't let friends use Windows 7 until Microsoft fixes Windows Explorer (link)
Stargazer,can you read or not?The man said IPC will be higher and single thread performance will be higher. Can't you just stop beating the dead horse already?It's dead,alright?
K10 has a clear bottleneck in the retirement unit.It has a massive 9 execution units available(3ALU,3AGU,3FPU) but can retire only 3 macro ops per cycle.
What would make the IPC lower on "integer cluster level"? Deeper pipeline + "less" resources, L1D cut down by 75%?
Not any of those. Less absolute resources, more practical resources per thread. This alone could possibly compensate for any IPC loss caused by deeper pipeline, let alone the improvements in other areas. If the cache is actually inclusive, then that alone would compensate for every possible CPU-level change which would reduce IPC even the fiercest Intel fan could think of.
Potential integer throughput of those 2ALU/2AGU says very little about the IPC performance, let alone single-thread performance, or whole product performance. All you'd need is slightly faster cache access and more aggressive prefetching and branch predicting to bring 10 % IPC increase with 10 % penalty on "integer clusters".
What BS? He has already stated both single thread and multithreaded performance are both higher.
Particle's First Rule of Online Technical Discussion:
As a thread about any computer related subject has its length approach infinity, the likelihood and inevitability of a poorly constructed AMD vs. Intel fight also exponentially increases.
Rule 1A:
Likewise, the frequency of a car pseudoanalogy to explain a technical concept increases with thread length. This will make many people chuckle, as computer people are rarely knowledgeable about vehicular mechanics.
Rule 2:
When confronted with a post that is contrary to what a poster likes, believes, or most often wants to be correct, the poster will pick out only minor details that are largely irrelevant in an attempt to shut out the conflicting idea. The core of the post will be left alone since it isn't easy to contradict what the person is actually saying.
Rule 2A:
When a poster cannot properly refute a post they do not like (as described above), the poster will most likely invent fictitious counter-points and/or begin to attack the other's credibility in feeble ways that are dramatic but irrelevant. Do not underestimate this tactic, as in the online world this will sway many observers. Do not forget: Correctness is decided only by what is said last, the most loudly, or with greatest repetition.
Rule 3:
When it comes to computer news, 70% of Internet rumors are outright fabricated, 20% are inaccurate enough to simply be discarded, and about 10% are based in reality. Grains of salt--become familiar with them.
Remember: When debating online, everyone else is ALWAYS wrong if they do not agree with you!
Random Tip o' the Whatever
You just can't win. If your product offers feature A instead of B, people will moan how A is stupid and it didn't offer B. If your product offers B instead of A, they'll likewise complain and rant about how anyone's retarded cousin could figure out A is what the market wants.
informal, some people can't believe their eyes.
Anyway, yeah, read the article at anand. It's quite clear about uarch changes there (and probably the only site other than realworldtech which bothered to make their own diagrams to help better understand). I haven't read the one on realworldtech yet but judging from that pic at post 726, it might be good.
David Kanter over at Real World Tech has a writeup about Bulldozer's uArch.
http://www.realworldtech.com/page.cf...WT082610181333
I figured this didn't have to be posted as a completely new thread.
Last edited by Mechromancer; 08-31-2010 at 07:51 AM.
2 pages is a few?
And finally we got a statement, its the first time he explicit mentioned this and the question was answered, ironically after the one that asked the question first was banned...
If he would have done so much earlier we could have at least saved 15 pages of nonsense... anyway im satisfied with the answer and there is nothing more to ask.
Or the guy who made 15 pages of non-sense could be willing to read a few articles on the front page before starting assumptions (and indirectly accusing people/slides of lying). Two ways of looking at it..
1) As per what percentage improvement could be seen... JF has already said that with 33% cores 50% performance gain at server workloads could be seen. This is the only information JF is willing to share and unless you hold Intel stock or work for them, i see no reason why'd you press so much for that information... which he already explained that he couldn't share owing to product being some time away from launch (i assume about a good 2 quarters or so...). Personally speaking AMD wouldn't want Intel to have information on an upcoming product, as it will give Intel an edge and possibly a chance to outmaneuver them. It works the same the when it comes to the opposite... The only time Intel leaked information (remember C2D) on an upcoming architecture was when AMD was kicking them around left right and center and in all segments of market... Now if Intel finds out stuff, they could possibly evolve a new pricing strategy (given their scale and market share its easier now) or something else, to counter a competitive product. Competitive BD is...
2) IPC is higher...
3) IPC compared to previous architectures of AMD is higher... he said as much... and many a times over...
to be fair, historically companies that hide information until days before launch tend to have problems with their product, especially the ones who have many delays. Even if BD does have a sizeable increase over k10.5, which it should, I honestly don't think it will be enough to compete with Sandy Bridge.
That preview by Intel was a red cape for AMD to charge at, and I'm willing to bet if they had a better product they would have released their own preview, challenging for the top spot. My guess is that BD will be a fine product, just still not has a powerful as Intel's in terms of pure performance. To me it seems it's more about power efficiency, as JF keeps mentioning 50% off 33% more cores. Well why not 100% off 33% more cores? That's because the thermal envelopes would just be too high not to mention the power draw would be astronomical considering they don't have a working 32nm process.
At least from my perspective, it seems to me that AMD is done challenging for the top enthusiast performance spot. They seem to have shifted onto a new direction, trying to offer the most performance per dollar, especially over the long run when you consider electricity bills. That's quite reasonable, as Intel has far more money spent on their fabrication process, and thus have denser, faster caches which seriously helps out on applications like Super Pi.
Is the OS aware of the cores sharing resources? If 2 cores of a module have 80% of the performance of two independent cores, when an application is using 2 threads (most of the games, for example) will the OS work on two different modules, or on a single module?
I'm reading the thread, sorry if it has already been answered...
Last edited by Andi64; 08-31-2010 at 10:00 AM.
Main: Windows 10 Core i7 5820K @ 4500Mhz, Corsair H100i, 32GB DDR4-2800, eVGA GTX980 Ti, Kingston SSDNow 240GB, Crucial C300 64GB Cache + WD 1.5TB Green, Asus X99-A/USB3.1
ESXi Server 6.5 Xeon E5 2670, 64GB DDR3-1600, 1TB, Intel DX79SR, 4xIntel 1Gbps
ESXi Server 6.0 Xeon E5 2650L v3, 64GB DDR4-2400, 1TB, Asrock X99 Xtreme4, 4xIntel 1Gbps
FreeNAS 9.10 x64 Xeon X3430 , 32GB DDR3-1600, 3x(3x1TB) WD Blue, Intel S3420GPRX, 4xIntel 1Gbps
No one is sure, all JF has said is that AMD is working with MS to devise core utilization order etc.
I would imagine, that ideally for multithreaded tasks you would want the same module due to the shared L2, but for separate tasks you would want different modules due to the performance loss from sharing components
Bookmarks