No, like I said before, I've run my X2 at 2.0ghz (with cinebench 10 x64), and 1cpu result was 1905 (1896 this K10).
Printable View
No, like I said before, I've run my X2 at 2.0ghz (with cinebench 10 x64), and 1cpu result was 1905 (1896 this K10).
I can believe K10 would run rings around K8 at same clock. The mere doubling of SSE throughput means that a lot of HPC apps will run near twice as fast. Core 2 is in many cases near twice as fast as P4 clock for clock
in HPC is some cases up to 4 times as fast.
The 32 byte fetch means K10 should be a floating point monster (HPC).The load forwarding capabilities of K8 are quite deficient (none!) compared to Core 2 ( load forwarding already in pentium pro) which means that their inclusion in K10 will give an even bigger boost than Core 2 got from it. Too bad the clock rate is low and the cache is relatively small.
32 byte fetch can help with decoding long instructions, but K10 still limited by 3 x pipeline (Core has 4 x pipeline). It can't help in legacy code with short instructions.
But Core(tm) feature 64-byte fetch buffer wich can help short loops run faster (on any code).
Core(tm) is still better in almost all which is related to the memory subsytem.Quote:
The load forwarding capabilities of K8 are quite deficient (none!) compared to Core 2 ( load forwarding already in pentium pro) which means that their inclusion in K10 will give an even bigger boost than Core 2 got from it. Too bad the clock rate is low and the cache is relatively small.
http://www.xbitlabs.com/articles/cpu...k10.html#sect0
As a result, we see that the memory subsystem of K10 processors has undergone some positive improvements. But we still have to say that it still potentially yields to the memory subsystem in Intel processors in some characteristics. Among these features are: the absence of speculative loading at unknown address past the write operations, lower L1D cache associativity, narrower bus between L1 and L2 caches (in terms of data transfer rate), smaller L2 cache and simpler prefetch. Despite all the improvements, Core 2 prefetch is potentially more powerful than K10 prefetch. For example, K10 has no prefetch at instruction addresses so that we could keeps track of individual instructions, as well as no prefetch from L2 to L1 that could hide L2 latency efficiently enough. These factors can have different effects on various applications, but in most cases they will determine higher performance of Intel processors.
Until i see a reputable site trusted by XS post a full review, its all speculation. Lets wait and see mature results, enough with the speculation.
1st point. That's a loop detector. I'm interested in pure SIMD FP, where 32 byte fetch should help.
2nd point. AMD has implemented a write burst buffer and real RAM prefetcher into the IMC. Intel probably has better prefetchers. The real hurt is the low launch clockspeed if true.
People seem to forget how one early K7 sucked (simply put)
Link:http://firingsquad.com/hardware/k7550preview/page7.asp
Even loses to K6-3 at the same clock.Has abysmal FPU performance,also goes for integer too.
I do:) So few good ones are out there that leaks of the so called bad ones would likely be seen first.
Yet, I'll say what I've said from day one. If AMD had something to show, they or someone friendly to them would have shown it by now. I hope like hell I'm wrong on this one.
The AMD stated working latency is "less than 38 cycles and depends on the clock speed of the southbridge". Higher clock speeds offset the latency as in all processors. L3 cache is just the shared victim cache for the L2 cache, nothing more. It operates to reduce latency very well between RAM<->CPU for the K10 as the larger L2 does in Core 2.
K8 had a 12 stage pipeline, Barcelona a 12 stage, and Core 2 a 14 stage.
K8 L2 latency is 12 clock cycles, Core 2 is 14 and Barcelona is 12.
K8 L2 cache bus width is 128-bit, Core 2 is 256-bit, Barcelona is 128-bit.
SSE engine width of K8 was 64-bit (2 per clock), Core 2 was 128-bit (3 per clock) and Barcelona is 128-bit (2 per clock).
L1+L2 cache latency is 15 cycles for the K8, 17 cycles for Core 2, and 15 for Barcelona IIRC.
Correction: L1+L2 cache access combined latency is median 13 cycles for Core 2 and Barcelona. That's twice as much data in the same time frame accessed by Core 2 due to the double bus width between L2<->Core.
There's much more improvements with larger stack load and reordering of load/store of the many which have the potential to make the most difference. Many of the improvements are identical to what was done with Yonah -> Core 2. Many more specific, and even some more advanced.
Based on the technicalities, the improvement seems like this:
K8 > K10 as with Yonah > Core 2. Like I forementioned, I think its a retail clock speed yield race, nothing more. We'll wait and see how it pans out in reality.
Coolaler isn't trusted by XS?
From what I recall, the early Kentsfield benchmarks found on that forum were quite accurate, and so were the Core 2 overclocks. Faking a benchmark like this would be really stupid, seeing how easy it would be to prove the forgery once the CPU's went retail.
The results are disappointing taking Intel's current and future (45nm) offerings into consideration, but they are not unreasonable. They show a few percentages performance increase over the K8.
There will probably be benchmarks where the K10 scores like the K8, and there will probably be benchmarks where the performance delta is bigger than in Cinebench (say circa 20-30%).
http://firingsquad.com/hardware/k7550preview/
Take a good look at that. Those were pretty bad K7 pre-launch benchmarks. And we all know the K7's story don't we?
Better wait a few more days. You never know with AMD, I remember T-Bred A and B. Well that was a surprise too, many thought AMD couldn't get the K7 over 2GHz properly when T-Bred A came out.
I'd say don't worry too much, AMD has the reputation to exceed everyone's expectations. It happened so many times before.
Edit, wow informal, I for sure take tooo long to write a reply.
Anyway, that's the same point I wanted to present here.
Kinda funny to note that both Pentium III and K7 did not reach their pinnacle until they were well in the shade of anticipation for the newer tech. Barton and Tualatin, too few remember ye.
K8 has gone through allot too, but besides the X2... it hardly feels like an mutated alien in the face of its original self.
Guy's
i would like to mention that AMD said that software using the FPU need a re- compile to take full advantage of the new Barcelona FPU right ?
So, running current benchi's like cinebench or super pi with apparently 'old' code doesnt show the full speed up the K10 would get.
Dunno if super PI uses FPU but basically alot of scientific programs use it.
Please correct me if i am wrong here.
I love how everyone is coming up with conspiracy theories. I totally trust Coolaler.
I love how discussions like these bring out the stupid in stupid people.
*sigh*
First off, it's not Coolaler, it's some guy on the site's forum. Secondly, remember the original Athlon previews? Even if somebody credible were to bench the chip, the results would still be bogus it it were bugged. Thirdly, Rahul Sood has stated that the 3GHz Phenom is considerably faster than all of AMD and Intel's current offerings. These benchmarks fly against Rahul's claims, and, quite frankly, I trust Rahul Sood just a bit more than some guy on some forum who's possibly using bugged chip/mainboard/bios to boot!
How about the AM2 previews, they turned out to be correct. Performance decrease vs DDR-400 unless paired with DDR2-800 memory.
Sood also questioned the initial Conroe benchmarks so I doubt he really knows much more than anybody else.Quote:
Thirdly, Rahul Sood has stated that the 3GHz Phenom is considerably faster than all of AMD and Intel's current offerings. These benchmarks fly against Rahul's claims, and, quite frankly, I trust Rahul Sood just a bit more than some guy on some forum who's possibly using bugged chip/mainboard/bios to boot!
Stupid people huh? Like the person before said, everyone thought AM2 was going to be competitive with Core 2 clock for clock, well well look what happened.
Atleast this seems more crediable than the 30K in 3DMark06 :rolleyes:
And everyone else who believes in that 30K isn't "stupid," right?
Stupid.
Lol man,Sood is the head of HP gaming department (you've heard of that company HP,have you?)
If AMD wants Phenom in HP's systems,they sure have handed a few Phenoms to them and asked them to sign the stuff all others have signed.Other than breaking NDA,Sood simply said Phenom @ 3ghz is "stone cold killer" and that it is and will be considerably faster than any intel or AMD chip when it is out.
Gentlemen:
I settled the statement.
The next one who quotes that guy and keeps this going on the stupid comment takes a vacation? Is that plain enough? Stop the fighting!
Lets see a concrete review. A review with pictures and an author. I do not trust pics from a foreign nation across the world.
The only hype or benches I will believe is when it is a true posted review, written up by an author, with pics of the packaging and cpu. I want to know what chipset it is running on as well as memory.
I dont trust a few screen shots from across the globe as a reliable performance benchmark.
Also note these are early opteron K10's not the Phenoms that willl be out later this year, nor the chipset we will run them on, and most likely different ram speeds/timings.
Thanks.... typically enterprise chips are the cream of the crop. Though there is some ambiguity in the stepping represented here, it doesn't look like CPUID has been updated with the CPUID string/look up table to designate stepping yet... so if this is B1 as some are postulating, then this is the launch stepping is it not?
This is of course proceeding under the assumption that this data is factual and not made up, back alley leaks like these deserve a healthy dose of skepticism.
The BIOS argument is a decent one, but typically BIOS code does not interact to produce a computational result... it provides very low level IO code, most of which OSes simply ignore today. Though it could be that there are initializations for various functions that are not occuring correcty and such, but I have never really experienced a BIOS update, even a buggy one, to affect performance that much.
Nonetheless, there is way way too much contradictory information or conjecture. Theo seems to think it is a monster, Hector Ruiz is downplaying the significance, Raul is certain of great performance, AMD has conceded single thread perfomance to Intel in various press statements.... so it is hard to know what to make of this.
No, he didn't say that, actually.
This is what he said:
And for the record, if you were to benchmark Phenom at 3GHz you would see that it kicks the living crap out of any current AMD or Intel processor
It's also worth noting here that Rahul has been an AMD fan for some time, predicted good Q1 results for them, and IIRC, has mentioned that he is invested in their stock.
I don't read too much into this, other than to say -- top execs leave for various reasons, and Richard's departure, especially the timing, subjects the event to an enormous amount of speculation.
My underlying point is that we simply are still in the dark, this back-alley bench is the most data was have had to date, and it consisted of Cinebench, CPUMark, and SP1M... we have various people making strong but non-data backed claims of trouncing this or failing that...
Having said that, this data did originate on Coolaler's site... that name alone attaches a certain degree of credibility to the results based past experience and on a reputation some of which was built on this site. However, it is still not sanctioned, and fails some basic principles, such as clearly spelling out all the details necessary to reproduce the results -- and, most importantly, the stepping is ambigous at best ... it is none-the-less-fun to debate.... but this is such a hot topic, I approach this with a high degree of trepidation. Do not underestimate the degree of importance this product is to AMD, they are betting the farm on it and, frankly, I prefer to give them the benefit of the doubt.
I think he was getting bored. Being the chief of marketing at a company that doesn't market very much makes little sense. I believe Henri resigned because he wanted to do the best thing for the company, and seeing he had little to offer at the moment, he resigned. Takes a lot of balls to do something like that. :up:
It is hard to determine what stepping this site is showing, if it is real (my little disclaimer :) ).... but there is a hint.
The CPUID string that CPUID read from BIOS does not read "AMD Opteron processor unknown" or "Engineering Sample (ES)".... AMD uses a special registers, which the BIOS reads and constructs the processor name and model string, and information.... see for example, page 345 http://www.amd.com/us-en/assets/cont...docs/32559.pdf
The CPUID report from Coolaler's site correctly identifies the processor family as 10, and also does not read the CPUID string as anything other than the branded processor name....
Assuming this is not faked (which is well could be), it appears to be the release product.
Jack
Oh and I PM-ed you:) Before I forget, Rahul Sood was one of the folks calling those early Conroe test done here by *that guy, fugger, fcg, victor and etc bogus. Hinted that some of the folks here was being paid. That was the cause that misunderstanding in that other post. At least 4 Webmasters weren't as rude but said similar things.
Saying he is bias is contrary to what he says, his own words.
Love blinds!Quote:
Originally Posted by Sood
I'll wait for the real release from AMD before I comment on the performance; not that my opinion means anything, anyhow. :D I'd like to see how these chips run on updated BIOS and motherboards (shipping) at speeds of 2.5ghz plus.
Wait and the truth will be revealed.
Man these Phenom threads are getting messy :|
Yes they do.
The K8 core is already strong to begin with.The law of diminishing returns is in full force here.
When you already start high , the improvements you make will bring little advantages.
Compare that with the P4 core which was far more fragile ( code quality was vital while the K8 eats just about anything ). Core brought massive improvements over the P4 , the score jumped a lot.Even compared to the K8 , core has a huge number of improvements , in the end it is only 20% better on average.
There are a lot of situations where K8 core+ 1MB L2 will be as fast or faster than K10 core + 512Kb L2 + 2MB L3 , especially where latency counts.In multithreaded apps , this will be more pronounced as the L3 will get trashed by different threads.
Obviously not. When you force your Chief SALES and Marketing officer out, you have a replacement ready to announce at the same time as you announce the departure of the old guy.
When someone quits with short notice, a la Henri, you have to say that his functions (head of sales AND marketing) are "reporting in to the office of the CEO" while you scramble to find the replacement.
we do not really know much about that rig that was benched. only speculations.
but the bad bios argument seems it could be true.
i tell you, once i flashed my mobo with new bios, everything looks good, no errors etc. yet, with exactly the same settings, mem bandwith was whole 2GB less than with older one. sPi went up 5sec. everything was slow. i reflashed the bios twice, same doggy result.
went back to older bios and it was how it should be.
sometimes, especily if you dont have good background, or "base score", you cant tell whether the bios is good or bad one....
Agreed....these fools who insist a BIOS has nothing to do with it need brain transplant. A BIOS can make ALL the difference to a machines performance....and we ALL know this. You'd need to be a complete forehead biter to think otherwise.
Besides....it would be kinda silly to pick PHENOM as a name for a line of cpu's which perform really badly against even older architecture and fail miserably against your direct competitors.
some time ago in an AMD lab
......"Ehhhh hold on guys...this cpu ain't looking too hot now we're approaching completion. I just ran it in CINEBENCH and INTEL's Q6700 scores double!!!! What we gonna do guys?".
"Well (say's one of the other guys in the lab)...we could always just release it anyway insisting it's actually better than INTEL's quads".
"Yeah," says another lab hand, ".........and we can call it PHENOM as in PHENOMINAL since it's soooooo CRAP!" :ROTF: :clap:
god... 10 more days till this crap battle is over...
Xs is an international forum with trusted members from all over.
Would you use this logic to dismiss Victor Wang's findings or Pedro Rocha's or Kinc's?
All people from a "foreign country" as you describe it and all people of impeccable credentials.
I'll be very polite and say that this was a very badly worded statement on your part. Where someone is from isn't a factor to their credibility.
More here,
http://www.theinquirer.net/?article=42052
Puts mod hat back on:
Yes it is. The intent is to insult and there is the issue.
Rational adults should be able to get together, discuss a subject from all viewpoints without insulting one another.
Someone can come here and post a brilliant thought but if that person calls another a moron,etc in his first sentence then his message is lost.
He's simply lost his credibility in my eyes as he's shown that he doesn't have the ability to disect anothers argument with facts and logic.
At that point it just turns you off to listening to anything that guy has to say.
i see, but it never to late to learn something new. :yepp:
If he means in 3Dmark, that doesn't contradict Coolaner's site's numbers. We have seen that K10 improves over K8 by about 7%. Per Core and Clockspeed.
So how does that compare to a 3.0 GHz Core2 if you consider two eight-cores? We have seen (in cinebench) that Barcelona's disadvatage per clock is less the more processors you count, which is hardly surprising given Hypertransport, integrated memory controller and NUMA. It has been the same way with K8 in both multiplprocessor sockets.
But the Intel Core2 eight-core is a socket 771 system, so no SLI.
So even if the per-core per-clock performance of Barcelona is still lower (which it is, face it), a benchmark involving both multi-processor code and graphics code, with Barcelona using SLI, will easily see K10 beat Core2.
You're right.
The dump I got reveals several problems with the HT link of the CPU. According to the config regs, the CPU HT link width is set to "Link physically not connected".
I don't exactly know if this is possible that the system works in such conditions, however, there's definitely something odd.
Code:Capabilities
HyperTransport Capability
Offset A0h
Revision 3.00
Interface type Host/Secondary
Device number 0
Link 0 frequency 200MHz
HyperTransport Capability
Offset C0h
Revision 3.00
Interface type Host/Secondary
Device number 0
Link 0 frequency 200MHz
HyperTransport Capability
Offset E0h
Revision 3.00
Interface type Host/Secondary
Device number 0
Link 0 frequency 200MHz
@cpuz: Has this any impact on the performance of the k10?
As much as I despise quoting fudzilla, he posted a rumor that another VP will leave AMD:
http://www.fudzilla.com/index.php?op...=2774&Itemid=1
Yeah, I have seen small improvements here or there, but let's assume the Coolaler data is correct, it appears AMD is somewhere between 15-20% short on IPC relative to Core at least in rendering, CPUmark is general but not a great indicator .... BIOS, even an early beta that gets this rig running, would be a stretch to make up that gap. Which is why I say the BIOS argument is a decent one, but not a good one.
What you see argued in this thread, and an argument that needs to be had in light of all the speculation, is that these results indicate that Barcey is fairing poorer than expected -- based upon AMD claims and known architectural work.... therefore, the skepticism is rational and the idea is to explain the data in light of the lower than expected performance... BIOS is one argument, not a good one but it is one... disabled features are another good argument, which is why knowing the stepping is so critical.
That somehow fits to what Gary Key from Anandtech (thanks informal) said about early benches.
Interesting at least...
Well, two more Monday's and we will see the real truth ...I cant stand more speculation :(
Hey guys here is even more information from Gary Key. I just thought there wasn't enough speculation so I will add some more ;)
I asked him if the part scales better than linear since he was saying Barcelona needed at least 2.4Ghz to show it's "true colors". Here is his response:
Also here is his take on the recent benchmarks on the Coolaler forums:Quote:
Originally Posted by Gary Key, AT Editor
Quote:
Originally Posted by Gary Key, AT Editor
HT3.0 has been moved in the roadmap from 2007 to 2008, the chipset will support HT3.0, but it appears the CPU will not:
http://www.amd.com/us-en/assets/cont...ck_Bergman.pdf
See slide 19.
HT3.0 support in the CPU appears to have been moved to the 45 nm sandtiger product.
so now, this is interesting...
so it means that its running only at 200htt? that gives roughly 1.6GBps bndwidth...
pair this with non-tweaked bios and 667mhz DDR2 Cl5 and you have current K8 performance.
i can imagine that phenom at full 1000mhz htt, good bios and ddr2 1066 Cl4 should fly...
You right! But if you follow that presentation then you would find on page 20 that Pinwheel platform based around Athlon X2 AM2+ and AMD690G is supporting HT3.0. Wired isn't it??? :confused:
Either person responsible for these slides was drunk or heavily misinformed :p:
At the moment I'm still thinking Phenom will have HT3.0 but not Barcelona....
hope to see the real deal soon.... im sick of all the speculation going around barcelona
http://forum.coolaler.com/showpost.p...&postcount=125
Harpertown 2GHz numbers for comparison!
23.3s vs 39.7s for Pi
20.48 vs 17.13 in "relative speed" for Fritz
2454 vs 1896 in single cpu cinebench
15334 vs 13295 in mutli cpu cinebench
6.25x vs 7.01 cinebench scaling (78 vs 88 percent)
10.4s vs 10.6s for wprime 32m
310.9s vs 327.4s for wprime 1024m
Any near direct 2GHz Clovertown/dual socket F for comparison? All else I have is http://www.xtremesystems.org/forums/...3&postcount=34
That's a good thing:) Computers have always been about the whole system. If K10 isn't faster Processor vs. Processor but the whole system kicks Intel's a$$ then I'd want AMD's best. This was the same argument I made when I tried to tell nn that it's more to great PC than a speedy processor alone. There shouldn't be any *if* excuses for AMD or Intel. Processor is just one part of a system and the whole system should be judge as such!
Please don't forget, Intel does have V8 that was supposed to have bu CF and SLI. It seems that AMD-ATI and nVidia is fighting Intel on support of both techs.
http://img.coolaler.com.tw/images/jj...mckdzozmt1.jpg
15 cycle L2 is vs. 12 cycle Penryn L2 (for 6 TIMES the L2 cache per core)
45 cycle L3.
What, rubbed a sore spot on ya'?
Sure, so does the sun, flash bulbs and something like me punching in the eye LOL!:rofl: Even after K10 hits the market, there will still be plenty of lies told. See the AMDZone after C2D? See [H]'s game review of C2D. Folks who lied on C2D reviews will NOT be trusted. They'll more than likely lie for K10 if it doesn't do so well.
No nn, I'll wait for Fugger, Victor, MovieMan and others here who I trust and not Ad-Whorrrresss who'll say anything for hits and Ad-Dollars.
The L2 latency has gone up or down for Barcey relative to Brisbane depending on how you look at it... I think CPUI latency measurements put Brisbane as high as 20 cylces, but Lostciruits did an analysis that put L2 latency on Brisbane at 14 cycles...
AMD added 2 more metal layers to Barcey, so equivalent to slightly better is expected. No wonder AMD/IBM are pushing for ultralow-K for 45 nm. Nice dig.
The thread scaling here is to be expected, this is also spanning over two sockets... so in a single socket comparision (4 core v 4 core), the scaling will difference between the two will be closer, but Barcey should be overall better at that..... (again, proceeding from the assumption that this is production silicon and no older ES samples).... the problem is that even with slightly better thread scaling performance, absolute clock for clock will not surpass....
The best hope is that there is something wrong with this particular sample/testing.
Oh trust me there will be argueing :ROTF:
Just wait till the upgrade threads, and someone wants to spec something out that isnt the fastest but is in their price range. Happened in the AXP and A64 days, as well as the C2D days.
Im still seeing the "Dont waste your money, buy a conroe!" replies in AMD threads, tho not nearly as much as 6 months ago. Same goes for AMD benching, some 4ghz conroe guy always comes along and bashes it.
And to the one dude who said that HT3.0 was pushed to 2009,he is wrong.All Phenoms are HT3.0 ready,the boards (RD790) support HT3.0(16 bit link width,5.2GT/s).The chips CAN and will work(100%) on AM2 motherboards and the HT scales back to HT1.0(What this does to performance is not clear ,yet).