K8L just like K8 is going to be featured in a few sockets.Quote:
Originally Posted by awhir
and Like always, any socket that it fits into (without the removal or bending of pins), it will work though not always at optimal efficiency
Printable View
K8L just like K8 is going to be featured in a few sockets.Quote:
Originally Posted by awhir
and Like always, any socket that it fits into (without the removal or bending of pins), it will work though not always at optimal efficiency
Interesting:) ....I have to get more info on this AMD stuff.
the way K10 will achieve 40% advantage
http://img201.imageshack.us/img201/8...taskingzq7.png
:)
We've already had confirmation that 40% overall is simply not happening
Randy Allen and Patrik Patla (AMD directors) told us about 40per cent, and suddenly brentpresley appears and tells us
look better at the pictureQuote:
that 40% overall is simply not happening
40 % advantage is for rough multitasking environment
10 % is for single-threaded appl.
funny in my favorite programming class, the teacher whipped code optimization and code splitting into us. knowing when to use integer approximation and when to do massive parallel floating point. Fun class, but definitely not for beginnersQuote:
Originally Posted by brentpresley
Everyone that owns a C2D depend on how well it OC's!!!.....they'd be 30%-50% less powerful if they didn't scale as well.Quote:
Originally Posted by brentpresley
Well....how many of you power users OC your C2D's? 100% of you?
When you start talkin' OC's then the performance gap will only grow further.
We all know a C2D HAS to be overclocked in order to attain the performance levels everyone talks of, why should it be any dif for AMD?
All we can go off are estimates of Performance and a speculative 10% from s7's under NDA guy.
If this is the only figure we know of, then we expect probably 15% more speed than a c2 (clock for clock). Until other figures are released it is nothing but a pointless argument.
Agreed :)Quote:
Originally Posted by brentpresley
Here, here :)Quote:
Originally Posted by Motiv
Just to finish the SSE FUD that somehow started. Look on C2D vs CD. Is the C2D like 6x faster? C2D got 6x higher potential SSE throughput. But its not really that much of the total code thats SSE.
Less dreaming, more reality please. x87 to SSE patches for games dont even bring that much.
And SSE is still widely missing at many places....MS tries to force this with no x87 in 64bit. But mandatory SSE. However...dont dream...SSE is a nice boost but no miracle. Its more a matter of cleaning up the stupid x87 and get it removed with time from the CPU.
absolutely, fortunately a well made and documented program can be updated rather quickly. I remember helping in a project to convert an Audio encryption from Integer to SSE3, took a couple days but the performance boost was huge.Quote:
Originally Posted by brentpresley
So it ultimately how important performance is to you.
the clue is "estimates"
i hope its atleast half true ;) would still give c2d a run for the money
I definitely agree with you there, heck take ten seconds to look at Microsoft source code and you'll wonder how the hell they got it to run. Some of them just seem to love the "goto statements" But I must admit their Binary interfaces and the assembly they use for it are extremely well made.Quote:
Originally Posted by brentpresley
Unfortunately the technically skilled aren't the ones writing the most code.
And if you really want to see a 300% speed increase, transcribe .Net programs to pure C code. Talk about a huge improvement.
Some interesting bits:
Source:Quote:
A 65nm silicon-on-insulator process is used for producing the near-450-million transistor device, with dual stress liners and a silicon germanium process is used to speed up the pFETs. Eleven layers of copper and low-k dielectrics connect the device.
At 95 degrees Celsius, modelling suggests the processor will run at between 2.2 and 2.8GHz at 1.15 volts. Each of the four cores include eight temperature sensors. The on-chip northbridge contains a further six.
The memory interface is 400 to 800Mbps from a 1.7 to 1.9 volt supply for DDR2, and 800 to 1,600Mbps from 1.4 to 1.6 volts for DDR3.
The HyperTransport interface supports legacy HT1 and 2 modes as well at HT3 at 2.4Hbps with a peak of 5.2Gbps.
http://www.edn.com/article/CA6415782.html?partner=enews
Enjoy!
:)
But it's not slower, sometimes its faster, sometimes its slower depending on the application, just like the K8. Overall, it still remains the fastest 64-bit x86 processor available today.Quote:
Originally Posted by LOE
We see a one or two unrealistic scenarios where this happens and requires specific situations that benefit from the Quad FX's additional memory controller. However, in a single-socket system, the desktop versions of Barcelona will only have 1 memory controller and 12.8GB/s of memory bandwidth.Quote:
heavy multithreading - we already see quad FX running inferior chips outperforming core2quad in heavy multithreaded scenarios, that gap will only grow bigger when K10 comes out
Most other heavy multi-threaded scenarios have the QX6700 beating the Quad FX just as easily as it does in single-threaded scenarios.
A C2D can execute 1 128-bit multiply, 1 128-bit add plus a load, store and jump in the same cycle.Quote:
are you sure? :nono: C2D can process one 128bit sse instruction per cycle, do you mean pentium has a 21.33 (128/6) bit SSE engine :rofl:
:up: :party3:Quote:
Originally Posted by Lightman
All CPUs speed up in 64 bits due to the larger amout of registers and the standard SSE2 instructions.Quote:
Originally Posted by accord99
But Core2 does not speed up as much as K8 since MacroFusion doesn't work in long mode.
On SSE execution K10 has little advantage.
Core2 has 3 SSEs plus one load and one store units.
K8 has 3 FPUs (that do SSE) plus the load/store unit that do two loads/stores per cycle, on K10 the FPUs are widened to 128 bit so it can do 3 128 bit SSE per cycle plus 2 load/stores.
So Core2 does 3 SSE, 1 load and 1 store. K10 does 3 SSE, 1 load and 1 store or 2 loads or 2 stores.
http://www.xbitlabs.com/articles/cpu...amd-k8l_5.html
Quote:
Originally Posted by savantu
I don't know why 0.9 to 1.2 keeps sticking in my head, but in some code base yes, P4 could do that 0.9 to 1.2 (some apps within the SPECINT bench showed this high):
http://www.princeton.edu/~jdonald/re...uck_pact03.pdf
(EDIT: it is reading this paper sometime ago that 0.9 to 1.2 sticks in my head, because my first thought was wow... a P4 can actually do that :) )..Quote:
The benchmarks that perform
best in this environment are mcf, art and swim at 93%, 97%
and 98% of peak respectively. eon and wupwise have relatively
high instruction throughput of 0.9 and 1.2 IPC respectively,
while mcf and swim have relatively low IPCs of .08,
.2 and .4 (all IPCs measured in ops). Not unexpectedly,
then, those applications with low instruction throughput demands
due to poor memory performance are less affected by
the statically partitioned execution resources. See Figure 1
for a summary of results from these runs.
The IPC, of course, is very code dependent (compiler optimizations, instruction ordering, etc) and how the architecture handles the ILP efficiency, combined with all sorts of factors. Truth is I have looked over probably half dozen to dozen papers where the IPC is measured/calculated, HT helps, I have seen IPC as high as 1.6 in some code base. However, the original point is that it really really stunk in a general sense.... a long pipeline with unoptimized code for that situation will generally crater the efficiency.
Another example of who well and poor the P4 can do IPC wise:
http://www.geocities.com/ykchen913/p...ions/CAECW.pdf
In h.264, the IDCT chain could get as high as 1.16 (see table 4). This is a good paper, as it also shows FSB utilization on a P4 is quite low even with a high L2 miss rate.... this is on a 533 MHz FSB .... and multimedia is likely to have the highest demand on FSB.
Anyway, C2D I do believe is significantly higher than 1.0 IPC on average (some will be low of course, but others high), but I have not found any studies or data that has measured it.
Barcelona appears to be heading for a good IPC boost, achieving something higher that C2D will be a true accomplishment, C2D did a good job in this department to show the improvements. I am anxious to see the data.
Jack