
Originally Posted by
JumpingJack
Informal, becareful and read Kanter's note carefully, this is beginning to focus up .... it was a good link... but you may be misinterpreting what Kanter is saying....
I am, myself, trying to understand at a detail that makes sense, this is much more complicated than what we are assuming....
Each core (that is execution core and the dedicated cache) will be clocked independently, a major power saving feature of K10. So in order to share a cache at L3 level, it will need to send data asynchronously to differently clocked cores... wow, this is complicated.... so what AMD has done (per Kanter) is build a 'translator', or a FIFO buffer to send data to and from the L3 -- this is not the same as dynamically adjusting L3 clock or latency, what it is doing is dynamically adjusting a clock divider to synch L3 with variable speed cores, now this variable L3 latency makes much much more sense.
Any asynchronous communication will incur extra latency (over a simple 1:1) simply as a result of clock mismatch ... this is a given ... (this is why C2D shows a dip in performance in DDR2-533 to DDR2-667 to DDR2-800 as dividers beyond 1:1 introduce extra latency).
So with this understanding, the observed latency (which is actually the important part) will be variable, not because L3 cache has variable latency but because it has to be sychronized through the FIFO buffers to cores of variable clocks.
Damn should have paid more attention to Kanter's article too....
Guys I am learning a lot here... thanks.
Jack
Bookmarks