Interesting reads here, but 3 questions:
1.) why would AMD do such a thing (you wrote for better clock-scaling, that means higher clocks or better scaling?)
2.) can it be optimized further
3.)this contradicts somewhatWrite Trough means that every write to the cache causes a synchronous write to the backing store. Because L2 is slower than L1, L1 must wait for L2 to write out data.or you mean this can be due to WCC? I am not a professional, but I would guess WCC doesnt take that long to preserve coherence... If anything at all, I'd guess the waiting of L1D for L2 can be a problem, but if stuff is written to L1D fast, and then some cycles later WCC writes to L2, I guess there will only be rare cases whereThe L3 is a non-inclusive victim cache (L2 data are evicted to the L3) with data transfered from L3 to the L1D of the expected core without being copied to the L2.really matters...The L1 is in Write-through, but you're not sure a data not in L2 is not the L1D of another core.
I guesswill first speed up things, thendata transfered from L3 to the L1D of the expected core without being copied to the L2.will need some time. All, to my eyes, depends on how this stuff is used, it can be faster in one case, and slow in the other... Mhm somewhere we came to that conclusion beforesnoop traffic in order to keep the coherency correct
Maybe the problem is because of, one core might need to wait for WCC to complete to ensure coherency?The L1 is in Write-through, but you're not sure a data not in L2 is not the L1D of another core.
Ahhhh btw... CONGRATZ FOR NEW WR![]()





Reply With Quote
Bookmarks