WCC is a joke in the current BD implementation and is not able to catch up with the massive loss that comes from the L1D. The entire caching-system is lowering the performance of the ľarch. The L3 is a non-inclusive victim cache (L2 data are evicted to the L3) with data transfered from L3 to the L1D of the expected core without being copied to the L2. That mean high snoop traffic in order to keep the coherency correct. And snoop traffic is something really unwanted from a bandwidth/performance pov. There is a pardox here : The L1 is in Write-through, but you're not sure a data not in L2 is not the L1D of another core.
Bullsh1t.




Reply With Quote

Bookmarks