Me too.
I would think it is clocked much higher than the rest of the L2. The higher the L1 throughput is the less it holds back execution. So, L1 accesses needs to be as fast as reasonably possible. The ~22 GB/s of (apparent) L1 Write is roughly 1/6 only of the ~130 GB/s of L1 Read... So, you can store the results of a computation 1/6 the speed of reading your datas. True, the amount of initial datas are usually bigger than of the results, but the ratio is not always >= 1:6. There can even be more results than initial datas.The buffer is nice, but it is just a buffer not a part of L1. It just helps to bundle the data in bigger chunks thus some accesses to the L2 can be avoided, nevertheless, you write into the L2.
Of course, it all depends on the size of this WCC, as well, so that how often we running out of it. I hope it's reasonably sized. (The rather low number in AIDA64 could also come from that it writes 64KB [size of L1D], while the WCC is certainly lesser.)





Reply With Quote
Bookmarks