
Originally Posted by
drfedja
Old slides is probably based on CPU simulation, but in realworld scenario BD marchitecture behaves very different. Maybe some of advanced features are disabled because of serious bug, and there is performance penalty. I can't wait to see BD errata list. I can't believe in that the cache architecture is simple reason for low performance because they can simulate how much performance penalty came from smaller 16K L1D, and WT based cache policy. Hit rate of 4-way 16KB is only 1 or 2 % less than 64K 2-way. 2MB L2 has also much higher hit rate than 0.5MB L2. Maybe BD is better optimized for large data work sets.
Bookmarks