Intel has built in strong prefetchers for data (they have them on the L1 and L2 cache). They also have development tools that help to optimize data for those prefetchers. Using that you can get a higher hitrate for data in the L1 and L2 cache if data is ordered well.
Applications that are used in performance tests or when games are only drawing vertices these prefetching comes in handy, but most real world applications don't get that much advantage from these prefetchers.
Six cores are much better than 4 cores even if Intel has some tricks (80% market share makes a lot easier to add special tricks) to make those 4 cores to work a bit harder by not making them wait for data as much as AMD does in some scenarios.
Bookmarks