Quote Originally Posted by Origin_Unknown View Post
HT is still a fake trick imho....

flame me all you like

How is it a fake trick? I mean, I guess it "tricks" the task manager. But as far as being a "trick", it's no more a trick (as another poster pointed out) than OoO execution, branch prediction, or basically any other component of a modern x86 processor. It's not fair to decode those big x86 instructions down into "micro operations". It's not fair that AMD uses an IMC. It's a cheap trick!

Customers care about the raw processing power of a CPU. Hyperthreading is a great way to increase potential performance for a relatively low cost in terms of heat and die area. All it is doing is making more efficient use of the unused execution units. Indeed, your Core2 processor is probably only using 2.5 out of 4 execution units at one. Rarely it uses all four and more often it uses two or three. Sometimes it's stuck at pushing through only one instruction (or just sitting idle because of a pipeline stall).

This goes back to someone complaining about how they would prefer that Intel tries to increase single threaded performance. I don't think you appreciate how difficult that is. Already, AMD and Intel use tricks to make their processors perform that would make your head spin. Out of Order execution is incredibly complex, especially in the way it is implemented in today's processors. Branch prediction can only get so accurate (Intel learned that with Netburst). A processor can only complete so many instructions at one.

Sure, it's possible to create a fifty-issue wide CPU that has billions of transistors and looks more like a GPU in construction than a CPU. But quite frankly, save for a few small synthetics, more often than not only a few instructions would be passing through them. There is only so much you can complete in one cycle without having dependency issues. You simply can't do this until you know that.

Which goes back to hyperthreading. If you have execution units that are empty, why not fill them up? Sounds pretty fair to me.

My only concern with Nahalem is the small cache sizes. When you are doing hyperthreading, you are basically splitting the cache needs (although I'm sure the dominant process is given priority). With Nahalem's effective L2 cache size (L2-L1) being only 192K, it's going to be interesting to see how it deals with the often thrown around "cache-thrashing" word.