5.1 Hyper-Threading Aware Thread Scheduling
For the purpose of this discussion, a thread refers to an operating system or application instruction stream. For HT-enabled systems, it is advantageous to schedule any thread to a logical processor on an idle physical processor, that is, on a physical processor that has no instructions executing on either logical processor. This minimizes the impact of competition for shared HT processor resources on overall system performance.
Figure 7 shows an example of a four-processor system that has two active threads. A shaded logical processor indicates an active logical processor; a non-shaded processor indicates an inactive logical processor.
Figure 7. A Four-Processor System with Two Active Threads
Assuming that no processor affinity has been set, the operating system scheduler is free to schedule the next available thread to any of the inactive logical processors. An operating system might randomly schedule the next thread to any of the available logical processors on the system. This section examines the performance impact if the scheduler chooses the wrong logical processor.
Some assumptions about the performance of the system have to be made to simplify the comparison. Assume that there are no significant bottlenecks in the system architecture or in the software, and that adding a thread to an idle physical processor results in a performance increase equivalent to 100% (100) of a similarly equipped non-HT processor. On this basis, the performance of the system in Figure 7 is 200 before the third thread is scheduled.
Scheduling the third thread on either logical processor 2 or 5 would have the following effects:
• The performance increase that is delivered when transitioning from one active logical processor to two active logical processors, on the same physical processor, is typically in the range of 10% (10) to 30% (30). So on average the total system performance would be likely to increase from 200 to 220 (that is, it goes up by 10%).
• This lower performance increase is due to the fact that two threads are competing for the use of the shared resources on one of the physical HT processors. So scheduling a thread onto an HT processor that already has an active logical processor has the following effects:
o Slowing down the performance of that active logical processor
o Limiting the performance of the new scheduled thread on the second logical processor
• The good news is that, for multithreaded applications, the sum of the performance of these two threads will typically be better than the performance of a similarly equipped non-HT processor. The specific performance increase for any application would depend on how its threads use the shared resource.
Scheduling the third thread on logical processor 3, 4, 7 or 8 would have the following effect:
• The performance increase of the system is 100% (100) of a similarly equipped non-HT processor. In this case there is no competition for shared HT processor resources and the total system performance goes up from 200 to 300 (that is, it goes up by 50%).
To take advantage of this performance opportunity, the scheduler in the Windows Server 2003 family and Windows XP has been modified to identify HT processors and to favor dispatching threads onto inactive physical processors wherever possible.
Bookmarks