yes it does because the dumbasses at best buy say they are real cores.
i know this from personal experience
HT is mostly positive (1-30%), but sometimes not. For some applications and games, it skips to another threads and the threads around, and then it has negative impact.
HT does not make any sense when a workload demands a dedicated throughput from a true cpu core.
Check my previous post, edited to make my point on threading.
If HT is so good, why not increase it a hundred fold?
for the most part to me multi core is more about the future then now.
The vast majority of apps that I use are not multi threaded and are able to saturate a single core and be constrained by the cpu with idle cores. Intel must have realised the problem hence the turbo mode on the new processors. Apparently it is not easy to code multi threading so I expect it is not something we will see in mainstream apps anytime soon, it will be limited to very cpu heavy stuff such as encoding apps.
maybe now again back to Bulldozer thema?:)
More cores will always be better than more threads, hyper or not. AMD is right to add more cores in BD. Those who think adding cores does not provide better performance are handicapped by their restricted computing experience.Quote:
Yes and how many apps take advantage of more than 4 cores? Single threaded performance is still relevant today. Only thing AMD has it going is brute force with more cores and high clock speeds.
Exactly, it all about the design trade-offs decided by the architect(s). An additional core would require a bare minimum of twice the area (note: core area only, some chip-level components will be shared) and would likely make bus arbitration/data coherency mildly more complex (+ latency due to physical layout constraints), however close to twice the potential throughput could be created. As for HT/SMT, the intent is to piggyback on most of the existing infrastructure (ie. share caches) and achieve the most efficiency possible with the execution units, that way the area consumption is no where near that of a new core.
High-throughput does not imply low-latency, although the other way around tends to be true somewhat more often. Whether a uArch is physically distinct cores/threads or an assortment of SMT/HT threads, a true throughput workload should not care about such differences when taking efficiency and physical constraints into account. That is, unless there is an emphasis on latency and of what granularity, even then there's often no clear answer.
Afaik 2-4x SMT is the acceptable balance between overhead and performance gain for most targeted workloads. 100x would be just plain asinine, that is unless you'd have an absolutely terrible deep pipeline chock-full of hazards, along with an equally bad main memory latency :rolleyes: (arguably at the point of 100+ threads per "core" you wouldn't likely be in the category of traditional pipeline + SMT anyways)
Isn't that somewhat of a contradiction? More cores directly implies more threads, since [hardware] threads are the low-level concept for program flow and at a basic level one core must have at least one thread of operation (while more can be multiplexed in SMT).
What you're trying to do here is make an overall justification for more cores without taking any real design constraints into account, while I'll add more cores certainly is the current industry trend. However, it's the architect's job to evaluate when it's applicable to add physical cores in lieu of SMT or any IPC improvement for that matter. While the concepts involved aren't hard per say, it's the slew of constraints that make these decisions more statistical analysis than judging based on raw merit.
The HT vs. cores argument will never be solved here. The reality is that we should just look at total throughput and spend less time thinking about how you get there.
I know everyone uses their computer differently, but I needed multi-core before multi-core cpus even existed. It's very rarely that I'm only running a single single-threaded application. Adding more cores may not help one ST app, but it will allow me to run more apps simultaneously before they slow down. Not to mention having the spare resources to deal with all the background processes, daemons, etc always running in modern OSes.
What's going to matter to the majority of end users ( besides the biggest factor: the name on the box :rolleyes: ) is which processor will give them the best experience for the least money, not implementation details like core count, IPC vs throughput, etc. For the extreme part of the market, absolute performance is important, but often in these scenarios the apps are multithreaded or we are power users doing heavy multitasking. I'd suggest that it's a very small segment of the market that needs to run only one CPU-bound single-threaded app at maximum speed.
I don't think there should even be an argument about this, SMT is not a replacement for cores, it is simply there to take care of inefficiency, or rather, it enhances efficiency. Intel did not push bloomfield as an octacore, but a quadcore, capable of rendering 8 threads, thanks to smt. SMT is not a fluke, and has proven real-world benefits. If it weren't for HT, bloomfield would struggle to compete with thuban in highly multithreaded apps. In some inefficiently coded apps the more threads you run the better results you'd get, and these scenarios are where HT helps a lot.
With that in mind, and since Intel is also in the multicore game, Intel seems poised to rely on SMT for the foreseeable future, to maintain their lead by processing more threads, albeit with less physical cores.
The main issue is that most people in the Hyper Threading (SMT) vs real Cores discussion can't state properly a decent comparision. The question is how much increase in die size and power consumption SMT equals to, comparing it to the extra performance when you are using that additional Thread, against an identical Core but without SMT support.
Intel has once stated if I recall correctly that inclusion of Hyper Threading on Pentium 4 was just an extra 5% of die size. Imagine the following with a 10% of die size area increase due to SMT support:
A standalone Core of a Processor without SMT (1 Core, 1 Thread): 50mm^2
A standalone Core of a Processor with SMT (1 Core, 2 Threads): 55mm^2
Lets say that you have a 1100mm^2 die size budget to spend adding Cores together for a Processor die...
1100mm^2 / 50 mm^2 = 22 Cores, 22 Threads
1100mm^2 / 55 mm^2 = 20 Cores, 40 Threads
And lets also assume that an application can scale well even with such a high amount of Threads. Now, compared to the fact that you're losing two physical Cores, does the extra 18 Threads gives you MORE performance than those physical Cores that you lose in the same die size space? If the response is yes, SMT is worth it if Software scales. If the response is no, then SMT is useless.
And I would add, power is much more important than die size, unless you're talking something crazy.
And, in fact, that's what Intel said about Hyperthreading, in one of the Nehalem talks:
It was the feature that provided the most performance/W gain of the many that were added. It didn't take much space, it had great perf/W benefits, and the downside was actually in implementation complexity (tricky to get right) and validation (tricky to prove you got it right). As you can imagine, it really explodes the number of possible core states.
^^zir_blazer
That's the point I was trying to get at, there's so many variables/constraints involved that there's no easy direct comparison between more vanilla cores, SMT, or even the in-between variant Cluster-based Multi-threading (CMT) more applicable here. It's all about cutting the resource budget into the perceived best slice, when it's not even just the transistor budget (area) comprising the only constraint; there's still power-efficiency/density and overall chip development/logic/layout complexity that's affected, along with the varying performance gains[/loses] for each workload in evaluation.
Once Bulldozer is out it'll be nice to see an example of CMT in action and hopefully discern some bit of information on how the concept performs in application/implementation. Assuming that's what it'll implement :)
Definately on the first part, but it doesn't hurt when discussions get you thinking [without propagating misinformation]. That reality is true for the majority of end-users/consumers, but there's at least a few around here that like to drill a little further into the details from either a personal relation to the industry/field or a general intellectual intrigue :D
http://techreport.com/articles.x/18799/4:confused:
you cant compare amd to intel to get an idea of how effective HT is. do an HT on/off test with the exact same system. HT increases power consumption, power efficiency and performance.
http://techreport.com/articles.x/15818/14
Looks like it's already been beaten to death, but just a quick rundown of reasons (each significant enough on its own):
1) Fundamentally different physical processes when it comes to power efficiency (SOI vs HKMG Bulk)
2) Cache/interconnection/chip-level designs differ at the transistor level -> More/less active logic
3) Vastly different pipelines and power consumption profiles -> Each has strengths/weaknesses in different applications/benchmarks
4) Power management is different (power-gating, etc.) -> Not all portions of the chip are consuming active power at a given moment
5) Considering the PII 940 and 1090T consume roughly the same amount of power, layout and process refinements skew the results widely
AMD have created a small doll with Intel i7 logo on it; poking and chanting voodoo chinese sounds in a basement dressed in red/green/black stuff while in the next room they collect the soul of the Nahalem in a big jar so that they can pour it in the new Bulldozer - Its that simple, mkay ?
I bet the browser you using to type that message is single threaded. :) Although I know you meant a single threaded app that maxes out the cpu :)
Your last sentence is key there. The vast majority of apps are single threaded bound but end users will like been able to run multiple apps at once without them affecting each other, multi core is defenitly useful for that.
For what its worth I think hyperthreading is useless near enough. A gimmick from intel. I run various servers, and servers love multiple cpu's, however in a server environment htt only serves to cause problems because apps when allocating threads to processors expect a real processor not some virtual processor that has no dedicated processing power. So eg. you can have 4 core intel with htt so 8 processors and mysql uses processors 1,2 and 4 with 1 and 2 been the same physical core which leads to an unbalanced load.