JF-AMD are you a process or design engineer for AMD?
Printable View
JF-AMD are you a process or design engineer for AMD?
he is director of product marketing servers AMD...
CMT was something said a while ago, long before I started working with bulldozer. When I sat down with engineering and product management to get the bulldozer story for the first time (I jump in much further downstream than the other teams) I asked about CMT and they basically said that they were not using that term. I never pursued.
how about we peruse the term "Standardized Threading Definitions" so all cpus can have STDs
To be brief, I'll bulletize my bones to pick in this thread.
- SMT - Listed as threads instead of cores for a reason. Intel's implementation means you'll only ever have [core count] threads active/executing even though you have [core count] * 2 "threads". Let's not lose sight of that. I don't know why Sav is going off on a tangent about this.
- CMT - Genuinely has [core count] threads active/executing at any given moment. Each module contains two integer units capable of chewing on instructions at the same exact moment. It's two cores per module.
- Threads - When it comes to comparing thread counts, realize that Intel's SMT-enabled chip thread counts aren't the same thing as AMD's CMT thread counts. In the case of AMD, all those threads are actually executing in parallel. In Intel's case they are not.
Maybe that will help clear up some confusion.
Actually, you're contributing to the confusion because you do not understant ( neither does JF apparently or does it intentionally for FUD ) what SMT really is.
As a hint, you should pay attention to the S part in SMT ( simultaneous multithreading ). There is plenty of literature on the subject, 5min of reading would help you get it settled.
??
http://i473.photobucket.com/albums/r...anadej/HTT.gif
Straight from Intel and in such a pretty diagram that most people understand right away....
To help out a bit, I think in this case a picture is worth a thousand words:
http://info.nuje.de/intel_smt.png
For some more variants, see http://molesterwaterball.blogspot.co...luster-mt.html
There are pipeline stages in Nehalem, where only one thread is active during one cycle (e.g. decoding) and other stages, where multiple subunits (like EUs) can be used by two threads simultaneously - but still one thread per EU.
Edit: Fixed image (didn't allow direct linking).
So what should we understand ? That instructions from 2 different threads are in flight at the same time in various execution stages ?
Sounds awfully familiar with what I'm saying : SMT means the simultaneous execution of 2 threads or more in parallel.
Maybe, you should answer why it's called simultaneous in the first place.
What's your point, somehow I am missing it ? What do execution units have to do with executing threads simultaneously ?
Maybe instead of amateur sources and interpretations, we should look into real technical articles, done by the people who invented this technologies and which are published at conferences and tech journals.
I've attached a diagram of the a Netburst execution core to show the simultaneous execution of 2 threads : you can find it in this paper
ftp://download.intel.com/technology/...technology.pdf
You don't appear to understand what is really going on yourself. There is only one execution unit. You can't have two threads with instructions that compete for the same resources executing on the same clock cycle in the same execution unit. That's the end of the story. HT is, as we've been claiming all along, just a way to maximize the utilization of the core's resources by scheduling work where there would normally be none being done (misses and whatnot). It does not magically let you execute two threads at the same time the way two real cores do.
Let me put it in simple terms:
With actual cores, throughput generally goes up ~90% when you go from 1 core to 2 cores.
With SMT, throughput generally goes up ~14% for int and ~20% for FP (from SPEC.org, on Intel-based submissions).
SMT may double the number of threads, but it does not double the number of pipelines. You can only fit so many executions per cycle based on the pipelines. SMT might give you better utilization, but you are still limited on pipelines.
Doubling the number of cores will double the number pipelines and allow for more simultaneous execution. That is the key to this whole discussion. Everyone can argue about how many angels can dance on the head of a pin, but in reality, having more cores means that you have a larger dancefloor.
Just a thought but does it really matter the path that the two companies have taken?
What should matter is the effectiveness of the choice they made.
IE: Take a $1000.00 intel chip and a $1000.00 AMD chip and see which one does the work you need done better.
Maybe thats too black and white for you smart guys here but to me thats all that counts..
The rest is just a way to kill time typing in a forum..
( Puts on flamesuit):rofl:
That is the craziest thing that I have ever heard ;)
NOBODY ever buys like that. Just customers, but outside of customers, who would ever do that?
http://cdn.mos.bikeradar.com/images/...n36-399-75.jpg
That is exactly what I mean..
Look at the type of work your doing, then look at the strengths of the two approaches and choose the one that works best for you..
I imagine that in different types of work there are places where both excel and others where both don't..
Last I knew a CPU typically has 3-4 ALUs and 3-4 FP units. No single thread will use all of them.
And it seems you are missing my point : I'm not saying HT is equal to having another core, I'm saying at allows you to execute 2 threads in parallel. End of story.
That's a strawman; you originally claimed
That is incorrect. SMT means your pipeline has 2 threads active at the same time. Which approach is better, is another discussion.
Bingo, :banana::banana::banana::banana:ing Yahtzee! I don't care for any of the technical side to any CPU. I just want what works for me, the fastest and most cost efficient. I'm the type of person who would also give up 10 to 15% performance if it was going to save me a few hundred dollars.
Lets get away from theory and into reality for a minute ok?
You guys know I have both Intel and AMD systems here yes?
Both excellent, but they are different.
My westmere system( 2 actually, one in SR2 board, one in SM X8DA3 board) are powerfull but suck electric and generate a lot of heat.
My dual Magny cours system is a little less powerfull( 6168 chips,1900mhz,24 cores) but runs at 39-41C and takes a lot less electric to run it.
I also noticed that in the DC work I do there are much less page faults with the MC system.. Why I don't know but it's true.
It is also solid as a rock and has been at 100% load since built 2 months ago.
Bottom line is as said, different strengths and charachteristics but both good systems.
Now this is an interesting point : what if being dependent on shared resources isn't a liability, but actually a desirable feature ?
SMT allows me to increase the utilization of an underutilized core. CMT duplicates the core or part of it, thus duplicating the lack of utilization also.
Example : we take a 4 issues wide core with 4 ALU units. Let's say, most of the time only 1 or 2 of those units are used.
-with SMT, we have 2 threads running in parallel on that core, the second thread being dispatched to the idle units. Thus we now have 3 or even all of the units in use.
-with CMT, I add another cluster of 4 ALUs for a total of 8. I have 2 threads, but I also have 2x as many resources available and each thread uses most of the time 1 or 2 ALUs. Thus, out of 8 in the module, I'm constantly using 3-4 units.
Because it is damn hard to achieve high utilization in just about every domain. Look at your body; when do you use it at max capacity ? The arms for example ? I really doubt you use more than 30-40% of the capacity of the right arm and 10-20% of the left one ( if you're right handed ).
But, I suppose, you don't think in this way : " why does the human body have 2 arms if they are never used to maximum capacity ? ".
well if i could ask god why i have 2 arms when i almost always use just one i would, but i cant. so instead im asking why build a chip that you say, and i quote:
so are cpu designers idiots who overbuild chips knowing its a waste of money and resources? or is it that SMT only gets 15% of a bonus because chips are almost always being fully utilized?Quote:
No single thread will use all of them.