so amd has pseudo cores now ??? LOLL
Right strategy
Wrong strategy
I do not know
Bulldozer Module is completely different.
HyperThreading only appears like two cores to the OS.
In reality it only uses the second thread to reduce the amount of time the core spends doing nothing whenever there's a stalled thread.
So for a core without hyperthreading:
cache miss occurs -> requests correct data from memory -> waits for it to arrive -> continues thread
A core with hyperThreading:
cache miss occurs -> requests correct data from memory -> instead of waiting for it and processing the first thread, it starts processesing the second thread.
Each Bulldozer module does actually have the hardware to process two threads simultaneously. It is actually two cores.
HyperThreading doesn't and can't make the one core process two threads simultaneously. It just reduces the time it spends waiting because of stalled threads.
Time*quantity/per core= performance
Even if Amd will have more cores, if they don't reduce also the time in wich a core do the operationS iT's ussless...
Last edited by xdan; 01-05-2011 at 03:40 AM.
i5 2500K@ 4.5Ghz
Asrock P67 PRO3![]()
P55 PRO & i5 750
http://valid.canardpc.com/show_oc.php?id=966385
239 BCKL validation on cold air
http://valid.canardpc.com/show_oc.php?id=966536
Almost 5hgz , air.
Of course they are increasing per-core performance. The only question is how much.
I think most likely IPC will at least reach i7 level if not SB level.
Then there's the question of frequency, which depends on manufacturing and materials.
With 32nm and HKMG we can expect better frequencies for sure - again, only question is how much.
And power efficiency depends on a combination of manufacturing, materials, design
Performance depends on a combination of IPC, frequency, and memory access (getting data to/from the cores fast enough for the cores to keep working).
looking at IPC alone only gives you a part of the picture, and looking at frequency alone only gives you part of the picture
Last edited by Apokalipse; 01-05-2011 at 03:52 AM.
That's why we have to see how it performs. They had a lot of time to develop brand new design.Old one had a few bottlenecks,especially in the execution stage of the pipeline. New core has more flexibility in EX stage and support many new features(wider fronted being able to effectively decode 4+1 x86 instructions(branch fusion),unified math and address scheduler in integer cores, 2x more load/store BW Vs 10h,much much better prefecthing,better branch prediction,4x larger L2 that is shared between core pairs,partitioned L3 that is now 8MBs large,complete ISA support,FMA support,universal FPU design which is able to execute even 2 FADDs or 2FMULs if needed etc.).
There is a LOT of things in BD and if they done it right it can be a lot faster than Thuban. With standard(average) IPC jump of 15%,pure core count jump of 33% and clock jump of 20% with much higher aiming Turbo core versus Thuban,this thing can go a lot higher. Interlagos has the same 33% higher core count and in throughput it's claimed it is 50% faster.Zambezi will have 2x less cores than Interlagos so this means higher clock rate are very possible(stock and Turbo),so 50% of Interlagos advantage may turn in same or somewhat higher number with Zambezi Vs Thuban.
Only if you are using a weakly parallel or single threaded program, and only one instance/program at a time.
There are only a few real world high performance usage scenarios I can think of where that is the case (ie. gaming). For most high performance work people I know run multiple programs at the same time and those programs often have multiple intensive threads. There is a reason I have mostly had multi-processor or multi-core machines since the pentium pro days.
If it comes down to a case where SB is faster with low thread counts and BD is faster with high thread counts, then you can't go by such a simplistic statement as what you made. Nor can you simply go by most review sites either, they usually run a single benchmark at a time. You will have to look at the types of programs you use AND how you use them before you can determine what will have the best performance for your scenario.
Are most people going to put that much thought into their purchasing decisions? Lol, hell no.
It's like trying to put 10 pounds of data in a 5 pound bag. While HT allows a single core to handle 2 threads, it can only handle one at a time. There is only 1 set of integer pipelines, so while 2 threads are initiated, only one is active.
It is like the ability to date 2 people. Anyone can date 2 people, but the ability to do both at once is very rare, if not non-existent.
2500k @ 4900mhz - Asus Maxiums IV Gene Z - Swiftech Apogee LP
GTX 680 @ +170 (1267mhz) / +300 (3305mhz) - EK 680 FC EN/Acteal
Swiftech MCR320 Drive @ 1300rpms - 3x GT 1850s @ 1150rpms
XS Build Log for: My Latest Custom Case
You have no idea what you talking about. Have you?
Your comparisions are really wrong and misleading.
Interleaved hyperthreading (only one thread at time can be actived on each core) is used only in itanium. All x86 hyperthreaded cores can really handle 2 threads at time (depending on available execution resources). But since in most cases there are available execution resources (a single thread rarely utilize even 70% of available core execution resources), so HT mostly shows positive effect. This is really simple and elegant way to utilize resources which are already in place but not used.
Actually hyper-threading was criticized for being energy-inefficient. For example, ARM has stated SMT can use up to 46% more power than dual core designs. Furthermore, they claim SMT increases cache thrashing by 42%, whereas dual core results in a 37% decrease.
Not to mention that in May 2005 Colin Percival demonstrated that a malicious thread operating with limited privileges can monitor the execution of another thread through their influence on a shared data cache, allowing for the theft of cryptographic keys.
Hyperthreading is largely a strong negative effect to any system.
Fast computers breed slow, lazy programmers
The price of reliability is the pursuit of the utmost simplicity. It is a price which the very rich find most hard to pay.
http://www.lighterra.com/papers/modernmicroprocessors/
Modern Ram, makes an old overclocker miss BH-5 and the fun it was
Main PC
i7 3770k
Asus P8Z77-Deluxe
4x4 GB Gskill Sniper
Sandisk Extreme 240 GB
Gigabyte GTX 670
Coolermaster ATCS 840
MCP35X - Apogee Drive II - MCR320
HTPC
i7 920
Gigabyte EX58 UD5
Sapphire 5670
3x2 GB OCZ Platinum @ 7-7-7-20
Corsair HX-650
Silverstone LC10
Intel X25-M G2
Adding performance feature will add some power consumption. This is natural. I can't comment ARM design, but SB is very efficient even with HT.
Cache trashing is mostly function of cache size and quality of software. In fact Bulldoser will be affected in the same way as SB with HT since each module uses shared cache (while dedicated L1 was reduced to just 16k).Furthermore, they claim SMT increases cache thrashing by 42%, whereas dual core results in a 37% decrease.
Well... This is funny. In fact, no absolutely secure hardware exists. There was introduced Blue Pill malware which uses security holes in AMD virtualization technology. Can we say AMD-V is a "strong negative effect" to any system?Not to mention that in May 2005 Colin Percival demonstrated that a malicious thread operating with limited privileges can monitor the execution of another thread through their influence on a shared data cache, allowing for the theft of cryptographic keys.
Hyperthreading is largely a strong negative effect to any system.
Update:
Since you used wikipedia as your source of information, I just want to add a sentence which you forgot to copy:
In May 2005 Colin Percival demonstrated that a malicious thread operating with limited privileges can monitor the execution of another thread through their influence on a shared data cache, allowing for the theft of cryptographic keys.[15] Note that while the attack described in the paper was demonstrated on an Intel Pentium 4 processor with HTT, the same techniques could theoretically apply to any system where caches are shared between two or more non-mutually-trusted execution threads; see also side channel attack.
Also:
In 2010, ARM has stated that it will include simultaneous multithreading in its chips in the future.
Last edited by kl0012; 01-05-2011 at 01:43 PM.
Bookmarks