PDA

View Full Version : Merom/Conroe to get FMAC baby!


Carfax
08-23-2005, 01:29 PM
Wow, now I'm really excited about Conroe and Merom! Apparently, they'll have a feature called Macro-Ops which takes the Dothan's Micro-Ops much further, by fusing together adds and multiplies for higher FP throughput :cool:

Translation, gaming and multimedia performance will be awesome because with FMAC, it should have roughly double the single precision output of both the P4 and the K8..

One of the tricks that Banias brought to the forefront was Micro-Op Fusion, basically the ability to gang multiple decoded operations into one single. Merom takes this much farther, and adds a more sophisticated version to the mix. In addition, Merom has Macro-Op fusion, the ability to gang x86 operations before decode. As an example, if you have a multiply followed by an add, Macro-Op fusion can turn that into a Multiply and Accumulate. Again, this simplifies the complex process of x86 execution and again increases IPC.

Source (http://67.19.9.2/?article=25623) (article written by Charlie)

Man, I can't wait for this chip! The only thing I don't like so far is that the first incarnation will still use the archaic FSB, but is scheduled to move to OMC in 2007.

Now, if it has some kind of multithreading, I will be in bliss :D

saratoga
08-23-2005, 01:38 PM
Translation, gaming and multimedia performance will be awesome because with FMAC, it should have roughly double the single precision output of both the P4 and the K8..

I don't think thats a new FMA instruction, rather it just fuses the individual instructions together after decode for lower overall latency. I don't think it will change the peak performance at all (since you still have to decode the individual instructions), though it will likely save power and increase real world performance in some cases.

Carfax
08-23-2005, 01:43 PM
I don't think thats a new FMA instruction, rather it just fuses the individual instructions together after decode for lower overall latency. I don't think it will change the peak performance at all (since you still have to decode the individual instructions), though it will likely save power and increase real world performance in some cases.

Well, it's not a bonified FMAC instruction I agree, but it does roughly the same thing..

I guess software will have to be recompiled to take advantage of MacroOps no?

kl0012
08-23-2005, 01:53 PM
Here is more about it:
http://www.anandtech.com/tradeshows/showdoc.aspx?i=2504

According to some speculations Merom will have following features:
* 14 stages pipeline.
* 4x wide execution core (3x in Dothan).
* 2x256 bit Read/Write L1<->L2 bus (instead of 2x128 bit in Dothan).
* Deeper buffers -more instructions in flight.
* Real 128 bit wide SSE/SSE2 engine (instead of 64 bit in Dothan/Yonah).

DevilsRejection
08-23-2005, 03:27 PM
Here is more about it:
http://www.anandtech.com/tradeshows/showdoc.aspx?i=2504

According to some speculations Merom will have following features:
* 14 stages pipeline.
* 4x wide execution core (3x in Dothan).
* 2x256 bit Read/Write L1<->L2 bus (instead of 2x128 bit in Dothan).
* Deeper buffers -more instructions in flight.
* Real 128 bit wide SSE/SSE2 engine (instead of 64 bit in Dothan/Yonah).

i saw the first two bits of info, where you find the rest?

saratoga
08-23-2005, 11:53 PM
Well, it's not a bonified FMAC instruction I agree, but it does roughly the same thing..

I guess software will have to be recompiled to take advantage of MacroOps no?

No, because theres no FMAC instruction :)

All this does (from the link you gave anyway) is look for multiply adds in sequence and then combine them after decode into one fused instruction just like Dothan does now for some int operations and the G5 does for most operations IIRC. So its not really much (if any) faster, though it is simplier to impliment and uses less power so its a good idea.

saratoga
08-23-2005, 11:55 PM
One thought though, if they really do include hardware for doing FMAC, it would certainly make adding an actual FMAC instruction a hell of a lot easier. Maybe they have plans for SSE4 ;)

dqniel
08-24-2005, 02:48 AM
The first thing you notice is that Intel has abandoned the long pipe, high speed, lower IPC model that was the norm for the last few years.

:slobber: thank god