Quote Originally Posted by kl0012 View Post
2. The RAM of "fusion" cpu (at least for its first version) isn't shared between GPU and CPU. Only a memory controller is shared and the RAM is divided into regions each of which is dedicated to CPU or GPU. So in order to exchange data between CPU and GPU you need to copy data from one mem region to another. This is probably not too efficient.
It sounds like the efficient part of this story is, that the CPU doesn't have to do the copying by executing code. In fact it could be powered off while copying takes place. That's what the Fusion paper suggests (at least to me). What if the command blocks (a few MB per frame in total) are sent in packets of a few hundred kb? These could still be residing in the L2 cache and fetched from there. The copying itself has to be initiated by the graphics driver, which in Llano's case should be able to program/control the IMC accordingly.

Textures are a different story, but they don't have to be copied per frame from CPU to GPU.