It sounds like the efficient part of this story is, that the CPU doesn't have to do the copying by executing code. In fact it could be powered off while copying takes place. That's what the Fusion paper suggests (at least to me). What if the command blocks (a few MB per frame in total) are sent in packets of a few hundred kb? These could still be residing in the L2 cache and fetched from there. The copying itself has to be initiated by the graphics driver, which in Llano's case should be able to program/control the IMC accordingly.
Textures are a different story, but they don't have to be copied per frame from CPU to GPU.





Reply With Quote

Bookmarks