Quote Originally Posted by zir_blazer View Post
_Fusion: CPU commands GPU internally, lowest possible latency at 0 hops. You are using shared RAM at just one hop with the IMC, so whatever either the CPU or GPU wants to access must have to do it though the same bus. Possibily, the most important improvement would be that data to process is directly uploaded from the CPU to GPU in real time instead of it just saying it where in the RAM it has placed it at, in what case it would have to retrieve it from the VRAM.
1. CPU sends "command buffer" to GPU through memory (and not through I/O ports). All CPU I/O (not memory) operations are slow by its nature. They are not cached.
2. The RAM of "fusion" cpu (at least for its first version) isn't shared between GPU and CPU. Only a memory controller is shared and the RAM is divided into regions each of which is dedicated to CPU or GPU. So in order to exchange data between CPU and GPU you need to copy data from one mem region to another. This is probably not too efficient.