
Originally Posted by
darkskypoet
@Helmore:
Actually, If you think of it.. with both Intel and AMD having IMCs... The next move would be onto CPU module. Never mind just the Marchitecture speak, but think right now on an AMD platform, you have Memory ->CPU -> Chipset (PCI Express arbiter) -> VC... With CSI, same thing. The move to the CPU will be made to drop the extra steps out of System ram to video ram. Now honestly, I couldn't quantify the additional penalties for having to route memory requests to cpu IMC and then passing data back through CPU... and In reality it is probably less so then utilizing say PCI Express slots off the SB (think NB <-> SB config), but i think we'll be seeing these GPGPU cores on CPU sooner rather then later. I mean think of the benefits in terms of CPU to GPGPU comm, and memory to both... Now don't get me wrong, it won't happen over night. But it will happen, and will cause a massive shift in the industry. And then what are we looking at? An SGI Shared memory infrastructure? Or tiered system memory? Embedded DRAM a la Bit Boys (shared CPU/GPU L3 Cache) and very fast DDR3 or DDR 5 system memory modules for access by both? I mean in a nutshell this is Fusion. Think of the tremendous redesign in such a chip... You could probably shave the CPU component down some, and leverage the strengths of GPGPU transistors to take on jobs that the CPU's FPU currently handles. For me this is the real game at hand, and a lot of the tech(s) being developed now are stepping stones to something more like a CPU / GPGPU design. Once you start to play with that sort of idea; the magnitude of design shift and paradigm rethink is tremendous... yet the benefits could be equally great. Now chips like the Cell start to become somewhat more interesting if you consider that in 5 years time the same sorts of ideas will be played out in PC land.
(It also puts AMD/ATI's Vector SP design into a different light, AMD is banking on merging the two in the near future. So does it make more sense for them to have gone with Nvidias very scalar method of processing? Or AMD's more vector oriented SPs? I honestly don't know, but I do wonder if they are thinking in terms of what will make a faster unified CPU/GPU. To be able to harness half the power of rv770 as an extended FPU would be incredible. I think part of AMDs teething problems, are that they can't afford to just R&D for its own sake. They can't afford not to make a product off of their stepping stones. So they learn, and drop a product to us; moving towards a goal in the 5 year time frame (maybe longer who knows). Perhaps their design decisions on discrete 'now' are being informed by what they see fusion as being later. Again, Multi GPU and multi die communication then becomes more important because it may be a requirement for their fusion plans. As well, the same might be said for vector simd SP approach vs scalar sp approach. Hard to say. But exciting none the less.
Intel's Nehalem has 3 x DDR 3 channels? Interesting, as that surely is laying the groundwork for adding Larabee later on... consider the die size comparisons, and compare GT280 to Nehalem... Lots of space left to add video, non? Further... intel's 45nm node is far smaller then that, and far better 'tweaked'... If Nvidia using 3rd party libraries can lay 1.4 billion transistors at 65nm, intel can do far more on in house 45nm. However, i bet they would do it MCM, so that the Frequency of Nehalem is not held back by the transistor rich graphics die. Also, when you have 14+ fabs, why not break production down into multiple dies? Allows for Discrete and embedded video. (side note: MCM allows for completely separate power planes w/o trying to do it on 1 die, I believe this hurt Phenom quite a bit)
Bookmarks