Results 1 to 25 of 262

Thread: Dresdenboys' blog: AMD Bulldozer - Patent based research

Threaded View

  1. #1
    Xtreme Addict
    Join Date
    Apr 2008
    Location
    Texas
    Posts
    1,663

    Dresdenboys' blog: AMD Bulldozer - Patent based research

    Dresdenboy's Blog

    AMD has a slew of patents from the last couple of years that point into the direction AMD is going with their upcoming microarchitecture codenamed Bulldozer (the Interlagos CPU). A bright German going by the screen name Dresdenboy has been following these patents for quite some time and putting together an image of what Bulldozer will look like. His blog is very informative and insightful as to the possible inner workings of AMD's future CPU.

    Here is the most recent diagram from August 21st:


    An interesting part of his research came when he explored whether or not AMD will implement SMT:
    More details on Bulldozer's multi-threading and single thread execution

    by Dresdenboy @ 2009-07-07 - 11:36:01 am

    Unfortunately I both did not have enough time and details (some things were to guess) to create the promised architecture diagram. However, now the missing details can be found in new published patent applications. I think that will help me getting back to the task. But now I switch to another topic: Will bulldozers have SMT or not?

    AMD's John Fruehe recently said thread in an AMDZone forum that, AMD will not do SMT in the next years. That could be understood in a way that the architecture revealed here will not be able to execute more than one thread per core. However, given this is not the case, because such a statement has not been. So far, John said that, AMD would not implement SMT. In my eyes it was a smart move to mention SMT - just to be able to deny it. However, this is still speculation.

    Instead we saw the term "cluster-based multi-threading (also known as clustered multi-threading, CMT) already years ago in an AMD presentation. If you look at Chuck Moore's slide below, you see, that SMT is the least admirable multi-threading variant to AMD. So far they were underway in the CMP part of this diagram and it just seems logical to move to much greener CMT area from there - even more since they explicitly state a 50% area for investment gain 80% throughput. They had this view already four years ago with first patents covering the new architecture being filed just two years later. If bulldozers would have been ready already for 2009 or 2010, these time frames seem ok to me. And even the four year difference from patent filing dates to 2011 fits well to what we know from older architectures.



    So we find the new arch again in:
    20090164758 - System and method for performing operations locked
    20090172359 - having parallel processing pipeline dispatch and method thereof
    20090172362 - Processing pipeline stage having specific thread selection and method thereof
    20090172370 - Eager execution in a processing pipeline having multiple integer execution units

    And most of these patent applications now give much more detail on how the threads are executed and the likes. Most of it fits well to what Hans de Vries already described in his detailed post on aceshardware.

    These patent application describe ways to execute a single thread on both clusters. This could be done by having a thread run ahead for early prefetches memory or by executing both ways of a branch in parallel and scrap the wrong way after branch resolution. A different variant is the parallel execution of the same code to gain reliability of the results by comparing them afterwards.

    Some of the mentioned patent applications also state, that the 4 way decoders could decode more than 4 instructions per cycle if there are both a micro coded and a fastpath instruction (of different threads) in one decoding path

    Another interesting and related topic is the way future general and how graphics processing units could be combined. This is covered in the following patent applications:
    20090164726 - Programmable address processor for graphics applications
    20090160863 - unified processor architecture for graphics and general processing workload
    CMT?!?!?!?! The above is just one of his entries on his blog. The rest are just as interesting, especially the entry called "Faster adaption to ISA Extensions". I thought his blog is newsworthy and needs a bit of healthy discussion.

    EDIT: Updated CPU Diagram as of August 27th. Check out Dresdenboy's blog for details on the changes!
    Last edited by Mechromancer; 08-27-2009 at 06:13 PM.
    Core i7 2600K@4.6Ghz| 16GB G.Skill@2133Mhz 9-11-10-28-38 1.65v| ASUS P8Z77-V PRO | Corsair 750i PSU | ASUS GTX 980 OC | Xonar DSX | Samsung 840 Pro 128GB |A bunch of HDDs and terabytes | Oculus Rift w/ touch | ASUS 24" 144Hz G-sync monitor

    Quote Originally Posted by phelan1777 View Post
    Hail fellow warrior albeit a surat Mercenary. I Hail to you from the Clans, Ghost Bear that is (Yes freebirth we still do and shall always view mercenaries with great disdain!) I have long been an honorable warrior of the mighty Warden Clan Ghost Bear the honorable Bekker surname. I salute your tenacity to show your freebirth sibkin their ignorance!

Bookmarks

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •