MMM
Page 4 of 4 FirstFirst 1234
Results 76 to 87 of 87

Thread: 2009 AMD analysts day [official thread]

  1. #76
    Xtreme Cruncher
    Join Date
    Jun 2006
    Posts
    6,215
    Bulldozer's vision in Fred Weber's head(ex AMD fellow back in 2005) :
    http://www.scribd.com/doc/7316303/CPU-Aug-05 (page 108)

    When asked : "Q :What are the top three items on your dream list of CPU innovations?" ,Mr Weber(a man who actually was a driving force behind the whole Clustered MultiThreading in the first place;look at his presentations back in 2005) responded:
    "FW : Well, I’m thinking a lot about the heterogeneous multiprocessing right now. The other area that is of great interest but there isn’t a solution for right now is how to best automatically, whether in hardware or software, use multiple threads in order to speed up programs that are not traditionally parallel. There are a bunch of techniques, such as run-ahead processing, being investigated. But there’s no obvious solution right now. "
    Can you see the effects of the above quote in the bulldozer module diagram ?

  2. #77
    Xtreme Member
    Join Date
    Sep 2009
    Location
    Czech Republic, 50°4'52.22"N, 14°23'30.45"E
    Posts
    474
    Quote Originally Posted by rcofell View Post
    Huh? Unless I'm mistaken, I seem to remember Bergman alluding to the fact they're SOI with the comment that the GPU portion is not a risk item on SOI.
    Well, I know, Fuad; but there are some other indicators, like at http://www.anandtech.com/cpuchipsets...spx?i=3673&p=4 last the picture - Brazos is not colored like 65/45/32 nm (and 28 nm is too far away from 2011). Also http://xtreview.com/addcomment-id-64...rocessors.html talks about some possibilities to TSMC test Fusion on 40 nm bulk process.

  3. #78
    Xtreme Addict
    Join Date
    Apr 2008
    Location
    Texas
    Posts
    1,663
    Quote Originally Posted by informal View Post
    Bulldozer's vision in Fred Weber's head(ex AMD fellow back in 2005) :
    http://www.scribd.com/doc/7316303/CPU-Aug-05 (page 108)

    When asked : "Q :What are the top three items on your dream list of CPU innovations?" ,Mr Weber(a man who actually was a driving force behind the whole Clustered MultiThreading in the first place;look at his presentations back in 2005) responded:
    "FW : Well, I’m thinking a lot about the heterogeneous multiprocessing right now. The other area that is of great interest but there isn’t a solution for right now is how to best automatically, whether in hardware or software, use multiple threads in order to speed up programs that are not traditionally parallel. There are a bunch of techniques, such as run-ahead processing, being investigated. But there’s no obvious solution right now. "
    Can you see the effects of the above quote in the bulldozer module diagram ?
    AMD always pimped the Bulldozer out to be a single-threaded maniac. The only way I see that happening is if an ENTIRE module (2 cores and all) can process a single thread by itself. Two real cores working as one to chew through a single thread should be interesting. The Integer and floating point performance of a single module should make that interesting.
    Core i7 2600K@4.6Ghz| 16GB G.Skill@2133Mhz 9-11-10-28-38 1.65v| ASUS P8Z77-V PRO | Corsair 750i PSU | ASUS GTX 980 OC | Xonar DSX | Samsung 840 Pro 128GB |A bunch of HDDs and terabytes | Oculus Rift w/ touch | ASUS 24" 144Hz G-sync monitor

    Quote Originally Posted by phelan1777 View Post
    Hail fellow warrior albeit a surat Mercenary. I Hail to you from the Clans, Ghost Bear that is (Yes freebirth we still do and shall always view mercenaries with great disdain!) I have long been an honorable warrior of the mighty Warden Clan Ghost Bear the honorable Bekker surname. I salute your tenacity to show your freebirth sibkin their ignorance!

  4. #79
    Xtreme Addict
    Join Date
    Jan 2005
    Posts
    1,366
    Quote Originally Posted by informal View Post
    Can you see the effects of the above quote in the bulldozer module diagram ?
    No. I can't see actually.
    Quote Originally Posted by Mechromancer View Post
    AMD always pimped the Bulldozer out to be a single-threaded maniac. The only way I see that happening is if an ENTIRE module (2 cores and all) can process a single thread by itself. Two real cores working as one to chew through a single thread should be interesting. The Integer and floating point performance of a single module should make that interesting.
    Guys, you are dreaming too much. While it is possible that one core may help to another with execution some instructions when the the second core is saturated (which is really not a common case - isn't it informal who always says that 4-way execution is too much a for single thread ) but it really not what "speculative threading" is.

    Here is some interesting:
    http://arstechnica.com/hardware/news...-bulldozer.ars
    Right now, AMD is referring to each integer scheduler and the pipelines associated with it as a "core," making each Bulldozer module "dual-core." I think this terminology is a huge mistake, and I hope AMD rethinks it.
    If this is the case, then wonder how does AMD planing to be competative with its 4-"core" in the high-end against 4-core Sandy Bridge/Ive Bridge in the main-stream.

  5. #80
    Xtreme Member
    Join Date
    Aug 2009
    Posts
    244
    Quote Originally Posted by kl0012 View Post
    If this is the case, then wonder how does AMD planing to be competative with its 4-"core" in the high-end against 4-core Sandy Bridge/Ive Bridge in the main-stream.
    High end Bulldozer will have 4 modules.

  6. #81
    Xtreme Member
    Join Date
    Oct 2009
    Location
    Bucharest, Romania
    Posts
    381
    From my knowledge, Bulldozer will come in flavours of 4-8 modules of 2mini cores each. So, for the very high-end it will compete with Sandy Bridge quite decently.

  7. #82
    Xtreme Addict
    Join Date
    Jan 2005
    Posts
    1,366
    Quote Originally Posted by mindfury View Post
    High end Bulldozer will have 4 modules.
    Are you sure? I guess AMD is taking now SUN Rock route of sharing FP resources between multiple cores (with offload heavy FP calculation to APU in mind).

  8. #83
    Xtreme Addict
    Join Date
    Dec 2007
    Location
    Hungary (EU)
    Posts
    1,376
    Quote Originally Posted by informal View Post
    Bulldozer's vision in Fred Weber's head(ex AMD fellow back in 2005) :
    http://www.scribd.com/doc/7316303/CPU-Aug-05 (page 108)

    When asked : "Q :What are the top three items on your dream list of CPU innovations?" ,Mr Weber(a man who actually was a driving force behind the whole Clustered MultiThreading in the first place;look at his presentations back in 2005) responded:
    "FW : Well, I’m thinking a lot about the heterogeneous multiprocessing right now. The other area that is of great interest but there isn’t a solution for right now is how to best automatically, whether in hardware or software, use multiple threads in order to speed up programs that are not traditionally parallel. There are a bunch of techniques, such as run-ahead processing, being investigated. But there’s no obvious solution right now. "
    Can you see the effects of the above quote in the bulldozer module diagram ?
    Interview with AMD's Fred Weber - The Future of AMD Microprocessors (March 31st, 2005)

    When Intel announced Hyper Threading, AMD wasn't (publicly) paying any attention at all to TLP as a means to increase overall performance. But now that AMD is much more interested and more public about their TLP direction, we wondered if there was any room for SMT a la Hyper Threading in future AMD processors, potentially working within multi-core designs.

    Fred's response to this question was thankfully straightforward; he isn't a fan of Intel's Hyper Threading in the sense that the entire pipeline is shared between multiple threads. In Fred's words, "it's a misuse of resources." However, Weber did mention that there's interest in sharing parts of multiple cores, such as two cores sharing a FPU to improve efficiency and reduce design complexity. But things like sharing simple units just didn't make sense in Weber's world, and given the architecture with which he's working, we tend to agree.
    -

  9. #84
    Xtreme Cruncher
    Join Date
    Jun 2006
    Posts
    6,215
    Quote Originally Posted by kl0012 View Post
    No. I can't see actually.

    Guys, you are dreaming too much. While it is possible that one core may help to another with execution some instructions when the the second core is saturated (which is really not a common case - isn't it informal who always says that 4-way execution is too much a for single thread ) but it really not what "speculative threading" is.

    Here is some interesting:
    http://arstechnica.com/hardware/news...-bulldozer.ars
    No you probably misunderstood me. 4 way execution has become standard today,and AMD will obviously use it with BD cores. Plus there are a lot of ways second cluster can do a lot of speculative threading work.For example ,look at one of the patents describing what Mr. Weber expressed in the quote(the patent is actually describing a variation of what we now regard as bulldozer module):
    Link to patent(thanks to dresdenboy's blog;you can count 4 or 5 ways to implement various algorithms for speculative threading on bulldozer like module the patent describes,kl0012Enjoy Enjoy ):
    0026]In certain instances, the processing pipeline 200 may be processing only a single thread. In this case, the instruction dispatch module 210 can be configured to dispatch integer instruction operations associated with the thread to both integer execution units 212 and 214 based on a predefined or opportunistic dispatch scheme. Alternately, the instruction dispatch module 210 can be configured to dispatch integer instruction operations of the single thread to only one of the integer execution units 212 or 214 and the unused integer execution unit can be shut down or otherwise disabled so as to reduce power consumption. The unused integer execution unit can be disabled by, for example, reducing the power supplied to the circuitry of the integer execution unit, clock-gating the circuitry of the integer execution unit, and the like.

    [0027]The implementation of multiple integer execution units that execute in parallel and share the same front-end unit 202 facilitates accelerated execution of a single thread through collaboration between the integer execution units. The integer execution units 212 and 214 can be used to implement a run ahead scheme whereby the instruction dispatch module 210 dispatches memory-access operations (e.g., load operations and store operations) to one integer execution unit while dispatching non-memory-access operations to the other integer execution unit. To illustrate, the front-end unit 202 can fetch and decode instructions associated with a thread such that load instructions later in the program sequence of the thread are prefetched and dispatched to one of the integer execution units for execution while the other integer execution unit is still executing non-memory-access instructions at an earlier point in the program sequence. In this way, memory data will already be prefetched and available in a cache (or already in the process of being prefetched) by the time one of the integer execution units prepares to execute an instruction dependent on the load operation.

    [0028]Another example of a collaborative use of the integer execution units 202 and 204 is for an eager execution scheme whereby both results of a branch in an instruction sequence can be individually pursued by each integer instruction unit. When the correct branch is determined, the integer instruction unit that was tasked with the branch that ultimately was correct can transfer its state to the other integer instruction unit and both integer instructions can then be used for execution of the program stream of the thread. Table 1 illustrates an example eager execution of instructions of a thread:

    TABLE-US-00001 TABLE 1 Eager Execution Example Section Instructions A add r1, r2, r1 cmp r1, r3 jne next B add r3, 3, r6 . . . mov r6, r3 C next: add r3, 6, r1 . . .

    [0029]As illustrated by Table 1, instruction section A represents the instruction sequence leading to and including the conditional jump instruction (jne), the instruction section C represents the instruction sequence that follows if the jump is taken, and the instruction section B represents the instruction sequence between the conditional jump instruction and the target of the conditional jump instruction (next). In this example, the front-end unit 202 could dispatch the instruction sequence represented by sections A and B (i.e., the program flow in the event that the jump is not taken) to the integer execution unit 212 and dispatch the instruction sequence represented by sections A and C (i.e., the program flow in the event that the jump is taken) to the integer execution unit 214. In the event that it is ultimately determined that the jump is to be taken, the state of the integer execution unit 214, having been executing the correct path, can be transferred to the integer execution unit 212. Conversely, in the event that that it is ultimately determined that the jump is not to be taken, the state of the integer execution unit 212, having been executing the correct path, can be transferred to the integer execution unit 212. More detailed eager execution implementations are described below with reference to FIGS. 6 and 7.

    [0030]As yet another example, the integer execution units 212 and 214 can be used collaboratively to implement a reliable execution scheme for a single thread. In this instance, the same integer instruction operation is dispatched to both integer execution units 212 and 214 for execution and the results are compared by, for example, the thread retirement modules 226 of each integer execution unit. In the event that the results match, the results of the integer instruction operation are deemed reliable and execution of the next instruction operation proceeds. Otherwise, if there is a mismatch between the results of execution of the same integer instruction operation, the results are deemed unreliable and corrective action is taken, such as by issuing an exception or other interrupt, by executing the integer instruction operation again, etc.
    As for this:
    Quote Originally Posted by kl002
    If this is the case, then wonder how does AMD planing to be competative with its 4-"core" in the high-end against 4-core Sandy Bridge/Ive Bridge in the main-stream.
    Bulldozer "X8" model(4 modules) will have eight 4 issue cores,as I understand it,since every core inside the module will be 4 issue wide(not 2way but 4 way-per AMD's slide;note that bobcat and bulldozer are similar but designed with different goals in mind so a single 2way bobcat "core" may not be the same as one cluster inside a bulldozer module due to different goals for 2 designs- low power and high performance,respectively ).
    So ,AMD will have actually an X4(2 module version) to compete with 4 core SandyB. and X8 to compete with 8 core SandyB.(if intel launches anything like that). The possible 8 core SandyB. will have 16 thread support though,so nobody knows how these 8 physical+8 logical cores will go against 8 physical ones in Bulldozer.

  10. #85
    Xtreme Mentor
    Join Date
    Apr 2005
    Posts
    2,550
    remember the cartoon with roadrunner and coyote when coyote reads "water" on the canister with big G A S O L I N E...

    well don't know why I've remeber that, but this is how Fuad reads Bulldozer: http://www.fudzilla.com/content/view/16425/1/

    Quote Originally Posted by Fuad
    If you look at the picture above, posted in Chuck's presentation, Bulldozer has two quad cores with an integer scheduler and the two cores share two FPU 128bit FMAC schedulers.

    Each int scheduler quad has its own L1 cache that talks with L2 shared cache used by both cores and FPU units and finally, the last layer has Shared L3 cache as well as Nortbridge support. (You lost me at 'each'. sub.ed.)

    AMD didn’t save its breath to attack Intel for stitching two cores together, and in two years from now, it plans to stitch two cores that will share some parts.
    last sentience is just brilliant and it shows level of Fudo's ignorance in comparing Smithfield with Bulldozer!

    But of course, he just can't sotp there and must conclude his contemplating with news that isn't news:

    Quote Originally Posted by Fudo
    As far as we know 8-core Nehalem EX can get to 8 native cores even at 45nm and we are quite sure that for late 2010 Intel plans to launch an 8-core 32nm Westmere based CPUs.

    AMD plans Bulldozer for desktop and server market in 2011
    hehe... Fudo, the legend
    Adobe is working on Flash Player support for 64-bit platforms as part of our ongoing commitment to the cross-platform compatibility of Flash Player. We expect to provide native support for 64-bit platforms in an upcoming release of Flash Player following the release of Flash Player 10.1.

  11. #86
    Xtreme Member
    Join Date
    May 2008
    Location
    NorCal
    Posts
    150
    Fudo has truely become Otellini's ass-puppet.

  12. #87
    Banned
    Join Date
    Jun 2009
    Posts
    348
    for a while there i thought i was at amdzone. lol!

    heres an interesting thread re: bulldozr/SpMT(Speculative MT).

Page 4 of 4 FirstFirst 1234

Bookmarks

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •