MMM
Page 5 of 5 FirstFirst ... 2345
Results 101 to 121 of 121

Thread: AMD's Eight-Core Bulldozer Processors to Get Massive Cache - Documents.

  1. #101
    Xtreme Addict
    Join Date
    Jan 2003
    Location
    Ayia Napa, Cyprus
    Posts
    1,354
    Quote Originally Posted by savantu View Post
    Hobby #3. ( IT in general )


    How can searching for info be negative? If I want to know when BD is being launched and my previous research pointed out H2 2011 why when I express this, it is perceived as negativity towards AMD ? Simply because I do not believe in the Q1 fairy tales so many are grasping to here ?



    I'm not part of the "cheer up" crowd. I do not perceive companies as beings which need encouragement, patting on the back and faith in their future.
    I look at the state of things, financial and operational performance, discrepancy between management messages and operations and so on.

    I stand by what I write. If I sometimes very rarely edit my posts, that's to temper the ironic/vitriolic content. Members around this forum are particularly sensitive to that. :P I am more of a fan of heated debates.
    Thanks for your answers, I am not going to derail this thread anymore

    Seasonic Prime TX-850 Platinum | MSI X570 MEG Unify | Ryzen 5 5800X 2048SUS, TechN AM4 1/2" ID
    32GB Viper Steel 4400, EK Monarch @3733/1866, 1.64v - 13-14-14-14-28-42-224-16-1T-56-0-0
    WD SN850 1TB | Zotac Twin Edge 3070 @2055/1905, Alphacool Eisblock
    2 x Aquacomputer D5 | Eisbecher Helix 250
    EK-CoolStream XE 360 | Thermochill PA120.3 | 6 x Arctic P12

  2. #102
    Xtreme Member
    Join Date
    Jul 2004
    Location
    Berlin
    Posts
    275
    To make any estimations, guesses or expectations regarding launch dates a bit less prone to attacks they could simply be improved by adding an error margin (the scientific way). So if I'm expecting a CPU to launch in Q2, but am not that sure about it, I could either say first half or middle of that year or simply Q2 +/- one quarter. If it would finally be Q1, Q2 or Q3, I wouldn't have been wrong

    Sure if someone is already expecting a certain month, then naming a quarter could be a way to add an error margin to it. But I don't think, that there are many posters besides John, which are already working with month granularity in their minds regarding BD or Llano.

    @John:
    Will there any microarchitectural updates on BD at the Financial Analyst Day?

    I'm currently collecting data supporting an assumption, that BD might not use the same clock in different parts of a module. This seems to be an interesting concept.
    Now on Twitter: @Dresdenboy!
    Blog: http://citavia.blog.de/

  3. #103
    Xtreme Addict
    Join Date
    Nov 2006
    Posts
    1,402
    JF posted a new thing on his blog :

    http://blogs.amd.com/work/2010/10/25/the-new-flex-fp/

    nothing really new, but interresting reading.

  4. #104
    Xtreme Addict
    Join Date
    Apr 2007
    Posts
    2,128
    Damn, I need BD now. Blazing fast software rasterization here I come!

  5. #105
    Xtreme Cruncher
    Join Date
    Jun 2006
    Posts
    6,215
    I think that with the FMA optimized software ,2P Interlagos will be one uber fast number crunching machine(compiler support should be there by the time it launches).

  6. #106
    Xtreme Enthusiast
    Join Date
    Dec 2009
    Posts
    846
    Matthias, there will be some analyst day updates. I expect possibly one or two nuggets that we can share. There is data in the slides, but it obviously needs to get through the final executive reviews.

    This is financial analysts, not industry analysts, so don't expect any deep technical discussions, they are more interested in business discussions.
    While I work for AMD, my posts are my own opinions.

    http://blogs.amd.com/work/author/jfruehe/

  7. #107
    Xtreme Member
    Join Date
    Jul 2004
    Location
    Berlin
    Posts
    275
    Quote Originally Posted by JF-AMD View Post
    Matthias, there will be some analyst day updates. I expect possibly one or two nuggets that we can share. There is data in the slides, but it obviously needs to get through the final executive reviews.

    This is financial analysts, not industry analysts, so don't expect any deep technical discussions, they are more interested in business discussions.
    No prob. You could still write additional blogs like the one about Flex FP. Some interesting BD tech is still behind the curtain. But I can understand why you don't want to share info about it. Sometimes I already thought that by speculating into multiple directions I am actually camouflaging the microarchitecture instead of revealing it

    I listed your Flex FP blog here with other neat stuff:
    http://citavia.blog.de/2010/10/26/mo...lysis-9794436/
    Last edited by Dresdenboy; 10-27-2010 at 02:35 PM.
    Now on Twitter: @Dresdenboy!
    Blog: http://citavia.blog.de/

  8. #108
    Xtreme Enthusiast
    Join Date
    Oct 2007
    Location
    Hong Kong
    Posts
    526
    Quote Originally Posted by JF-AMD View Post
    Matthias, there will be some analyst day updates. I expect possibly one or two nuggets that we can share. There is data in the slides, but it obviously needs to get through the final executive reviews.

    This is financial analysts, not industry analysts, so don't expect any deep technical discussions, they are more interested in business discussions.
    How's GF's 32nm going to be shaped? On-time or delayed?
    I think this is vital for Bulldozer and Llano products.

  9. #109
    Xtreme Enthusiast
    Join Date
    Dec 2009
    Posts
    846
    I don't work for GF so I can't comment on them. Dirk continues to reiterate that BD is still on schedule.
    While I work for AMD, my posts are my own opinions.

    http://blogs.amd.com/work/author/jfruehe/

  10. #110
    Xtreme Addict
    Join Date
    Apr 2007
    Location
    canada
    Posts
    1,886
    Quote Originally Posted by JF-AMD View Post
    Matthias, there will be some analyst day updates. I expect possibly one or two nuggets that we can share. There is data in the slides, but it obviously needs to get through the final executive reviews.

    This is financial analysts, not industry analysts, so don't expect any deep technical discussions, they are more interested in business discussions.

    maybe an advanced pricing sneak peak ??? that surely is a business decision

    can't wait to see how this flex fp unit will work out + loads of other secret architectural change ..
    Last edited by Sn0wm@n; 10-27-2010 at 09:47 PM.
    WILL CUDDLE FOR FOOD

    Quote Originally Posted by JF-AMD View Post
    Dual proc client systems are like sex in high school. Everyone talks about it but nobody is really doing it.

  11. #111
    Xtreme Addict
    Join Date
    Sep 2010
    Location
    Australia / Europe
    Posts
    1,310
    sshhhhh! tis a secret

  12. #112
    Xtreme Enthusiast
    Join Date
    Oct 2007
    Location
    Hong Kong
    Posts
    526
    Quote Originally Posted by madcho View Post
    JF posted a new thing on his blog :

    http://blogs.amd.com/work/2010/10/25/the-new-flex-fp/

    nothing really new, but interresting reading.
    Each Flex FP has its own scheduler; it does not rely on the integer scheduler to schedule FP commands, nor does it take integer resources to schedule 256-bit executions.
    But isn't L/S and retirement requires INT cores?

  13. #113
    Xtreme Enthusiast
    Join Date
    Dec 2009
    Posts
    846
    Well, for a Bulldozer module you have 3 schedulers, (I am going off memory right now), 2 integer schedulers at 40 entries each and an FP scheduler at 60 entries. So you have 140 total entries for 2 integer threads and 1 FP thread.

    If you look at SB, they have 1 scheduler that has to handle 1 thread (1 hyperthread) and 1 FP. I believe they only have 54 entries on that scheduler (Based on Real World Tech article).
    While I work for AMD, my posts are my own opinions.

    http://blogs.amd.com/work/author/jfruehe/

  14. #114
    Xtreme Addict
    Join Date
    Nov 2006
    Posts
    1,402
    what about phenom II ?

  15. #115
    Xtreme Member
    Join Date
    Jul 2004
    Location
    Berlin
    Posts
    275
    Quote Originally Posted by qcmadness View Post
    But isn't L/S and retirement requires INT cores?
    The int cores each could retire up to 4 instructions per cycle incl. the FP instructions. And while the FPU has a combined 2R/1W mem throughput via the int cores' LSUs, each cores' 2R/1W (4R/2W in total) should be enough to handle that.
    Now on Twitter: @Dresdenboy!
    Blog: http://citavia.blog.de/

  16. #116
    Xtreme Addict
    Join Date
    Jan 2005
    Posts
    1,366
    Quote Originally Posted by JF-AMD View Post
    Well, for a Bulldozer module you have 3 schedulers, (I am going off memory right now), 2 integer schedulers at 40 entries each and an FP scheduler at 60 entries. So you have 140 total entries for 2 integer threads and 1 FP thread.

    If you look at SB, they have 1 scheduler that has to handle 1 thread (1 hyperthread) and 1 FP. I believe they only have 54 entries on that scheduler (Based on Real World Tech article).
    It makes sence for bulldozer to have a bigger schedulers. According to Dresdenboy, bulldozer has longer instruction latencies so it is possible that each instruction would wait longer in the scheduler queue for available execution slot (depends on the actual instruction throughput). On the other side SB has 168 slots in uop reorder buffer vs. 128 slots in each bulldozer core. So depends how you're positioning bulldozer vs SB (core vs. core or module vs. core) it is a bit better or a bit worse. Also separate FP/INT schedulers is not something new for AMD. AMD sticked with this approach since K7. So there's no clear winner since Intel's unified scheduler has been proven as very effective.

  17. #117
    Xtreme Cruncher
    Join Date
    Jun 2006
    Posts
    6,215
    Quote Originally Posted by kl0012 View Post
    It makes sence for bulldozer to have a bigger schedulers. According to Dresdenboy, bulldozer has longer instruction latencies so it is possible that each instruction would wait longer in the scheduler queue for available execution slot (depends on the actual instruction throughput). On the other side SB has 168 slots in uop reorder buffer vs. 128 slots in each bulldozer core. So depends how you're positioning bulldozer vs SB (core vs. core or module vs. core) it is a bit better or a bit worse. Also separate FP/INT schedulers is not something new for AMD. AMD sticked with this approach since K7. So there's no clear winner since Intel's unified scheduler has been proven as very effective.
    AMD is switching to unified integer scheduler per core.This will allow greater flexibility when executing integer instructions compared to K8/10h.
    As for instruction latencies,yes they are longer but BD has other ways of masking this,plus this is the hint about the possible high frequency targets for BD cores.

  18. #118
    Xtreme Member
    Join Date
    Jul 2004
    Location
    Berlin
    Posts
    275
    Quote Originally Posted by kl0012 View Post
    It makes sence for bulldozer to have a bigger schedulers. According to Dresdenboy, bulldozer has longer instruction latencies so it is possible that each instruction would wait longer in the scheduler queue for available execution slot (depends on the actual instruction throughput). On the other side SB has 168 slots in uop reorder buffer vs. 128 slots in each bulldozer core. So depends how you're positioning bulldozer vs SB (core vs. core or module vs. core) it is a bit better or a bit worse. Also separate FP/INT schedulers is not something new for AMD. AMD sticked with this approach since K7. So there's no clear winner since Intel's unified scheduler has been proven as very effective.
    The number of scheduler entries also depends on the clock frequency target (FO4 pipeline stage delay). This could be a "knee of the curve" type optimization in Bulldozer: less FO4 increases latencies but limits scheduler entries.

    OTOH an average scheduler size only influences overall performance by a small amount. Much more does it depend on branch prediction, memory prefetches (no data - nothing to schedule ), mem subsystem efficiency, front end, etc.

    You could try the simple scheduler simulation available on my blog. I'll soon update it with one having different instruction latencies.
    Now on Twitter: @Dresdenboy!
    Blog: http://citavia.blog.de/

  19. #119
    Xtreme Addict
    Join Date
    Jan 2005
    Posts
    1,366
    Quote Originally Posted by informal View Post
    AMD is switching to unified integer scheduler per core.This will allow greater flexibility when executing integer instructions compared to K8/10h.
    Yep. And unified FP/INT scheduler may bring another few percent of perf (at least according to theoretical studies) but I guess AMD won't take this route because of future APU aproach (which aimed to replace traditional fpu).
    As for instruction latencies,yes they are longer but BD has other ways of masking this,plus this is the hint about the possible high frequency targets for BD cores.
    Yep, and the bigger sheduler size is one of the ways to mask high latency of some instructions. But rather then make some prediction on performance by mentioning higher latency instruction (which by self means nothing without a "big picture"), my point was that shedulers of different architectures are not comparable just by their's size. Sometimes bigger does not mean better (just like P4 with it's frequency).

  20. #120
    Xtreme Member
    Join Date
    Jul 2004
    Location
    Berlin
    Posts
    275
    Quote Originally Posted by kl0012 View Post
    Yep. And unified FP/INT scheduler may bring another few percent of perf (at least according to theoretical studies) but I guess AMD won't take this route because of future APU aproach (which aimed to replace traditional fpu).
    My "big picture" looks more like there will first be an updated BD architecture (BD2) before making bigger changes to the ľArch again. There currently is no need for making any circuits in the critical path prepared for some future changes. This would just cost time, performance, area, verification effort etc.

    In this big picture we should also stop thinking in the old ways. Why does a scheduler have to be big and cover most stalls? In the past this was because there was a fixed power budget. Not making use of it while stalling simply reduced performance. Now you could during a stall switch off (clock gate) the core. This saves power. Saved power could be turned into work/performance at another place, even at another time. Overall performance not lost - scheduler window size ok.

    Another question is, at which point it will be useful to bring shader-like computing resources closer to the GP (x86) core. One way could be to add APU resources as additional cluster(s), fed by a common front end. Then they would have their own scheduling etc. AMD has patents for decoding mixed ISA streams. So they could build a decoder, which directs decoded or even translated instruction packets to their appropriate targets. There could further be some instruction buffers/execution caches etc.
    Last edited by Dresdenboy; 10-28-2010 at 06:30 AM.
    Now on Twitter: @Dresdenboy!
    Blog: http://citavia.blog.de/

  21. #121
    Xtreme Addict
    Join Date
    Jan 2009
    Posts
    1,445
    Quote Originally Posted by kl0012 View Post
    Sometimes bigger does not mean better.

    thats what he said......
    [MOBO] Asus CrossHair Formula 5 AM3+
    [GPU] ATI 6970 x2 Crossfire 2Gb
    [RAM] G.SKILL Ripjaws X Series 16GB (4 x 4GB) 240-Pin DDR3 1600
    [CPU] AMD FX-8120 @ 4.8 ghz
    [COOLER] XSPC Rasa 750 RS360 WaterCooling
    [OS] Windows 8 x64 Enterprise
    [HDD] OCZ Vertex 3 120GB SSD
    [AUDIO] Logitech S-220 17 Watts 2.1

Page 5 of 5 FirstFirst ... 2345

Bookmarks

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •