Page 8 of 39 FirstFirst ... 56789101118 ... LastLast
Results 176 to 200 of 954

Thread: AMD's Bobcat and Bulldozer

  1. #176
    Xtreme Enthusiast
    Join Date
    Feb 2009
    Posts
    800
    I hope they'll talk about platforms (both server and desktop) on the second wave of slides.

  2. #177
    Xtreme Cruncher
    Join Date
    Jun 2006
    Posts
    6,215
    Another deck of slides,or a good part of them, is up at certain (well) known website that deals with hardware inner things . I don't want to say which one since the NDA is not over yet(1hour remaining) so there is a chance it gets pulled . If you understood my word play you will know which website it is .

    edit:
    since terrace asked me on SA forum about the load/store BW here it is,confirmed(2x128bit L and 1x128bit S capability,per core-the slide talks about dedicated integer cores and distinct/non shared features).This is from the other slide deck with more detailed info on Bulldozer.The thing i noticed about the shared FPU is the dual 128bit packed integer pipelines ,along the 2 128bit FMACs that sit inside the FPU monster.
    Attached Images Attached Images
    • File Type: jpg 22.jpg (52.5 KB, 1804 views)
    Last edited by informal; 08-24-2010 at 03:03 PM.

  3. #178
    Xtreme Mentor
    Join Date
    Aug 2006
    Location
    HD0
    Posts
    2,646
    Quote Originally Posted by terrace215 View Post
    Single-thread performance --> low-thread performance, due to the nature of 2-cored modules and Intel's 2-way HT.

    So you are essentially saying "wtf cares about 4-threaded performance?!?!"

    Well, that encompasses the vast majority of what you'll be doing on the desktop most of the time, so, the answer is: most people.
    sc2 only uses 2 cores. Photoshop is limited by the user's skill not the computer. Web browsers are dependent upon the connection

    so the answer is: only nerds who rage over 1/10th of a frame rate on forums.

    desktops are single thread oriented

  4. #179
    Xtreme Addict
    Join Date
    Jul 2005
    Posts
    1,646
    Quote Originally Posted by STaRGaZeR View Post
    So this is AMD's Hot Chips presentation? Gosh, they didn't say anything new except for Bobcat's L1 and L2 cache sizes as Hans has already said...
    Saw AMD in thread title and couldn't resist huh? Try harder.

  5. #180
    Banned
    Join Date
    Jul 2004
    Posts
    1,125
    Quote Originally Posted by xlink View Post
    desktops are single thread oriented
    And um, that's the problem for BD.

  6. #181
    Xtreme Cruncher
    Join Date
    Jun 2006
    Posts
    6,215
    Quote Originally Posted by terrace215 View Post
    And um, that's the problem for BD.
    Why do you say that ?
    How do you know how it performs in underutilized scenarios?

  7. #182
    Banned
    Join Date
    Jul 2004
    Posts
    1,125
    Well, that doesn't actually go as far as saying you can do all 3 (2 loads, 1 store) at the same time...

    There's now info up at tech report, too.
    Last edited by terrace215; 08-24-2010 at 03:34 PM.

  8. #183
    Xtreme Cruncher
    Join Date
    Jun 2006
    Posts
    6,215
    Quote Originally Posted by terrace215 View Post
    Well, that doesn't actually go as far as saying you can do all 3 (2 loads, 1 store) at the same time...
    Well that's the most granular bit you will get for now . You can interpret it as you wish

  9. #184
    Xtreme Member
    Join Date
    Sep 2008
    Posts
    235
    Quote Originally Posted by terrace215 View Post
    And um, that's the problem for BD.
    1) JF told you at the other thread that IPC is higher.

    2) If that's true then the higher frequency design comes on top of that.

    3) And last but not least: Power gating Turbo now allows much higher single core frequencies.


    Looks like a 1-2-3 speed bump for single thread performance to me....


    Regards, Hans

  10. #185
    Xtreme Mentor
    Join Date
    Aug 2006
    Location
    HD0
    Posts
    2,646
    Quote Originally Posted by Hans de Vries View Post
    1) JF told you at the other thread that IPC is higher.

    2) If that's true then the higher frequency design comes on top of that.

    3) And last but not least: Power gating Turbo now allows much higher single core frequencies.


    Looks like a 1-2-3 speed bump for single thread performance to me....


    Regards, Hans
    fewer execution units...

    those three factors coupled with other architectural improvements would need to be able to be at least 50% faster in some instances.

  11. #186
    Devil kept pokin'
    Join Date
    Jan 2010
    Location
    South Kakalaky
    Posts
    1,299
    Its AMD, you got to have faith they'll give Intel something to think about

    I love the under dog

  12. #187
    I am Xtreme
    Join Date
    Oct 2004
    Location
    U.S.A.
    Posts
    4,743
    I like how amd is combating HT by intel. I'm looking forward to both bobcat and bulldozer. Although I still wonder how AMD is going to complete in the mobile market against intel's atom.


    Asus Z9PE-D8 WS with 64GB of registered ECC ram.|Dell 30" LCD 3008wfp:7970 video card

    LSI series raid controller
    SSDs: Crucial C300 256GB
    Standard drives: Seagate ST32000641AS & WD 1TB black
    OSes: Linux and Windows x64

  13. #188
    Xtreme Cruncher
    Join Date
    Apr 2005
    Location
    TX, USA
    Posts
    898
    Quote Originally Posted by Chumbucket843 View Post
    synthesizers do not floorplan. generally modern logic blocks are 50-100K gates, probably around 300-600K transistors. this is a rather large chunk when the core itself, including L1 & L2 it is probably <20M transistors.
    Not exactly sure which context your mean when you say they "do not floorplan", but they definitely allow floorplanning at some level. The first step of synthesis, RTL -> Netlist, doesn't floorplan (it just cares about standard cells usage & timing estimates/constraints), if that's what you're trying to get at. However, the second step of synthesis, Netlist -> Placement (placement tool), definitely does floor-planning.

    Tools like Cadence Encounter take floorplan constraints and allow for partitioning sub-modules, however like the picture above shows the results tend to look like a jumbled mess, since strict boundaries aren't adhered to.
    Quote Originally Posted by -Boris- View Post
    Finally, chips from AMD has always been nicely ordered, pointing at a mostly hand made layout. I can imagine that it leads to an uneven power usage and unnecessary long circuits and timings. And wastes die space.
    Well, historically [x86] chips from both camps have always been mainly custom layout in the datapath with a varying amount of synthesized control logic, seeing pictures like this is a bit of an eye opener from the norm

    Another example is Intel's Pine-Trail (bigger):

    The huge purple blotch running down the middle is all synthesized logic
    Quote Originally Posted by deeperblue View Post
    AMD says (http://www.youtube.com/watch?v=VIs1CxuUrpc)
    "Synthesizable with small number of custom arrays"
    Together with what was said before I think one of the main goals that AMD wants to achieve is to have easily customizable processors. Add a gpu core here, some cache there and another core here. From the slide it looks like lots of their process is already capable of being laid out by a computer.
    We have the caches, the integer units and the floating point units being the fixed hand optimized blocks with stuff like the x86 decode organically filling up the space in between. AMD also says it makes it easier to put the whole thing on a different process.

    I've only limited knowledge about modern synthesizing and floor planning from working with some FPGAs.
    Maybe Hans or somebody in the industry can say something about Bobcat?
    You pretty much sum up my thoughts on the matter, it looks like they shot for a semi-custom approach by supplying some of the main datapath logic (not necessarily say the whole FPU, etc., just the important chunks) and the arrays as hard-macros/external-IP (in- or out-of-house, doesn't matter) while synthesizing the rest.

    While they're definitely not unique in the approach, it will certainly provide a quicker process adaptation, since only a standard cell library and select logic/array-IP pieces would technically be necessary. Granted there's still a bit more work than just swapping libraries/IP and pressing a few buttons
    Quote Originally Posted by Chumbucket843 View Post
    lol, it is the exact opposite. hand layout is much better. humans are better at finding eulerian paths and coming up with clever layouts. computers cant really do that with all of the design rules and other parameters as effectively. the difference in performance is 2.6-7x faster with custom designed circuits.

    really what happens is a coder will simulate his module and make sure it reaches the targeted timing, which is usually much higher than actual delay to assure robust operation. if the logic cant reach the speed it is either rewritten or circuit designers optimize it. in certain logic families it must be entirely custom designed.

    circuits that are custom designed are usually things like power gating, clock distribution, and analog circuits such as pll's, dll's, and memory controllers/ io pads.
    Just pointing out that while we humans can definitely be more adapt at coming up with these clever (sometimes novel) solutions to optimizing layout area/timing/congestion-constraints, it's also a significant capital and time investment, so it's for ROI and time to market reasons that it doesn't always work out. The case of Bobcat is obvious an example of this, and Atom for that matter.

    Honestly I would find the logically optimal euler path to be much easier for a computer to solve
    But yes computerized tools aren't very good when it comes to balancing the plethora of added constraints in a physical world, hence us restricting them to sub-optimal standard cells + wiring constraints.



  14. #189
    Banned
    Join Date
    Jul 2004
    Posts
    1,125
    Quote Originally Posted by Hans de Vries View Post
    1) JF told you at the other thread that IPC is higher.

    2) If that's true then the higher frequency design comes on top of that.

    3) And last but not least: Power gating Turbo now allows much higher single core frequencies.


    Looks like a 1-2-3 speed bump for single thread performance to me....


    Regards, Hans
    Hans, I wouldn't put much faith in "much higher single core frequencies", relative to phenom II.

    As for the IPC, I could believe it is higher, but having seen how they've trimmed the cores in certain ways, not that much higher.
    Last edited by terrace215; 08-24-2010 at 04:44 PM.

  15. #190
    Xtreme Addict
    Join Date
    Jan 2008
    Location
    milwaukee
    Posts
    1,683
    Quote Originally Posted by informal View Post
    Why do you say that ?
    How do you know how it performs in underutilized scenarios?
    they dont, but standard procedures say to enter amd thread and say anything negative about amd even if there is no proof to back it
    LEO!!!!
    amd phenom II x6 1100T | gigabyte 990fxa-ud3 . .
    2x2gb g.skill 2133c8 | 128gb g.skill falcon ssd
    sapphire ati 5850 | x-fi xtrememusic. . .
    samsung f4 2tb | samsung dvdrw . .
    corsair tx850w | windows 7 64-bit.
    ddc3.25 xspc restop | ek ltx | mc-tdx | BIP . .
    lycosa-g9-z2300 | 26" 1920x1200 lcd .

  16. #191
    Xtreme Cruncher
    Join Date
    May 2009
    Location
    Bloomfield
    Posts
    1,968
    Quote Originally Posted by crazydiamond View Post
    they dont, but standard procedures say to enter amd thread and say anything negative about amd even if there is no proof to back it
    you say that so much and i think it's easier to just either visit another forum, not get so upset, ignore the posts, or even just set certain users to ignore. notice it's not a big deal to anyone else. the shintai era is over dude.

    some people dont like amd and you'll just have to deal with it. personally i am more frustrated with them (amd) more than anything else. i have high standards, especially from big companies. and yes AMD is a big semiconductor company.

  17. #192
    Xtreme Member
    Join Date
    Sep 2008
    Posts
    235
    Quote Originally Posted by deeperblue View Post
    AMD says (http://www.youtube.com/watch?v=VIs1CxuUrpc)
    "Synthesizable with small number of custom arrays"
    Together with what was said before I think one of the main goals that AMD wants to achieve is to have easily customizable processors. Add a gpu core here, some cache there and another core here. From the slide it looks like lots of their process is already capable of being laid out by a computer.
    We have the caches, the integer units and the floating point units being the fixed hand optimized blocks with stuff like the x86 decode organically filling up the space in between. AMD also says it makes it easier to put the whole thing on a different process.

    I've only limited knowledge about modern synthesizing and floor planning from working with some FPGAs.
    Maybe Hans or somebody in the industry can say something about Bobcat?
    Another nice example is the 1.9W TDP 2GHz hardmacro version of the dual
    core ARM cortex A9 in the TSMC 40G process (total size of only 6.7 mm2)

    http://www.arm.com/products/CPUs/Cor...ard-Macro.html



    Regards, Hans

  18. #193
    Xtreme Addict
    Join Date
    Aug 2004
    Location
    Sweden
    Posts
    2,084
    Quote Originally Posted by Hans de Vries View Post
    Another nice example is the 1.9W TDP 2GHz hardmacro version of the dual
    core ARM cortex A9 in the TSMC 40G process (total size of only 6.7 mm2)
    What's the reason behind the odd die shape, do you know?

  19. #194
    Xtreme Member
    Join Date
    Apr 2005
    Posts
    110
    Quote Originally Posted by terrace215 View Post
    Hans, I wouldn't put much faith in "much higher single core frequencies", relative to phenom II.
    Why? Does this boil down to your special insight with regards to GF's 32nm process?

  20. #195
    Banned
    Join Date
    Jul 2004
    Posts
    1,125
    Quote Originally Posted by redpriest View Post
    Why? Does this boil down to your special insight with regards to GF's 32nm process?
    Phenom II hits what, 3.8 on turbo? Yeah, I don't see BD turboing much past 4GHz, *especially* on that GF process. Power just starts to climb so fast with frequency at that point.

  21. #196
    Xtreme Cruncher
    Join Date
    Apr 2005
    Location
    TX, USA
    Posts
    898
    Quote Originally Posted by Mats View Post
    What's the reason behind the odd die shape, do you know?
    It's supposed to still be a rectangle, the discrepancy is the white "missing" area is for external-IP (L2 cache SRAM mainly, but chip-level interfacing too).



  22. #197
    Xtreme Addict
    Join Date
    Apr 2007
    Posts
    2,128
    Quote Originally Posted by terrace215 View Post
    Hans, I wouldn't put much faith in "much higher single core frequencies", relative to phenom II.

    As for the IPC, I could believe it is higher, but having seen how they've trimmed the cores in certain ways, not that much higher.
    1. Deeper pipeline. -> Higher clocks.
    2. 32 nm Gate Last HKMG SOI. -> Higher clocks.
    3. Inclusive cache hierarchy. -> Higher IPC.

  23. #198
    Xtreme Addict
    Join Date
    Aug 2004
    Location
    Sweden
    Posts
    2,084
    Quote Originally Posted by redpriest View Post
    Why? Does this boil down to your special insight with regards to GF's 32nm process?
    No, it's because Intel is raising their turbo clock with 66 MHz, from 3733 MHz in 45 nm i7 880, to 3800 MHz in 32 nm i7 2600.

    If Intel can only raise it with 66 MHz, then AMD should only be able to raise it with less than that, right? It's fundamentally impossible that AMD can do ANYTHING better, according to some people.
    Yet, they keep on posting and bashing in threads about things they hate.. I mean, what's the point?

    Maybe I should join a Britney Spears forum and see what being negative in a forum is all about, because I honestly don't know.

  24. #199
    Xtreme Addict
    Join Date
    Jan 2009
    Posts
    1,445
    Quote Originally Posted by Calmatory View Post
    1. Deeper pipeline. -> Higher clocks.
    2. 32 nm Gate Last HKMG SOI. -> Higher clocks.
    3. Inclusive cache hierarchy. -> Higher IPC.
    lol a trifecta eh?


    i guess stp matters still....well thanks for all who explained why they view it as important.
    [MOBO] Asus CrossHair Formula 5 AM3+
    [GPU] ATI 6970 x2 Crossfire 2Gb
    [RAM] G.SKILL Ripjaws X Series 16GB (4 x 4GB) 240-Pin DDR3 1600
    [CPU] AMD FX-8120 @ 4.8 ghz
    [COOLER] XSPC Rasa 750 RS360 WaterCooling
    [OS] Windows 8 x64 Enterprise
    [HDD] OCZ Vertex 3 120GB SSD
    [AUDIO] Logitech S-220 17 Watts 2.1

  25. #200
    Xtreme Addict
    Join Date
    Jan 2009
    Posts
    1,445
    Quote Originally Posted by Mats View Post
    No, it's because Intel is raising their turbo clock with 66 MHz, from 3733 MHz in 45 nm i7 880, to 3800 MHz in 32 nm i7 2600.

    If Intel can only raise it with 66 MHz, then AMD should only be able to raise it with less than that, right? It's fundamentally impossible that AMD can do ANYTHING better, according to some people.
    Yet, they keep on posting and bashing in threads about things they hate.. I mean, what's the point?

    Maybe I should join a Britney Spears forum and see what being negative in a forum is all about, because I honestly don't know.
    why Britney spears? there's nothing wrong with Britney!.....LEAVE BRITNEY ALONE!!!


    http://www.youtube.com/watch?v=kHmvkRoEowc
    [MOBO] Asus CrossHair Formula 5 AM3+
    [GPU] ATI 6970 x2 Crossfire 2Gb
    [RAM] G.SKILL Ripjaws X Series 16GB (4 x 4GB) 240-Pin DDR3 1600
    [CPU] AMD FX-8120 @ 4.8 ghz
    [COOLER] XSPC Rasa 750 RS360 WaterCooling
    [OS] Windows 8 x64 Enterprise
    [HDD] OCZ Vertex 3 120GB SSD
    [AUDIO] Logitech S-220 17 Watts 2.1

Page 8 of 39 FirstFirst ... 56789101118 ... LastLast

Bookmarks

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •