Page 59 of 181 FirstFirst ... 9495657585960616269109159 ... LastLast
Results 1,451 to 1,475 of 4519

Thread: AMD Zambezi news, info, fans !

  1. #1451
    Xtreme Member
    Join Date
    May 2009
    Location
    São Paulo, Brazil
    Posts
    317
    Quote Originally Posted by TESKATLIPOKA View Post
    drfedja great work, I love these diagrams. If you don't mind I will link them to another forum I frequently visit.
    I'd be even more thankful if there happened to be a version in English.

  2. #1452
    Xtreme Member
    Join Date
    Apr 2007
    Location
    Serbia
    Posts
    102
    Quote Originally Posted by Oliverda View Post
    BTW you interpret the Address Generation Units as units for calculate linear addresses as well as INC/LEA values. The Optimization Guide refers them as simple integer exetution units, too (AGLU).

    Would you briefly explain what kind of operations can these units execute?

    Thanks
    Also I'd like to know what type of instructions can execute AGLU, but what my knowledge of what can AGLU execute is based also on Optimisation Manual and my assumptions is that the AGLU can execute address calculations and LEA, and probably can execute INC. If AMD's manual says that the AGLU can execute simple ALU operations. Maybe i'm wrong for INC, but it could be possible for such unit to support some other type of instructions than CALL and LEA.
    If it can calculate adress, that unit can also execute simple ADD or INC with unsigned integer (address + offset) and some logical operations like XOR or AND. That is my speculation, because Optimisation Manual probably isn't fully written.

    Optimisation guide also refer this:
    There are four integer execution units per core. Two units which handle all arithmetic, logical and
    shift operations (EX). And two which handle address generation and simple ALU operations
    (AGLU). Figure 2 shows a block diagram for one integer cluster. There are two such integer clusters
    per compute unit.
    Optimisation manual says that the AG0|AG1 units execute LEA instruction when work with 3 operands. But with legacy 2 operand instructions LEA can be executed only at EX0|EX1 units. AG0|AG1 can execute CALL instructions, which is double op decoded. Fist op. execute on EX and secon op. execite on AGLU.
    The CALL instruction clearly transfers control to another procedure, and the RET instruction returns to the instruction following the call.
    But that isn't any big difference in comparison to K10. K10 also execute CALL instruction like double op, but on BD CALL disp, near and CALL reg, near has 50% lower latency than 10h and CALL mem (near) is hardwired - double decoded, on 10h is microcoded.

    According to Optimisation manual, main difference in BD AGU vs 10h AGU units is that the BD AGU can execute LEA, when work with three operands, and CALL is fully hardwired, with slightly lower latencies.

    Quote Originally Posted by danielkza View Post
    I'd be even more thankful if there happened to be a version in English.
    Use google translate to learn Serbian... :P
    I will translate that diagrams to English, that isn't problem, but I think it is understandable in that version. Picture is worth a thousand words! :P
    Last edited by drfedja; 08-09-2011 at 03:07 PM.
    "That which does not kill you only makes you stronger." ---Friedrich Nietzsche
    PCAXE

  3. #1453
    Registered User
    Join Date
    Sep 2008
    Posts
    45
    drfedja awesome looking diagrams, however I think there's a small error for the Bulldozer FPU. Bulldozer only has 1 IMAC unit and it's located in Pipe 0 which it shares with an FMA unit and a convert unit. The two integer units in pipe 2 and pipe 3 are for vector integer ADD(multiplication is done in the IMAC unit) as well as AVX, SSE and x87 instructions that are not handled by the FMA, CVT, or the XBR units.

  4. #1454
    Xtreme Member
    Join Date
    Apr 2007
    Location
    Serbia
    Posts
    102
    You are right. I will correct that.
    There are no 256-bit integer AVX instructions (256-bit int. AVX comes with AVX2) and integer FMA is handled by FP pipe 0.

    According to Optimisation Manual:
    A 128-bit integer multiply accumulate (IMAC) unit is incorporated into FPU pipe 0. The IMAC
    performs integer fused multiply and accumulate, and similar arithmetic operations on AVX, MMX
    and SSE data. A crossbar (XBAR) unit is integrated into FPU pipe 1 to execute the permute
    instruction along with shifts, packs/unpacks and shuffles. There is an FPU load-store unit which
    supports up to two 128-bit loads and one 128-bit store per cycle.
    All of them can execute integer SIMD instructions. Pipe 0 performs integer fused multiply accomulate, and pipe 1 execute shuffle and FSTORE and all of integer SIMD.

    There are four units. According to instruction latencies table, CVT unit is pipe 0, which is shared with FMA0 pipe 0 (128-bit FP block), pipe 1 is XBAR unit which is responsible for shuffling operations and shared with FMA1, pipe 2 and pipe 3 is integer SIMD units which handle some bitwise or logic operations: eg.
    ANDNP, ANDNS is FP bitwise operations but it is handled by pipe 2 and pipe 3. That type of instructions has same troughput on 10h, but there is shared with FADD or FMUL pipe.
    "That which does not kill you only makes you stronger." ---Friedrich Nietzsche
    PCAXE

  5. #1455
    Xtreme Member
    Join Date
    Jan 2011
    Location
    145.21.4.???
    Posts
    319
    The thread's becoming more & more technical. I still wonder whether bulldozer could use all of the four FPU unit(128bit*2 fmac+ 128bit*2 mmx) to run superpi.

  6. #1456
    Xtreme Member
    Join Date
    Apr 2007
    Location
    Serbia
    Posts
    102
    SuperPi uses about 50% of legacy x87 FP operations. Average IPC of SPi is 0.65-0.7 with 10h microcarchitecture. SPi is mixed type of code, and it is very memory depended, because there is alot of memory stack operations. In general, FPU throughput isn't bottleneck for executing SPi. x87 execution of SPi is saturated by inefficiency of 10h memory subsystem (LS-units->L1D->L2->L3 caches). I think that SPi could be much better on BD, but significantly slower than Sandy or even Nehalem.
    In general, SPi isn't optimised code for modern parallel SIMD architecture. There aslo could be problem with unaligned memory access, store to load forwarding and data dependencies in order to run code serialized. That is probably the main reason why is Core architecture so superior when runing such unoptimized code.
    "That which does not kill you only makes you stronger." ---Friedrich Nietzsche
    PCAXE

  7. #1457
    Xtreme Addict
    Join Date
    Dec 2007
    Location
    Hungary (EU)
    Posts
    1,376
    Thanks, so there is no newer information about AGLUs.

    My speculation: if the AGLUs can handle any of the ALU operations, then they must know normal, zero and sign-extended register copy at least, so the instruction table is inaccurate. The additions also must be handled by them and with a slightly more compexity the SUB, NEG, INC, DEC and CMP operations (not the fused compares, just the standalone ones). The logical NOTs, ANDs, ORs, XORs and (not fused) TESTs also requires only a little more simple circuits.

    And exaclty these operations are performed by the double-pumped fast ALUs in Netburst, at 4/cycle rate.
    -

  8. #1458
    Xtreme Member
    Join Date
    Apr 2007
    Location
    Serbia
    Posts
    102
    Quote Originally Posted by Oliverda View Post
    Thanks, so there is no newer information about AGLUs.

    My speculation: if the AGLUs can handle any of the ALU operations, then they must know normal, zero and sign-extended register copy at least, so the instruction table is inaccurate. The additions also must be handled by them and with a slightly more compexity the SUB, NEG, INC, DEC and CMP operations (not the fused compares, just the standalone ones). The logical NOTs, ANDs, ORs, XORs and (not fused) TESTs also requires only a little more simple circuits.

    And exaclty these operations are performed by the double-pumped fast ALUs in Netburst, at 4/cycle rate.
    Maybe, some people on the Net had speculated that simple ALU or AGLU on BD may handle instructions like 30 years old 6502.

    However, mov, push and call are most frequent instructions in x86 machine code. Also the SUB, NEG, INC, DEC and CMP is often used, so AGLU unit could be very useful.

    Attachment 118788
    "That which does not kill you only makes you stronger." ---Friedrich Nietzsche
    PCAXE

  9. #1459
    Xtreme Addict
    Join Date
    Dec 2007
    Location
    Hungary (EU)
    Posts
    1,376
    Quote Originally Posted by drfedja View Post
    Maybe, some people on the Net had speculated that simple ALU or AGLU on BD may handle instructions like 30 years old 6502.

    However, mov, push and call are most frequent instructions in x86 machine code. Also the SUB, NEG, INC, DEC and CMP is often used, so AGLU unit could be very useful.

    I think the MOV section covers simple loads, stores (certain workplace of AGUs) and register copies on the picture. Since the PUSH and POP instructions has been already recuced to single store and load operations at execution level by the K10's Stack Engine, they had to look for another instruction group to speed up (my mentioned conception could extend the general integer execution speed from K10's 3 to 4 ALU operations/cycle/thread). Many conditional jumps (je, jne and others) will be fusioned with preceding TEST/CMP instruction, so the listed add-like and logical ALU instructions would cover another 10-12% on the picture by the AGLU-s.
    -

  10. #1460
    Xtreme Addict
    Join Date
    Aug 2004
    Location
    Sweden
    Posts
    2,084

  11. #1461
    I am Xtreme
    Join Date
    Dec 2007
    Posts
    7,750
    good idea for cpus that are ONLY expensive since people buying such chips cant really do anything about it. but for chips in the 300-400$ range its going to deter people who dont need the nice heatsink unless they also have the regular versions also available with a discount.
    2500k @ 4900mhz - Asus Maxiums IV Gene Z - Swiftech Apogee LP
    GTX 680 @ +170 (1267mhz) / +300 (3305mhz) - EK 680 FC EN/Acteal
    Swiftech MCR320 Drive @ 1300rpms - 3x GT 1850s @ 1150rpms
    XS Build Log for: My Latest Custom Case

  12. #1462
    Devil kept pokin'
    Join Date
    Jan 2010
    Location
    South Kakalaky
    Posts
    1,299
    Quote Originally Posted by Mats View Post
    AMD Considers Equipping FX Chips with Liquid-Cooling Solution
    http://www.xbitlabs.com/news/coolers..._Solution.html
    Intel Considers to Bundle Liquid Cooling Solution with Next-Generation Enthusiast Processors
    http://www.xbitlabs.com/news/coolers...rocessors.html

  13. #1463
    Xtreme Cruncher
    Join Date
    Jul 2006
    Posts
    1,374
    That is an interesting concept, but then they have to deal with potential for RMA. Having a fan fail on a inexpensive heatsink isn't the same, or as likely, as having the pump fail on the water-cooling unit. Kind of a cool idea, but I'd be afraid it would come back to haunt them in the long term.

  14. #1464
    Xtreme Member
    Join Date
    Apr 2005
    Location
    London, UK
    Posts
    261
    I had Coolit sealed water systems pump pop. It took the whole system except cpu(5870, AM3 motherboard). I would imagine that if AMD went this way, they would use one of those sealed systems as they are cheapest. If a air cooler fails, you have system shut down or freeze, if water cooler fails, you have quite a bit of damage.

  15. #1465
    I am Xtreme
    Join Date
    Dec 2007
    Posts
    7,750
    all alienware PCs come with these now, so i bet they are reliable enough to use with other products too
    2500k @ 4900mhz - Asus Maxiums IV Gene Z - Swiftech Apogee LP
    GTX 680 @ +170 (1267mhz) / +300 (3305mhz) - EK 680 FC EN/Acteal
    Swiftech MCR320 Drive @ 1300rpms - 3x GT 1850s @ 1150rpms
    XS Build Log for: My Latest Custom Case

  16. #1466
    Registered User
    Join Date
    Dec 2010
    Location
    127.0.0.1
    Posts
    71
    would AMD manufacture them or farm them out?

  17. #1467
    Xtreme 3D Team
    Join Date
    Jan 2009
    Location
    Ohio
    Posts
    8,499
    Quote Originally Posted by thematrixhazune View Post
    would AMD manufacture them or farm them out?
    Farm them out. AMD doesn't even manufacture their own CPUs....
    Smile

  18. #1468
    I am Xtreme FlanK3r's Avatar
    Join Date
    May 2008
    Location
    Czech republic
    Posts
    6,823
    I think, it will be good idea. The topmodel CPU bundled with some better aircooler (Tower type) or lowend WT setup as H50 or H70....
    ROG Power PCs - Intel and AMD
    CPUs:i9-7900X, i9-9900K, i7-6950X, i7-5960X, i7-8086K, i7-8700K, 4x i7-7700K, i3-7350K, 2x i7-6700K, i5-6600K, R7-2700X, 4x R5 2600X, R5 2400G, R3 1200, R7-1800X, R7-1700X, 3x AMD FX-9590, 1x AMD FX-9370, 4x AMD FX-8350,1x AMD FX-8320,1x AMD FX-8300, 2x AMD FX-6300,2x AMD FX-4300, 3x AMD FX-8150, 2x AMD FX-8120 125 and 95W, AMD X2 555 BE, AMD x4 965 BE C2 and C3, AMD X4 970 BE, AMD x4 975 BE, AMD x4 980 BE, AMD X6 1090T BE, AMD X6 1100T BE, A10-7870K, Athlon 845, Athlon 860K,AMD A10-7850K, AMD A10-6800K, A8-6600K, 2x AMD A10-5800K, AMD A10-5600K, AMD A8-3850, AMD A8-3870K, 2x AMD A64 3000+, AMD 64+ X2 4600+ EE, Intel i7-980X, Intel i7-2600K, Intel i7-3770K,2x i7-4770K, Intel i7-3930KAMD Cinebench R10 challenge AMD Cinebench R15 thread Intel Cinebench R15 thread

  19. #1469
    Xtreme Addict
    Join Date
    Jun 2002
    Location
    Ontario, Canada
    Posts
    1,782
    http://www.amdzone.com/phpbb3/viewto...rt=900#p209349

    Starting to look like BD is a total dud. Sorry, it hurts me to say this but I believe OBR is correct. I just hope I didn't wast $230 CDN on a Crosshair V mobo.

    BD = DNF of processors. ???
    As quoted by LowRun......"So, we are one week past AMD's worst case scenario for BD's availability but they don't feel like communicating about the delay, I suppose AMD must be removed from the reliable sources list for AMD's products launch dates"

  20. #1470
    I am Xtreme
    Join Date
    Aug 2008
    Posts
    5,586
    nice i'll see lots of highend boards going for cheap


  21. #1471
    Xtreme Enthusiast
    Join Date
    Jul 2004
    Location
    London
    Posts
    577
    Ill wait till the end of October before I slowly start losing faith in BD. I hope that day never comes.
    i7 920@4.34 | Rampage II GENE | 6GB OCZ Reaper 1866 | 8800GT (zzz) | Corsair AX750 | Xonar Essence ST w/ 3x LME49720 | HiFiMAN EF2 Amplifier | Shure SRH840 | EK Supreme HF | Thermochill PA 120.3 | MCP355 | XSPC Reservoir | 3/8" ID Tubing

    Phenom 9950BE @ 3400/2000 (CPU/NB) | Gigabyte MA790GP-DS4H | HD4850 | 4GB Corsair DHX @850 | Corsair TX650W | T.R.U.E Push-Pull

    E2160 @3.06 | ASUS P5K-Pro | BFG 8800GT | 4GB G.Skill @ 1040 | 600W Tt PP

    A64 3000+ @2.87 | DFI-NF4 | 7800 GTX | Patriot 1GB DDR @610 | 550W FSP

  22. #1472
    Xtreme Member
    Join Date
    Apr 2005
    Location
    London, UK
    Posts
    261
    Quote Originally Posted by freeloader View Post
    http://www.amdzone.com/phpbb3/viewto...rt=900#p209349

    Starting to look like BD is a total dud. Sorry, it hurts me to say this but I believe OBR is correct. I just hope I didn't wast $230 CDN on a Crosshair V mobo.

    BD = DNF of processors. ???
    If it's true it means there is a problem with yields. It does not mean that BD has bad performance of some sorts.

  23. #1473
    Xtreme Mentor
    Join Date
    May 2008
    Location
    cleveland ohio
    Posts
    2,879
    Quote Originally Posted by freeloader View Post
    http://www.amdzone.com/phpbb3/viewto...rt=900#p209349

    Starting to look like BD is a total dud. Sorry, it hurts me to say this but I believe OBR is correct. I just hope I didn't wast $230 CDN on a Crosshair V mobo.

    BD = DNF of processors. ???
    If Orb is correct. I'm magical.
    HAVE NO FEAR!
    "AMD fallen angel"
    Quote Originally Posted by Gamekiller View Post
    You didn't get the memo? 1 hour 'Fugger time' is equal to 12 hours of regular time.

  24. #1474
    Xtreme Member
    Join Date
    Jan 2011
    Location
    145.21.4.???
    Posts
    319
    Quote Originally Posted by muziqaz View Post
    If it's true it means there is a problem with yields.
    It's capacity problem, which occured with old 90nm X2.

  25. #1475
    Xtreme Member
    Join Date
    Apr 2005
    Location
    London, UK
    Posts
    261
    Quote Originally Posted by undone View Post
    It's capacity problem, which occured with old 90nm X2.
    I think everyone especially AMD would be happy to sell as much as they did in 90nm days

Page 59 of 181 FirstFirst ... 9495657585960616269109159 ... LastLast

Bookmarks

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •