MMM
Results 1 to 25 of 3724

Thread: AMD Cayman info (or rumor)

Threaded View

  1. #10
    Registered User
    Join Date
    Oct 2009
    Posts
    1
    Quote Originally Posted by Macadamia View Post
    Hm? How does logic go up? Logic should go down (minorly, 10-15% die size would be fortunate) and perf-logic ratio should get boosted just slightly less.
    That ~10-15% number was also my estimate. It's of course dependent on the actual code running. It will be possible to construct cases were Cayman will be slower than Cypress (if Cayman doesn't have significantly more than 1920 SPs). But generally, it will gain the most in situations where the VLIW5 architecture fared worst in comparison to nvidia.
    Quote Originally Posted by Macadamia View Post
    Unless it's WT-XT-YT-ZT which might be what you mean. That'd be quite an increase actually, but the ratio's way overdone.

    I'm thinking of WXYZ + logic in the SIMD blocks that allows transcendentals to be performed through looping or such.
    Actually it is already known how the VLIW4 units will be organized. The codepath for that arch in the driver is functional since Catalyst 10.4, I've posted some stuff about that over at B3D 10 days ago.

    The transcendental functions are done by the xyz units working together (just like it is done for double precision already now, only that it takes 3 slots), so 3 of the 4 slots of the VLIW unit are used to calculate a transcendental. The fourth slot (w) does not take part in that and is still free to use in the same cycle. That means a good part of the t unit got split up in three parts and is distributed to the x, y and z units.
    Another function of the t unit was doing format conversions and roundings. This functionality got replicated to all subunits. That means for this kind of stuff Cayman will fly.
    24bit integer arithmetics are now fully supported by Cayman and can be done in all 4 slots (Evergreen had only partial support which was not really used).
    A 32Bit integer multiplication will unfortunately block all 4 slots (could be done by the t unit with the xyzw slots free for use by other instructions in Evergreen), but this is probably the price to pay to get some transistor savings from the change.
    All other integer instructions can again be done in all 4 slots (as before).

    Double precision instructions behave the same way as in Cypress. Everything involving a multiplication (MUL, FMA) takes 4 slots while the other stuff (like ADD and conversions) takes 2 slots. That means the DP:SP ratio is 1:4.
    Last edited by Gipsel; 10-29-2010 at 02:46 PM.

Bookmarks

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •