Page 8 of 8 FirstFirst ... 5678
Results 176 to 199 of 199

Thread: AMD embraces AVX making a new superset with SSE5(256bit support)

  1. #176
    Xtreme Cruncher
    Join Date
    Jun 2006
    Posts
    6,215
    Thanks ajaidev,2x128 is simpler at least. There are a lot of things around the AVX spec AMD must do (in order to get full support).Good thing is that there is plenty of time until BD core tapes out.

    BTW,there are some whispers that RevD *may* have SSE4.x(or SSSE3) support.RevD should be Istanbul. But at this point it's only a rumor.

  2. #177
    Xtreme Member
    Join Date
    Sep 2008
    Posts
    235
    Quote Originally Posted by informal View Post
    Thanks ajaidev,2x128 is simpler at least. There are a lot of things around the AVX spec AMD must do (in order to get full support).Good thing is that there is plenty of time until BD core tapes out.

    BTW,there are some whispers that RevD *may* have SSE4.x(or SSSE3) support.RevD should be Istanbul. But at this point it's only a rumor.
    I see that Intel publicly stated that it would be 256 bite wide.
    This is from a while ago though.

    http://www.cse.scitech.ac.uk/disco/m..._BillHoran.pdf


    Regards, Hans

  3. #178
    Xtreme Addict
    Join Date
    Nov 2005
    Posts
    1,084
    Shintai, AMD say that intend to support AVX. You are just hoping and predicting that AMD fail on that.
    Don't do that or more and more signatures get your failed predictions
    Quote Originally Posted by Shintai View Post
    And AMD is only a CPU manufactor due to stolen technology and making clones.

  4. #179
    Xtreme Member
    Join Date
    Nov 2006
    Posts
    324
    I would say more, and this way: AMD intends to support AVX, because it intends to introduce bulldozer in future (presumably in 2011) that would support AVX as well
    Windows 8.1
    Asus M4A87TD EVO + Phenom II X6 1055T @ 3900MHz + HD3850
    APUs

  5. #180
    Xtreme Addict
    Join Date
    Dec 2002
    Location
    Sweden
    Posts
    1,261
    Quote Originally Posted by v_rr View Post
    Don't do that or more and more signatures get your failed predictions
    is it not good way to be "famous"? Like Nostradamus.

  6. #181
    Xtreme Member
    Join Date
    Sep 2008
    Posts
    235
    Quote Originally Posted by Hans de Vries View Post
    I see that Intel publicly stated that it would be 256 bite wide.
    This is from a while ago though.

    http://www.cse.scitech.ac.uk/disco/m..._BillHoran.pdf
    This document from spring 2008 already shows that FMA is post Sandy Bridge
    (see page 23 and on)

    Regards, Hans

  7. #182
    Xtreme Mentor
    Join Date
    Jul 2008
    Location
    Shimla , India
    Posts
    2,631
    Quote Originally Posted by Hans de Vries View Post
    This document from spring 2008 already shows that FMA is post Sandy Bridge
    (see page 23 and on)

    Regards, Hans
    That has changed now AVX and FMA is considered different. AVX will come in SB but FMA will come in its shrink. It will be 256bit wide but using 128bit x2 route its easy to implement.

  8. #183
    I am Xtreme
    Join Date
    Jul 2004
    Location
    Little Rock
    Posts
    7,204
    Why not just get the updated info from Intel?

    http://software.intel.com/en-us/avx/

    Intel® AVX (Intel® Advanced Vector Extensions) is a 256 bit instruction set extension to SSE and is designed for applications that are floating point intensive.
    Let's see, as the architecture improves, shouldn't Intel update AVX accordingly? If that causes AMD heartburn then, that means they have to work harder. Or should Intel not do improvements to help AMD? Intel has always improved SIMD's and hell, I hope they continue to.

    http://software.intel.com/en-us/foru...s/topic/61121/

    Quote Originally Posted by Igor
    I have heard that Sandy Bridge won't have FMA implementation.

    If that rumor is true, I would really like to know who decided that x86 developers should wait more to finally get fused multiply-add isntruction? Is it so useless in real code or the marketing department has again started doing an engineer's job?

    Hereby I publicly voice my displeasure over that poor decision.
    Quote Originally Posted by Intel Engineering team:
    Hi Igor,

    Sandy Bridge will not have FMA, it's targeted for a future processor. I apologize if there is any confusion I (or Intel) caused. In our defense, we did discuss feature timing in the last two Intel developer forums (and now to my embarrassment, I see that presentation has been removed from the IDF content catalog at http://www.intel.com/idf , we'll have it up in time for the upcoming IDF on Oct 20). And it's on a separate CPUID feature flag (separate section of the document too ) in the programming reference.
    Yes, it changed alright ajaidev
    Last edited by Donnie27; 05-08-2009 at 07:56 PM. Reason: fotgot the second link
    Quote Originally Posted by Movieman
    With the two approaches to "how" to design a processor WE are the lucky ones as we get to choose what is important to us as individuals.
    For that we should thank BOTH (AMD and Intel) companies!


    Posted by duploxxx
    I am sure JF is relaxed and smiling these days with there intended launch schedule. SNB Xeon servers on the other hand....
    Posted by gallag
    there yo go bringing intel into a amd thread again lol, if that was someone droping a dig at amd you would be crying like a girl.
    qft!

  9. #184
    Xtreme Cruncher
    Join Date
    Jun 2006
    Posts
    6,215
    Quote Originally Posted by Donnie27 View Post
    Why not just get the updated info from Intel?

    http://software.intel.com/en-us/avx/



    Let's see, as the architecture improves, shouldn't Intel update AVX accordingly? If that causes AMD heartburn then, that means they have to work harder. Or should Intel not do improvements to help AMD? Intel has always improved SIMD's and hell, I hope they continue to.

    http://software.intel.com/en-us/foru...s/topic/61121/





    Yes, it changed alright ajaidev
    Donnie,everything you wrote was already posted dozen of times in this very thread(in other words nothing new). 256bit registers are a *must* for AVX,there is no *if* there.What can be different is the way these are organized(ie. 2x128). This is what we do not know in AMD's approach.

    As for AVX spec. changes,the last change was the FMA change and intel went with less powerful FMA approach while AMD used the previous intel's implementation,called FMA4. This means that even though FMA will be in BD cores it won't be compatible with FMA3 in post Sandy Bridge CPU(IvyB.) The AVX is different from FMA,and if intel doesn't make some radical changes in it the BD cores will be compatible with SandyB.

  10. #185
    D.F.I Pimp Daddy
    Join Date
    Jan 2007
    Location
    Still Lost At The Dead Show Parking Lot
    Posts
    5,182
    Quote Originally Posted by v_rr View Post
    Have a nice day:
    AMD64 Architecture
    Programmer’s Manual
    Volume 6:
    128-Bit and 256-Bit
    XOP, FMA4 and CVT16
    Instructions

    http://support.amd.com/us/Processor_TechDocs/43479.pdf


    Hey thanks I will take a look at that
    SuperMicro X8SAX
    Xeon 5620
    12GB - Crucial ECC DDR3 1333
    Intel 520 180GB Cherryville
    Areca 1231ML ~ 2~ 250GB Seagate ES.2 ~ Raid 0 ~ 4~ Hitachi 5K3000 2TB ~ Raid 6 ~

  11. #186
    Xtreme Addict
    Join Date
    May 2007
    Location
    Europe/Slovenia/Ljubljana
    Posts
    1,540
    Well, whatever they implement, i hope it'll be put to a good use.
    Intel Core i7 920 4 GHz | 18 GB DDR3 1600 MHz | ASUS Rampage II Gene | GIGABYTE HD7950 3GB WindForce 3X | WD Caviar Black 2TB | Creative Sound Blaster Z | Altec Lansing MX5021 | Corsair HX750 | Lian Li PC-V354
    Super silent cooling powered by (((Noiseblocker)))

  12. #187
    Xtreme Member
    Join Date
    Sep 2008
    Posts
    235
    Quote Originally Posted by ajaidev View Post
    That has changed now AVX and FMA is considered different. AVX will come in SB but FMA will come in its shrink. It will be 256bit wide but using 128bit x2 route its easy to implement.
    Quote Originally Posted by Donnie27 View Post
    Quote Originally Posted by Intel Engineering Team
    Hi Igor,

    Sandy Bridge will not have FMA, it's targeted for a future processor. I apologize if there is any confusion I (or Intel) caused. In our defense, we did discuss feature timing in the last two Intel developer forums (and now to my embarrassment, I see that presentation has been removed from the IDF content catalog at http://www.intel.com/idf , we'll have it up in time for the upcoming IDF on Oct 20). And it's on a separate CPUID feature flag (separate section of the document too ) in the programming reference.
    Yes, it changed alright ajaidev

    What I meant (and what Intel says here) is that AVX and FMA never were
    supposed to be implemented at the same time. FMA was always targeted
    for a later processor.


    Regards, Hans

  13. #188
    Xtreme Cruncher
    Join Date
    Jun 2006
    Posts
    6,215
    More on the "AVX in Bulldozer" topic,Agner's question about SSE4.1 instructions that are now missing from Volume 6(AMD developer forums):
    http://forums.amd.com/devforum/messa...&enterthread=y

    Quote Originally Posted by Agner
    I have noticed that those instructions that SSE5 shared with SSE4.1 are missing in the new manual. Does this mean that all SSE4.1 instructions will be covered in the next AMD processor and hence don't need to be categorized as XOP instructions?
    Response from Mr Christie:
    Quote Originally Posted by D. Christie,AMD senior fellow
    Yes -- see my response to your question on my blog entry, if you haven't already.
    This clearly means what the SSE4.1 will be included in AVX and hence there will be no need to list it in Volume 6 which is for additional XOP extensions that AMD engineered themselves. There you go Shintai,AMD won't list ANY of the AVX included instructions in their own XOP extensions because they do not belong there at all.XOP is AMD's baby and they only posted XOP extensions manual.AVX support is there and will not be covered in Volume 6(nor will any of the SSE4.x or SSSE3)

    The blog post response was posted by me a number of times already so there is no need to re-post it again. Christie basically said AMD will support all of the post SSE3 instruction set extensions in Bulldozer. Only FMA will be an exception ,but SandyB will not feature any support for it anyway so it's a non issue so to say. FMA3 support will come in BD shrink since AMD wants to be sure that specification is finaly stable and not prone to last minute changes.
    Last edited by informal; 05-09-2009 at 07:39 AM.

  14. #189
    Xtreme Mentor
    Join Date
    Jul 2008
    Location
    Shimla , India
    Posts
    2,631
    Quote Originally Posted by Hans de Vries View Post
    What I meant (and what Intel says here) is that AVX and FMA never were
    supposed to be implemented at the same time. FMA was always targeted
    for a later processor.


    Regards, Hans
    Not really FMA was suppose to go with SB but it did not the really sad part is that even FMA4 might not work with SB that is very sad indeed.

    I dont much care for FMA3 because of its destructive nature.

    BTW The really nice thing about these next gen processors are the operands they can handle. IMO 4 operands at a time is really the big news.

  15. #190
    I am Xtreme
    Join Date
    Jul 2004
    Location
    Little Rock
    Posts
    7,204
    Quote Originally Posted by informal View Post
    Donnie,everything you wrote was already posted dozen of times in this very thread(in other words nothing new). 256bit registers are a *must* for AVX,there is no *if* there.What can be different is the way these are organized(ie. 2x128). This is what we do not know in AMD's approach. "and a lot of other things you do not KNOW"

    As for AVX spec. changes,the last change was the FMA change and intel went with less powerful FMA approach while AMD used the previous intel's implementation,called FMA4. This means that even though FMA will be in BD cores it won't be compatible with FMA3 in post Sandy Bridge CPU(IvyB.) The AVX is different from FMA,and if intel doesn't make some radical changes in it the BD cores will be compatible with SandyB.
    All this, why, because I asked if Intel should be responsible for AMD keeping up with improvements to their software. A simple answered yes or no instead of more PR would have been nice! Your not answering the question is nothing new from you as well LOL!

    Quote Originally Posted by Hans
    What I meant (and what Intel says here) is that AVX and FMA never were supposed to be implemented at the same time. FMA was always targetedfor a later processor.

    Regards, Hans
    Thanks Hans, I know what you meant and posted what I did to back up what what was said. Why are you guys acting like I disagreed or something. See the ""? Yepp means; Yes, damned straight, in fact, you got it, bet your @$$ and etc.. I asked why not just go straight to the source? The other question/s still goes unanswered meanwhile!

    Thanks Hans, love your site, was one of the first ones here to link to it BTW!
    Last edited by Donnie27; 05-11-2009 at 06:33 PM.
    Quote Originally Posted by Movieman
    With the two approaches to "how" to design a processor WE are the lucky ones as we get to choose what is important to us as individuals.
    For that we should thank BOTH (AMD and Intel) companies!


    Posted by duploxxx
    I am sure JF is relaxed and smiling these days with there intended launch schedule. SNB Xeon servers on the other hand....
    Posted by gallag
    there yo go bringing intel into a amd thread again lol, if that was someone droping a dig at amd you would be crying like a girl.
    qft!

  16. #191
    Registered User
    Join Date
    Jan 2005
    Posts
    41
    I fail to see how FMA3 is much worse than FMA4. The "destructive" nature means, that if you want to preserve the initial value, you store it somewhere before the FMA op. And retrieve it later, probably there are many ways to cope with it. The big thing is doing the FMA op in any form IMO..

  17. #192
    Xtreme Mentor
    Join Date
    Jul 2008
    Location
    Shimla , India
    Posts
    2,631
    Quote Originally Posted by asmodean View Post
    I fail to see how FMA3 is much worse than FMA4. The "destructive" nature means, that if you want to preserve the initial value, you store it somewhere before the FMA op. And retrieve it later, probably there are many ways to cope with it. The big thing is doing the FMA op in any form IMO..
    ok lets see you can combo two similar FMA3 codes in a single FMA4 code.

    FMA3 --> a = b+c and a = a+d
    FMA4 --> a = b+c+d

    b=2
    c=3
    d=4

    FMA3 --> a = 2+3 //First
    a = 5

    a = 5+4 //Second
    a = 9

    FMA4 --> a = 2+3+4
    a = 9

    FMA3 is longer and takes less system utilization but takes more resources.
    FMA4 is shorter as such takes more system utilization but takes less resources.

    Think of it as MPEG4 Part 10 and XVID

  18. #193
    Xtreme Cruncher
    Join Date
    Apr 2005
    Location
    TX, USA
    Posts
    898
    Quote Originally Posted by asmodean View Post
    I fail to see how FMA3 is much worse than FMA4. The "destructive" nature means, that if you want to preserve the initial value, you store it somewhere before the FMA op. And retrieve it later, probably there are many ways to cope with it. The big thing is doing the FMA op in any form IMO..
    Agreed, just having the FMA op in any form is better than none.

    If you're implementing something like an FIR filter(yay for time-domain convolution), then the destructive nature would be followed anyways. It's a step up from the dot product op in SSE4.1 if you're dealing with more than 4(8 if extended to AVX's 256-bit) elements, as the FMA will deal with the accumulation, until the end when just one horizontal add is needed.

    Code:
    //Basic FIR Filter
    int taps = 8 * some_constant;
    float Input[taps], System_Response[taps];
    Output = 0;
    
    for(n=0; n<taps; n++) {
       Output = Input[n] * System_Response[n] + Output; }
    I'm too lazy to convert to the equivalent assembly, but the point should come across(one 256-bit FMA3 per 8 taps, one HADD, and 2 MOV instructions per 8 taps).

    It's all situational dependent...
    No benefit from FMA4 => Worse code-density than FMA3
    Benefit from FMA4 => Better code-density/performance than FMA3
    Last edited by rcofell; 05-10-2009 at 07:11 AM.



  19. #194
    Registered User
    Join Date
    Apr 2007
    Location
    Bangkok, Thailand
    Posts
    55

  20. #195
    Xtreme Addict
    Join Date
    Apr 2008
    Location
    Texas
    Posts
    1,663
    Quote Originally Posted by informal View Post
    Thanks ajaidev,2x128 is simpler at least. There are a lot of things around the AVX spec AMD must do (in order to get full support).Good thing is that there is plenty of time until BD core tapes out.

    BTW,there are some whispers that RevD *may* have SSE4.x(or SSSE3) support.RevD should be Istanbul. But at this point it's only a rumor.
    Didn't happen .
    Core i7 2600K@4.6Ghz| 16GB G.Skill@2133Mhz 9-11-10-28-38 1.65v| ASUS P8Z77-V PRO | Corsair 750i PSU | ASUS GTX 980 OC | Xonar DSX | Samsung 840 Pro 128GB |A bunch of HDDs and terabytes | Oculus Rift w/ touch | ASUS 24" 144Hz G-sync monitor

    Quote Originally Posted by phelan1777 View Post
    Hail fellow warrior albeit a surat Mercenary. I Hail to you from the Clans, Ghost Bear that is (Yes freebirth we still do and shall always view mercenaries with great disdain!) I have long been an honorable warrior of the mighty Warden Clan Ghost Bear the honorable Bekker surname. I salute your tenacity to show your freebirth sibkin their ignorance!

  21. #196
    Live Long And Overclock
    Join Date
    Sep 2004
    Posts
    14,058
    Quote Originally Posted by Mechromancer View Post
    Didn't happen .
    Do you think they'll add it in with the Q1 2010 update?

    Perkam

  22. #197
    Xtreme Cruncher
    Join Date
    Jun 2006
    Posts
    6,215
    At this point ,the earliest date looks to be Bulldozer release date.

  23. #198
    Xtreme Addict
    Join Date
    Apr 2008
    Location
    Texas
    Posts
    1,663
    Quote Originally Posted by perkam View Post
    Do you think they'll add it in with the Q1 2010 update?

    Perkam
    Heck, I missed the news on the Q1 2010 update. Got a link?
    Core i7 2600K@4.6Ghz| 16GB G.Skill@2133Mhz 9-11-10-28-38 1.65v| ASUS P8Z77-V PRO | Corsair 750i PSU | ASUS GTX 980 OC | Xonar DSX | Samsung 840 Pro 128GB |A bunch of HDDs and terabytes | Oculus Rift w/ touch | ASUS 24" 144Hz G-sync monitor

    Quote Originally Posted by phelan1777 View Post
    Hail fellow warrior albeit a surat Mercenary. I Hail to you from the Clans, Ghost Bear that is (Yes freebirth we still do and shall always view mercenaries with great disdain!) I have long been an honorable warrior of the mighty Warden Clan Ghost Bear the honorable Bekker surname. I salute your tenacity to show your freebirth sibkin their ignorance!

  24. #199
    Xtreme Cruncher
    Join Date
    Jun 2006
    Posts
    6,215
    Quote Originally Posted by Mechromancer View Post
    Heck, I missed the news on the Q1 2010 update. Got a link?
    It's Magny Cours/Sao Paolo launch..

Page 8 of 8 FirstFirst ... 5678

Bookmarks

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •