MMM
Page 3 of 5 FirstFirst 12345 LastLast
Results 51 to 75 of 124

Thread: Can Llano do AVX?

  1. #51
    Xtreme X.I.P.
    Join Date
    Nov 2002
    Location
    Shipai
    Posts
    31,147
    nn_step, but there are specific apps that would benefit from avx right? iirc intel already plans 256bit avx in haswell...

    and about cpus... well i read that x86 instructions are broken down into actual simple math instructions which are then executed by specialized logic for those simple math instructions. originally the x86 cpus had kinda general purpose logic but they were not as efficient as breaking down x86 into basic math and having specialized math logic that then works on that... this is what i read a while ago, please correct me if im wrong
    Last edited by saaya; 04-27-2010 at 08:13 AM.

  2. #52
    Xtreme Cruncher
    Join Date
    Jun 2006
    Posts
    6,215
    AVX is already 256b ISA extension to the 128bit SIMD we have today.

  3. #53
    Xtreme X.I.P.
    Join Date
    Nov 2002
    Location
    Shipai
    Posts
    31,147
    Quote Originally Posted by informal View Post
    AVX is already 256b ISA extension to the 128bit SIMD we have today.
    oh then its 512bit for haswell? i just remember that its going to double the width once more

  4. #54
    Banned
    Join Date
    Jul 2004
    Posts
    1,125
    Quote Originally Posted by saaya View Post
    oh then its 512bit for haswell? i just remember that its going to double the width once more
    Well, "currently" we're at 128b with officially launched parts.

    AVX starts at 256b with Intel Sandy Bridge. BD is also to support this. So that's a doubling.

    You've heard Haswell is going to 512b FP vector width?

  5. #55
    Xtreme Member
    Join Date
    Jul 2004
    Location
    Berlin
    Posts
    275
    Quote Originally Posted by terrace215 View Post
    You've heard Haswell is going to 512b FP vector width?
    I've heard something like that too. Might be based on the assumption, that Haswell could support Larrabee's SIMD ISA.
    Now on Twitter: @Dresdenboy!
    Blog: http://citavia.blog.de/

  6. #56
    Registered User
    Join Date
    Sep 2009
    Posts
    77
    It's interesting. It seems AMD never disclose that Llano will equip with AVX or SSE5 or other features, but the changes in Llano doesn't look little. Will Llano have some extra pipeline-stage compared to K10?
    I guess it's very possible that the changes are making for AVX or SSE5.

  7. #57
    Xtreme X.I.P.
    Join Date
    Nov 2002
    Location
    Shipai
    Posts
    31,147
    hmmm im not sure about 512 for haswell, but i remember from idf 2008 or 2009 that haswell would double avx of sb...
    i think ive seen 512 but cant be sure, its been a while...
    and yes, there was a lot of talk about haswell having elements of lrb, some interpeted that as it having an lrb block for an igp, others as a true hybrid design...
    back then intels plan was to get game devs hooked on lrb, then merge lrb with the cpus, and to bring graphics back to the cpu that way...
    without lrb im curious what intels plans are now...

  8. #58
    YouTube Addict
    Join Date
    Aug 2005
    Location
    Klaatu barada nikto
    Posts
    17,574
    Quote Originally Posted by saaya View Post
    nn_step, but there are specific apps that would benefit from avx right? iirc intel already plans 256bit avx in haswell...

    and about cpus... well i read that x86 instructions are broken down into actual simple math instructions which are then executed by specialized logic for those simple math instructions. originally the x86 cpus had kinda general purpose logic but they were not as efficient as breaking down x86 into basic math and having specialized math logic that then works on that... this is what i read a while ago, please correct me if im wrong
    The applications that benefit the most from AVX are those that are already embarrassingly parallel. Embarrassingly parallel applications might as well skip the CPU and go straight to the GPU because that is the work load it was designed for.

    Now in the edge cases for code that is parallel but has considerable choke points, AVX in theory can improve performance but not double as would be expected from nearly double the computational resources.
    So logically, we can expect only 5-80% utilization of a FULL AVX unit (Which is going to eat alot of transistors)

    Now if we plan on getting the most benefit out of an AVX unit, it would need to be shared between two or more threads.
    Fast computers breed slow, lazy programmers
    The price of reliability is the pursuit of the utmost simplicity. It is a price which the very rich find most hard to pay.
    http://www.lighterra.com/papers/modernmicroprocessors/
    Modern Ram, makes an old overclocker miss BH-5 and the fun it was

  9. #59
    Xtreme Addict
    Join Date
    Apr 2006
    Location
    City of Lights, The Netherlands
    Posts
    2,381
    Quote Originally Posted by nn_step View Post
    Now if we plan on getting the most benefit out of an AVX unit, it would need to be shared between two or more threads.
    Smells like Bulldozer .
    "When in doubt, C-4!" -- Jamie Hyneman

    Silverstone TJ-09 Case | Seasonic X-750 PSU | Intel Core i5 750 CPU | ASUS P7P55D PRO Mobo | OCZ 4GB DDR3 RAM | ATI Radeon 5850 GPU | Intel X-25M 80GB SSD | WD 2TB HDD | Windows 7 x64 | NEC EA23WMi 23" Monitor |Auzentech X-Fi Forte Soundcard | Creative T3 2.1 Speakers | AudioTechnica AD900 Headphone |

  10. #60
    Xtreme Member
    Join Date
    Jul 2004
    Location
    Berlin
    Posts
    275
    Quote Originally Posted by Helmore View Post
    Smells like Bulldozer .
    Yeah, because of "shared" But Sandy Bridges FP units are shared as well.

    Any kind of sharing is good for longer latency pipelined units.
    Now on Twitter: @Dresdenboy!
    Blog: http://citavia.blog.de/

  11. #61
    Xtreme X.I.P.
    Join Date
    Nov 2002
    Location
    Shipai
    Posts
    31,147
    Quote Originally Posted by nn_step View Post
    The applications that benefit the most from AVX are those that are already embarrassingly parallel. Embarrassingly parallel applications might as well skip the CPU and go straight to the GPU because that is the work load it was designed for.

    Now in the edge cases for code that is parallel but has considerable choke points, AVX in theory can improve performance but not double as would be expected from nearly double the computational resources.
    So logically, we can expect only 5-80% utilization of a FULL AVX unit (Which is going to eat alot of transistors)

    Now if we plan on getting the most benefit out of an AVX unit, it would need to be shared between two or more threads.
    well, you know intel...
    they had lrb on one side and avx on the other and wanted them to touch at some point... and if anything goes wrong, the one or the other will serve as a backup plan... which was a smart strategy as lrb did fail

    so highly parallel code, lets say video compression, will benefit a lot, but it wont be double as fast... thanks, thats good to know!

    Quote Originally Posted by Dresdenboy View Post
    Sandy Bridges FP units are shared as well.
    they are? how? is there anything public to read about this?

  12. #62
    Xtreme Addict
    Join Date
    Apr 2006
    Location
    City of Lights, The Netherlands
    Posts
    2,381
    Quote Originally Posted by saaya View Post
    they are? how? is there anything public to read about this?
    He is referring to SMT (Hyper Threading). 2 threads sharing the same execution resource, including the FP units.
    "When in doubt, C-4!" -- Jamie Hyneman

    Silverstone TJ-09 Case | Seasonic X-750 PSU | Intel Core i5 750 CPU | ASUS P7P55D PRO Mobo | OCZ 4GB DDR3 RAM | ATI Radeon 5850 GPU | Intel X-25M 80GB SSD | WD 2TB HDD | Windows 7 x64 | NEC EA23WMi 23" Monitor |Auzentech X-Fi Forte Soundcard | Creative T3 2.1 Speakers | AudioTechnica AD900 Headphone |

  13. #63
    Xtreme X.I.P.
    Join Date
    Nov 2002
    Location
    Shipai
    Posts
    31,147
    Quote Originally Posted by Helmore View Post
    He is referring to SMT (Hyper Threading). 2 threads sharing the same execution resource, including the FP units.
    ahhhhhh gotcha...

  14. #64
    Xtreme Enthusiast
    Join Date
    Mar 2005
    Posts
    644
    Besides AVX, could those units be used for other purposes? AMD wanted to introduce their own x86 Instruction Set extensions a few years ago and there isn't a clean picture about how they were intending to do so (All points that they planned introducing them with Bulldozer, but maybe they can do it earlier with Llano), and besides, they aren't up to date supporting Intel standards. Currently, AMD lacks SSSE3, SSE 4.1 and SSE 4.2 support.
    AMD wanted to introduce SSE5, but due to Intel announcing AVX, AMD revised their proposed extension to make sure that they don't overlap or are incompatible with the new Intel AVX opcodes. The revised instruction set was broken into three smaller extensions: XOP (That if I understand properly, groups the old SSE5 instructions that had an AVX equivalent), CVT16 (Can transform 32 Bits precision floating point numbers into 16 Bits precision and viceversa), and FMA (Fused Multiply Add). Adding to the mess is the fact that there are two proposed versions of FMA: FMA3, that works with 3 operands, and FMA4, that works with 4 operands (Long life redundancy!), and as is predictable, AMD and Intel had each sided with one of them.
    Except Intel AVX, AMD extensions seems to be missing in action for the most part. I suppose that at least AMD could have included their very own extensions or at least the missing SSSE3/SSE4 ones.

  15. #65
    Xtreme X.I.P.
    Join Date
    Nov 2002
    Location
    Shipai
    Posts
    31,147
    i think amd can do most if not all of it, some better some less efficiently... they just arent promoting it because as soon as they will, intel will step on their foot again
    i think they are just waiting for intel to push for whatever they think makes sense and then support it as well... and maybe announce that they support some few extra insutrctions on top of that...

    whatever amd now claims they will do, intel will speak up with a much louder voice and proclaim that they will support the same or even more and call it something else...
    so amd is doing the smart thing and playing the waiting game until intel puts their cards on the table i think...

  16. #66
    YouTube Addict
    Join Date
    Aug 2005
    Location
    Klaatu barada nikto
    Posts
    17,574
    Quote Originally Posted by saaya View Post
    well, you know intel...
    they had lrb on one side and avx on the other and wanted them to touch at some point... and if anything goes wrong, the one or the other will serve as a backup plan... which was a smart strategy as lrb did fail

    so highly parallel code, lets say video compression, will benefit a lot, but it wont be double as fast... thanks, thats good to know!


    they are? how? is there anything public to read about this?
    actually video compression when applied to large enough raw video is embarrassingly parallel, the same could be said for most video, but honestly because they are so parallel and GPUs are very common and extremely good at embarrassingly parallel work; video tends to utilize GPUs instead of SIMD units.

    The applications that would actually see improvements via SIMD would be compression and encryption, but even those tend to see an order of magnitude better performance if there is explicit hardware support [which is ALOT cheaper than doubling the SIMD unit]

    Once GPUs start supporting IEEE 754 (and a proper standard), the CPU will see absolutely no reason to have SIMD [except for legacy reasons]
    Fast computers breed slow, lazy programmers
    The price of reliability is the pursuit of the utmost simplicity. It is a price which the very rich find most hard to pay.
    http://www.lighterra.com/papers/modernmicroprocessors/
    Modern Ram, makes an old overclocker miss BH-5 and the fun it was

  17. #67
    Xtreme X.I.P.
    Join Date
    Nov 2002
    Location
    Shipai
    Posts
    31,147
    Quote Originally Posted by nn_step View Post
    Once GPUs start supporting IEEE 754 (and a proper standard), the CPU will see absolutely no reason to have SIMD [except for legacy reasons]
    unless your intel and you dont HAVE a propper gpu
    thats why avx exists to begin with, doesnt it?
    if intel had a propper gpu avx would be part of opencl or direct compute i guess...

  18. #68
    Banned
    Join Date
    Jul 2004
    Posts
    1,125
    Quote Originally Posted by saaya View Post
    unless your intel and you dont HAVE a propper gpu


    Even the Arrandale GPU ain't bad on a performance PER WATT basis, and SB's iGPU will probably be quite competitive perf/W-wise.

    You can of course argue about how much power (and thus performance) gets allocated to an iGPU.

    Presumably Intel considered this when designing SB.

  19. #69
    Registered User
    Join Date
    Jul 2008
    Posts
    73
    Quote Originally Posted by terrace215 View Post
    Even the Arrandale GPU ain't bad on a performance PER WATT basis, and SB's iGPU will probably be quite competitive perf/W-wise.
    As Intel IGPs don't support OpenCL, DX Compute or CUDA they don't qualify to be mentioned in this thread.

  20. #70
    Xtreme X.I.P.
    Join Date
    Nov 2002
    Location
    Shipai
    Posts
    31,147
    Quote Originally Posted by terrace215 View Post


    Even the Arrandale GPU ain't bad on a performance PER WATT basis, and SB's iGPU will probably be quite competitive perf/W-wise.
    for 3d, perf per watt is "ok"... for 2d? hell no!

    cuda? directcompute? opencl? pff, who cares?
    what would you need this for again? oh right, there arent really any apps for it at all :P the only few aps that do exist would be so slow on an intel igp, its pointless supporting it...

    just like igp/entry level dx11 gpus... as if they could actually render anything dx11 at a double digit fps :P

  21. #71
    Xtreme Cruncher
    Join Date
    Jun 2006
    Posts
    6,215
    Llano's GPU and SB's one are not in the same league perf. and feature wise. While Llano will have (practically a) GPGPU onboard,SB will have a tweaked IGP from Arrandale.One could argue it(SB's IGP) will be enough(as it probably will) for the average consumer,Llano will stomp it in games and whenever OpenCL/DirectX is used. There is a chance intel will do a whole rework of the GPU that goes into SB,but to expect it to be even close in performance to Llano in games is a fantasy dream IMO.

  22. #72
    Xtreme Enthusiast
    Join Date
    Mar 2005
    Posts
    644
    Quote Originally Posted by informal View Post
    Llano's GPU and SB's one are not in the same league perf. and feature wise. While Llano will have (practically a) GPGPU onboard,SB will have a tweaked IGP from Arrandale.One could argue it(SB's IGP) will be enough(as it probably will) for the average consumer,Llano will stomp it in games and whenever OpenCL/DirectX is used. There is a chance intel will do a whole rework of the GPU that goes into SB,but to expect it to be even close in performance to Llano in games is a fantasy dream IMO.
    You are thinking wrong if you are understimating Intel. I think that no one would have expected that Clarkdale's GPU (1, 2) would have been capable of providing a competitor to the Radeon 3200/4200 IGPs considering that Intel history in GPUs had it as the graphic industry permanent punchbag and laughingstock. Sandy Bridge IGP is NOT to be understimated.
    Remember than both AMD and Intel are not only Processor makers, but full platforms providers (Processor, Chipsets, and GPUs. OEMs likes all together). Intel seems willing to take seriously the GPU part of its platform, otherwise they would be later at a SERIOUS disadvantage should AMD set a strong baseline of GPU performance even on its cheapest platforms. What would happen if Intel wanted to stick to its old and stinky IGP in the full platform war? AMD would have an overally slower Processor that is still stupidly fast for the vast majority of the mainstream users, but a GPU that would crush in an epic way Intel GMA to mark the difference. From Core 2 Duo onwards, Intel is as greedy as always, but not stupid anymore.
    The real advantage lies in that Fusion GPU, being either a direct derivate of a current budget GPU, or specifically made, is that it will be made with everything that ATI experience got to offer (Including software developing tools, driver compatibility, the always mentioned features that are currently used by no one, etc). That is something that Intel currently doesn't have, but they still have much more money to throw should they get ambicious on GPU R&D.
    Last edited by zir_blazer; 05-01-2010 at 04:45 AM.

  23. #73
    Xtreme Cruncher
    Join Date
    Jun 2006
    Posts
    6,215
    480SPs of Cypress class my friend... If they come close to that with even "dual core IGP" as the rumors suggest,I'll tip my hat to them!

  24. #74
    Xtreme Enthusiast
    Join Date
    Mar 2005
    Posts
    644
    Quote Originally Posted by informal View Post
    480SPs of Cypress class my friend... If they come close to that with even "dual core IGP" as the rumors suggest,I'll tip my hat to them!
    Basically, that would place it slighty above the Radeon 5570/5670 but a bit far from the 5750, with 400 and 720 SP respectively. That means that we can speculate quite accurately about Fusion GPU performance. With two exceptions: Having the GPU directly connected in the same piece of silicon to the CPU means that you have a benefict for basically eliminating their communication latency, and that is better, however, how much of an impact it could make the fact that it would be sharing Memory Bandwidth with the other Cores and with an increased latency compared to the Video Card own VRAM? Well, that is all what is left to know about Fusion GPU besides actual numbers.
    Now... What we do know about Sandy Bridge? Do we have a remote idea of its performance? Else, I would still stay at bay until more info surfaces. The worst thing that you can do is saying that you are I N V I N C I B L E and get owned before finishing to say the classic sentence.

    BTW... Where the hell is Hans de Vries? It should be useful his input in this Thread after soo many days.

  25. #75
    Xtreme Mentor
    Join Date
    Jul 2008
    Location
    Shimla , India
    Posts
    2,631
    SB doe snot have a dual core IGP well its tweaked and some additions have been done but its not dual core "as per say". Performance is improved quite a bit over the old one but still it is not something that can defeat llano's Cypress class GPU.

    @informal the GPU is not the one from Arrandale directly yes it has similarities but it has much more
    Coming Soon

Page 3 of 5 FirstFirst 12345 LastLast

Bookmarks

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •