Page 1 of 7 1234 ... LastLast
Results 1 to 25 of 175

Thread: AMD does reverse GPGPU, announces OpenCL SDK for x86

  1. #1
    Xtreme Mentor
    Join Date
    Sep 2007
    Location
    Ohio
    Posts
    2,977

    AMD does reverse GPGPU, announces OpenCL SDK for x86

    AMD does reverse GPGPU, announces OpenCL SDK for x86

    By Jon Stokes | Last updated August 6, 2009 6:15 AM CT

    http://arstechnica.com/hardware/news...dk-for-x86.ars

    "AMD has announced the release of the first OpenCL SDK for x86 CPUs, and it will enable developers to target x86 processors with the kind of OpenCL code that's normally written for GPUs. In a way, this is a reverse of the normal "GPGPU" trend, in which programs that run on a CPU are modified to run in whole or in part on a GPU.

    Why would you want to run GPU programs on a CPU? Debugging is one reason, if you don't have access to an OpenCL-compliant GPU. And for now, that's essentially what'll be doing, since the new SDK doesn't appear to be able to target GPUs, yet. But eventually, developers will be able to write in OpenCL and target multicore x86 CPUs alongside GPUs from NVIDIA, AMD, and Intel. Of course, when you can write once and target a variety of parallel hardware types, the fact that Larrabee runs x86 will be irrelevant; so Intel had better be able to scale up Larrabee's performance, because its x86 support will not be a selling point (at least for Larrabee as a GPU, though an HPC coprocessor might be a different story).

    Note that you can already write once, run anywhere for GPUs and multicore x86 already, but you'd have to use RapidMind's proprietary middleware layer. Because it's more than just an API—the middleware does just-in-time compilation targeting whatever hardware is in the system, dynamic load-balancing, and real-time optimization—an OpenGL vs. RapidMind comparison is a little bit apples-to-oranges, but only just a bit.

    In reality, few workloads are such that you can break them up in the design phase into parallel chunks so that a middleware layer can dynamically map them to hardware resources at run-time. Certainly there are some problem domains that this works for—finance is one that comes to mind at the moment—but these are very specialized (though profitable) niches. Most of the stuff that ordinary developers will want to do with GPGPU in the medium-term is more mundane and application-specific, like using the GPU to speed up some specific part of a common application in order to give a performance boost vs. the CPU alone. In other words, these common apps don't solve data-parallel, compute-intensive problems—rather, they have specific parts that need acceleration, and if there's a capable GPU available then they can use OpenCL to hand off that part to it.

    Note that Snow Leopard will come with an OpenCL implementation that works on both CPUs and GPUs. Ars will have a review when it launches, so stay tuned."
    __________________________________________________ ________

    I find this odd...

    Could this mean that ATI is having issues getting OpenCL to run on their GPU's?

    Note the last comment...
    "Snow Leopard will come with an OpenCL implementation that works on both CPUs and GPUs."
    Asus Maximus SE X38 / Lapped Q6600 G0 @ 3.8GHz (L726B397 stock VID=1.224) / 7 Ultimate x64 /EVGA GTX 295 C=650 S=1512 M=1188 (Graphics)/ EVGA GTX 280 C=756 S=1512 M=1296 (PhysX)/ G.SKILL 8GB (4 x 2GB) SDRAM DDR2 1000 (PC2 8000) / Gateway FPD2485W (1920 x 1200 res) / Toughpower 1,000-Watt modular PSU / SilverStone TJ-09 BW / (2) 150 GB Raptor's RAID-0 / (1) Western Digital Caviar 750 GB / LG GGC-H20L (CD, DVD, HD-DVD, and BlueRay Drive) / WaterKegIII Xtreme / D-TEK FuZion CPU, EVGA Hydro Copper 16 GPU, and EK NB S-MAX Acetal Waterblocks / Enzotech Forged Copper CNB-S1L (South Bridge heat sink)

  2. #2
    Xtreme X.I.P. Particle's Avatar
    Join Date
    Apr 2008
    Location
    Kansas
    Posts
    3,219
    I think you're trying to read something from between the lines when there isn't meant to be anything there. More likely, they mean what the text says on its face--the SDK isn't finished and the next release of OSX will support OpenCL.
    Particle's First Rule of Online Technical Discussion:
    As a thread about any computer related subject has its length approach infinity, the likelihood and inevitability of a poorly constructed AMD vs. Intel fight also exponentially increases.

    Rule 1A:
    Likewise, the frequency of a car pseudoanalogy to explain a technical concept increases with thread length. This will make many people chuckle, as computer people are rarely knowledgeable about vehicular mechanics.

    Rule 2:
    When confronted with a post that is contrary to what a poster likes, believes, or most often wants to be correct, the poster will pick out only minor details that are largely irrelevant in an attempt to shut out the conflicting idea. The core of the post will be left alone since it isn't easy to contradict what the person is actually saying.

    Rule 2A:
    When a poster cannot properly refute a post they do not like (as described above), the poster will most likely invent fictitious counter-points and/or begin to attack the other's credibility in feeble ways that are dramatic but irrelevant. Do not underestimate this tactic, as in the online world this will sway many observers. Do not forget: Correctness is decided only by what is said last, the most loudly, or with greatest repetition.

    Rule 3:
    When it comes to computer news, 70% of Internet rumors are outright fabricated, 20% are inaccurate enough to simply be discarded, and about 10% are based in reality. Grains of salt--become familiar with them.

    Remember: When debating online, everyone else is ALWAYS wrong if they do not agree with you!

    Random Tip o' the Whatever
    You just can't win. If your product offers feature A instead of B, people will moan how A is stupid and it didn't offer B. If your product offers B instead of A, they'll likewise complain and rant about how anyone's retarded cousin could figure out A is what the market wants.

  3. #3
    Xtreme Member
    Join Date
    Jul 2009
    Location
    Madrid (Spain)
    Posts
    352
    Quote Originally Posted by Talonman View Post
    Could this mean that ATI is having issues getting OpenCL to run on their GPU's?
    Probably, if you mean that their GPU OpenCL drivers are not polished or functional enough yet. This things take time to develope. NVIDIA should need much less time to do it if what I've read about OpenCL API being so similar to CUDA API are true. So it doesn't really surprise me...

    It's all a matter of time, though.

  4. #4
    Registered User
    Join Date
    Dec 2008
    Location
    Chicago
    Posts
    49
    This is in no way weird.

    The point of OpenCL actually IS to run on every computation device in you machine. That's the beauty about it! Just wait for the time where you can fire up a OpenCL folding client and you can send the same jobs to your GPU, CPU and your Cell addon card. Sure different working units will still run faster on the different architectures.
    I'm just waiting for the first guys to implement a driver that on the fly generates an optimized FPGA image for every executed kernel

    And btw the OSX 10.6 beta already has a functioning AMD GPU OpenCL driver. (as well as a CPU driver and a NVidia GPU driver).

  5. #5
    Xtreme Addict
    Join Date
    Apr 2007
    Posts
    1,870
    Quote Originally Posted by Talonman View Post
    Could this mean that ATI is having issues getting OpenCL to run on their GPU's?
    Naturally, it's easier to write an x86 OpenCL driver than one targeting multiple generations of GPU hardware. Besides, having a CPU driver allows developers to start learning the API and porting their algorithms now so that once GPU support is available they will have had a head start. It'll be interesting to see how much overhead OpenCL brings though, compared to writing your own stuff directly in C/C++.

    Quote Originally Posted by deeperblue View Post
    And btw the OSX 10.6 beta already has a functioning AMD GPU OpenCL driver. (as well as a CPU driver and a NVidia GPU driver).
    I know device detection and enumeration is up and running on OSX but I haven't seen any functioning OpenCL apps as yet.

  6. #6
    Xtreme Addict
    Join Date
    Apr 2008
    Location
    Texas
    Posts
    1,663
    OpenCL is supposed to be able to run on both CPUs and GPUs simultaneously. Serial workloads are better suited for CPUs and parallel loads work best on GPUs. This SDK will make sure we get the most bang for our buck out of our 4 core or more, single/multi-GPU desktops. I'm very excited to see what game engines and GPGPU software will take advantage of OpenCL over the next few years.
    Core i7 2600K@4.6Ghz| 16GB G.Skill@2133Mhz 9-11-10-28-38 1.65v| ASUS P8Z77-V PRO | Corsair 750i PSU | ASUS GTX 980 OC | Xonar DSX | Samsung 840 Pro 128GB |A bunch of HDDs and terabytes | Oculus Rift w/ touch | ASUS 24" 144Hz G-sync monitor

    Quote Originally Posted by phelan1777 View Post
    Hail fellow warrior albeit a surat Mercenary. I Hail to you from the Clans, Ghost Bear that is (Yes freebirth we still do and shall always view mercenaries with great disdain!) I have long been an honorable warrior of the mighty Warden Clan Ghost Bear the honorable Bekker surname. I salute your tenacity to show your freebirth sibkin their ignorance!

  7. #7
    Xtreme Mentor
    Join Date
    Sep 2007
    Location
    Ohio
    Posts
    2,977
    So it is actually a positive thing, by getting the API rolling ASAP. GPU support can come in time. I can see that.

    It just initially sounded odd to me for a company having produced GPU's.
    Asus Maximus SE X38 / Lapped Q6600 G0 @ 3.8GHz (L726B397 stock VID=1.224) / 7 Ultimate x64 /EVGA GTX 295 C=650 S=1512 M=1188 (Graphics)/ EVGA GTX 280 C=756 S=1512 M=1296 (PhysX)/ G.SKILL 8GB (4 x 2GB) SDRAM DDR2 1000 (PC2 8000) / Gateway FPD2485W (1920 x 1200 res) / Toughpower 1,000-Watt modular PSU / SilverStone TJ-09 BW / (2) 150 GB Raptor's RAID-0 / (1) Western Digital Caviar 750 GB / LG GGC-H20L (CD, DVD, HD-DVD, and BlueRay Drive) / WaterKegIII Xtreme / D-TEK FuZion CPU, EVGA Hydro Copper 16 GPU, and EK NB S-MAX Acetal Waterblocks / Enzotech Forged Copper CNB-S1L (South Bridge heat sink)

  8. #8
    Xtreme Addict
    Join Date
    Nov 2006
    Location
    Red Maple Leaf
    Posts
    1,556
    Quote Originally Posted by Talonman View Post
    It just initially sounded odd to me for a company having produced GPU's.
    AMD produces CPUs.
    E8400 @ 4.0 | ASUS P5Q-E P45 | 4GB Mushkin Redline DDR2-1000 | WD SE16 640GB | HD4870 ASUS Top | Antec 300 | Noctua & Thermalright Cool
    Windows 7 Professional x64


    Vista & Seven Tweaks, Tips, and Tutorials: http://www.vistax64.com/

    Game's running choppy? See: http://www.tweakguides.com/

  9. #9
    Xtreme Enthusiast
    Join Date
    Feb 2009
    Posts
    800
    They produce processors too Talonman. I have to admit though, it sounds more like an intel's thing (just a thought. They created x86 anyway. It made me think that AMD did intel a favour).

  10. #10
    YouTube Addict
    Join Date
    Aug 2005
    Location
    Klaatu barada nikto
    Posts
    17,574
    Specifically it would enable for people without OpenCL cards to start creating programs for them.

    Bravo AMD
    Fast computers breed slow, lazy programmers
    The price of reliability is the pursuit of the utmost simplicity. It is a price which the very rich find most hard to pay.
    http://www.lighterra.com/papers/modernmicroprocessors/
    Modern Ram, makes an old overclocker miss BH-5 and the fun it was

  11. #11
    Xtreme Enthusiast
    Join Date
    Dec 2007
    Posts
    816
    Quote Originally Posted by Talonman View Post
    AMD does reverse GPGPU, announces OpenCL SDK for x86

    By Jon Stokes | Last updated August 6, 2009 6:15 AM CT

    http://arstechnica.com/hardware/news...dk-for-x86.ars

    "AMD has announced the release of the first OpenCL SDK for x86 CPUs, and it will enable developers to target x86 processors with the kind of OpenCL code that's normally written for GPUs. In a way, this is a reverse of the normal "GPGPU" trend, in which programs that run on a CPU are modified to run in whole or in part on a GPU.

    Why would you want to run GPU programs on a CPU? Debugging is one reason, if you don't have access to an OpenCL-compliant GPU. And for now, that's essentially what'll be doing, since the new SDK doesn't appear to be able to target GPUs, yet. But eventually, developers will be able to write in OpenCL and target multicore x86 CPUs alongside GPUs from NVIDIA, AMD, and Intel. Of course, when you can write once and target a variety of parallel hardware types, the fact that Larrabee runs x86 will be irrelevant; so Intel had better be able to scale up Larrabee's performance, because its x86 support will not be a selling point (at least for Larrabee as a GPU, though an HPC coprocessor might be a different story).

    Note that you can already write once, run anywhere for GPUs and multicore x86 already, but you'd have to use RapidMind's proprietary middleware layer. Because it's more than just an API—the middleware does just-in-time compilation targeting whatever hardware is in the system, dynamic load-balancing, and real-time optimization—an OpenGL vs. RapidMind comparison is a little bit apples-to-oranges, but only just a bit.

    In reality, few workloads are such that you can break them up in the design phase into parallel chunks so that a middleware layer can dynamically map them to hardware resources at run-time. Certainly there are some problem domains that this works for—finance is one that comes to mind at the moment—but these are very specialized (though profitable) niches. Most of the stuff that ordinary developers will want to do with GPGPU in the medium-term is more mundane and application-specific, like using the GPU to speed up some specific part of a common application in order to give a performance boost vs. the CPU alone. In other words, these common apps don't solve data-parallel, compute-intensive problems—rather, they have specific parts that need acceleration, and if there's a capable GPU available then they can use OpenCL to hand off that part to it.

    Note that Snow Leopard will come with an OpenCL implementation that works on both CPUs and GPUs. Ars will have a review when it launches, so stay tuned."
    __________________________________________________ ________

    I find this odd...

    Could this mean that ATI is having issues getting OpenCL to run on their GPU's?

    Note the last comment...
    "Snow Leopard will come with an OpenCL implementation that works on both CPUs and GPUs."
    it is because the GPGPU vs CPU paradox ... I tried to explain it in my blog ...
    At OpenCL : http://www.khronos.org/opencl/
    when you look at the OpenCL API, you ll understand very quickly that CPU has some skill today that the GPU does not have yet ... With many cores, you need alot of Caches, to store the data between the steps of your OpenCL procedure calls. If you go out of the Socket or your GPU, your performance sucks ... The tricks that nVidia use with CUDA only works if you touch your data 1 time. NV uses the thread scheduler to freeze a thread when you have a stall because memory access, and move to the next thread and come back to the 1st one when the memory request is done. They use an hardware scheduler.
    This works when your algorythm is not f(n-1), when each loop has no relation between loop n and loop (n-1)
    what ever you saw yet out of CUDA is bunch of cornet case algorithm, and to prove that it is not generic, you can not get any version of the Spec_int or Spec_fp on those GPUs , while some part of Spec are "very parralelizable"

    AMD found that x86 is very flexible for 128bits vector, with many loop using each other results ... And that is the base of x86 (via SSE2)! I am sure they will add more support for the GPU when it does make sense, when it is massively parallel, and take advantage of the GPU acceleration. With AVX coming, it will be a really big challenge to beat the processor at 256 bits FLOATs and Double processing.

    the other part is the memory size limit, the GPUs today are seriously limited, places were you need TFLOPS requires a lot of Memory ... right now, the GPGPUs are using their DDR5 as a cache to the main memory ... PCIexpress is a very poor cache protocol ...

    The GPUs are in a world that requires the programmer to make its code 100% perfect, load have to be aligned 100%, store too, vectorization has to be done perfectly ... It is a world of Pain, and if you don t have a programmer with a PhD in parralelism or SIMD, you will not see the end of the project ... If you want OpenCL to perform well on GPU, you still have to plan for all of those tricks.

    So, AMD will probably increase their support for GPGPU, over time, they got their own stable starting point, and it is always x86.
    you will see the CPU wining all the very iterative workloads on OpenCL.

    Francois
    Last edited by Drwho?; 08-10-2009 at 07:44 AM.
    DrWho, The last of the time lords, setting up the Clock.

  12. #12
    Xtreme Addict
    Join Date
    Dec 2008
    Location
    Sweden, Linköping
    Posts
    2,034
    Thanks for a very informative post even I can understand Francois
    SweClockers.com

    CPU: Phenom II X4 955BE
    Clock: 4200MHz 1.4375v
    Memory: Dominator GT 2x2GB 1600MHz 6-6-6-20 1.65v
    Motherboard: ASUS Crosshair IV Formula
    GPU: HD 5770

  13. #13
    Xtreme Enthusiast
    Join Date
    Mar 2009
    Location
    Toronto ON
    Posts
    566
    Same here. Thanks Francois
    Core i7-4930K LGA 2011 Six-Core - Cooler Master Seidon 120XL ? Push-Pull Liquid Water
    ASUS Rampage IV Black Edition LGA2011 - G.SKILL Trident X Series 32GB (4 x 8GB) DDR3 1866
    Sapphire R9 290X 4GB TRI-X OC in CrossFire - ATI TV Wonder 650 PCIe
    Intel X25-M 160GB G2 SSD - WD Black 2TB 7200 RPM 64MB Cache SATA 6
    Corsair HX1000W PSU - Pioner Blu-ray Burner 6X BD-R
    Westinghouse LVM-37w3, 37inch 1080p - Windows 7 64-bit Pro
    Sennheiser RS 180 - Cooler Master Cosmos S Case

  14. #14
    Xtreme Mentor
    Join Date
    Sep 2007
    Location
    Ohio
    Posts
    2,977
    That was a good post...

    I dont have a good grip on this...

    "what ever you saw yet out of CUDA is bunch of cornet case algorithm, and to prove that it is not generic, you can not get any version of the Spec_int or Spec_fp on those GPUs , while some part of Spec are "very parralelizable""

    What does that mean in terms of GPU resource allocation or performance?
    Last edited by Talonman; 08-10-2009 at 08:30 AM.
    Asus Maximus SE X38 / Lapped Q6600 G0 @ 3.8GHz (L726B397 stock VID=1.224) / 7 Ultimate x64 /EVGA GTX 295 C=650 S=1512 M=1188 (Graphics)/ EVGA GTX 280 C=756 S=1512 M=1296 (PhysX)/ G.SKILL 8GB (4 x 2GB) SDRAM DDR2 1000 (PC2 8000) / Gateway FPD2485W (1920 x 1200 res) / Toughpower 1,000-Watt modular PSU / SilverStone TJ-09 BW / (2) 150 GB Raptor's RAID-0 / (1) Western Digital Caviar 750 GB / LG GGC-H20L (CD, DVD, HD-DVD, and BlueRay Drive) / WaterKegIII Xtreme / D-TEK FuZion CPU, EVGA Hydro Copper 16 GPU, and EK NB S-MAX Acetal Waterblocks / Enzotech Forged Copper CNB-S1L (South Bridge heat sink)

  15. #15
    I am Xtreme
    Join Date
    Jul 2005
    Posts
    4,811
    Nice post Francois

    Quote Originally Posted by deeperblue View Post
    This is in no way weird.

    The point of OpenCL actually IS to run on every computation device in you machine. That's the beauty about it! Just wait for the time where you can fire up a OpenCL folding client and you can send the same jobs to your GPU, CPU and your Cell addon card. Sure different working units will still run faster on the different architectures.
    I'm just waiting for the first guys to implement a driver that on the fly generates an optimized FPGA image for every executed kernel

    And btw the OSX 10.6 beta already has a functioning AMD GPU OpenCL driver. (as well as a CPU driver and a NVidia GPU driver).
    As mentioned OpenCL is not GPU specific.
    Last edited by Eastcoasthandle; 08-10-2009 at 08:30 AM.
    [SIGPIC][/SIGPIC]

  16. #16
    Registered User
    Join Date
    Dec 2008
    Location
    Chicago
    Posts
    49
    Quote Originally Posted by trinibwoy View Post
    I know device detection and enumeration is up and running on OSX but I haven't seen any functioning OpenCL apps as yet.
    I'm actually working on a benchmark tool for OpenCL on my Mac. The drivers are there and running.
    I hope to have the tool finished and released for Mac, Linux and Windows as soon as Snow Leopard is officially released.

  17. #17
    Xtreme Addict
    Join Date
    Apr 2008
    Location
    Texas
    Posts
    1,663
    Some of the elements Francois just pointed out are things I've read AMD is trying to improve upon on the next gen GPUs (the cache issues). I bet Larrabee will eat up OpenCL workloads because of this. OpenCL is on the fast track to fame with AMD, Intel, and Apple getting the software and hardware worked out for its' implementation. It makes me giddy thinking about the possibilities.
    Core i7 2600K@4.6Ghz| 16GB G.Skill@2133Mhz 9-11-10-28-38 1.65v| ASUS P8Z77-V PRO | Corsair 750i PSU | ASUS GTX 980 OC | Xonar DSX | Samsung 840 Pro 128GB |A bunch of HDDs and terabytes | Oculus Rift w/ touch | ASUS 24" 144Hz G-sync monitor

    Quote Originally Posted by phelan1777 View Post
    Hail fellow warrior albeit a surat Mercenary. I Hail to you from the Clans, Ghost Bear that is (Yes freebirth we still do and shall always view mercenaries with great disdain!) I have long been an honorable warrior of the mighty Warden Clan Ghost Bear the honorable Bekker surname. I salute your tenacity to show your freebirth sibkin their ignorance!

  18. #18
    Xtreme Mentor
    Join Date
    Sep 2007
    Location
    Ohio
    Posts
    2,977
    Quote Originally Posted by deeperblue View Post
    I'm actually working on a benchmark tool for OpenCL on my Mac. The drivers are there and running.
    I hope to have the tool finished and released for Mac, Linux and Windows as soon as Snow Leopard is officially released.
    Good deal!

    Keep us posted...
    Asus Maximus SE X38 / Lapped Q6600 G0 @ 3.8GHz (L726B397 stock VID=1.224) / 7 Ultimate x64 /EVGA GTX 295 C=650 S=1512 M=1188 (Graphics)/ EVGA GTX 280 C=756 S=1512 M=1296 (PhysX)/ G.SKILL 8GB (4 x 2GB) SDRAM DDR2 1000 (PC2 8000) / Gateway FPD2485W (1920 x 1200 res) / Toughpower 1,000-Watt modular PSU / SilverStone TJ-09 BW / (2) 150 GB Raptor's RAID-0 / (1) Western Digital Caviar 750 GB / LG GGC-H20L (CD, DVD, HD-DVD, and BlueRay Drive) / WaterKegIII Xtreme / D-TEK FuZion CPU, EVGA Hydro Copper 16 GPU, and EK NB S-MAX Acetal Waterblocks / Enzotech Forged Copper CNB-S1L (South Bridge heat sink)

  19. #19
    Xtreme Addict
    Join Date
    May 2007
    Location
    'Zona
    Posts
    2,346
    Quote Originally Posted by Mechromancer View Post
    Some of the elements Francois just pointed out are things I've read AMD is trying to improve upon on the next gen GPUs (the cache issues). I bet Larrabee will eat up OpenCL workloads because of this. OpenCL is on the fast track to fame with AMD, Intel, and Apple getting the software and hardware worked out for its' implementation. It makes me giddy thinking about the possibilities.
    Well we already know they radically changed and improved the scheduler and it is much more complex that previous generations.
    Originally Posted by motown_steve
    Every genocide that was committed during the 20th century has been preceded by the disarmament of the target population. Once the government outlaws your guns your life becomes a luxury afforded to you by the state. You become a tool to benefit the state. Should you cease to benefit the state or even worse become an annoyance or even a hindrance to the state then your life becomes more trouble than it is worth.

    Once the government outlaws your guns your life is forfeit. You're already dead, it's just a question of when they are going to get around to you.

  20. #20
    Xtreme Addict
    Join Date
    Apr 2007
    Posts
    1,870
    Quote Originally Posted by Drwho? View Post
    With AVX coming, it will be a really big challenge to beat the processor at 256 bits FLOATs and Double processing.
    GPUs aren't standing still my friend. 256-bit AVX will still be a joke compared to the GPUs out by that time.

    The GPUs are in a world that requires the programmer to make its code 100% perfect, load have to be aligned 100%, store too, vectorization has to be done perfectly ... It is a world of Pain, and if you don t have a programmer with a PhD in parralelism or SIMD, you will not see the end of the project ... If you want OpenCL to perform well on GPU, you still have to plan for all of those tricks.
    This is no different to SSE programming on a CPU, so not sure what you're trying to say here. And yes, OpenCL will be making full use of SSE/AVX in order to extract maximum performance out of the hardware.

    you will see the CPU wining all the very iterative workloads on OpenCL.
    OpenCL is split into task-based and data-based parallelism. Not sure what you mean by "iterative workloads". GPUs aren't so hot at task-based stuff, they downright suck and that should definitely stay on the CPU for the forseeable future.

    But we have a big problem, one that Larrabee is trying to solve. What's the point of doing task based stuff on the CPU if it has to communicate over the bottlenecking PCIe bus? This is why folks are looking to move more and more task based stuff over to the GPU because while the CPU is faster, by the time it travels over PCIe it would've been faster to keep everything on the GPU.

  21. #21
    Xtreme Addict
    Join Date
    Apr 2004
    Posts
    1,640
    Quote Originally Posted by Farinorco View Post
    Probably, if you mean that their GPU OpenCL drivers are not polished or functional enough yet. This things take time to develope. NVIDIA should need much less time to do it if what I've read about OpenCL API being so similar to CUDA API are true. So it doesn't really surprise me...

    It's all a matter of time, though.
    All NVIDIA had to do was use the CUDA interface to add in OpenCL support. So OpenCL runs on top of CUDA, as I understand it.
    DFI LANParty DK 790FX-B
    Phenom II X4 955 BE (1003GPMW) @ 3.8GHz (19x200) w/1.36v
    -cooling: Scythe Mugen 2 + AC MX-2
    XFX ATI Radeon HD 5870 1024MB
    8GB PC2-6400 G.Skill @ 800MHz (1:2) 5-5-5-15 w/1.8v
    Seagate 1TB 7200.11 Barracuda
    Corsair HX620W


    Support PC gaming. Don't pirate games.

  22. #22
    Xtreme Member
    Join Date
    Jul 2009
    Location
    Madrid (Spain)
    Posts
    352
    Quote Originally Posted by Cybercat View Post
    All NVIDIA had to do was use the CUDA interface to add in OpenCL support. So OpenCL runs on top of CUDA, as I understand it.
    I don't think OpenCL runs on top of CUDA API, because it's not a higher level API, but the opposite if any. Obviously, it surely runs on top of CUDA architecture, since that's a commercial name to their architecture, but that has nothing to do with developement times, obviously for all vendors, the API will run on top of their architecture...

  23. #23
    I am Xtreme zanzabar's Avatar
    Join Date
    Jul 2007
    Location
    SF bay area, CA
    Posts
    15,871
    Quote Originally Posted by blindbox View Post
    They produce processors too Talonman. I have to admit though, it sounds more like an intel's thing (just a thought. They created x86 anyway. It made me think that AMD did intel a favour).
    not really with amd creating the sdk and compiler that allows them to have optimized code, the largest problem with amd server parts is that x86 with heavy int on sse1/2 runs drastically different depending on what compiler u use as to the performance comparison on amd and intel. so this is a huge boost for amd in the long run, since who controls the compiler controls the optimization path for the hardware

    it also makes it so intel wont be so hasty to remove the x86 licensing and may give amd enough to claim that they changed x86 enough to get their own rights and wont have to pay licensing for x86 GF parts

    Quote Originally Posted by Farinorco View Post
    I don't think OpenCL runs on top of CUDA API, because it's not a higher level API, but the opposite if any. Obviously, it surely runs on top of CUDA architecture, since that's a commercial name to their architecture, but that has nothing to do with developement times, obviously for all vendors, the API will run on top of their architecture...
    open CL is higher than cuda, cuda is a low lvl C, openCL is closer to C++. and there is a cuda interface on the driver that it has to run through no matter what GPGPU language ends up on it.
    Last edited by zanzabar; 08-11-2009 at 02:41 AM.
    5930k, R5E, samsung 8GBx4 d-die, vega 56, wd gold 8TB, wd 4TB red, 2TB raid1 wd blue 5400
    samsung 840 evo 500GB, HP EX 1TB NVME , CM690II, swiftech h220, corsair 750hxi

  24. #24
    Xtreme Member
    Join Date
    Jul 2009
    Location
    Madrid (Spain)
    Posts
    352
    Quote Originally Posted by zanzabar View Post
    open CL is higher than cuda, cuda is a low lvl C, openCL is closer to C++. and there is a cuda interface on the driver that it has to run through no matter what GPGPU language ends up on it.
    Maybe, I don't know the APIs themselves (yet, I will when I have some time for it ), but I read the interview with the Khrono's president (also Nvidia's VP of Embedded Content) published in TechReport and posted here by Trinibwoy where he said the following about OpenCL and CUDA:

    OpenCL and C for CUDA are actually at very different levels. OpenCL is the typical Khronos API. Khronos likes to build the API as close as possible to the silicon. We call it the foundation-level API that everyone is going to need. Everyone who's building silicon needs to at some point expose their silicon capability at the lowest and most fundamental, and in some ways the most powerful, level because we've given the developer pretty close access to the silicon capability—just high enough abstraction to enable portability across different vendors and silicon architectures. And that's what OpenCL does. You have an API that you have control over the way stuff runs. It gives you that level of control.

    Whereas C for CUDA, it takes all of that low-level decision making and automates it. So you just write a C program, and the C for CUDA architecture will figure out how to parallelize. Now, some developers will love that, because it's much easier, and the system is doing a lot more figuring out for you. Other developers will hate that, and they will want to get down to bits and bytes and have a more instant level of control. But again, it's all good, and as long as the developers are educated as to what are the various approaches that the different programming languages are taking, and are enabled to pick the one that best suits their needs, I think that's a healthy thing.
    Here is the complete interview: http://www.techreport.com/articles.x/17321/1

    This is why I said that OpenCL isn't a higher level than CUDA but the opposite if any, that's what he's saying there...

    Of course, given that NVIDIA names with CUDA from the hardware achitecture to the higher level language/API, any GPGPU are going to be on top of some CUDA layer...

  25. #25
    Xtreme Mentor
    Join Date
    Sep 2007
    Location
    Ohio
    Posts
    2,977
    A handy picture...
    Asus Maximus SE X38 / Lapped Q6600 G0 @ 3.8GHz (L726B397 stock VID=1.224) / 7 Ultimate x64 /EVGA GTX 295 C=650 S=1512 M=1188 (Graphics)/ EVGA GTX 280 C=756 S=1512 M=1296 (PhysX)/ G.SKILL 8GB (4 x 2GB) SDRAM DDR2 1000 (PC2 8000) / Gateway FPD2485W (1920 x 1200 res) / Toughpower 1,000-Watt modular PSU / SilverStone TJ-09 BW / (2) 150 GB Raptor's RAID-0 / (1) Western Digital Caviar 750 GB / LG GGC-H20L (CD, DVD, HD-DVD, and BlueRay Drive) / WaterKegIII Xtreme / D-TEK FuZion CPU, EVGA Hydro Copper 16 GPU, and EK NB S-MAX Acetal Waterblocks / Enzotech Forged Copper CNB-S1L (South Bridge heat sink)

Page 1 of 7 1234 ... LastLast

Bookmarks

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •