Page 4 of 7 FirstFirst 1234567 LastLast
Results 76 to 100 of 175

Thread: AMD does reverse GPGPU, announces OpenCL SDK for x86

  1. #76
    Xtreme Addict
    Join Date
    Apr 2007
    Posts
    1,870
    Quote Originally Posted by Chumbucket843 View Post
    hey im in high school and i take that as an offense but i do see how easy that would be to parallelize and then paste into a forum. it looks like you can do this with a lot of sequences.
    Haha, sorry. Wasn't meant as an insult, just pointing out how smart people in high school are

    Quote Originally Posted by Drwho? View Post
    Knowing that you have to deal with numbers that are way behond what the CPU or GPU registers can deal with
    Yes, we've already covered that earlier. You would need to implement z=f(x,y) where f is a multiplication function of arbitrarily large numbers. It doesn't change the algorithm, you're still doing a prefix sum on an associative operation. You should already know how to implement f(x,y) since you coded a factorial function on the CPU right?

    Quote Originally Posted by xVeinx View Post
    That being said, the factorials calculation can be computed in parallel, but you'd have to break up the problem and be able to keep track of what has/hasn't been calculated, which would be more effectively done on the CPU.
    For non-deterministic problem sets maybe but this is a simplification of an already simple prefix scan. I'm sure Drwho didn't take the time to look it up when I suggested he do so, so I'll help him out.

    http://developer.download.nvidia.com...n/doc/scan.pdf

    This problem is even simpler because you don't need to store an array of operands since you know that for n! you generate a sequence where i=1..n and for every element a[i], a[i+1]=a+1. So it's just simple math to partition that sequence into M thread blocks each taking M*2 elements from the sequence. Once all blocks are done with their sub-sequences you do a final reduction of all block results and voila.

  2. #77
    Xtreme Addict
    Join Date
    Apr 2007
    Posts
    1,870
    Quote Originally Posted by Chumbucket843 View Post
    hey im in high school and i take that as an offense but i do see how easy that would be to parallelize and then paste into a forum. it looks like you can do this with a lot of sequences.
    Haha, sorry. Wasn't meant as an insult, just pointing out how smart people in high school are

    Quote Originally Posted by Drwho? View Post
    Knowing that you have to deal with numbers that are way behond what the CPU or GPU registers can deal with
    Yes, we've already covered that earlier. You would need to implement z=f(x,y) where f is a multiplication function of arbitrarily large numbers. It doesn't change the algorithm, you're still doing a prefix sum on an associative operation. You should already know how to implement f(x,y) since you coded a factorial function on the CPU right?

    Quote Originally Posted by xVeinx View Post
    That being said, the factorials calculation can be computed in parallel, but you'd have to break up the problem and be able to keep track of what has/hasn't been calculated, which would be more effectively done on the CPU.
    For non-deterministic problem sets maybe but this is a simplification of an already simple prefix scan. I'm sure Drwho didn't take the time to look it up when I suggested he do so, so I'll help him out.

    http://developer.download.nvidia.com...n/doc/scan.pdf

    This problem is even simpler because you don't need to store an array of operands since you know that for n! you generate a sequence where i=1..n and for every element a[i], a[i+1]=a+1. So it's just simple math to partition that sequence into M thread blocks, each block taking n/M elements from the sequence. Once all blocks are done with their sub-sequences you do a final reduction of all block results and voila.

  3. #78
    Xtreme Legend
    Join Date
    Jan 2003
    Location
    Stuttgart, Germany
    Posts
    929
    Quote Originally Posted by trinibwoy View Post
    So it's just simple math to partition that sequence into M thread blocks each taking M*2 elements from the sequence. Once all blocks are done with their sub-sequences you do a final reduction of all block results and voila.
    do it please. i genuinely wonder how long it takes to implement and how fast it will run

  4. #79
    Xtreme Addict
    Join Date
    Apr 2007
    Posts
    1,870
    Quote Originally Posted by W1zzard View Post
    do it please. i genuinely wonder how long it takes to implement and how fast it will run
    I'm no CUDA programmer but it's trivial to do in Java on multiple CPU threads using BigDecimal. To get this to run in CUDA you would need to build a similiar data structure to represent unbounded integer values on the GPU. That isn't trivial!

  5. #80
    Xtreme Legend
    Join Date
    Jan 2003
    Location
    Stuttgart, Germany
    Posts
    929
    Quote Originally Posted by trinibwoy View Post
    I'm no CUDA programmer but it's trivial to do in Java on multiple CPU threads using BigDecimal.
    to me that sounds like an argument _for_ x86 "normal programming" and against gpgpu?

  6. #81
    YouTube Addict
    Join Date
    Aug 2005
    Location
    Klaatu barada nikto
    Posts
    17,574
    Quote Originally Posted by W1zzard View Post
    i dont think it is possible to make a general statement like "x86 will outperform opencl".

    from what i understand, the unique feature of opencl is that it can run threads on the cpu and the gpu. so run iterative/branching/recursive stuff on the cpu and use the gpu for stuff like matrix multiplications.

    of course in a commercial environment it is also important how quickly you can get things working. your factorial is a great example for that. 1 hour to write cpu code, weeks to write gpu code. coders cost money, time you often don't have, hardware is cheap, just get a few dozen boxes running the cpu implementation
    It is entirely possible to make that statement in cases in which the code is massively Serial.
    Fast computers breed slow, lazy programmers
    The price of reliability is the pursuit of the utmost simplicity. It is a price which the very rich find most hard to pay.
    http://www.lighterra.com/papers/modernmicroprocessors/
    Modern Ram, makes an old overclocker miss BH-5 and the fun it was

  7. #82
    Xtreme Addict
    Join Date
    Apr 2007
    Posts
    1,870
    Quote Originally Posted by W1zzard View Post
    to me that sounds like an argument _for_ x86 "normal programming" and against gpgpu?
    Nope it just means that people have written billions of lines of code for CPUs over time that other people can reuse. If you had to go reinvent the wheel everytime you wrote a program on a CPU you would have the same problem. Just because there's a lot of code already out there for CPUs doesn't make it "normal programming". Eventually there will be similiar reusable data structures and libraries for GPUs, but right now we're in the inventing the wheel phase

  8. #83
    Xtreme Addict
    Join Date
    Jul 2007
    Posts
    1,488
    Quote Originally Posted by trinibwoy View Post
    Nope it just means that people have written billions of lines of code for CPUs over time that other people can reuse. If you had to go reinvent the wheel everytime you wrote a program on a CPU you would have the same problem. Just because there's a lot of code already out there for CPUs doesn't make it "normal programming". Eventually there will be similiar reusable data structures and libraries for GPUs, but right now we're in the inventing the wheel phase
    Not only are the programming tools for GPGPU still evolving, the hardware itself is still evolving too. Will future GPUs have more capabilities and fewer limitations then current GPUs? Definitely.

    It's the same story with x86. It definitely wasn't pretty in its early days.

  9. #84
    Xtreme Legend
    Join Date
    Jan 2003
    Location
    Stuttgart, Germany
    Posts
    929
    Quote Originally Posted by Solus Corvus View Post
    Not only are the programming tools for GPGPU still evolving, the hardware itself is still evolving too. Will future GPUs have more capabilities and fewer limitations then current GPUs? Definitely.

    It's the same story with x86. It definitely wasn't pretty in its early days.
    true, but back then there wasnt a competing technology available that was more mature in every regard except that it came with 1 potential order of magnitude execution time reduction promise.

    does gpgpu offer anything else other than the promise of being faster?

  10. #85
    Xtreme Addict
    Join Date
    Jul 2007
    Posts
    1,488
    Being faster is the only thing that comes to mind at the moment.

    But x86 at its start wasn't faster, easier to develop for, or more mature then some of the other ISAs of the time. IMO it was low cost and ubiquity that lead to its early success.

  11. #86
    Xtreme Enthusiast
    Join Date
    Dec 2007
    Posts
    816
    here is the few 1st digit of 1000000!

    82639316883312400623766461031726662911353479789638 73045167775885563379611035645084446530511311463973 35160680421087858854146474695064783618230121097542 32995901156417462491737988838926919341417654578323 93198728024721989396436544455216153392058351993879 89417742062408415939877018188072231692520577371284 36859815222389311521255279546829742282164292748493 88778471244357228595093436211764525449305226584119 76299056190121202414190025341283194330650762070040 51595915117186613844750900755834037427137686877042 09375102350263340124834131491021768454943127363639 90669719529613457333185577827926166902990562020543 69409707066647851950401003675381978549679950259346 66642561397857355976414208350625445088433370019103 46467387684551438607977529149091464312409535521184 54755321622415039686972337454998330671728809536246 48077818320285426377421697236875196416847586471082 42100299778180917677898375179885052302287080350414 43033114386531070330335994687223907101785184740297 83087941364846376812855960498131717659293531630665 17480609772403756072910751424791692204603165502971 69354047663716190369381921896827532213492684751953 50100400335255754749084696578284256885422102069005 13532770029669416657845610334030651805753128367466 39302736238230285383667241957608614717411328876198 68511924639222394549034184071704524191278560391738 18065632292055980341496433939625002178497725812661 13824842989268461523442361118297456949837880338146 87469951709983634639027320349411064589427249057923 05860181434890526382963177342611731621379255430545 09844614644539199499300410119112360492794946012951 30659278247356204716530093403421148592953866359624 47461097303253380618748369952010062827823753289857 99694238906888529567715949209225149583806856514505 96670904555697317271773014004432808102825390252086 48000898390663154978466569004207120665481602926839 38943423565931192630890840193101695234971107709046 97787866113905135675939369298998926409043243894492 10869093844596609832775820823588351223893611134646 66406618178718704409169235099804650072819161636524 13939761918325607097915533516559818774178302468101 91055845341174334013771594704523382014837248934927 36225371480136546463360019692746427132960171671652 65441110722837953835858565183287750920925286592401 94965758088678759206907055438624460369055092476355 26054794986896702217286822255591086864614968403942 96022938804319833255001743711629597524893489578370 95360081696384427809037703763228572556032284560089 99845469405610984854146135108704412702067643651153 44979482463513313444153430254463924329415057983134 38164811961057658077860119374565557615301684268901 39677170011676284699970487161367602640539814056150 76174991110145759325069317658157472438057937126633 21062938475659697058245825083806111628613740584206 59508856753326967758289024053017037388329261183392 62073903655201325025196211382457528966196071713373 73341415000488973314367305436350799224765380907032 09861024215211324276243856841718874229182392681292 43363349207727805776448423177161377696833069137987 18657693126531324263866633492572602475741131270444 04608411672103993075727560835425172875214665576667 53585365011649712316077596276209806711351395595202 90353790517679281320455781148008256022502403520944 18728580839132063823810049868162254485749606864679 63196757333252468918033997378753215979105301502089 53588025656911878379700135022281279549107192693613 03676716335815882297746959712767622369155812445915 42159203789325849877377253647181266016127482761148 50134991920240358141232300237159287318808267591457 62627992848590250125002598851330095693810747674664 62504209801396315560074702200487555503612708498350 12940298999675675579588642888236610922205864806906 14864975748995768599128029077423069602653349151176 04319247461732810342030136008307585923493900720907 09875209583615653286546364321818375923337598328543 29052989317533692726608339636249804335480642838830 08123661398831602385581691638967393073513831901910 73415336238872693893185099676224924923929217356968 19504295313584531641074934068744500074936290108349 58153559155389347802774206884320467759248341743960 48006973013539551268187279817589685670484287440146 05421834377138978917111671450768625032061658059995 85406503232567148020462991799431611384015570586523 58852508045995562952093385973141835366791803327096 80217333180312997081216239784565937349207787117546 29914411817732595427140979400994263957733427264737 29757557913338214068403561403845875360892453193573 92666972178821498259985381836723997350970266164892 96868992146507053507787686031955414948630508815561 81534492742626837745841509501340035575185545866258 01731330227658679470947867991908998005801312353910 83552457316921195450681523988038443523500576261736 72106174176541410805477030118479612226739507005391 36965385386322096999802893744920888199841684579760 78349927468227641710156799857237292512204028434615 95452837484928924363534585108117528740271114914737 03839809490061097949738662299917488472320970546584 28288930883365952599522026974997050515927837611769 83375160955577985429534777370393609916817795415523 73271727762926158243565690479067692357925210100950 42763971980362553967164666075926525255498239956533 90150839950346847844657971816428269624855904620519 74662214858594699923577350806058599119724672269982 69670820661606934608345836918378355747958086762889 05575495090090020396959408067867958245571987651073 74861231535861776330846676522607131298526935686662 94737854416606884961598063482789622865196450490370 43396363175856316752511822706253217902835968128310 79356364777075822460177225011900869920112351315946 47323090022967589388601144145840544801623171271565 68681192593368037260116546132659204733891466222512 17991186319920237834403917167752426122417340701999 74650396315730937569927461797686168186475297624529 98595843992508215937345497275215915101915923246085 35604268507330509830338186848156865546300876470920 73761218324109255761447293630880265629147584697263 03806151663495932714839832439310290315206091787066 70674267885565342881777214099583923859191871870221 45795090672004652280516262234170471832500601444260 40358264275425185874122376739449503902440652769510 32674571104201745044798423117559990976866750325014 46826823407879823968425360524394938677858362243722 64892283688713430559334389210486428583776107995858 83607092924268750049180123692569815071514545095602 94161155322998893167801766828643769411452848055328 65661126028830980407763723462157342911834373879238 41215183789796207120569834554544496974346880600840 01644203237543784951317765091934926182876712576197 67865564501281309874606653039304785686495334951590 21964410760448523031462161187026322598151341168145 72720956140327969306447112656902589720981240861395 87877000063993898498756192527266460430150228453081 08742614910246860744001242709301534877561237253431 17315889404710831630912764250371315833158345710507 23440404209950030553142975113098148376874187298591 63475015047478292361277811744353228353708243183153 28580391272367267116992181916320246816702407469842 37827034509445941034724833661375134934478090660923 32094337568119639561626647638216146802414852149679 30819395377426023771715074243149472414031673993934 02137746503223707216406226414858892092052346232319 67811354139470595261625133695210478313746943181436 88734673328083009082597889458502773455620395455374 83006523566971871912737558711136270045221981821560 92190778344607092643012024769002727138168508236962 95821789144006416392041468737496978807764315762806 53210369515443993916958887120111224206165649722235 08693532906946918122748231024110458630696804806042 05996388269950965177435882508086079919433118621128 91754324424249517582658498124390877275201018016639 73153542970415598766900392363768597302514832825609 74169430723202399878983386535766425379859355725291 32442616477679088522524796355328365163019367626593 89384487424880700691541790176626726325781195050425 13181474377660934753947044340523447173713814534977 50793642218993304528552193049125169157374836929017 80926012044172547481001186559298397522855763917351 41132016465497970234020097026124084546705606729514 42254463092663942711206762499963946517745595480656 33790209809018746828189556864697955675257741029212 91385138330311998481189728417154303992203954622461 29625040019654388337565770385683229507701071050856 29913115093522576350545056507043919957195306841281 51302589704764372301151202295746379050118808819387 61416864719809188636839535817045568300334115114894 30201744719458286493899759764110796395131968243285 71547584138789821871087424047559508505892065155028 16624158839147312194212821660528761736101475001485 66516422572110065573707223820092380052688800052806 63908450978929242365400709295520087094973274809106 01541360397832191536656237963723851643109831078393 71580851781180833997179948924490773715270680335388 08618572944786572205394657993298161815999839926950 04626608850207202622080640999259868214911296622332 38004066835400833535473880015037160451379817355817 92295557362516822108572507064376760696686009429529 25173504408727974950108825040315238831023477317389 40323686719767043703456248244832007751808814395187 04905484746819057123318556202216229583441346553717 23200716290168918397074970762947624764809421452634 64289966717314259018959518703662412413158329075686 03963519136797505289075965579096288322122291637222 26020190433000550841442026091307144782867100135699 52006675802692656543566071774846185254599667191775 11573465672807627164521643228405465169149807124218 48284593978323278028027469766253200815513882219871 77645760410027927747287411572163083557759078952577 97128524996594548925829659929867797775132521200155 76270477687465699004272929145928174706844858471039 62352135760660719474280661601329696164788178062614 62351886512754025568040309237521504554309701174368 08652424523560028228164422202943923402629217818654 52690700649198013197638563371008688348707330107224 72250649159997013857510699068604406509366375038175 57914864054286848836042284536873941367218188389716 54753213466662227163911429345359079653372969098519 14846664193968882121700414718204757584484078048781 14257929120817684171224838060353311367511135615527 70449707301662433106895624383530559736019003112110 72020866021426553821633353923648307962410874958002 67720169938502515632718842424940224358886194979390 72475406126376873006931187184791894427432074521337 59816597924544429589765412718696937356893948346780 05719364538544515294747200727654260333498197915843 65365013852685543053805849197374826376294917478528 30927822199571032113729437223319867479721169647135 23319712049279923901215349954492374715372117145743 78628554826134753344458214096739964519670156700440 90950849772085174459494029999425599705281308573001 06122462547845760729381797745098691766238897821769 48668912119666341737410277053206626092546469675501 08276039273802922036939660343915948501093463409080 79854973462076861669323680155225222644010835400829 50821451393729783582123336622966051705099043145319 57779992692322648899291936158589431789997734466510 70932322951167736888945587463433881425722584037061 81397382332210204315449794425238729892061652678799 21901593894067045695576635306749683704935592101406 14693315426974759350479571888540863395857238144671 33130353010457052315867973331886590278882757666821 91619814272332249730959443676188112353050680164342 11513372119440084752707233456354376894312383409598 17091558013273204901990652574805902510432948714773 90485652268053414051603803208318738568627963067745 62660547095615756178178175769809592817169519066768 68045907373947349103193653151787027258622170211081 40277192115812643207501062711643506948015377884026 24118071559989854877465599441916519938915778987389 79956722117175969478216130755303553922609097148034 37529546866311512036312361693780695933766075062624 48406156704978862546010714417008122064126425093226 52626872952692199076279451193047352736047398019711 05803660976592009394820723775811983749713561115826 78033756243296206893206904106466544395377813776882 31313976345634823943890715580173539134927010114993 64904777341173862362974755411858312125670685257945 51812049883976561478081964502978945888019797304721 37138453254627024341294657813265749892969721912513 61642450835468200141451346457251696786393635995541 54523485944621334937883365344169447352799029473675 05683720970721685921186958695137083763904218670537 02030526008774230811611974837201728161706210353975 47608588394427526071868116973281913923309157189766 64308506976508171190036885489506493779391572436185 20906929504816607527726736705111030572997410533606 88171968256439819855301593206224106112427227790259 97674709815692921798177367213802316315337184679387 44310421813798749962213033341872974093152674607686 18470035246333453592669254253671139046866245141604 64546928769999523414150753902908380718225576971151 40934516953215543224087769326290450008367923159884 15767117229116048423730900274038509582072671829237 83653840457299503135899464131984048605712773177233 67271917008892893902902228162549407799241663560689 65054451917049908134020398531419445099389223254955 28437650509000849507235708603398123152899945987765 47491848343173703102855436831999820730472869490598 75444299355153901004515957350674707263113851662776 23101065664254549852557972701263871081507885793076 01270017694831773366953197100432370657235884793795 15762434351112732365835688180698739424787511333190 77325362399970034303615187213273318976117356247443 68727088498038539977194528492466633821069501523319 33601909190226659267216823113640177099066512886299 21029823545161678114157953636790601586329814045158 82212017248708972153689990213636389705867831639034 00831409048755120184963446220571445326684183824083 259252131
    .....
    .....
    .....
    For Those who are crazy enough and what to see for your eyes ...
    The world document with 1000000! inside ... 2.55MB

    The screen shot:

    Without ANY optimization, it toke 43735260 milliseconds, about 728 minutes ... around 12 hours ...
    With some threading, I think I can get it down dramatically, I ll just do it for the fun of it ...
    The 8MB of my Nehalem will be really useful between the combining of splited multiplications ...

    Basically: Done!
    Last edited by Drwho?; 08-16-2009 at 10:11 AM.
    DrWho, The last of the time lords, setting up the Clock.

  12. #87
    Xtremely Kool
    Join Date
    Jul 2006
    Location
    UK
    Posts
    1,875
    I can do it faster in my head, its just that copy & paste to the forum from my head is problematic.

  13. #88
    YouTube Addict
    Join Date
    Aug 2005
    Location
    Klaatu barada nikto
    Posts
    17,574
    Quote Originally Posted by Final8ty View Post
    I can do it faster in my head, its just that copy & paste to the forum from my head is problematic.
    Then drop to scientific notation and give us the first 13 digits [which is more than enough for any engineering project]
    Fast computers breed slow, lazy programmers
    The price of reliability is the pursuit of the utmost simplicity. It is a price which the very rich find most hard to pay.
    http://www.lighterra.com/papers/modernmicroprocessors/
    Modern Ram, makes an old overclocker miss BH-5 and the fun it was

  14. #89
    Xtreme Enthusiast
    Join Date
    Dec 2007
    Posts
    816
    Quote Originally Posted by W1zzard View Post
    i dont think it is possible to make a general statement like "x86 will outperform opencl".

    from what i understand, the unique feature of opencl is that it can run threads on the cpu and the gpu. so run iterative/branching/recursive stuff on the cpu and use the gpu for stuff like matrix multiplications.

    of course in a commercial environment it is also important how quickly you can get things working. your factorial is a great example for that. 1 hour to write cpu code, weeks to write gpu code. coders cost money, time you often don't have, hardware is cheap, just get a few dozen boxes running the cpu implementation
    OpenCL is to Java what Larrabee new instruction is the Assemby code ... See the point?

    Francois
    DrWho, The last of the time lords, setting up the Clock.

  15. #90
    Xtreme Legend
    Join Date
    Jan 2003
    Location
    Stuttgart, Germany
    Posts
    929
    Quote Originally Posted by Drwho? View Post
    OpenCL is to Java what Larrabee new instruction is the Assemby code ... See the point?

    Francois
    no

  16. #91
    Xtreme Enthusiast
    Join Date
    Dec 2007
    Posts
    816
    Quote Originally Posted by W1zzard View Post
    no
    OpenCL is a softare layer ... with drivers ... it has concequences ... and overhead .. Assemble code is known to be the fastest code on any architecture.
    A Simple layer between LrB new instruction and the programmer will do. we all know that as soon as a desktop application need performance, it does down to ASM.

    The massive number of Java Desktop is gone, no more need of those ... almost every computer runs x86 from AMD, intel or VIA.

    Last word ... Drivers seems to be a problem, there are a lot of empirical about this ... an healthy PC uses less driver in the long run.
    Before people make fun of the intel drivers, some should clean up on the front of their doors ... ;-)

    Last edited by Drwho?; 08-16-2009 at 11:41 AM.
    DrWho, The last of the time lords, setting up the Clock.

  17. #92
    Xtreme Legend
    Join Date
    Jan 2003
    Location
    Stuttgart, Germany
    Posts
    929
    Quote Originally Posted by Drwho? View Post
    OpenCL is a softare layer ... with drivers ... it has concequences ... and overhead .. Assemble code is known to be the fastest code on any architecture.
    A Simple layer between LrB new instruction and the programmer will do. we all know that as soon as a desktop application need performance, it does down to ASM.

    Last word ... Drivers seems to be a problem, there are a lot of empirical about this ... an healthy PC uses less driver in the long run.
    Before people make fun of the intel drivers, some should clean up on the front of their doors ... ;-)

    i haven't seen anyone write assembler code on x86 in many years. even drivers are coded in a high level language. compilers are fairly smart, they produce good code, better than 90% of asm coders. asm code is a nightmare to maintain.
    what do you code in asm at intel?

  18. #93
    Xtreme Mentor
    Join Date
    Sep 2007
    Location
    Ohio
    Posts
    2,977
    Quote Originally Posted by Drwho? View Post
    OpenCL is a softare layer ... with drivers ... it has concequences ... and overhead .. Assemble code is known to be the fastest code on any architecture.
    I believe you...

    I posted once that I thought OpenCL would probably be slower than CUDA and got my head bit off!

    The thread was locked in the end.
    Asus Maximus SE X38 / Lapped Q6600 G0 @ 3.8GHz (L726B397 stock VID=1.224) / 7 Ultimate x64 /EVGA GTX 295 C=650 S=1512 M=1188 (Graphics)/ EVGA GTX 280 C=756 S=1512 M=1296 (PhysX)/ G.SKILL 8GB (4 x 2GB) SDRAM DDR2 1000 (PC2 8000) / Gateway FPD2485W (1920 x 1200 res) / Toughpower 1,000-Watt modular PSU / SilverStone TJ-09 BW / (2) 150 GB Raptor's RAID-0 / (1) Western Digital Caviar 750 GB / LG GGC-H20L (CD, DVD, HD-DVD, and BlueRay Drive) / WaterKegIII Xtreme / D-TEK FuZion CPU, EVGA Hydro Copper 16 GPU, and EK NB S-MAX Acetal Waterblocks / Enzotech Forged Copper CNB-S1L (South Bridge heat sink)

  19. #94
    Xtreme Legend
    Join Date
    Jan 2003
    Location
    Stuttgart, Germany
    Posts
    929
    Quote Originally Posted by Talonman View Post
    I posted once that I thought OpenCL would probably be slower than CUDA and got my head bit off!
    i doubt a significant difference will exist, even if nv implements opencl on top of cuda. got the link to the thread?

  20. #95
    Xtreme Enthusiast
    Join Date
    Dec 2007
    Posts
    816
    Quote Originally Posted by Talonman View Post
    I believe you...

    I posted once that I thought OpenCL would probably be slower than CUDA and got my head bit off!

    The thread was locked in the end.
    Please read totally before replying
    hehehhe ... CuDa is like OpenCL ... a thick layer between the programmer and the hardware.

    For the next few months, it may be fine, but we can't have a PC fragmented with different APIs ... Don't get me wrong, I was a super lover of GLIDE, but it did not survive the "Standardization" ... I guess Cuda is in the same position. OpenCL and DirectX compute are more into the position to standartize ... but long term, Integration will do its work.
    The Network cards, sound cards are use to be separated, they are now in the motherboard, with System on chip very soon in the CPU.
    This is the nature of this industry, when the CPU catch up from the bottom, it does start doing custom functions with Generic transistors, the CPU goes through transformation to do this, and we are at the beginning of Generic Purpose cores x86 to get into GFX.

    The soft drivers for sound cards were never as good as a Creative lab 48 bits cards, but they catch up the all required feature set. they are now a huge majority shipping.

    Let's all share opinions, without beating on each other, I may be right, I may be wrong ... It is just what I think, and you are free to dissagree.

    Francois
    Last edited by Drwho?; 08-16-2009 at 12:02 PM.
    DrWho, The last of the time lords, setting up the Clock.

  21. #96
    Xtreme Enthusiast
    Join Date
    Dec 2007
    Posts
    816
    Quote Originally Posted by W1zzard View Post
    i haven't seen anyone write assembler code on x86 in many years. even drivers are coded in a high level language. compilers are fairly smart, they produce good code, better than 90% of asm coders. asm code is a nightmare to maintain.
    what do you code in asm at intel?
    I use mostly compilers for the soft architecture ... but most of the code that need performance on PCs IS using ASM.

    * Most of the video codec do. Check x264 code or vTune DivX or windows Media ...
    * Most of the 3D Rendering like 3DSMax uses intrinsics, almost a 1:1 translation to ASM.
    * Most of audio programs have the MP3 parts using MMX.
    * Most of the Drivers, including NV and ATI are using ASM to managed cache pollution properly when sending through PCIexpress ... MOVNT and so on ...

    The list is very long, you just don t see it because we don t explain much details ...

    as soon as Performance can give you a competitive advantage over your software competitor, people use ASM.

    I help a lot of those guys.

    Francois
    DrWho, The last of the time lords, setting up the Clock.

  22. #97
    Xtreme Mentor
    Join Date
    Sep 2007
    Location
    Ohio
    Posts
    2,977
    Quote Originally Posted by Drwho? View Post
    Let's all share opinions, without beating on each other, I may be right, I may be wrong ... It is just what I think, and you are free to dissagree.

    Francois
    I am with you 100%. I was kind of fired up yesterday, and might have gotten out of line toward you. Sorry about that.

    I think we can all learn from each other.
    Asus Maximus SE X38 / Lapped Q6600 G0 @ 3.8GHz (L726B397 stock VID=1.224) / 7 Ultimate x64 /EVGA GTX 295 C=650 S=1512 M=1188 (Graphics)/ EVGA GTX 280 C=756 S=1512 M=1296 (PhysX)/ G.SKILL 8GB (4 x 2GB) SDRAM DDR2 1000 (PC2 8000) / Gateway FPD2485W (1920 x 1200 res) / Toughpower 1,000-Watt modular PSU / SilverStone TJ-09 BW / (2) 150 GB Raptor's RAID-0 / (1) Western Digital Caviar 750 GB / LG GGC-H20L (CD, DVD, HD-DVD, and BlueRay Drive) / WaterKegIII Xtreme / D-TEK FuZion CPU, EVGA Hydro Copper 16 GPU, and EK NB S-MAX Acetal Waterblocks / Enzotech Forged Copper CNB-S1L (South Bridge heat sink)

  23. #98
    Xtreme Addict
    Join Date
    Apr 2007
    Posts
    1,870
    Quote Originally Posted by W1zzard View Post
    true, but back then there wasnt a competing technology available that was more mature in every regard except that it came with 1 potential order of magnitude execution time reduction promise.
    I hope you don't believe x86 to be the pinnacle of human innovation in ISA's

    Quote Originally Posted by Solus Corvus View Post
    Being faster is the only thing that comes to mind at the moment.
    And the only thing that matters. It's not like x86 is any more parallel friendly than other platforms.

    But x86 at its start wasn't faster, easier to develop for, or more mature then some of the other ISAs of the time. IMO it was low cost and ubiquity that lead to its early success.
    Exactly.

  24. #99
    Xtreme Addict
    Join Date
    Dec 2008
    Location
    Sweden, Linköping
    Posts
    2,034
    Quote Originally Posted by Drwho? View Post
    The Network cards, sound cards are use to be separated, they are now in the motherboard, with System on chip very soon in the CPU.
    This is the nature of this industry, when the CPU catch up from the bottom, it does start doing custom functions with Generic transistors, the CPU goes through transformation to do this, and we are at the beginning of Generic Purpose cores x86 to get into GFX.
    So going abit off-topic perhaps... How far do you think System on Chip will develop within the coming years? I'd like your personal opinion on this

    It is obvious that the 3 major PC vendors Intel, AMD and Nvidia have a large interest in this and all have a solution/is working on one that will be released very soon. Looking 10 years ahead will something have changed with out tradition computers with a motherboard, CPU and the option for a stronger GPU would we want to, or have we in 10 years started the transition whereas the difference between a CPU and the GPU has become minimal if not none-existant anymore, and all PCs will consist of a single-chip on a PCB instead of the classical motherboard, CPU, GPU etc. etc.

    We have seen SoC taking it's way into Phones and Smaller hand-carried products and soon Qualcomm will release their smartbooks, not performance monsters, but it's SoC in a small package which I can carry around and use as a PC together with a OS (not Windows ofc.).

    So my question really is, will this pattern follow into the PC market?
    SweClockers.com

    CPU: Phenom II X4 955BE
    Clock: 4200MHz 1.4375v
    Memory: Dominator GT 2x2GB 1600MHz 6-6-6-20 1.65v
    Motherboard: ASUS Crosshair IV Formula
    GPU: HD 5770

  25. #100
    Xtreme Legend
    Join Date
    Jan 2003
    Location
    Stuttgart, Germany
    Posts
    929
    Quote Originally Posted by trinibwoy View Post
    I hope you don't believe x86 to be the pinnacle of human innovation in ISA's
    so which isa is more successful?

Page 4 of 7 FirstFirst 1234567 LastLast

Bookmarks

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •