MMM
Page 2 of 3 FirstFirst 123 LastLast
Results 26 to 50 of 52

Thread: SC09: Intel speaks about 3D Web, demonstrates LRB.

  1. #26
    Xtreme Member
    Join Date
    Feb 2008
    Posts
    340
    Quote Originally Posted by SQjay View Post
    Anyone wondering about 40 deactivated cores?
    It seams to me that they didn't have a spare NPP nearby to power up the whole chip!

    If half of the chip is already on the limit of 300W then all this comparing with 4870/5870 is useless.
    LRB with half chip is on par with 4870 perf. wise and on par 5970 (8x 4870) TDP wise.
    So when it finally come out it will be say... 10x slower then R900 (very optimistic scenario for Intel) if they cut down power by half and bring 50% speed up in clocks.

    As GPU it would bee finally nice to see LRB with TWO DIGITS FPS in any modern game.
    Reading comprehension ftw.
    On the SGEMM single precision, dense matrix multiply test, Rattner showed Larrabee running at a peak of 417 gigaflops with half of its 80 cores activated; and with all of the cores turned on, it was able to hit 805 gigaflops. As the keynote was winding down, Rattner told the techies to overclock it, and was able to push a single Larrabee chip up to just over 1 teraflops, which is the design goal for the initial Larrabee co-processors.

    I'm not really sure that gaming is really what they are swinging at.

    Kicking off the SC09 supercomputing event...
    αποστασία

  2. #27
    Administrator
    Join Date
    Nov 2007
    Location
    Stockton, CA
    Posts
    3,569
    Quote Originally Posted by Drwho? View Post
    here is what you need to know: Here
    the rest is not important ...

    Pommm pom pom pom
    Now that was scarry, 5:20am and I click that link

    haha good one

  3. #28
    Xtreme Mentor
    Join Date
    Nov 2005
    Location
    Devon
    Posts
    3,437
    Quote Originally Posted by Hornet331 View Post
    So if you go with current best case (beside the theoretical 880gflops) intel larrabee can pull of twice the preformance of a HD4870. Now its to be seen how much more pefromance the HD5870 has brought to the table. If the scaling is the same the should be around 1.2TF.
    It's not theoretical, it's sustainable and the guy tweaked it to do 980GFlops/s on stock HD4870 ...
    Here is the thread to read --> http://forum.beyond3d.com/showthread.php?t=54842

    Some quotes:
    prunedtree
    The result ? I measure 880 Gflop/s for 4096x4096 dense matrix-matrix products. That makes a pair of HD4870x2 boards faster than nine GTX280s ^^
    prunedtree
    Well, given that double precision multiply-adds are four (five if you count the `t' unit) times slower but only require twice the bandwidth, it's much easier to achieve high ALU utilization. ATi's implementation is almost optimal, over 200 Gflop/s (out of 240 Gflop/s peak).
    prunedtree
    The 8x8 block kernel in the original post uses only 26 float4 registers. There's clearly plenty of margin, so how much further can we go ? Well, it's possible to fit 8x10 blocks using the integrality of the register file. This is 11% faster in theory. It achieves 980 Gflop/s in practice, over 4 multiply-adds per cycle on average.
    MicahVillmow [AMD/ATi]
    prunedtree,
    Congratulations on improving on our algorithm for dense-matmat mul. It is very impressive to see that people can take our code developed on older hardware and improve it past its original design. The original code was developed on the R600 hence the 8x4 design and later optimized for R670 but the original design didn't change.
    And there is much more good information in that thread!

    So ATi can do almost 1TF/s sustained on HD4870 vs. this new demo. Not that impressive. We don't know enough about Intel implementation and it's limitations, but we can be sure it's pretty well optimized as well.
    RiG1: Ryzen 7 1700 @4.0GHz 1.39V, Asus X370 Prime, G.Skill RipJaws 2x8GB 3200MHz CL14 Samsung B-die, TuL Vega 56 Stock, Samsung SS805 100GB SLC SDD (OS Drive) + 512GB Evo 850 SSD (2nd OS Drive) + 3TB Seagate + 1TB Seagate, BeQuiet PowerZone 1000W

    RiG2: HTPC AMD A10-7850K APU, 2x8GB Kingstone HyperX 2400C12, AsRock FM2A88M Extreme4+, 128GB SSD + 640GB Samsung 7200, LG Blu-ray Recorder, Thermaltake BACH, Hiper 4M880 880W PSU

    SmartPhone Samsung Galaxy S7 EDGE
    XBONE paired with 55'' Samsung LED 3D TV

  4. #29
    Xtreme Mentor
    Join Date
    Apr 2005
    Posts
    2,550
    Quote Originally Posted by kl0012 View Post
    I think that the bigest advantage of Larrabee is its ISA so you don't need to deal with various proprietary APIs. Also its memory model (coherent caches, general purpose mem hierarhy) alows much higher flexability in code development.
    LOL... proprietary APIs? since when are OpenCL and DirectCompute proprietary for exclusively one GPU?

    BTW to efficiently program Larrabee you'll need to learn how to use LRB instruction set and Larrabee's own High Level Programming Language...
    Adobe is working on Flash Player support for 64-bit platforms as part of our ongoing commitment to the cross-platform compatibility of Flash Player. We expect to provide native support for 64-bit platforms in an upcoming release of Flash Player following the release of Flash Player 10.1.

  5. #30
    all outta gum
    Join Date
    Dec 2006
    Location
    Poland
    Posts
    3,390
    Quote Originally Posted by Lightman View Post
    almost 1TF/s sustained
    OT: TF is teraflops, one trillion flops. Flops is one floating point operation per second, therefore there's no such thing as TF/s (floating point opertion per second per second?).
    www.teampclab.pl
    MOA 2009 Poland #2, AMD Black Ops 2010, MOA 2011 Poland #1, MOA 2011 EMEA #12

    Test bench: empty

  6. #31
    Xtreme X.I.P.
    Join Date
    Nov 2002
    Location
    Shipai
    Posts
    31,147
    2-3x faster than a gtx280... fermi is supposed to be more than 2-3x faster than a gtx280 isnt it?
    but lrb will cost 300-400$, while fermi will come at a hefty 2500$+... i really dont understand nvidias price positioning... they either think lrb wont be ready anytime soon for when fermi comes out, or they think it wont perform nearly as well as fermi... or they have their head in the clouds and the prices are a wish on their part... which is actually what it looks like... 2500 and 4000 sounds like what nvidia would like to sell their gpus for... i just dont see it happening in this time-space reality we currently exist in ^^

    and i wonder about rasterization performance...
    while its a big achievement for gpus to do double precision and do lots of flops, single or double, its easy for lrb to have a high flops number... the difficult part for intel is getting rasterization perf right... and im curious how itll do there...

    cause lets face it, who here is gonna buy lrb to run some synthetic flops benchmark or run a gpgpu app?
    Last edited by saaya; 11-19-2009 at 10:46 AM.

  7. #32
    I am Xtreme
    Join Date
    Dec 2007
    Posts
    7,750
    Quote Originally Posted by saaya View Post
    cause lets face it, who here is gonna buy lrb to run some synthetic flops benchmark or run a gpgpu app?
    the same people who also use an nvidia card to cheat on 3DmarkV CPU scores?

  8. #33
    Xtreme Member
    Join Date
    Feb 2008
    Posts
    340
    Me! Seeing as I will be in gradschool when this tech is mature, I'd like to play with it for as long as possible ahead of time. I'm really interested in the API support, the price is an important issue that I've not seen anything about.
    αποστασία

  9. #34
    Xtreme Mentor
    Join Date
    Nov 2005
    Location
    Devon
    Posts
    3,437
    Quote Originally Posted by xoqolatl View Post
    OT: TF is teraflops, one trillion flops. Flops is one floating point operation per second, therefore there's no such thing as TF/s (floating point opertion per second per second?).
    Good catch
    I deserve for that
    RiG1: Ryzen 7 1700 @4.0GHz 1.39V, Asus X370 Prime, G.Skill RipJaws 2x8GB 3200MHz CL14 Samsung B-die, TuL Vega 56 Stock, Samsung SS805 100GB SLC SDD (OS Drive) + 512GB Evo 850 SSD (2nd OS Drive) + 3TB Seagate + 1TB Seagate, BeQuiet PowerZone 1000W

    RiG2: HTPC AMD A10-7850K APU, 2x8GB Kingstone HyperX 2400C12, AsRock FM2A88M Extreme4+, 128GB SSD + 640GB Samsung 7200, LG Blu-ray Recorder, Thermaltake BACH, Hiper 4M880 880W PSU

    SmartPhone Samsung Galaxy S7 EDGE
    XBONE paired with 55'' Samsung LED 3D TV

  10. #35
    Xtreme Enthusiast
    Join Date
    Nov 2008
    Location
    Sweden
    Posts
    621
    Quote Originally Posted by xoqolatl View Post
    OT: TF is teraflops, one trillion flops. Flops is one floating point operation per second, therefore there's no such thing as TF/s (floating point opertion per second per second?).
    Can't we make it mean acceleration from 0 TF to x TF, kinda like in physics?
    Main Rig: Phenom II X6 1055T 95W @3562 (285x12.5) MHz, Corsair XMS2 DDR2 (2x2GB), Gigabyte HD7970 OC (1000 MHz) 3GB, ASUS M3A78-EM,
    Corsair F60 60 GB SSD + various HDDs, Corsair HX650 (3.3V/20A, 5V/20A, 12V/54A), Antec P180 Mini


    Notebook: HP ProBook 6465b w/ A6-3410MX and 8GB DDR3 1600

  11. #36
    Xtreme Member
    Join Date
    Sep 2008
    Posts
    235
    Quote Originally Posted by kl0012 View Post
    Here it is:
    http://www.theregister.co.uk/2009/11...ttner_keynote/



    80 cores? Probably autor's mistake. But rather interesting is a 805 GFLOPS in SGEMM. As a reference point, Tesla (GTX280) hits 370 GFLOPS in the same task.
    Rattner isn't even speaking about Larrabee but about its 80 core research chip:

    Intel to unveil energy-efficient, many-core research chip
    http://www.computerworld.com/s/artic...p?taxonomyId=1


    Regards, Hans

  12. #37
    Xtreme X.I.P.
    Join Date
    Nov 2002
    Location
    Shipai
    Posts
    31,147
    Quote Originally Posted by Drwho? View Post
    here is what you need to know: Here
    the rest is not important ...

    Pommm pom pom pom
    is that supposed to be funny?
    is it referring to something else? is it a parody?

    Quote Originally Posted by Hans de Vries View Post
    Rattner isn't even speaking about Larrabee but about its 80 core research chip:

    Intel to unveil energy-efficient, many-core research chip
    http://www.computerworld.com/s/artic...p?taxonomyId=1


    Regards, Hans
    ha! journalism at it's best
    Last edited by saaya; 11-20-2009 at 03:12 AM.

  13. #38
    I am Xtreme
    Join Date
    Jul 2007
    Location
    Austria
    Posts
    5,485
    Quote Originally Posted by saaya View Post
    ha! journalism at it's best
    indeed, so we are talking about polaris and not larrabee... wtf.

  14. #39
    Coat It with GOOOO
    Join Date
    Aug 2006
    Location
    Portland, OR
    Posts
    1,608
    Quote Originally Posted by kl0012 View Post
    Here it is:
    http://www.theregister.co.uk/2009/11...ttner_keynote/



    80 cores? Probably autor's mistake. But rather interesting is a 805 GFLOPS in SGEMM. As a reference point, Tesla (GTX280) hits 370 GFLOPS in the same task.
    sounds like someone's gotten mixed up with the Polaris test chip
    Main-- i7-980x @ 4.5GHZ | Asus P6X58D-E | HD5850 @ 950core 1250mem | 2x160GB intel x25-m G2's |
    Wife-- i7-860 @ 3.5GHz | Gigabyte P55M-UD4 | HD5770 | 80GB Intel x25-m |
    HTPC1-- Q9450 | Asus P5E-VM | HD3450 | 1TB storage
    HTPC2-- QX9750 | Asus P5E-VM | 1TB storage |
    Car-- T7400 | Kontron mini-ITX board | 80GB Intel x25-m | Azunetech X-meridian for sound |


  15. #40
    Xtreme Member
    Join Date
    Feb 2008
    Posts
    340
    I remember seeing a video on this chip.
    αποστασία

  16. #41
    I am Xtreme
    Join Date
    Dec 2007
    Posts
    7,750
    so what are the specs of this polaris chip? chip size and TDP?

  17. #42
    Xtreme Addict
    Join Date
    Jan 2005
    Posts
    1,730
    Quote Originally Posted by Manicdan View Post
    so what are the specs of this polaris chip? chip size and TDP?
    Intel has a department called Terascale computing, basically Skunk Works of Intel.

    There they develop and test concepts and products which will deliver TFlops of performance and TBs of BW. Polaris was basically an 80 core chip that was testing whether a 2D interconnect is able to scale to such core counts.
    In other words, I/O is the bottleneck in many-core chips, not the core themselves.

    Polaris isn't new, IIRC is was demoed back in 2007. It is 275mm square, 65nm burns 62w and reached 1 Tflop of performance. On chip data BW is 1,6 TBs with a 2D mesh.
    Quote Originally Posted by Heinz Guderian View Post
    There are no desperate situations, there are only desperate people.

  18. #43
    Xtreme X.I.P.
    Join Date
    Nov 2002
    Location
    Shipai
    Posts
    31,147
    polaris is basically a mentally challenged half brother of atom and grandfather of LRB
    dont really understand why they were showing it off tbh... LRB would have been much more interesting...
    i read this as bad news, sounds as if LRB is still not working properly? :/

  19. #44
    I am Xtreme
    Join Date
    Dec 2007
    Posts
    7,750
    Quote Originally Posted by saaya View Post
    polaris is basically a mentally challenged half brother of atom and grandfather of LRB
    dont really understand why they were showing it off tbh... LRB would have been much more interesting...
    i read this as bad news, sounds as if LRB is still not working properly? :/
    sounds like Polaris is Philip J Fry

  20. #45
    Xtreme Addict
    Join Date
    Aug 2008
    Location
    Hollywierd, CA
    Posts
    1,284
    i remember reading about this a few years ago, wondering if they were trying to mimic nvidia's gpu philosophy... i find it interesting that it is still a simple 80 core. iirc, the purpose of the chip initially was to develop inter-core communications for future many core designs. i also remember they had working silicon back then, so i wonder what has changed and why they would feel the need to demo the silicon now?
    [SIGPIC][/SIGPIC]

    I am an artist (EDM producer/DJ), pls check out mah stuff.

  21. #46
    I am Xtreme
    Join Date
    Jul 2007
    Location
    Austria
    Posts
    5,485
    Quote Originally Posted by 570091D View Post
    i remember reading about this a few years ago, wondering if they were trying to mimic nvidia's gpu philosophy... i find it interesting that it is still a simple 80 core. iirc, the purpose of the chip initially was to develop inter-core communications for future many core designs. i also remember they had working silicon back then, so i wonder what has changed and why they would feel the need to demo the silicon now?
    Well maybe they made the cores more complex, the chip back in 07 had very simple cores, and was basically a co processor, not even that.

    Now they could run a fully fleged benchmark on it.

  22. #47
    Xtreme Addict
    Join Date
    Jan 2005
    Posts
    1,366
    Does Teo Valich changed his mind?
    http://www.brightsideofnews.com/news...n-1h-2010.aspx
    During regular quarterly meetings with large OEMs, Larrabee was recently brought out and Intel's representatives stated that the part will surprise the competition with performance. Among other things, Intel re-iterated that the part is on track for introduction in very late first half of 2010, which would pitch the part in time for 30th anniversary of Computex Taipei 2010 [June 1-5].
    Also a few words from DreamWorks:
    http://www.fastcompany.com/magazine/...m-machine.html
    DreamWorks is piloting Intel's new Larrabee chip. It will drastically improve the long rendering times required by the animation process: "It's going to change every single thing we do," says Katzenberg.

  23. #48
    Xtreme Member
    Join Date
    Dec 2007
    Location
    CR:IA
    Posts
    384
    n/m
    Last edited by ChinStrap; 11-25-2009 at 11:14 AM.
    PC-A04 | Z68MA-ED55 | 2500k | 2200+ XPG | 7970 | 180g 520 | 2x1t Black | X3 1000w

  24. #49
    Xtreme Member
    Join Date
    Dec 2008
    Posts
    285
    Quote Originally Posted by Cybercat View Post
    Why, because they didn't demonstrate it at an HPC event?
    HPC is a relatively small market ATM and will stay so unless everybody goes cloud computing crazy. If they were confident of it's performance in rasterized 3D gfx then they would have demoed the thing at a consumer PC hardware event.
    Core i7 920, Gigabyte x58-USB3, Radeon 5850 [CF coming soon], 6GB OCZ Platinum, Corsair 40GB Force, 3x 2TB Spinpoint F4, Silverstone OP1000, Dell XPS Studio Case.

    Alienware M11x.

  25. #50
    Xtreme Mentor
    Join Date
    Aug 2006
    Location
    HD0
    Posts
    2,646
    Quote Originally Posted by xoqolatl View Post
    OT: TF is teraflops, one trillion flops. Flops is one floating point operation per second, therefore there's no such thing as TF/s (floating point opertion per second per second?).
    it's how quickly your flop of a product accelerates...

Page 2 of 3 FirstFirst 123 LastLast

Bookmarks

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •