Page 6 of 82 FirstFirst ... 34567891656 ... LastLast
Results 126 to 150 of 2036

Thread: The GT300/Fermi Thread

  1. #126
    Xtreme Addict
    Join Date
    Aug 2007
    Location
    Istantinople
    Posts
    1,574
    Those numbers in the AMD slides are the most misleading things I have ever seen. Shader cores and SP performance? 4870 should be a hell of a lot faster than GTX 285 if they are to mean anything.
    Has anyone really been far even as decided to use even go want to do look more like?
    INTEL Core i7 920 // ASUS P6T Deluxe V2 // OCZ 3G1600 6GB // POWERCOLOR HD5970 // Cooler Master HAF 932 // Thermalright Ultra 120 Extreme // SAMSUNG T260 26"

  2. #127
    Xtreme Enthusiast
    Join Date
    Jun 2005
    Posts
    960
    Quote Originally Posted by annihilat0r View Post
    Those numbers in the AMD slides are the most misleading things I have ever seen. Shader cores and SP performance? 4870 should be a hell of a lot faster than GTX 285 if they are to mean anything.
    Yes, they are marketing stuff.
    Just like Nvidia' slides talking about "8X DP performance improvement" and not mentioning that SP improvement is marginal.

  3. #128
    Xtreme Addict
    Join Date
    Aug 2007
    Location
    Istantinople
    Posts
    1,574
    Marginal being, like, double the previous SP?

    Plus, when Nvidia is misleading with something, they are greedy, cruel capitalists out for the money in our pockets; but when AMD is misleading it's "marketing". It's always like that around here.
    Has anyone really been far even as decided to use even go want to do look more like?
    INTEL Core i7 920 // ASUS P6T Deluxe V2 // OCZ 3G1600 6GB // POWERCOLOR HD5970 // Cooler Master HAF 932 // Thermalright Ultra 120 Extreme // SAMSUNG T260 26"

  4. #129
    Xtreme Enthusiast
    Join Date
    Dec 2008
    Posts
    752
    Seriously, those kinds of slides make me sick. They're so wrong. You'd see the same kind of "fail" hardware for nvidia if you compared the gtx280 to the 4870. Except the gtx280 out performed the 4870.

    And I agree with annihilat0r, if nVidia does something like this its "haha, the fools we won't fall for their rebranding and overhyped product" but with AMD its simply "marketing."

    Stop being so biased people - the sad part is all these people who say its just marketing are all the people who always call others fanboys

  5. #130
    Xtreme Cruncher
    Join Date
    May 2009
    Location
    Bloomfield
    Posts
    1,968
    Quote Originally Posted by Piotrsama View Post
    Yes, they are marketing stuff.
    Just like Nvidia' slides talking about "8X DP performance improvement" and not mentioning that SP improvement is marginal.
    theoretically the improvement is marginal but in the real world it is much faster. tesla 2000 series is 80% faster than last gen if you exclude the mul unit. its even faster with fma efficiency. comparing peak flops rather than the usual n-body demo is not a good marketing technique. besides having full speed double precision is a very important feature of fermi.

  6. #131
    Xtreme Addict
    Join Date
    Apr 2007
    Posts
    1,870
    The funny thing is that a lot of HPC workloads are bandwidth limited and can't even make use of all the flops because they can't get data to the cores fast enough. That's why caching and the use of shared memory is so important. For example, a lot of compute workloads just don't play nice with HD4xxx cards because the LDS there doesn't really function like it should. So people resort to a lot of other trickery like using the texture units instead to pump data into the cores but that's obviously not a scalable approach. Things should be much better with HD5xxx but I haven't seen independent confirmation yet.

  7. #132
    Xtreme Enthusiast
    Join Date
    Jun 2005
    Posts
    960
    Quote Originally Posted by annihilat0r View Post
    Marginal being, like, double the previous SP?
    No, like the jump from 933 GFLOPS from the actual C1060 to the 1040 GFLOPS of the C2050 that is coming on 2Q of 2010.

    Which accounts to: 1.11X

    Quote Originally Posted by annihilat0r View Post
    Plus, when Nvidia is misleading with something, they are greedy, cruel capitalists out for the money in our pockets; but when AMD is misleading it's "marketing". It's always like that around here.
    I didn't say that.
    Re-read what I wrote:
    Quote Originally Posted by Piotrsama View Post
    Yes, they are marketing stuff.
    Just like Nvidia' slides
    talking about "8X DP performance improvement" and not mentioning that SP improvement is marginal.
    I'm saying that both sides' slides are marketing.

    Quote Originally Posted by orangekiwii View Post
    And I agree with annihilat0r, if nVidia does something like this its "haha, the fools we won't fall for their rebranding and overhyped product" but with AMD its simply "marketing."
    Same for you.
    Last edited by Piotrsama; 12-21-2009 at 10:41 AM.

  8. #133
    Xtreme Mentor
    Join Date
    Jul 2008
    Location
    Shimla , India
    Posts
    2,631
    Quote Originally Posted by trinibwoy View Post
    The funny thing is that a lot of HPC workloads are bandwidth limited and can't even make use of all the flops because they can't get data to the cores fast enough. That's why caching and the use of shared memory is so important. For example, a lot of compute workloads just don't play nice with HD4xxx cards because the LDS there doesn't really function like it should. So people resort to a lot of other trickery like using the texture units instead to pump data into the cores but that's obviously not a scalable approach. Things should be much better with HD5xxx but I haven't seen independent confirmation yet.
    Yes they are 960 dwords/clock.

    Offcourse these are advertised performance i would say the worst they can do is around 960/3 dwords/clock which is still a lot more than RV770.

    Now fermi has 16 LDS if i am not mistaken and work with 32 execution's/clock given the huge bandwidth i would say it is quite a bit more than what Evergreen can offer not so sure about dual evergreen tough.
    Coming Soon

  9. #134
    Xtreme Cruncher
    Join Date
    May 2009
    Location
    Bloomfield
    Posts
    1,968
    Quote Originally Posted by trinibwoy View Post
    The funny thing is that a lot of HPC workloads are bandwidth limited and can't even make use of all the flops because they can't get data to the cores fast enough. That's why caching and the use of shared memory is so important. For example, a lot of compute workloads just don't play nice with HD4xxx cards because the LDS there doesn't really function like it should. So people resort to a lot of other trickery like using the texture units instead to pump data into the cores but that's obviously not a scalable approach. Things should be much better with HD5xxx but I haven't seen independent confirmation yet.
    yes physics simulations will be a lot faster on fermi than previously. i do remember seeing a slide on the cache hierarchy and it was 3x faster than gt200. it was CFD i think.
    Quote Originally Posted by Piotrsama View Post
    No, like the jump from 933 GFLOPS from the actual C1060 to the 1040 GFLOPS of the C2050 that is coming on 2Q of 2010.

    Which accounts to: 1.11X
    well first off you compared the old high end tesla to the new low end tesla. secondly theoretical flops are not a measure of real world performance (ie. larrabee). the mul unit in gt200 makes the card look faster than it really is. it can be used but not as often as it should. you might want to take other factors into consideration like 6GB of ram, fma, memory hierarchy, more bandwidth, etc.

  10. #135
    Xtreme Addict
    Join Date
    Apr 2007
    Posts
    1,870
    http://www.nvidia.com/docs/IO/43395/...83-001_v01.pdf



    So clocks are actually 1.25-1.4Ghz. Tesla SKUs will have 2 cores disabled probably due to yields and/or power consumption.

  11. #136
    Xtreme Enthusiast
    Join Date
    Jun 2005
    Posts
    960
    Quote Originally Posted by Chumbucket843 View Post
    well first off you compared the old high end tesla to the new low end tesla. secondly theoretical flops are not a measure of real world performance (ie. larrabee). the mul unit in gt200 makes the card look faster than it really is. it can be used but not as often as it should. you might want to take other factors into consideration like 6GB of ram, fma, memory hierarchy, more bandwidth, etc.
    Yes, I realize that FLOPS aren't a real measure, but that didn't stop Nvidia to hype 8X improvement on DP FLOPS. Did it?

    What's the high end you talk about? The C2070?
    If it is: (which I'm not sure), C2070 is 1260 GFLOPS, and coming on Q3 2010.

    The jump from 933 to 1260 is 1.35X (and the card won't be on the market until Q3 2010).

  12. #137
    Xtreme Addict
    Join Date
    Apr 2007
    Posts
    1,870
    Quote Originally Posted by Piotrsama View Post
    Yes, I realize that FLOPS aren't a real measure, but that didn't stop Nvidia to hype 8X improvement on DP FLOPS. Did it?
    Sure, you can regurgitate Nvidia's marketing numbers or you can draw you own conclusions based on what we know about the architectures so far.

    GT200: MAD + MUL/SFU
    Fermi: MAD + SFU

    In the case of GT200 they assign the SFU 1 flop. With Fermi the contribution of the SFU isn't counted which further under-represents the compute power available. It's probably best to be conservative with Fermi expectations though - it'll limit the disappointment if it bombs

  13. #138
    Xtreme Addict
    Join Date
    Aug 2007
    Location
    Istantinople
    Posts
    1,574
    Has anyone really been far even as decided to use even go want to do look more like?
    INTEL Core i7 920 // ASUS P6T Deluxe V2 // OCZ 3G1600 6GB // POWERCOLOR HD5970 // Cooler Master HAF 932 // Thermalright Ultra 120 Extreme // SAMSUNG T260 26"

  14. #139
    Xtreme Mentor
    Join Date
    Apr 2005
    Posts
    2,550
    Expected! Performance in line of GTX295!
    Adobe is working on Flash Player support for 64-bit platforms as part of our ongoing commitment to the cross-platform compatibility of Flash Player. We expect to provide native support for 64-bit platforms in an upcoming release of Flash Player following the release of Flash Player 10.1.

  15. #140
    Xtreme Addict
    Join Date
    Aug 2007
    Location
    Istantinople
    Posts
    1,574
    well that's for Tesla actually
    Has anyone really been far even as decided to use even go want to do look more like?
    INTEL Core i7 920 // ASUS P6T Deluxe V2 // OCZ 3G1600 6GB // POWERCOLOR HD5970 // Cooler Master HAF 932 // Thermalright Ultra 120 Extreme // SAMSUNG T260 26"

  16. #141
    Xtreme Addict
    Join Date
    Nov 2007
    Posts
    1,195
    is 1.4 ghz shader clock or gpu clock

  17. #142
    Xtreme Cruncher
    Join Date
    May 2009
    Location
    Bloomfield
    Posts
    1,968
    ^^shader, tesla does not have a core clock.

    wow. memory clocks are quite low. thats only 50% faster than ddr3 and not much faster than 1600MHz (effective) gddr3 of tesla c1070. it is still probably is a major source of power consumption so they really need some 2Gb gddr5 IC's. judging by voltage of tesla which is probably higher than geforce like xeons/optis are to phenom/core it could hit original target clocks if frequency scales properly and power consumption is still within reason.

  18. #143
    I am Xtreme
    Join Date
    Dec 2008
    Location
    France
    Posts
    9,060
    Professional cards are usually lower clocked than gamer ones anyway.
    But doesn't look bad IMO, does it?
    Donate to XS forums
    Quote Originally Posted by jayhall0315 View Post
    If you are really extreme, you never let informed facts or the scientific method hold you back from your journey to the wrong answer.

  19. #144
    Xtreme Addict
    Join Date
    Apr 2007
    Posts
    1,870
    1Ghz GDDR5 is 88% more bandwidth than the 800Mhz GDDR3 on the C1070. Don't see that being a problem at all.

  20. #145
    Xtreme Mentor
    Join Date
    Jul 2008
    Location
    Shimla , India
    Posts
    2,631
    Nvidia castrates Fermi to 448SPs

    "IT LOOKS LIKE we were right about Fermi being too big, too hot, and too late, Nvidia just castrated it to 448SPs. Even at that, it is a 225 Watt part, slipping into the future.

    The main point is from an Nvidia PDF first found here. On page 6, there are some interesting specs, 448 stream processors (SPs), not 512, 1.40GHz, slower than the G200's 1.476GHz, and the big 6GB GDDR5 variant is delayed until 2H 2010. To be charitable, the last one isn't Nvidia's fault, it needs 64x32GDDR5 to make it work, and that isn't coming until 2H 2010 now..."

    http://www.semiaccurate.com/2009/12/...-fermi-448sps/

    Ouch
    Coming Soon

  21. #146
    I am Xtreme
    Join Date
    Oct 2004
    Location
    U.S.A.
    Posts
    4,743
    Quote Originally Posted by ajaidev View Post
    Nvidia castrates Fermi to 448SPs

    "IT LOOKS LIKE we were right about Fermi being too big, too hot, and too late, Nvidia just castrated it to 448SPs. Even at that, it is a 225 Watt part, slipping into the future.


    That's why you wait for Nvidia to refresh Fermi. Nvidia has been too predictable lately, with the poorly designed GTX 280 (refreshed to 285 which still needs better cooling) and their dual pcb sandwiches.


    Asus Z9PE-D8 WS with 64GB of registered ECC ram.|Dell 30" LCD 3008wfp:7970 video card

    LSI series raid controller
    SSDs: Crucial C300 256GB
    Standard drives: Seagate ST32000641AS & WD 1TB black
    OSes: Linux and Windows x64

  22. #147
    Xtreme Addict
    Join Date
    Aug 2007
    Location
    Istantinople
    Posts
    1,574
    which node will a Fermi refresh be on? 32?
    Has anyone really been far even as decided to use even go want to do look more like?
    INTEL Core i7 920 // ASUS P6T Deluxe V2 // OCZ 3G1600 6GB // POWERCOLOR HD5970 // Cooler Master HAF 932 // Thermalright Ultra 120 Extreme // SAMSUNG T260 26"

  23. #148
    Diablo 3! Who's Excited?
    Join Date
    May 2005
    Location
    Boulder, Colorado
    Posts
    9,412
    Fermi is on 40nm. Aren't the last few rounds of GPUs done on half-nodes like 55nm and 40nm compared to CPU 65nm and 45nm? So maybe 28nm for the Fermi refresh if TMSC does the right spirit dance by 2011?

  24. #149
    Xtreme Addict
    Join Date
    Mar 2006
    Location
    Saskatchewan, Canada
    Posts
    2,207
    If those are the specs for the consumer card thats really disappointing.

    Nvidia with the refresh can probably make due with 40nm since its seems more along faults in the process and yields than anything else.

    These things if they are held back by heat will overclock like crazy(atleast under ln2).

    I am really surprised they are not getting clocks higher, because from what I remember, the MUL was what made the clock drop from the 9800 gtx to gtx 280.

    I am guessing ECC has a role to play in the drops clocked all around and without it in the consumer version, hopefully they can bring the clocks up a bit.
    Core i7 920@ 4.66ghz(H2O)
    6gb OCZ platinum
    4870x2 + 4890 in Trifire
    2*640 WD Blacks
    750GB Seagate.

  25. #150
    Xtreme Addict
    Join Date
    Apr 2004
    Posts
    1,640
    It's pretty bad to be stuck with 448 SPs when they've been promising 512 up and down since GDC. I mean they couldn't even do 480? And they have to do that for the whole product line, they can't give the more expensive cards better binning? Something must be REALLY wrong with it then.

    Here's to hoping A3 works out for them.
    DFI LANParty DK 790FX-B
    Phenom II X4 955 BE (1003GPMW) @ 3.8GHz (19x200) w/1.36v
    -cooling: Scythe Mugen 2 + AC MX-2
    XFX ATI Radeon HD 5870 1024MB
    8GB PC2-6400 G.Skill @ 800MHz (1:2) 5-5-5-15 w/1.8v
    Seagate 1TB 7200.11 Barracuda
    Corsair HX620W


    Support PC gaming. Don't pirate games.

Page 6 of 82 FirstFirst ... 34567891656 ... LastLast

Tags for this Thread

Bookmarks

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •