Page 4 of 4 FirstFirst 1234
Results 76 to 98 of 98

Thread: Radeon HD 6870 could widely outperform gtx 480

  1. #76
    Xtreme Enthusiast
    Join Date
    Dec 2009
    Location
    Burbank, CA
    Posts
    563
    Quote Originally Posted by 570091D View Post
    and you, of course, realize this news came from kitguru... the same site that saw a custom gtx480 pcb and declared the coming of the 512sp gtx485.

    and yes i do think that amd would ship a modified version of evergreen with better tessellation performance and only a modest improvement for all other aspects of the chip.
    hahaha, yes i remember that. Im hoping that the 6000 is a fast card, who doesnt want a faster video card, i sure do! bring it on AMD!!!

  2. #77
    Xtreme Cruncher
    Join Date
    Jun 2006
    Posts
    6,215
    Little speculation on my part . From here we see this table:


    IF SI has 1920SPs,then it's a 20% increase in stream processors count,meaning the die size is roughly ~3.2% higher (since 40% bigger die ,at the same node, gets AMD ~250% more SPs). ~4% die area investment means ~ 347mm2,basically the same die size as Cypress.If the SPs are reorganized in 4D scheme and utilization is better(as rumored) compared to Cyoress' approach,then the SI 6000 series can bring more than 20% performance improvement with almost no die space investment.Keep in mind that there was a tessellation improvement mentioned too in the news,so overall the SI ,done @40nm could mean more performance @ same die space and the same or slightly higher TDP envelopes. It doesn't have to be named 6870,a 6770 would suffice.

  3. #78
    Xtreme Member
    Join Date
    Oct 2009
    Location
    Santos(São Paulo), Brasil.
    Posts
    202
    Quote Originally Posted by SnipingWaste View Post
    I find it funny that y'all think there can't be much improvement because its going to be on 40nm. Just look at the RV670 and RV770. Both are on 55nm and there is a nice improvement from the RV670 to the RV770.
    there was a nice improvement becase the RV770's shaders count is 150% bigger than the RV670's.
    It can't apply here, RV670 was REALLY small and didn't eat much power, so there was much room for that improvement.
    but RV870 is not small, such a big jump in shaders counts would mean a huge, really huge chip.
    it's going to have some nice improvements, but no way comparable with RV670 -> RV770. This will only be possible at 28nm.
    AMD Phenom II X6 1055T @ 4009MHz
    NB @ 2673MHz
    Corsair H50 + Scythe Ultra Kaze 3k
    Gigabyte GA-MA790X-UD4P
    2X2GB DDR2 OCZ Gold
    XFX Radeon HD5850 XXX @ 900MHz Core
    OCZ Agility2 60GB
    2x500GB HDD WD Blue
    250GB Samsung
    SevenTeam 620W PAF
    CoolerMaster CM690

  4. #79
    Xtreme Mentor
    Join Date
    Feb 2007
    Location
    West hartford, CT
    Posts
    2,804
    Quote Originally Posted by Lokinhow View Post
    there was a nice improvement becase the RV770's shaders count is 150% bigger than the RV670's.
    It can't apply here, RV670 was REALLY small and didn't eat much power, so there was much room for that improvement.
    but RV870 is not small, such a big jump in shaders counts would mean a huge, really huge chip.
    it's going to have some nice improvements, but no way comparable with RV670 -> RV770. This will only be possible at 28nm.
    read above
    FX-8350(1249PGT) @ 4.7ghz 1.452v, Swiftech H220x
    Asus Crosshair Formula 5 Am3+ bios v1703
    G.skill Trident X (2x4gb) ~1200mhz @ 10-12-12-31-46-2T @ 1.66v
    MSI 7950 TwinFrozr *1100/1500* Cat.14.9
    OCZ ZX 850w psu
    Lian-Li Lancool K62
    Samsung 830 128g
    2 x 1TB Samsung SpinpointF3, 2T Samsung
    Win7 Home 64bit
    My Rig

  5. #80
    Xtreme Addict
    Join Date
    Feb 2008
    Location
    Russia
    Posts
    1,910
    impossible vga. what about 6970

    Intel Q9650 @500x9MHz/1,3V
    Asus Maximus II Formula @Performance Level=7
    OCZ OCZ2B1200LV4GK 4x2GB @1200MHz/5-5-5-15/1,8V
    OCZ SSD Vertex 3 120Gb
    Seagate RAID0 2x ST1000DM003
    XFX HD7970 3GB @1111MHz
    Thermaltake Xaser VI BWS
    Seasonic Platinum SS-1000XP
    M-Audio Audiophile 192
    LG W2486L
    Liquid Cooling System :
    ThermoChill PA120.3 + Coolgate 4x120
    Swiftech Apogee XT, Swiftech MCW-NBMAX Northbridge
    Watercool HeatKiller GPU-X3 79X0 Ni-Bl + HeatKiller GPU Backplate 79X0
    Laing 12V DDC-1Plus with XSPC Laing DDC Reservoir Top
    3x Scythe S-FLEX "F", 4x Scythe Gentle Typhoon "15", Scythe Kaze Master Ace 5,25''

    Apple MacBook Pro 17` Early 2011:
    CPU: Sandy Bridge Intel Core i7 2720QM
    RAM: Crucial 2x4GB DDR3 1333
    SSD: Samsung 840 Pro 256 GB SSD
    HDD: ADATA Nobility NH13 1GB White
    OS: Mac OS X Mavericks

  6. #81
    Xtreme Enthusiast
    Join Date
    Oct 2004
    Posts
    684
    Quote Originally Posted by Lokinhow View Post
    there was a nice improvement becase the RV770's shaders count is 150% bigger than the RV670's.
    It can't apply here, RV670 was REALLY small and didn't eat much power, so there was much room for that improvement.
    but RV870 is not small, such a big jump in shaders counts would mean a huge, really huge chip.
    it's going to have some nice improvements, but no way comparable with RV670 -> RV770. This will only be possible at 28nm.

    Im not saying that we will see the same improvement from RV670 to RV770. There were many post say because there can't be much improvement to non because the next GPU is on 40nm. One thing I can see is some improvements with shader efficiency because to me its lower then the RV770 vs the evergreens. I can see nice improvement if the shader efficiency is improved, tessellation is improved, and more shaders is added.

  7. #82
    Xtreme Addict
    Join Date
    Nov 2006
    Posts
    1,402
    last gen of nvidia did a nice improvement in geometry, i hope ATI thinked too upgrade this part.

  8. #83
    Xtreme Addict
    Join Date
    May 2007
    Location
    'Zona
    Posts
    2,346
    Quote Originally Posted by informal View Post
    Little speculation on my part . From here we see this table:


    IF SI has 1920SPs,then it's a 20% increase in stream processors count,meaning the die size is roughly ~3.2% higher (since 40% bigger die ,at the same node, gets AMD ~250% more SPs). ~4% die area investment means ~ 347mm2,basically the same die size as Cypress.If the SPs are reorganized in 4D scheme and utilization is better(as rumored) compared to Cyoress' approach,then the SI 6000 series can bring more than 20% performance improvement with almost no die space investment.Keep in mind that there was a tessellation improvement mentioned too in the news,so overall the SI ,done @40nm could mean more performance @ same die space and the same or slightly higher TDP envelopes. It doesn't have to be named 6870,a 6770 would suffice.
    320ALUs 5d(1600SPs) to 480ALUs 4d(1920SPs) is a 50% increase...
    You can't really use the RV670->RV770 transition to base future changes to the architecture since it isn't a true comparison because they went from a ringbus to a hub.
    6700 is definitely larger than Cypress, with current rumors putting it just a bit under 400mm2, most specific number I have heard is 395mm2. Performance target is obviously a "full" GF100 and I'm guessing the TDP would be around GTX470 levels, 210-220w.
    Originally Posted by motown_steve
    Every genocide that was committed during the 20th century has been preceded by the disarmament of the target population. Once the government outlaws your guns your life becomes a luxury afforded to you by the state. You become a tool to benefit the state. Should you cease to benefit the state or even worse become an annoyance or even a hindrance to the state then your life becomes more trouble than it is worth.

    Once the government outlaws your guns your life is forfeit. You're already dead, it's just a question of when they are going to get around to you.

  9. #84
    I am Xtreme
    Join Date
    Dec 2007
    Posts
    7,750
    if they go for a 400mm2 chip at 40nm, then 28nm with double everything is still going to be close to 400mm2 and maybe 200W+. which does beg the question how stripped down will the 6970 have to be for a duel gpu card

  10. #85
    Xtreme Addict
    Join Date
    Jul 2007
    Posts
    1,488
    Quote Originally Posted by LordEC911 View Post
    320ALUs 5d(1600SPs) to 480ALUs 4d(1920SPs) is a 50% increase...
    Why would a 4d ALU be as big as a 5d ALU?

  11. #86
    I am Xtreme
    Join Date
    Dec 2007
    Posts
    7,750
    how big is that 5th SP in the ALU? removing 320 of those, then adding 160 of the other 4 sounds like it will make it alot bigger

  12. #87
    Xtreme Addict
    Join Date
    May 2007
    Location
    'Zona
    Posts
    2,346
    Quote Originally Posted by Solus Corvus View Post
    Why would a 4d ALU be as big as a 5d ALU?
    It shouldn't be... I was just stating that some of the larger die savings from RV670->RV770 isn't going to happen with Cypress->6700. So you can't use the SP increase to die size increase to estimate 6700.

    Quote Originally Posted by Manicdan View Post
    how big is that 5th SP in the ALU? removing 320 of those, then adding 160 of the other 4 sounds like it will make it alot bigger
    If they are all the same size, going from 320 5d to 480 4d should only be ~20% increase in total shader space, not taking into account other units; scheduler, cache, TMU/TFUs ect.
    50% shader increase for 20% size increase, seems like an efficient design if 5d is really only averaging a max utilization of ~80% in most situations, meaning the actual performance of a 5d vs 4d ALU should be about the same.

    Quote Originally Posted by Manicdan View Post
    if they go for a 400mm2 chip at 40nm, then 28nm with double everything is still going to be close to 400mm2 and maybe 200W+. which does beg the question how stripped down will the 6970 have to be for a duel gpu card
    That's what I was thinking about back in April, when I first heard that the 28nm beast will be 512bit, though I don't know if the 512bit part is true or if it is a single GPU or dual GPU?
    It really depends on how "mature" 28nm is, what clocks and what power savings they can get from it.
    Last edited by LordEC911; 07-23-2010 at 12:34 PM.
    Originally Posted by motown_steve
    Every genocide that was committed during the 20th century has been preceded by the disarmament of the target population. Once the government outlaws your guns your life becomes a luxury afforded to you by the state. You become a tool to benefit the state. Should you cease to benefit the state or even worse become an annoyance or even a hindrance to the state then your life becomes more trouble than it is worth.

    Once the government outlaws your guns your life is forfeit. You're already dead, it's just a question of when they are going to get around to you.

  13. #88
    Xtreme Cruncher
    Join Date
    May 2009
    Location
    Bloomfield
    Posts
    1,968
    TSMC's 28nm HP is 40% faster at the same leakage. that could mean a 2GHz 480sp fermi or fcypress with 1600sp @1200MHz with a smaller die and cheaper. or is ATi going to GloFo? if so that makes predictability a lot harder.

    leakage increases exponentially with more voltage and linearly with transistor count.

  14. #89
    Xtreme Enthusiast
    Join Date
    Oct 2008
    Posts
    678
    Quote Originally Posted by informal View Post
    Little speculation on my part . From here we see this table:


    IF SI has 1920SPs,then it's a 20% increase in stream processors count,meaning the die size is roughly ~3.2% higher (since 40% bigger die ,at the same node, gets AMD ~250% more SPs). ~4% die area investment means ~ 347mm2,basically the same die size as Cypress.If the SPs are reorganized in 4D scheme and utilization is better(as rumored) compared to Cyoress' approach,then the SI 6000 series can bring more than 20% performance improvement with almost no die space investment.Keep in mind that there was a tessellation improvement mentioned too in the news,so overall the SI ,done @40nm could mean more performance @ same die space and the same or slightly higher TDP envelopes. It doesn't have to be named 6870,a 6770 would suffice.
    So if AMD made a die double the size they would have many thousand SPs, and hundreds of ROPs and TMUs?
    I would say that you are doing some mistakes here. First, you can never use numbers like that since you don't know how much of the die increase that is due to shaders. And second, you can't compare different architectures like that. HD3870 had a big ringbus and AMD made some major changes to the layout, they increased the transistor density a bit too.

    If you shall compare chips like that, compare modern chips from the same generation, like Cypress and Juniper. And you can see that it scales pretty linear.

    I would say that SI will have at least 20% larger die, since the amount of shaders has increased. But we don't know much about TMUs and ROPs. Besides they have made some architectural changes that we don't know too much about either.

    So, my guess at the moment, around 400mm² and about 20-50% better performance depending on situation.

  15. #90
    I am Xtreme
    Join Date
    Dec 2007
    Posts
    7,750
    Quote Originally Posted by LordEC911 View Post
    If they are all the same size, going from 320 5d to 480 4d should only be ~20% increase in total shader space, not taking into account other units; scheduler, cache, TMU/TFUs ect.
    50% shader increase for 20% size increase, seems like an efficient design if 5d is really only averaging a max utilization of ~80% in most situations, meaning the actual performance of a 5d vs 4d ALU should be about the same.
    i didnt think it was the same size, i thought the first was much larger than the other 4, which is why they went with 5, instead of like 2-3

    removing 1 may get them 5-10% more space, for a 2-3% in game perf loss (unless your like furmark which hopefully means 20% less heat and power consumption) then adding in

    i really would like to get a much more knowledgeable answer about the size of the ALUs and SPs.

    im gonna go with really bad fake numbers that are made up in my head
    first SP is 30%, next are 10% each, (30+10+10+10+10) total is 70% of the chip from SPs
    removing 1 SP per ALU will net them 10% space (aiming high) with 320 ALUs thats .0003125% of chip space per small SP
    adding in 160 more ALUs will add 480 small SPs is 15% more space
    adding 160 large SPs is another 15%.
    -10+15+15=
    20% bigger for 1920 SPs using 480 ALUs of 4d

    please be aware that i do not know crap about the accuracy of those numbers

  16. #91
    Xtreme Member
    Join Date
    Apr 2010
    Posts
    145
    If they go from 4+1 configuration to 3+1 (assuming no other factors are involved in die area)…

    If the fat SPU is 1.0x the size of a regular SPU:
    480[3+1] is 1.20x the area of 320[4+1]

    If the fat SPU is 1.2x the size of a regular SPU:
    480[3+1] is 1.21x the area of 320[4+1]

    If the fat SPU is 1.5x the size of a regular SPU:
    480[3+1] is 1.23x the area of 320[4+1]

    If the fat SPU is 2.0x the size of a regular SPU:
    480[3+1] is 1.25x the area of 320[4+1] (and 384[3+1] is the same area as 320[4+1])

    If the fat SPU is 3.0x the size of a regular SPU:
    480[3+1] is 1.29x the area of 320[4+1]

    If the fat SPU is 5.0x the size of a regular SPU:
    480[3+1] is 1.33x the area of 320[4+1]

    Even with a really fat SPU, the increase in fat SPU density by going from 4+1 to 3+1 doesn't result in a large increase in total SPU area.

  17. #92

  18. #93
    Xtreme Member
    Join Date
    Mar 2009
    Location
    Unknown
    Posts
    266
    1920 SP is going to be hardly any faster than 1600 SP without significant background improvements (which I presume is called NI )
    Higher clocks at same or lower TDP, better Tessellation and GPGPU features could be good enough to counter GF104 till we have a ATI 'Fermi'.
    Heck, even Tessellation and other DX11 stuff is useless ATM, higher clocks would be enough to just go clear.

  19. #94
    Xtreme Enthusiast
    Join Date
    Oct 2008
    Posts
    678
    Quote Originally Posted by Tao~ View Post
    1920 SP is going to be hardly any faster than 1600 SP without significant background improvements (which I presume is called NI )
    Higher clocks at same or lower TDP, better Tessellation and GPGPU features could be good enough to counter GF104 till we have a ATI 'Fermi'.
    Heck, even Tessellation and other DX11 stuff is useless ATM, higher clocks would be enough to just go clear.
    Up to 50% more power on the same frequency is not bad at all on the same node.
    Everyone is assuming that the 1920 SPs are a fact even if we don't know anything about that yet. But if it is, we will have 50% more ALUs with a total of 20% more SPs, the difference will be quite big. Especially if you take into account how much each SP has been utilized earlier.

  20. #95
    Xtreme Member
    Join Date
    Mar 2009
    Location
    Unknown
    Posts
    266
    Why would they go back from 4+1 to 3+1 , i thought we were moving towards the parallel computing era. But it depends what the long term strategy is -: they can trade die space for better ILP or more Shader Units

    It is worth noting that ATi simply doesnt need anything much faster than what it has right now to dominate the market , it simply needs to improve the efficiency of the current-gen arch. I doubt Nvidia has anything left to offer in Perf/Enthusiast segment till maybe Q1 2011.

    Once 28nm comes the whole equation changes. The improvements made now must be progressive, thats all.
    Last edited by Tao~; 07-28-2010 at 11:17 PM.

  21. #96
    Xtreme Enthusiast
    Join Date
    Feb 2009
    Posts
    800
    Tao~, go read a few uarch details on 4xxx reviews and also 5xxx reviews. 4+1 config are probably underutilized. 3+1 config will probably be utilized more, but as a result, requires more die-space, due to an increase in the number of fat shaders. The upside is, DP performance increases.

    Radeon HD 6xxx as we know it is a big experiment by AMD. They want HD 7xxx, their next uarch to be smooth. Might as well take the opportunity to earn some profit from it, because they can. I think nvidia would've made a fermi+G90 uarch hybrid if they could, just as a stopgap before the arrival of HD 5xxx.


    One other thing people haven't taken into account is how 4+1 is a MUST for DP (the shaders combine to do DP. I think techreport explains in length regarding this). If AMD were to do a 3+1 config, there must have been a change in how the shaders work. That's for SI. NI probably implements the whole uarch. All speculation on my part.

    SI is the upcoming. NI is the future.

    If AMD wants SI to do 1920 shaders in 40nm, plus a 3+1 config, there HAS to be a change in the shader uarch (or anything directly related to it). They're the most space-consuming after all.
    Last edited by blindbox; 07-28-2010 at 11:30 PM.

  22. #97
    Xtreme Guru
    Join Date
    Jun 2010
    Location
    In the Land down -under-
    Posts
    4,452
    Not sure if this has been mentioned but any release date? havent been on much lately..

    Another thing I find funny is AMD/Intel would snipe any of our Moms on a grocery run if it meant good quarterly results, and you are forever whining about what feser did?

  23. #98
    Xtreme Cruncher
    Join Date
    Jun 2006
    Posts
    6,215
    Quote Originally Posted by Johnny87au View Post
    Not sure if this has been mentioned but any release date? havent been on much lately..
    5 posts above yours ....

Page 4 of 4 FirstFirst 1234

Bookmarks

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •