Little speculation on my part . From here we see this table:
IF SI has 1920SPs,then it's a 20% increase in stream processors count,meaning the die size is roughly ~3.2% higher (since 40% bigger die ,at the same node, gets AMD ~250% more SPs). ~4% die area investment means ~ 347mm2,basically the same die size as Cypress.If the SPs are reorganized in 4D scheme and utilization is better(as rumored) compared to Cyoress' approach,then the SI 6000 series can bring more than 20% performance improvement with almost no die space investment.Keep in mind that there was a tessellation improvement mentioned too in the news,so overall the SI ,done @40nm could mean more performance @ same die space and the same or slightly higher TDP envelopes. It doesn't have to be named 6870,a 6770 would suffice.
there was a nice improvement becase the RV770's shaders count is 150% bigger than the RV670's.
It can't apply here, RV670 was REALLY small and didn't eat much power, so there was much room for that improvement.
but RV870 is not small, such a big jump in shaders counts would mean a huge, really huge chip.
it's going to have some nice improvements, but no way comparable with RV670 -> RV770. This will only be possible at 28nm.
AMD Phenom II X6 1055T @ 4009MHz
NB @ 2673MHz
Corsair H50 + Scythe Ultra Kaze 3k
Gigabyte GA-MA790X-UD4P
2X2GB DDR2 OCZ Gold
XFX Radeon HD5850 XXX @ 900MHz Core
OCZ Agility2 60GB
2x500GB HDD WD Blue
250GB Samsung
SevenTeam 620W PAF
CoolerMaster CM690
FX-8350(1249PGT) @ 4.7ghz 1.452v, Swiftech H220x
Asus Crosshair Formula 5 Am3+ bios v1703
G.skill Trident X (2x4gb) ~1200mhz @ 10-12-12-31-46-2T @ 1.66v
MSI 7950 TwinFrozr *1100/1500* Cat.14.9
OCZ ZX 850w psu
Lian-Li Lancool K62
Samsung 830 128g
2 x 1TB Samsung SpinpointF3, 2T Samsung
Win7 Home 64bit
My Rig
impossible vga. what about 6970
Intel Q9650 @500x9MHz/1,3V
Asus Maximus II Formula @Performance Level=7
OCZ OCZ2B1200LV4GK 4x2GB @1200MHz/5-5-5-15/1,8V
OCZ SSD Vertex 3 120Gb
Seagate RAID0 2x ST1000DM003
XFX HD7970 3GB @1111MHz
Thermaltake Xaser VI BWS
Seasonic Platinum SS-1000XP
M-Audio Audiophile 192
LG W2486L
Liquid Cooling System :
ThermoChill PA120.3 + Coolgate 4x120
Swiftech Apogee XT, Swiftech MCW-NBMAX Northbridge
Watercool HeatKiller GPU-X3 79X0 Ni-Bl + HeatKiller GPU Backplate 79X0
Laing 12V DDC-1Plus with XSPC Laing DDC Reservoir Top
3x Scythe S-FLEX "F", 4x Scythe Gentle Typhoon "15", Scythe Kaze Master Ace 5,25''
Apple MacBook Pro 17` Early 2011:
CPU: Sandy Bridge Intel Core i7 2720QM
RAM: Crucial 2x4GB DDR3 1333
SSD: Samsung 840 Pro 256 GB SSD
HDD: ADATA Nobility NH13 1GB White
OS: Mac OS X Mavericks
Im not saying that we will see the same improvement from RV670 to RV770. There were many post say because there can't be much improvement to non because the next GPU is on 40nm. One thing I can see is some improvements with shader efficiency because to me its lower then the RV770 vs the evergreens. I can see nice improvement if the shader efficiency is improved, tessellation is improved, and more shaders is added.
last gen of nvidia did a nice improvement in geometry, i hope ATI thinked too upgrade this part.
320ALUs 5d(1600SPs) to 480ALUs 4d(1920SPs) is a 50% increase...
You can't really use the RV670->RV770 transition to base future changes to the architecture since it isn't a true comparison because they went from a ringbus to a hub.
6700 is definitely larger than Cypress, with current rumors putting it just a bit under 400mm2, most specific number I have heard is 395mm2. Performance target is obviously a "full" GF100 and I'm guessing the TDP would be around GTX470 levels, 210-220w.
Originally Posted by motown_steve
Every genocide that was committed during the 20th century has been preceded by the disarmament of the target population. Once the government outlaws your guns your life becomes a luxury afforded to you by the state. You become a tool to benefit the state. Should you cease to benefit the state or even worse become an annoyance or even a hindrance to the state then your life becomes more trouble than it is worth.
Once the government outlaws your guns your life is forfeit. You're already dead, it's just a question of when they are going to get around to you.
if they go for a 400mm2 chip at 40nm, then 28nm with double everything is still going to be close to 400mm2 and maybe 200W+. which does beg the question how stripped down will the 6970 have to be for a duel gpu card
how big is that 5th SP in the ALU? removing 320 of those, then adding 160 of the other 4 sounds like it will make it alot bigger
It shouldn't be... I was just stating that some of the larger die savings from RV670->RV770 isn't going to happen with Cypress->6700. So you can't use the SP increase to die size increase to estimate 6700.
If they are all the same size, going from 320 5d to 480 4d should only be ~20% increase in total shader space, not taking into account other units; scheduler, cache, TMU/TFUs ect.
50% shader increase for 20% size increase, seems like an efficient design if 5d is really only averaging a max utilization of ~80% in most situations, meaning the actual performance of a 5d vs 4d ALU should be about the same.
That's what I was thinking about back in April, when I first heard that the 28nm beast will be 512bit, though I don't know if the 512bit part is true or if it is a single GPU or dual GPU?
It really depends on how "mature" 28nm is, what clocks and what power savings they can get from it.
Last edited by LordEC911; 07-23-2010 at 12:34 PM.
Originally Posted by motown_steve
Every genocide that was committed during the 20th century has been preceded by the disarmament of the target population. Once the government outlaws your guns your life becomes a luxury afforded to you by the state. You become a tool to benefit the state. Should you cease to benefit the state or even worse become an annoyance or even a hindrance to the state then your life becomes more trouble than it is worth.
Once the government outlaws your guns your life is forfeit. You're already dead, it's just a question of when they are going to get around to you.
TSMC's 28nm HP is 40% faster at the same leakage. that could mean a 2GHz 480sp fermi or fcypress with 1600sp @1200MHz with a smaller die and cheaper. or is ATi going to GloFo? if so that makes predictability a lot harder.
leakage increases exponentially with more voltage and linearly with transistor count.
So if AMD made a die double the size they would have many thousand SPs, and hundreds of ROPs and TMUs?
I would say that you are doing some mistakes here. First, you can never use numbers like that since you don't know how much of the die increase that is due to shaders. And second, you can't compare different architectures like that. HD3870 had a big ringbus and AMD made some major changes to the layout, they increased the transistor density a bit too.
If you shall compare chips like that, compare modern chips from the same generation, like Cypress and Juniper. And you can see that it scales pretty linear.
I would say that SI will have at least 20% larger die, since the amount of shaders has increased. But we don't know much about TMUs and ROPs. Besides they have made some architectural changes that we don't know too much about either.
So, my guess at the moment, around 400mm² and about 20-50% better performance depending on situation.
i didnt think it was the same size, i thought the first was much larger than the other 4, which is why they went with 5, instead of like 2-3
removing 1 may get them 5-10% more space, for a 2-3% in game perf loss (unless your like furmark which hopefully means 20% less heat and power consumption) then adding in
i really would like to get a much more knowledgeable answer about the size of the ALUs and SPs.
im gonna go with really bad fake numbers that are made up in my head
first SP is 30%, next are 10% each, (30+10+10+10+10) total is 70% of the chip from SPs
removing 1 SP per ALU will net them 10% space (aiming high) with 320 ALUs thats .0003125% of chip space per small SP
adding in 160 more ALUs will add 480 small SPs is 15% more space
adding 160 large SPs is another 15%.
-10+15+15=
20% bigger for 1920 SPs using 480 ALUs of 4d
please be aware that i do not know crap about the accuracy of those numbers
If they go from 4+1 configuration to 3+1 (assuming no other factors are involved in die area)…
If the fat SPU is 1.0x the size of a regular SPU:
480[3+1] is 1.20x the area of 320[4+1]
If the fat SPU is 1.2x the size of a regular SPU:
480[3+1] is 1.21x the area of 320[4+1]
If the fat SPU is 1.5x the size of a regular SPU:
480[3+1] is 1.23x the area of 320[4+1]
If the fat SPU is 2.0x the size of a regular SPU:
480[3+1] is 1.25x the area of 320[4+1] (and 384[3+1] is the same area as 320[4+1])
If the fat SPU is 3.0x the size of a regular SPU:
480[3+1] is 1.29x the area of 320[4+1]
If the fat SPU is 5.0x the size of a regular SPU:
480[3+1] is 1.33x the area of 320[4+1]
Even with a really fat SPU, the increase in fat SPU density by going from 4+1 to 3+1 doesn't result in a large increase in total SPU area.
1920 SP is going to be hardly any faster than 1600 SP without significant background improvements (which I presume is called NI )
Higher clocks at same or lower TDP, better Tessellation and GPGPU features could be good enough to counter GF104 till we have a ATI 'Fermi'.
Heck, even Tessellation and other DX11 stuff is useless ATM, higher clocks would be enough to just go clear.
Up to 50% more power on the same frequency is not bad at all on the same node.
Everyone is assuming that the 1920 SPs are a fact even if we don't know anything about that yet. But if it is, we will have 50% more ALUs with a total of 20% more SPs, the difference will be quite big. Especially if you take into account how much each SP has been utilized earlier.
Why would they go back from 4+1 to 3+1 , i thought we were moving towards the parallel computing era. But it depends what the long term strategy is -: they can trade die space for better ILP or more Shader Units
It is worth noting that ATi simply doesnt need anything much faster than what it has right now to dominate the market , it simply needs to improve the efficiency of the current-gen arch. I doubt Nvidia has anything left to offer in Perf/Enthusiast segment till maybe Q1 2011.
Once 28nm comes the whole equation changes. The improvements made now must be progressive, thats all.
Last edited by Tao~; 07-28-2010 at 11:17 PM.
Tao~, go read a few uarch details on 4xxx reviews and also 5xxx reviews. 4+1 config are probably underutilized. 3+1 config will probably be utilized more, but as a result, requires more die-space, due to an increase in the number of fat shaders. The upside is, DP performance increases.
Radeon HD 6xxx as we know it is a big experiment by AMD. They want HD 7xxx, their next uarch to be smooth. Might as well take the opportunity to earn some profit from it, because they can. I think nvidia would've made a fermi+G90 uarch hybrid if they could, just as a stopgap before the arrival of HD 5xxx.
One other thing people haven't taken into account is how 4+1 is a MUST for DP (the shaders combine to do DP. I think techreport explains in length regarding this). If AMD were to do a 3+1 config, there must have been a change in how the shaders work. That's for SI. NI probably implements the whole uarch. All speculation on my part.
SI is the upcoming. NI is the future.
If AMD wants SI to do 1920 shaders in 40nm, plus a 3+1 config, there HAS to be a change in the shader uarch (or anything directly related to it). They're the most space-consuming after all.
Last edited by blindbox; 07-28-2010 at 11:30 PM.
Not sure if this has been mentioned but any release date? havent been on much lately..
Another thing I find funny is AMD/Intel would snipe any of our Moms on a grocery run if it meant good quarterly results, and you are forever whining about what feser did?
Bookmarks