Page 1 of 3 123 LastLast
Results 1 to 25 of 58

Thread: A Fermi benchmark of sorts...

  1. #1
    Xtreme Mentor Talonman's Avatar
    Join Date
    Sep 2007
    Location
    Ohio
    Posts
    2,972

    A Fermi benchmark of sorts...

    A Fermi benchmark of sorts...

    If you do consider folding a benchmark, we have had a Fermi report if you haven't heard.

    http://www.evga.com/forums/tm.asp?m=...1&key=�

    I make these bold assumptions about Fermi.

    1) It will use about 320 watts when loaded up.

    2) (1) Fermi GPU can produce 40K PPD, or more.

    3) Fermi will fold near or exceed 4X the speed of the 200 Series GPU's.
    What now takes 60 minutes to produce, will soon be done in 15 minutes.

    Fermi will also have 1.5GB of memory, and is made up of 3.0 billion transistors and features 512 CUDA processing cores organized into 16 SMs (Streaming Multiprocessors) of 32 cores each.
    Last edited by Talonman; 10-18-2009 at 04:01 PM.
    Asus Maximus SE X38 / Lapped Q6600 G0 @ 3.8GHz (L726B397 stock VID=1.224) / 7 Ultimate x64 /EVGA GTX 295 C=650 S=1512 M=1188 (Graphics)/ EVGA GTX 280 C=756 S=1512 M=1296 (PhysX)/ G.SKILL 8GB (4 x 2GB) SDRAM DDR2 1000 (PC2 8000) / Gateway FPD2485W (1920 x 1200 res) / Toughpower 1,000-Watt modular PSU / SilverStone TJ-09 BW / (2) 150 GB Raptor's RAID-0 / (1) Western Digital Caviar 750 GB / LG GGC-H20L (CD, DVD, HD-DVD, and BlueRay Drive) / WaterKegIII Xtreme / D-TEK FuZion CPU, EVGA Hydro Copper 16 GPU, and EK NB S-MAX Acetal Waterblocks / Enzotech Forged Copper CNB-S1L (South Bridge heat sink)

  2. #2
    Xtreme Addict
    Join Date
    Apr 2007
    Posts
    1,870
    http://foldingforum.org/viewtopic.ph...=11717#p114890

    btw, Fermi only has 16 cores but with 32 shaders each.

  3. #3
    Da Goose DAK1640's Avatar
    Join Date
    Oct 2005
    Location
    Chicago
    Posts
    4,914
    Omg...:d


    i7-860 Farm with nVidia GPU's

  4. #4
    Xtreme Cruncher OldChap's Avatar
    Join Date
    Mar 2008
    Location
    Plymouth (UK)
    Posts
    5,101
    I've not seen pics yet but what connections will this have? pcie =75w, 6pin = 75w, 8pin = 150w, one of each = 300watts. Going to need to upgrade psu's for this?


    My Biggest Fear Is When I die, My Wife Sells All My Stuff For What I Told Her I Paid For It.
    79 SB threads and 32 IB Threads across 4 rigs 111 threads Crunching!!

  5. #5
    Xtreme Mentor Talonman's Avatar
    Join Date
    Sep 2007
    Location
    Ohio
    Posts
    2,972
    Quote Originally Posted by trinibwoy View Post
    http://foldingforum.org/viewtopic.ph...=11717#p114890

    btw, Fermi only has 16 cores but with 32 shaders each.
    Thanks!

    Fixed:

    Fermi will also have 1.5GB of memory, and is made up of 3.0 billion transistors and features 512 CUDA processing cores organized into 16 SMs (Streaming Multiprocessors) of 32 cores each.
    Last edited by Talonman; 10-18-2009 at 04:02 PM.
    Asus Maximus SE X38 / Lapped Q6600 G0 @ 3.8GHz (L726B397 stock VID=1.224) / 7 Ultimate x64 /EVGA GTX 295 C=650 S=1512 M=1188 (Graphics)/ EVGA GTX 280 C=756 S=1512 M=1296 (PhysX)/ G.SKILL 8GB (4 x 2GB) SDRAM DDR2 1000 (PC2 8000) / Gateway FPD2485W (1920 x 1200 res) / Toughpower 1,000-Watt modular PSU / SilverStone TJ-09 BW / (2) 150 GB Raptor's RAID-0 / (1) Western Digital Caviar 750 GB / LG GGC-H20L (CD, DVD, HD-DVD, and BlueRay Drive) / WaterKegIII Xtreme / D-TEK FuZion CPU, EVGA Hydro Copper 16 GPU, and EK NB S-MAX Acetal Waterblocks / Enzotech Forged Copper CNB-S1L (South Bridge heat sink)

  6. #6
    Xtreme Legend
    Join Date
    Jan 2003
    Location
    Stuttgart, Germany
    Posts
    930
    there's no proof that this is indeed fermi folding .. a lot of information that doesnt add up.. just wishful thinking by the nv fanboys

  7. #7
    Xtreme Member jfromeo's Avatar
    Join Date
    Jul 2009
    Location
    Madrid, Spain
    Posts
    169
    Quote Originally Posted by OldChap View Post
    I've not seen pics yet but what connections will this have? pcie =75w, 6pin = 75w, 8pin = 150w, one of each = 300watts. Going to need to upgrade psu's for this?
    PCI-e 2.0 is 150W, 75W is 1.x.
    WORKSTATION || TJ10B-W | i7-3930K C2 | 4x8GB DDR3-2400 CL10 | 2xGTX TITAN 6GB SLI | P1000W || 30" 2560x1600@60hz
    HTPC || GD08B | i7-920 D0 | 3x4GB 2000 CL9 | HD5870 1GB | X25-M 80GB | X-750 || 75" 1920x1080@4x200hz
    NOTEBOOK || P170EM | i7-3820QM | 2x8GB DDR3 1600 CL9 | GTX680M 4GB | HyperX 3K 240GB || 17,3" 1920x1080@60hz
    ULTRABOOK || W130EW | i7-3620QM | 2x8GB DDR3 1600 CL9 | HD4000 | HyperX 3K 240GB || 13,3" 1366x768@60hz

  8. #8
    Xtreme Mentor Olivon's Avatar
    Join Date
    Jun 2008
    Location
    France - Bx
    Posts
    2,595
    Quote Originally Posted by W1zzard View Post
    there's no proof that this is indeed fermi folding .. a lot of information that doesnt add up.. just wishful thinking by the nv fanboys
    ^Right

    1) It will use about 320 watts when loaded up.
    It seems weird, no
    Last edited by Olivon; 10-18-2009 at 07:37 AM.

  9. #9
    Xtreme Mentor Talonman's Avatar
    Join Date
    Sep 2007
    Location
    Ohio
    Posts
    2,972
    Quote Originally Posted by W1zzard View Post
    there's no proof that this is indeed fermi folding .. a lot of information that doesnt add up.. just wishful thinking by the nv fanboys
    Wishful thinking that will most likely turn out to be fact.

    Do you not think (1) Fermi GPU can produce 40K PPD, or more?

    BTW - His 24hr PPD is up to 344,845.



    http://folding.extremeoverclocking.c...hp?s=&u=477950

    344,845 - 10K PPD for the (4) i7 cores= 334,845.

    334,845 / 7 Fermi = 49,263. Wow!!

    It might actually be closer to 50K PPD per Fermi?
    Last edited by Talonman; 10-18-2009 at 08:06 AM.
    Asus Maximus SE X38 / Lapped Q6600 G0 @ 3.8GHz (L726B397 stock VID=1.224) / 7 Ultimate x64 /EVGA GTX 295 C=650 S=1512 M=1188 (Graphics)/ EVGA GTX 280 C=756 S=1512 M=1296 (PhysX)/ G.SKILL 8GB (4 x 2GB) SDRAM DDR2 1000 (PC2 8000) / Gateway FPD2485W (1920 x 1200 res) / Toughpower 1,000-Watt modular PSU / SilverStone TJ-09 BW / (2) 150 GB Raptor's RAID-0 / (1) Western Digital Caviar 750 GB / LG GGC-H20L (CD, DVD, HD-DVD, and BlueRay Drive) / WaterKegIII Xtreme / D-TEK FuZion CPU, EVGA Hydro Copper 16 GPU, and EK NB S-MAX Acetal Waterblocks / Enzotech Forged Copper CNB-S1L (South Bridge heat sink)

  10. #10
    Xtreme Mentor Jamesrt2004's Avatar
    Join Date
    Feb 2007
    Location
    Oxford, England
    Posts
    3,431
    Quote Originally Posted by W1zzard View Post
    there's no proof that this is indeed fermi folding .. a lot of information that doesnt add up.. just wishful thinking by the nv fanboys
    +1..
    "Cast off your fear. Look forward. Never stand still, retreat and you will age. Hesitate and you will die. SHOUT! My name is…"
    //James

  11. #11
    Xtreme Mentor Talonman's Avatar
    Join Date
    Sep 2007
    Location
    Ohio
    Posts
    2,972
    Quote Originally Posted by jfromeo View Post
    PCI-e 2.0 is 150W, 75W is 1.x.
    A 295's TDP is 289 Watts, and can produce 18K PPD.

    If a Fermi's TDP is about 320 watts, not to bad if it can produce 50K PPD...

    It would take 2.777 295's using 802 watts to do the same.

    Actually here: http://www.brightsideofnews.com/news...not-300w!.aspx

    "In case of upcoming high-memory configurations nVidia Tesla, Quadro and GeForce cards, the company had to install a 6-pin and an 8-pin connector, getting 300W of power to play with. However, this was a precautionary measure. According to information we have at hand, the GT300 board [yeah, featuring "Fermi" CUDA architecture] barely missed 225W cut-off for the 6+6 pin if the board comes with 6GB of GDDR5 memory."

    The power it takes to run (1) Fermi might actually be less than what I am guessing? Near 225W sounds outstanding...
    Last edited by Talonman; 10-18-2009 at 09:49 AM.
    Asus Maximus SE X38 / Lapped Q6600 G0 @ 3.8GHz (L726B397 stock VID=1.224) / 7 Ultimate x64 /EVGA GTX 295 C=650 S=1512 M=1188 (Graphics)/ EVGA GTX 280 C=756 S=1512 M=1296 (PhysX)/ G.SKILL 8GB (4 x 2GB) SDRAM DDR2 1000 (PC2 8000) / Gateway FPD2485W (1920 x 1200 res) / Toughpower 1,000-Watt modular PSU / SilverStone TJ-09 BW / (2) 150 GB Raptor's RAID-0 / (1) Western Digital Caviar 750 GB / LG GGC-H20L (CD, DVD, HD-DVD, and BlueRay Drive) / WaterKegIII Xtreme / D-TEK FuZion CPU, EVGA Hydro Copper 16 GPU, and EK NB S-MAX Acetal Waterblocks / Enzotech Forged Copper CNB-S1L (South Bridge heat sink)

  12. #12
    Xtreme Member NeedMoMegaHurtZ's Avatar
    Join Date
    Oct 2007
    Posts
    398
    2) (1) Fermi GPU can produce 40K PPD, or more.

    3) Fermi will fold near or exceed 4X the speed of the 200 Series GPU's.
    What now takes 60 minutes to produce, will soon be done in 15 minutes.
    While this is interesing, you can't compare apples to oranges.

    He is using GPU3 beta client, not the standard GPU2 client.

    How much do today's cards generate with that GPU3 client? He's running 200 instances of GPU3. I think GPU2 only allows for 8 ?

    How big are GPU3 workunits?

    I don't see how a direct comparison is possible? Especially when you don't even know what he is using ?

  13. #13
    Xtreme Mentor Talonman's Avatar
    Join Date
    Sep 2007
    Location
    Ohio
    Posts
    2,972
    But we do know is PPD!

    And we do know his rig is (1) i7 CPU using 4 cores, and (7) Fermi.

    I am thinking the BETA Client lets you run more instances of folding, but probably doesn't give a performance increase all by itself.
    It would probably need Fermi to take advantage of that new feature.
    Last edited by Talonman; 10-18-2009 at 10:13 AM.
    Asus Maximus SE X38 / Lapped Q6600 G0 @ 3.8GHz (L726B397 stock VID=1.224) / 7 Ultimate x64 /EVGA GTX 295 C=650 S=1512 M=1188 (Graphics)/ EVGA GTX 280 C=756 S=1512 M=1296 (PhysX)/ G.SKILL 8GB (4 x 2GB) SDRAM DDR2 1000 (PC2 8000) / Gateway FPD2485W (1920 x 1200 res) / Toughpower 1,000-Watt modular PSU / SilverStone TJ-09 BW / (2) 150 GB Raptor's RAID-0 / (1) Western Digital Caviar 750 GB / LG GGC-H20L (CD, DVD, HD-DVD, and BlueRay Drive) / WaterKegIII Xtreme / D-TEK FuZion CPU, EVGA Hydro Copper 16 GPU, and EK NB S-MAX Acetal Waterblocks / Enzotech Forged Copper CNB-S1L (South Bridge heat sink)

  14. #14
    Xtreme Cruncher Chumbucket843's Avatar
    Join Date
    May 2009
    Location
    Bloomfield
    Posts
    1,968
    Quote Originally Posted by Talonman View Post
    A Fermi benchmark of sorts...

    If you do consider folding a benchmark, we have had a Fermi report if you haven't heard.

    http://www.evga.com/forums/tm.asp?m=...1&key=�

    I make these bold assumptions about Fermi.

    1) It will use about 320 watts when loaded up.
    PCI SIG only allows 300 watts.
    2) (1) Fermi GPU can produce 40K PPD, or more.
    even with 48Kb of cache it still wont get that.
    3) Fermi will fold near or exceed 4X the speed of the 200 Series GPU's.
    What now takes 60 minutes to produce, will soon be done in 15 minutes.

    Fermi will also have 1.5GB of memory, and have of 16 cores which contain 32 shaders each.
    he is folding on over 200 processors. there is no way that is fermi. its just gpu3 beta. 31 gpu's and an i7. g80 and up are MIMD arrays of SIMDs. it sounds like it is a current gpu because he is getting 700 points per SM. that sounds like g92. i find it ironic that if there are 248 active cpus that means exactly 31 gpu's.

  15. #15
    Xtreme Mentor Talonman's Avatar
    Join Date
    Sep 2007
    Location
    Ohio
    Posts
    2,972
    ORIGINAL: FahMan

    Hi Folks!
    Thanks for your interesting in my humble contribution in team EVGA.
    Some of you wanted to know about my hardware so please check:

    www.fahmanfolding.webs.com

    As you can see on photos this single PC is able to produce more then 200kppd (around 250 with extreme overclocking).

    Some additional explanations:
    I was forced to use usb monitor as GPUs haven't any video output (this engineering samples of Fermi are Tesla like, but they have 1.5GB of memory each like GT300 will).
    Because of the new MIMD architecture (they have 32 clusters of 16 shaders) i was not able to load them at 100% in any other way but to launch 1 F@H client per cluster and per card. Every client is GPU3 core Beta (Open MM library). I supose it is much more efficient then previous GPU2. In addition they need very little memory to run. Having 16GB of DDR3 and using Windows 7 Enterprise I've managed to run 200 instances of F@H GPU and 4 CPU (i7 processor HT off). The 7th card is not fully loaded. This could also be an issue with EVGA X58 mobo.
    I use together two Silverstone Strider PSU's 1500W each that is probably too much but
    now I experiment with overclocking (cards are factory unlocked). Max power consumption
    I've noticed was 2400W.
    The whole system is cooled by my own construction of liquid CO2 which is heavy an inconvenient and I have to supply a new cylinder every 5 days.
    That's It. I wish everybody to improve your folding speed using GT300 soon!
    Just keep folding...
    "(they have 32 clusters of 16 shaders) i was not able to load them at 100% in any other way but to launch 1 F@H client per cluster and per card."

    1 folding instance per cluster, is 32 per card.

    32 x 7 = 224 (GPU folding instances.)

    224 - 248 = 24

    The 24 is probably just from experimenting? Not sure.

    You may be right on the power, I am starting to think it does take less than 320 watts when loaded up.
    Last edited by Talonman; 10-18-2009 at 10:28 AM.
    Asus Maximus SE X38 / Lapped Q6600 G0 @ 3.8GHz (L726B397 stock VID=1.224) / 7 Ultimate x64 /EVGA GTX 295 C=650 S=1512 M=1188 (Graphics)/ EVGA GTX 280 C=756 S=1512 M=1296 (PhysX)/ G.SKILL 8GB (4 x 2GB) SDRAM DDR2 1000 (PC2 8000) / Gateway FPD2485W (1920 x 1200 res) / Toughpower 1,000-Watt modular PSU / SilverStone TJ-09 BW / (2) 150 GB Raptor's RAID-0 / (1) Western Digital Caviar 750 GB / LG GGC-H20L (CD, DVD, HD-DVD, and BlueRay Drive) / WaterKegIII Xtreme / D-TEK FuZion CPU, EVGA Hydro Copper 16 GPU, and EK NB S-MAX Acetal Waterblocks / Enzotech Forged Copper CNB-S1L (South Bridge heat sink)

  16. #16
    Registered User
    Join Date
    Jul 2006
    Location
    SoCal
    Posts
    44
    is it 32 clusters of 16 shaders or 16 clusters or 32 shaders?

    Which is it?

    And when did benchmarks become wildly theoretical?
    ΜOΛΩΝ ΛΑΒΕ

  17. #17
    Xtreme Member owcraftsman's Avatar
    Join Date
    Sep 2006
    Location
    Cape Coral, FL USA
    Posts
    184
    I found fahman's real setup!
    a little levity.
    http://www.evga.com/forums/tm.asp?m=100973448

    i72600k|P8Z68-V Pro|CMZ8GZ3M2A1866C9|EVGA 015-P3-1582-AR SLI|Antec High Current Pro 1200w|VTX3-25SAT3-240G system|Caviar Black 1TB & 640GB storage|SONY BDU-X10S|CM HAF932 A 3.0|H70/GT AP-15 x2|R.A.T.7|SidewinderX6|CM Storm Sirus (headset)|z5500|VIZIO E371VL

    TOASTER SIG
    UBCD
    You still voting Democrat? You're stuck on stupid!

  18. #18
    Xtreme Mentor Talonman's Avatar
    Join Date
    Sep 2007
    Location
    Ohio
    Posts
    2,972
    Quote Originally Posted by Pixie_marj View Post
    is it 32 clusters of 16 shaders or 16 clusters or 32 shaders?

    Which is it?
    Not sure now...

    According to this, Fermi is 16 SMs which contain 32 "CUDA cores":

    "http://www.realworldtech.com/page.cfm?ArticleID=RWT093009110932&p=4

    "The control hierarchy is similar to the GT200, with a global scheduler that issues work to each SM. Previously the global scheduler (and hence the GPU) could only have a single kernel in flight. Nvidia’s newer scheduler can maintain state for up to 16 different kernels, one per SM. Each SM runs a single kernel, but the ability to keep multiple kernels in flight increases utilization especially when one kernel begins to finish and has fewer blocks left. More importantly, assigning a kernel per core means that smaller kernels can be efficiently dispatched to the GPU.

    The latency for context switch between kernels has also been reduced by 10X to around 25 microseconds, this delay is largely due to cleaning up the state that each kernel must track – such as TLBs, dirty data in caches, registers, shared memory and the other kernel context."



    Now I think he is running 16 folding instances, 1 per each SM calculating on 32 CUDA Cores.

    But 7 * 16 is only 112.

    I don't have a grip on this part yet...
    "Fermi SM Overview

    The cores (or SMs) in Fermi have been tremendously beefed up and resources have been shifted around substantially. At a high level, the execution resources have quadrupled, but are shared between two scalar execution pipelines; each pipeline has twice the execution resources (or vector lanes) of the GT200 cores. It’s essential to note that while the two pipelines can execute two warps from the same thread block, they are not superscalar in the sense of a CPU. The memory pipeline has also been brought into the core, whereas previously each memory pipeline was shared between three cores. More importantly, the shared memory has been folded into a (semi-coherent) L1 data cache, giving each core a real memory hierarchy.

    In many respects, these changes are conceptually reminiscent of the improvements between Niagara I and II. Niagara II doubled the thread count to 8, but each set of 4 threads had a dedicated scheduler and integer (ALU) pipeline, compared to dedicated ALUs and floating point (FPU) pipelines for Fermi. All 8 threads in a Niagara II core shared memory pipelines, just like Fermi, and FPUs, which are analogous to the special function units (SFU).

    To utilize those execution resources, the number of threads in flight for each Fermi core has increased by 50% to 1536, spread across 8 concurrent thread blocks. This means that to fully utilize one of the new cores, 192 threads per block are required up from 128 in GT200. As with the current generation, execution within an SM occurs at the granularity of a warp, which is a set of 32 threads. With the increase in threads, each core can have up to 48 warps in-flight at once.

    As with all Nvidia DX10 hardware, Fermi has several different clock domains in each core – principally the regular clock for front-end and scheduling, and then the fast clock for actual execution units that runs at twice the regular clock."


    I still don't know the exact Max number of folding instances you could load up on 1 Fermi GPU.
    Any help would be appreciated on the subject.

    Quote Originally Posted by Pixie_marj View Post
    And when did benchmarks become wildly theoretical?
    Since we are trying to get benchmarks on a GPU still under NDA, but is currently being used, and we get to see the production numbers.
    Last edited by Talonman; 10-18-2009 at 11:06 AM.
    Asus Maximus SE X38 / Lapped Q6600 G0 @ 3.8GHz (L726B397 stock VID=1.224) / 7 Ultimate x64 /EVGA GTX 295 C=650 S=1512 M=1188 (Graphics)/ EVGA GTX 280 C=756 S=1512 M=1296 (PhysX)/ G.SKILL 8GB (4 x 2GB) SDRAM DDR2 1000 (PC2 8000) / Gateway FPD2485W (1920 x 1200 res) / Toughpower 1,000-Watt modular PSU / SilverStone TJ-09 BW / (2) 150 GB Raptor's RAID-0 / (1) Western Digital Caviar 750 GB / LG GGC-H20L (CD, DVD, HD-DVD, and BlueRay Drive) / WaterKegIII Xtreme / D-TEK FuZion CPU, EVGA Hydro Copper 16 GPU, and EK NB S-MAX Acetal Waterblocks / Enzotech Forged Copper CNB-S1L (South Bridge heat sink)

  19. #19
    Xtreme Addict
    Join Date
    Apr 2007
    Posts
    1,870
    Quote Originally Posted by Talonman View Post
    Thanks!
    Did you even read the link?

  20. #20
    Xtreme Mentor Talonman's Avatar
    Join Date
    Sep 2007
    Location
    Ohio
    Posts
    2,972
    Yes...

    But I don't think they believe the guy.

    I still don't know the exact Max number of folding instances you could load up on 1 Fermi GPU.
    Any help would be appreciated on the subject. What is your opinion trinibwoy?

    He is now up to 382,179 in 24 hours!
    - 10K for the CPU

    372,179 / (7) Fermi = 53,168 PPD per each Fermi.

    I simply have a hard time believing a guy would show up with so much processing power, then give us all that info about his system, with it flat out being a lie.
    It just goes against human nature. He would naturally want to tell his Folding Team about the rig he is running.

    Or, are we to believe it is a bunch of guys, that all started folding together, and just picked the name FahMan? No way!

    I believe he typed the truth as best he knew it. That simple.
    Last edited by Talonman; 10-18-2009 at 01:51 PM.
    Asus Maximus SE X38 / Lapped Q6600 G0 @ 3.8GHz (L726B397 stock VID=1.224) / 7 Ultimate x64 /EVGA GTX 295 C=650 S=1512 M=1188 (Graphics)/ EVGA GTX 280 C=756 S=1512 M=1296 (PhysX)/ G.SKILL 8GB (4 x 2GB) SDRAM DDR2 1000 (PC2 8000) / Gateway FPD2485W (1920 x 1200 res) / Toughpower 1,000-Watt modular PSU / SilverStone TJ-09 BW / (2) 150 GB Raptor's RAID-0 / (1) Western Digital Caviar 750 GB / LG GGC-H20L (CD, DVD, HD-DVD, and BlueRay Drive) / WaterKegIII Xtreme / D-TEK FuZion CPU, EVGA Hydro Copper 16 GPU, and EK NB S-MAX Acetal Waterblocks / Enzotech Forged Copper CNB-S1L (South Bridge heat sink)

  21. #21
    Xtreme Addict
    Join Date
    Apr 2007
    Posts
    1,870
    Quote Originally Posted by Talonman View Post
    What is your opinion trinibwoy?
    Don't have one. We have one guy claiming he's busting up FaH on some Fermis and another guy saying he's got confirmation that it's a lie.

    My only question is why somebody with that kind of PPD throughput would be motivated to lie about running Fermis. My first thought was that it was the Nvidia hype machine at work but that would be really out there.

  22. #22
    Xtreme Mentor Talonman's Avatar
    Join Date
    Sep 2007
    Location
    Ohio
    Posts
    2,972
    I agree. Too far out there for it to be the nVidia hype machine...

    Reading here...

    http://www.pcper.com/article.php?aid=789

    (1) Fermi has 16 SMs, with 32 CUDA Cores each.

    The SMs (streaming multiprocessors) execute threads in groups of 32 called “warps” that help to improve efficiency of the GPU.

    The GPU is made up of 3.0 billion transistors and features 512 CUDA processing cores organized into 16 streaming multiprocessors of 32 cores each.

    If he had the BETA folding program, and could load up more than one instance on a Fermi, does this mean:

    The groups of 32 Threads called “warps” x the 16 SMs = 512 total threads per (1) Fermi, all processing on 512 CUDA cores?
    What does that say to how many folding instances he could load up on 1 GPU, if he wanted to go for Max load?
    Last edited by Talonman; 10-18-2009 at 03:26 PM.
    Asus Maximus SE X38 / Lapped Q6600 G0 @ 3.8GHz (L726B397 stock VID=1.224) / 7 Ultimate x64 /EVGA GTX 295 C=650 S=1512 M=1188 (Graphics)/ EVGA GTX 280 C=756 S=1512 M=1296 (PhysX)/ G.SKILL 8GB (4 x 2GB) SDRAM DDR2 1000 (PC2 8000) / Gateway FPD2485W (1920 x 1200 res) / Toughpower 1,000-Watt modular PSU / SilverStone TJ-09 BW / (2) 150 GB Raptor's RAID-0 / (1) Western Digital Caviar 750 GB / LG GGC-H20L (CD, DVD, HD-DVD, and BlueRay Drive) / WaterKegIII Xtreme / D-TEK FuZion CPU, EVGA Hydro Copper 16 GPU, and EK NB S-MAX Acetal Waterblocks / Enzotech Forged Copper CNB-S1L (South Bridge heat sink)

  23. #23
    Xtreme Cruncher Chumbucket843's Avatar
    Join Date
    May 2009
    Location
    Bloomfield
    Posts
    1,968
    warps handle WAY more threads than that. fermi has about 24,000 and gt200 handles about 30,000. the gpu cant run a program on each thread like a cpu though. the threads are designed to hide latency. yes, a single folding instance has multiple threads. thats why they are so much faster than cpu's.there are many many calculations to do that can be done independently so they run very well on gpu's.

  24. #24
    Xtreme Mentor Talonman's Avatar
    Join Date
    Sep 2007
    Location
    Ohio
    Posts
    2,972
    Sorry, I did edit my post a bit.

    Thanks for the info.

    Do you have an opinion Chumbucket843 on how many GPU folding instances (1) Fermi could load up in theory?

    This seems to be a big point of yours:
    Quote Originally Posted by Chumbucket843 View Post
    He is folding on over 200 processors. there is no way that is fermi. its just gpu3 beta. 31 gpu's and an i7. g80 and up are MIMD arrays of SIMDs. it sounds like it is a current gpu because he is getting 700 points per SM. that sounds like g92. i find it ironic that if there are 248 active cpus that means exactly 31 gpu's.
    But I don't understand why the 248 is such a big deal.

    That is only about 35 folding instances per GPU. I thought Fermi could do that, no problem?
    Last edited by Talonman; 10-18-2009 at 04:05 PM.
    Asus Maximus SE X38 / Lapped Q6600 G0 @ 3.8GHz (L726B397 stock VID=1.224) / 7 Ultimate x64 /EVGA GTX 295 C=650 S=1512 M=1188 (Graphics)/ EVGA GTX 280 C=756 S=1512 M=1296 (PhysX)/ G.SKILL 8GB (4 x 2GB) SDRAM DDR2 1000 (PC2 8000) / Gateway FPD2485W (1920 x 1200 res) / Toughpower 1,000-Watt modular PSU / SilverStone TJ-09 BW / (2) 150 GB Raptor's RAID-0 / (1) Western Digital Caviar 750 GB / LG GGC-H20L (CD, DVD, HD-DVD, and BlueRay Drive) / WaterKegIII Xtreme / D-TEK FuZion CPU, EVGA Hydro Copper 16 GPU, and EK NB S-MAX Acetal Waterblocks / Enzotech Forged Copper CNB-S1L (South Bridge heat sink)

  25. #25
    Xtreme Addict
    Join Date
    Apr 2007
    Posts
    1,870
    Quote Originally Posted by Talonman View Post
    Do you have an opinion Chumbucket843 on how many GPU folding instances (1) Fermi could load up in theory?
    Fermi can run 16 compute kernels in parallel, one per core.

Page 1 of 3 123 LastLast

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •