MMM
Page 3 of 3 FirstFirst 123
Results 51 to 72 of 72

Thread: GPU Benchmarking Methods Investigated: Fact vs. Fiction

  1. #51
    Xtreme Addict
    Join Date
    Aug 2007
    Location
    Toon
    Posts
    1,570
    Quote Originally Posted by iMacmatician View Post
    I've been wondering why reviews haven't been using the standard deviation.
    Cos then we'd ask for confidence intervals and more hard stats.
    Intel i7 920 C0 @ 3.67GHz
    ASUS 6T Deluxe
    Powercolor 7970 @ 1050/1475
    12GB GSkill Ripjaws
    Antec 850W TruePower Quattro
    50" Full HD PDP
    Red Cosmos 1000

  2. #52
    Xtreme Addict
    Join Date
    Mar 2009
    Posts
    1,116
    Quote Originally Posted by cegras View Post
    bamtan: Yeah, but to get an accurate average you need at least 5 runs. And that's just for a making sure you have a good linear fit. The sample size for statistically relevant average + 3 SD rule would be LARGE.

    Extrapolating from 3 runs gives you no useful information at all.
    shrug, I don't think I mentioned how many runs should be done

    but since you brought it up, I disagree with you because I don't think computer game benchmarks vary enough that "3 runs gives you no useful information at all"

    I'm trying to help because every video card review on the web is at a grade 6 mathematics level. you are too, but you just landed a bridge too far
    Last edited by bamtan2; 06-18-2010 at 11:10 AM.

  3. #53
    Xtreme Addict
    Join Date
    Feb 2008
    Location
    America's Finest City
    Posts
    2,078
    Quote Originally Posted by SKYMTL View Post
    Like I said in the conclusion: there is no "right" way to go about it. Kyle's team has a unique persepctive and it sets them (and sometimes their conlcusions) apart from the norm. They do use in-game sequences and research things properly it seems which is a huge step in the right direction. However, I am not actually sure if they include action sequences in their benchmarks or if they are just doing a run-through.

    What the article is really meant to convey is that the vast majority of standard benchmarking methods (built in benchmarks & stand-alone) are dead wrong. This is why readers should push for a transparent benchmarking process where there is disclosure of exactly which methods were used.



    I can't speak for other sites but the fact that we test every run three times and average out the results eliminates any "zingers" when it comes to minimum framerates. Average FPS are also there for a reason.
    I fully plan to include my testing methodology whenever i start up a new review... explaining the software and the environments used. Otherwise, i'm just putting the cloth over my readers' eyes.

    My point of view is that games are very finnicky and the most important thing is not scores or anything like that, its all about raw performance in games and the best bang for your bucks as well as the best performance money can buy. I don't care what company is better or worse, I just want to give my readers a clear cut and well informed as well as transparent review.
    Quote Originally Posted by FUGGER View Post
    I am magical.

  4. #54
    Xtreme Enthusiast
    Join Date
    Aug 2008
    Posts
    577
    Quote Originally Posted by hurleybird View Post
    The ironic thing here is that this article is perpetuating one of the most common benchmarking mistakes of today: providing minimum frame rates without qualifying them. Minimum FPS by itself is worthless, since for all you know it be for a single frame at the start of the level, or conversely that card might be hitting that minimum frame-rate all of that time. Another example, if one card hits a very low minimum frame rate once for a very short period, and another card hits a higher minimum frame rate but goes there more often, it's the first card with the lower min fps that is providing the better game play experience. If you want to provide minimum frame-rates, you MUST qualify them with a graph of fps over time, or at the very least a description of the gameplay. Unfortunately this poor methodology is very widespread.
    I think you are right, the solution by my estimate is to provide data on the MEDIAN of FPS, as this will tell you with average and minimum FPS where the range lies the most.
    --Intel i5 3570k 4.4ghz (stock volts) - Corsair H100 - 6970 UL XFX 2GB - - Asrock Z77 Professional - 16GB Gskill 1866mhz - 2x90GB Agility 3 - WD640GB - 2xWD320GB - 2TB Samsung Spinpoint F4 - Audigy-- --NZXT Phantom - Samsung SATA DVD--(old systems Intel E8400 Wolfdale/Asus P45, AMD965BEC3 790X, Antec 180, Sapphire 4870 X2 (dead twice))

  5. #55
    Xtreme Legend
    Join Date
    Jan 2003
    Location
    Stuttgart, Germany
    Posts
    929
    Quote Originally Posted by bamtan2 View Post
    I'm trying to help because every video card review on the web is at a grade 6 mathematics level. you are too, but you just landed a bridge too far
    just right for many of the readers then. i'm sure less than 1% of readers of a typical vga review understand what standard deviation is, people have difficulties "reading" linear x-y plots

  6. #56
    Xtreme 3D Team
    Join Date
    Jan 2009
    Location
    Ohio
    Posts
    8,499
    Quote Originally Posted by W1zzard View Post
    just right for many of the readers then. i'm sure less than 1% of readers of a typical vga review understand what standard deviation is, people have difficulties "reading" linear x-y plots
    +1.

    People don't want to sit there for 2 minutes looking at linear graphs and charts. They want to see the result and see it quick.

    Quote Originally Posted by Stukov View Post
    I think you are right, the solution by my estimate is to provide data on the MEDIAN of FPS, as this will tell you with average and minimum FPS where the range lies the most.
    You do know what median means right? Middle. Median of what? Median of FPS? What? The Mode is the most.
    Smile

  7. #57
    Xtreme Enthusiast
    Join Date
    Aug 2008
    Posts
    577
    Quote Originally Posted by BeepBeep2 View Post
    +1.

    People don't want to sit there for 2 minutes looking at linear graphs and charts. They want to see the result and see it quick.



    You do know what median means right? Middle. Median of what? Median of FPS? What? The Mode is the most.
    http://en.wikipedia.org/wiki/Median

    Median is the "middle" of a series of numbers. So if you have 1, 1, 1, 2, 3, 4, 5...2 is the median number. Having the Median number would allow you to know on which side of the average you will get most of the time. Average mean can be heavily skewed by massive jumps in data (or FPS here) or massive drops.

    Let's say its 1, 1, 1, 2, 3, 100 the average mean would be 18, however the mean would 1-2 (1.5). In terms of FPS, if you have a lot of an average of 60, a minimum of 20, and a median of 65 will tell you that getting 20 FPS is not very likely, but if the median was 30, it would tell you that you end up lower than 60 quite often, but there are some high peaks skewing it higher.
    Last edited by Stukov; 06-19-2010 at 12:20 AM.
    --Intel i5 3570k 4.4ghz (stock volts) - Corsair H100 - 6970 UL XFX 2GB - - Asrock Z77 Professional - 16GB Gskill 1866mhz - 2x90GB Agility 3 - WD640GB - 2xWD320GB - 2TB Samsung Spinpoint F4 - Audigy-- --NZXT Phantom - Samsung SATA DVD--(old systems Intel E8400 Wolfdale/Asus P45, AMD965BEC3 790X, Antec 180, Sapphire 4870 X2 (dead twice))

  8. #58
    Xtreme Addict
    Join Date
    Mar 2009
    Posts
    1,116
    Quote Originally Posted by W1zzard View Post
    just right for many of the readers then. i'm sure less than 1% of readers of a typical vga review understand what standard deviation is, people have difficulties "reading" linear x-y plots
    readers don't need to know what it is. it is just a tool to produce a result. like adding and dividing produces average, standard deviation can produce a useful minimum

  9. #59
    I am Xtreme
    Join Date
    Dec 2007
    Posts
    7,750
    Quote Originally Posted by Stukov View Post
    http://en.wikipedia.org/wiki/Median

    Median is the "middle" of a series of numbers. So if you have 1, 1, 1, 2, 3, 4, 5...2 is the median number. Having the Median number would allow you to know on which side of the average you will get most of the time. Average mean can be heavily skewed by massive jumps in data (or FPS here) or massive drops.

    Let's say its 1, 1, 1, 2, 3, 100 the average mean would be 18, however the mean would 1-2 (1.5). In terms of FPS, if you have a lot of an average of 60, a minimum of 20, and a median of 65 will tell you that getting 20 FPS is not very likely, but if the median was 30, it would tell you that you end up lower than 60 quite often, but there are some high peaks skewing it higher.
    the biggest problem with median is the fluctuation of fps is very noticable. if your talking about mpg, you dont really care if one tank gets 20 and the next gets 30, and then back to 20, but in games thats called micro stuttering, lol, and people really hated that.

    i think line graphs is where its at, 100%. anything else takes away from that. sure you can have summaries, but the raw data needs to be fully available for some readers who do take the time to read.

  10. #60
    Xtreme Addict
    Join Date
    Aug 2002
    Posts
    1,202
    Quote Originally Posted by Manicdan View Post
    im pretty sure older cards are still better price/perf kings, considering a 4850 is 100$, nearly 1/3 the price, but i doubt its 3x slower
    Yeah but it runs 3x hotter and it's not really faster than even, cheaper, smaller 8800GT.

    5850 is the best price/performance card especially when clocked to 1000Mhz core.
    2600k @ 5.0Ghz 1.54V, Giga Z68, Zotac GTX680 AMP!, Patriot 1066Mhz 8GB RAM, Custom water, Silverstone 1000W, HAF932

  11. #61
    Xtreme Addict
    Join Date
    Mar 2009
    Posts
    1,116
    Quote Originally Posted by QuadDamage View Post
    Yeah but it runs 3x hotter and it's not really faster than even, cheaper, smaller 8800GT.

    5850 is the best price/performance card especially when clocked to 1000Mhz core.
    when you speak in absolutes it takes only one counterexample to disprove you.

    we are way off topic here, but you must be corrected. used cards have huge discounts, making it very difficult for any new card to compete for the price/performance crown.

    example: gtx 260 goes for $120 on ebay. 5850 is about $300. 5850 is not near twice as fast as gtx 260.

    http://completed.shop.ebay.com/i.htm...c0.m283&_rdc=1

    http://www.guru3d.com/article/vga-ch...ecember-2009/4

  12. #62
    Xtreme Enthusiast
    Join Date
    Oct 2006
    Posts
    617
    you could display the spread of framerates like in the 1st attached pic, where the green GPU has higher average and max framerates but doesn't do as well at lower framerates

    if you had more FPS ranges on the x axis (eg 30 instead of the 10 in my pic) it could look good as a line graph instead of a bar graph

    if you had heaps of FPS ranges on the x axis (eg each fps range was an integer) and just used a colored dot for each value you'd get a cloud of scattered dots that would look good if you had enough data points.

    but you'd often want to compare multiple GPUs and multiple resolutions and too many graphs can be a nuisance... but if you made the columns of a multiple GPU/resolution graph into representations of the FPS spread (eg the "red" GPU that i did for example's sake in the 2nd pic, where each horizontal line is perhaps a integer FPS value, and its width represents how much time the GPU spends at that FPS, i didn't bother mspainting the other bars to perfection) you could display the full story in a single JPG
    Attached Images Attached Images
    Last edited by hollo; 06-20-2010 at 11:16 AM.

  13. #63
    Xtreme Cruncher
    Join Date
    May 2009
    Location
    Bloomfield
    Posts
    1,968
    Quote Originally Posted by W1zzard View Post
    just right for many of the readers then. i'm sure less than 1% of readers of a typical vga review understand what standard deviation is, people have difficulties "reading" linear x-y plots
    implicitly they can understand it. kind of like how you understand physics and use it in every day life even though you dont solve PDE's in your head.

    as shown above anyone can see the distribution of framerates and see that card X has more consistent framerates than card Y. if they dont know what to look for tell them or explain it.

    btw im not a fan of typical reviews. sites should do something unique, something that gives them individuality and a reason to visit them. as a technical person i would like to see in depth statistical analysis of benchmarks.

  14. #64
    Xtreme Addict
    Join Date
    Mar 2009
    Posts
    1,116


    a good point is made here, which is that just changing the presentation (what Y and X represent) of existing data can change how useful the picture is.

    normal reviews have Y=framerates along the left, and X=time along the bottom. but wait a second, who cares about time? we don't even know what the benchmark was doing at any given time.

    so rather than a graph presenting what speed things were going at a given time, why not use the exact same data to present how often things were going a given speed?
    Last edited by bamtan2; 06-20-2010 at 11:51 AM.

  15. #65
    Xtreme Enthusiast
    Join Date
    Jul 2004
    Posts
    535
    This here is the good .

  16. #66
    Xtreme Enthusiast
    Join Date
    Mar 2005
    Location
    North USA
    Posts
    670
    Here's what I do with vbscript based off of samples provided (la as a 2d array) to calculate "average" (median) fps. Comments/Criticism very welcome.

    Code:
    i="0"
    Select Case UBound(la)
    Case "1"
    	avgfps=la(1)
    	WScript.Echo "Average FPS=" & avgfps
    Case "2"
    	for i = 1 to ubound(la)
    		sum=sum + la(i)
    	Next
    	avgfps=sum/ubound(la)
    	WScript.Echo "Average FPS=" & avgfps
    Case "3"
    	for i = 1 to ubound(la)
    		If sDebug=1 Then WScript.Echo "la(" & i & ")=" & la(i)
    		If la(i)>la(high) Then high=i
    		If la(i)<la(low) Then low=i
    	Next
    	for i = 1 to ubound(la)
    		Select Case i
    		Case high
    			If sDebug=1 Then WScript.Echo "la(" & i & ") is high!"	' throw it away!
    		Case low
    			If sDebug=1 Then WScript.Echo "la(" & i & ") is low!"	' throw it away!
    		Case Else
    			avgfps=la(i)
    		End Select
    	Next
    	WScript.Echo "Average FPS=" & avgfps
    Case "4"
    	for i = 1 to ubound(la)
    		If sDebug=1 Then WScript.Echo "la(" & i & ")=" & la(i)
    		If la(i)>la(high) Then high=i
    		If la(i)<la(low) Then low=i
    	Next
    	for i = 1 to ubound(la)
    		Select Case i
    		Case high
    			If sDebug=1 Then WScript.Echo "la(" & i & ") is high!"	' throw it away!
    		Case low
    			If sDebug=1 Then WScript.Echo "la(" & i & ") is low!"	' throw it away!
    		Case Else			
    			avgfps=avgfps + la(i)
    			If sDebug=1 Then WScript.Echo "la(" & i & ") one of our sweet values!"	' throw it away!
    		End select
    	Next
    	If low=high Then
    		avgfps=avgfps/3
    		If sDebug=1 Then WScript.Echo "avgfps=(3)" & avgfps
    	Else
    		avgfps=avgfps/2
    		If sDebug=1 Then WScript.Echo "avgfps(2)=" & avgfps
    	End If
    	WScript.Echo "Average FPS=" & avgfps
    Case "5"
    	for i = 1 to ubound(la)
    		WScript.Echo "la(" & i & ")=" & la(i)
    		If la(i)>la(high) Then high=i
    		If la(i)<la(low) Then low=i
    	Next
    	for i = 1 to ubound(la)
    		Select Case i
    		Case high
    			WScript.Echo "la(" & i & ") is high!"	' throw it away!
    		Case low
    			WScript.Echo "la(" & i & ") is low!"	' throw it away!
    		Case Else			
    			avgfps=avgfps + la(i)
    			WScript.Echo "la(" & i & ") one of our sweet values!"	' throw it away!
    		End select
    	Next
    	If low=high Then
    		avgfps=avgfps/4
    		WScript.Echo "avgfps=(3)" & avgfps	' throw it away!
    	Else
    		avgfps=avgfps/3
    		WScript.Echo "avgfps(2)=" & avgfps	' throw it away!
    	End If
    	WScript.Echo "Average FPS=" & avgfps
    Case Else
    		for i = 1 to ubound(la)
    		WScript.Echo "la(" & i & ")=" & la(i)
    		If la(i)>la(high) Then high=i
    		If la(i)<la(low) Then low=i
    	Next
    	for i = 1 to ubound(la)
    		Select Case i
    		Case high
    			WScript.Echo "la(" & i & ") is high!"	' throw it away!
    		Case low
    			WScript.Echo "la(" & i & ") is low!"	' throw it away!
    		Case Else			
    			avgfps=avgfps + la(i)
    			WScript.Echo "la(" & i & ") one of our sweet values!"	' throw it away!
    		End select
    	Next
    	If low=high Then
    		avgfps=avgfps/(ubound(la)-1)
    		WScript.Echo "avgfps=(3)" & avgfps	' throw it away!
    	Else
    		avgfps=avgfps/(ubound(la)-2)
    		WScript.Echo "avgfps(2)=" & avgfps	' throw it away!
    	End If
    	WScript.Echo "Average FPS=" & avgfps
    End Select
    
    avgfps=Round(avgfps,2)
    logfile.WriteLine ("AVERAGE FPS: " & avgfps)
    Last edited by Truckchase!; 06-20-2010 at 08:15 PM. Reason: said "mean", meant "median"
    Asus P6T-DLX V2 1104 & i7 920 @ 4116 1.32v(Windows Reported) 1.3375v (BIOS Set) 196x20(1) HT OFF
    6GB OCZ Platinum DDR3 1600 3x2GB@ 7-7-7-24, 1.66v, 1568Mhz
    Sapphire 5870 @ 985/1245 1.2v
    X-Fi "Fatal1ty" & Klipsch ProMedia Ultra 5.1 Speaks/Beyerdynamic DT-880 Pro (2005 Model) and a mini3 amp
    WD 150GB Raptor (Games) & 2x WD 640GB (System)
    PC Power & Cooling 750w
    Homebrew watercooling on CPU and GPU
    and the best monitor ever made + a Samsung 226CW + Dell P2210 for eyefinity
    Windows 7 Utimate x64

  17. #67
    Xtreme Guru
    Join Date
    Aug 2007
    Posts
    3,562
    Interesting but how would you do that in a program like Excel in a quick and easy manner? I am guessing you could make a Custom List but that would take an insane amount of time. (directed @ Batman)

    Also, NVIDIA's newest WHQL drivers pretty much bring my point about in-game benchmarks into pretty sharp focus. They are claiming a massive framerate jump in the Concrete Jungle benchmark. In game? Nadda.

  18. #68
    Xtreme Enthusiast
    Join Date
    Mar 2005
    Location
    North USA
    Posts
    670
    Quote Originally Posted by W1zzard View Post
    just right for many of the readers then. i'm sure less than 1% of readers of a typical vga review understand what standard deviation is, people have difficulties "reading" linear x-y plots
    Hey W1z; first of all, I'm a HUGE fan of your work. Thank you for all your contributions to the community. Just wanted to say that I think we're selling ourselves short. I would assert that the number is closer to 20%, but we look at it as a much smaller minority because that minority tends not to be very "vocal" in the first place. If someone gets it, they rarely complain; they just consume and move on.

    To that end, I think there is an audience for more intelligent data, it's just an audience that is difficult to market to or convince to do much of anything through standard advertising means.
    Asus P6T-DLX V2 1104 & i7 920 @ 4116 1.32v(Windows Reported) 1.3375v (BIOS Set) 196x20(1) HT OFF
    6GB OCZ Platinum DDR3 1600 3x2GB@ 7-7-7-24, 1.66v, 1568Mhz
    Sapphire 5870 @ 985/1245 1.2v
    X-Fi "Fatal1ty" & Klipsch ProMedia Ultra 5.1 Speaks/Beyerdynamic DT-880 Pro (2005 Model) and a mini3 amp
    WD 150GB Raptor (Games) & 2x WD 640GB (System)
    PC Power & Cooling 750w
    Homebrew watercooling on CPU and GPU
    and the best monitor ever made + a Samsung 226CW + Dell P2210 for eyefinity
    Windows 7 Utimate x64

  19. #69
    Xtreme X.I.P.
    Join Date
    Nov 2002
    Location
    Shipai
    Posts
    31,147
    nice article...

  20. #70
    Champion
    Join Date
    Jan 2007
    Location
    Romania, lab501.ro
    Posts
    1,707
    Nice read

    One observation though - when it comes to actual performance in a game, there is a difference between stand-alone benchmarks and different portions of the game, thus sustaining your point. However, when it comes to the % diference of performance between 5850 and GTX 470 (your examples) I see almost not percentage difference between the two from one situation to the other.

    The point is very simple, based on your tests and results - if real-life performance in a game is what you are after, then in-game testing, even if it is far less accurate then benchmarks due to a large number or variables, is the way to go. However, if accurate performance percentage difference between 1,2,3, etc products is what you are after, benchmarks are in no way less representative then in-game testing.
    Weissbier - breakfast of champions



  21. #71
    Xtreme Guru
    Join Date
    Aug 2007
    Posts
    3,562
    Quote Originally Posted by Monstru View Post
    The point is very simple, based on your tests and results - if real-life performance in a game is what you are after, then in-game testing, even if it is far less accurate then benchmarks due to a large number or variables, is the way to go. However, if accurate performance percentage difference between 1,2,3, etc products is what you are after, benchmarks are in no way less representative then in-game testing.
    You can't forget that we're talking about playability as well. Take the AvP benchmark for example. If you were going by the stand-alone test, you would be under the wrong impression that none of the tested cards could actually play the game.

    The same could be said about some in-game benchmarks such as the Desert sequence in Just Cause 2. If you went by that, you would once again get the wrong impression since most cards perform amazingly in it while the game itself is more challenging.

  22. #72
    Xtreme Enthusiast
    Join Date
    Feb 2009
    Posts
    800
    I agree with SKYMTL. Conclusion? We need more horsepower. I'm starting to feel the age of my 4890. Damn source games dipping me to below 30 fps sometimes.

Page 3 of 3 FirstFirst 123

Bookmarks

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •