Dunno what to make of this.... my system is p95 stable for >30 h under these settings. I guess this benchmark confirmed the stability since the numbers are identical:
Code:
Intel(R) LINPACK data
Current date/time: Wed May 21 15:00:58 2008
CPU frequency: 3.400 GHz
Number of CPUs: 4
Number of threads: 4
Parameters are set to:
Number of tests : 1
Number of equations to solve (problem size) : 20000
Leading dimension of array : 20000
Number of trials to run : 100
Data alignment value (in Kbytes) : 4
Maximum memory requested that can be used = 3200404096, at the size = 20000
============= Timing linear equation system solver =================
Size LDA Align. Time(s) GFlops Residual Residual(norm)
20000 20000 4 117.053 45.5704 4.455000e-010 3.943651e-002
20000 20000 4 116.505 45.7846 4.455000e-010 3.943651e-002
20000 20000 4 116.076 45.9537 4.455000e-010 3.943651e-002
20000 20000 4 116.418 45.8190 4.455000e-010 3.943651e-002
20000 20000 4 116.994 45.5934 4.455000e-010 3.943651e-002
20000 20000 4 115.492 46.1864 4.455000e-010 3.943651e-002
20000 20000 4 117.306 45.4718 4.455000e-010 3.943651e-002
20000 20000 4 115.621 46.1346 4.455000e-010 3.943651e-002
20000 20000 4 117.541 45.3809 4.455000e-010 3.943651e-002
20000 20000 4 118.747 44.9200 4.455000e-010 3.943651e-002
20000 20000 4 117.721 45.3116 4.455000e-010 3.943651e-002
It f
cking scorched my cores. I ran HDMon which logs highest temps. I hit 69 °C during the tests; p95 by contrast didn't break 63 °C. If I open up my case, the 69 °C has dropped to 63 °C with this thing but damn!
Something else I found striking (perhaps a major knock against this util as a stability app) is that despite a 100 % core load on all 4 cores as-per task manager, I'm seeming some pretty radical differences in temps from cores 0/1 to cores 2/3 which I do not see when running p95. I think this means that p95 does a more effective job spreading the load out evenly which may have stability implications.
Example:
Temps as read by HWMon for linpack: 63,63,56,56
Temps as read by HWMon for p95 large FFT's: 52,52,51,50
Temps as read by HWMon idle: 35,35,36,35
Again, the magnitude of the numbers don't bother me as much as the fact that linpack gives that massive 7 °C difference vs. the small 2 °C diff on p95. Yeah, I know about issues affecting temp spreads such as HS mounting, uneven pressure on the mount via the screws, uneven surfaces, TIM irregularities, etc. but other stability tests such as p95, OCCT, etc. do not show this major difference on my machine indicating to me that whatever is causing this is not hardware related.
What I would really love is a benchmark that can quickly (<15 min) tell me if my vcores are high enough for a given set of conditions. Perhaps this might do that if one adjusts the work size to generate a smaller problem size such that the cycle time would be like 30-40 sec? It might be as-good or superior to p95 for verifying not thermal stability, but the "stability" of mathematical precision. As you know, p95 has the round off number checking errors which we all use to verify "stability" of a given set of conditions. What sucks is that as you approach a "stable" system, the errors from p95 often take 10s of hours to manifest themselves.
I'm not interested in doing the Pepsi challenge with this software in this regard, but maybe one of you out there is?
Question 1: How quickly do errors as indicated by differences in the residual (norm) values take to propagate? In other words, if it takes p95 doing large FFT's 6 hours to fail on a core, how long does it take this software given a reasonable work size (my machine did a 20k size in about 3 min)?
Question 2: Has anyone run this thing for 12+ hours? If so was it stable and if not, how many hours did it take to give an error, again characterized by differences in the residual (norm) values?
Bookmarks