Page 2 of 3 FirstFirst 123 LastLast
Results 26 to 50 of 55

Thread: Rewriting SuperPi

  1. #26
    Xtreme Enthusiast
    Join Date
    May 2007
    Posts
    831
    Quote Originally Posted by FUGGER View Post
    Our Super Pi will stay our version, muffinflavored pi or whatever you want to call it to avoid any confusion.

    But go nuts on making your version, Im not one to hold anyone back from succeeding, just dont do it thinking you are replacing Super Pi.
    That is not my goal.

    Everyone might be seeing this as more than it really is.
    I am just trying to calculate an x amount of digits of pi, and see how long it takes.
    I do not know if the memory usage will be amazing, or anything.
    Hopefully in the end, it will be that way.

    If anyone knows anything about the Gauss-Legendre algorithm for calculating pi, let me know.
    Last edited by MuffinFlavored; 04-03-2008 at 04:22 PM.
    Gigabyte P35-DQ6 | Intel Core 2 Quad Q6700 | 2x1GB Crucial Ballistix DDR2-1066 5-5-5-15 | MSI nVIDIA GeForce 7300LE

  2. #27
    Xtreme Guru
    Join Date
    Jan 2005
    Location
    waukegan
    Posts
    3,607
    i hope it will be even faster cuz ram is faster than any hd.
    mobo: strix b350f
    gpu: rx580 1366/2000
    cpu: ryzen 1700 @ 3.8ghz
    ram: 32 gb gskill 2400 @ 3000
    psu: coarsair 1kw
    hdd's: samsung 500gb ssd 1tb & 3tb hdd

  3. #28
    Xtreme Enthusiast
    Join Date
    May 2007
    Posts
    831
    So far, I can calculate 16 digits of pi correctly.

    Code:
    	long double a = 1.0;
    	long double b = 1.0/sqrt(2.0);
    	long double t = 1.0/4.0;
    	long double p = 1.0;
    	long double temp, pi;
    
    	int digits = 4;
    	int n = 1;
    
    	while (n <= digits) {
    		temp = a;
    		a = (a+b)/2.0;
    		b = sqrt(temp*b);
    		t = t - (p * pow(temp - a, 2.0));
    		p = 2.0 * p;
    		n++;
    	}
    
    	pi = pow(a+b, 2)/(4*t);
    	printf("&#37;lf\n", pi);
    I am trying to avoid the use of an external library to support large numbers.
    "long double" allows only 128 bits (16 bytes) to be stored.
    Gigabyte P35-DQ6 | Intel Core 2 Quad Q6700 | 2x1GB Crucial Ballistix DDR2-1066 5-5-5-15 | MSI nVIDIA GeForce 7300LE

  4. #29
    Banned
    Join Date
    Feb 2006
    Location
    Hhw
    Posts
    4,036
    Quote Originally Posted by MuffinFlavored View Post
    So far, I can calculate 16 digits of pi correctly.

    Code:
    	long double a = 1.0;
    	long double b = 1.0/sqrt(2.0);
    	long double t = 1.0/4.0;
    	long double p = 1.0;
    	long double temp, pi;
    
    	int digits = 4;
    	int n = 1;
    
    	while (n <= digits) {
    		temp = a;
    		a = (a+b)/2.0;
    		b = sqrt(temp*b);
    		t = t - (p * pow(temp - a, 2.0));
    		p = 2.0 * p;
    		n++;
    	}
    
    	pi = pow(a+b, 2)/(4*t);
    	printf("&#37;lf\n", pi);
    I am trying to avoid the use of an external library to support large numbers.
    "long double" allows only 128 bits (16 bytes) to be stored.
    What spi does, is save temporary results to hdd/ram which is why it's so tweakable. You won't get far if you only use built in data types you have to store intermediate results somewhere.

    Edit:

    Or, just combine data types, eg double 'long double' where the first one has prefaced 0's. Actually it's the same as I said above as you're doubling the variables which are used to hold the data, so it will double the ram usage, and at some point the os might start using the swap file for some reason.
    Last edited by Marvin_The_Martian; 04-03-2008 at 11:37 PM.

  5. #30
    Xtreme Enthusiast
    Join Date
    May 2007
    Posts
    831
    Quote Originally Posted by Marvin_The_Martian View Post
    What spi does, is save temporary results to hdd/ram which is why it's so tweakable. You won't get far if you only use built in data types you have to store intermediate results somewhere.

    Edit:

    Or, just combine data types, eg double 'long double' where the first one has prefaced 0's. Actually it's the same as I said above as you're doubling the variables which are used to hold the data, so it will double the ram usage, and at some point the os might start using the swap file for some reason.
    The way the algorithm I was using worked, it returned twice the amount of digits calculated each time. That is not good.

    Trying a digit extracting method, I can get much better results.
    But the method I have now, takes a very long time to calculate 16384 digits of pi (16kb)

    But, I have this question.

    What does everyone perfer:
    a pi calculating benchmark
    or a prime calculating benchmark
    Gigabyte P35-DQ6 | Intel Core 2 Quad Q6700 | 2x1GB Crucial Ballistix DDR2-1066 5-5-5-15 | MSI nVIDIA GeForce 7300LE

  6. #31
    Banned
    Join Date
    Feb 2006
    Location
    Hhw
    Posts
    4,036
    I think the algorithm doesn't matter to most, calculating prime numbers well you can use P95 for that and calculating pi I think that's done already to idk how many digits.

    People will care about how tweakable the benchmark is, and about how consistent the scaling will be ( eg predictability so you can weed out any scores which are simply bugged or cheated ). Best would be a cheat safe bench offcourse.

    My .02c.

  7. #32
    Xtreme Enthusiast
    Join Date
    May 2007
    Posts
    831
    Quote Originally Posted by Marvin_The_Martian View Post
    I think the algorithm doesn't matter to most, calculating prime numbers well you can use P95 for that and calculating pi I think that's done already to idk how many digits.

    People will care about how tweakable the benchmark is, and about how consistent the scaling will be ( eg predictability so you can weed out any scores which are simply bugged or cheated ). Best would be a cheat safe bench offcourse.

    My .02c.
    I would very well like to make something, new.
    If anyone knows anything that would stress a CPU (I am now even trying to write some DirectX 9 applications), please let me know.
    I will attempt to implement it.
    Gigabyte P35-DQ6 | Intel Core 2 Quad Q6700 | 2x1GB Crucial Ballistix DDR2-1066 5-5-5-15 | MSI nVIDIA GeForce 7300LE

  8. #33
    Xtreme Addict
    Join Date
    Apr 2007
    Posts
    2,128
    So, you know HOW to do, but you don't know WHAT to do?

    What about doing soemthing GRAPHICAL instead? Software rendering.

    What about doing the "plasma effect" from oldie intros/demos? Don't use LUTs, calculate everything over and over again for every pixel, with cosines and sines.

    Basic idea goes like this:
    go trough every pixel in the screen (with for(){}'s).
    calculate pixel color (e.g. c = 128cos((x*y)/120)*64+sin(x/50+y/30)*48+cos((y/x)*60)/16
    set pixel on the appropriate point on the screen(direct framebuffer access would be way faster though).
    loop for 2 minutes.
    calculate the amount of rendered frames.

    You can put huge amounts of load to a CPU if you don't use LUTs and use big resolutions. Think about 1600*1200*10 trigonometric functions in one second. Too fast? Add more complex formula.

    It doesn't really matter what it renders, as long as it renders, with CPU. I'd be interested to watch some kind of an effect rather than "Calculating... Please wait..."-text for over and over again. Oh, and whatever you do, please, do it cross-platform in mind.

    Or what about image rotating? HUGE maze generation? 3D rendering? Fractal rendering? There is SO much what you can do, just be creative.
    Last edited by Calmatory; 04-06-2008 at 11:04 PM.

  9. #34
    Xtreme Member
    Join Date
    Jun 2007
    Location
    Finland
    Posts
    236
    Have you muffinflavored seen this thread?

    Maybe you would be better off coding the stuff calmatory said.

    I'm not against you rewriting SuperPi but it seems to me that there is already many versions of the same thing.
    ASUS P5Q DELUXE
    Intel Q9450
    Mushkin 4GB
    Asus GTX260

  10. #35
    Xtreme Enthusiast
    Join Date
    May 2007
    Posts
    831
    You are right.
    I need to create a DirectX 9 benchmark that heavily uses the CPU.

    2 in 1 bonus.

    I have already started.
    Gigabyte P35-DQ6 | Intel Core 2 Quad Q6700 | 2x1GB Crucial Ballistix DDR2-1066 5-5-5-15 | MSI nVIDIA GeForce 7300LE

  11. #36
    Xtreme Addict
    Join Date
    Apr 2007
    Posts
    2,128
    Why to include DX?

    DX with few cubes, textures, simple shaders and some primitive lighting techniques is about the same for the GPU as the 1+1 is for the CPU - VERY poor idea. Either you need to stress, stress alot, and even more, or not stress at all.

    Besides, unless you really are going to STRESS the GPU, it will be CPU-bound, and then the DirectX implementation is just waste of time. You can get graphics with CPU aswell. Every 3DMark, including 3DMark06 are CPU bound, are you aiming alone higher than Futuremark went with full team of professionals?

    So, make one good, instead of two poor benches, I am fairly certain that 1+1 = 2 isn't good in this case.

    And no, I am not trying to put you down or anything, just saying what I think.

    (Well, tbh, I believe SuperPI is being #1 because it has HWBot etc. support. It's de facto. If you ask me, NucRus MultiCore benchmark, or wPrime would be better benches than SuperPI, but SPI is a legend, and nothing is going to take it away, no matter how good it is/what it stresses. Sad but true. I'd personally be more interested in wPrime/NucRus etc. Or 3D Software renderer.)

  12. #37
    Xtreme Enthusiast
    Join Date
    May 2007
    Posts
    831
    Quote Originally Posted by Calmatory View Post
    Why to include DX?

    DX with few cubes, textures, simple shaders and some primitive lighting techniques is about the same for the GPU as the 1+1 is for the CPU - VERY poor idea. Either you need to stress, stress alot, and even more, or not stress at all.

    Besides, unless you really are going to STRESS the GPU, it will be CPU-bound, and then the DirectX implementation is just waste of time. You can get graphics with CPU aswell. Every 3DMark, including 3DMark06 are CPU bound, are you aiming alone higher than Futuremark went with full team of professionals?

    So, make one good, instead of two poor benches, I am fairly certain that 1+1 = 2 isn't good in this case.

    And no, I am not trying to put you down or anything, just saying what I think.

    (Well, tbh, I believe SuperPI is being #1 because it has HWBot etc. support. It's de facto. If you ask me, NucRus MultiCore benchmark, or wPrime would be better benches than SuperPI, but SPI is a legend, and nothing is going to take it away, no matter how good it is/what it stresses. Sad but true. I'd personally be more interested in wPrime/NucRus etc. Or 3D Software renderer.)
    I guess I must have said something wrong in my first post.

    I am not trying to replace 3DMark.
    I am not trying to replace SuperPi.
    I am not trying to replace wPrime.
    I am not trying to "replace" existing benchmarks.

    I was just going to see if I could "program" a very stressing CPU benchmark, and then try and mutli-thread it.

    I know 1+1 is not stressing.
    I know drawing individual pixels is not stressing.

    I will attempt to make something difficult, stressing, and worth while.

    All I can do is try.
    Gigabyte P35-DQ6 | Intel Core 2 Quad Q6700 | 2x1GB Crucial Ballistix DDR2-1066 5-5-5-15 | MSI nVIDIA GeForce 7300LE

  13. #38
    Xtreme Enthusiast
    Join Date
    Oct 2005
    Location
    Melbourne, Australia
    Posts
    529
    Quote Originally Posted by MuffinFlavored View Post
    I just need a way to benchmark the CPUs.
    After that, I can make it less memory dependent, more CPU dependent, anything.
    That will just make it more biased to Core 2 chips with their huge cache, but no onboard memory controller.

  14. #39
    Banned
    Join Date
    May 2005
    Location
    Belgium, Dendermonde
    Posts
    1,292
    try something like FFT or DFT? very intensive

  15. #40
    Xtreme Addict
    Join Date
    Apr 2007
    Posts
    2,128
    Basically people "care" about how their CPU band is doing in benches. If AMD beats Intel in wPrime and Nucrus(don't heat up please, just an example! ) then Intelists shout for "real world benches" or tell that Intel > AMD since Intel is faster in SuperPI. Same goes if Intel > AMD. It just depends who you ask about, but generally people care about "real world" (read: Crysis, 3Dmark CPU benches, SuperPI & Pifast and maybe CINEBENCH.).

    So what you might wanna do is to stress the CPU in multiple ways. Integer division and multiplying. Floating-point D/M, bitshifting, string manipulation. base10 to base16 conversions and vice versa, trigonometric functions...

    If there is such a benchmark which really tests that all(Lets call it X), STILL people want to see "real world" benches, despite that the benchmark X tests for even WIDER variety of features than average "real world" bench.

    Pretty much screwed, aye?

    Well, not really unless one aims to make a benchmark which people care about and think it being the best meter for CPU speed (Which it could in reality be, thanks to wide variety of features tested). But after all, the main idea and motivator is the will to learn and experiment, right?

  16. #41
    Xtreme Enthusiast
    Join Date
    May 2007
    Posts
    831
    Quote Originally Posted by Calmatory View Post
    Basically people "care" about how their CPU band is doing in benches. If AMD beats Intel in wPrime and Nucrus(don't heat up please, just an example! ) then Intelists shout for "real world benches" or tell that Intel > AMD since Intel is faster in SuperPI. Same goes if Intel > AMD. It just depends who you ask about, but generally people care about "real world" (read: Crysis, 3Dmark CPU benches, SuperPI & Pifast and maybe CINEBENCH.).

    So what you might wanna do is to stress the CPU in multiple ways. Integer division and multiplying. Floating-point D/M, bitshifting, string manipulation. base10 to base16 conversions and vice versa, trigonometric functions...

    If there is such a benchmark which really tests that all(Lets call it X), STILL people want to see "real world" benches, despite that the benchmark X tests for even WIDER variety of features than average "real world" bench.

    Pretty much screwed, aye?

    Well, not really unless one aims to make a benchmark which people care about and think it being the best meter for CPU speed (Which it could in reality be, thanks to wide variety of features tested). But after all, the main idea and motivator is the will to learn and experiment, right?
    Very true. From what you are saying, it seems the end result in comparison to other peoples end results almost matters more than the benchmark itself.

    Yes, I am in it for the learning and experimentation, and the fellow members of XS could be, sort of "beta testers".
    Gigabyte P35-DQ6 | Intel Core 2 Quad Q6700 | 2x1GB Crucial Ballistix DDR2-1066 5-5-5-15 | MSI nVIDIA GeForce 7300LE

  17. #42
    Xtreme Addict
    Join Date
    Apr 2007
    Posts
    2,128
    If you ever need a tester, feel free to count me in!

    Shall I ask you what kind of C++ experience do you have? i.e how much? When did you start, what heve you done so far? Other languages? Just curious.

  18. #43
    Xtreme Enthusiast
    Join Date
    May 2007
    Posts
    831
    Quote Originally Posted by Calmatory View Post
    If you ever need a tester, feel free to count me in!

    Shall I ask you what kind of C++ experience do you have? i.e how much? When did you start, what heve you done so far? Other languages? Just curious.
    A tester? Didn't you send me the code of the ASM benchmarks? You need to be writing with me!

    To be honest, I have, I guess what people would call "minimum" C/C++ experience. I kind of combined the two when I was learning, but now I am trying to stick with C.

    I am 14 years old, I started by just reading a couple of tutorials, and I went from there.

    I know a lot of PHP, Python, and C/C++ (I think of them as the close relatives )

    And ever since you sent me that CPU Bench 1.02 source code, I have been reading about ASM opcodes.

    Do you want to do some sort of combined project?
    Gigabyte P35-DQ6 | Intel Core 2 Quad Q6700 | 2x1GB Crucial Ballistix DDR2-1066 5-5-5-15 | MSI nVIDIA GeForce 7300LE

  19. #44
    Xtreme Addict
    Join Date
    Apr 2007
    Posts
    2,128
    Hmm, you must be mixing up me with someone else, since as far as I know, I haven't sent anyone anything related to ASM benchmarks, as I don't even know ASM well enough.

    Why don't you want to use external libraries? The clock() in time.h is AWFULLY inaccurate(Running time for my plasmas were over 14 seconds, whereas it should have stopped right after 10 seconds were passed.) under heavy CPU load, thus there is no accurate way of measuring time.

  20. #45
    Xtreme Enthusiast
    Join Date
    May 2007
    Posts
    831
    Quote Originally Posted by Calmatory View Post
    Hmm, you must be mixing up me with someone else, since as far as I know, I haven't sent anyone anything related to ASM benchmarks, as I don't even know ASM well enough.

    Why don't you want to use external libraries? The clock() in time.h is AWFULLY inaccurate(Running time for my plasmas were over 14 seconds, whereas it should have stopped right after 10 seconds were passed.) under heavy CPU load, thus there is no accurate way of measuring time.
    Then who sent me those benchmarks?

    I should find a better way to measure time, but by external libraries, I meant a bignum library, because I felt it would not provide the best performance.
    Gigabyte P35-DQ6 | Intel Core 2 Quad Q6700 | 2x1GB Crucial Ballistix DDR2-1066 5-5-5-15 | MSI nVIDIA GeForce 7300LE

  21. #46
    Xtreme Mentor
    Join Date
    May 2007
    Posts
    2,792
    Quote Originally Posted by MuffinFlavored View Post
    Yeah, wPrime is an awesome benchmark.
    But with the recent fluctuation between versions, I don't know.
    The author commented quite clearly that official stable is still 1.55 and other newer builds having problems are not yet stable and unofficial for the exact reason. They will be improved and quickened up quite a bit before official release build according to him

    As for something which creates heat and computers Pi, check SuperPrime developed by a fella on here. The best CPU/RAM test I've played with since Linpack, beats Linpack 32b for stress and heat and shows up errors very quickly. Cheat-preventative, picks up system details through CPUZ too [which was a problem for long], better than wPrime does. One major downside to it is, it only runs on Intel platforms.

    I have no idea why Charles hasn't updated and made Super Pi more efficient, I know many fellas have complained and advised for it many times over the years. One good reason is, you get to compare from the beginning to the end across all platforms - single threaded, single channel performance. Why Charles doesn't release the source code is one major reason you should be careful with it too -> to limit the cheating.

    No calculation that isn't a specific memory test out there yet that I've come across is RAM intensive or much sensitive until you get into the high memory footprint periods of code. Something like SysTool Pi is more RAM intensive than most apps around. PhotoWorxx in EVEREST and WinRAR is about the best I've seen that are quite RAM/FSB intensive, much more than others anyway. A very good memory subsystem benchmark is STREAM, all professionals incl. hardware firms use it and yeah, they tend to compete in it Maybe you can take pointers from that since IIRC it is coded in C as well as Fortran, the source code is available and STREAM2 is being developed currently.

    As for Intel/AMD, you should ignore zealot and bias individuals or those with little understanding of uarch functioning and neutralize your benchmark to be platform independent and still remain consistent across processor generations and core improvements - i.e. it can't be showing Pentium 4 faster per MHz than Penryn in integer calcs for instance. Check out Intel documentation on ILP, TLP, PLP at their website, their software developer community covers quite a lot of information and hints. Check out Linpack source code for good pointers too, I know this is how many other devs start off building good benchmarks.
    That's what SPEC did when developing the CPU 2006 benchmark and hence why all professional firms compete and rely in it, since they govern the benchmarking affair officially. Both CPU MFG firms have different strengths, weaknesses and dependencies as all data analyzers and coders will know. If you use code optimized running instructions which favor AMD, it will win and if you use the opposite the Intel will win. You can see the instruction and bench types which favor Intel and AMD in this 2.83G Harpertown and 2.3G Barcelona comparison. If you took Gamess for instance, the AMD CPU would outclass the Intel one. But if you took PovRay for example, the Intel CPU would outclass the AMD one. Different code, different strengths, no one brush fits all

    There really is a strong need to get some multi-threaded version that is RAM intensive out. All CPU MFGs are now moving to decrease data bottlenecks in their core uarchs in buses and increase cache, inter-core, intra-core and RAM bandwidth massively to enable fast computations with improved energy efficiency. Problem is, if the applications and benchmarks around are not coded to take benefit of them, their new instruction sets, their multi-cores, their memory, cache and bus performances, the end result will be useless to show the system perf. and the application might very well run slower on newer urachs then older ones per clock which were single threaded with little memory bandwidth. Everything is being improved for multithreaded parallelism, coding needs to take in account compilers/languages, their optimizations, operational semantics, functional languages, extensions, higher-order functions, polymorphism, non-determinism and so on that exploit parallelism best. Maybe have a look into lambda calculus too.

    As for coding, my coding is v.weak now so I'm reserved in what I say, I have no interest in it nor the time. I quit in late 2005 and have not touched it since then apart from Firefox/Thunderbird related coding and whatever comes with its debugging.

    EDIT: this might help you. QPi is utilizing the most efficient algorithm I've used, although there may well be better implmentations of it close to the actual: http://www.geocities.com/tsrmath/pi/piprogs.html#QPI
    Last edited by KTE; 04-10-2008 at 01:52 AM.

  22. #47
    Xtreme Enthusiast
    Join Date
    May 2007
    Posts
    831
    Quote Originally Posted by KTE View Post
    The author commented quite clearly that official stable is still 1.55 and other newer builds having problems are not yet stable and unofficial for the exact reason. They will be improved and quickened up quite a bit before official release build according to him

    As for something which creates heat and computers Pi, check SuperPrime developed by a fella on here. The best CPU/RAM test I've played with since Linpack, beats Linpack 32b for stress and heat and shows up errors very quickly. Cheat-preventative, picks up system details through CPUZ too [which was a problem for long], better than wPrime does. One major downside to it is, it only runs on Intel platforms.

    I have no idea why Charles hasn't updated and made Super Pi more efficient, I know many fellas have complained and advised for it many times over the years. One good reason is, you get to compare from the beginning to the end across all platforms - single threaded, single channel performance. Why Charles doesn't release the source code is one major reason you should be careful with it too -> to limit the cheating.

    No calculation that isn't a specific memory test out there yet that I've come across is RAM intensive or much sensitive until you get into the high memory footprint periods of code. Something like SysTool Pi is more RAM intensive than most apps around. PhotoWorxx in EVEREST and WinRAR is about the best I've seen that are quite RAM/FSB intensive, much more than others anyway. A very good memory subsystem benchmark is STREAM, all professionals incl. hardware firms use it and yeah, they tend to compete in it Maybe you can take pointers from that since IIRC it is coded in C as well as Fortran, the source code is available and STREAM2 is being developed currently.

    As for Intel/AMD, you should ignore zealot and bias individuals or those with little understanding of uarch functioning and neutralize your benchmark to be platform independent and still remain consistent across processor generations and core improvements - i.e. it can't be showing Pentium 4 faster per MHz than Penryn in integer calcs for instance. Check out Intel documentation on ILP, TLP, PLP at their website, their software developer community covers quite a lot of information and hints. Check out Linpack source code for good pointers too, I know this is how many other devs start off building good benchmarks.
    That's what SPEC did when developing the CPU 2006 benchmark and hence why all professional firms compete and rely in it, since they govern the benchmarking affair officially. Both CPU MFG firms have different strengths, weaknesses and dependencies as all data analyzers and coders will know. If you use code optimized running instructions which favor AMD, it will win and if you use the opposite the Intel will win. You can see the instruction and bench types which favor Intel and AMD in this 2.83G Harpertown and 2.3G Barcelona comparison. If you took Gamess for instance, the AMD CPU would outclass the Intel one. But if you took PovRay for example, the Intel CPU would outclass the AMD one. Different code, different strengths, no one brush fits all

    There really is a strong need to get some multi-threaded version that is RAM intensive out. All CPU MFGs are now moving to decrease data bottlenecks in their core uarchs in buses and increase cache, inter-core, intra-core and RAM bandwidth massively to enable fast computations with improved energy efficiency. Problem is, if the applications and benchmarks around are not coded to take benefit of them, their new instruction sets, their multi-cores, their memory, cache and bus performances, the end result will be useless to show the system perf. and the application might very well run slower on newer urachs then older ones per clock which were single threaded with little memory bandwidth. Everything is being improved for multithreaded parallelism, coding needs to take in account compilers/languages, their optimizations, operational semantics, functional languages, extensions, higher-order functions, polymorphism, non-determinism and so on that exploit parallelism best. Maybe have a look into lambda calculus too.

    As for coding, my coding is v.weak now so I'm reserved in what I say, I have no interest in it nor the time. I quit in late 2005 and have not touched it since then apart from Firefox/Thunderbird related coding and whatever comes with its debugging.

    EDIT: this might help you. QPi is utilizing the most efficient algorithm I've used, although there may well be better implmentations of it close to the actual: http://www.geocities.com/tsrmath/pi/piprogs.html#QPI
    Thank you very much for all of this information.

    Hmm, you must be mixing up me with someone else, since as far as I know, I haven't sent anyone anything related to ASM benchmarks, as I don't even know ASM well enough.

    Why don't you want to use external libraries? The clock() in time.h is AWFULLY inaccurate(Running time for my plasmas were over 14 seconds, whereas it should have stopped right after 10 seconds were passed.) under heavy CPU load, thus there is no accurate way of measuring time.
    I have found a way to get the value returned by the high-precision timer of the processor.

    Code:
    void benchmark() {
    	const unsigned int calculations = 4294967295; //The maximum amount an integer can hold, and the amount we will add up to
    	unsigned int a, b, c, d;
    	LARGE_INTEGER start, end, ticks;
    
    	a = b = c = d = 0;
    
    	QueryPerformanceFrequency(&ticks);
    	QueryPerformanceCounter(&start); 
    
    	__asm {
    		mov edi, calculations
    
    		mov eax, a
    		mov ebx, b
    		mov ecx, c
    		mov edx, d
    
    		$loop:
    			add eax, 1
    			add ebx, 1
    			add ecx, 1
    			add edx, 1
    
    			dec edi
    			jnz $loop
    	}
    
    	QueryPerformanceCounter(&end); 
    
    	printf("&#37;0.9f seconds\n", (float)(end.QuadPart - start.QuadPart) / ticks.QuadPart);
    }
    I also apologize, spycpu had sent me all of the information.
    He provided me with a benchmark that does SSE, SSE2, and 3DNow! operations, and returns results in MIPS (Millions of instructions per second)
    Gigabyte P35-DQ6 | Intel Core 2 Quad Q6700 | 2x1GB Crucial Ballistix DDR2-1066 5-5-5-15 | MSI nVIDIA GeForce 7300LE

  23. #48
    Xtreme Mentor
    Join Date
    May 2007
    Posts
    2,792
    Np
    Have a look at this, coding properly for multi-core, multi-processor systems is not easy: http://books.google.co.uk/books?id=q...l=en#PPA198,M1
    Maybe a bit simpler on loop transformations: http://en.wikipedia.org/wiki/Loop_optimization

    This is basics, I expect you've already looked at it but if you haven't it may help: http://softwarecommunity.intel.com/a...s/eng/2589.htm

    And this is what MC/MP specialists say: http://www.electronicsweekly.com/Art...ng-experts.htm
    Last week, Chris Rowen, CEO of multi-processor specialist Tensilica, said: "The challenge of writing software for programming general purpose computing applications is generally recognised in the scientific computing community as the biggest single unsolved, and perhaps unsolvable, computing problem."

    Will Intel and AMD ever get there? "They'll suddenly realise where they've been going wrong and take the right approach and then they'll say they invented it", replied Robertson, "I suspect Intel are doing it already, they're just too embarrassed to admit it."

  24. #49
    Xtreme Mentor
    Join Date
    Mar 2006
    Posts
    2,978
    http://www.benchmarkhq.ru/english.html?/be_cpu.html

    MaxPI is a multithreaded PI calculator, using a different algorithm altogether. It is very interesting to run this on a Phenom (9600 BE, B2) ... core 2 is always slower (by almost 10%)


    Jack
    Attached Thumbnails Attached Thumbnails Click image for larger version. 

Name:	AMD-790FX-9600BE-2.3G-200x11.5-800-5-MaxPI256K.JPG 
Views:	241 
Size:	161.5 KB 
ID:	76550  
    One hundred years from now It won't matter
    What kind of car I drove What kind of house I lived in
    How much money I had in the bank Nor what my cloths looked like.... But The world may be a little better Because, I was important In the life of a child.
    -- from "Within My Power" by Forest Witcraft

  25. #50
    Xtreme Enthusiast
    Join Date
    Jun 2007
    Location
    Victoria, Australia
    Posts
    948
    Been following, thought I would subscribe.

    Good luck MuffinFlavored, hope it goes well... No request's here, dont know much about it, would like a nice GUI though :P

Page 2 of 3 FirstFirst 123 LastLast

Bookmarks

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •