MMM
Results 1 to 25 of 226

Thread: SuperPi on GPU, were going CUDA

Threaded View

  1. #11
    Xtreme Member
    Join Date
    Aug 2006
    Location
    Warsaw, Poland
    Posts
    148
    Quote Originally Posted by Neuuubeh View Post
    is calculating pi a process thats easily multi-thread'able?
    While using "the SuperPi" algorithm [Gauss-Legendre http://en.wikipedia.org/wiki/Gauss-Legendre_algorithm] it is absolutely multi-threadable. Other algorithms probably too.
    The reason is simple: there are only about 10 matematical operations each loop - the rest is computing it with custom-written methematical operations using very high precision [eg. adding 30MB to compute a result of simple addition ].

    I've done it a little while ago, looks like this: [FORTRAN77, oldschool , but original SuperPi was also written in FORTRAN ]

    Code:
    program pi_calc
          implicit none
          integer i
          real*8 a(1:21),b(1:21),t(1:21),p(1:21)
          real*8 pi      
          a(1)=1
          b(1)=1/sqrt(2.0)
          t(1)=0.25
          p(1)=1
    
          do i=1,20,1
            a(i+1)=(a(i)+b(i))/2
            b(i+1)=sqrt(a(i)*(b(i)))
            t(i+1)=t(i)-p(i)*(a(i)-a(i+1))**2
            p(i+1)=2*p(i)
            pi=(a(i)+b(i))**2/(4*t(i))
            write(*,*)i,pi
          enddo
    
          stop
          end
    "do ____ enddo" - loop of course, the main algorithm
    "**" - power
    output:
    1 2.91421352106
    2 3.14057922577
    3 3.14159262165
    4 3.14159262902
    5 3.14159262902
    6 3.14159262902
    .............................................

    That program without custom-written mathematical formulas only calculates ~15 digits and in fact not all digits are correct, the accuracy is not so great . But with that formulas it is a "classic SuperPi" - 1M with 19 iterations etc .
    As we can see the idea is really simple, only ~10 mathematical operations per loop.
    The thing is to implement mathematical operations which can handle veeeery accurate precision [30M of digits or as we like].

    So there is:
    addition [add bits with carry, simple and multithreadable with some error margin [if carry will happen too many times ], division by 2 [simple binary "shift left"] - very fast in fact, not worth optimizing;
    square root - I don't know the algorithm, but this is probably the slowest in that program - if it really is slowest hmm... I dunno if this is easy multithreadable
    multiplication - probably faster than sqrt but non-comparable slower than addidtion for sure - the simple algorithm is as we all know to multiplicate in row and add - it will be very time-eating [30MB * 30MB means ~10^12 operations, I dunno if it is coded that way, probably there is faster algorithm, because 10^12 cycles in 1 second means a terahertz] - this is absolutely multithreadable [perfect scaling I can say]. IIRC 32-bit CPU can multiply 32-bit * 32-bit in 6 cycles which will be more real, but still somehow slow. No, wait - there must be faster algorithm [or it is not the most time-eating], because with doubling the accuracy we are slowing the calculations a little more than twice. Or maybe there is some trick to increase number of digits with each loop, but I don't think so [every loop takes +- the same amount of time].
    division - also not that simple to write with threads ["not that simple" means "I don't know how to do it" ]

    So the main problems: write a multi-threaded sqrt and division [power is not that demanding as it is only ^2 which in fact means multiplication].

    Sorry if that was too long and I bored You , but I personally find it interesting to know what it is all about [especially when we are benching it hours and hours, and all we can see are those 19-24 loops, nothing more .
    Well, we all can now at least understand what means "not convergent in sqr" - we know what sqrt it is and why it should be convergent .
    Last edited by JMKS; 05-26-2008 at 12:03 PM. Reason: indentations, [code]

Bookmarks

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •