Oh wow...I see you're creating and destroying threads during the computation cycle itself. In my own programming I've found it to be a good idea to create x number of threads and then use them all to process pieces of work dispatched from a synchronous controller. That may or may not be practical or applicable to your particular algorithm of course--I won't pretend to be familiar with your project. In any case, that does make sense now. I saw as few as 8 threads and as many as 60-some.