-
I did two tests. Running 36 threads and 12 threads. The results are that running 36 threads the daily point production is lower than with 10 GPU threads by -12% at least. In fact the maximum value is with 12 threads. That means that you use a full CPU core for one GPU thread. I get +8% at least over a 10+2 configuration. All this at same hardware settings.
The reason is that HCC WU are still CPU intensive. The CPU is havily used at the end of each diffraction image, this means two times per WU, to which you have to add all the logistics, like loading and unloading the WUs on the GPU and all PC functions on the networks etc... If one would like to add over 12 threads then there would be a way to have better results, but this would mean to be able to define the starting time of each WU, so to avoid collisions and have multiple WUs using at the same time CPU resources.
The reason I am using the 10+2 configuration is that in case there is a shortage of GPU WUs and the cache is empty (it happened already) I avoid having an powered but idle machines and the CPU crunches at least two WUs, better than nothing.
Last edited by EtaCarinae; 03-25-2013 at 12:26 AM.
[SIGPIC][/SIGPIC]
Posting Permissions
- You may not post new threads
- You may not post replies
- You may not post attachments
- You may not edit your posts
-
Forum Rules
Bookmarks