Steve - you're very welcome. And same to you - Happy Holidays and a Happy New Year to you and your family as well. I wish you health, fortune, and happiness for 2019.
(Yeah, I've been on this forum for quite some time although I've also been inactive here also for quite some time as well. As life would have it, been busy. Family. Kid. Work. Same 'ol, same 'ol.)
(That plus my hardware acquisitions also keep me pretty busy testing the configurations as well, not unlike what you're trying to do.)
My experience with overclocking (see my thread here asking about overclocking the Core i7-3930K from the stock speed of 3.2 GHz to 4.5 GHz 24/7 and then subsequently killing at least one out of the six cores on account of having done so) taught me that overclocking is great for short term gains, but long term, I don't recommend doing it.
It'd be different if we were made of money so we can just replace hardware whenever it fails, but that's usually and generally not the case. At least for me, it wasn't. I'm still running 3930Ks, 7 or so years after that processor has launched. My daily driver now was originally built in 2011. And my "new" hardware are all pretty much off eBay. (Which can be GREAT!)
So given that, longetivity is now my top priority above overclocking because I can't keep dumping/pumping in money for my stupidity.
Hence, I'm not an advocate for overclocking. It's great for setting world records and short term runs, but the stuff that I do (computational fluid dynamics) - a single run can last 42 days. Straight. So...a hardware failure during that time is VERY, VERY bad.
(One of my more recent smooth particle hydrodynamics (SPH) simulation runs - HALF of it completed in three months - running straight. I stopped the analysis early.)
My take on overclocking is that it's great for super short runs, but when you're working on the kind of stuff that I'm working on, it ends up, ultimately being to the detriment, so I don't bother with it.
I speed up the parts that I can within the limits of the hardware and try and make it run within those constraints/limitations.
re: Hyperthreading
Again, YMMV.
Some programs will show upto about a 7% performance increase. Most, on average is 0% difference, +/- about 2 to 2.5% difference. So...it's not great. But it's also not quite as bad as when HyperThreading first launched either, where we were seeing differences of upto -10+%.
I typically don't run with Hyperthreading enabled on any of my systems because trying to manage the processor affinities is a giant pain and not particularly worth the effort. (Sidenote though: if you do go through the steps of assigning processor affinities, that CAN help speed things up with, again, varying results because it prevents data from core migration, but again, having to set it and reset it every time is a pain, so I generally don't really bother with it.)
re: RAM
The greater the AMOUNT of RAM is, it is usually and generally better. The only times when it isn't are:
1) When a specific configuration of RAM forces SPD to run at suboptimal speeds (e.g. quad ranked DIMMs forcing the SPD to run at DDR3-800 instead of DDR3-1600).
2) Mixed or mismatch type/speed/timings, which causes everything to run slower, due in part to synchronization issues.
So if you have 4x 8GB available, I'd recommend using that. It is unlikely that your cruncher will actually really need all of that, but depending on what you're running for your file server, you might. (e.g. ZFS is notorariously RAM heavy due to its background RAID scrubbing abilities.)
re: SSD
Samsung 960 Pro is fine.
Really, most SSDs will be plenty. Most crunchers (with a few minor exceptions, and sometimes only within the project's beta testing) will you get a large volume of data. Usually, each work unit will be small as given by the principle of distributed parallel computing. Even with "large" data, because you have to send the results back to their server, and since they can't and don't know the speed of your connection, they have to make it so that it can't be too big otherwise you'd spend more time uploading the results than you might otherwise spend crunching it. So, the project teams have developed it such that it balances this.
(I was on the Folding@Home beta team for quite some time because for a time, I had advanced hardware relative to the peers, so I was able to test bigger work units for them. Now, they're GPU dominant, and most of my stuff isn't always GPGPU capable, so it has limited how much I can contribute with just pure CPU-based hardware.)
re: GPU
Again, someone else who is more knowledgable in this area might be able to speak to it more, but it will also be project dependent.
The better metric that I tend to use in terms of relative comparison is floating point operations per second (FLOPS) for data types of single and/or double precision.
The higher the number in those metrics, the better. Some GPU crunchers use double precision, so you want a card that has high double precision performance. Most other crunchers will be single precision (for cancer type stuff, as it was suggested). But if you do any of the distributed AI computing, then they're now using basically 4x4 half-precision FMA (e.g. read: "tensor cores") so that's becoming a new metric to look out for if you're looking to get into that game (because more traditional single precision cores CAN execute 4x4 half-precision FMA, but SIGNIFICANTLY slower because the tensor cores is specialized hardware with the expressed intent for this class of problems.)
re: OS
If that's what you have, go with that.
re: everything else
Please don't take it as something NOT to do only because you might have limited benefits for your goals in mind.
Messing with hardware can be fun, but it can also be EXTREMELY frustrating (as I found out when I started messing with my Infiniband network adapters. Turned out that one of the four adapters that I had ordered was DOA).
But I will also say this - IF this is going to pull double duty with it being a cruncher and a file server, if you don't have some kind of CPU offload (e.g. TCP offload onto the network interface card), you can potentially see that transfers to and from this server can be very slow when it is crunching because the network has to go through (usually) core 0 of your CPU. So, if that is busy crunching, data will only trickle in and out. So just keep/bear that in mind.
(And now you're optimizing between using (n-1) cores for crunching vs. (n-2) cores vs. something else (e.g. RDMA/TCP offload).)
I used to have it where my file servers would crunch too. Except that my file servers were designed with low power, slow processors in mind since they're just dummy servers, and transfers would slow down to asynchronous speeds (e.g. ~5 MB/s on a gigabit ethernet network, which should be capable of about 116 MB/s peak.) The moment I turned off the cruncher, the transfer speeds resumed. The moment I turned the cruncher back on, it slowed back down again.
So I just kept it off and make something else do the crunching and let the dummy file server be the dummy file server.
(Course, now that I am using NAS appliances, I've completely offloaded the file server task onto an entirely different device altogether.)
But I wanted to bring this to your attention as something to keep in mind because this can and is likely going to happen unless you put in mitigation and management tools and hardware to deal with this, otherwise, again, things might not turn out the way you might have otherwise thought/hoped it would.
Thanks.
*edit*
re: "I kinda understood a lot of it but i not going to pretend that I got it all."
a) Story of my life.
b) It's okay to not know things. Learning is fun IMO.
c) When in doubt, ask. There are very few stupid questions, and the stupidest ones of all are the ones that aren't asked.
Some people here might know me a little bit to know that I've been doing this stuff for quite a long time, with 99% of it out of sheer necessity. So...
And there will also always be people who know more about this stuff than even I do. The difference (usually) is that's their job to know. I do it because I have/had to to support the needs of other activities that I am or was presently working on. (Which I've scared and surprised a few sysadmins on account of that before.)
re: "(The QPI on my server (with first gen E5-2690 processors) is capable of 32 GB/s (256 Gbps) vs. SATA 6Gb s vs. PCIe 3.0 x16 15.75 GB/s (128 Gbps) link.
seems like i have a LOT to work towards, but i love projects like this, that i can expand my horizons and knowledge and do what i love most. Mess with hardware.."
Actually, your system might have a faster main bus than my systems do. But it also depends on how well you can make use of it. I doubt that any distributing computing project is going to be nearly as strenuous as the stuff that I am running myself for my own simulation stuff, so it's not likely that any of that would be limiting factors to crunching performance. Distributed computing projects tend to NOT be disk I/O heavy, so I wouldn't really worry about that too much.
Again, the stuff that I normally run is more along the lines of "real" HPC stuff, so the demands are greater. Distributed computing is meant to be broken down so that the average computer can perform those tasks.
Bookmarks