MMM
Results 1 to 25 of 267

Thread: AMD FX "Bulldozer" Review - (4) !exclusive! Excuse for 1-Threaded Perf.

Threaded View

  1. #11
    Xtreme Member
    Join Date
    Nov 2007
    Posts
    103
    Quote Originally Posted by son14 View Post
    The answer could be simple.
    This answer is not for that question, because he was speaking about 4CU/8C vs. 4CU/4C, which is irrelevant in case of games with 4 threads or less.
    So, let's we go back to 4CU/4C vs. 2CU/4C.

    If you use 2CU/4C, there are 2 floating point units available, in 4CU/4C mode 4. So a task using 4 cores, gets access to 4 vs. 2 fpus. The performance gain in 50% int and 50 % fpu scenarios, is up to 25 %. Xbox ported games, mostly use 2 cores. So they get access to 2 fpus in both scenarios and you see performance gain of about 5 % only, from doubled cache per core.
    You're searcing for the answer in the right direction, floating-point wise. Although, let's not forget this FPU is 2-way SMT, with double the resources of K10's FPU: it's 2xFMAC vs. 1xFADD+1xFMUL, FMAC being 1xFADD+1xFMUL combined. The thing is that all of this computing power can only be fully utilized with FMA code, because you can't have an FADD and FMUL independently started in the same cycle on an FMAC unit, only if it's an FMA instruction.

    Thus, for legacy code 4CU/4C mode means double the resources utilizable all the time (not shared with the second thread), per hw thread, so it can have independent FADD's and FMUL's started in the same cycle on the two FMAC units. (Extra bonus is that it can have also 2xFADD or 2xFMUL, not just 1xFADD+1xFMUL, like on K10.) In 2CU/4C mode, it can have usually only 1xFADD OR 1xFMUL started in the same cycle (mostly one FMAC unit available because of the second thread engage the other one), per hw thread.

    (And so, I think there won't be such a significant gain in performance for FMA heavy code, going 4CU/4C from 2CU/4C. EDIT: Or, even if there be so, both cases will be signicicantly faster than the same algorithm with legacy code.)

    ...So a task using 4 cores...
    (I think it's not the best wording, because less than quad-threaded apps can also be "using 4 cores" upon the constant core-variation of Windows. So, it's about how many threads an application have.)

    The problem is Dirt, F1, Civ5, Crysis 2,... . Northbridge overclocking and 4CU/4C can be the difference between playable or not.
    Definitely worth a testing.

    Quote Originally Posted by The Stilt View Post
    It is a design limitation as far as I know.
    You can run the compute units either in single or dual core mode but not at the same time.
    So, you're saying if we enable the 2nd core here, then we can't disable ony other cores, alone in a CU?

    Quote Originally Posted by omninmo View Post
    Oh, and by the way folks... but this might be a decent solution to allow us to manage our threads the way we like them and give us the extra flexibility of deciding if we want 4c/2cu, 4c/4cu or 8c/4cu on an app to app basis WITHOUT much hassle and without having to reboot!

    http://img.tomshardware.com/us/2004/...taskassign.zip

    just set your turbo OCs accordingly and setup each game in the way it will benefit you the most? win/win?

    if someone could please test this and see if it plays nice with BD and let the rest of us now, would be grateful!
    Wow, I've forgot about this little utility. Perhaps it could be enhanced and adapted to BD, so all of it could work fully automatic! Would be even more useful!
    Last edited by dess; 10-18-2011 at 06:15 AM.

Tags for this Thread

Bookmarks

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •