MMM
Results 1 to 10 of 10

Thread: Thinking about EVGA's dual CPU mobo, but have questions...

  1. #1
    Xtreme Mentor
    Join Date
    Sep 2007
    Location
    Ohio
    Posts
    2,977

    Thinking about EVGA's dual CPU mobo, but have questions...

    I am thinking about building a new system using the EVGA dual socket mobo.

    1 thing bothers me so far that I am trying to get squared away in my mind....

    http://www.xtremesystems....hp?t=242204&page=3

    Post 65, and 66...

    lutjens is running on a Dual CPU mobo (Tyan S7025), and (2) Xeon W5580@ 3.20GHz CPU's.

    When he ran a ray-tracing Pinball game, his CPU's didn't load up to 90% or so...

    "Got 18 FPS, but only ~45% CPU usage. Is there something I'm missing to improve multithreading? Couldn't see any CPU usage affinity options in the game and I double checked affinity in Task Manager (all cores are selected).

    Edit: For something to do, I ran another instance of the game and did 14 FPS with both copies running simultaneously (FRAPS only displays in the active window, but the same 14 FPS was seen regardless of which instance of the game was selected). Usage with both instances active was 95-100%.
    Edit2: Ran the arauna bench as well (one cycle...its zzz time for me). Better scaling than the pinball game, but CPU usage varied widely, hitting anywhere from 20-80% usage across all cores. Anyhow, 7046 was the result..."


    I consider all CPU's running near their full potential to do ray tracing, a critical success factor.

    Why didn't they run with higher utilization, giving him more performance?
    These configs produce just as much:
    ajaidev ------------ i7 980X@ 4.60GHz ---- 21 FPS (32nm Hex Core - HT on)

    rge ----------------- i7 950@ 4.81GHz ---- 20 FPS (45nm Quad Core - HT on)

    AkRazor ------- Xeon W3520@ 4.41GHz ---- 19 FPS (45nm Quad Core - HT on)

    -=DVS=- ------------ i7 920@ 4.30GHz ---- 18 FPS (45nm Quad Core - HT on)
    RCG Bex ------------- i7 920@ 4.18GHz ---- 18 FPS (45nm Quad Core - HT on)

    He scored 7046 on the benchmark, but these almost did too...
    http://www.xtremesystems....howthread.php?t=242688
    #1 --- 6518 - rge -----------Core i5 950--------- 4600Mhz -- 4/8--
    #2 --- 6032 - Pyr0 ----------Core i7 920--------- 3800Mhz -- 4/8--

    I would want my dual CPU'ed mobo to produce more. Could be just an app thing, could be the way things are on all apps...
    Could be the GHz difference too on the CPU's?
    I don't know.

    The same seems to happen to Particle's dual Hex-core setup.

    "4000, Particle, Dual AMD Opteron 2427, 2380 MHz, 12/12

    http://www.pcrpg.org/pics/computer/arauna.png

    Note the low CPU utilization. I know that's a known issue, but I think a star or something would make it less embarrassing."

    Are the dual CPU'ed systems actually going to give us better performance?
    Last edited by Talonman; 01-07-2010 at 07:22 PM.
    Asus Maximus SE X38 / Lapped Q6600 G0 @ 3.8GHz (L726B397 stock VID=1.224) / 7 Ultimate x64 /EVGA GTX 295 C=650 S=1512 M=1188 (Graphics)/ EVGA GTX 280 C=756 S=1512 M=1296 (PhysX)/ G.SKILL 8GB (4 x 2GB) SDRAM DDR2 1000 (PC2 8000) / Gateway FPD2485W (1920 x 1200 res) / Toughpower 1,000-Watt modular PSU / SilverStone TJ-09 BW / (2) 150 GB Raptor's RAID-0 / (1) Western Digital Caviar 750 GB / LG GGC-H20L (CD, DVD, HD-DVD, and BlueRay Drive) / WaterKegIII Xtreme / D-TEK FuZion CPU, EVGA Hydro Copper 16 GPU, and EK NB S-MAX Acetal Waterblocks / Enzotech Forged Copper CNB-S1L (South Bridge heat sink)

  2. #2
    Xtreme X.I.P. Particle's Avatar
    Join Date
    Apr 2008
    Location
    Kansas
    Posts
    3,219
    It doesn't effect all applications. Some times when I'm doing an x264 encode for instance, I see 95% utilization or so. I see it a lot in apps where the author didn't consider the rammifications of memory management for their threads. If you're running a thread, Windows bounces it around all the cores quite a bit. If you allocated memory when the thread was on CPU3 and then later use that memory when it's on CPU9, the memory has to be read on physical CPU0 and transferred over to CPU1 and then executed by logical CPU9. If an author is careful to minimize that cross-talk, they'll see better performance. I'm not saying that's the only issue that can hurt performance, but it's one I've seen in some apps. Most apps aren't optimized for multi-socket systems even if they're multithreaded.
    Particle's First Rule of Online Technical Discussion:
    As a thread about any computer related subject has its length approach infinity, the likelihood and inevitability of a poorly constructed AMD vs. Intel fight also exponentially increases.

    Rule 1A:
    Likewise, the frequency of a car pseudoanalogy to explain a technical concept increases with thread length. This will make many people chuckle, as computer people are rarely knowledgeable about vehicular mechanics.

    Rule 2:
    When confronted with a post that is contrary to what a poster likes, believes, or most often wants to be correct, the poster will pick out only minor details that are largely irrelevant in an attempt to shut out the conflicting idea. The core of the post will be left alone since it isn't easy to contradict what the person is actually saying.

    Rule 2A:
    When a poster cannot properly refute a post they do not like (as described above), the poster will most likely invent fictitious counter-points and/or begin to attack the other's credibility in feeble ways that are dramatic but irrelevant. Do not underestimate this tactic, as in the online world this will sway many observers. Do not forget: Correctness is decided only by what is said last, the most loudly, or with greatest repetition.

    Rule 3:
    When it comes to computer news, 70% of Internet rumors are outright fabricated, 20% are inaccurate enough to simply be discarded, and about 10% are based in reality. Grains of salt--become familiar with them.

    Remember: When debating online, everyone else is ALWAYS wrong if they do not agree with you!

    Random Tip o' the Whatever
    You just can't win. If your product offers feature A instead of B, people will moan how A is stupid and it didn't offer B. If your product offers B instead of A, they'll likewise complain and rant about how anyone's retarded cousin could figure out A is what the market wants.

  3. #3
    Xtreme Addict
    Join Date
    Mar 2008
    Location
    Minnesota
    Posts
    1,653
    Applications that can make use of that many threads usually work better than that very badly multi-threaded ray tracing bench you linked.
    i5 2500K @ 4.9GHz+ 8GB G-Skill RipJaws DDR3-2000 @1600Mhz CAS 6 Asus P8P67 Pro CrossFire 6970's @ 950/1450
    Xeon X5677 @ 4.5Ghz 6GB G-Skill RipJaws DDR3-2000 @1600Mhz CAS 7 Gigabyte EX58-UD5 4870x2
    i7-880 @ 4.2Ghz+ (still playing) 4GB G-Skill RipJaws DDR3-2000 @2300Mhz CAS 9 Asus Maximus III Formula MSI Hawk 5770

  4. #4
    Xtreme Addict
    Join Date
    Jun 2007
    Posts
    1,442
    Quote Originally Posted by Talonman View Post
    Are the dual CPU'ed systems actually going to give us better performance?
    If you look at poke349's multi threaded pi thread (which maintains load in 94-100% range), he does an apples to apples comparison of single cpu vs dual cpu in several runs, using exact same benchmark.

    For example in the
    250,000,000 digits:

    110.530 - v0.4.2 x64 SSE3 - poke349 - 2 x Intel Xeon X5482 Harpertown @ 3.2 GHz

    201.449 - v0.4.2 x64 SSE3 - poke349 - Intel Xeon X5482 Harpertown @ 3.2 GHz

    So that is 201/110 or 1.83x better. Others he tested ranged in same 1.8-1.9 range, just shy of 2x the performance.

    Seems to just depend on how well the app takes advantage of all available cores, but given ray tracing load varies from 30-80% on just 1 cpu or 2, it isnt taking full advantage of even 1 cpu.

    Edit: god I am a slow typist or searcher...2 responses in time I posted mine...kind like in water cooling link where I posted skinees link... after he chimed in.
    Last edited by rge; 01-09-2010 at 10:16 AM.

  5. #5
    I am Xtreme
    Join Date
    Aug 2008
    Posts
    5,586
    were all the operating systems the same? i like the comparison..


  6. #6
    Xtreme CCIE
    Join Date
    Dec 2004
    Location
    Atlanta, GA
    Posts
    3,842
    Dual proc boards do offer ~2x the performance of 1x processor boards with the same architecture/clocks, except they do obviously require more threads. If an application is not properly threaded then no, you won't see much of an increase. I would suspect that to be the case.
    Dual CCIE (Route\Switch and Security) at your disposal. Have a Cisco-related or other network question? My PM box is always open.

    Xtreme Network:
    - Cisco 3560X-24P PoE Switch
    - Cisco ASA 5505 Firewall
    - Cisco 4402 Wireless LAN Controller
    - Cisco 3502i Access Point

  7. #7
    Xtreme Enthusiast
    Join Date
    Mar 2009
    Location
    Bay Area, California
    Posts
    705
    Particle is right.

    Most apps aren't written for NUMA memory access. (Non-Uniform Memory Access)

    Prior to Gainestown (i7-based Xeons), the Intel dualies were all uniform access. (including my dual X5482s).

    Writing for NUMA is an absolute nightmare... As you need to control exactly where all your threads, know where all your memory is - and as such, optimizations become almost entirely hardware-specific.
    Needless to say, I don't have the resources to write and test my benchie on NUMA for now...


    But anyhow, NUMA and the memory thrashing that it can cause has nothing to do with low CPU consumption. All memory waiting is only visible to the OS as "CPU consumption".
    You'll notice NUMA penalties in your actual performance scaling, not in your CPU consumption.


    EDIT:
    Quote Originally Posted by rge View Post
    If you look at poke349's multi threaded pi thread (which maintains load in 94-100% range), he does an apples to apples comparison of single cpu vs dual cpu in several runs, using exact same benchmark.

    For example in the
    250,000,000 digits:

    110.530 - v0.4.2 x64 SSE3 - poke349 - 2 x Intel Xeon X5482 Harpertown @ 3.2 GHz

    201.449 - v0.4.2 x64 SSE3 - poke349 - Intel Xeon X5482 Harpertown @ 3.2 GHz

    So that is 201/110 or 1.83x better. Others he tested ranged in same 1.8-1.9 range, just shy of 2x the performance.

    Seems to just depend on how well the app takes advantage of all available cores, but given ray tracing load varies from 30-80% on just 1 cpu or 2, it isnt taking full advantage of even 1 cpu.
    Here's some slightly more detailed data that I've run:

    Core i7:
    http://www.numberworld.org/y-crunche...4.3/ushio.html
    2 x Xeon X5482
    http://www.numberworld.org/y-crunche....3/nagisa.html
    Last edited by poke349; 01-09-2010 at 11:39 AM.
    Main Machine:
    AMD FX8350 @ stock --- 16 GB DDR3 @ 1333 MHz --- Asus M5A99FX Pro R2.0 --- 2.0 TB Seagate

    Miscellaneous Workstations for Code-Testing:
    Intel Core i7 4770K @ 4.0 GHz --- 32 GB DDR3 @ 1866 MHz --- Asus Z87-Plus --- 1.5 TB (boot) --- 4 x 1 TB + 4 x 2 TB (swap)

  8. #8
    Xtreme Mentor
    Join Date
    Sep 2007
    Location
    Ohio
    Posts
    2,977
    Thanks boys...

    I feel better, and do appreciate the info.
    Asus Maximus SE X38 / Lapped Q6600 G0 @ 3.8GHz (L726B397 stock VID=1.224) / 7 Ultimate x64 /EVGA GTX 295 C=650 S=1512 M=1188 (Graphics)/ EVGA GTX 280 C=756 S=1512 M=1296 (PhysX)/ G.SKILL 8GB (4 x 2GB) SDRAM DDR2 1000 (PC2 8000) / Gateway FPD2485W (1920 x 1200 res) / Toughpower 1,000-Watt modular PSU / SilverStone TJ-09 BW / (2) 150 GB Raptor's RAID-0 / (1) Western Digital Caviar 750 GB / LG GGC-H20L (CD, DVD, HD-DVD, and BlueRay Drive) / WaterKegIII Xtreme / D-TEK FuZion CPU, EVGA Hydro Copper 16 GPU, and EK NB S-MAX Acetal Waterblocks / Enzotech Forged Copper CNB-S1L (South Bridge heat sink)

  9. #9
    Xtreme Enthusiast
    Join Date
    Mar 2009
    Location
    Bay Area, California
    Posts
    705
    Quote Originally Posted by Talonman View Post
    Thanks boys...

    I feel better, and do appreciate the info.
    Gainestown dualies are NUMA...
    However, the the penalty for accessing the wrong processor's memory is very tiny...

    I read somewhere that for Gainestown, accessing the other socket's ram has "only" 30% more latency than accessing local ram...
    In other words... You probably won't feel it.
    Main Machine:
    AMD FX8350 @ stock --- 16 GB DDR3 @ 1333 MHz --- Asus M5A99FX Pro R2.0 --- 2.0 TB Seagate

    Miscellaneous Workstations for Code-Testing:
    Intel Core i7 4770K @ 4.0 GHz --- 32 GB DDR3 @ 1866 MHz --- Asus Z87-Plus --- 1.5 TB (boot) --- 4 x 1 TB + 4 x 2 TB (swap)

  10. #10
    Xtreme Mentor
    Join Date
    Sep 2007
    Location
    Ohio
    Posts
    2,977
    More good trivia... thanks.

    Still deciding what 2 CPU's to pick up. My top price would be 500 bucks each...

    (I can't build this for another month, so hopefully prices will come down on CPU's a tad...)

    I also will want a new Fermi approved PSU, and Mountain Mods case I guess? Might go with a work-station rather than a case. Undecided.

    Probably a new triple rad too. My WaterKegIII Xtreme will be used to cool the mobo only.
    Last edited by Talonman; 01-09-2010 at 01:56 PM.
    Asus Maximus SE X38 / Lapped Q6600 G0 @ 3.8GHz (L726B397 stock VID=1.224) / 7 Ultimate x64 /EVGA GTX 295 C=650 S=1512 M=1188 (Graphics)/ EVGA GTX 280 C=756 S=1512 M=1296 (PhysX)/ G.SKILL 8GB (4 x 2GB) SDRAM DDR2 1000 (PC2 8000) / Gateway FPD2485W (1920 x 1200 res) / Toughpower 1,000-Watt modular PSU / SilverStone TJ-09 BW / (2) 150 GB Raptor's RAID-0 / (1) Western Digital Caviar 750 GB / LG GGC-H20L (CD, DVD, HD-DVD, and BlueRay Drive) / WaterKegIII Xtreme / D-TEK FuZion CPU, EVGA Hydro Copper 16 GPU, and EK NB S-MAX Acetal Waterblocks / Enzotech Forged Copper CNB-S1L (South Bridge heat sink)

Bookmarks

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •