Page 6 of 11 FirstFirst ... 3456789 ... LastLast
Results 126 to 150 of 272

Thread: Intel Xeon 5570: Smashing SAP records.

  1. #126
    Xtreme Addict
    Join Date
    Jan 2005
    Posts
    1,730
    Quote Originally Posted by Shadowmage View Post
    The cores on the same silicon die are not connected using HT. My guess is that they have some dedicated coherency traffic buses. If what you said is correct, then AMD's processor diagram will have the L1, L2, and L3 caches all connected to some arbiter which is then connected to the HT port. Instead, note that L1 feeds into L2 which feeds into L3. In other words, like with Nehalem, the L3 handles all the coherency traffic - it just takes a little longer to process the data.


    AMD's has a pseudo-exclusive cache ( data cannot be found in the two cache levels at the same time, although “pseudo” means that there are a few exceptions) relationship.Things are problematic when you have an L3 miss , you need to check what the other cores have in their caches.

    By using an inclusive cache , with an L3 miss data is guaranteed not to be in the other caches and a memory request is sent.

    Given Nehalem's inclusive relationship and the flag system they use to maintain coherency , gives them little to no headaches about coherency traffic.
    AMD's caches burn more BW and latency for this problem , that's all.

    What's the cache coherency traffic like on Nehalem? It seems to me that AMD is interconnect-bound because they are still waiting for HT3. Intel has less of a problem because QuickPath is higher performance than AMD's current HT implementation.

    Your argument is pinpointing the problem on the wrong component. The traffic caused by the exclusive caches is reasonable. It's just that AMD needs a bandwidth improvement for their intra-chip interconnects.
    That is simply not true.AMD isn't bottlenecked by interconnects, it is precisely the coherency traffic which kills it.Huge amounts of BW are wasted with maintaining coherency.

    In a Nehalem multicpu system , you need to maintain the coherency of the L3s.Furthermore , Intel implemented a directory based coherence protocol which is point to point instead of broadcast.

    Not so with the Opteron because data in L1/L2 is more or less guaranteed not to be in the L3 .Also they use a snoop based one protocol in which the caches listen in on transport of variables to any of the CPUs and update their own copies of these variables if they have them. Snooping logic in the processor broadcasts a message over the bus each time a word in its cache has been modified. The snooping logic also snoops on the bus looking for such messages from other processors.
    Since K8/10 use 64bit lines , can you imagine the traffic in a 4 socket system to maintain coherency ? What about 8 sockets ? Yeah , HT 3.0 will help , but it is a band aid curing the symptoms by brute force ( more BW ) and not the disease ( a better cache coherency protocol ).

    Why do you think Newsys tried to build Horus , a directory based chipset to overcome this ?
    Quote Originally Posted by Heinz Guderian View Post
    There are no desperate situations, there are only desperate people.

  2. #127
    Xtreme Enthusiast
    Join Date
    May 2008
    Posts
    612
    savantu:
    Inclusive vs exclusive cache
    Why is the L1 and L2 cache smaller on i7 compared to DENEB ?
    What about redundant data in cache ?
    What is the hitrate for the cache (assocativity) ?
    Do you have any thoughts about manufacturing yields comparing i7 and DENEB

    There has been some discussions on the internet that Intel did focus very much on servers with the i7 (performance), that was AMD's strong area where Intel was behind. With the i7 Intel may have been focused to much on server performance?

    What AMD has done seems to be the opposite (if desktop is the opposite). I think they have some good news for gamers etc with this design for the deneb. Maybe Intel knows this and thats why they are releasing this type of information (let go of the NDA).
    The reason why they didn't focused on performance on servers could be that it is enough now. Companies don't need more CPU performance there, they want CPU that draws less power and maybe they focused on that area.
    Last edited by gosh; 12-19-2008 at 02:52 AM.

  3. #128
    Xtreme Addict
    Join Date
    Feb 2006
    Location
    northern ireland
    Posts
    1,008
    Quote Originally Posted by gosh View Post
    The reason why they didn't focused on performance on servers could be that it is enough now. Companies don't need more CPU performance there, they want CPU that draws less power and maybe they focused on that area.
    They will love i7 for servers then, They will be able to cut there power usage in half. Through out those four socket systems and replace with duals. Lovely

  4. #129
    Xtreme Enthusiast
    Join Date
    May 2008
    Posts
    612
    Quote Originally Posted by gallag View Post
    They will love i7 for servers then, They will be able to cut there power usage in half. Through out those four socket systems and replace with duals. Lovely
    The problem (what I think) is that servers is on 24/7, and that isn't that many servers that need raw CPU performance. Fast disks is probably more important in general. Sometimes there might be high CPU loads but that is probably not a common scenario.

  5. #130
    Xtreme Addict
    Join Date
    Feb 2006
    Location
    northern ireland
    Posts
    1,008
    Quote Originally Posted by gosh View Post
    The problem (what I think) is that servers is on 24/7, and that isn't that many servers that need raw CPU performance. Fast disks is probably more important in general. Sometimes there might be high CPU loads but that is probably not a common scenario.
    ahhh, So now server cpu performance is not important now, Well it can just join all those other things that become unimportant as soon as Intel excelled in them.

    Why does AMD do so well in server land atm and what will change to make cpu performance negligible once i7 comes to the server sector?

  6. #131
    Xtreme Enthusiast
    Join Date
    May 2008
    Posts
    612
    Quote Originally Posted by gallag View Post
    ahhh, So now server cpu performance is not important now, Well it can just join all those other things that become unimportant as soon as Intel excelled in them.

    Why does AMD do so well in server land atm and what will change to make cpu performance negligible once i7 comes to the server sector?
    I only explained what I think is important and the same was important "yesterday" also. Power consumption has always been important for servers, but what happened is that the power consumption is much more important today. I read something about AMD when then was selling CPU's for servers, even with current opteron they sell more of the CPU's that draws less power with less speed compared to the more power hungry and faster CPU's (if what I read was right).

    If you read in forums where people are looking for home servers etc. They always have power consumption as one very important attribute. My own server draws 52 watt on idle.

  7. #132
    I am Xtreme
    Join Date
    Jul 2007
    Location
    Austria
    Posts
    5,485
    Quote Originally Posted by gosh View Post

    If you read in forums where people are looking for home servers etc. They always have power consumption as one very important attribute. My own server draws 52 watt on idle.
    I just burst our in laughter when i read the sentence...

  8. #133
    Xtreme Addict
    Join Date
    Feb 2006
    Location
    northern ireland
    Posts
    1,008
    All we have heard for the last few years is that all that matters is server performance, It was so to speak the jewel in AMD's crown. It just seems that every time Intel dominates a new performance metric the goal posts shift on how important it is. It seem that since the launch of i7 all that matters is gpu limited gaming performance.

  9. #134
    Xtreme Enthusiast
    Join Date
    May 2008
    Posts
    612
    Quote Originally Posted by gallag View Post
    All we have heard for the last few years is that all that matters is server performance, It was so to speak the jewel in AMD's crown.
    It was power/watt. And if you go back then CPU's wasn't that fast compared to CPU's today (the new opteron is of course faster compared to the old one).

  10. #135
    I am Xtreme
    Join Date
    Jul 2004
    Location
    Little Rock
    Posts
    7,204
    Quote Originally Posted by gosh View Post
    It was power/watt. And if you go back then CPU's wasn't that fast compared to CPU's today (the new opteron is of course faster compared to the old one).
    Yet still slower and has been slower for almost 2 years now. AMD's main advantage was Bandwidth from SOC and Point to Point. Dual Socket was lost to Xeon-Bensley as I showed you when you were Duby229. Not only was the Intel Platform Faster, but it drew less power as well.

    New Opterons are Faster than the old ones but still slower than their Intel Counterparts. Power Draw will make some, maybe a few pause but others will still get power savings from using Nehalem. Why? Look at power used during the time it took to finish the project? Let's say you're right, You take longer but with less power but overall time took longer. This also can mean overall time of the project/s could mean you used more power anyway.

    http://www.reuters.com/article/techn...24573820080708

    DreamWorks picks Intel over AMD for chip supply

    They used to exclusively use AMD processors. A friend told me about DW testing a Nehalem long before Intel let the rest know about it. Just like with Apple, maybe they tell us just how long ago that way. Please don't say DW were duped by Intel's clever marketing

    http://tech.blorge.com/Structure:%20...t-in-the-cold/

    AMD’s dual-core Opteron processors will be replaced over the next 18 month – a move involving some 1,000 workstations and 1,500 server units. DreamWorks has agreed to buy Intel’s Nehalem 8-core processor for the high-end workstations, and Larrabee processors for the servers. Larrabee processors will have between 10 and 100 cores, according to Intel.
    "He said that the studio’s recent offering, Kung Fu Panda, had its final touches done on computers using both Intel and AMD processors."

    Maybe you know more than they do
    Last edited by Donnie27; 12-19-2008 at 08:02 AM.
    Quote Originally Posted by Movieman
    With the two approaches to "how" to design a processor WE are the lucky ones as we get to choose what is important to us as individuals.
    For that we should thank BOTH (AMD and Intel) companies!


    Posted by duploxxx
    I am sure JF is relaxed and smiling these days with there intended launch schedule. SNB Xeon servers on the other hand....
    Posted by gallag
    there yo go bringing intel into a amd thread again lol, if that was someone droping a dig at amd you would be crying like a girl.
    qft!

  11. #136
    Xtreme Mentor
    Join Date
    Mar 2006
    Posts
    2,978
    Quote Originally Posted by Stukov View Post
    Uh, the TLB was a bug from previous designs of L3 cache that was never fixed. It's Fudzilla but it was the first in google search http://www.fudzilla.com/index.php?op...=6242&Itemid=1
    DO NOT TRUST FUDzilla to get anything right. His post on this point, and any point about a TLB, is complete and total hogwash.
    One hundred years from now It won't matter
    What kind of car I drove What kind of house I lived in
    How much money I had in the bank Nor what my cloths looked like.... But The world may be a little better Because, I was important In the life of a child.
    -- from "Within My Power" by Forest Witcraft

  12. #137
    Xtreme Addict
    Join Date
    Jan 2005
    Posts
    1,730
    Quote Originally Posted by gosh View Post
    savantu:
    Inclusive vs exclusive cache
    Why is the L1 and L2 cache smaller on i7 compared to DENEB ?
    AMD always preferred larger L1s with low associativity ; Intel went for small L1s with high associativity.I would assume the hit rate is larger on Intel's higher associative caches.

    Intel choose based on extensive simulation a small but very fast L2.it had to be small because of the inclusive relationship with the L3.Results point out that their approach is outstanding performance wise.

    Even so , there were serious debates inside Intel over the size of the L2s, many advocated a larger one.

    http://www.realworldtech.com/page.cf...2808015436&p=1

    What about redundant data in cache ?
    That's the drawback of the inclusive approach ; using smaller L1/L2 is one fix to the problem.The other is to increase the size of the L3.
    What is the hitrate for the cache (assocativity) ?
    Intel's is generally higher as can be seen from Aaron Kanter's review.
    Do you have any thoughts about manufacturing yields comparing i7 and DENEB
    They have the same die size.Given Intel's prowess in manufacturing I'd assume their yields are better.I don't have hard data on this , it is just a hunch based on past performance.
    There has been some discussions on the internet that Intel did focus very much on servers with the i7 (performance), that was AMD's strong area where Intel was behind. With the i7 Intel may have been focused to much on server performance?
    How so ? From all reviews , Nehalem stomps the desktop world with ease especially in multimedia benchmarks.It has superb all around performance.
    Once graphic drivers are optimized for it , it will increase its lead in games.
    What AMD has done seems to be the opposite (if desktop is the opposite). I think they have some good news for gamers etc with this design for the deneb. Maybe Intel knows this and thats why they are releasing this type of information (let go of the NDA).
    Gamers are 1% of the market.Barely relevant and Deneb still has to prove that it can beat Kentsfield/Yorkfield as the same clock.
    Quote Originally Posted by Heinz Guderian View Post
    There are no desperate situations, there are only desperate people.

  13. #138
    Xtreme Member
    Join Date
    Apr 2006
    Posts
    393
    Quote Originally Posted by gosh View Post
    It was power/watt. And if you go back then CPU's wasn't that fast compared to CPU's today (the new opteron is of course faster compared to the old one).
    Gosh, I know you have no clue about servers but that's ok.

    Everyone can now pick up dual socket Xeons to replace their quad socket Opterons and have same/better performance while using less than half the power.

    There has been some discussions on the internet that Intel did focus very much on servers with the i7 (performance), that was AMD's strong area where Intel was behind. With the i7 Intel may have been focused to much on server performance?
    I don't get it, i7 920 @ 2.66GHz is on par with a QX9650 @ 3GHz, how is that focusing too much on server performance? Seeing all the reviews for Deneb, it will still be slower than the current Core 2 Quads. So what's your argument then? AMD focused too much on server performance beginning?
    Last edited by Clairvoyant129; 12-19-2008 at 09:23 AM.

  14. #139
    V3 Xeons coming soon!
    Join Date
    Nov 2005
    Location
    New Hampshire
    Posts
    36,363
    Quote Originally Posted by Zucker2k View Post
    Come on Dave, don't be greedy. Let's all have a go at it with our very subjective purchases, unbridled, unfettered; doesn't get any real world than that. Anyway, I have a head start: http://www.xtremesystems.org/forums/...d.php?t=211079
    I wasn't being greedy. I only asked for one of each!
    Crunch with us, the XS WCG team
    The XS WCG team needs your support.
    A good project with good goals.
    Come join us,get that warm fuzzy feeling that you've done something good for mankind.

    Quote Originally Posted by Frisch View Post
    If you have lost faith in humanity, then hold a newborn in your hands.

  15. #140
    Banned
    Join Date
    May 2006
    Posts
    458
    Quote Originally Posted by gallag View Post
    They will love i7 for servers then, They will be able to cut there power usage in half. Through out those four socket systems and replace with duals. Lovely
    Sure they can, if they plan to run synthetic benchmarks 24/7 with HT on.

  16. #141
    Xtreme Cruncher
    Join Date
    Aug 2006
    Location
    Denmark
    Posts
    7,747
    Quote Originally Posted by DoubleZero View Post
    Sure they can, if they plan to run synthetic benchmarks 24/7 with HT on.
    Talking from first hand experience. That is not true. They really do have quad socket performance in dual socket.
    Crunching for Comrades and the Common good of the People.

  17. #142
    Xtreme Enthusiast
    Join Date
    May 2008
    Posts
    612
    Quote Originally Posted by Clairvoyant129 View Post
    Gosh, I know you have no clue about servers but that's ok.
    Everyone can now pick up dual socket Xeons to replace their quad socket Opterons and have same/better performance while using less than half the power.
    What type of server software needs this hardware speed? How many concurrent users (employees) do you think need to be using the server before it will have trouble to handle the load?
    What is the idle power for i7

    Quote Originally Posted by savantu View Post
    They have the same die size.Given Intel's prowess in manufacturing I'd assume their yields are better.I don't have hard data on this , it is just a hunch based on past performance.
    Can Intel turn of one or more cores and sell it as X3, X2 etc with this design?

    Quote Originally Posted by savantu View Post
    Nehalem stomps the desktop world with ease especially in multimedia benchmarks.It has superb all around performance.
    There are some big english sites that has shown good results. But if you have read foreign reviews they are not so good (good but not super). It is very fast on memory intensive applications. That is also needed on server software, not on desktops (latency is much more important on desktops)
    Last edited by gosh; 12-19-2008 at 12:15 PM.

  18. #143
    Registered User
    Join Date
    Dec 2007
    Posts
    66
    Obviously at 65nm AMD wern't able to give K10 sufficient die area in implement a fully inclusive L3 cache but that isn't the case at 45nm and lower.
    I would assume comprehensive simulations would have been run to test such a design revision, but, given the change wasn't made would it be reasonable to conclude any performance gains were deemed not to warrant the required man hours of engineering effort? Granted the current time to market pressure has been great for Shanghai, the design choices would have been made most likely prior to Barcelona even launching.

  19. #144
    Banned
    Join Date
    Oct 2006
    Location
    Haslett, MI
    Posts
    2,221
    Quote Originally Posted by Movieman View Post
    I wasn't being greedy. I only asked for one of each!
    My bad, Dave. I would be very interested in seeing results from trusted person like you. Hopefully AMD/Intel reps on this forum would accept the challenge.


    Quote Originally Posted by gosh View Post
    What type of server software needs this hardware speed? How many concurrent users (employees) do you think need to be using the server before it will have trouble to handle the load?
    What is the idle power for i7


    Can Intel turn of one or more cores and sell it as X3, X2 etc with this design?
    If I understand you right, you're saying Intel's latest dualcores for the server platform are so fast they're useless. Wow, that's dangerous talk. We don't want Intel's engineers to rest on their laurels. We need even faster chips, with ultra low power consumption. Don't you agree?
    Last edited by Zucker2k; 12-19-2008 at 12:17 PM.

  20. #145
    Xtreme Cruncher
    Join Date
    Aug 2006
    Location
    Denmark
    Posts
    7,747
    Quote Originally Posted by gosh View Post
    What type of server software needs this hardware speed? How many concurrent users (employees) do you think need to be using the server before it will have trouble to handle the load?
    What is the idle power for i7
    Well, since AMD people have shouted virtualization sicne Core 2 Xeons. Lets try that. Else there is PLENTY of DB applications. Webservices, terminal services etc. Plus in all the cases where you would buy a quad socket machine. You just get a Xeon 5500 based one.

    And the idle power....seriously. You are trolling gosh.

    http://www.anandtech.com/cpuchipsets...spx?i=3453&p=3
    http://techreport.com/articles.x/15818/14
    http://www.xbitlabs.com/articles/cpu..._18.html#sect0

    Its nothing you cant find on your own. But there is a reason you got 10% of the posts here.

    Quote Originally Posted by gosh View Post
    Can Intel turn of one or more cores and sell it as X3, X2 etc with this design?
    I dont see why not. But Intel earlier said such things would be scrapped. AMDs reason to do so was horrible process manufactoring mixed with a very large diesize.

    And there is no such CPUs like those on Intels nehalem roadmaps. There are 3 core designs. Bloomfield, Lynnfield and Havendale.
    Last edited by Shintai; 12-19-2008 at 12:20 PM.
    Crunching for Comrades and the Common good of the People.

  21. #146
    I am Xtreme
    Join Date
    Jul 2007
    Location
    Austria
    Posts
    5,485
    Quote Originally Posted by Shintai View Post
    Well, since AMD people have shouted virtualization sicne Core 2 Xeons. Lets try that. Else there is PLENTY of DB applications. Webservices, terminal services etc. Plus in all the cases where you would buy a quad socket machine. You just get a Xeon 5500 based one.

    And the idle power....seriously. You are trolling gosh.

    http://www.anandtech.com/cpuchipsets...spx?i=3453&p=3
    http://techreport.com/articles.x/15818/14
    http://www.xbitlabs.com/articles/cpu..._18.html#sect0

    Its nothing you cant find on your own. But there is a reason you got 10% of the posts here.



    I dont see why not. But Intel earlier said such things would be scrapped. AMDs reason to do so was horrible process manufactoring mixed with a very large diesize.

    And there is no such CPUs like those on Intels nehalem roadmaps. There are 3 core designs. Bloomfield, Lynnfield and Havendale.
    yeah i dont think we ever will see S1336 dualcores. Thought on the other hand, i think we maybe see dualcore lynnfields without IGP.

  22. #147
    Xtreme Enthusiast
    Join Date
    May 2008
    Posts
    612
    The problem is that there are some sites (english sites) that allways say good things about intel.

    Here is one review from another site in anothre country
    http://sweclockers.com/articles_show.php?id=6125&page=5

  23. #148
    Xtreme Enthusiast
    Join Date
    May 2008
    Posts
    612
    About cache hitrate

    Quote Originally Posted by savantu View Post
    Intel's is generally higher as can be seen from Aaron Kanter's review.
    isn't the i7 L3 cache 16 way associative? And deneb has 48 way associative?

    What does that mean in hit rate?

    I know that i7 has higher Hz on the cache (more power needed) and maybe they use that to compensate?

  24. #149
    Xtreme Mentor
    Join Date
    May 2008
    Location
    cleveland ohio
    Posts
    2,879
    Quote Originally Posted by savantu View Post


    AMD's has a pseudo-exclusive cache ( data cannot be found in the two cache levels at the same time, although “pseudo” means that there are a few exceptions) relationship.Things are problematic when you have an L3 miss , you need to check what the other cores have in their caches.

    By using an inclusive cache , with an L3 miss data is guaranteed not to be in the other caches and a memory request is sent.

    Given Nehalem's inclusive relationship and the flag system they use to maintain coherency , gives them little to no headaches about coherency traffic.
    AMD's caches burn more BW and latency for this problem , that's all.



    That is simply not true.AMD isn't bottlenecked by interconnects, it is precisely the coherency traffic which kills it.Huge amounts of BW are wasted with maintaining coherency.

    In a Nehalem multicpu system , you need to maintain the coherency of the L3s.Furthermore , Intel implemented a directory based coherence protocol which is point to point instead of broadcast.

    Not so with the Opteron because data in L1/L2 is more or less guaranteed not to be in the L3 .Also they use a snoop based one protocol in which the caches listen in on transport of variables to any of the CPUs and update their own copies of these variables if they have them. Snooping logic in the processor broadcasts a message over the bus each time a word in its cache has been modified. The snooping logic also snoops on the bus looking for such messages from other processors.
    Since K8/10 use 64bit lines , can you imagine the traffic in a 4 socket system to maintain coherency ? What about 8 sockets ? Yeah , HT 3.0 will help , but it is a band aid curing the symptoms by brute force ( more BW ) and not the disease ( a better cache coherency protocol ).

    Why do you think Newsys tried to build Horus , a directory based chipset to overcome this ?
    an inclusive L3 cache in nehalem and an noninclusive L3 cache in k10/ shanghai. nice post but HT also is coherent.

    there was a reason that L3 cche like that??? anyone know why (I don't know or remeber why)

    I don't think HT3.0 will be big enough for 4 socket shanghai.

  25. #150
    Xtreme Mentor
    Join Date
    May 2008
    Location
    cleveland ohio
    Posts
    2,879
    Quote Originally Posted by gosh View Post
    About cache hitrate



    isn't the i7 L3 cache 16 way associative? And deneb has 48 way associative?

    What does that mean in hit rate?

    I know that i7 has higher Hz on the cache (more power needed) and maybe they use that to compensate?
    good luck on finding that cpu-z screen of caches.associations >_>....

Page 6 of 11 FirstFirst ... 3456789 ... LastLast

Bookmarks

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •