Page 29 of 39 FirstFirst ... 1926272829303132 ... LastLast
Results 701 to 725 of 954

Thread: AMD's Bobcat and Bulldozer

  1. #701
    Xtreme Addict
    Join Date
    Jun 2002
    Location
    Ontario, Canada
    Posts
    1,782
    Quote Originally Posted by -Boris- View Post
    I'm pretty sure that when you have the position JF has in a company you get pretty accurate numbers from engineering and so on. There is no need for him to sit down and bench engineering samples personally. Would be quite stupid if engineering lied about the performance in internal reviews and documents.
    You know this isn't Dilbertland right?
    Thanks for answering that JF.

    So if he's getting accurate numbers from engineering, then it's a simple thing for him to say, "BD numbers are better than existing chips". Simple right?
    As quoted by LowRun......"So, we are one week past AMD's worst case scenario for BD's availability but they don't feel like communicating about the delay, I suppose AMD must be removed from the reliable sources list for AMD's products launch dates"

  2. #702
    Xtreme Enthusiast
    Join Date
    Oct 2008
    Posts
    678
    Quote Originally Posted by freeloader View Post
    Thanks for answering that JF.

    So if he's getting accurate numbers from engineering, then it's a simple thing for him to say, "BD numbers are better than existing chips". Simple right?
    That's pretty much what he have said countless times already.

    http://www.xtremesystems.org/forums/...&postcount=602

  3. #703
    Xtreme Enthusiast
    Join Date
    Dec 2009
    Posts
    846
    Quote Originally Posted by freeloader View Post
    JF...have you personally seen running BD chips yet or whatever the server variant is called? Just wondering how you're so sure if you haven't bench tested one yet.

    Does anyone here know when BD compatible socket motherboards will go on sale?
    I don't have any reason to be in building 400. And it is better off that the marketing guy is not "dropping in" on them.

    Our performance engineering team has done a real accurate job on performance modeling in the past, I have no reason to doubt them. Generally the worst that we see is too much conservatism, not too much optimism.
    While I work for AMD, my posts are my own opinions.

    http://blogs.amd.com/work/author/jfruehe/

  4. #704
    Xtreme Enthusiast
    Join Date
    Dec 2009
    Posts
    846
    Quote Originally Posted by STaRGaZeR View Post
    He's right. K10 has more resources, shared or not.
    OK, so let me get the gist of all of this whole thread down to two statements:

    1. People are claiming Bulldozer will be slower than existing products because they are sharing resources in the processor and sharing is inherently worse.

    2. People are claiming that even though Bulldozer has dedicated resources relative to the old architecture that shares them, this is worse.

    OK, I got it now.
    While I work for AMD, my posts are my own opinions.

    http://blogs.amd.com/work/author/jfruehe/

  5. #705
    Xtreme Member
    Join Date
    Nov 2008
    Posts
    117
    Quote Originally Posted by JF-AMD View Post
    OK, so let me get the gist of all of this whole thread down to two statements:

    1. People are claiming Bulldozer will be slower than existing products because they are sharing resources in the processor and sharing is inherently worse.

    2. People are claiming that even though Bulldozer has dedicated resources relative to the old architecture that shares them, this is worse.

    OK, I got it now.
    THey are Intel fanboys No problem !
    When AMD had 64-bit and Intel had only 32-bit, they tried to tell the world there was no need for 64-bit. Until they got 64-bit.
    When AMD had IMC and Intel had FSB, they told the world "there is plenty of life left in the FSB" (actual quote, and yes, they had *math* to show it had more bandwidth). Until they got an IMC.
    When AMD had dual core and Intel had single core, they told the world that consumers don't need multi core. Until they got dual core.
    When intel was using MCM, they said it was a better solution than native dies. Until they got native dies. (To be fair, we knocked *unconnected* MCM, and still do, we never knocked MCM as a technology, so hold your flames.)
    by John Fruehe

  6. #706
    Xtreme Guru
    Join Date
    May 2007
    Location
    Ace Deuce, Michigan
    Posts
    3,955
    Quote Originally Posted by Hans de Vries View Post
    Our friend who goes by the name dougsf30/terrace215/chipper/chipdesigner/tatertot/justaview/gloo...
    and 100 more names has a history of driving people nuts (and maybe himself as well...)

    To recapitulate this thread:

    AMD Architects : IPC increases (Anand article commenting on the 2 ALUs an 16KB L1)

    terrace215 post: IPC decreases, because of the 2 ALUs..
    terrace215 post: IPC decreases, because of the 16KB caches
    terrace215 post: IPC decreases, AMD presentation sheet no.X tells us so.
    terrace215 post: IPC decreases, AMD presentation sheet no.Y confesses this.

    JF-AMD posting: IPC increases!! instead of getting worse.

    terrace215 post: IPC decreases, the marketing guy isn't talking about IPC
    terrace215 post: IPC decreases, don't trust marketing guys.
    terrace215 post: IPC decreases, Bulldozer is only optimized for server workloads.
    terrace215 post: IPC decreases, AMD presentation sheet no.Y confesses this.

    JF-AMD posting: IPC increases!!!! You are spreading FUD

    terrace215 post: IPC decreases, AMD presentation sheet no.X tells us so.
    terrace215 post: IPC decreases, The AMD architect says it decreases by 5%
    terrace215 post: IPC decreases, because of the 2 ALUs..
    terrace215 post: IPC decreases, AMD has given up improving IPC.

    JF-AMD posting: IPC increases!!!!!!! How many times did I tell you!!!

    forever{
    terrace215 post: IPC decreases, because .....
    terrace215 post: IPC decreases, says .... of AMD
    terrace215 post: IPC decreases, according to AMD's presentation.
    terrace215 post: IPC decreases, don't trust marketing guys.
    terrace215 post: IPC decreases, because of the 2 ALUs..
    terrace215 post: IPC decreases, the marketing guy isn't talking about IPC
    terrace215 post: IPC decreases, because of the 16KB caches
    terrace215 post: IPC decreases, AMD has given up improving IPC.
    terrace215 post: IPC decreases, The AMD architect says it decreases by 5%
    terrace215 post: IPC decreases, Bulldozer is only optimized for server workloads.
    terrace215 post: IPC decreases, AMD presentation sheet no.X tells us so.
    terrace215 post: IPC decreases, The more I post the more it decreases.
    terrace215 post: IPC decreases, The more I post the more it decreases.
    terrace215 post: IPC decreases, The more I post the more it decreases.
    .....}
    until (interrupt by Movieman)

    savantu post: IPC decreases, AMD has given up improving IPC.
    savantu post: IPC decreases, The AMD architect says it decreases by 5%
    savantu post: IPC decreases, The more I post the more it decreases.
    savantu post: IPC decreases, The more I post the more it decreases.
    savantu post: IPC decreases, The more I post the more it decreases.

    JF-AMD posting: Epic palms to the Face....

    Regards, Hans
    updated
    Quote Originally Posted by Hans de Vries View Post

    JF-AMD posting: IPC increases!!!!!!! How many times did I tell you!!!

    terrace215 post: IPC decreases, The more I post the more it decreases.
    terrace215 post: IPC decreases, The more I post the more it decreases.
    terrace215 post: IPC decreases, The more I post the more it decreases.
    .....}
    until (interrupt by Movieman)


    Regards, Hans

  7. #707
    Xtreme Mentor
    Join Date
    Nov 2006
    Location
    Spain, EU
    Posts
    2,949
    Quote Originally Posted by JF-AMD View Post
    OK, so let me get the gist of all of this whole thread down to two statements:

    1. People are claiming Bulldozer will be slower than existing products because they are sharing resources in the processor and sharing is inherently worse.

    2. People are claiming that even though Bulldozer has dedicated resources relative to the old architecture that shares them, this is worse.

    OK, I got it now.
    I'll make it short and easy to understand. Original quote:

    Quote Originally Posted by savantu View Post
    K10 has 3 ALUs and 3 AGUs. No matter how hard you and others try to downplay K10 execution resources, fact is, a K10 integer core has more resources than a BD integer core.
    Which is 100% true, as K10 has more execution units. I don't see the words perfomance, shared or dedicated in this post. Then you say:

    Quote Originally Posted by JF-AMD View Post
    No, you are wrong. Old architecture has shared resources, new architecture has dedicated resources.
    Which is wrong, based on the above. I just pointed it out, but it seems it was a perfect excuse to ignore what the guy is actually saying (as you like to do) and repeat the same post you've been repeating how many times now?

    I hope you properly get it now.
    Friends shouldn't let friends use Windows 7 until Microsoft fixes Windows Explorer (link)


    Quote Originally Posted by PerryR, on John Fruehe (JF-AMD) View Post
    Pretty much. Plus, he's here voluntarily.

  8. #708
    Xtreme Addict
    Join Date
    Jan 2005
    Posts
    1,730
    Quote Originally Posted by -Boris- View Post
    I'm pretty sure that when you have the position JF has in a company you get pretty accurate numbers from engineering and so on. There is no need for him to sit down and bench engineering samples personally. Would be quite stupid if engineering lied about the performance in internal reviews and documents.
    You know this isn't Dilbertland right?
    Let's say their past history isn't as imaculate as you portray it. There is an alternate discussion on BD details on Aces and Paul Demone directly answers JFs claims :

    Quote Originally Posted by Paul Demone
    Quote Originally Posted by inf64
    wrote:
    JF's comment about "lower IPC" in BD
    http://www.xtremesystems.org/forums/sho ... tcount=589

    Quote:
    See, that statement is what gets people in trouble. Someone reads that statement and assumes 10% lower performance.

    IPC will be higher than previous generation
    Single threaded performance will be higher than previous generation
    I hope this is clear enough for Paul and his crystal ball.

    ROFL. A Niagara has higher "aggregate" (across all threads) IPC than a US-IV
    but far lower single thread performance. Listen for what a salesman doesn't
    say! Higher single thread performance than K10? Probably, but at far higher
    clock rates enabled by a deeper pipeline, simpler cores, and a process shrink.

    I remember another AMD "great white hope" chip called Barcelona.

    I remember another rash and brash AMD marketing guy called Henri Richard.

    That guy said a lot of things about that chip, made a lot of fantastic claims
    of how it would make Intel cry. AMD fans worshipped the ground he walked on.

    The funny thing is that once Barcelona silicon was characterized, SKUs defined,
    and the first pre-release internal benchmark data compiled that guy left AMD so
    fast it made the air crackle. Once Barcelona was released it was obvious why. :-D

    Good ole JF has said a lot of things about BD. My guess is he still has at least
    18 months to spin BD before having a third party reality check. I hope unlike
    Henri he sticks it out post BD release just to see his fancy footwork trying to
    match these claims up to reality.

    Until then I'll believe a gram of disclosure from AMD's current and recent design
    engineers over a ton of claims from a marketing guy. My advice to AMD fans
    hanging on to JF's every last word is to remember back on what Henri Richard
    said in private vs what he said in public.

    http://news.cnet.com/8301-13924_3-10433953-64.html

    an excerpt of a 2004 internal AMD communication from former AMD Executive Vice
    President Henri Richard, the company's then-highest-ranking sales executive: "If you
    look at it with an objective set of eyes, you would never buy AMD. I certainly would
    never buy AMD for a personal system, if I wasn't working here."
    BD taped out a month or two ago. If they were lucky silicon is mostly functional. If not, they are working overtime to fix it and get working samples. Silicon is being characterized and in pre-validation stage.
    In other words, benchmarks and performance are second place at this time, most important is getting a functional chip.

    What this all means, every claim about BD performance is based on estimates done without having actual silicon in hand.
    SUN Rock was meant to be the greatest chip done in the past decade with with innovative features like transactional memory and scout threads. I still remember Jonathan Schwartz, how ecstatic he was over Rock.
    Rock turned out a complete dud, burning 300w and abisymal performance.
    Last edited by savantu; 08-31-2010 at 04:58 AM.
    Quote Originally Posted by Heinz Guderian View Post
    There are no desperate situations, there are only desperate people.

  9. #709
    Xtreme Addict
    Join Date
    Jul 2005
    Posts
    1,646
    Six year old quotes? I'm going to start filling Sandy Bridge threads with info on Netburst, thats cool with you guys right?

  10. #710
    Xtreme X.I.P. Particle's Avatar
    Join Date
    Apr 2008
    Location
    Kansas
    Posts
    3,219
    I have to wonder why Paul would even say this unless he just wants to argue:
    ROFL. A Niagara has higher "aggregate" (across all threads) IPC than a US-IV
    but far lower single thread performance. Listen for what a salesman doesn't
    say! Higher single thread performance than K10? Probably, but at far higher
    clock rates enabled by a deeper pipeline, simpler cores, and a process shrink.
    The first section deals with repeating the whole "OK, so it's faster overall but what about single threaded work?! Ha!" We've already been told that BD is faster than the current gen at both, which he even acknowledges in the second half. As such, what point was there to even posting the first part? As for the second part, who cares? If the frequencies are higher due to the changes in the chip and that permits an overall faster singlethreaded and multithreaded experience than is possible with current designs, then why does it matter if the new chip ticks faster? I'm not saying I think that clock for clock the new part will be slower at this point, but even if it were it would be fine given that in the end it's still faster and not only clocked higher.
    Particle's First Rule of Online Technical Discussion:
    As a thread about any computer related subject has its length approach infinity, the likelihood and inevitability of a poorly constructed AMD vs. Intel fight also exponentially increases.

    Rule 1A:
    Likewise, the frequency of a car pseudoanalogy to explain a technical concept increases with thread length. This will make many people chuckle, as computer people are rarely knowledgeable about vehicular mechanics.

    Rule 2:
    When confronted with a post that is contrary to what a poster likes, believes, or most often wants to be correct, the poster will pick out only minor details that are largely irrelevant in an attempt to shut out the conflicting idea. The core of the post will be left alone since it isn't easy to contradict what the person is actually saying.

    Rule 2A:
    When a poster cannot properly refute a post they do not like (as described above), the poster will most likely invent fictitious counter-points and/or begin to attack the other's credibility in feeble ways that are dramatic but irrelevant. Do not underestimate this tactic, as in the online world this will sway many observers. Do not forget: Correctness is decided only by what is said last, the most loudly, or with greatest repetition.

    Rule 3:
    When it comes to computer news, 70% of Internet rumors are outright fabricated, 20% are inaccurate enough to simply be discarded, and about 10% are based in reality. Grains of salt--become familiar with them.

    Remember: When debating online, everyone else is ALWAYS wrong if they do not agree with you!

    Random Tip o' the Whatever
    You just can't win. If your product offers feature A instead of B, people will moan how A is stupid and it didn't offer B. If your product offers B instead of A, they'll likewise complain and rant about how anyone's retarded cousin could figure out A is what the market wants.

  11. #711
    I am Xtreme
    Join Date
    Jul 2007
    Location
    Austria
    Posts
    5,485
    http://flamewheelspin.ytmnd.com/

    perfectly sums up this thread...

  12. #712
    Xtreme Addict
    Join Date
    Jan 2005
    Posts
    1,730
    Quote Originally Posted by MrMojoZ View Post
    Six year old quotes? I'm going to start filling Sandy Bridge threads with info on Netburst, thats cool with you guys right?
    )

    No need to; the point was simply to take the appropriate spoon of salt wrts to marketing and performance claims for a future product.

    Quote Originally Posted by JF-AMD
    OK, so let me get the gist of all of this whole thread down to two statements:

    1. People are claiming Bulldozer will be slower than existing products because they are sharing resources in the processor and sharing is inherently worse.

    2. People are claiming that even though Bulldozer has dedicated resources relative to the old architecture that shares them, this is worse.

    OK, I got it now.
    Not in the slightest.
    First of all, nobody claimed BD will be slower than existing products either in performance overall or single threaded performance. Nobody brought in discussion the dedicated vs. shared resources but you, so false dilemma you have there.
    The only point raised ( by me at least ) was that given the design trade-offs BD did ( which I addressed in a previous post in details - my POV nothing more and which David Kanter also mentioned in his article), it is expected that BD will lose slightly in performance per clock compared to K10 in integer code . Overall performance of BD, including single threaded will be higher no doubt than K10. But not per clock.
    Quote Originally Posted by Heinz Guderian View Post
    There are no desperate situations, there are only desperate people.

  13. #713
    Xtreme Addict
    Join Date
    Apr 2007
    Posts
    2,128
    On what do you base your opinion on? Deeper pipeline?

    Are there any bits of info regarding cache inclusiveness/exclusiveness, other than 16 kB L1D, which hints for inclusive cache?

    I'm still predicting inclusive cache for the L1D size. There is no reason to stick to exclusive cache as it gives virtually no benefit because of poor L2/L1 and L3/L2 ratios. It just slows every memory operation quite a bit while giving marginal improvement on hit rate. Anyone with some knowledge on the performance penalty due to exclusive cache? I'd believe that inclusive cache would bring more than enough to compensate for any loss the deeper pipeline could potentially cause, brining the cache latencies to near Nehalem numbers, if not greater. SB seems to be a real badass on this, so I can't see BD topping near it's latecies with inclusive cache.
    Last edited by Calmatory; 08-31-2010 at 05:35 AM.

  14. #714
    Xtreme Addict
    Join Date
    Jul 2005
    Posts
    1,646
    Quote Originally Posted by savantu View Post
    )

    No need to; the point was simply to take the appropriate spoon of salt wrts to marketing and performance claims for a future product.
    And the same applies to Intel parts so I need to start spamming their threads too, right? Is that what you are advocating here?

  15. #715
    Xtreme Addict
    Join Date
    Jan 2005
    Posts
    1,730
    Quote Originally Posted by Particle View Post
    I have to wonder why Paul would even say this unless he just wants to argue:


    The first section deals with repeating the whole "OK, so it's faster overall but what about single threaded work?! Ha!" We've already been told that BD is faster than the current gen at both, which he even acknowledges in the second half. As such, what point was there to even posting the first part?
    He was adressing JF's point about IPC being higher ( Paul doubts that ). I am surprised it isn't obvious.

    As for the second part, who cares? If the frequencies are higher due to the changes in the chip and that permits an overall faster singlethreaded and multithreaded experience than is possible with current designs, then why does it matter if the new chip ticks faster? I'm not saying I think that clock for clock the new part will be slower at this point, but even if it were it would be fine given that in the end it's still faster and not only clocked higher.
    Well, you see, neither me, Paul or others are interested in absolute values for benchmarks scores. My interest is how they got there, the uarch, the trade-offs, the clever stuff done to hide bottlenecks, the corner cases,etc. I don't give a rats ass if it scores 101 FPS in I-don't-know-what-game or does SuperPi in -2 sec.
    The fun is analyzing the intentions and the implementation, not the end result. I take great pleasure in reading about Netburst, Prescott,Tejas, Nehalem ( 1st one ) , Tanglewood, Rock, etc even if some were duds in the end. It may suck, but it was innovative and challenging.
    Last edited by savantu; 08-31-2010 at 05:37 AM.
    Quote Originally Posted by Heinz Guderian View Post
    There are no desperate situations, there are only desperate people.

  16. #716
    I am Xtreme
    Join Date
    Jul 2007
    Location
    Austria
    Posts
    5,485
    Quote Originally Posted by savantu View Post
    The only point raised ( by me at least ) was that given the design trade-offs BD did ( which I addressed in a previous post in details - my POV nothing more and which David Kanter also mentioned in his article), it is expected that BD will lose slightly in performance per clock compared to K10 in integer code . Overall performance of BD, including single threaded will be higher no doubt than K10. But not per clock.
    Watch as it will again be spun so you said overall performance or overall IPC...

  17. #717
    Xtreme Enthusiast
    Join Date
    Jun 2006
    Location
    Space
    Posts
    769
    Well after readin' all the stuff about BD, my nooby chip expertese tells me that.

    IPC will be be improved at the same clocks, than current AMD processors.
    It will take less space per core
    It will clock higher than the current crop of AMD processors.

    It looks like it will be highly competitive in the server market, but behind in the 'gamers' segment (possibly close to matching today's Intels because of clockspeed, but not surpassing it in IPC).

    obviously no one will know until it gets leaked.

  18. #718
    Xtreme Member
    Join Date
    Sep 2008
    Posts
    235
    Quote Originally Posted by savantu View Post
    Let's say their past history isn't as imaculate as you portray it. There is an alternate discussion on BD details on Aces and Paul Demone directly answers JFs claims :



    BD taped out a month or two ago. If they were lucky silicon is mostly functional. If not, they are working overtime to fix it and get working samples. Silicon is being characterized and in pre-validation stage.
    In other words, benchmarks and performance are second place at this time, most important is getting a functional chip.

    What this all means, every claim about BD performance is based on estimates done without having actual silicon in hand.
    I don't agree with the word "estimates"

    A design is validated and debugged long before it goes to silicon. Validation
    is done both by cycle accurate software simulation and FPGA hardware
    emulation. An FPGA hardware implementation of the core, or entire processor,
    can run 10+MHz and can be made cycle accurate. This is also how you do
    performance tuning during the design phase itself.

    Typically operating systems are booted and many software applications
    are run long before you go to silicon.

    About your link......

    What in this musing from the investment board inhabitants can be classified
    as not being investor FUD and of any technical relevance concerning
    the architectural details of bulldozer?


    Regards, Hans

  19. #719
    Xtreme Guru
    Join Date
    May 2007
    Location
    Ace Deuce, Michigan
    Posts
    3,955
    Quote Originally Posted by savantu View Post
    )

    Not in the slightest.
    First of all, nobody claimed BD will be slower than existing products either in performance overall or single threaded performance. Nobody brought in discussion the dedicated vs. shared resources but you, so false dilemma you have there.
    The only point raised ( by me at least ) was that given the design trade-offs BD did ( which I addressed in a previous post in details - my POV nothing more and which David Kanter also mentioned in his article), it is expected that BD will lose slightly in performance per clock compared to K10 in integer code . Overall performance of BD, including single threaded will be higher no doubt than K10. But not per clock.
    I don't get why you guys are so sure it can't offer more ipc than k10. I think it does make a difference that the 3 could either be ALUs or AGUs because they couldn't do so simultaneously. Add to the fact that many applications don't even use a full ALU/AGU, so combined with a better prefetcher bulldozer should offer good ipc gains.

    It's been confirmed many times over that 80% number is integer cores in a single module vs integer cores in different modules, and the performance is lost due to shared components in the modules, not due to weaker cores.
    Quote Originally Posted by Hans de Vries View Post

    JF-AMD posting: IPC increases!!!!!!! How many times did I tell you!!!

    terrace215 post: IPC decreases, The more I post the more it decreases.
    terrace215 post: IPC decreases, The more I post the more it decreases.
    terrace215 post: IPC decreases, The more I post the more it decreases.
    .....}
    until (interrupt by Movieman)


    Regards, Hans

  20. #720
    Xtreme Addict
    Join Date
    Apr 2007
    Location
    canada
    Posts
    1,886
    Quote Originally Posted by freeloader View Post
    Thanks for answering that JF.

    So if he's getting accurate numbers from engineering, then it's a simple thing for him to say, "BD numbers are better than existing chips". Simple right?

    in fact its been said couples of times by JF-AMD himself ....


    Quote Originally Posted by JF-AMD View Post
    OK, so let me get the gist of all of this whole thread down to two statements:

    1. People are claiming Bulldozer will be slower than existing products because they are sharing resources in the processor and sharing is inherently worse.

    2. People are claiming that even though Bulldozer has dedicated resources relative to the old architecture that shares them, this is worse.

    OK, I got it now.

    sharing most inevitably mean communism for some .... but not for me .. if its to bring a good product at an affordable price with a big improvement over the last product im all for anything really


    Quote Originally Posted by STaRGaZeR View Post
    I'll make it short and easy to understand. Original quote:



    Which is 100% true, as K10 has more execution units. I don't see the words perfomance, shared or dedicated in this post. Then you say:



    Which is wrong, based on the above. I just pointed it out, but it seems it was a perfect excuse to ignore what the guy is actually saying (as you like to do) and repeat the same post you've been repeating how many times now?

    I hope you properly get it now.

    arguying with the man who works at the company to wich you decide to pick about said product .. and said person is in talk with engineers who built the damn thing ....
    Last edited by Sn0wm@n; 08-31-2010 at 05:53 AM.
    WILL CUDDLE FOR FOOD

    Quote Originally Posted by JF-AMD View Post
    Dual proc client systems are like sex in high school. Everyone talks about it but nobody is really doing it.

  21. #721
    Xtreme Addict
    Join Date
    Nov 2006
    Posts
    1,402
    Quote Originally Posted by Hornet331 View Post
    http://flamewheelspin.ytmnd.com/

    perfectly sums up this thread...
    I agree & savantu doesn't help to have objectiv view about facts.

    If JF- said IPC will be better, this is true ... Why ? It's simple, he don't want be unemployed.

    Good marketing is telling the true ... And Henri Richard has done some big mistake, and now don't work anymore for AMD.

    Bad guys don't stay so long ...
    Last edited by madcho; 08-31-2010 at 05:55 AM.

  22. #722
    Xtreme Addict
    Join Date
    Apr 2007
    Posts
    2,128
    What if there are multiple threads with lots of AVX instructions? Single module can only feed one AVX instruction at a time, or two 128-bit SSEx instructions, or 4 64-bit FPU instructions, right?

    Up to <number of modules> threads running AVX there should be no performance penalty as long as there is no other FP instructions in the fly. More there are, lower the AVX performance will be. And if one adds more AVX threads, the FPU units will just starve and there is no performance improvement?

    In short: If I want to do lots of AVX, I can only run <number of modules> threads for improved performance?

  23. #723
    Xtreme Cruncher
    Join Date
    Jun 2006
    Posts
    6,215
    Repeating for the 10th time already,10h can retire(back end of the chip) 3 macro ops,period.It has 9 execution units. There's your problem.

  24. #724
    Xtreme Addict
    Join Date
    Apr 2007
    Location
    canada
    Posts
    1,886
    Quote Originally Posted by Calmatory View Post
    What if there are multiple threads with lots of AVX instructions? Single module can only feed one AVX instruction at a time, or two 128-bit SSEx instructions, or 4 64-bit FPU instructions, right?

    Up to <number of modules> threads running AVX there should be no performance penalty as long as there is no other FP instructions in the fly. More there are, lower the AVX performance will be. And if one adds more AVX threads, the FPU units will just starve and there is no performance improvement?

    In short: If I want to do lots of AVX, I can only run <number of modules> threads for improved performance?

    by the time this what if comes there will be more powerfull cpu's cappable of doing more then a single avx instruction per core etc....

    and anyway isnt avx a better suited instruction for massive multimedia task in term of the way it process the info ???? so even if the cpu's might be limited by that fact they will most likely finish their job easily right ????
    Last edited by Sn0wm@n; 08-31-2010 at 06:06 AM.
    WILL CUDDLE FOR FOOD

    Quote Originally Posted by JF-AMD View Post
    Dual proc client systems are like sex in high school. Everyone talks about it but nobody is really doing it.

  25. #725
    Xtreme Enthusiast
    Join Date
    Oct 2008
    Posts
    678
    Quote Originally Posted by STaRGaZeR View Post
    Original quote:
    Originally Posted by savantu
    K10 has 3 ALUs and 3 AGUs. No matter how hard you and others try to downplay K10 execution resources, fact is, a K10 integer core has more resources than a BD integer core.

    Which is 100% true, as K10 has more execution units. I don't see the words perfomance, shared or dedicated in this post. Then you say:
    Quotes taken out of context can be true, but in context they can mean something different. Your quote was a response to my post about pipes. He is trying to make 3 pipelines appear like 6 pipes. Which is a twist to the truth.

    BD has more resources since it can use 2 ALUs and 2 AGUs every clock, Phenom II averages at 1.5 ALUs and 1.5 AGUs since the share pipe. Again, if you can't use it, it isn't a resource. 2+2=4 (3+3)/2=3.


    Quote Originally Posted by STaRGaZeR View Post
    Which is wrong, based on the above. I just pointed it out, but it seems it was a perfect excuse to ignore what the guy is actually saying (as you like to do) and repeat the same post you've been repeating how many times now?

    I hope you properly get it now.
    The discussion is still around IPC. Even if you try to make it look different. And it's still about BDs integer execution capacity compared to k8 (10h), we are pointing out that BDs 4 pipes seems a bit stronger than K8s 3 pipes.
    And by adding the different parts of K8s pipeline together some people here are trying to make them look twice as strong.
    4 pipes equals more resources than 3.
    Last edited by -Boris-; 08-31-2010 at 06:24 AM.

Page 29 of 39 FirstFirst ... 1926272829303132 ... LastLast

Bookmarks

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •