Deneb Samples are almost out

**JumpingJack** · 11-05-2008, 10:43 PM

Originally Posted by Macadamia

I know previous gaming code was branchy, but I don't think the current trend is emphasized there any more.
Xenon and Cell for the consoles aren't too apt at branching, I last remember, especially with buffed up SIMD units. There will always be branchy code, but does it still comprise the majority of the engine?

AMD does need serious work on their predictors though - for general performance more than anything. Despite the improvements Intel has a really decisive lead here ever since Conroe.

Yeah I completely agree....

A few comments ...

It's really kind of intuitive when you think about it, just a high level gerdanken experiment helps to really understand why game code runs more branches than say compressing a file or encoding a movie clip. It is unavoidable, there is no way to program a game engine without it. The input actions of the player are unpredictable and the resulting cause and effect will always require testable conditions, the crux of any gaming algorithm is ultimately indeterministic. For the CPU duties of the gaming engine, yeah it is still important -- the CPU is responsible for receiving player input, tracking AI, culling, etc etc.

Kanter sent me a word copy of this article a few months ago for review, and it is one of the only technical representations that help to really rationalize why C2D did such a good job with gaming code when it launched compared to K8 (it was the most dramatic feature of Conroe and really lit up the forums/net of course) : http://www.realworldtech.com/page.cf...2808015436&p=5

Ironically, at least my opinion is ... that Intel's branch prediction capability, as seen in C2D, was probably the only good thing that ultimately came out of Netburst -- the penalty for a mispredicted branch can be as bad or worse than a L2 cache miss -- you need to flush the pipeline, fetch the new code/date into the front end, reorder again and repopulate the pipeline .. I've seen numbers between 30-150 cycles wasted just to correct a mispredicted branch.... to avoid this, I would not doubt if Intel architects went all out balls to the wall to figure out any and all possible ways to improve branch prediction accuracy, even then that 31 stage pipeline just drug it all back down as two or three mispredictions in a 1000 would cripple a Prescott.

The branch predictor logic most likely carried over in some fashion, to some degree into C2D, i.e. probably the only feature of Netburst to go into C2D... who knows, but makes sense.

So yeah, AMD's branch prediction is weaker than Intel's at the moment, though K8 could kick butt against Netburst gaming wise -- that was because even with great branch prediction, that uber long pipeline just stunk if there was ever a stall .... shortening up the pipeline with C2D, widening it, coupled with strong branch predictors = great scenario for gaming.

EDIT: Here's a good one ... look at figure 4: http://www.research.ibm.com/people/m...s/2004_msp.pdf this is a neat paper, it also shows the L1/L2 misses for FPS games vs other applications ... so you can see why we get this generalized statement "games love L2 cache", true they do ... but they also love good branch predictors -- both of which C2D/Q's have lots of ...

Jack

Thread: Deneb Samples are almost out

Thread Tools

Search Thread

Rate This Thread

Display

Threaded View

Bookmarks

Bookmarks

Posting Permissions