45nm Phenom Overclocked, Super Pied

**xPliziT** · 07-27-2008, 05:25 AM

Nice find...

Found that about cache associativity. Taken from wiki: http://en.wikipedia.org/wiki/CPU_cache#Associativity

The replacement policy decides where in the cache a copy of a particular entry of main memory will go. If the replacement policy is free to choose any entry in the cache to hold the copy, the cache is called fully associative. At the other extreme, if each entry in main memory can go in just one place in the cache, the cache is direct mapped. Many caches implement a compromise, and are described as set associative. For example, the level-1 data cache in an AMD Athlon is 2-way set associative, which means that any particular location in main memory can be cached in either of 2 locations in the level-1 data cache.

Associativity is a trade-off. If there are ten places the replacement policy can put a new cache entry, then when the cache is checked for a hit, all ten places must be searched. Checking more places takes more power, area, and potentially time. On the other hand, caches with more associativity suffer fewer misses (see conflict misses, below), so that the CPU spends less time servicing those misses. The rule of thumb is that doubling the associativity, from direct mapped to 2-way, or from 2-way to 4-way, has about the same effect on hit rate as doubling the cache size. Associativity increases beyond 4-way have much less effect on the hit rate, and are generally done for other reasons (see virtual aliasing, below).

In order of increasing (worse) hit times and decreasing (better) miss rates,

* direct mapped cache -- the best (fastest) hit times, and so the best tradeoff for "large" caches
* 2-way set associative cache
* 2-way skewed associative cache -- "the best tradeoff for .... caches whose sizes are in the range 4K-8K bytes" -- André Seznec[2]
* 4-way set associative cache
* fully associative cache -- the best (lowest) miss rates, and so the best tradeoff when the miss penalty is very high

If each location in main memory can be cached in either of two locations in the cache, one logical question is: which two? The simplest and most commonly used scheme, shown in the right-hand diagram above, is to use the least significant bits of the memory location's index as the index for the cache memory, and to have two entries for each index. One good property of this scheme is that the tags stored in the cache do not have to include that part of the main memory address which is implied by the cache memory's index. Since the cache tags are fewer bits, they take less area [on the microprocessor chip] and can be read and compared faster.

One of the advantages of a direct mapped cache is that it allows simple and fast speculation. Once the address has been computed, the one cache index which might have a copy of that datum is known. That cache entry can be read, and the processor can continue to work with that data before it finishes checking that the tag actually matches the requested address.

The idea of having the processor use the cached data before the tag match completes can be applied to associative caches as well. A subset of the tag, called a hint, can be used to pick just one of the possible cache entries mapping to the requested address. This datum can then be used in parallel with checking the full tag. The hint technique works best when used in the context of address translation, as explained below.

Other schemes have been suggested, such as the skewed cache[2], where the index for way 0 is direct, as above, but the index for way 1 is formed with a hash function. A good hash function has the property that addresses which conflict with the direct mapping tend not to conflict when mapped with the hash function, and so it is less likely that a program will suffer from an unexpectedly large number of conflict misses due to a pathological access pattern. The downside is extra latency from computing the hash function[3]. Additionally, when it comes time to load a new line and evict an old line, it may be difficult to determine which existing line was least recently used, because the new line conflicts with data at different indexes in each way; LRU tracking for non-skewed caches is usually done on a per-set basis. Nevertheless, skewed-associative caches have major advantages over conventional set-associative ones.[4]

They basically increased the hit rate of the L3 it seems.

**Mechromancer** · 07-27-2008, 12:05 PM

So by increasing the hit rate of the L3 along with adding 4 more megabytes, AMD seems to have improved upon the work done with each clock cycle (more IPC maybe), right? The CPU is simply more efficient because it makes less mistakes it seems.

Can we get an engineer or somebody in the "know" to explain this to us?

**Bandwidth** · 07-28-2008, 09:13 AM

Don’t look too much into cache associativity. It could be a (small) advantage or a (small) disadvantage in some cases. AMD engineers may have found that increasing the L3 cache associativity to 48 works better for the Deneb design with 6MB L3 cache. Having a larger cache and higher associativity can increase latency also. So the question now is this:

Did AMD manage to lower the L3 cache latency on Deneb even though it’s 3x larger and highly associative?

I think they did just that, but the NB has to be clocked higher than 2GHz this time around. I wish it was running 1/1 with the CPU clocks, but heat and power issues may be the problem.

**FlanK3r** · 07-29-2008, 11:37 PM

watch this

with default cooling (box?), real Vcore was 1.475V.

**gurusan** · 07-29-2008, 11:55 PM

flank3r is that stable?

**FlanK3r** · 07-30-2008, 12:21 AM

i dont know, its from any user (but possible yes)

**informal** · 07-30-2008, 01:29 AM

I doubt that shot is genuine,but hey who knows.Red flag was when he stated it's a C2 stepping...No way he got a production stepping in July,at least not from AMD.But if we give it slight chance it's real,it could be done on C0 or C1 Deneb and SB750 board,that's possible.

**BrowncoatGR** · 07-30-2008, 01:48 AM

link?

**Macadamia** · 07-30-2008, 01:57 AM

Originally Posted by informal

I doubt that shot is genuine,but hey who knows.Red flag was when he stated it's a C2 stepping...No way he got a production stepping in July,at least not from AMD.But if we give it slight chance it's real,it could be done on C0 or C1 Deneb and SB750 board,that's possible.

If it's launching in Sep/Oct why not?

Definitely no more time for another stepping to slip in.

**informal** · 07-30-2008, 02:00 AM

Originally Posted by Macadamia

If it's launching in Sep/Oct why not?

Definitely no more time for another stepping to slip in.

My doubts are about the fact he got this chip in the first place

.AMD is more secretive than a mossad (bar the GPU part)

.
But yes,you are correct,there is no more time for a new stepping.

**Macadamia** · 07-30-2008, 02:11 AM

Originally Posted by informal

My doubts are about the fact he got this chip in the first place

.AMD is more secretive than a mossad (bar the GPU part)

.
But yes,you are correct,there is no more time for a new stepping.

Actually, Phenoms did leak out (Agena) and that was a big flop (B2 that is)

AMD has learnt to shut up, but maybe in the face of Nehalem, one image can speak enough.

However I still think this is an unintentional leak. Very unintentional. Or the poster has access to the fabs/ is an OEM partner.

I mean, RV770 was gawdawesome and most of you guys didn't knew it until the benchmarks came. Even the partners didn't know!

Remember the GT200 superiority and nVidia blowing ATI off planet earth?

**informal** · 07-30-2008, 02:18 AM

Originally Posted by Macadamia

Actually, Phenoms did leak out (Agena) and that was a big flop (B2 that is)

AMD has learnt to shut up, but maybe in the face of Nehalem, one image can speak enough.

However I still think this is an unintentional leak. Very unintentional. Or the poster has access to the fabs/ is an OEM partner.

I mean, RV770 was gawdawesome and most of you guys didn't knew it until the benchmarks came. Even the partners didn't know!

Remember the GT200 superiority and nVidia blowing ATI off planet earth?

Yeah,the GT200 was hyped up and declared winner before market even saw the RV770 numbers(and they were a "small" surprise

).
Anyhow,i do hope the shot is real,especially given the fact that AOD is running stress test in the background and the 4Ghz was done with 1.47V

.
If we suppose Deneb is 10-15% faster than Agena,then at this clock ,45nm Phenom can be a formidable opponent not only to Penryn,but to Nehalem too.

**possessed** · 07-30-2008, 02:35 AM

let's hope like sb750 this confirms too, i'm gonna see many intel fanboys cry imo

**superpyton** · 07-30-2008, 08:56 AM

Originally Posted by BrowncoatGR

link?

http://www.overclock.net/amd-cpus/35...om-45nm-2.html

**Oese** · 07-30-2008, 10:08 AM

omfg that would be TOTALLY amazing

**Oc-Ghost** · 07-30-2008, 11:31 AM

I´m not sure but, is that font supposed to look like that in AOD, in the 4GHz screenshot?

hint: frequency first core smaller font than the others, should it be like that?

Can´t check myself as I´m still running K7 here

**HondaGuy** · 07-30-2008, 11:38 AM

I think so, mine is the same way

**cky2k6** · 07-30-2008, 11:48 AM

holy cow... if 4ghz is possible on stock even if unstable, it should be doable on even just good air cooling. if that is the case, i'm selling my intel/nvidia junk, and getting my hands on some 790fx/rv770 goodness. looks like amd might be back in the game...

**Extelleron** · 07-30-2008, 11:51 AM

Originally Posted by cky2k6

holy cow... if 4ghz is possible on stock even if unstable, it should be doable on even just good air cooling. if that is the case, i'm selling my intel/nvidia junk, and getting my hands on some 790fx/rv770 goodness. looks like amd might be back in the game...

Even if it was technically stable, the voltage is too much for 45nm. If you look at Intel's 45nm chips, anything above 1.4V is dangerous. HK/MG may play a role there but still 1.475V on 45nm is like ~1.55-1.6V on 65nm.

If that is true though then a stable overclock to 3.7-3.8GHz should be possible with reasonable voltage. That's a huge improvement over 65nm B3.

**Oese** · 07-30-2008, 12:21 PM

indeed... but 1,475v or comparable 1,55-1,6v whatever is not too much for testing or just a 4ghz screen..

**Mechromancer** · 07-30-2008, 12:32 PM

Don't get my hopes up! I will eStab you with my iStab v1.04, FlanK3r!

Seriously though, if Deneb can do 4Ghz@1.475 with a stock cooler then its time to buy a little AMD stock.

**Kasparz** · 07-30-2008, 01:10 PM

Originally Posted by informal

I doubt that shot is genuine,but hey who knows.Red flag was when he stated it's a C2 stepping...No way he got a production stepping in July,at least not from AMD.But if we give it slight chance it's real,it could be done on C0 or C1 Deneb and SB750 board,that's possible.

Retail Deneb CPU's are ready, they just need to make enough before release. There was both Intel and AMD CPU's in retail that had manufacturing/packing date way before official release.

**MrMojoZ** · 07-30-2008, 01:17 PM

I choose to believe that screen is a fake in order to not get my hopes up.

**Psychlone** · 07-30-2008, 01:38 PM

Why not believe it? I can pull 3.51GHz out of my 9850BE right now, so another 500MHz really isn't asking too much from a new fab process...

http://valid.x86-secret.com/show_oc.php?id=391344

I don't believe for a second that it's *NOT* real - for 3.51GHz, my 'golden' JAAHB 0816 GPMW only needs 1.375V, so the jump to 4GHz on the new 45nm process could possibly require a bit more voltage to be stable there - oh, and a chipset that allows it along with the new ACC (can we say 790G/GX w/ SB750???

)

I, for one, am ALL OVER this (both the new 45nm and the 790G/GX/SB750) when they come out!!!

Psychlone

**Devil's Prophet** · 07-30-2008, 01:43 PM

Well, Psychlone, aren't you VERY lucky with your Phenom batch? Because 3.5GHz with only 1.375V is nothing short of spectacular. Or is that kind of achievement much more common than I have thought it was? I'm reading stories of 9850BE everywhere that could only reach 3GHz-ish, regardless what voltage.

But next to your overclock, the oc in the pic seems very doable. At least with some cherry picked phenom that is. It's best not to think of this as your average 45nm overclock, right?

Thread: 45nm Phenom Overclocked, Super Pied

Thread Tools

Search Thread

Rate This Thread

Display

Bookmarks

Bookmarks

Posting Permissions