Results 1 to 25 of 116

Thread: Nehalem 101 part1 - 1366 and X58

Threaded View

  1. #1
    Xtreme X.I.P.
    Join Date
    Nov 2002
    Location
    Shipai
    Posts
    31,147

    Nehalem 101 part1 - 1366 and X58

    Hey guys,

    thought id post some details so your already prepared for nehalem and how to clock it.
    Ill try to post more details here bit by bit as i have the time

    What cpu should i get?
    well that depends on your budget of course
    Here is WHO i would recommend WHAT cpu:

    920: ~300US$
    people who want to try the latest platform
    people who want sli on an intel system or might want it in future
    occasional gamers
    hardcore gamers
    workstation users
    overclockers on air

    940: ~600US$
    workstation users
    overclockers on water
    overclockers on xtreme cooling
    3dmark fetishists

    965: ~1000US$
    demanding workstation users
    overclockers on xtreme cooling*
    3dmark fetishists *
    bandwidth fetishists
    people with a big budget seeking the best of the best

    *several people report they cant hit 5Ghz with the 965 chips on ln2, so you might want to wait and see if the extra 400$ for a 965 over a 940 is worth it... if most cpus max out below 5Ghz then the only advantage of the 965 is better memory performance thanks to unlocked memory and uncore multipliers which might not be worth an additional 400$

    What Overclocks can I expect?
    most 965 chips dont seem to hit 5G on ln2...
    So far a 940 should be good enough to max out the current chips under ln2 (23x200=4600Mhz)
    the gains from Ln2 are rather small for most chips, only around 300 mhz, and lower than -20 temps dont seem to gain much at all.
    Most chips dont like higher voltages than around 1.5v and barely scale above that.

    965:
    ~4.5Ghz 3d stable with all 4 cores on ln2
    ~3.8Ghz 3d stable with all 4 cores on air
    ~2100Mhz tri channel stable on air
    ~220Mhz Bclock
    ~200Mhz Bclock stable on air
    ~8GT/s or 4Ghz or below QPI speed, cooling doesnt really matter

    940:
    max 5Ghz (23x220)
    ~4.5Ghz 3d stable with all 4 cores on ln2
    ~3.8Ghz 3d stable with all 4 cores on air
    ~2100Mhz tri channel stable on air
    ~220Mhz Bclock
    ~200Mhz Bclock stable on air
    ~8GT/s or 4Ghz or below QPI speed, cooling doesnt really matter

    920:
    max 4.5Ghz (21x220)
    ~4.5Ghz 3d stable with all 4 cores on ln2
    ~3.8Ghz 3d stable with all 4 cores on air
    ~2100Mhz tri channel stable on air
    ~220Mhz Bclock
    ~200Mhz Bclock stable on air
    ~8GT/s or 4Ghz or below QPI speed, cooling doesnt really matter

    as you can see all cpus seem to clock about the same so far, with some few 965 cpus clocking to 5g and higher.
    cooling is definately a big limitation as core i7 cpus run very hot above stock voltage and speed.
    similar to k10 most cpus dont like to run at very low temperatures however and the gain from ln2 over watercooling should be very small.
    There are some good cpus that clock very well with cold temps of around -100, but those seem to be the big exception so far.
    for most cpus there seems to be no or almost no scaling below around -40C, so phase change cooling will have a big revival it seems



    What Performance gains can I expect?
    In Audio/Video/CAD and other highly multithreaded applications higher clockspeeds should result in notable performance gains.
    Unfortunately you wont see very high gains if you look at games...
    Going from DDR3 1066 to DDR3 1600 barely gives any boosts in games or 3d benchmarks actually... its great for synthetic memory benchmarks, but thats about it. And according to this article on bit-tech.net going from 1066 777 to 1600 777 doubled their memory bandwidth in synthetic tests, but this memory overclock plus going from 2.66ghz to 4ghz with a gtx280 gave them a boost of:

    4fps in crysis (~10% boost)
    6fps in far cry2 (~10% boost)
    20fps in hl2 (~10% boost)

    so a 50% boost in cpu clocks and a 50% boost in memory clocks results in only a 10% performance boost in games...
    Their overclocked 920 was faster than a 965, yet you see there are only very minor gains in gaming performance...
    So to spend 3x the money for a 965 over a 920 doesnt make any sense whatsoever if your main focus is gaming.

    Clocks:
    Instead of the FSB Nehalem uses a reference clock like AMD, but its at 133Mhz and not 200Mhz.
    This reference clock is multiplied to create all clocks inside the CPU:

    Bclock x CPU Multiplier = CPU Clock (12-256)
    Bclock x QPI Multiplier = QPI Clock (2-30)
    Bclock x Uncore Multiplier = Uncore Clock (2-30)
    Bclock x Memory Multiplier = Memory clock (2-30)

    available/working multipliers on 965XE
    Bclock x CPU Multiplier = CPU Clock (12-256)
    Bclock x QPI Multiplier = QPI Clock (18, 20, 24)
    Bclock x Uncore Multiplier = Uncore Clock (10-30)
    Bclock x Memory Multiplier = Memory clock (6-30) (max is 14x since Uncore Multiplier needs to be 2x or higher than Memory Multiplier)

    available/working multipliers on 940
    Bclock x CPU Multiplier = CPU Clock (22)
    Bclock x QPI Multiplier = QPI Clock (18)
    Bclock x Uncore Multiplier = Uncore Clock (16)
    Bclock x Memory Multiplier = Memory clock (6,8) (retail chips might have higher memory multiplier unlocked)

    available/working multipliers on 920
    Bclock x CPU Multiplier = CPU Clock (22)
    Bclock x QPI Multiplier = QPI Clock (18)
    Bclock x Uncore Multiplier = Uncore Clock (16)
    Bclock x Memory Multiplier = Memory clock (6,8) (retail chips might have higher memory multiplier unlocked)


    default Multipliers on 965XE
    133 x 24 = 3200MHz CPU
    133 x 24 = 3200MHz QPI (6.4GT/s)
    133 x 20 = 2666Mhz UnCore
    133 x 10 = 1333Mhz Mem (DDR3 1333)

    default Multipliers on 940
    133 x 22 = 2933MHz CPU
    133 x 18 = 2400MHz QPI (4.8GT/s)
    133 x 16 = 2133Mhz UnCore
    133 x 8 = 1066Mhz Mem (DDR3 1066)

    default Multipliers on 920
    133 x 20 = 2666MHz CPU
    133 x 18 = 2400MHz QPI (4.8GT/s)
    133 x 16 = 2133Mhz UnCore
    133 x 8 = 1066Mhz Mem (DDR3 1066)


    default voltages on 965XE:
    default voltages on 940:
    default voltages on 920:


    Steppings:
    retail Core i7 chips will be C0 stepping
    some ES Core i7 and nehalem based Xeon chips are B stepping chips
    B stepping chips clock a few hundred mhz higher than C stepping chips, at least on Ln2
    C stepping chips can take more vdimm, vtt and vcore than B stepping chips without burning the chip, but clock slightly worse

    Overclocking:
    If you have an XE cpu its all easy, you just change the CPU Multiplier to increase the CPU Clockspeed, thats it...
    If you dont have cpu multiplier access, then things get a bit tricky
    You need to push up the Bclock, which is related to all other clocks though... so when you increase the Bclock you overclock every part of the processor.
    QPI, Uncore and the memory controller/memory.

    Crime Scene Investiagtion errr i mean QPI:
    You dont really want to overclock QPI though...
    Just like HyperTransport QPI is way overpowered for a Desktop system and only really useful for Servers, and you wont see any notable improvements from clocking the QPI bus higher. So while overclocking it wont really give you any benefits, thats what you have to do if you want to overclock your cpu via the Bclock.

    It usually maxes out at around 8GT/s, which is around 4Ghz, and as you have seen from the details above, there are only 2 multipliers you can select unfortunately, which will limit your overclocks on Bclock for the 920 and 940 parts. The lowest multiplier you can select is 18. If you do the maths, that means you will be limited to a Bclock of 222. That is IF you processor and board can run 4Ghz QPI... some cant run that high...

    Can I get an Uncore do you want more?
    Now about Uncore, Uncore is the L3 cache and memory controller, and its powered not by Vcore, but by VTT. I guess Intel couldnt decide what to call this part of the cpu and then went for Uncore which pretty much means its not the core... not very creative, but it gets the job done
    Im sure youve seen many if not all mainboard manufacturers using one or two or even 4 phases of a CPU PWM to power the VTT. While on Core2 VTT consumed around 10-15W and you could use a basic PWM to feed it, it now needs double that power, around 30W, and when your overclocking, even more, naturally...

    3-some Memories: mhhh good times, good times
    The Uncore multiplier needs to be 2x or higher than the memory multiplier.
    Since Uncore is locked to 2133Mhz on the 920 and 940 higher memory multipliers than 8x are not available. (retail chips might have higher memory multiplier unlocked) So to reach a higher memory clock than 1066 you need an XE cpu which lets you overclock Uncore and use 10x, 12x and 14x memory multipliers. Or you overclock the Bclock which will result in higher memory clocks as well, but with the highest multiplier of 8x you will just barely make it over DDR3 1600 with a bclock of 200, so memory overclocking seems to be limited on the 920 and 940 chips! (retail chips might have higher memory multiplier unlocked) How big the impact of speeds beyond ddr3 1600 really is highly depends on the applications though...

    It seems to be quite useful as well to overclock the Uncore even if you dont want to run huge memory clocks btw. Remember it contains the L3 cache, so overclocking it will reduce L3 cache latencies and we all know that the higher the memory controller is clocked, the more efficient it works and the higher the bandwidth, even if the memory is still running at stock. So High Uncore clocks with low memory clocks at tight latencies will definately be a sweet combo! Again memory enthusiasts will NEED an XE to fully max out the memory potential even at low clocks, since only the XE cpus allow Uncore overclocking... (retail 920 and 940 chips might have higher uncore multipliers unlocked)

    Does high Vdimm kill Core i7?
    Some say yes, some say no...
    for those of you who remember amd going IMC some years back, you might remember that the 90nm shrink brought some problems with it... high vdimm could kill or degrade the integrated memory controller. well how did we work around that back then? vcore had to be increased as well to keep the vcore vdimm ratio more or less the same and things were fine. Later when amd moved to K10, the same thing happened again with some imcs dieing at 1.9v ddr2 vdimm if vcore and other related voltages were kept low.
    well for amds imc implementation the memory controller was actually powered by vcore, so thats most likely why the vcore vdimm ratio had to be maintained. for nehalem or core i7 the memory controller is powered by vtt...

    Intel recommends a max vdimm of 1.65v, which is curiously 1.5x vtt...
    Several people reported that running higher than default vtt plus higher than 1.65v vdimm works just fine. how come?
    From what i know about manufacturing processes, you have to pick the target voltage you want to work with at some point, and then decide what transistor design to use. some transistors can take high voltages but switch slow and are rather beefy, others are small and can switch much faster but will degrade with higher voltages. which is exactly what people reported with vdimm damaged core i7 cpus. one way to work around this and stress the transistors less is by not grounding them to ground but to some other voltage.

    I dont know why or how, but vdimm is definately related to vtt, and the fact that intel recommends a max vdimm of 1.5x vtt is not a coincidence if you ask me... its not clear whether this 1.5x ratio exists and if it has to be maintained, or if vdimm has to be within a certain voltage range to vtt, but in either case the resulting highest vdimm/vtt ratios are about the same... once we have relatively cheap 920 cpus available and people can risk burning a cpus im sure it will only be a matter of weeks until we know more about vdimm/vtt on core i7



    high uncore and memory clocks are possiblewithout high vdimm, we hit ddr3 2200 with 1.7v vdimm on BloodRage a few days ago


    Setfsb Note:
    Once you boot at a certain Bclock you will only be able to bump it up by around 25Mhz before the board gets unstable.
    Most likely its caused by the chipset and memory training during bootup, core i7 has a built in routine that automatically fine tunes the chipset and memory during bootup.




    Here are some recommended settings for you i got a from a friend at intel.
    if your reading this, cheers man

    How to find max Bclock:
    Bclock 133Mhz
    CPU Multiplier 12x/14x
    Uncore Multiplier 16x or higher
    QPI Multiplier 18x
    Memory Multiplier 8x or lower
    Vcore 1.25v
    VTT 1.45v
    Vdimm 1.65v

    push up the Bclock and increase VTT and QPI volts to get higher.
    Max 100% safe VTT seems to be 1.65v


    How to find max Mem clock:
    Bclock 133Mhz
    CPU Multiplier 12x/14x
    Uncore Multiplier 16x or higher
    QPI Multiplier 18x
    Highest Memory Multiplier
    Vcore 1.25v
    VTT 1.45v
    Vdimm 1.65v

    push up the Bclock and increase VTT and Vdimm to get higher.
    more vdimm and vtt than 1.65v at your own risk
    1.8vdimm should be 100% safe though



    How to find max CPU Clocks on i7-965:
    Bclock 133Mhz or lower
    CPU Multiplier 24x
    Uncore Multiplier 16x or higher
    QPI Multiplier 18x
    Memory Multiplier 8x or lower
    Vcore 1.25v
    VTT 1.45v
    Vdimm 1.65v

    reduce Bclock, then increase the cpu multipliers step by step.
    push up vcore to get higher clocks stable


    How to find max CPU Clocks on i7-940/i7-920:
    max Bclock
    CPU Multiplier 15x
    Uncore Multiplier 16x or higher
    QPI Multiplier 18x
    Memory Multiplier 8x or lower
    Vcore 1.25v
    VTT 1.45v
    Vdimm 1.65v

    increase CPU multipliers step by step, push up vcore to get higher clocks stable. if you cant get a multiplier stable, reduce Bclock until your stable.
    Max bclock for high multis is slightly lower than for low multis.


    The return of the Turbo-Button?
    The max turbo multipliers are stored in seperate registers from the "normal" max allowed cpu multipliers, and they can not be changed (unless you hack the cpu ) The turbo multipliers can not be forced to stay on all the time... at least not directly... you can fool the cpu into believing all the conditions that are necessary to run turbo mode are there, even if they are not
    the highest boot multiplier is the highest "normal" multiplier...

    for the i7-965 it works like this:

    IF turbo is disabled
    OR vcore is unstable
    OR current is too high
    OR Tdp is too high:

    4 cores will run at 24x133=3200Mhz
    3 cores will run at 24x133=3200Mhz
    2 cores will run at 24x133=3200Mhz
    1 cores will run at 24x133=3200Mhz

    IF Turbo is enabled
    AND vcore is stable
    AND current is ok
    AND the TDP is below the limit

    4 cores will run at 25x133=3450Mhz
    3 cores will run at 25x133=3450Mhz
    2 cores will run at 25x133=3450Mhz
    1 core will run at 26x133=3600Mhz


    for the i7-940 it works like this:

    IF turbo is disabled
    OR vcore is unstable
    OR current is too high
    OR Tdp is too high:

    4 cores will run at 22x133=2933Mhz
    3 cores will run at 22x133=2933Mhz
    2 cores will run at 22x133=2933Mhz
    1 cores will run at 22x133=2933Mhz

    IF Turbo is enabled
    AND vcore is stable
    AND current is ok
    AND the TDP is below the limit

    4 cores will run at 22x133=2933Mhz
    3 cores will run at 22x133=2933Mhz
    2 cores will run at 22x133=2933Mhz
    1 core will run at 23x133=3050Mhz

    so you see, turbo is really not that useful for most of the cpus, its only useful for the 965, where it basically auto overclocks the cpu... so basically the Core i7 lineup looks like this:

    920 2800Mhz BUT if vcore is unstable, current is too high, or temps are too high, it will run slower.
    940 3050Mhz BUT if vcore is unstable, current is too high, or temps are too high, it will run slower.
    965 3600Mhz BUT if vcore is unstable, current is too high, or temps are too high, it will run slower.

    What turbo really is, is a more advanced form of throttling the cpus from those clockspeeds down to a SAFE speed that will work even with high temps, fluctuating vcore and high current. instead of advertising a 3600mhz cpu that throtles down to 3400 or less depending on the situation, which would cause a lot of complains, the cpus are rated at 3200mhz and everything else above it is a BONUS, so nobody can complain


    This is it or now, ill add more details later on. if you have any questions, just ask
    Attached Thumbnails Attached Thumbnails Click image for larger version. 

Name:	Nehalem vdimm vtt.PNG 
Views:	20566 
Size:	3.7 KB 
ID:	88844  
    Last edited by saaya; 01-03-2009 at 12:14 AM.

Bookmarks

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •