MMM
Results 1 to 25 of 677

Thread: How to set up GTL Ref Values for 45nm & 65nm

Threaded View

  1. #11
    Xtreme Mentor
    Join Date
    Aug 2008
    Location
    Australia
    Posts
    3,035
    Quote Originally Posted by mikeyakame View Post
    Cryptik,

    Tonight I had to read more Intel datasheets on something unrelated and I debunked my own theories on Asus' differing GTLs. I actually have an explanation now that isn't based on hunches! It seems if I had read thoroughly over all the FSB pin definitions I would have caught it straight away!

    Address strobes are common clock source synchronous. Data strobes are differential clock source synchronous.

    So why do Asus use 0.635 for data and 0.667 for address.

    Address being common clock source synchronous means that the strobe is setup across multiple clock domains during the same clock edge, and synchronized to source of strobe.

    Data being differential clock source synchronous means the strobe is set up at the target destination on the next falling edge of either differential clock, so the actual clock domain of source and target never actually come into contact with each other, and just synchronizes the data strobe based on when the source set it up, and accessed at the target on the following falling edge on the host bus not the source.

    Now I see why Data is lower than address.

    Ringback on low and high swing has potential to wreak havoc when it leaks into two clock domains have an open strobe on the same clock, and would otherwise never cross talk even if their frequency oscillates similarly to another on a different domain. Cross talking from ringback can only occur when a common clock is used for a strobe, it's otherwise harmless on differential clocks. Jitter on a common clock is as harmless as ringback on a differential clock. Asus understand that this is true in practice by their decisions on what reference voltage point to use for the associated strobe on the data or address bus. Jitter across differential clocks is bad because it causes skewing and deviation when calculating the following falling clock edge on the inverse clock because if the following clock period actually takes place earlier or later than max stability window determines is acceptable, there is no way to be sure that the target will begin to read the strobe before it begins and contaminate the strobe with something completely different, or after and miss the first bursts of data with result of incomplete data at endpoint. Checking data is complete and valid is optional and upto endpoint device to ensure that what it received is the same as what was sent.

    So on a data strobe you want to minimize jitter at the expense of harmless ringback so you set the GTL for data bus strobes at the same value as Host FSB GTL reference to more precisely time when to setup the strobe and how long to hold it to make sure the target can successfully read successive strobes for the completion of the data transaction. The clock domains will never exist on the same clock so any ringback within the valid period window won't do anything other than look cool on an oscilloscope. There is your 0.63x.

    On an address strobe different clock domains being exposed to each other on a common clock means you want to eliminate ringback because all the shielding and pulling down on the input voltage you do won't make the slightest difference if both clock domains have clock frequencies that have oscillation and resonance characteristics alike each other, and if they cross talk between device bus' there is no way to know this, as cross talk from source bus to destination bus appears as what should be part of an address strobe transaction even if its actually random data bits from a data strobe on the same falling clock edge for a completely different transaction. Address strobe data is highly critical and random if even minute pollution can completly alter the necessary data for the MCH to determine where a following data strobe needs to be requested to, or even something like changing a bit in the destination address data when decoded ends up initiating a data strobe to a completely unrelated device bus, and completely corrupting a data strobe it is accessing from a different source. Data all looks the same so you rely on the MCH to decode and route it properly and assume responsibility of validity to target, Address is all completely different, and decoding is based on set of hexadecimal identifiers supplied so that you don't need to know where it actually lives or how to get to it, just that it is there and you want to communicate with it. Asus uses 0.67 to eliminate ringback cross talk between clock domains at the expense of more jitter because jitter causes skewing, and skewing is only a problem when timing strobes across differential clocks especially when the low clocks data is inverse to its value, and high clock isn't. A != A on low clock, F != 0 on high clock, mess the timing up and AD0F from low clock is read as AD0F on high instead of 52F0 which is what the source device burst on to the Host FSB bus.


    Summary.

    0.63 for Data uses ringback resistance from differential clock bus access for source and target to eliminate clock jitter when timing strobe access between source and target. Set to same reference point as Host Bus GTL aka NB_GTLREF because all clocks are reference from host BCLK and what better reference to use than the same point being used to drive the clocks you are timing against clocks driven based on the same source at the destination accessing the strobe on the following clock edge. Chances of clocks skewing and deviating too far are decreased dramatically, that's the critical factor here.

    0.67 for Address uses jitters ineffectiveness to raise the reference point as compensation to minimize or eliminate ringback into the valid clock period which has to potential to pollute and disrupt system stability or activity. Higher the sample point for all calculated clock period decreases the risk of address strobe contents being altered from concurrent data strobe on same falling clock edge as address strobe is setup across the bus, and what source sends and target receives isn't checked to minimize cost of transaction and assumed that what target receives is what the source originally sent.

    Even with making sense of this, I feel so much more confused about other things. Datasheets are evil, they lure you in with the dream of answers, but in reality just pose more questions that they can't give you answers for!

    The above also stipulates my understanding and practical results i've found from using both NB & CPU clock skews together with GTL reference points as it appears they are designed to go hand in hand and to be setup to compliment each others strengths and weaknesses, most importantly when Host BCLK frequency becomes fairly high (though quite a bit lower on quad cores than dual cores) and minimum criteria for operation can no longer be met for 100% of clocks on FSB and corresponding clocks driven on CPU and MCH from the Host FSB BCLK.

    GTL reference points are extremely important for calculating highly critical values that correspond to handling data and timing strobes between clocks, and because of this they have a weak ability that can compensate small skew deviations between differential clock provided that the deviation at worst never exceeds the maximum stability criteria determined by the voltage crossing point between low and high swings. Using them solely to do this from what I now realize seems to result in perfectly stable system behaviour at high FSB, until CPU and MCH clock timing becomes critical during heavy bus transactions such as the type Linpack puts in motion and the system will hard lock instantly, sometimes after a minute other times 5-10 but the result is the same. The deviation between the previous voltage crossing point and the following one was greater than 150ps and the previous falling edge data strobe polluted the adjacent rising edge address strobe on the inverse clock. No amount of GTL or Vtt adjustment alone can correct it at this point, as if deviation averages 120-130ps, then all it takes is a bit too much jitter on the valid clock period for a data strobe and the deviation has potential to jump beyond that 150ps maximum tolerance.

    NB and CPU clock skews can't fix problems which as a result of poorly setup GTL reference points so changing these to fix this kind of problem is like changing the tyres on a car that won't run because it has a flat battery. Car might have spanking new tyres but this means squat if your battery is still flat since you can't drive it anyway. NB & CPU clock skews what they are designed to do is make static corrections to when the output clock needs to be driven for the MCH and for the CPU with respect to the voltage crossing point which the GTL reference for Host FSB clock (NB GTLREF) is used to determine. If this is set correctly, then correcting skewing that is occuring from all the extra NB voltage we are pumping in to get high FSB stable in the first place (and is actually slowly becoming an obstacle that will halt further potential and heat just amplifies it which is why water cooled NBs give more consistent FSB clocks when put under similar conditions as constant lower temperature acts to reduce onset of symptoms and problems). Setting up the timing for driving CPU & MCH output clocks correctly from Host BCLK reference will correct dynamic compensation GTLs were being utilized incorrectly to handle if you either intentionally or accidentally managed to set them up to do that of course. If the problem lies with the design of the board itself and its VR limitations or additional compensation in circuit components that is designed to only handle so much before even they can't help, then no amount of amount of skew or GTL adjustment will change the results from this point even if the memory and CPU can do it.

    Most 8 phase X48 boards will only do near 500Mhz FSB and this is VR and capacitance compensation limitations of their designs. P45 boards even with 16 phases unless the ones in question are priced at the same point as X48 counterparts are crippled by use of cheaper components to save costs of production, and it becomes like winning the lottery some win first try, others never win at all. If you are lucky to get a P45 board that does somehow pull over 500Mhz FSB on a quad core then you are one in very few who actually have achieved it.

    Only two boards designed for this task are Asus Rampage Extreme and Foxconn Blackops and both when not crippled by either CPU or DRAM falling flat on their face first can and will achieve some very high extremely stable FSB frequencies as a result of either never exceeding their design or not finding a CPU that is perfect enough to exceed its abilities.

    If by correctly setting up both clock skews and GTLs you are still stuck and can't get farther there is a damn good chance that CPU is losing coherency or DRAM timings can not be tight enough as required because of inconsistent IC limitations or ability. Sometimes it can be this, more times than not it is what happens when you become consumed on making one set of values work that just cant work, and if this happens best thing to do is step back for a bit start from the beginning at a different point and make your way back up there with new values. It's being able to cut your losses and start fresh that will be difference some of the time between making a setup work or not at a certain FSB frequency. Not all FSB frequencies are worth using anyway, sometimes 1 or 2Mhz either way is all it takes to get the system running smooth.
    Sorry haven't been in here in awhile. That description of the 0.63x data and 0.667x address makes perfect sense, however does the same situation exist when you factor in increased vcore and very large FSB/clock frequency increases? Is the data bus as resistant to ringback and is the address bus as resistant to jitter under the completely different circumstances created by overclocking? I found with my system that the data bus is the most sensitive, perhaps the ringback starts to needs compensating for at higher BUS/clock speeds, and responds to (is stabilized by) increased Vref much more than the address bus, which can almost be left at 0.667x with a slightly raised Vtt even up to ~534 FSB.

    I have not had much luck setting the vRef for the data bus the same as the NB Vref. My boards NB default multi is 0.640x whilst the default data bus multi is 0.635x. To be completely stable, my NB GTL Ref is -40mv, and the data bus GTL Ref is +60mv, my NB will not tolerate 0.64x + GTL Refs and the cpu will not tolerate less than 0.635x on the data bus when at high speeds (4GHz+). This may be a function of the changes occurring due to overclocking and the requirements of both the CPU and NB under these conditions. It seems at least with my system, that at elevated speeds the NB and CPU need to be tuned separately to maintain valid data transfer between source and destination.

    The use of clock skews is imperative to achieve stability, especially it seems with quads. However each system appears to exhibit unique behaviour, some preferring a delay of 100 - 200ps on the NB with the cpu left at normal, and others prefer a 100 - 200ps delay on the CPU with the NB left at normal. This may be either due to the CPU used, or variations in components on the board, or other factors. To a very limited extent you can overpower the need for NB skews with increased NB voltage but in a lot of cases a decrease in NB voltage can be achieved with correctly set NB skew.

    500 FSB on a quad is not something we see often, perhaps as much due to people not wanting to push the chip that hard as opposed to the same speed achievable on a higher multi as a lack of boards to support it. The only quads ive seen doing 500 FSB with reasonable volts are the occasional 65nm and the occasional low VID 9650 on a suitable P45. The A3 revision P45 chipset seems much more capable in terms of quad overclocking, and although only possessing a 6 phase analogue PWM, the Gigabyte UD3P is handling them very well, with 4500 MHz and 500 FSB 24/7 stable able to be achieved. The Max II Formula, which has the same 16 phase PWM as the Rampage Extreme, seems to be able to handle it too but not many of the guys with one have chips that are 500 FSB capable at volts they are comfortable using. Also the M2F seems quite capable at 4500 MHz on duals, with no CPU/NB clock skewing or ram skewing necessary for complete stability. Choice of board is very important when intending to push the limits of your hardware.



    Quote Originally Posted by Cibic View Post
    Guys, I've been reading but I don't fully understand it. I have never done maths in english.

    MB: Asus Striker II Extreme v1104 Bios
    CPU: Q6600 at 3.6GHz, 1.46v
    VTT: 1.44v (actual 1.46)
    NB: 1.55v
    PLL: 1.6v
    According to Seban's excel file and
    from what I gathered:


    Doing the maths I get 46mv that means I should set the values and then work my way down or up?
    and each time I get a failure but I must keep 1 and 3 40mv below 0 and 2.

    eg.
    GTL ref 0: +50mv
    GTL ref 1: +10mv
    GTL ref 2: +50mv
    GTL ref 3: +10mv
    NB: +50mv

    Which way is it correct?
    What range should I be and if none of the ranges work, that means I need more volts somewhere?
    There's no hard and fast rule to this, You can feel free to experiment with every combination of the GTL Refs. For example I set +20mv for all my GTL Refs the other night doing some testing, you are not going to hurt anything unless you have a very high Vtt and the Vref goes over 1.10v.

    You can try for example, +50/+10/+50/+10 or +50/+20+/+50/+20 or +50/auto/+50/auto or +40/-5/+40/-5 would give you your ~46mv difference if you find that to work better.

    Quote Originally Posted by Bobly View Post
    Okay there's one thing I'm slightly confused about: This thread explains how to set the GTL Ref voltages, however when looking at examples of what people have set, their are always 2 of the 4 which are higher (~50) and 2 of the 4 that are lower (~10):
    Example:
    GTL ref 0: +50mv
    GTL ref 1: +10mv
    GTL ref 2: +50mv
    GTL ref 3: +10mv

    Is that because 0 is the maximum and 1 the minimum? Should that not be mentioned in the first post?



    2/ More importantly, what is the preferred method to find a stable overclock while fiddling with GTL Ref voltages?
    When you overclock a CPU it's fairly trouble free, increase FSB, check stability, if stable increase again, if not raise Vcore or Vnb (one at a time) until stable, then resume.
    But with GTL Ref voltages I'm in a slightly more troublesome situation...
    Should I first start manually setting the VTT and see if I can overclock any higher? Or is manual VTT without manual GTL Ref useless and I need to do both at the same time?
    If unstable, should I increase or decrease the GTL Ref voltages to get stable again? Are there any margins one should not go over? (ex: +~100mV?)

    Sorry for the noobish questions but this has got to be one of the most confusing aspects of overclocking ^^
    The default multipliers for GTL Refs 0/2 are 0.635x and the default multiplier for GTL Refs 1/3 are 0.667x, so to get the same Vref when multiplied by the Vtt you need different modifiers for 0/2 and 1/3, ie: 50/10/50/10.

    I guess the way to tune GTL Refs differes person to person. To roughly tune them, I prefer to make sure ram/NB is stable, manually set Vtt, and, using a vcore that is slightly too low for the given speed, use either orthos small FFT or Intel Burn Test to see what GTL Refs give me the longest fail time or least errors, respectively. You must be certain the rest of your system is stable though or you will be wasting your time. You can also somewhat fine tune the GTL's by also examining the Gflops output with IBT, they should be very similar to each other, and correct adjustment of the GTL's can achieve this.

    Quote Originally Posted by Amurtigress View Post
    mikeyakame/Cryptik:

    Thanks for the long post on the last page. I haven't posted here for quite some time...

    It appears that my mainboard hasn't read your posting yet, since my system would only get stable setting all references to 0.69 +-0.2.
    Any larger difference in any of those references leads to calculation errors in IBT. If my memories don't fail me, keeping the difference 0.04 between adress and data did not even work on default clocks.

    Here a quick summary of my current settings:
    CPU: Q9550 E0
    FSB clock: 460 MHz (465 and more will cause hangs and BSODs)
    FSB term V.: 1.36V
    NB Volt: 1.40V
    CPU PLL Volt: 1.56V
    CPU clock: 3910 MHz
    CPU GTL 0/2: 0.69x
    CPU GTL 1/3: 0.69x
    NB GTL: 0.69x

    CPU and NB skew do nothing for me, they only cause instabilities (BSODs, system hang) if not set to AUTO. More than 200 ps even keeps the system from posting.

    Does that make any sense in relation to your last post on why ASUS is keeping that 0.04 difference?
    You're not the only person with a 9550 that requires high GTL's. Just use whatever works for you best, GTL settings dont port over from other peoples systems. You may need more Vtt, vcore, vPLL etc as GTL's can't make up for a lack of vcore etc.

    Quote Originally Posted by Expat GriZ View Post
    Big Thanks to Cryptik for starting this thread & Mike for his detailed analysis (had to read some of it twice!! ... uh well maybe three times!!) I've had a ton of fun with my RE & E8600 but was rather dissappointed that I couldn't reach the magical 600fsb that I had seen posted all over this and other sites, AUTO settings at that!! I'd gazed at this thread a few times and then it a fit of desperation @ 4 in the morning decided it was time to play with the GTL's. Well, booted in 2nd try @ 9 x 600 PL 8 after doing the math!! Didn't think it was necessary with the E8600, thought I would read the thread when I put my quad in but WOW!! Been busy the last few days and had problems getting into Xtremesystems for the last week but now that I'm back on just had to say a BIG THANKS.. I'm sure this will really benefit me when I go for the QX9650!! Here is where I finished off the other night....
    Cheers I'm glad this thread has turned out to be useful. Mikey's fantastic explanations have also been a very valuable contribution.

    That's a great result, congratulations! Glad this thread helped, and yes dual cores are also effected by GTL's not just quads. Any particular reason why you are buying a QX9650? From what I've seen from other peoples results the Q9650 E0's clock much better than the QX chips, and the RE seems to be able to handle 470 - 480 FSB so 4.23 GHz should be achievable on much lower vcore than on a QX with your RE.
    Last edited by CryptiK; 01-26-2009 at 06:42 AM.
    Ci7 990X::Rampage III Extreme::12GB Corsair Dominator 1866C7GT::2 x EVGA SC Titans in SLI::Corsair AX1200::TJ07::Watercooled
    Ci7 920 3849B018::Rampage II Extreme::6GB GSKILL Trident 2000C9 BBSE::EVGA GTX580::Antec Signature SG850::TJ09::Aircooled w/TRUE 120X

Bookmarks

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •