Cryptik,
Tonight I had to read more Intel datasheets on something unrelated and I debunked my own theories on Asus' differing GTLs. I actually have an explanation now that isn't based on hunches! It seems if I had read thoroughly over all the FSB pin definitions I would have caught it straight away!
Address strobes are common clock source synchronous. Data strobes are differential clock source synchronous.
So why do Asus use 0.635 for data and 0.667 for address.
Address being common clock source synchronous means that the strobe is setup across multiple clock domains during the same clock edge, and synchronized to source of strobe.
Data being differential clock source synchronous means the strobe is set up at the target destination on the next falling edge of either differential clock, so the actual clock domain of source and target never actually come into contact with each other, and just synchronizes the data strobe based on when the source set it up, and accessed at the target on the following falling edge on the host bus not the source.
Now I see why Data is lower than address.
Ringback on low and high swing has potential to wreak havoc when it leaks into two clock domains have an open strobe on the same clock, and would otherwise never cross talk even if their frequency oscillates similarly to another on a different domain. Cross talking from ringback can only occur when a common clock is used for a strobe, it's otherwise harmless on differential clocks. Jitter on a common clock is as harmless as ringback on a differential clock. Asus understand that this is true in practice by their decisions on what reference voltage point to use for the associated strobe on the data or address bus. Jitter across differential clocks is bad because it causes skewing and deviation when calculating the following falling clock edge on the inverse clock because if the following clock period actually takes place earlier or later than max stability window determines is acceptable, there is no way to be sure that the target will begin to read the strobe before it begins and contaminate the strobe with something completely different, or after and miss the first bursts of data with result of incomplete data at endpoint. Checking data is complete and valid is optional and upto endpoint device to ensure that what it received is the same as what was sent.
So on a data strobe you want to minimize jitter at the expense of harmless ringback so you set the GTL for data bus strobes at the same value as Host FSB GTL reference to more precisely time when to setup the strobe and how long to hold it to make sure the target can successfully read successive strobes for the completion of the data transaction. The clock domains will never exist on the same clock so any ringback within the valid period window won't do anything other than look cool on an oscilloscope. There is your 0.63x.
On an address strobe different clock domains being exposed to each other on a common clock means you want to eliminate ringback because all the shielding and pulling down on the input voltage you do won't make the slightest difference if both clock domains have clock frequencies that have oscillation and resonance characteristics alike each other, and if they cross talk between device bus' there is no way to know this, as cross talk from source bus to destination bus appears as what should be part of an address strobe transaction even if its actually random data bits from a data strobe on the same falling clock edge for a completely different transaction. Address strobe data is highly critical and random if even minute pollution can completly alter the necessary data for the MCH to determine where a following data strobe needs to be requested to, or even something like changing a bit in the destination address data when decoded ends up initiating a data strobe to a completely unrelated device bus, and completely corrupting a data strobe it is accessing from a different source. Data all looks the same so you rely on the MCH to decode and route it properly and assume responsibility of validity to target, Address is all completely different, and decoding is based on set of hexadecimal identifiers supplied so that you don't need to know where it actually lives or how to get to it, just that it is there and you want to communicate with it. Asus uses 0.67 to eliminate ringback cross talk between clock domains at the expense of more jitter because jitter causes skewing, and skewing is only a problem when timing strobes across differential clocks especially when the low clocks data is inverse to its value, and high clock isn't. A != A on low clock, F != 0 on high clock, mess the timing up and AD0F from low clock is read as AD0F on high instead of 52F0 which is what the source device burst on to the Host FSB bus.
Summary.
0.63 for Data uses ringback resistance from differential clock bus access for source and target to eliminate clock jitter when timing strobe access between source and target. Set to same reference point as Host Bus GTL aka NB_GTLREF because all clocks are reference from host BCLK and what better reference to use than the same point being used to drive the clocks you are timing against clocks driven based on the same source at the destination accessing the strobe on the following clock edge. Chances of clocks skewing and deviating too far are decreased dramatically, that's the critical factor here.
0.67 for Address uses jitters ineffectiveness to raise the reference point as compensation to minimize or eliminate ringback into the valid clock period which has to potential to pollute and disrupt system stability or activity. Higher the sample point for all calculated clock period decreases the risk of address strobe contents being altered from concurrent data strobe on same falling clock edge as address strobe is setup across the bus, and what source sends and target receives isn't checked to minimize cost of transaction and assumed that what target receives is what the source originally sent.
Even with making sense of this, I feel so much more confused about other things. Datasheets are evil, they lure you in with the dream of answers, but in reality just pose more questions that they can't give you answers for!
The above also stipulates my understanding and practical results i've found from using both NB & CPU clock skews together with GTL reference points as it appears they are designed to go hand in hand and to be setup to compliment each others strengths and weaknesses, most importantly when Host BCLK frequency becomes fairly high (though quite a bit lower on quad cores than dual cores) and minimum criteria for operation can no longer be met for 100% of clocks on FSB and corresponding clocks driven on CPU and MCH from the Host FSB BCLK.
GTL reference points are extremely important for calculating highly critical values that correspond to handling data and timing strobes between clocks, and because of this they have a weak ability that can compensate small skew deviations between differential clock provided that the deviation at worst never exceeds the maximum stability criteria determined by the voltage crossing point between low and high swings. Using them solely to do this from what I now realize seems to result in perfectly stable system behaviour at high FSB, until CPU and MCH clock timing becomes critical during heavy bus transactions such as the type Linpack puts in motion and the system will hard lock instantly, sometimes after a minute other times 5-10 but the result is the same. The deviation between the previous voltage crossing point and the following one was greater than 150ps and the previous falling edge data strobe polluted the adjacent rising edge address strobe on the inverse clock. No amount of GTL or Vtt adjustment alone can correct it at this point, as if deviation averages 120-130ps, then all it takes is a bit too much jitter on the valid clock period for a data strobe and the deviation has potential to jump beyond that 150ps maximum tolerance.
NB and CPU clock skews can't fix problems which as a result of poorly setup GTL reference points so changing these to fix this kind of problem is like changing the tyres on a car that won't run because it has a flat battery. Car might have spanking new tyres but this means squat if your battery is still flat since you can't drive it anyway. NB & CPU clock skews what they are designed to do is make static corrections to when the output clock needs to be driven for the MCH and for the CPU with respect to the voltage crossing point which the GTL reference for Host FSB clock (NB GTLREF) is used to determine. If this is set correctly, then correcting skewing that is occuring from all the extra NB voltage we are pumping in to get high FSB stable in the first place (and is actually slowly becoming an obstacle that will halt further potential and heat just amplifies it which is why water cooled NBs give more consistent FSB clocks when put under similar conditions as constant lower temperature acts to reduce onset of symptoms and problems). Setting up the timing for driving CPU & MCH output clocks correctly from Host BCLK reference will correct dynamic compensation GTLs were being utilized incorrectly to handle if you either intentionally or accidentally managed to set them up to do that of course. If the problem lies with the design of the board itself and its VR limitations or additional compensation in circuit components that is designed to only handle so much before even they can't help, then no amount of amount of skew or GTL adjustment will change the results from this point even if the memory and CPU can do it.
Most 8 phase X48 boards will only do near 500Mhz FSB and this is VR and capacitance compensation limitations of their designs. P45 boards even with 16 phases unless the ones in question are priced at the same point as X48 counterparts are crippled by use of cheaper components to save costs of production, and it becomes like winning the lottery some win first try, others never win at all. If you are lucky to get a P45 board that does somehow pull over 500Mhz FSB on a quad core then you are one in very few who actually have achieved it.
Only two boards designed for this task are Asus Rampage Extreme and Foxconn Blackops and both when not crippled by either CPU or DRAM falling flat on their face first can and will achieve some very high extremely stable FSB frequencies as a result of either never exceeding their design or not finding a CPU that is perfect enough to exceed its abilities.
If by correctly setting up both clock skews and GTLs you are still stuck and can't get farther there is a damn good chance that CPU is losing coherency or DRAM timings can not be tight enough as required because of inconsistent IC limitations or ability. Sometimes it can be this, more times than not it is what happens when you become consumed on making one set of values work that just cant work, and if this happens best thing to do is step back for a bit start from the beginning at a different point and make your way back up there with new values. It's being able to cut your losses and start fresh that will be difference some of the time between making a setup work or not at a certain FSB frequency. Not all FSB frequencies are worth using anyway, sometimes 1 or 2Mhz either way is all it takes to get the system running smooth.



Reply With Quote

Bookmarks