What Temperatures did u guys used for Heatspreader removal?
Maybe it was to hot for the Core, something like a upside down burn in?
I try to be useful here
This behaviour is because you need to have some buffer between pot and die.
I found out, that if you have direct contact between the large copper block (base of GPU pot),
results will get worse especially in heavy load.
When you have this kind of contact, which is actually super good,
you will not be able to transfer heat fast enough out of the die.
This is the reason you will loose that much MHz. It will propably cause the instant lock up in the beginning of some test.
Nvidia stock thermal paste is shin etsu, which is not good paste in general, but not bad with ln2 either.
It is working pretty well in heavy load and it is possible to transfer that huge heatload to HS.
When you have some good thermal paste in use, you will now be able to transfer that heat from HS to GPU pot.
This way the temperature of GPU is higher and when you have enough voltage they will run well.
Without HS you will get too cold very easily and face some other cold related issues as well.
I actually tried quite thick copper plate between pot and GPU HS and it worked well.
That was my 1455MHz unigine run with that certain DCII card.
The card was a lot easier to hold in correct temperature range that way.
So, my point was that keep the HS on. Keep the GPU warm enough and give enough voltage for it.
You are as good as your samples are!
first time (on MOA) we used CPU Pot and arround 100° to do the removal, core degraded
second time we used Heat gun set at 250° directly to the Heatspreader, core degraded
Unfortunatly it's not the answer to set back the Heatspreader (OCz Freeze between GPU and HS) still degraded core.
Your point is really interessting and we will test next time the differences with some copper plate.
Later in times of the HD 5870 Lightning we used a Copper plate too, but not for buffering, more for getting more distance between PCB and Pot.
But I think the matter of the degrading cores are to high Temperatures during the removal
I used just 95°C on MOA, and that was pot base temp - GPU temperature must have been even lower. They reach up to 95 in normal usage on stock cooling. Removing the IHS did not improve or degrade clocks but it made benching easier.
www.teampclab.pl
MOA 2009 Poland #2, AMD Black Ops 2010, MOA 2011 Poland #1, MOA 2011 EMEA #12
Test bench: empty
If that is true, we have two very good Lightnings....
We will try to add this additional buffer.
But with my physical understanding it makes no sense, why the card have more problems under heavy load without HS.
Because there is more thermal resistance with HS...
In case the thermal paste between DIE and HS is better then OCZ freeze we used after HS removal and can handle heavy loads better, we have to find a equivalent replacement for the thermal paste.
But then it should be fine even without HS and direct contact between DIE and pot.
Thanks for sharing your knowledge...
Andi
It could also be a problem of to much pressure...
Remove IHS and mount pot is often working fine for some time but after a few times benching there could be some problems coz there is to much pressure on core (some bga solder balls of the gpu core could be broken).
That is maybe the reason why a quite thick copper plate between pot and GPU HS and it worked well?
I heared from same issues a time ago (GTX280 - i think Shammy was talking about that) and that some cards worked really bad after a few times benching (bga solder balls of the gpu core had broken and some spacers to apply pressure at the right spots helped to solve that problem).
And also what SF3D is saying is right too - without HS you will easy run into some cold related issues and need really good TIM
without HS there is no cold relatet issue at all on our cards, you can run 1300 MHz on every bench with quite low voltage and not even a bit of a Coldbug.
The pressure was also a idea of us but on the first card (with HS on) we had issues what where surely soldering contact related, after fixing it in the oven and removing HS it worked well again but we had the no HS related issues.
What are the simptoms if you go over 1300MHz? Lockups or strange colors on the screen?
Born to lose, live to win!
That is true. It makes no sense if we think pure physics.
The situation is very unpredictable with 3000 million transistor working at 1.6V and 1500MHz+.
GPU die will be a lot more warmer with hs on and if you have extra plate between HS and pot, it will be even more warm.
That does make sense, when we think about current flow + GPU operation + temperature.
There is sort of current sweetspot for every GPU depending of the leakage, which will change with temperature.
For that same reason there is also temperature sweetspot.
I could test my DCII card at 1400MHz at -120C when it needed -135/-140C with HS on.
After 1400MHz I got lock ups no matter what voltage or temperature was applied, if there was no HS.
I reinstalled HS and just did same tests at -140C but I raised only voltage and it was scaling.
Then I kept temperature at that sweetspot and raised voltage as far as it helped something (up to 1.62V).
From this point of view we can make conclusion, that voltage is more important for some GPU's than temperature.
You will need to add more voltage to gain clocks because getting colder propably will not help in most cases.
When you don't have HS on, you will go easily too cold and you loose the current/temperature sweetspot.
Then GPU must have some sort of cold issues, which will lead in to crash or lock up. It might be some certain part of the die, who knows.
Usually this will be solved by driver restore.
All materials have some certain heat capacity and heat transfer cabability.
I think the GPU pot's everyone is using are pure copper and the base is quite massive.
The very heavy load from GPU will heat up the first 0.1mm of copper very fast and the GPU will overheat, cause the load is very extreme.
The massive copper block can not transfer enough heat in that very short time and it will cause problems. This will happen in milliseconds/tenths of a second.
When there is two layers of thermal paste and HS, heat have kind of a buffer zone (HS), where it can be transfered rapidly. Then GPU or some part of it will not fail in the beginning of load.
The mass of the HS is a lot lower than the base of the pot and that might be the reason it is better to be there than not.
So getting more cool, is not always so cool!
Last edited by SF3D; 06-21-2011 at 11:46 AM.
You are as good as your samples are!
Lets all go dislike T_M's youtube video on how to remove the HS! haha.
SF3D that makes sense. Maybe -180 is too cold. and -140 / -120 was always the sweetspot, you just didnt know it cause the stockHS was resisting that cold temp
before removal there was even no CB (just without aditional Memory Voltage in 2D Mode lower than -145°) 3D Mode and benching Memory Voltage the Card was fully CB free.
on that last card I tested every 10° down from -100° after the Removal.
from -120° on I can run this 1300 MHz... nothin more.
Replaced HS and I can run 1300 MHz from -120°... nothin more.
In both cases no more CB, even not in 2D Mode or without aditional Memory Voltage. No matter of HS on or off.
SF3D, do you realy think that the copper of the HS and the copper of the Pot make any difference?
Your case is different. You can reproduce the old behavior with replacing the HS but this dosn't make the deal here...
unfortunatly...
The problem is somewhere else and we are missing it....
Maybe we need to cool also the PCB of the GPU which holds the traces? Hipro said this on the DCUII thread and it makes me wonder because the stock IHS also makes some contact with the GPU's PCB
I don't have your issues, but again as I told you I never use any GTX580 with IHS . BTW, that colbug in 2D that you are talking about is from the memory (you must use >1.8v at everytime, 1.9v or 2v recommended).
If you have a very good contact without IHS you don't need to use excessive amount of voltage, 1.48v-1.5v is enough with -120 oC. Also, if with this voltage when the GPU is in high load (Unigine, 3DMark03 Nature, etc) you fill the pot and the temperature is still climbing then you have a GOOD contact (yes, even with fat pots , Fermi is a monster)
Hope it helps!
Born to lose, live to win!
Memory CB is not gone... this just means that memory is warmer without HS... I see no other conclusion.
Implied the memory voltage was the same in both tests.
Maybe the contact between DIE pracket PCB and HS makes the different?
So we have to find a solution to cool the bracket PCB of the GPU, like the HS do and then maybe the extreme load is not so fatal and the processes SF3D descriped are not so important.
The problem to find a similar thermal paste like between DIE and HS still exist.
Or use a very slim pot with not such a fat base... when we can run full pot, mass is not that important and the slim pot acts like the buffer SF3D applied?!
Thank you for this very interesting discussion
I tried it only because I don't have a fat. Up to about 1.35-1.37v was ok on a DCu II. I ran vantage 1200/1200 with a full pot, but pushing to 1.4v+ and the slim just can't handle the load. The pot probe temps were rising 20c and more through a single game test. I expect the temp of the die and IHS were rising even more. I gave up until I get hold of a fatty pot.
Actually ln2 was hard to pour now I think about it. the slim pot opening is so small the liquid was getting sprayed back by the amount of nitrogen gas expanding. Absolute ln2 sucking monster these cards.
I think loopy's idea is a good one. A pot with a slim base that allows maximum cold conduction through to the IHS is a good idea.
No, a slim is not a good solution at all because it cannot candle the load. The solution is to put something thermally conductive between pot and the GPU PCB. I think I will try with plain silicon paste or, better, with dielectric grease
Born to lose, live to win!
I meant something like a fat pot with a thin base. Rather than thick 1cm or whatever of copper, drop it to a couple of mm to get the ln2 as close to the IHS as possible.
I don't think this will help anything. The GPU need's voltage and when you have never used Hs on, you have learned to bench your cards at some certain temperature range and with some certain voltage. When you don't have HS on you need less voltage, cause DIE is cooler and leakage is lower. Simple as that.
Running card at 1450MHz at -120C and with 1.55V is quite normal. It is kind of same to run it with HS on at 1450MHz at -140C and with 1.6V. This is the way I have seen it in my tests.
The instant lock up without HS happens when you go too cold and it might be a lot earlier than you think. I think this is making the confusion.
About the HS of these GPU's we are talking. It is connected to the pbc from sides with very thing glue layer which does not conduct heat well.
It is cooling down the corners of GPU pcb and in that part of the GPU there is traces for memory operation and pci-e etc. GPU will get power behind the die, so I see no point in cooling pcb separately.
I do think that this issue comes from combination of many small parts like leakage, current, thermal power handling, heat conductivity etc.
There is just so many people trying in their own ways, that it is hard to see any pattern. I have learned from my own mistakes and I have my own test pattern nowadays.
It is a lot more easy to make some conclusions, when there is at least some sort of idea what should happen and what should work.
You are as good as your samples are!
Bookmarks