View Full Version : My thoughts on distributed computing for drug design
Ackbar
04-29-2005, 08:44 PM
I've waited a long time to post this, but I think it is worth posting at this point since the most recently posted thread that questions D2OL's use.
(Some background on myself) I personally have used "folding" to examine protein structure in the past. I have been involved with several projects that use Molecular Mechanics (or MM) (what is used for "folding" and "drug discovery") for several research projects including examining bonding sites for molecules (essentially what some drug discovery programs do). I have found (and this is a well known fact amongst people in molecular modeling) that using MM rarely leads to good results that are comparable to that found in vivo. Quite simply, MM relies on examining bonds as static classical objects that are basically springs, but uses added empirical results to help make things fit a little better. For this reason, "folding" results are ALWAYS suspect at best, and usually downright wrong because they don't model the physics correctly. For very simple cases, the molecules modeled may work, but you have to be very careful and conscientious of when the particular type of MM method you're using works and what it is validated for.
Now drug discovery is more complex, in addition to simply knowing if a set of atoms will take on a certain shape, we'd like to know how it will bond to another molecule, effectively. Sometimes we may only look at the "surface" of the molecule to see if it "fits" with the surface of another molecule to block some reaction from happening to determine if a drug is viable. (There are other things that could happen as well but this is complicated enough.) So when we look at two molecules bounding we can look at the energy of the system, we "fit" them together and see if the energy drops. If it does, it probably binds. The first problem with this is that as I mentioned before, we're not modeling the physics correctly. Bonding is a quantum mechanical process. Breaking and creating bonds requires QM to be modeled correctly. Some people chose to do otherwise, but results will vary and tend to not be very good. With MM, you can't really model breaking and creating bonds, they're static like springs. So you can stretch and morph the spring, but it won't break.
So how these things are done without distributed computing really involves a long process. I have in the past modeled very simple systems with QM that took years of computational time that barely is able to give good results. So why do we expect to gain ANYTHING from folding and/or drug discovery? Well, it comes down to the old adage about an infinite number of monkeys typing. We know that most of what we're doing is downright wrong at worst, but at best its just an oversimplification of things. So if we send out an infinite amount of work for people to do, eventually we might get something out of the gibberish. At worst, we learn that our MM method is poor and we figure out how to fix it. At best, things look good and we learn something about the molecule being studied.
I do not want to discourage people from doing distributed computer, I used to back in the day! But trust me when I say that for the most part, the work your computer does will not necessarily be more than a monkey randomly typing in the hopes of writing Shakespeare. So enjoy it as a competition, eventually I'm sure there will be a distributed computing project worth running, but at the moment I don't really see many that are. :toast:
On a personal note, I've gone through the frustrations involved with lots of personal molecular modeling projects. From using molecular mechanics (folding), to semi-empirical, to true quantum mechanical methods (ab initio) and its hard to get good results. Even at the best levels of theory, you have to constantly verify your results because everything uses approximations. And using molecular mechanics you basically live by experimental results. In fact, you're often just trying to make things fit experimental results.
I've used some VERY fast computers systems as well (some even in the top rankings) and it takes alot of time. The day someone comes up with a true quantum mechanical distributed computer project, I'll be all for it. At least the physics will in principle be sound (to the order of the approximation used)!
In conclusion, don't be upset with DDOL or any of those distributed computing projects, enjoy it as a competition. :woot: They're all a testament to how great your organization is at getting the most CPU power together for a single purpose! :clap:
trakslacker
04-29-2005, 09:33 PM
very insightful and an excellent post.
My feelings on DC are that, yeah, its fun as a competition. But I at least want to run a project where I have a miniscule shot at being one of the monkeys that comes up with something at least semi-helpful or productive. :) This stuff costs us money(some ppl a LOT of money;)), and I think we all want to put our resources toward something that at least has a chance of being productive in some way, even if its showing researchers how not to do something.
With that in mind, Ackbar, have you ever analyzed any of the current projects(D2OL, F@H, FaD, etc.) based on available information to determine if any of them are going about this in, lets say, a less wrong way than the rest? I think these kind of opinions would be great for our team. Again, thanks for your input.
Ackbar
04-29-2005, 09:47 PM
With that in mind, Ackbar, have you ever analyzed any of the current projects(D2OL, F@H, FaD, etc.) based on available information to determine if any of them are going about this in, lets say, a less wrong way than the rest? I think these kind of opinions would be great for our team. Again, thanks for your input.
I've read some of the papers that F@H put out earlier on. A significant portion of the earlier ones were related to the creation of the distributed computing project. But lately F@H pumps out a large number of papers that are due to user calculations though. So there is scientifically important work going on. The difference is that scientifically important work is A) not necessarily beneficial to anyone in the short term and B) not necessarily correct in reality (ie. shows that within the level of theory used that we predict something to happen). The problem is that no one is currently using very high levels of theory to compute these things. For the most part, that's okay. If you ONLY want to look at folding, MM might work for you. Folding is EXTREMELY difficult to predict though since it is highly reliant on initial conditions of the molecule and of the entire system as there are proteins in nature that "guide" proteins to fold a certain way.
I'd say that F@H does a good job with putting jobs into the users hands that get results that are eventually looked at. Practical applications? Umm... pretty small. Basically, Stanford would hate to tell you guys this, but you guys are basically just saving them money from supercomputer usage. So its a win-win situation for them, they get lots of CPU time, and they get papers for a much smaller upkeep cost of maintaining the system.
So its up to you. I wouldn't mind giving my opinion on some options if you guys narrow it down some, but I'd rather not look through ALL the current DC projects to see what looks good. LOL
Maybe I'm old fashioned, but when I want to get some good long calculations done, I just go over to the local supercomputer and run some programs. :D
trakslacker
04-29-2005, 09:55 PM
Maybe I'm old fashioned, but when I want to get some good long calculations done, I just go over to the local supercomputer and run some programs. :D
that makes me chuckle :p:
regarding projects, obviously we are still a bit from coming to a decision, but the two most popular choices at this time are Find-A-Drug and F@H.
Ackbar
04-29-2005, 10:06 PM
that makes me chuckle :p:
regarding projects, obviously we are still a bit from coming to a decision, but the two most popular choices at this time are Find-A-Drug and F@H.
As far as competitions go... XS is going to be VERY far behind in F@H, I'm not sure about FAD though.
In theory (and it depends how they try to implement it) FAD will fall under all the problems I posed in the first post as DDOL. If done correctly, you can use these calculations as a good basis for experimental work. Its still searching for a needle in a haystack, in the dark, with a flashlight that only works part of the time but you don't know when it works because you're blind. Molecular modeling is an art, molecules are too hard to describe well so we approximate A LOT. The challenge is knowing when your approximations work. Running millions of calculations is useless without knowledge of what your calculations are doing. In that respect, Stanford is doing relatively well since they seem to limit what they do to things that make some sense to do. Drug design with simple calculations are in and of itself flawed, but can be a good starting point. Please post the link to FAD and I'll read up on it.
StandrdDev
04-29-2005, 10:19 PM
Find-a-Drug (http://www.find-a-drug.org.uk/)
Sigma
Ackbar
04-29-2005, 10:25 PM
Find-a-Drug (http://www.find-a-drug.org.uk/)
Sigma
Yah, just surfed to it, thanks. The premise seems okay, the idea is functionally the same as previous works like "Grid.org" and DDOL. They seem to have some "milestone" system that might be nice. But... I REALLY don't like this:
Q: Can academics access the results without charge?
A: We look sympathetically at requests from academics for a subset of the results as well as suggestions of certain protein targets.
It is my belief that results that are drawn from the public should be available for ALL people to use. :nono: Otherwise, they should do the calculations on their own computers and keep all the results to themselves! They don't seem to publish anything either, so the results are questionable about whether they actually do anything useful with what they get or even if they can truly tell a good result from a poor one.
Ackbar
04-29-2005, 10:28 PM
But... Creutzfeldt-Jakob disease (CJD) is a worthwhile disease to crunch for since the root of the cause of CJD is a problem with a protein folding incorrectly and causing other proteins to fold incorrectly. In theory, that is a problem that may be solvable with simple folding calculations.
That particular sub-project might be worthwhile, but again, they don't provide enough information to know what they're doing with the data! If they published some results you might be able to tell if the data they get is being used well or not. Unfortunately, no information is available. If anyone finds publications from them, please post it!
StandrdDev
04-29-2005, 10:39 PM
Yah, just surfed to it, thanks. The premise seems okay, the idea is functionally the same as previous works like "Grid.org" and DDOL. They seem to have some "milestone" system that might be nice. But... I REALLY don't like this:
Q: Can academics access the results without charge?
A: We look sympathetically at requests from academics for a subset of the results as well as suggestions of certain protein targets.
It is my belief that results that are drawn from the public should be available for ALL people to use. :nono: Otherwise, they should do the calculations on their own computers and keep all the results to themselves! They don't seem to publish anything either, so the results are questionable about whether they actually do anything useful with what they get or even if they can truly tell a good result from a poor one.
Ackbar,
While you seem far more informed than I on these topics, it is my understanding that FaD DOES collaborate with academics on a variety of targets. I believe the researcher/collaborator is identified on the "Target" pages for each target. Likewise, I dont believe FaD does publish any material as that is the researchers function I would assume. There are also many discussions on these matters in the forums where Keith (THINK) has clairified or elaborated on. Lastly, I would also direct you to the certs page where the NIH has issued activity statements for molecules that have desired properties in the laboratory setting. HTH.
I look forward to reading more...
Cheers,
Sigma
Ackbar
04-29-2005, 10:50 PM
Ackbar,
While you seem far more informed than I on these topics, it is my understanding that FaD DOES collaborate with academics on a variety of targets. I believe the researcher/collaborator is identified on the "Target" pages for each target. Likewise, I dont believe FaD does publish any material as that is the researchers function I would assume. Lastly there are also many discussions on these matters in the forums where Keith (THINK) has clairified or elaborated on. Lastly, I would also direct you to the certs page where the NIH has issued activity statements for molecules that have desired properties in the laboratory setting. HTH.
I look forward to reading more...
Cheers,
Sigma
You're absolutely right, it is the job of researchers to write papers and publish the findings from the work at FAD. But unlike F@H they don't provide any examples of results that have been published, and they admittedly are not particularly open to giving results to fellow researchers. The targets are fine, I think they are listening to the right people as those targets seem sound. Unforunately, we again don't know how they are quantifying what is a good result. If they had some publications from researchers that used their results, I'd be more inclined to believe it was going somewhere useful.
I just wish I had something more tangible to look at to think about, but otherwise I'm just led to believe that it works. I was trained as a scientist, so I'm skeptical unless I see published results accepted by the community. Just the way I am... :fact: It seems that they might be hording the results since they collaborate with for-profit researchers.
FAD doesn't seem too bad overall though. Its hard to say what is a worthwhile project to devote so much CPU time to... I wish I knew of one that would do what I consider more correct calculations.
STEvil
04-30-2005, 12:04 AM
I kinda miss running seti... sure its somewhat pointless comparred to curing a disease, but you knew your work was generating results.
Confirmation of results isnt much to ask, is it?
NapalmV5
04-30-2005, 09:06 AM
Was leaning towards Find-a-Drug but after i've read this...
Whatever is decided, i'll go with that...
StandrdDev
04-30-2005, 11:48 AM
Ackbar,
First let me start by saying that I fully understand and respect your skepticism. I for one, do not discourage the practice of investigating or questioning the project or administrators of a DC project that one might participate in, and in fact, encourage it. I'm sorry that I am not able to satisfactorily answer your questions, however, they exceed the limits of my knowledge, and require a level of proof, higher than I (a mere layman participant) am prepared to give.
A few llinks that might contain some more information:
Interview with Keith Davies of Find-a-Drug (http://usalug.org/phpBB2/viewtopic.php?t=4569)
3rd Joint Sheffield Conference on Chemoinformatics (http://cisrg.shef.ac.uk/shef2004/) or more specifically The Slides to Keith's presentation there (http://cisrg.shef.ac.uk/shef2004/talks/KDavies.pdf)
I also have a small quote excerpted from a similar discussion from Keith (dated 2/7/05) which might interest you:
"Bioterrorism antidotes University of Rochester
Cancer NIH validation
HIV University of Cardiff + NIH validation
Multiple Sclerosis Univeristy of Pennsylvannia
SARS Lubeck University
There is also an agreement in place for follow-up testing molecules for cancer activity (with NIH) and we anticipate a similar agreement for TB.
A collaboration is anticipated for the CJD project and for some obscure reasons an academic group discontinuued exploring collaborations on Malaria!
Later this month we hope to annouce an initiative aimed at increasing the number of collaborations and partnerships we have in place.
I should also reiterate that proteome project queries have no immediate follow-up plans and methodology project queries mainly serve to increase the quality etc of the software."
Lastly, if you still have further questions on the validity of research and colaborators, methodology etc, I would sincerely invite you to post them or e-mail Keith, or give them to me and I'll post them for you, if you prefer. While I cant guarantee that you might like the answers you receive, or that your questions will be satisfactorily answered by him (for valid reasons), I can guarantee you will get an answer, which is possibly the point of greatest contrast that distinguishes this project from all the rest. HTH.
Sigma
Ackbar
04-30-2005, 12:04 PM
Ackbar,
First let me start by saying that I fully understand and respect your skepticism. I for one, do not discourage the practice of investigating or questioning the project or administrators of a DC project that one might participate in, and in fact, encourage it. I'm sorry that I am not able to satisfactorily answer your questions, however, they exceed the limits of my knowledge, and require a level of proof, higher than I (a mere layman participant) am prepared to give.
A few llinks that might contain some more information:
Interview with Keith Davies of Find-a-Drug (http://usalug.org/phpBB2/viewtopic.php?t=4569)
3rd Joint Sheffield Conference on Chemoinformatics (http://cisrg.shef.ac.uk/shef2004/) or more specifically The Slides to Keith's presentation there (http://cisrg.shef.ac.uk/shef2004/talks/KDavies.pdf)
I also have a small quote excerpted from a similar discussion from Keith (dated 2/7/05) which might interest you:
"Bioterrorism antidotes University of Rochester
Cancer NIH validation
HIV University of Cardiff + NIH validation
Multiple Sclerosis Univeristy of Pennsylvannia
SARS Lubeck University
There is also an agreement in place for follow-up testing molecules for cancer activity (with NIH) and we anticipate a similar agreement for TB.
A collaboration is anticipated for the CJD project and for some obscure reasons an academic group discontinuued exploring collaborations on Malaria!
Later this month we hope to annouce an initiative aimed at increasing the number of collaborations and partnerships we have in place.
I should also reiterate that proteome project queries have no immediate follow-up plans and methodology project queries mainly serve to increase the quality etc of the software."
Lastly, if you still have further questions on the validity of research and colaborators, methodology etc, I would sincerely invite you to post them or e-mail Keith, or give them to me and I'll post them for you, if you prefer. While I cant guarantee that you might like the answers you receive, or that your questions will be satisfactorily answered by him (for valid reasons), I can guarantee you will get an answer, which is possibly the point of greatest contrast that distinguishes this project from all the rest. HTH.
Sigma
The information that you've provided me with is VERY useful! I've read through much of it and generally agree with what they are trying to do. The general methodolgy follows from several other things I've read about drug design. I two things that caught my attention:
A) They use protein structure from X-ray crystallography studies from the protein database (PDB). This is always interesting as x-ray crystallography results are not necessarily the true structure of the protein of interest. So trying to dock to it may be a fruitless task. They're probably aware of this but computationally you'd have to increase the simulations to include some type of "folding" (molecular dynamics) to account for this.
B) The other thing is, "We do not publish which molecules or how many have been found as this might compromise agreements with potential partners to fund the develop such initial hits into drugs." It does seem that FAD is in direct association with pharmaceutical companies looking to market drugs directly. I, personally, would prefer that the results be available to a wider group of individuals so society can use this information and potentially find cures. As opposed to a single company bent on marketing a particular drug hiding results so they can patent it and charge ridiculous prices on it.
angrysquirrel
04-30-2005, 12:40 PM
I know of one DC project that has published results, Distributed.net (http://www.distributed.net), they post their findings when they finish decoding the information. They also donate some of their winnings to a group voted on by the users and give some to the winning team/user to donate to whomever they please.
So this doesn't directly help anyone, but whoever finds the key could donate their winnings to a worthwhile group.
Magnj
05-01-2005, 12:53 PM
wow. So what are we going to do. I'll continue folding for D20l because, we are doing VERY well. From what i'm gathering from posts here, none of these programs are actualy acomplishing anything significant. So we might as well continue on with D20l even if it is just a competition. Although I'm sure some of the guys like Rod who are way more involved might be looking to do actual usefull work with their 5 Digit Farms
Ackbar
05-01-2005, 12:59 PM
wow. So what are we going to do. I'll continue folding for D20l because, we are doing VERY well. From what i'm gathering from posts here, none of these programs are actualy acomplishing anything significant. So we might as well continue on with D20l even if it is just a competition. Although I'm sure some of the guys like Rod who are way more involved might be looking to do actual usefull work with their 5 Digit Farms
They are accomplishing a significant amount of work... but the problem requires a nearly infinite amount of work and a lot of luck. So in the end you're searching for something that may or may not exist and that you may or may not be able to find with how you're trying to find it. All of science gains something in the long run! If we understand something small, we all gain. So I don't want to discourage usage away from these DC programs, but your odds of curing cancer through these programs are maybe around the same odds as finding an alien in SETI@HOME. But again, if we do an infinite amount of work, we'll find a cure! Anyone know of a way to get an infinitely fast computer and an infinite amount of them to work for us? ;)
Magnj
05-01-2005, 01:02 PM
Anyone know of a way to get an infinitely fast computer and an infinite amount of them to work for us? ;)
Oh well why didn't anyone say so. I have a bunch in the basement just sitting around :p
vBulletin® v3.7.0, Copyright ©2000-2008, Jelsoft Enterprises Ltd.