An organism's genome is its set of chromosomes, its complete set of genetic information. Many have compared the genome to a massive database - as a blueprint for every protein and organ in the organism. Certainly it is an extraordinary storage device. But can the computer analogy be taken further? Can the genome be thought of as a program that controls the moment to moment functioning of the organism. Can it be viewed as a "self-installing and self-launching application" that enables an organism to develop or "build" itself?
In order to make this metaphor concrete, I propose that computer scientists and biologists begin attempting to describe the processes that the genome participates in as though they were parts of a large computer program. Specifically, create flowcharts with genes as objects connected by logical terms like "and" and "or" and, of course, "while" loops?
Schematic representation of genetic processes has a long history. The "Fundamental Dogma" of genetics, as James Watson once glibly called it, is represented by:
"DNA is transcribed by RNA and RNA is the template upon which proteins are constructed."
As Watson himself knows better than anyone, the picture is far more complex than that. Robert Robbins of the DOE told me that can begin to approximate it with the chart
With each item in the sequence gives feedback to all the ealier items and DNA even gives feedback to itself.DNA --> primary transcript --> messenger RNA -->primary polypetide --> processed polypeeptide -->final protein --> does stuff
This past Spring I posted a note to the bionet.genome.chromosome and bionet.general discussion groups concerning the question of whether a genome can be regarded as a computer program and quite a lively discussion ensued that I want to make available to a larger audience. Excerpts of the discussion are linked to my synopsis of it below.
It began with my original posting on April 13, 1995 which was followed by a reply by a very thoughtful and detailed reply from Robert Robbins of the US Dept. of Energy Genome Database Project. Robbins is a biologist with a serious interest in having computer scientists consider my questions. He was encouraging while politely pointing out the naive errors in my thinking.
Robbins himself then heard from G. Dellaire of McGill who raised some interesting points of his own. Robbins replied in detail to Dellaire's comments.
David Baillie from the Institute of Mol. Biol. Biochem. at Simon Fraser University in Burnaby, Canada, Vahe Bedian from the Univ. of Pennsylvania and Paul O'Neill from the Univ. of Utah Computer Center offered some short but useful comments.
Tengleong Chew from the St. Louis University Medical Center replied in detail to my posting and closed with the tantalizing remark that "There are potential Nobel Prizes hidden in this field."
I sent a few people the collected comments and G. Dellaire replied with some detail remarks on the comments of others.
Next, I posted my first attempt to create a flow chart of a genetic process, the process of b-galactosidase, the gene that produces an enzyme used for the digestion of lactose sugar in the bacterium e.coli. The gene is activated if glucose is not present and lactose is.
The chart seemed fairly simple, but Keith Robison of Harvard pointed out that the processes of detecting the presence of glucose and lactose took place in parallel, not in a linear order as my chart implied.
I responded to Robison saying basically that this type of discussion was precisely what I hoped would result from my posting. This was not, after all, an obvious fact to a naive non-molecular biologist.
Vahe Bedian commented more enthusiatically on the rough chart and Robison's remarks.
Guy Tantenzopf suggested a few candidate organisms for this type of analysis. Ron Sapolsky gave references to two papers by P.D. Karp that deal with some of the same questions that I had raised.
This discussion has been very enriching. First because of the intelligence and generousity of the electronic acqaintances I have made in the international molecular biology community but also because it has made me realize that there is a place - perhaps even a need - for naive computer science thinking in the world of molecular genetics.