Junk DNA

Junk DNA is a provisional label for the portions of the DNA sequence of a chromosome or a genome whose function is not known to the public.

More than ninety percent of human DNA is referred to as junk DNA: repetitive, nonsense sequences that don't code for protein synthesis (the key job of DNA is to produce specific proteins). Many believe that the junk is presumably just deactivated leftovers from earlier stages in our evolution. The truth is junk DNA reveals not just our genetic past, but also our genetic future. That is, the junk DNA contains a blueprint not just of what we currently are and how we got there, but also of what we will eventually become - genetic destiny writ large.

Both RNA and DNA are composed of molecular building blocks called nucleotides, which are made of one nitrogenous base plus a phosphate molecule and a sugar molecule. DNA consists of four different nucleotides: cytosine, guanine, adenine, and thymine; (abbriviated C, G, A, and T, respectively). Due to their structure, A and G are known as purines, while C and T are called pyrimidines. Individual DNA strings typically consist of many thousands of these nucleotides. DNA exists as a double strand, where each nucleotide binds to a complementary nucleotide on the opposite strand. A always binds to T, and C always binds to G. Every three nucleotides constitute what is known as a codon, and each codon codes for a particular amino acid according to the genetic code.

There exist a second layer of information on top of the traditional C,G,A,T base-pair DNA coding: the additional layer of information is related to the presence or absence of a methyl group, CH[sub-3], attached to the base cytosine (the "C" in C,G,A,T). This well-known phenomenon is called "cytosine methylation" and is traditionally ignored in mapping DNA.

But the methylation state is preserved in DNA reproduction, meaning it is faithfully copied from generation to generation - and therefore could code significant additional information into the DNA. The sequence of attached methyl groups inside the long strings of supposedly junk DNA form meaningful patterns. The binary pattern of presence or absence of methyl groups forms a complex algorithm for invoking what geneticists call "frameshift mutations".

Frameshifts occur when a nucleotide (e.g. one containing the base T) is unexpectedly added to or deleted from a DNA string. When that happens, the rest of the string is shifted along, causing the codons (which are the "words" of the genetic language) downstream of where the insertion or deletion occurs to be shuffled. This may be of little consequence if it's near the end of a DNA string, or of great consequence if it's near the beginning. So, if you take out the first "T" from a DNA string reading C-T-A-G-T-C-G, then instead of having the first two codons being C-T-A and G-T-C, they become C-A-G and T-C-G - a completely different genetic message. Although such frameshift mutations were previously thought to be random (and almost always detrimental), they can be invoked by natural processes, when RNA transcribes the "junk" DNA and therefore endowing a person with a pre-programmed evolutionary improvement. RNA transcription is a key part of the procedure used to replicate DNA.

The DNA in mitochondria (which is not part of chromosomes, and is inherited solely from the mother) provides a checksum for random frameshift mutations. Mitochondria are small organelles within cells that contain their own DNA - DNA that is unrelated to normal heredity. A checksum is a simple mathematical procedure for verifying the integrity of a lengthy string of data - such as the genetic information coded in chromosomes.

If a frameshift occurs by accident (due to a random addition or loss of a base pair), the checksum sees to it that the DNA in the female's egg cells is corrected, so that the error in coding won't be passed on to the next generation. Only if the frameshift is invoked during RNA transcription of junk DNA does it get passed on to the egg cells. Cells have a built-in mechanism to correct for random frameshifts, but still allow certain special frameshifts to be passed on. Those mutations had been waiting to be activated, the characteristics they coded for had been pre-programmed into the DNA.

The cytosine methylation encodes a kind of counter that increments very slowly - on the order of tens of thousands of years. The frameshift mutations and the evolutionary changes they cause were timed to occur throughout the ages.

In the fossil record there is a notable lack of fossil evidence for missing links - intermediate stages halfway between one species and the next. This is because the evolutionary process is characterized by long periods with little or no change interspersed with short periods of rapid speciation. Evolution occurs in huge and sudden jumps rather than in a steady process of slow change.

Environmental upheavals destabilize populations, and allow the offspring of only a handful of individuals with a mutant characteristic to rapidly become the new dominant form. This means all members of a new species are descended from a very few members of the previous species - the entire new species arises from a very tiny gene pool. Although normally tiny gene pools are recipes for disaster, with timed frameshifts occurring almost simultaneously in millions of members of a species, new species can arise safely without the dangerous narrowing of the gene pool.