In 1999, President Bill Clinton announced the completion of the first draft of the full human genome with Craig Venter and Francis Collins, two of the leading scientists instrumental in this endeavor. This scientific milestone was rightfully heralded as an accomplishment that would "lead to new ways to prevent, diagnose, treat, and cure disease" and the "starting point for the new era of genetic medicine."
While the first full DNA sequence draft of the human genome certainly marks the start of the genomics era, the actual sequence was really just a reference since there is no single human genome. Every human's genome is unique. In fact, the sequence that was obtained in that first draft at the turn of the century was an amalgam of DNA from multiple individuals.
What Is the Genome?
The genome is all the DNA that makes up the total heritable genetic content of an individual, and everyone's is different. The first draft of the human genome just provided the basic template to define the genes, regulatory points, structural elements, and other features. This generic sequence did and still does help advance an overall understanding of biology and disease pathology, and provides a tool for researchers to discover new treatments and diagnostics. However, for personalized diagnosis of disease risks, selection of treatments, and other medicine based on an individual's genetic makeup, it is the small differences in the DNA from individual to individual that are critical. The human genome sequence just provides a baseline so the small differences in the three billion nucleotides contained in the genomic sequence of each person can be identified.
Mutations Cause Everyone’s Genomes to Vary
These small differences in individual's genomes are created by mutations, specifically germline mutations, that occur due to mistakes when the DNA is copied to make the egg and sperm cells, and when chromosomes recombine during fertilization when an egg and sperm unite. Mutations also occur in other cells in the body too. When they do, they are referred to as somatic mutations and they typically only affect a few body cells, although damaging ones may produce cancer tumors. However, germline mutations, those that occur in the sperm or egg either before or directly after fertilization, become a permanent feature of all the cells in the individual, and all future descendants of that individual. These mutations provide the working material for evolution and define the range of traits and characteristics that make humans diverse.
What Exactly Are Mutations?
DNA provides the information to make and regulate the proteins that form the structure of the body and run all the chemical reaction. However, chemically, DNA is just a long chain made by combining four different bases: adenine (A), cytosine (C), guanine (G),and thymine (T). It is the order of these chemical bases provides the information, so if one of the bases is changed, it can provide different information that may alter the composition or amount of a protein. The particular DNA base sequence and variations that one organism has is its genotype.
Small changes in the DNA base sequence do not always change something in the organism because most genetic instructions require several bases together. For example, sets of three DNA bases together code for each one of the amino acids that are linked together to make peptide chains that form proteins. Since, in most cases, there are a few combinations of these DNA base triplets that define the same amino acid, one base change may not affect anything. The genetic code has a lot of built in redundancy. However, some single base changes can alter an important element that makes a protein function incorrectly or just differently, that then changes a person's appearance or chemistry. These changes in the actual traits an individual has are phenotypic variations.
Phenotypes include almost any physical or physiologic trait, such as eye color, hair color, shape of the nose, length of fingers, and also, genetic diseases, heart conditions, different abilities to taste or digest food types, levels of proteins in blood, etc. Virtually all phenotypic differences result, at least in part, from some combination of variations in genes. The difficulty is figuring out which DNA variations are responsible for any particular phenotype.
Most traits result from combinations of certain genetic variations, some may be dependent on others to produce an effect, and some may only produce a phenotype with the right environmental conditions. As the human population evolved and spread out over the world these small mutations accumulated and were passed on. Since most genetic alterations actually did not have a big effect on health or survival, they just accumulated in the DNA and were passed on from generation to generation. With millions of genotypic variations between individuals, it is extremely difficult to sift out the few relevant ones for a specific trait, and it is even more difficult to figure out how exactly they affect a particular trait.
Mapping the Genetic Differences
To work out the important genotypic mutations that affect health related issues such as disease susceptibility or drug sensitivity, you first need to know what common genetic variations occur in the human population. Characterizing individual genome variations means sequencing the three billion bases of each genome from lots and lots of people then comparing them all to each other. Fortunately, DNA sequencing technology has really developed since the start of the Human Genome Sequencing project when it took 10 years to complete the first version of the human genome.
Types of Genetic Variation
The 1000 Genomes Project has taken on this challenge of cataloging human genome variations. Most genotypic variations are single-base mutations, more formally known as single nucleotide polymorphisms (SNPs). About 38 million of these SNPs have been found to occur in at least 1% of the human population.
SNPs have been associated with an incredibly wide range of traits, diseases, and behaviors. From health conditions such as metabolic syndrome, to diseases like Alzheimer's, to metabolism differences such as sensitivity to caffeine, to taste preferences, to cancer treatment effectiveness, etc. SNPs have been associated to almost every type of trait imaginable. Also, since SNP mutations have occurred at different times and places around the world, and are passed onto descendants, most SNPs are also associated with geographic and ethnographic ancestry. The SNPeadia maintains a database of the all SNPs identified as associated with some interesting human trait. It has over 35,000 entries.
In addition to substitutions of single bases, sometimes extra bases are added to or deleted from DNA. These changes are called indels. While there are only about 1.4 million of these indels that occur with 1% frequency in humans, some have been linked to susceptibility for certain cancer or neurological diseases.
The Real Genomics Revolution Is About to Unfold
While the first draft of the human genome at the beginning of the century symbolized the start of the genomics era, the real genomics revolution requires two other milestones. One is an understanding of genetic variation that is good enough to make sense of the specific differences in an individual's genome. The second is DNA sequencing technology that is cheap, easy, and fast enough to make it routine for anyone to have their whole genome sequenced.
The cost of DNA sequencing is dropping precipitously fast and companies are already claiming to have crossed the threshold of being able to sequence a person’s full genome for less than $1,000. Also, there are enough associations between DNA variations, diseases, and other phenotypes, that the sequencing data is becoming useful for prediction of disease risks and health vulnerabilities. It seems we are on the cusp of a new era in medicine and maybe self knowledge.