Articles on this site and many others have written about how high throughput “next-generation” DNA sequencing has expanded genetic analysis capabilities to the point that researchers are finally starting to unravel the complexity of cancer, understand the mutations causing a range of genetic disorders, and utilize this technology for diagnosis in personalized medicine. However, DNA sequencing only provides information about gene variations which is only a fraction of the biological picture. Proteins, not DNA, are the molecules that manifest life. It is proteins that run the reactions and provide the structure for living organisms. Genes provide the specifications but it is proteins that actually make living organisms.
Proteins Are the Real Molecules of Life
Genes provide the basic code to make the proteins so a lot can be inferred about how the biology works by looking at genetics. However, proteins themselves are regulated and controlled in hundreds of ways after being made. A full understanding what’s happing in living systems will require advanced high-throughput techniques to analyze the diverse range of proteins controlling millions of biological processes happening in thousands of different cells.
The discovery of proteins predates the discovery of DNA by more than a 100 years and techniques to analyze proteins go back the 1800s. However, the chemistry of proteins, which are polymer chains made by linking sequences of 20 different types of amino acids in various combinations, is much more complex than DNA which has just 4 nucleotide building blocks. Proteins possess an incredibly diverse set of characteristics so they can provide the hundreds of thousands of functions required for life.
System-Wide Next Generation Proteomics
Because of the diversity of proteins, there are a plethora of techniques to analyze them. Many, though, are designed to look at only certain types of proteins or just assay one type of protein-controlled reaction. For a more comprehensive understanding of how the diversity of proteins and the range of interactions and modifications that regulate, control, and otherwise affect protein activity function to produce life, more general broad-based techniques are needed. Serial analysis of one protein at a time just can’t provide the scale of data required. Like “next generation” DNA sequencing where massively expanded throughput at reduced costs enabled broad-scale analysis of large numbers of genes in parallel, we need a next generation proteomics technology.
Maarten Altelaar, Javier Munoz, and Albert Heck, suggest, in their recent Nature Review Genetics article, that instrumentation, sample preparation, and computational analysis is, in fact, reaching a point where rapid analysis of large number of proteins on a whole system-wide basis is possible. They describe how, with reverse-phase liquid chromatography coupled to mass spectrometry, next generation proteomics is beginning to emerge.
Liquid Chromatography, Mass Spectroscopy (LC/MS)
Mass spectroscopy breaks up molecules into chemical components, and measures the mass of the resulting fragments to produce a unique profile for each molecule. Since individual proteins break-up differently in a mass spectrometer, producing different profiles, advanced databases can be used to compare results of a cell-extracted protein analyzed using this method against patterns produced by known proteins, so that the proteins present in a particular sample can be identified—like matching fingerprints of a criminal suspect to a database.
Before running the samples through a mass spectrometer, liquid chromatography can be used as part of an automated system to separate out the individual proteins in complex biological samples. Instruments that combine liquid chromatography with mass spectroscopy (LC/MS) then, enable parallel analysis of the range of proteins present in virtually any sample.
How LC/MS Is Being Used for Broad-Scale Proteomic Analysis
Altelaar, et al., point out that, in 2008, this LC/MS approach enabled researchers from the Max Planck Institute in Germany to characterize all the proteins of a type of yeast. At that time, the process took several days, however, just four years later, researchers at the same institute were able to run a very similar study in just a few hours. Also, another study that same year showed that the approach also worked to profile the larger number of proteins expressed in human cells.
More complex analyses has also been done using LC/MS. For example, groups in the Netherlands and Wisconsin used LC/MS analysis to compare all the expressed proteins in human embryonic stem cells with induced pluripotent stem cells, which are made in the lab from differentiated adult cells. There did not appear to be a significant difference in the protein composition of these two types of cells, suggesting that induced pluripotent stem cells maybe a reasonable substitute for human embryo-derived stem cells.
Unraveling the Complexity of Proteins
While the LC/MS approach is very powerful, proteomic analysis also involves more than just looking at which proteins are expressed in particular cells. As mentioned above, there are hundreds of ways proteins can be modified and controlled. For example, cells often regulate responses to their environment by activating or deactivating proteins in series through chemical modifications, such as adding or removing a phosphate molecule, to create a signalling cascade of events—almost like dominos falling in a series. Proteomic analysis requires monitoring these sorts of modifications and understanding which proteins interact with each other in addition to simply knowing the protein levels present. There are estimated to be more than 130,000 protein-protein interactions in humans.
Is This the Start of a Revolution in Proteomics?
As a result of the complexity of proteins, there is still a long way to go before LC/MS technology has the capability to easily generate a comprehensive picture of the proteome of any living cell as conveniently and cost effectively as whole-genome sequencing can do for the genome. However, there has been a real, almost revolutionary, advance in the system-wide analysis of proteins in just the last several years.
For example, Altelaar, et al. point out a recent study from Stanford that combined protein LC/MS results with complementary system-wide genetic, immunologic, metabolic analyses to monitor the physiology of an individual over a 14-month period. The results showed how dynamic all these measurements are over an extended period, and the extent to which the integrated data provides a much more complete picture of an individual’s state of health than data from any one class of biomolecules provides.
While use in the clinic is still many years off, it is clear that advanced LC/MS certainly will significantly advance our understanding of disease, identify the mechanisms of drug resistance, and identify new biomarkers for diagnosis and treatment. System-wide proteomic analysis will likely produce from illuminating and surprising results over the next several years.