OR WAIT 15 SECS
In proteomics studies, proteins are digested into hundreds of thousands of peptides, thus creating very large and complicated mixtures. Simultaneous electrospray ionization of these complex mixtures, however, results in suppression of ion formation. Therefore, it is essential to have effective chromatographic methods to separate the peptides before analysis with mass spectrometry, to relieve ion suppression and to allow the mass spectrometer sufficient time to collect tandem mass spectra of peptide ions. The challenges involved in developing such separations are great, however. This is the first of a series of articles exploring topics that will be addressed at the HPLC 2016 conference in San Francisco, from June 19 to 24.
Large-scale protein biochemistry, otherwise known as proteomics, was enabled by the protein sequence infrastructure created by genome sequencing. Tandem mass spectrometers create spectra of peptides that can be used to search sequence databases to assign amino acid sequences to the spectra. Such methods allow the analysis of protein complexes, organelles, and whole cells through a process that proteolytically digests proteins into a collection of peptides that can then be identified using tandem mass spectrometry (MS) and informatics. Intrinsic to this strategy is the creation of very large and complicated mixtures of peptides. Simultaneous electrospray ionization of large and diverse collections of peptides results in suppression of ion formation; consequently it is imperative to separate peptides sufficiently to relieve ion suppression and to allow the mass spectrometer sufficient time to collect tandem mass spectra of peptide ions. Thus, peptide separation methods are critical for proteomics studies. The challenges involved are great.
The digestion of the proteins from a human cell can produce many hundreds of thousands of peptides. The analysis of such a complex mixture by tandem MS is dependent on both the resolution of the separation and also on the scan speed of the tandem mass spectrometer. Separation resolution has to decrease the complexity of the mixture so the instrument can maximize collection of tandem mass spectra of peptide ions. The scan speeds of tandem mass spectrometers are increasing, allowing the collection of more spectra per unit time. New data acquisition strategies like data-independent data acquisition are threatening to further increase data acquisition rates. Clearly, this situation creates a challenging separation problem, especially if the goal is to also minimize the overall time required for the analysis.
Initial methods to resolve such complicated peptide mixtures used a two-dimensional liquid chromatography (2D LC) separation technique that combined a strong cation exchange separation with reversed-phase separation using a C18 column (1,2). While not perfectly orthogonal, the two modes provide excellent peak capacity, but do so with long analysis times to achieve high numbers of protein identifications. Various other 2D approaches to this problem have been developed, including a high-pH reversed-phase separation followed by a low-pH reversed-phase separation and methods that combine a strong anion exchange step with a reversed-phase step, analogous to the strong cation exchange–reversed phase combination (3).
Through continuous improvements to MS technology, it is now possible to analyze most of a mammalian cell proteome. In fact, a nearly complete analysis of the eukaryotic Saccharomyces cerevisiae proteome has been reported (4) with an analysis time under 90 min using reversed-phase high-performance LC (HPLC). By decreasing analysis times, higher throughput experiments can be designed that can examine more human patient samples or more sample conditions. The development of high-resolution single-dimension separations based on ultrahigh-pressure liquid chromatography (UHPLC) using a C18 column has offered a new direction, although the time required to achieve high numbers of peptide identifications is still 4–8 h for complex proteomes (5,6). Additionally, those analyses achieve nearly complete protein analyses but with very low numbers of identified peptides per proteins (in other words, low sequence coverage) and in fact most of the proteome is identified with one peptide per protein.
Another reason to increase the resolution of separations and peak capacities is to improve the dynamic range of the separation. Within a proteome there is at least an abundance range of 106 for expressed proteins and this range doesn’t include post-translational modifications (PTMs). It remains to be seen if single-dimension UHPLC separations can be extended to encompass the entire proteome of a mammalian cell to capture all the subtleties and nuances of protein structure. Developing high-speed 2D LC separations may provide a means to achieve this goal, but this type of method does present a challenge because of the slower analysis times often associated with separations in two dimensions.
As separation methods in combination with advanced MS technology bring us closer to the capability to characterize the complete human proteome, advancing these methods to achieve high sequence coverage will be essential. High sequence coverage in combination with good dynamic range will allow collection of information about isoforms and PTMs, information, which will be important to understand the function of proteins.
John Yates III is the Ernest W. Hahn Professor of Chemical Physiology and Molecular and Cellular Neurobiology at The Scripps Research Institute, in La Jolla, California.