The identification of nontargeted species in environmental and commercial samples by mass spectrometry can be very difficult.
In this column installment, guest authors from Eastman Chemical Company describe their systematic approach for the identification
of nontargeted species using nominal and accurate mass data, searching both mass spectral and "spectra-less" databases.
Organic mass spectrometry (MS) has witnessed an extraordinary increase in capabilities this past decade because of major advances
in ionization sources, analyzers, detectors, chromatography, and computer technology. Many of these technological advances
focus on biological applications, a fact plainly evident to attendees of the American Society for Mass Spectrometry's (ASMS)
annual conferences. Yet the significance of this ever-sophisticated technology has not been lost on industrial, environmental,
and forensic mass spectrometrists, whose work involves characterizing commercial chemical products.
Eastman Chemical Company is a global manufacturer of polymers, fibers, coatings, additives, solvents, adhesives, and many
other products. Gas chromatography–mass spectrometry (GC–MS) and liquid chromatography–mass spectrometry (LC–MS) have proven
to be essential for characterizing our company's products and those of other companies. With reasonable effort, we routinely
and reliably obtain mass spectral data from these highly sensitive and yet robust techniques. However, unless the data can
be converted into structural information, it is not useful as a knowledge base to resolve the analytical problem at hand.
In the last 34 years, we developed and refined a systematic process (1,2) for the identification of nontargeted species using
GC–MS and LC–MS analyses. We refer to these types of species as "known unknowns" — that is, species known in the chemical
literature or MS reference databases, but unknown to the investigator. The essence of the process is finding candidate structures
by searching mass spectral databases, Chemical Abstract Services databases, and ChemSpider databases. Figure 1 presents a
simplified flowchart of the overall process; the subsequent sections discuss individual steps and illustrate three examples
in the identification of known unknowns.
Figure 1: Simplified flowchart for identifying "known unknowns." MF = molecular formula and MW = molecular weight.
Computer-Searchable Mass Spectral Databases
The first step in the process is computer searching of spectra against mass spectral databases. This approach (3) is very
powerful and efficient for the identification of unknowns typically requiring 3–5 s for each component in a mixture. Electron
ionization (EI) databases are used for identifying compounds in GC–MS analyses, and collision-induced dissociation (CID) databases
are used for LC–MS analyses. The databases are purchased from commercial sources or are created from compounds characterized
at our company (see Table I).
Table I: Spectra with associated structures searched with NIST search software
The results of the EI mass spectral searches are normally more successful than CID searches for two reasons. First, the number
of entries in EI databases for GC–MS is approximately 10 times larger than that for CID databases for LC–MS. Second, 70-eV
EI spectra are much more reproducible than CID spectra, which can vary significantly depending on instrument design and user-specified
NIST MS Search Software as Eastman Corporate Standard
We adopted the National Institute of Standards and Technology (NIST) MS Search program as our corporate standard for searching
mass spectral databases for the following reasons:
- Searches both EI and CID databases
- Performs fast EI searches with essentially no false negatives (3)
- Searches libraries by spectra, structure, and other data fields
- Merges search results for multiple databases
- Creates users' libraries with structures and other data fields
- Merges, archives, and distributes users' libraries nightly
- Imports spectra and structures from all major commercial software programs
- Correlates fragments to substructures for EI and CID spectra via MS Interpreter utility
The automated process of merging, archiving, and distributing our corporate EI and CID databases occurs nightly by means of
batch files and a simple event-scheduler utility. A standard GC–MS laboratory computer on the network serves as the sole library
server for our company, which operates a worldwide computer network of MS systems. Many of these remote systems are operated
by scientists with minimal expertise in mass spectral interpretation. When necessary, those scientists send their files via
the network for interpretation by corporate experts in MS. The experts then add spectra and associated structures to our corporate
Soft Ionization for Molecular Weight Determinations
The molecular weight of a component is one of the most important pieces of information obtained from MS analysis. CID spectra
obtained by LC–MS analyses that use "soft" ionization techniques, such as electrospray ionization (ESI) and atmospheric pressure
chemical ionization (APCI), normally yield ion species that indicate the molecular weights of components. In contrast, the
molecular ions of components often go unobserved in EI analyses. We use chemical ionization (CI) to determine the molecular
weights of those components in EI GC–MS analyses.
We use a wide variety of CI gases and gaseous mixtures in GC–MS analyses including methane, isobutane, ammonia, ammonia-d
(4), methylamine, and others. The choice of gas depends on the proton affinity of the unknown. We primarily use ammonia,
however, because most of our unknowns contain heteroatoms. Ammonia CI yields very good molecular weight information (proton
adducts, ammonium adducts, or both). Moreover, it does not leave carbon deposits that contaminate and ultimately hinder the
performance of the CI source. MS CI manifolds supplied by the manufacturers for many of our GC–MS instruments are incompatible
with ammonia gas, so we fit our instruments with custom manifolds (4). In addition to tolerating ammonia, the custom manifolds
provide easy in situ preparation of gaseous mixtures.
Accurate Mass Data for Molecular Formula Determinations
The wide availability of time-of-flight (TOF), quadrupole TOF (Q-TOF), and orbital trap mass analyzers allow the routine acquisition
of high resolution mass spectral data with low parts-per-million (ppm) mass accuracy in either LC–MS or GC–MS modes. In many
cases (5), even a mass accuracy of <1 ppm is inadequate to determine a unique molecular formula (MF). Therefore, mass spectrometry
vendors apply orthogonal filters such as isotopic ratio abundances and a variety of heuristic and chemistry rules (6) to limit
the number of molecular formulas.