Identifying "Known Unknowns" in Commercial Products by Mass Spectrometry

Feb 01, 2013

The identification of nontargeted species in environmental and commercial samples by mass spectrometry can be very difficult. In this column installment, guest authors from Eastman Chemical Company describe their systematic approach for the identification of nontargeted species using nominal and accurate mass data, searching both mass spectral and "spectra-less" databases.

Organic mass spectrometry (MS) has witnessed an extraordinary increase in capabilities this past decade because of major advances in ionization sources, analyzers, detectors, chromatography, and computer technology. Many of these technological advances focus on biological applications, a fact plainly evident to attendees of the American Society for Mass Spectrometry's (ASMS) annual conferences. Yet the significance of this ever-sophisticated technology has not been lost on industrial, environmental, and forensic mass spectrometrists, whose work involves characterizing commercial chemical products.

Eastman Chemical Company is a global manufacturer of polymers, fibers, coatings, additives, solvents, adhesives, and many other products. Gas chromatography–mass spectrometry (GC–MS) and liquid chromatography–mass spectrometry (LC–MS) have proven to be essential for characterizing our company's products and those of other companies. With reasonable effort, we routinely and reliably obtain mass spectral data from these highly sensitive and yet robust techniques. However, unless the data can be converted into structural information, it is not useful as a knowledge base to resolve the analytical problem at hand.

Figure 1: Simplified flowchart for identifying "known unknowns." MF = molecular formula and MW = molecular weight.
In the last 34 years, we developed and refined a systematic process (1,2) for the identification of nontargeted species using GC–MS and LC–MS analyses. We refer to these types of species as "known unknowns" — that is, species known in the chemical literature or MS reference databases, but unknown to the investigator. The essence of the process is finding candidate structures by searching mass spectral databases, Chemical Abstract Services databases, and ChemSpider databases. Figure 1 presents a simplified flowchart of the overall process; the subsequent sections discuss individual steps and illustrate three examples in the identification of known unknowns.

Computer-Searchable Mass Spectral Databases

Table I: Spectra with associated structures searched with NIST search software
The first step in the process is computer searching of spectra against mass spectral databases. This approach (3) is very powerful and efficient for the identification of unknowns typically requiring 3–5 s for each component in a mixture. Electron ionization (EI) databases are used for identifying compounds in GC–MS analyses, and collision-induced dissociation (CID) databases are used for LC–MS analyses. The databases are purchased from commercial sources or are created from compounds characterized at our company (see Table I).

The results of the EI mass spectral searches are normally more successful than CID searches for two reasons. First, the number of entries in EI databases for GC–MS is approximately 10 times larger than that for CID databases for LC–MS. Second, 70-eV EI spectra are much more reproducible than CID spectra, which can vary significantly depending on instrument design and user-specified variables (3).

NIST MS Search Software as Eastman Corporate Standard

We adopted the National Institute of Standards and Technology (NIST) MS Search program as our corporate standard for searching mass spectral databases for the following reasons:

  • Searches both EI and CID databases
  • Performs fast EI searches with essentially no false negatives (3)
  • Searches libraries by spectra, structure, and other data fields
  • Merges search results for multiple databases
  • Creates users' libraries with structures and other data fields
  • Merges, archives, and distributes users' libraries nightly
  • Imports spectra and structures from all major commercial software programs
  • Correlates fragments to substructures for EI and CID spectra via MS Interpreter utility

The automated process of merging, archiving, and distributing our corporate EI and CID databases occurs nightly by means of batch files and a simple event-scheduler utility. A standard GC–MS laboratory computer on the network serves as the sole library server for our company, which operates a worldwide computer network of MS systems. Many of these remote systems are operated by scientists with minimal expertise in mass spectral interpretation. When necessary, those scientists send their files via the network for interpretation by corporate experts in MS. The experts then add spectra and associated structures to our corporate database.

Soft Ionization for Molecular Weight Determinations

The molecular weight of a component is one of the most important pieces of information obtained from MS analysis. CID spectra obtained by LC–MS analyses that use "soft" ionization techniques, such as electrospray ionization (ESI) and atmospheric pressure chemical ionization (APCI), normally yield ion species that indicate the molecular weights of components. In contrast, the molecular ions of components often go unobserved in EI analyses. We use chemical ionization (CI) to determine the molecular weights of those components in EI GC–MS analyses.

We use a wide variety of CI gases and gaseous mixtures in GC–MS analyses including methane, isobutane, ammonia, ammonia-d 3 (4), methylamine, and others. The choice of gas depends on the proton affinity of the unknown. We primarily use ammonia, however, because most of our unknowns contain heteroatoms. Ammonia CI yields very good molecular weight information (proton adducts, ammonium adducts, or both). Moreover, it does not leave carbon deposits that contaminate and ultimately hinder the performance of the CI source. MS CI manifolds supplied by the manufacturers for many of our GC–MS instruments are incompatible with ammonia gas, so we fit our instruments with custom manifolds (4). In addition to tolerating ammonia, the custom manifolds provide easy in situ preparation of gaseous mixtures.

Accurate Mass Data for Molecular Formula Determinations

The wide availability of time-of-flight (TOF), quadrupole TOF (Q-TOF), and orbital trap mass analyzers allow the routine acquisition of high resolution mass spectral data with low parts-per-million (ppm) mass accuracy in either LC–MS or GC–MS modes. In many cases (5), even a mass accuracy of <1 ppm is inadequate to determine a unique molecular formula (MF). Therefore, mass spectrometry vendors apply orthogonal filters such as isotopic ratio abundances and a variety of heuristic and chemistry rules (6) to limit the number of molecular formulas.