How to Ensure Accurate Measurement of Dioxins and PCBs

December 4, 2012

Article

The most accurate method for measuring dioxins and related compounds, such as polychlorinated biphenyls (PCBs), is isotope dilution?selected ion monitoring?mass spectrometry (ID-SIM-MS). To use the method correctly, however, analysts need to take various factors into account, such as proper spiking of the extraction and injection standards and establishing proper quality control approaches that are applicable to ID-SIM-MS. Below, renowned dioxins expert Yves Tondeur of SGS-Analytical Perspectives in Wilmington, North Carolina, answers questions about dioxin analysis that were raised during a recent web seminar.

The most accurate method for measuring dioxins and related compounds, such as polychlorinated biphenyls (PCBs), is isotope dilution–selected ion monitoring–mass spectrometry (ID-SIM-MS). To use the method correctly, however, analysts need to take various factors into account, such as proper spiking of the extraction and injection standards and establishing proper quality control approaches that are applicable to ID-SIM-MS.

Below, renowned dioxins expert Yves Tondeur of SGS-Analytical Perspectives in Wilmington, North Carolina, answers questions about dioxin analysis that were raised during a recent web seminar.

A recording of the web seminar is available for free on demand at: Dioxins & Dioxin-Like Compounds: : An Overview and Insights Into the Analytical Methodology

You showed several different dioxin structures. What makes the essential “toxic” core structure of dioxins? Is it the aromatic rings, the chlorinated aromatic rings, or something else?

Tondeur: Since this is not really an analytical question, I am not sure that I am the correct person to answer your question. Nevertheless, I will give it a shot. Many researchers have examined possible structure–toxicity relationships. I am not a toxicologist, but I remember that in the late 1970s scientists at the National Institute of Environmental Health Sciences (NIEHS) looked at 2,3,7,8-tetrachlorodibenzodioxin (TCDD) as a “box” and attempted to see the same “box” within other structures (such as chlorinated biphenylenes) to estimate relative toxicity. Another structure feature of interest is the “planarity” of the molecule. Indeed, the toxic equivalency (TEQ) concept was extended to polychlorinated biphenyl (PCB) congeners with planar structures. I recommend readings that describe the mechanism of action for these molecules and their interaction with DNA materials.

What method detection limit is used to calculate toxic equivalency (TEQ)?

Tondeur: Under isotope-dilution gas chromatography–mass spectrometry (GC–MS) and selected ion monitoring (SIM) conditions, one is clearly better off using sample- and analyte-specific estimated detection limits (EDL) as described in United States Environmental Protection Agency (US EPA) Methods 8290, 23, and 0023A. For the purpose at hand, the method detection limits (MDLs) in US EPA Method 1613 are absolutely worthless. MDLs are not representative of actual performance and their existence is only justified to preserve long-established practices originating from older non–isotope dilution (ID)-SIM-MS methodologies where the MDL is justified. Thus, if you want to know the truth, use the EDL. If you have to comply with a mandated requirement (such as a a permit specifying MDL), then use the MDL for reporting purposes. However, keep in mind that the MDL does not tell you anything about actual performance. Use the EDL for your own assessment, and to determine whether the data received meet your data quality objectives. Changing the permit is also an option to consider. Under a performance-based measurement system (PBMS), changes in which requirements for demonstrations of compliance focus more on accuracy and reliability than on conventional, archaic, and not-fit-for-purpose thinking should be forthcoming.

What is the approximate cost for the extraction and analysis you have described in this seminar?

Tondeur: Depending on the matrix, the level of detection, and other specific data quality objectives as well as reporting format, the price can range from $650 to slightly over $1000 per sample. Very often, data users do not appreciate (until it is too late) the costs associated with making decisions focused on short-term costs. Many experienced and savvy data users will tell you that for highly visible projects, short-term savings can easily translate into very expensive long-term effects. Therefore, it is important for data users to adopt a true PBMS orientation that entails assuming a greater responsibility for the decisions made. As data users, we must determine up front what constitute acceptable data, and stop relying on the methods to define what we need. The US EPA and the methods are not our friends when it comes to accurately demonstrating our state of compliance. Thus, not surprisingly, the cost depends on what kind of information you need.

I agree that isotope dilution is a powerful tool. I think part of the reason it hasn't taken off is that there is a lot to keep track of, especially when it comes to calibration, where you have to have the isotopic compound and the normal compounds analyzed together, calculating response factors, and so forth. And I am not sure how much added cost there is for purchasing the labeled standards.

Tondeur: I understand your concerns. However, as a 33-year practitioner of isotope dilution, I can reassure you that your concerns are not warranted. ID-MS has taken off. I am even expecting it to completely replace traditional methods based on electron-capture detection (ECD) and low-resolution MS when definitive data are required. How long this will take will depend on how long it takes data users to appreciate the method’s fitness for purpose.

Personally, I find ID-MS methodologies much easier to work with than non ID-MS methods. The amount of quality control (QC) effort needed to establish a higher level of confidence in the data is significantly reduced for ID-MS relative to non-ID-MS assays. The latter will never (and cannot) achieve the same level of confidence no matter how much QC is added. And yes, it costs more to achieve an acceptable level of confidence in the data obtained by low-resolution methods. Those are the hidden costs of focusing too much on short-term savings discussed above. Efforts to calibrate non-ID-SIM-MS methodologies are much more involved and complex as a result of the additional QC work required to compensate for the deficiencies inherent to non-ID-SIM-MS approaches.

The cost of labeled standards is actually much lower than one might think. The original purchase may be costly. However, with ID-SIM-MS methods we spike much lower quantities per sample than with non-ID-SIM-MS methods, so the cost per sample may be only a few dollars (such as $3 per sample before adding the costs of preparation and QC, which can easily double the cost to around $7/sample). When thinking about these kinds of figures, it makes me wonder why we are not converting all of our methodologies to ID-MS, especially if the possibility is available. The value one derives from ID-MS compared to the cost of standards is out of this world!

How can a laboratory best eliminate residual materials that cause false positives?

Tondeur: Fundamentally, the laboratory should have personnel capable of applying the principles of the scientific method. There are no “cookbook” recipes capable of handling all of the challenges analytical chemists face when analyzing real samples. For this reason, we, as data users, must stop associating performance with what the laboratory is capable of achieving on pure water or super clean sodium sulfate. The latter might be useful information for the laboratory, but clearly not for a data user spending $1000 on a sample! Data users should expect more, and we cannot accomplish our goals for continuous improvement if we remain stuck in the past while the future is shaped by changes that occur daily. Under a PBMS approach, such as the one my firm has been developing for ID-MS methods, the analyst must have a legitimate reference point from which decisions are acted upon without compromising the integrity of the analysis. In addition, the analyst should rely on procedures that have embedded performance — whereby accurate feedback is available to assess the validity of the chosen actions.

Are incremental improvements in sample preparation methods and high-resolution GC–MS allowing significantly greater sensitivity in traditional methods (such as EPA Method 1613) than when the methods were originally developed?

Tondeur: Absolutely! Current innovations actually go beyond improvements in the sample preparation methods. The analogy with the eternal triangle we presented during the web seminar helps show that regulations and changes in regulations create tensions between data users and the tools available to them. Thus, under a highly dynamic system, we cannot expect the tools (methods) to stand still — to be set in stone with no hope of enhancing quality because of an unfounded fear of upsetting the comparison with older (and bad) data — while the data user’s need for reliable data keeps increasing over time because of more stringent regulations. The assays have not reached the status of a commodity product. Achieving the data quality objectives that data users require in the future still requires “intervention” using human intelligence. By the way, sensitivity is not the only performance indicator to consider. The improvements you mention should be dictated by the data user’s data quality objectives at any particular time for any particular project.

Is the amount of contribution of diphenylethers to furan concentrations predictable? Could the contribution be corrected for if the diphenylethers were quantified in the sample?

Tondeur: Outside a theoretical model, the answer is yes, but only if all of the possible congeners are available as quantitative standards, and if we can determine their behavior on a “when and where it is needed” basis. Each structure plays a role in this “prediction” endeavor. Without any doubt, there will be differences between fragmentation rates based on congener structure. Also worth mentioning is that the reaction rates depend on the actual conditions at the time of analysis (such as the amount of internal energy imparted to the formed ions). The latter is a prerequisite for applying the suggested “correction.” Because none of these conditions can be met today, the only currently acceptable solution is to remove these interferences during sample preparation or to use more-selective detection systems.

Criteria for low-resolution (three ion abundances) qualitative detection have been used for years in the detection of drugs in urine using isotope dilution. Such qualitative identifications have been found to be forensically defensible in courts of law. With cleanup procedures constantly improving, what is the rationale for high-resolution mass spectrometry (HRMS) — other than the profit margin — especially when a similar number of qualitative ratio ions are used (three)?

Tondeur: The detection of drugs cannot easily be compared to polychlorinated dioxins and furans (PCDD/F) assays. For one thing, in a drug detection assay, one looks at one compound compared to 210 compounds in a PCDD/F assay. When the drug metabolizes, it leads to substances that are substantially different (in the mass-to-charge ratio, m/z, and chemical properties) than the parent compound. Organochlorine compounds can lead to many different types of compounds with similar chemical behaviors and mass-to-charge ratios; each representing large families of compounds that contain congeners with anywhere from 1 to 10 chlorines. Moreover, it is not because one uses three ions to “positively” identify a drug that there cannot be false positives or false negatives. The selection of three ions is actually arbitrary. In fact, it is driven by the need to increase confidence (a coping mechanism to offset the weaknesses in the assay) in the determination given the inferior selectivity of the low-resolution assay. If the drug compound’s three selected ions contained both molecular and fragment ions, a deeper understanding of MS and ion kinetics indicates that the fragment ions are not ideal for quantitative determinations unless true isotopically labeled standard are used. Note that the ionization technique can also play a role. Depending on the specific example, a reliance on three ions for positive identification can be scientifically challenged, even in court. In the case of PCDD/F assays, even the use of three or four ions (and the associated ratios) is not sufficient to distinguish commonly encountered interferences such as PCBs, chlorinated xanthenes, chlorinated benzylphenylethers, dichlorodiphenyldichloroethylene, chlorinated methoxybiphenyls, and possibly many others that have not been identified.

Then, of course, we are not limited to qualitative determinations (“qualitative identifications”). When it comes to qualitative and quantitative measurements, the power of ID-SIM-HRMS methodologies is indisputable. This fact is especially true once one understands that the ultimate product’s reliability is the result of a well-coordinated combination of highly selective steps (specific extraction-fractionation, HRGC, and HRMS). HRMS by itself does not define the power of the assay; it is the interplay between the extraction–fractionation–GC separation and the MS resolving power, as well as the use of isotopically labeled standards.

It would be interesting to hear from data users who have relied on low-resolution PCDD/F assays such as Method 8280. I would like to know how they feel about the technology after being burned and spending thousands of dollars to defend themselves. Such users may be able to provide valuable insights regarding “profit margin.” Having said this, low-resolution methods are useful if the purpose of the test is to screen data.

Another important point is the following: As a result of its higher selectivity, HRMS offers lower detection limits than LRMS. Thus, in clinical studies where lower detection limits are required, the added benefit offered by HRMS is smaller sample sizes (less-intrusive biological sample collection). The application of HRMS to athlete doping programs provides the means to detect the use of illicit performance-enhancing drugs more remotely from the point of specimen collection. Thus, the rationale for using HRMS is dictated by the data user’s needs and the ability to make effective decisions based on data with the best uncertainty at the lowest level required. The margins achieved by the laboratories are needed to keep such sophisticated instrumentation up to par (repairs, maintenance, upgrades, and skilled personnel).

Do you see any problem using single-point calibration, once the relative response factor (RRF) is maintained over the working range?

Tondeur: Good question. Once an initial calibration has been established, I do not see a problem in using a single-point calibration. However, as I briefly alluded to in my answer to another question at the end of the web seminar, initial and routine calibration runs fulfill more than one function. The conventional view is limited to establishing the linearity of the detector. The aspect that we (practitioners and methods) unfortunately do not recognize is the critical role these calibrations play in isotope-dilution methods. In that respect, they act as probes into the veracity of the standards used during sample preparation, as well as for the detection of errors (providing accurate feedback on performance) and provide the basis for compensating for systematic errors during the fortification and analysis steps. With regard to the linearity of the detector, I do not believe that establishing an ICAL that meets the method's criteria for acceptance constitutes a guarantee that linearity is maintained at the moment a particular sample is analyzed. Under a PBMS orientation, analysts must find ways to assess the uncertainty associated with the determination of the target analytes on a per-sample and per-analyte basis. The uncertainty should be an integral part of the combined standard uncertainty or the expanded uncertainty supplied with the analytical results. Therefore, we should remove the use of “data qualifiers” (vague ambiguous statements on the results that amount to the laboratory transferring its responsibility to the data user, who must then determine a way to quantify the inference the “flag” represents, and then make a decision) and instead be more specific and quantitative in our description of the error associated with the detector's response for a particular target analyte in a particular sample, and for that particular moment of the analysis. These ideas are not just wishes. My firm has been very active in identifying, evaluating, and validating solutions. These solutions are now a part of a comprehensive program known internally as our performance-based criteria analysis (PBCA).