This month's "LC Troubleshooting" looks at two reader-submitted questions regarding method calibration.
One of the parts of being a column editor for LCGC that I enjoy is interacting with readers, primarily by e-mail. I try to answer each reader's question promptly. Some questions have a wider interest to the liquid chromatography (LC) community, so I occasionally pick a question or two to use as the centerpiece of my "LC Troubleshooting" columns. Reader questions also help me keep my finger on the pulse of reader interests so that I can pick topics that will be useful to a wide variety of reader needs. This month I've picked two topics related to method calibration that came in recent e-mail messages. If you would like to submit a question, contact me via my e-mail address listed at the end of this article.
What's the Matter with the Calibrators?Question: I am running a method to verify that the product we produce contains 100±5% of the label claim of the active ingredient. The method has worked well for our product containing 50 mg of active ingredient. We are just introducing a lower potency product that contains 28 mg of the active ingredient, and I can't get the product to pass the specifications. I weighed out 28 mg of reference standard to check the method and this reference standard will not pass either.
Answer: As is my usual practice when reviewing data, I first examine it to see if I can establish any indicators of data quality. One easy way to do this is to look at the repeatability. At the bottom of Table I I've shown the average, standard deviation, and percent relative standard deviation (%RSD) for the n = 6 replicate injection sets (as is usual for these discussions, I've rounded the numbers for display convenience). You can see that the 50-mg set comes in at 0.12% RSD and the 28-mg set at 0.23%. Both of these are reasonable and show that the performance of LC system, especially the autosampler and data system, is adequately reproducible. In our laboratory, we typically see ≤0.3% RSD for six 10-µL injections, and these data meet those criteria.
So, what's going wrong here? I suspect that it is a problem with the calibration procedure. This method uses what we call single-point calibration, where a reference standard is made at one concentration and a response factor is determined based on the peak response (area) and the standard concentration. Then sample concentrations are calculated by dividing the sample peak area by the response factor. This is a simple and straightforward technique, but it assumes that the calibration plot goes through zero (x = 0, y = 0). The response factor is simply the slope of the calibration line, and to establish the calibration line, you need two points. One point is the response of the standard and the other is assumed to be zero — that is, zero response for zero concentration — a very logical assumption. However, this assumption must be demonstrated during method validation to justify single-point calibration.
To illustrate the problem, I've assumed that all the weighings at the 50-mg and 28-mg levels are accurate as shown in Table I. By pooling all 16 injections for both concentrations, we can perform a linear regression on the data set. In this case, we get a formula for the calibration curve of y = 346x + 530, with r2= 0.9999. On the other hand, if we only use the 50-mg points and force the origin through zero, we get y = 357x, with r2= 1.0000. It is simple to determine which formula is appropriate for the data set, as was discussed in an earlier "LC Troubleshooting" column (1). In the regression data report for the full data set, there is a value called "standard error of y" (SEy), which can be thought of as the normal error, or uncertainty around the value of y when x = 0. If the reported value of y at x = 0 is less than the standard error of y, the curve can be forced through zero, and in the present case, this would justify a single-point calibration. If the reported value is greater than the standard error of y, forcing the curve through 0,0 is not justified, and a multipoint calibration is required. In the present case SEy = 34. In other words, if the calibration line crosses the y axis at 0 ±34, it is within 1 standard deviation of zero and not statistically different from zero, so the use of 0,0 as a calibration point is reasonable. Here, though, the calibration curve y = 346x + 530 tells us that if x = 0, y = 530, which is more than 10 times larger than SEy, so a single-point calibration is not justified.
To illustrate how much better the data appear when we do a multipoint calibration, I have selected just the second weighings of the 50.2-mg and 28-mg data and used the n = 2 injections of each (four total injections) to generate the calibration curve (y = 342x + 694, r2 = 1.0000). I've used this calibration to back-calculate the concentration of each injection, as shown in the right-hand column of Table II. You can see that the 50-mg data points are about the same as for the response-factor approach, but the 28-mg points all fit within approximately 0.5% of the target values. This is definitely a better choice for calibration of the current method.
When we move to the 28-mg product, however, the story changes. I've drawn the solid curve in Figure 1 to represent the true calibration curve when both the 50- and 28-mg points are used to generate the plot (again, the difference in slopes is exaggerated for illustration). The 28-mg point is shown on this solid line as a solid dot. However, if this same response is plotted on the single-point curve (dashed line), the point is moved to the right (open circle) to fit the curve and gives a reported value that is higher than it should be. This is exactly what we see with the data of Tables I and II when the response-factor (single-point) approach is taken — the 28-mg values are larger than expected by a little more than 2% (the drawing of Figure 1 greatly exaggerates the difference). With this understanding of the problem, it is not at all surprising that the 28-mg points all fail with the single-point calibration at 50 mg. If another single-point calibration curve were made using 28-mg calibrators, those calibrators would be expected to pass system suitability, and it is likely that at least some of the product batches would also pass specifications for the same reason that the 50-mg ones did with a 50-mg calibration. That is, a single-point calibration can appear to work properly if the test values are close to the calibration point, even if the wrong calibration scheme is used.
I don't know the history of the method or if the single-point calibration was ever validated properly. In any event, when a method is transferred as a pharmacopeial method or an internally developed method, it is always wise to make several concentrations of reference standards over the range of zero to the proposed single-point concentration (and likely to 25–50% higher than the single-point value). Make a calibration curve using all the data points and compare it to a single-point calibration. If they give the same results, easily checked by testing whether the calibration curve can be forced through zero or not, then a single-point calibration is justified.
The final part of the reader's question pertained to the requirement for making two equal weighings of the reference standard and comparing the response of these. Technically, if you are positive that your weighings are always correct, this practice is a waste of time. However, there is a normal amount of uncertainty (error) in the laboratory, and sometimes we just make a mistake. This is the reason for making duplicate weighings — to double-check that a mistake was not made. As a consumer, I am certainly more comfortable depending on a certain dose of a pharmaceutical product if I know the concentration has been double-checked. This is the same reason that duplicate preparations of sample often are called for in an analysis. In some methods, both of the standards are used for calibration. For example, one might have a sequence of STD1, STD2, SPL1a, SPL1a, SPL1b, SPL1b, STD1, STD2, where STD1 and STD2 are the two reference standards, and sample 1 (SPL1) is prepared in duplicate (SPL1a and SPL1b) and each preparation is injected twice. Perhaps all four bracketing standards are averaged to get the response factor for that set of samples. The design of the injection sequence, calibration method, and number of replicates will depend on the analytical method, laboratory policy, and regulatory requirements.