John W. Dolan

This is the fifth and final installment in a series of "LC Troubleshooting" columns about various aspects of calibration of
liquid chromatography (LC) methods and problems related to calibration. We have looked at questions related to whether or
not the calibration curve should pass through the origin (1), how to determine method limits (2), the use of %error plots
to highlight potential problems (3), and which calibration model to use (4). This month will focus on a more specialized topic,
the use of curve weighting. This also is an appropriate topic, because it addresses questions submitted by several readers
asking why I didn't weight the curves in the discussion of whether or not the calibration curve passed through the origin
(1).
Why Weighting?
When a leastsquares linear regression is used to fit experimental data to a linear calibration curve, equal emphasis is given
to the variability of data points throughout the curve. However, because the absolute variation (as opposed to %error) is
larger for higher concentrations, the data at the high end of the calibration curve tend to dominate the calculation of the
linear regression. This often results in excessive error at the bottom of the curve. One way to compensate for this error
and to give a better fit of the experimental data to the calibration curve is to weight the data inversely with the concentration,
a process called curve weighting. The weighting of calibration curves often will lower the overall error of the method and,
thus, improve the quality of the analytical results. Most LC calibration curves that span several orders of magnitude show
increasing error with increasing concentration, whereas the relative error (percent relative standard deviation, %RSD) is
reasonably constant. Curve weighting should be evaluated whenever the relative error is fairly constant throughout the calibration
curve (5).
There also are regulatory reasons why curve weighting should be considered. The FDA's guidelines for validation of bioanalytical
methods (6) contains this statement (my italics): "Standard curve fitting is determined by applying the simplest model that adequately describes the concentration–response relationship using appropriate weighting and statistical tests for goodness of fit." How are "simplest," "adequately," and "appropriate" determined? It seems to allow many interpretations,
however, one key point is that weighting should be evaluated, and the evaluation of the impact of curve weighting allows for
statistical tests to be applied.
Evaluation of Data
A thorough evaluation of the appropriateness of curve weighting and selection of the weighting factor is best done at the
end of method development or during method validation when a sufficiently large data set is available to calculate standard
deviations at each calibrator concentration. However, because most LC calibration curves exhibit similar characteristics from
the standpoint of weighting, we can use a few shortcuts for the present discussion.
The first step is to determine if the standard deviation at the lower limit of the curve is significantly different from that
at the upper end. This determination is based upon the Ftest, which compares standard deviations for two populations. However, this test is a bit moot for the case of LC calibration
curves that span two or more orders of magnitude. In almost every case, the standard deviation (or absolute error) will increase
with concentration, causing the null hypothesis of the Ftest to be rejected. An alternative way to come to the same conclusion is that the %RSD is fairly constant throughout the
curve. We know the latter to be the common observation. For example, the FDA's guidelines for validation of bioanalytical
methods (drugs in biological matrices, such as plasma) suggest that the method precision and accuracy be ≤±15% at all concentrations
above the lower limit of quantification (LLOQ) and ±20% at the LLOQ. It is clear that the FDA expects the error to be approximately
constant throughout the calibration curve — this is the nature of LC calibration curves. So after testing a couple of data
sets to satisfy yourself, you probably can skip the Ftest.
The next step is to determine the proper weighting factor for the data. The calculations are based upon a fairly complex equation
that can be found in references 5 or 7. For my own satisfaction, I programmed this into an Excel spreadsheet, but by the time
I entered all the formulas and debugged them so that I could get the same answers as the textbook, I had spent an entire day.
A much easier approach is to use the curve weighting options that are built into most data analysis software packages. These
allow you to choose a weighting factor – 1/x
^{
0
}
(no weighting), 1/x
^{
0.5
}, 1/x, and 1/x
^{
2
} are the most useful weighting calculations. Each weighting factor will produce a weighted least squares calibration curve,
which can be used to calculate the %error (also called relative error) for each experimental value.
You can compare the effectiveness of the various weighting schemes at reducing method error by calculating the sum of the
absolute values of the relative error (ΣRE). The weighting factor that gives the smallest ΣRE is the best choice. You will
find that the ΣRE will drop quickly as weighting is increased, then stabilize. I suggest using the least amount of weighting
that minimizes the error, as is shown in the examples in the following section. This should satisfy the FDA's "simplest model
that adequately describes the concentration–response relationship using appropriate weighting" (6).