Integration Errors in Chromatographic Analysis, Part II: Large Peak Size Ratios

June 1, 2006
Merlin K.L. Bicking

ACCTA, Inc., St. Paul, Minnesota.

LCGC North America

LCGC North America, LCGC North America-06-01-2006, Volume 24, Issue 6
Page Number: 604–616

In the second of a two-part series, Marlin K.L. Bicking continues to explain his work concerning integration errors in peaks with approximately equal sizes (small peak ratios).

This study continues previous work that was concerned with integration errors in peaks with approximately equal sizes (small peak ratios). Chromatographic situations are created here for varying peak resolution and relative peak size. In this case, the peak size of the smaller peak ranges from 5% to less than 0.5% of the larger peak, and resolution is varied from 4.0 to 1.0. Such situations arise in trace analysis and in the determination of impurities in pharmaceutical active ingredients. All chromatograms have been integrated, using both area and height, by four baseline methods — drop, valley, exponential skim, and Gaussian skim. Integration errors are calculated using reference calibration injections. Not all combinations of relative peak size and resolution produced separate peaks. When two peaks were present, the errors for the large peak were negligible. The drop method produced large positive integration errors for a small second peak, but was accurate when the small peak was eluted first. Valley integration generally resulted in a negative peak error. The exponential skim method was accurate at resolution of 2.0 for all situations, but not at lower resolutions, where negative errors were observed. The Gaussian skim procedure was accurate at resolution equal to 1.5 only when using height. Errors were positive for greater resolution and negative when the resolution decreased. There are some situations in which none of the methods produced an accurate estimate of small peak size. As in the first study, height measurements produced less error than area measurements.

Integration of chromatographic peaks (determination of height, area, and retention time) is the first and most important step during data analysis in chromatography-based analytical methods. Peak information is used for all subsequent calculations, such as calibration or analysis of unknowns. Clearly, any error in measurement of peak size will produce a subsequent error in the reported result.

In part I of this article series (1), integration errors were evaluated when the peaks were of similar size. That is, the smaller peak was at least 5% of the larger peak. The results demonstrated that the drop method produced the least error in all situations. The valley method consistently produced negative errors for both peaks, and the skim method generated a significant negative error for the shoulder peak. Peak height also was shown to be more accurate than peak area. As the relative peak size increased (one peak became smaller), resolution at or below 1.0 generated unacceptable errors, and resolution greater than 1.5 was necessary to minimize integration errors.

In the present study, this error investigation is expanded to situations in which the smaller peak is significantly different in size from the larger peak. Specifically, small peak size ratios from about 5% to less than 0.5% of the large peak are investigated at resolution values from 4.0 to 1.0. Such peak size ratios commonly occur in trace analysis, in which a solvent or major matrix component is eluted near the analyte of interest, which is present at much smaller concentrations. Similar situations occur in the determination of impurities in pharmaceutical formulations, in which regulatory requirements specify that all compounds present above 0.1% levels must be quantified.

Unfortunately, there is little available guidance on how to properly integrate small peaks when they are near a much larger peak. One previous report discussed integration errors for relative peak sizes in this range (2), noting that the drop–height method was appropriate when the small peak is eluted first. When the small peak was after the large peak, the best integration method depended upon the relative widths of the peaks. Drop–height was best for peaks of equal width, while skim–height was preferred when the peaks widths were not similar. Not all integration options were considered in this study, so conclusions could not be generalized completely.

Conventional wisdom and current chromatographic practice suggest that the minimum resolution between two peaks must be at least 1.5 to ensure complete separation. Some laboratories have even larger minimum resolution requirements. This is a prudent practice, because the effective resolution is also a function of relative peak size, in addition to retention time differences, peak width, and tailing (3). The results presented here provide some initial guidelines on the minimum separation needed and the best integration baseline method to use in each situation.

As noted in the previous report, all resolution situations described here were created using a liquid chromatography (LC) system. The peak shape observed is typical for any well-behaved chromatographic system, so application to gas chromatography (GC) separations should be valid. Although only one data system was used for this study, the author believes that all modern chromatography data systems process data using similar procedures, and only minor differences would be produced by other software packages. The general conclusions should still be valid.

Experimental

Experimental details were described earlier (1). A brief summary is provided here.

Equipment: All chromatographic experiments were performed using an Agilent 1100 HPLC system controlled by ChemStation software Version B.01.03 (Agilent Technologies, Palo Alto, California). A 100 mm × 4.6 mm Hypersil C18 column, packed with 5-μm particles, was used for this study (Thermo Electron, Waltham, Massachusetts).

Preparation of test solutions: Individual stock solutions of dimethyl phthalate (DMP) and nitrobenzene (NB) were prepared in acetonitrile, so that when diluted, the maximum peak height observed was near 1 absorbance unit (AU).

A combination of serial and parallel dilutions with these two solutions resulted in a series of test samples with a constant concentration of dimethyl phthalate and varying concentrations of nitrobenzene. These solutions are referred to in the text as "second peak small," because nitrobenzene is eluted after dimethyl phthalate. Similarly, the initial mixture was diluted with a solution of nitrobenzene only in acetonitrile, producing another series of test samples with a constant concentration of nitrobenzene and varying concentrations of dimethyl phthalate. These solutions are labeled "first peak small."

Operating conditions: All experiments were conducted at a flow rate of 1.50 mL/min with an injection volume of 5 μL and a column temperature of 40 °C. Absorbance was monitored at 250 nm. The system was operated under isocratic conditions, using the following concentrations of acetonitrile in water: 45.0%, 67.5%, 75.0%, and 83.0%. These conditions produced resolution between the two analytes (dimethyl phthalate and nitrobenzene) of 4.0, 2.0, 1.5, and 1.0, respectively, as measured by the data system, using the resolution tangent method for the test sample containing approximately equal concentrations. USP tailing factors ranged between 1.05 and 1.10, so this study presents results for typical chromatographic peaks that show only minimal tailing.

Calculations: Analysis under conditions generating a resolution of 4.0 (45% acetonitrile) was used to define the "true" value for each test solution. That is, the relative response of the two components was determined from these data. The values for each test solution are listed in Table I, with the largest peak always assigned a value of 100. (Note that the lowest concentration of the small peak from the previous study is included here as the highest concentration.) The ratio between each component in the test solutions and its corresponding calibration reference produced a response factor. Then, under each set of resolution conditions, and for each integration method, the response for the calibration reference was used with the response factor to calculate an expected peak response (area or height) for those conditions. Comparison of this expected peak response with the actual value allowed calculation of the error for the observed chromatogram. All values reported here represent the percent error between the observed and expected values. A positive error means the observed value was higher than expected; a negative error means some peak area (or height) was lost.

Table I: Relative response for test solutions*

Results and Discussion

When one peak is significantly smaller than the other peak, situations will exist in which the small peak cannot be integrated as a separate peak because a valley no longer appears between the peaks. Table II summarizes the conditions under which two separate peaks were obtained. Although this phenomenon is familiar to chromatographers, and has been mentioned elsewhere (3,4), the table is included here to provide readers with a readily available reference for predicting the resolution necessary to produce separate peaks under the conditions described here (good peak shape). Readers should be aware, however, that poor peak shape will produce more situations where separate peaks are not observed. That is, the valley will disappear at smaller peak size ratios and larger resolutions.

Table II: Conditions producing separate peaks

When the small peak is less than about 5% of the large peak, and is eluted second, it is not resolved from the other peak as the resolution approaches 1.0 because of tailing from the large peak. When the small peak is eluted first, it is resolved only if the area is more than 1% of the large peak. A smaller peak can be seen because there is not interference from the large peak's tail.

A summary of integration errors is provided in Table III. The table is organized by relative peak size and resolution. For each of the two peaks, average integration errors are listed for both area and height measurements, using each of the four integration methods — drop, valley, exponential skim, and Gaussian skim. The values in parentheses represent the standard deviation for each average, based upon three injections. Variability was generally less than 0.5%. Any errors less than about 2% should be considered negligible, and for the purposes of this discussion, such results represent no error.

Table III: Integration errors for differing relative peak areas, resolutions, integrated method*

Finally, by reporting relative rather than absolute errors, it is important to recognize that constant integration errors will be magnified for the smallest peaks. This approach has been used here because the analyst should be most interested in the error compared to the true value. The data indicate that reported values for very small peaks can be in error by more than 50% in some cases, and the analyst must be aware of such situations, as it has a direct impact on the reported results.

Unlike the previous study, the variability was larger for a few data points, particularly when the smaller peak was very small. This result reflected the practical problems that all analysts encounter when working at trace levels. Even minor baseline fluctuations, not visible at less senstive scale settings, could cause a noticeable change in integration results, especially when the peak being integrated was very small. Finally, even though high-purity standards (99%) were used in this study, some minor peaks (less than 0.1% of the main ingredient) were observed. Although such peaks were insignificant when the smaller peak was at least 5% of the larger peak, they sometimes had an influence when the smaller peak was less than 2% of the larger peak. In all such situations, careful examination of the baseline was necessary to determine the correct location for the integration start–stop positions. Analysts are cautioned strongly to consider this problem when integrating small peaks, as it can further increase integration errors.

Integration Errors for the Drop Method

Integration errors for the large peak become insignificant when the smaller peak is less than 5% of the larger peak, and will not be discussed here. The errors for the small peak, however, become very large as the peak decreases in size, as shown in Figure 1, which summarizes the integration errors at resolution 1.5 for the small peak, using both area and height.

Figure 1

The drop method produces a steadily increasing error for the second small peak as its area decreases. The error exceeds 100% when the small peak size is only 0.20% of the large peak. These results should be contrasted with the previous study (1), in which the drop method generated little error for either peak (when both peaks were of similar size). Figure 2 illustrates why the drop method produces such large errors for the second small peak. A chromatogram for the parent peak only is overlaid on the chromatogram of the mixture. In the region where the second peak is eluted, it is evident that the baseline has not returned to its prepeak 1 levels, because of tailing from the large parent peak. Unfortunately, the data system sets the baseline at the prepeak 1 level, which is below the actual baseline. The result is a significant positive error for the smaller peak. As would be expected, this error decreases as the small peak increases in size, so that the error is negligible once the small peak is more than about 5% of the large peak. Height measurements produce somewhat less error, but still at unacceptable levels. These tailing problems are not present when the first peak is small. As a result, large integration errors are absent (Figures 1c and d). Here, the negative area errors observed for the smallest peak reflect losses due to the movement of the valley toward the smaller peak. Height measurements provide an accurate value.

Figure 2

Clearly, the drop method is inappropriate for integrating the second peak when the resolution is 1.5. However, examination of Table III indicates that the situation is improved only slightly when the resolution is increased to 2.0. For the smallest second peak (100:0.20), the area and height errors are 52% and 29%, respectively, and few laboratories would consider such errors to be acceptable. When the smaller peak is more than about 1% of the larger peak, errors are relatively small, if height is used and the resolution is at least 2.0. When the smaller peak is less than about 1% of the larger peak, much larger resolutions are required to minimize errors. In some cases, a resolution of 4.0 might not be sufficient, if high accuracy is required. When the small peak is first, measuring height using the drop method produces accurate results.

Integration Errors for the Valley Method

When the peaks are of similar size, the valley method produces significant negative errors for both peaks (1). When the peaks are not of similar size, the error in the large peak becomes insignificant. However, the error for the small peak remains a large negative number. As shown in Figure 3, the valley baseline for the small peak is significantly higher than the real baseline from the parent peak, resulting in a 32% loss of peak area (and 17% loss of height) from the small peak when the area ratio is 100:2.0 and the resolution is 1.5. Note that there is also a loss of area from the large peak, but the relative loss is small compared with the size of the peak. The second peak errors increase as the second peak gets smaller.

Figure 3

Higher resolution reduces these errors, but the small peak must be much more than 2% of the large peak for the errors to become insignificant. The use of height measurements will reduce the errors. However, even higher resolution values (perhaps 3 or larger) would be necessary to keep the errors low consistently. Increased peak tailing would make the situation worse, and additional separation of the peaks would be necessary. Note that if the larger peak were significantly wider than the small peak, then the valley method might provide a more accurate approximation of the real baseline (2), but such situations are less common.

Integration Errors for the Skim Method

In examining the shape of the parent peak baseline in Figures 2 and 3, it might seem intuitive to expect the exponential skim method to provide better results, particularly when the second peak is small. The tail from the second peak has an exponential character, and the skim method should be able to provide an accurate correction. Unfortunately, Table III indicates that the exponential skim method produces significant negative errors for the small peak at resolution equal to 1.5. At high peak size ratios (very small second peak), an exponential skim is less inaccurate than either drop or valley, but an area error of – 25% would not be acceptable in most cases. Similar trends with smaller errors occur when the first peak is small. Height measurements produce much less error, although the results are less accurate than the drop–height combination. At resolution equal to 2.0, the exponential skim method produces little error, but few analysts would consider using it under these conditions.

The Gaussian skim procedure might provide better accuracy if the shape of the skim line is a better approximation of the parent peak profile. Unfortunately, the data in Table III indicate that this expectation is not always met. While the exponential skim method tends to draw the skim line above the real baseline (negative error for the skimmed peak), the Gaussian skim tends to draw the skim line below the real baseline. That is, the skim line returns to baseline faster than the real baseline, resulting in a positive error. The magnitude of the error is particularly large (greater than 100%) when the second peak is very small at resolution equal to 2.0. Fortunately, this integration method would probably not be used in such situations. However, when the resolution is equal to 1.5, the errors are still significant for a very small second peak, with area errors greater than 30%. It is interesting to note that the Gaussian skim–height procedure does provide the best overall accuracy at this resolution, even when the small peak is eluted first. At resolution equal to 1.0, the errors are minimal when the small peak is second, but they are increasingly negative when the small peak is first.

While the Gaussian skim procedure is an appealing option, its implementation in a wide range of size ratio and resolution combinations is problematic. The errors can be either large positive, negative, or minimal, depending upon the situation. Peaks with more tailing might be expected to result in even larger positive errors, if the skim line does not properly account for the tailing. Additional experiments will be necessary to evaluate the general applicability of this method for a wider range of chromatographic conditions.

The reasons for the errors in the skim method are illustrated in Figure 4. Here overlay chromatograms with a size ratio of 100:4.6 are presented for resolution equal to 1.0 (Figures 4a and b) and 1.5 (Figures 4c and d). At the lower resolution, the height of the tail from the parent peak drops to much lower values than the exponential skim line, resulting in a large negative peak area error. It can be seen that the Gaussian skim line is a good approximation of the real baseline, however, Figure 4b also illustrates how critical the exact shape of the skim line is. A small change in shape would result in significant errors. As the separation improves, the shape of the exponential skim line more closely approximates the actual shape of the tail from the parent peak, while the Gaussian skim line diverges from the real tail. Figure 4 illustrates the general failure of the exponential skim method and the inconsistent nature of the errors for the Gaussian skim procedure.

Figure 4

Other Integration Issues

The shape of the skim baseline cannot be controlled, as it is a proprietary part of the software. However, the results presented here suggest that further refinements in the skim function, or at least the addition of control to the skim function might provide somewhat better integration results. Such features would be hard to design, and execution by the user would be difficult in the absence of information on the peak shape of the parent peak. But successful implementation of such an algorithm would be a welcome addition to current chromatography software.

The errors described here highlight the difficulty of working with separations where one peak is significantly smaller than the other peaks. As noted earlier, low-level impurities can interfere with peaks of interest in many cases. In other situations, the impurities simply can cause a disturbance in the baseline that results in a change in the location of the peak start and stop points. For most analyses, such a change would not be noticeable, either visually on the chromatogram, or in the resulting data. However, when one peak of interest is small (less than 5% of other peaks), the changes caused by an impurity which is present even at levels below 0.1% can create significant integration errors. The analyst must be very careful in reviewing the baseline and integration points, to ensure that these errors are recognized and minimized.

Finally, the peaks generated in this study showed typical chromatographic characteristics, with USP tailing factors of less than 1.1. This parameter is measured at 5% of peak height, and generally is considered to be a good measure of tailing. In Figure 5, the width of the peak very near the baseline is shown, along with indications of the location relative to the peak height. It is clear that the width of the peak increases substantially below 0.2% of the peak height. It is in this region where a second peak at very low levels would be eluted. When viewed at this scale, the errors reported here become much easier to visualize, and it also is clear that the conventional measures of peak symmetry have little meaning this close to the baseline.

Figure 5

The cause of this extended tail very near the baseline is not clear. It seems unlikely that this tailing results from the same mechanisms responsible for asymmetry at higher points on the peak. It would be interesting to study whether this tail is unique to the instrument, the column, the analyte, or some combination of them. But regardless of the source, conventional peak symmetry measurements will not detect this tailing, yet it has a significant impact on integration of small peaks that appear on the tail. Ultimately, it might be necessary to report peak symmetry values at 0.1–0.5% of peak height for challenging separation problems such as those described here.

Conclusions

When the second peak is small and resolution is at least 2.0, the skim method with height measurement produces accurate results. When the first peak is small, the drop–height combination provides the least integration error at all peak size ratios and resolutions. At resolution 1.5, with the second peak small and measuring at least 2% of the large peak, the drop–height method is more accurate, although there will be a small positive error in the result. For smaller second peaks and lower resolution, there are no methods that produce accurate results. The exponential skim–height procedure provides the least error, but it has a relatively large negative value. The Gaussian skim–height method also provides accurate results at resolution equal to 1.5, but the errors change significantly with changes in resolution. The drop method for very small second peaks produces a very large positive error. In general for large peak size ratios, resolution less than 1.5 should be avoided.

Most of the integration errors are due to extended tailing from the large peak, at levels very near the baseline. This region of the chromatogram must be examined carefully, and integration baselines must be drawn to follow this tail to minimize integration errors.

References

(1) M.K.L. Bicking, LCGC 24(4), 402–414 (2006).

(2) V.R. Meyer, Chromatographia 40, 15–22 (1995).

(3) J.W. Dolan, LCGC 20(5), 430–436 (2002).

(4) V.R. Meyer, LCGC 13(3), 252–260 (1995).