Failed System Suitability Test: A Case Study


LCGC Europe

LCGC EuropeLCGC Europe-12-01-2016
Volume 29
Issue 12
Pages: 684–686

A reader’s problem of a method that fails the repeatability requirement of the system suitability test serves as an example of how to approach liquid chromatography (LC) method troubleshooting.

John W. Dolan, LC Troubleshooting Editor

A reader’s problem of a method that fails the repeatability requirement of the system suitability test serves as an example of how to approach liquid chromatography (LC) method troubleshooting.

I regularly receive questions from readers of “LC Troubleshooting”. Many of these questions are simple and can be answered with a quick e-mail, and I often collect these to include in a discussion of readers’ questions in this column. Others seem to stand alone as a good example of not only solving a specific problem, but also how to approach liquid chromatography (LC) problems in general. The present case study is a good example of the latter type of question.

In a somewhat edited version, the question went like this:

I am using a pharmacopeial method for assay of a drug tablet. This is a normal phase method with the following conditions. Column: unbonded silica, 10-µm diameter particles packed in a 300 mm × 4.6 mm column, operated at 25 °C [flow not specified]. Mobile phase: acetonitrile–ethyl acetate–tetrahydrofuran–n-hexane (1:1:1:7). Injection: 10 µL of sample dissolved in methanol + mobile phase. Samples were kept cool (6 °C) while on the autosampler.

During verification of the method in the research and development (R&D) laboratory and initial transfer to the quality control (QC) laboratory, I did not observe any failure of the system suitability testing (SST) requirements (≤2% relative standard deviation [RSD]) for either the initial injection of replicate standards (n = 5) or bracketing standards.

After six months of use in the QC lab, two types of SST failures were noticed:
1. Failure of %RSD for the initial n = 5 replicate standards. Sometimes SST passed, sometimes it failed. This occurred on two different LC systems.
2. If the initial n = 5 standards passed, sometimes the bracketing standards did not meet the requirements and other times they did.

In addition to the failure of the standards, I’ve noticed a difference in peak area response that occurs occasionally and suddenly with multiple injections of either standard or sample.

After observing the above problems, we have performed parallel analysis in the R&D lab using the same brand of LC system used in QC, but we have not observed any of the problems. I’m not sure what to do next.


Mental Experiments First

As I read this question, my mind immediately pointed me to an injection problem. When I’m with a group and that happens, I’m often asked how I came to that conclusion so quickly. Here’s the process that I use, either consciously or subconsciously.

I always like to begin troubleshooting by simplifying the workload as much as possible, so I do as many mental experiments as I can first. It is appropriate to apply the “divide and conquer” rule here. In this approach, an experiment (mental or physical) is chosen that divides the entire problem space into large pieces. Knowing the answer to the experiment, you can eliminate a large portion of the possible root causes. Repeat with the remaining potential problems to eliminate as many as possible.

When communicating by e-mail, I usually find that I don’t have all the information that I would like when initially approaching an LC problem. For example, a few example chromatograms and a table of results for typical passing and failing runs would be nice - it is surprising how often a quick glance at a chromatogram or some data will help to spot the problem source quickly. I did not get this information for the present problem, so I’ll have to try another approach.

Even without all the desired information (and this is usually the case for most problems you will encounter in the laboratory), I think that I can safely make a few conclusions:

  • The method is basically sound. If it came from one of the pharmacopeias (for example, the United States Pharmacopeia [USP] or European Pharmacopoeia [EP]), these methods have been thoroughly vetted before publication. When setting up such methods in the laboratory, they are sufficiently stable that revalidation is usually not necessary. Some kind of verification that the method works is required, usually defined by the receiving laboratory’s standard operating procedures (SOPs) based on regulatory guidance. It appears to me from the reader’s input that this process went smoothly. Divide and conquer result: It probably isn’t the method itself.


  • The method worked initially in the R&D laboratory on at least one instrument and later it continued to work in R&D when parallel analysis was done. Divide and conquer result: confirms the above assumption that the method is basically sound, and it might be instrument or operator dependent.

  • The method worked initially in QC, but failed after approximately six months and continued to fail. It also failed on a second LC system, but it wasn’t clear if it originally ran properly on this system. Divide and conquer result: Something has changed over six months and it isn’t unique to a single LC system. Because it worked for six months, I am less likely to suspect operator problems than I would if the transfer to QC never worked.

  • The comment that the areas appear to change suddenly from one injection to the next leads me to assume that this is not a gradual ageing of the instrument or column. Or at least that this is not the most likely cause. Divide and conquer result: Don’t focus first on the column.

  • When the method was returned to R&D it worked as expected. Divide and conquer result: There’s something different between the R&D and QC instruments that is causing the problem.


Is It Plugged In?

It is amazing how often the simplest things are overlooked, and I’m as guilty of this absentmindedness as the next person. We all know stories about some appliance that doesn’t work and we fret and fuss trying to get it operational, but it never occurs to us that it might not be plugged in. So ask the obvious questions, even though you may be certain of the answers. Here are a couple:

Is retention reasonable? For consistent chromatography, including retention times and peak area or height, the first peak of interest should be retained so that it is well clear of the column dead-time peak (t0). This means that the retention factor, k, should be at least 1 and preferably 2 or larger. A value of k = 0.5 may be acceptable in some cases, but smaller retention factors mean that the first peak is likely to run into interfering materials at the solvent front and will be less reproducible. You can calculate k as follows:

k = (tR–t0)/t0  [1]

where tR is the retention time. For the present purpose, you can visually estimate if the k-value is large enough by making sure the time (baseline distance) between injection and the solvent front is less than the time between the solvent front and the peak of interest. For the current setup, the column dead volume should be approximately 3 mL, so at 2 mL/min, t0 ≈ 1.5 min. For k ≥ 1, tR ≥ 3 min would be required. Because this is a pharmacopeial method, it is highly likely that the method is designed to satisfy this requirement, but the brand of column or accuracy of column oven calibration can change k significantly relative to the original method conditions. If I had been supplied a chromatogram of a good and failing run, I could check the k-value easily.

Another retention check to make is to make sure that the failed injections have the same retention times as the good ones. I have assumed that a change in retention between the two conditions would be so obvious, it was not overlooked, but it is good to ask. Again, a couple of chromatograms supplied with the question would allow me to evaluate the retention times.

I mentioned in the previous section that the symptom of variable peak size is very unlikely to be related to a problem with the column. However, replacing the column with a new one is such a simple check, it wouldn’t hurt to try a new column before digging deeper. Another rule of thumb I often use is “easy over powerful”. That is, it may be most efficient to try an easy experiment, such as column replacement, before executing a more complicated experiment, even if you don’t expect it to succeed. Column replacement may have been done already, so it may not have to be repeated, but I don’t have that information.


Reduce the Variables

Now that we’ve eliminated the easy possibilities with some mental experiments or by visual evaluation of existing chromatograms or data, it is time to roll up our sleeves and dig in deeper. Changes in peak area are most commonly the result of some change in the injection process, so that’s where I’m going to concentrate.

I’m not sure how the sample sequence is set up, but for this type of method often a single injection is made from each vial. If replicate injections are made, they are made from two separate vials. Thus the sequence might be SST sample 1, SST2, SST3, SST4, SST5, then the bracketing standards (BS), BS1, BS2. This would be followed by several samples for analysis (SA), SA1, SA2, SA3, SA4, SA5, and a couple more bracketing standards BS3 and BS4, then more SA samples. Often all of the SST samples come from a common stock and the BS samples from another common stock. The %RSD calculation mentioned by the reader would then be made for SST1–SST5 (n = 5) and for BS1–BS4 (n = 4).

If the above sequence is correct, we can’t be sure if the problem of differing areas is a result of the injection process or something that happened to the samples before they were injected because a different vial was used for each injection. The first experiment that I would do is to eliminate this uncertainty. Fill two vials with sufficient SST or BS for a large number of injections. The method uses only 10 µL of sample, so 0.5–1.0 mL would allow 50–100 injections per vial. Set up the method sequence to make n = 10 injections of each vial, then repeat (10 × SST and 10 × BS), so the total number of injections is n = 20 for SST and n = 20 for BS.

After the run is complete, calculate the %RSD for each set of 10 and for the combined set of 20 for each sample. Compare these to the failed experimental results. If the %RSD is lower when the samples are from the same vial compared to when samples are in individual vials, this suggests that the problem is related to preparation of the vials or the position of the vial in the run sequence. If the variability is similar, the injection process itself is more likely the problem. Also, examine the data to look for any patterns in peak areas. Do peaks gradually get larger or smaller or are the changes random? A sequence of gradually changing peak sizes can occur if the sample vial is filled too full and poorly vented. It can also occur if sample or solvent evaporates as the sample sits on the sample tray. Evaporation of the sample solvent is more likely a problem with a normal phase method, such as this one, than a reversed‑phase method, where the sample solvent usually contains water, making it much less volatile.

If the %RSD is still high and the peak area changes are random, as I suspect they will be, I believe the problem is associated with the autosampler. Before going further, I would thoroughly check the autosampler for obvious problems. Is there enough wash solvent in the reservoir? Is the wash solvent compatible with the sample (for example, it can contain no water for a normal phase method)? Is the correct syringe installed? Is it worn out? Is it tight in the mount? Is the sample needle clear of any septum debris or other material? Is the needle depth adjusted properly? Sometimes the tubing connecting the needle and syringe needs to be purged of any bubbles - check for this. If there is a needle wash station in the autosampler, is it clean and working properly? Make sure that the draw or fill rate chosen for withdrawing sample from the sample vial isn’t too large. Because of the volatility of the normal‑phase sample solvent, more rapid fill conditions that may be satisfactory for reversed-phase methods can cause cavitation (bubble formation) during the sample transfer process, introducing error. If in doubt, reduce the fill rate to see if this improves things. Consult the operator’s manual for other suggestions, as well as the earlier “LC Troubleshooting” column on autosamplers (1). If you find and correct any problems at this point, repeat the SST and BS injection series to see if the problem is solved.


Autosampler Performance Qualification

If the problem persists at this point, the autosampler needs to be evaluated independent of the method. Under test conditions, today’s autosamplers should perform at %RSD ≤ 1% for n = 5 10 µL injections of a well behaved substance. In our laboratory we typically observed
0.3–0.5% RSD under such conditions.

I like to check the autosampler under reversed-phase conditions because there are typically fewer inherent problems with reversed phase than normal phase. I’ll explain the reversed-phase test conditions here; if you want to stay in normal‑phase mode, you can figure out comparable conditions. If you are working in the normal-phase mode, as is the present case, when you switch to reversed phase, be sure to use a compatible intermediate solvent. The simplest choice is propanol or isopropanol, either of which is fully soluble in aqueous and normal phase solvents. In the present case, acetonitrile is a component of the mobile phase, so you can use acetonitrile as the changeover solvent. First, remove the column and replace it with a piece of connecting tubing. Then replace all solvents (wash solvents and any mobile‑phase solvents) with acetonitrile. Flush 10–20 mL of acetonitrile through each line to ensure all the normal‑phase solvents have been flushed from the system. Then switch to the desired reversed-phase mobile phase and flush with another 10–20 mL of mobile phase. Finally, install the reversed‑phase column and equilibrate it.

The autosampler performance qualification test is described in an earlier “LC Troubleshooting” column (2), which can be found on the LCGC website. The test column is a C18 column of your choice. Choose a stable test compound that is easy to detect. Reference 2 recommends anthracene, but most low-volatility neutral aromatic compounds will do the job. Avoid volatile aromatics, such as toluene or benzene, because these can evaporate during the test and change the results. Select a concentration and detection wavelength that will give a detector signal of 0.2–0.8 absorbance units for a 10-µL injection; for anthracene, a solution of ~2 µg/mL and a wavelength of 260 nm works well. Choose methanol–water, methanol–buffer, acetonitrile–water, or acetonitrile–buffer at a ratio that gives good retention (for example, k ≈ 4) for the test compound. A good starting place is 80% methanol–20% water or 70% acetonitrile–20% water. You may need to adjust the concentration to get k ≈ 4, depending on the column brand you choose. For n = 5 injections of 10 µL each, you should see %RSD ≤ 1% for peak area under these conditions. If the variability exceeds this, there is something wrong with the autosampler. Review and double-check the component inspection mentioned above. If these items look OK, the autosampler is in need of more serious maintenance. Consult the service manual for more options or contact the manufacturer’s technical support department for more help.


Based on the information I was supplied by the reader about the area variability he observed, I strongly suspect that the problem is related to the autosampler. If, after going through the method checks and autosampler checks, the problem persists, it’s time to sit down and re-evaluate the possibilities. For example, transfer the exact column and mobile phase that works in the R&D laboratory to QC and see if the method works now; this should help to eliminate problems with the mobile phase or column.

This case study serves as a good example about how to approach a method problem. We broke the problem down into its potential causes, then examined them with mental experiments first, because they are faster and can eliminate spending time on unproductive laboratory experiments. It is wise to double-check even the most obvious root causes (did you plug it in?) just to be sure you didn’t overlook something. When tests under method conditions lead to a dead end, it is necessary to change to system or component qualification conditions that test the system or component independent of the method.


  1. J.W. Dolan, LCGC Europe29(7), 370–374 (2016).
  2. G. Hall and J.W. Dolan, LCGC North Am.20(9), 842–848 (2002).

“LC Troubleshooting” Editor John Dolan has been writing “LC Troubleshooting” for LCGC for more than 30 years. One of the industry’s most respected professionals, John is currently a principal instructor for
LC Resources in McMinnville, Oregon, USA. He is also a member of LCGC Europe’s editorial advisory board. Direct correspondence about this column via e-mail to

Related Videos
Toby Astill | Image Credit: © Thermo Fisher Scientific