The authors discuss the specifics of analytical method validation.
As summarized in Part I, in any regulated environment analytical method validation is a critical part of the overall process of validation (1). Analytical method validation is a part of the validation process that establishes, through laboratory studies, that the performance characteristics of the method meet the requirements for the intended analytical application and provides an assurance of reliability during normal use; sometimes referred to as "the process of providing documented evidence that the method does what it is intended to do." Regulated laboratories must perform analytical method validation to be in compliance with government or other regulators, in addition to being good scientists. A well-defined and well-documented validation process not only provides evidence that the system and method is suitable for its intended use, but it can aid in transferring the method and satisfy regulatory compliance requirements.
Since the late 1980s, government and other agencies (for example, FDA, International Conference on Harmonization-ICH) have issued guidelines on validating methods. In 1987, the FDA designated the specifications in the current edition of the United States Pharmacopeia (USP) as those legally recognized when determining compliance with the Federal Food, Drug, and Cosmetic Act (2). More recently, new information has been published, updating the previous guidelines and providing more detail and harmonization with International Conference on Harmonization (ICH) guidelines (3-5). As encompassed in the guideline, analytical method validation is just one part of the overall validation process that includes at least four distinct steps: software validation, hardware (instrumentation) validation and qualification, analytical method validation, and system suitability.
In part I of this article, we addressed software validation, analytical instrument qualification (AIQ), and system suitability (1). In part II, we will discuss the specifics of analytical method validation, including terms and definitions, protocol/methodology, and how analytical method validation changes according to the type or intended purpose of the analytical method.
Terms and Definitions
Several parameters, generally referred to as analytical performance characteristics, may be investigated during any method validation protocol depending upon the type of method and its intended use. These analytical performance characteristics, which we have over the years affectionately referred to as "The Eight Steps of Analytical Method Validation," are illustrated in Figure 1.
Figure 1: Analytical method validation performance characteristics.
Although most of these terms are familiar and are used daily in any regulated high performance liquid chromatography (HPLC) laboratory, they sometimes mean different things to different people. To avoid any confusion, it is necessary to have a complete understanding of the terms, definitions, and methodology as discussed in the following sections. Examples of acceptance criteria are listed in Table I.
Table I: Example analytical method validation protocol acceptance criteria
Accuracy is the measure of exactness of an analytical method or the closeness of agreement between an accepted reference value and the value found in a sample. Established across the range of the method, accuracy is measured as the percent of analyte recovered by the assay. For drug substances, accuracy measurements are obtained by comparison of the results to the analysis of a standard reference material, or by comparison to a second, well-characterized method. For the assay of the drug product, accuracy is evaluated by the analysis of synthetic mixtures spiked with known quantities of components. For the quantification of impurities, accuracy is determined by the analysis of samples (drug substance or drug product) spiked with known amounts of impurities (if impurities are not available, see specificity).
To document accuracy, the guidelines recommend that data be collected from a minimum of nine determinations over a minimum of three concentration levels covering the specified range (that is, three concentrations, three replicates each). The data should be reported as the percent recovery of the known, added amount, or as the difference between the mean and true value with confidence intervals (for example, ± 1 SD).
The precision of an analytical method is defined as the closeness of agreement among individual test results from repeated analyses of a homogeneous sample. Precision is commonly performed as three measurements: repeatability, intermediate precision, and reproducibility.
Repeatability refers to the ability of the method to generate the same results over a short time interval under identical conditions (intra-assay precision). To document repeatability, the guidelines suggest analyzing a minimum of nine determinations covering the specified range of the procedure (that is, three levels or concentrations, three repetitions each) or a minimum of six determinations at 100% of the test or target concentration. Repeatability results are typically reported as % RSD.
Intermediate precision refers to the agreement between the results from within-laboratory variations due to random events that might occur when using the method, such as different days, analysts, or equipment. To determine intermediate precision, an experimental design should be used so that the effects (if any) of the individual variables can be monitored. Intermediate precision results are typically generated by two analysts who prepare and analyze replicate sample preparations. Each analyst would prepare his or her own standards and solutions, and might use a different HPLC system for the analysis. The %-difference in the mean values between the two analysts' results are subjected to statistical testing (for example, a Student's t-test) to examine if there is a difference in the mean values obtained.
Reproducibility refers to the results of collaborative studies among different laboratories. Documentation in support of reproducibility studies should include the standard deviation, the relative standard deviation (or coefficient of variation), and the confidence interval. To generate data to demonstrate reproducibility, a typical experimental design might include analysts from two laboratories (possibly different from the analysts involved in the intermediate precision) preparing and analyzing replicate sample preparations Again, each analyst would prepare his or her own standards and solutions and use a different HPLC system for analysis. Results are reported as % RSD, and the %-difference in the mean values between the two analysts must be within specifications. Statistical calculations could also be carried out to determine if there is any difference in the mean values obtained.
Past USP guidelines included the term ruggedness, defined as the degree of reproducibility of test results obtained by the analysis of the same samples under a variety of conditions, such as different laboratories, analysts, instruments, reagent lots, elapsed assay times, assay temperature, or days. Ruggedness is a measure of the reproducibility of test results under the variation in conditions normally expected from laboratory to laboratory and from analyst to analyst. The use of the term ruggedness, however, is falling out of favor and is not used by the ICH, but is instead addressed in guideline Q2 (R1) under the discussion of intermediate precision (5).
Specificity is the ability to measure accurately and specifically the analyte of interest in the presence of other components that may be expected to be present in the sample. Specificity takes into account the degree of interference from other active ingredients, excipients, impurities, and degradation products. Specificity in a method ensures that a peak's response is due to a single component (no peak coelutions). Specificity for a given analyte is commonly measured and documented by resolution, plate number (efficiency), and tailing factor.
For identification purposes, specificity is demonstrated by the ability to discriminate between other compounds in the sample or by comparison to known reference materials. For assay and impurity tests, specificity can be shown by the resolution of the two most closely eluted compounds. These compounds usually are the major component or active ingredient and a closely eluted impurity. If impurities are available, it must be demonstrated that the assay is unaffected by the presence of spiked materials (impurities or excipients). If the impurities are not available, the test results are compared to a second well-characterized procedure. For assay, the two results are compared. For impurity tests, the impurity profiles are compared. Comparison of test results will vary with the particular method, but it may include visual comparison as well as retention times, peak areas (or heights), and peak shape.
Starting with the publication of USP 24, and as a direct result of the ICH process, it is now recommended that a peak-purity test based upon photodiode-array (PDA) detection or mass spectrometry (MS) be used to demonstrate specificity in chromatographic analyses by comparison to a known reference material. Modern PDA technology is a powerful tool to evaluate specificity. PDA detectors can collect spectra across a range of wavelengths at each data point collected across a peak, and through software processes, each spectrum can be compared to determine peak purity. Used in this manner, PDA detectors can distinguish minute spectral and chromatographic differences not readily observed by simple overlay comparisons.
However, PDA detectors can be limited on occasion in the evaluation of peak purity by a lack of UV response, as well as by the noise of the system and the relative concentrations of interfering substances. Also, the more similar the spectra are, and the lower the relative absorbances, the more difficult it is to distinguish coeluted compounds. MS detection overcomes many of the limitations of a PDA, and in many laboratories it has become the detection method of choice for method validation. MS can provide unequivocal peak purity information, exact mass, and structural and quantitative information. The combination of both PDA and MS on a single HPLC instrument can provide valuable orthogonal information to help ensure that interferences are not overlooked during method validation.
Limit of Detection and Quantitation
The limit of detection (LOD) is defined as the lowest concentration of an analyte in a sample that can be detected, but not necessarily quantitated. It is a limit test that specifies whether or not an analyte is above or below a certain value. The limit of quantitation (LOQ) is defined as the lowest concentration of an analyte in a sample that can be quantitated with acceptable precision and accuracy under the stated operational conditions of the method.
In a chromatography laboratory, the most common way of determining both the LOD and the LOQ is using signal-to-noise ratios (S/N), commonly 3:1 for LOD, and 10:1 for LOQ. Another method that is gaining popularity is a means of calculating the limits based upon the following formula:
LOD/LOQ = K(SD/S) where K is a constant (3 for LOD, 10 for LOQ), SD is the standard deviation of response, and S is the slope of the calibration curve.
It should be noted that determination of these limits is a two-step process. Regardless of the method used to determine the limit, an appropriate number of samples needs to be analyzed at the limit, once calculated, to fully validate the method performance at the limit.
Linearity and Range
Linearity is the ability of the method to provide test results that are directly proportional to analyte concentration within a given range. Range is the interval between the upper and lower concentrations of an analyte (inclusive) that have been demonstrated to be determined with acceptable precision, accuracy, and linearity using the method as written. The range is normally expressed in the same units as the test results obtained by the method (for example, ng/mL). Guidelines specify that a minimum of five concentration levels be used to determine the range and linearity, along with certain minimum specified ranges depending upon the type of method. Table II summarizes typical minimum ranges specified by the guidelines. Data to be reported generally include the equation for the calibration curve line, the coefficient of determination (r2 ), residuals, and the curve itself.
Table II: Example minimum recommended ranges
The robustness of an analytical procedure is defined as a measure of its capacity to obtain comparable and acceptable results when perturbed by small but deliberate variations in procedural parameters listed in the documentation. Robustness provides an indication of the method's suitability and reliability during normal use. During a robustness study method parameters are intentionally varied to see if the method results are affected. The key word in the definition is deliberate. Examples of HPLC variations are illustrated in Tables III and IV for isocratic and gradient methods, respectively. Variations should be chosen symmetrically around a nominal value, or about the value specified in the method, to form an interval that slightly exceeds the variations that can be expected when the method is implemented or transferred. For instrument settings, manufacturers' specifications are sometimes used to determine variability. The range evaluated during the robustness study should not be selected to be so wide that the robustness test will purposely fail, but rather to represent the type of variability routinely encountered in the laboratory. Challenging the method to the point of failure is not necessary. One practical advantage of robustness tests is that after robustness is demonstrated over a given range of a parameter, the value of that parameter can be adjusted within that range to meet system suitability without a requirement to revalidate the method.
Table III: Example isocratic separation robustness variations
Robustness should be tested late in the development of a method, and if not, is typically one of the first parameters investigated during method validation. Throughout the method development process, however, attention should be paid to the identification of which chromatographic parameters are most sensitive to small changes so that when robustness tests are undertaken the appropriate variables can be tested. Robustness studies are also used to establish system suitability parameters to make sure the validity of the entire system (including both the instrument and the method) is maintained throughout method implementation and use. In addition, if the results of a method or other measurements are susceptible to variations in method parameters, these parameters should be adequately controlled and a precautionary statement included in the method documentation. Common HPLC parameters used to measure and document robustness include (for information about how to calculate, see reference 6)
Replicate injections will improve the estimates (for example, %RSD) of the effect of a parameter change. In many cases, multiple peaks are monitored, particularly when some combination of acidic, neutral, or basic compounds are present in the sample.
A common question that comes up during the development of analytical method validation protocols is defining robustness parameters versus intermediate precision or reproducibility parameters. A simple rule of thumb: If it is internal or written into the method (for example, temperature and flow rate) it is a robustness parameter; if it is external to the method (for example, the analyst, instrument number, or day) it is an intermediate (formerly ruggedness) parameter. In other words, you would write a method to reflect flow rate, temperature, wavelength, buffer composition, and pH, but you would never write into a method: "Jim runs the method on Tuesdays on Column Lot # 42587 on System Six" — those are all intermediate precision parameters.
While not formally listed in the guidelines, it is also quite common to investigate sample and standard stability during validation. The stability of both the samples and the stock reference standard solution is evaluated at different time intervals (for example, at time 0, 3, and 7 days) following storage at both room temperature and refrigeration. This information is used to determine how often standards need to be prepared, how they (and the samples) should be stored, and how quickly the samples must be analyzed following preparation.
Analytical Method Validation by Method Type
Several types of methods are used to measure the active pharmaceutical ingredient (API) and impurities, related substances, and excipients and the USP recognizes that is it not always necessary to evaluate every analytical performance parameter for every method. The type of method and its intended use dictate which performance characteristics need to be investigated, as summarized in Table V. Both the USP and ICH divide analytical methods into four separate categories:
Table IV: Example gradient separation robustness variations
Category 1 Methods
Category 1 tests target the analysis of major components and include methods such as content-uniformity and potency-assay analyses. The latter methods, while quantitative, are not usually concerned with low concentrations of analyte, but only with the amount of the API in the drug product. Because of the simplicity of the separation (the API must be resolved from all interferences, but any other peaks in the chromatogram do not need to be resolved from each other), emphasis is on speed over resolution. For assays in Category 1, LOD and LOQ evaluations are not necessary because the major component or active ingredient to be measured is normally present at high concentrations. However, because quantitative information is desired, all of the remaining analytical performance parameters are pertinent.
Category 2 Methods
Category 2 tests target the analysis of impurities or degradation products (among other applications). These assays usually look at much lower analyte concentrations than Category 1 methods, and are divided into two subcategories: quantitative and limit tests. If quantitative information is desired, a determination of LOD is not necessary, but the remaining parameters are required. Methods used in support of stability studies (referred to as stability-indicating methods) are an example of a quantitative Category 2 test. The situation reverses itself for a limit test. Because quantitation is not required, it is sufficient to measure the LOD and demonstrate specificity and robustness. For a Category 2 limit test, it is only necessary to show that a compound of interest is either present or not — that is, above or below a certain concentration. Methods in support of cleaning validation and environmental EPA methods often fit into this category. Although, as seen in Table V, it is never necessary to measure both LOD and LOQ for any given Category 2 method, but it is common during validation to evaluate both characteristics.
Category 3 Methods
The parameters that must be documented for methods in USP-assay Category 3 (specific tests or methods for performance characteristics) are dependent upon the nature of the test. Dissolution testing is an example of a Category 3 method. Because it is a quantitative test optimized for the determination of the API in a drug product, the validation parameters evaluated are similar to a Category 1 test for a formulation designed for immediate release. However, for an extended-release formulation, where it might be necessary to confirm that none of the active ingredient has been released from the formulation until after a certain time point, the parameters to be investigated would be more like a quantitative Category 2 test that includes LOQ. Because the analytical goals may differ, the Category 3 evaluation parameters are dependent upon the actual method, as indicated in Table V.
Table V: Analytical performance characteristics to measure vs. type of method (3)
Category 4 Methods
Category 4 identification tests are qualitative in nature, so only specificity is required. For example, identification can be performed by comparing the retention time or a spectrum to that of a known reference standard. Freedom from interferences is all that is necessary in terms of chromatographic separation.
In today's global market, validation can be a long and costly process, involving regulatory, governmental, and sanctioning bodies from around the world. A well-defined and well-documented validation process provides regulatory agencies with evidence that the system (instrument, software, method, and controls) is suitable for its intended use. Method validation constantly evolves and is just one part of this overall process. The bottom line is that all parties involved should be confident that an HPLC method will give results that are sufficiently accurate, precise, and reproducible for the analysis task at hand, and method validation is just one of the tools to use to accomplish this task.
Michael Swartz "Validation Viewpoint" Co-Editor Michael E. Swartz is with Ariad Pharmaceuticals, Cambridge, Massachusetts and a member of LCGC's editorial advisory board.
Ira S. Krull "Validation Viewpoint" Co-Editor Ira S. Krull is an Associate Professor of chemistry at Northeastern University, Boston, Massachusetts and a member of LCGC's editorial advisory board.
Direct correspondence about this column to email@example.com.
(1) M.E. Swartz and I.S. Krull, LCGC North America 27(11), 989–995 (2009). http://chromatographyonline.findanalytichem.com/ValidationViewpoint
(2) United States Food and Drug Administration, Guideline for submitting samples and analytical data for methods validation, February 1997. US Government Printing Office: 1990-281-794:20818, or at www.fda.gov/cder/analyticalmeth.htm
(3) USP 32-NF 27, August 2009, Chapter 1225.
(4) Fed. Reg. 65(169), 52,776–52,777, 30 August 2000. See also: www.fda.gov/cder/guidance
(5) International Conference on Harmonization, Harmonized Tripartite Guideline, Validation of Analytical Procedures, Text and Methodology, Q2 (R1), November 2005, See www.ICH.org.
(6) USP 32-NF 27, August 2009, Chapter 621.