Small-Molecule Drug Discovery: Processes, Perspectives, Candidate Selection, and Career Opportunities for Analytical Chemists

LCGC North America, August 2022, Volume 40, Issue 8
Pages: 344–350

Columns | <b>Column: Perspectives in Modern HPLC</b>

This article provides an overview of the small-molecule drug discovery (SMDD) process for analytical scientists. The focus is on the modern approaches of identifying molecular targets, followed by high-throughput screening and synthesizing molecules with optimized properties for disease mitigation. The fundamental concepts and studies required in drug candidate selection, the business landscape, the technology trends, and the career opportunities for analytical chemists are discussed.

This installment is the second of six white papers on the pharmaceutical industry written for analytical chemists and others who want a concise overview of the industry and its practices. This article describes the drug discovery process for small-molecule drugs from the molecular target identification and validation, screening compounds for hits, and synthesizing leading compounds with optimized structural motifs to maximize efficacy, bioavailability, and safety. The medicinal chemist uses structure-activity relationship (SAR) and therapeutic index concepts to guide the discovery process to nominate drug candidates for clinical trials. The business and technology perspectives and the role of the analytical chemist in drug discovery are described.

What is Drug Discovery?

Today’s approach to discovering new drugs generally starts with an understanding of the pathophysiology of a disease and potential molecular targets, followed by high-throughput screening (HTS) and the synthesis of a variety of molecules that bind specifically to the target (hits and leads), which medicinal chemists subsequently optimize to development candidates. The complex and multidisciplinary processes in drug discovery are described in detail in textbooks (1–4) and short courses (5), using fundamental concepts in molecular biology, medicinal chemistry, pharmacology, pharmacokinetics, pharmacodynamics, and toxicology (6–10). This paper provides a brief overview of the processes, criteria in candidate selection, business landscape, and technology trends of small-molecule drug discovery (SMDD).

In the rational approach, modern drug discovery begins with a therapeutic concept or assumption that a molecular target is associated with a disease state that a small organic molecule can mitigate (1–3). The process involves screening a library of compounds that binds specifically to the target. Typically, the leading compounds are comprised of two or more structural motifs that are then further optimized through a series of customized syntheses to increase the binding affinity, selectivity (to reduce the potential for side effects), efficacy or potency (to mitigate the disease), metabolic stability (to increase the half-life in systemic circulation), and bioavailability (having good solubility and permeability). Once a compound that fulfills these requirements has been identified as a candidate, the project focus shifts to drug development to enable human clinical trials. Successful clinical trials lead to the regulatory registration as a commercial drug product.

Glossary and Abbreviations

Table I lists the common terminologies and abbreviations associated with SMDD.

A Brief Review of Drug Discovery and Development Processes

Figure 1 depicts the major steps and milestones of the drug discovery and development process. The reader is referred to many excellent books on this complex, expensive, and multidisciplinary process (1–4). The key stages are basic research, drug discovery, nonclinical development, and clinical development, which hopefully lead and result in regulatory approval of a new drug product. The regulatory filings are investigational new drug (IND) for starting clinical trials in humans and new drug application (NDA) in the United States. In the European Union, they are the clinical trial agreement (CTA) and marketing authorization application (MAA) for gaining approval to start commercializing drug products. Primary regulations that must be followed in drug development are Good Laboratory Practice (GLP), Good Manufacturing Practice (GMP), and Good Clinical Practice (GCP) regulations. International Conference on Harmonization (ICH) guidelines are not regulations, but they provide detailed technical requirements for late-stage development and production, which sponsors universally follow. Figure 1 provides perspectives on timelines, costs, regulations, and standard practices for each step.

Basic Research Using the Molecular Approach

In the preclinical research stage, a therapeutic concept is formed based on understanding the pathophysiology and molecular biology of a disease. The research is often conducted by molecular biologists, physicians, toxicologists, pharmacologists, and biochemists working in academia or research institutions (1,2). Because most biological functions in the body are because of the activities of proteins, a state of health can often be viewed as the “normal” functioning of proteins. In a most simplistic view, a disease state can be viewed as the “abnormal” functioning of proteins, which an appropriate drug may restore to a “normal” state.

A critical finding of preclinical research is the identification of a molecular target(s), which is typically a receptor, enzyme, signaling protein, or ion channel of the host (patient) or pathogen (for example, virus or bacteria). The disease state may be mitigated when a drug molecule binds to this molecular target. Occasionally, a gene or messenger ribonucleic acid (mRNA) can be targeted though the drug delivery system to such a target that is more challenging than targeting a protein. This type of research is conducted by protein chemists, pharmacologists, and computational chemists. In the exploratory phase, data are generated to prove that the modulation of the target could ameliorate disease using in vitro assays and validation studies that use experimental animal models.

Drug Discovery (From Therapeutic Concept to Leads) and Medicinal Chemistry

The next step in drug discovery starts with the high-throughput screening (HTS) of many library compounds to identify those molecules that bind to the target. Fragment screening and results from in vitro assays allow for understanding the structural motifs that will increase the binding affinity and specificity to the target. Once multiple hits are found, they are used to generate large virtual libraries of compounds that are then further screened for the best drug candidates. To speed up this process, laboratories may employ:

In silico modeling, which allows predicting compounds’ properties from their structure without actual synthesis.

Statistical methods. For example, grouping (clustering) compounds by similarity and selecting just a few representatives from each group. The whole cluster of compounds may be screened out quickly if the representative was ineffective.

High throughput experimentation (HTE), which is a set of practices that allows running many (sometimes thousands) experiments simultaneously. In addition to speed, HTE helps save precious material because of miniaturization. It often implies automation using robotics like liquid handlers.

The medicinal chemist with an organic chemistry background serves as the project lead by synthesizing compounds with optimized physiochemical properties such as binding affinity, bioavailability, and pharmacokinetics and drug metabolism (PKDM), using both in vitro and in vivo studies (7). Common principles in this iterative optimization process are the SAR, therapeutic ratio, and the Lipinski “Rule of Five” for drug-like properties that are described later (7,11).

A major pharmaceutical company that has a SMDD organization often has many projects focusing on one or more disease indications (for example, oncology, immunology, central nervous system, cardiovascular, pain, and others). To increase the productivity of the organization and the quality and purity of the leads, a centralized compound management system and analytical support group are employed to purify and characterize the leads before they are archived in a central depository. Case studies on the workflow and separation techniques used in the high-throughput purification and characterization are published in the literature (12,13).

The main focus of a drug discovery program is to nominate drug candidates for clinical trials. A GLP toxicological study is initiated when a lead candidate is ready for development. A GLP toxicology study can cost over a million dollars and is required for filing an IND. The GLP toxicology study evaluates if the candidate is safe enough to be administered to a human.

Preclinical Drug Development and CMC (From Drug Candidates to IND)

Before conducting a GLP toxicology study, a multidisciplinary technical development project team is formed, consisting of a process chemist, analytical chemist, pharmaceutics scientist, regulatory specialist, quality control specialist, project manager, and team leader. This chemistry, manufacturing, and controls (CMC) team is charged with the detailed characterization of the drug candidate, driving process development for the production of the clinical trial materials (CTM) (both days supply [DS] and drug product [DP]), and filing regulatory documents (IND) required for conducting clinical trials. These processes and practices in nonclinical drug development will be discussed further in the next installment of this series.

Clinical Development (From IND to NDA)

Clinical trials in humans commence a month after the filing of the IND (unless the study is placed on a clinical hold by the regulatory agency). The purpose of Phase 1 trials is to assess the safety and pharmacokinetics (PK) of the new chemical entity (NCE) using dose escalation studies.

In Phase 2 trials, the goals are dosing findings and evaluating the efficacy of the NCE using blinded studies (to establish proof of concept [POC] of the NCE). Phase 2 studies are completed using patients with the disease.

Phase 3 clinical trials aim to establish the commercial POC of the dosing regimen using a larger patient population with near-commercial formulations and scale. Upon the successful completion of the clinical trials, clinical data are analyzed by biostatisticians and clinicians. The data is compiled into the new drug application (NDA), an extensive document for drug approval submission to the regulatory authorities.

Fundamental Concepts for Candidate Selection in Drug Discovery)

In this section, we discuss the fundamental studies and concepts used in selecting drug candidates in drug discovery.

Pharmacology and Dose-Response Curves

Pharmacology is the study of drug action or the interactions of a living organism and exogenous chemicals that alter normal biochemical functions (6). Paul Ehrlich proposed an important pharmacology concept in 1913 in a statement that reads as thus: “Corpora non agunte nisi fixata,” which translates to “A substance will not work unless it is bound” (1).

Most drugs bind to the target reversibly and either inhibit or activate its functions (antagonist or agonist). The binding kinetics can be described using the equation below.

Figure 2 describes the relationship between the target response (fractional occupancy) and the concentration of the drug. EC50 is the effective drug concentration that can produce a response halfway between the maximum and baseline and measure the potency of the drug. The figure inset shows that drug D1 is more potent than D2 because its EC50 is lower, whereas D3 is less efficacious than D2 because the response is approximately half while the EC50 of both drugs is the same.

Early Animal Studies

Although early screening in drug discovery using HTS is based mainly on in vitro cell and tissue-based assays (mostly assays using radioactivity, fluorescence, enzyme-linked immunoassay (ELISA), the use of in vivo animal studies is required in rodents (for example, a mouse or rat) for the target validation and assessment of the efficacy, safety, tolerability, and bioavailability (8–10). The data that are collected include mortality, weight loss, observations for disease manifestation, and efficacy (for example, tumor volume, viral RNA, anti-inflammatory response, biomarkers, and bioavailability) (14). Animal studies are single dose, repeated dose, or dose-escalating in the short- or long-term to assess safety as lethal dose for 50% (LD50), or toxic dose for 50% (TD50), no-observed-adverse-effect level (NOAEL), pharmacokinetics (PK), and aid candidate selection using concepts such as the therapeutic index (TI). Data are also used to define the route of administration, dose, and dosing frequency for clinical studies.

GLP toxicology studies in rodent and non-rodent test animals are pivotal safety studies with regulatory scrutiny to collect essential data such as organ toxicity. Before filing the IND, the GLP toxicology studies are completed to justify that the NCE is sufficiently safe for human clinical trials.

Pharmacokinetics (PK) and Metabolism

For a drug to be active, it must be able to reach the site of action. One underlying assumption is that the pharmacodynamics (PD) is related (proportional) to the concentration of drug binding to the target. Because these data is challenging to measure, the drug concentration in the systemic circulation is collected in PK studies as they may correlate with those at the target (8,9). Because the preferred route of administration for small-molecule drugs is peroral for convenience and marketability (for example, tablets or capsules), the solubility and permeability of the API (to Caco-2 cells in the intestines) are the important factors in oral bioavailability. Metabolism is another critical consideration since rapid metabolism of the API in the body will reduce the duration and maximum concentration of the drug in circulation (6–9).

What is a Lead? Lipinski’s “Rule of Five” of Drug-Like Properties

The small-molecule drug product should preferably be a peroral formulation (a tablet or capsule) with a shelf life of two years for marketability and convenience to the patient. Lipinski’s “Rule of Five,” proposed by Christopher A. Lipinski of Pfizer in 2000 (15), describes the drug-like properties of the orally available candidate with promising physicochemical properties for aqueous solubility, absorption, and bioavailability.

The “Rule of Five” are as follows: a molecular weight <500; the partition coefficient of LogP <5, the number of H-bond donors <5, and the number of H-acceptors <10 (11,15,16). These rules of thumb guide candidate selection in drug discovery. Compounds of high hydrophobicity generally tend to be more potent, but they are also less soluble or permeable even though they have a higher binding affinity to the target than candidates with lower hydrophobicity. In contrast, more hydrophilic molecules are readily excreted by the body, resulting in less uptake. According to Lipinski, the rule was developed to counter Pfizer’s observations that rely on results from HTS binding assays, which biased those hits with higher hydrophobicity and later turned out to be difficult to develop (16). Therefore, aqueous solubility and permeability data must be provided to medicinal chemists as early as possible to avoid oral absorption problems. Lipinski also pointed out that substrates for transporters and natural products are the exceptions to the “rule of five.”

Guiding Principles in Candidate Selection

The selection of drug candidates from thousands of hits and hundreds of leads by the medicinal chemist is both an art and science and is often compared with finding “needles in a haystack.” The two primary guiding principles in candidate selection are SAR and TI. Figure 3 depicts the myriad factors in the selection process used in balancing the desirable attributes that affect efficacy with those on safety and toxicity. Other significant attributes are bioavailability, stability, intellectual property (IP) rights, and the marketing landscape (17). Although the risk of failure is great in new drug development, the payback in revenues and the benefits to patients with serious ailments are substantial to the company’s bottom line and public health.

Business Landscape in Drug Discovery

Sales of new innovative drugs for serious diseases are the lifeblood of the pharmaceutical industry and represent the primary revenue source for established companies because most revenue is lost after patent expiry because of competition from generic products (17). Nevertheless, many pharmaceutical companies do not conduct basic research or drug discovery because they are expensive and the return on investment is uncertain. Most pharmaceutical companies acquire development candidates through licensing, collaborative partnership, and merger or acquisition. A case in point was the acquisition of Pharmasset by Gilead Sciences for $11 billion in 2011 for its phase-2 drug candidates for chronic hepatitis C virus infection (18), or a recent 2022 acquisition of Biohaven by Pfizer for $11.6 billion for Biohaven’s migraine therapies pipeline (17).

For those companies that engage in drug discovery, many choose to offshore or outsource the internal programs because of their lower cost. In recent years, there has been a proliferation of startups whose business model is to take promising molecules into early phase trials with the intention to be acquired by larger companies. These startups are mostly virtual (with minimal staff and limited laboratory facility by relying on contract development and manufacturing organizations (CDMOs) and funded by venture capitalists or private equity firms.

It is essential to mention that SMDD is quite different from those in biologics (monoclonal antibody [mAb] biotherapeutics) (1,2), with distinctively different modes of action, targeted binding, and longer pharmacokinetics (1,2,5). Biologics requires more expensive production facilities using recombinant DNA technology, fermentation, and bioprocessing. Biological therapeutics are also not orally available and therefore must be administered as parenterals.

Technology Perspectives: Automating Drug Discovery Using Artificial Intelligence

SMDD can be viewed as a multidimensional problem in which various characteristics of compounds—efficacy, pharmacokinetics, and safety—need to be optimized in parallel to provide drug candidates. Past innovations and concepts, such as combinatorial compound synthesis, HTS, microtiter plate automation, fluorescence-based binding assays, 3D imaging, computer-aided drug design (CADD), and acoustic micro dispensing, have all contributed to the improvement in the speed, cost, and quality of the decision-making process in the iterative molecular design cycle by the medicinal chemist (19). Recent advances in microfluidics-assisted chemical synthesis and biological testing systems incorporating artificial intelligence systems to provide feedback analysis may provide the basis for the next level of adaptive automated de novo drug design systems (19).

Pharmaceutical Career Opportunities for Analytical Chemists in Drug Discovery

Analytical chemists’ most significant employment segment (particularly separation scientists) is in new drug development and quality control in the pharmaceutical industry (20). The drug discovery segment offers significantly fewer job opportunities for reasons mentioned earlier in the business landscape section. For analytical chemists, the skill requirements in discovery are different from those in development because there are no GMP or regulatory compliance requirements. Those working in centralized analytical groups deal with thousands of compounds rather than a few drug candidates in development.

The technical skills sought after in drug discovery are HTS, automation, high-throughput purification and characterization, bioanalysis for pharmacokinetic assays and biodistribution analysis, high performance liquid chromatography (HPLC), gas chromatography (GC), mass spectrometry (MS), supercritical fluid chromatography (SFC), preparative LC, and chiral separation and purification. Details of the workflows and separation techniques used in a central drug discovery support group are shown in Figures 4 and Tables II and III. Readers are referred to the original publications for details (12,13).


This article provides an overview of the drug discovery process of small-molecule drugs for the analytical chemist. It includes a description of the modern rational approach of finding a compound binding to the molecular target and optimizing its structural motifs for efficacy and safety. Concepts for candidate selection, business landscape, technology trends, and career opportunities in this pharmaceutical segment for analytical chemists are also discussed.


The author thanks the following reviewers who provided timely comments on the technical content and clarity of the manuscript: He Meng from Sanofi, Tao Jiang from Mallinckrodt, Alice Krumenaker from TW Metals LLC, Mike Shifflet from J&J Consumer Health Care, Cicely Zhu from Nitto Americas, Yen-Yu Yang from University of California at Riverside, Mengling Wong from Genentech, Stanislav Bashkyrtsev from Elsci, and Ranjitkumar Patil from Sun Pharma.


This paper provides a brief overview of the drug discovery process, concepts in drug candidate selection, and the business and technology landscape. The information presented here stems from books, journal articles, and internet resources collected for my short course in drug development (5) and often reflects personal experience and opinions. It is challenging to present a concise and general overview of this complex and diversified process from the perspective of an analytical chemist, as some levels of simplification or interpretations are unavoidable.


(1) R.G. Hill and H.P. Rang, Eds., Drug Discovery and Development: Technology in Transition (Churchill Livingston, Edinburgh, UK, 2nd ed., 2012), chapters 4–11.

(2) B.E. Blass, Basic Principles of Drug Discovery and Development (Academic Press, Cambridge, MA, 1st ed., 2015).

(3) E.D. Zanders, The Science and Business of Drug Discovery: Demystifying the Jargon (Springer, Berlin, Germany, 2nd ed., 2020).

(4) U. Nielsch, U. Fuhrmann, and S. Jaroch, Eds., New Approaches to Drug Discovery (Springer, Berlin, Germany, 1st ed., 2016).

(5) M.W. Dong, “Drug Discovery and Development Processes,” Pittcon 2019 short course presented at Pittcon, Philadelphia, PA, 2019.

(6) K. Whalen, Lippincott Illustrated Reviews: Pharmacology (Walters Kluwer, Philadelphia, PA, 7th ed., 2018).

(7) G. Patrick, An Introduction to Medicinal Chemistry (Oxford University Press, Oxford, UK, 6th ed., 2017).

(8) P. Beringer, Winter’s Basic Clinical Pharmacokinetics (Lippincott Williams & Wilkens, Philadelphia, PA, 6th ed., 2017).

(9) M. Rowland and T.N. Tozer, Clinical Pharmacokinetics and Pharmacodynamics: Concepts and Applications (Lippincott Williams & Wilkens, Philadelphia, PA, 4th ed., 2010).

(10) C. Klassen and J. Watkins, Casarett & Doull’s Essentials of Toxicology (McGraw-Hill, New York, NY, 4th ed., 2021).

(11) L. Di and E. Kerns, Drug-Like Properties: Concepts, Structure Design and Methods from ADME to Toxicity Optimization (Academic Press, New York, NY, 2nd ed., 2016).

(12) M. Wong, B. Murphy, J.H. Pease, and M.W. Dong, LCGC North Am. 33(6), 402– 413 (2015).

(13) B. Lin, J.H. Pease, and M.W. Dong, LCGC North Am. 33(8), 534–545 (2015).

(14) R Rahbari, J. Van Niewaal, and M.R. Bleavins, Eds., Biomarkers in Drug Discovery and Development (Wiley, Hoboken, NJ, 2nd ed., 2020).

(15) C.A. Lipinski, J. Pharmacol. Toxicol. Methods 44, 235 (2000).

(16) C.A. Lipinski, “Medicinal Chemistry: Tips and Tricks and Compound Properties in Drug Discovery,” a two-hour course presented at Genentech, South San Francisco, CA, 2012.

(17) M.W. Dong, LCGC North Am. 40(6), 252– 257, (2022).

(18) R. Mullin, C & EN 100(17), 10 (2022).

(19) G. Schneider, Nat. Rev. Drug Discov. 17, 97–113 (2018).

(20) M.W. Dong, “Careers in Pharmaceutical Industry for Analytical Chemists: A Personal Journey,” paper presented at ACS National Meeting, Virtual, 2021.

About the Author

Michael W. Dong is a principal of MWD Consulting, which provides training and consulting services in HPLC and UHPLC, method improvement, pharmaceutical analysis, and drug quality. He was formerly a Senior Scientist at Genentech, a Research Fellow at Purdue Pharma, and a Senior Staff Scientist at Applied Biosystems/PerkinElmer. He holds a PhD in Analytical Chemistry from City University of New York. He has more than 130 publications and a best-selling book in chromatography. He is an editorial advisory board member of LCGC North America and the Chinese American Chromatography Association. Direct correspondence to: