
LCGC Blog: Reclaiming the Foundations of Separation Science
Key Takeaways
- Overreliance on black-box peak picking, alignment, and smoothing can normalize suboptimal chromatography and erode intuition about injection overload, phase interactions, column health, and system cleanliness.
- Peak asymmetry and baseline artifacts should be treated as mechanistic signals tied to stationary-phase chemistry and instrument condition, not merely parameters to tune in post-processing.
In this month's LCGC Blog, Caitlin Cain from the University of Virginia examines whether technological advances in chromatography—particularly the adoption of big data, artificial intelligence (AI), and machine learning (ML)—are reducing the emphasis on the fundamental chromatographic theory that is key to understanding the technique.
Flip through any recent issue of Analytical Chemistry or Journal of Chromatography A, and the dominant narrative becomes clear: we are living in the era of big data, machine learning, and artificial intelligence. Complex software packages promise to align hundreds of messy samples and routinely parse gigabytes of data in the blink of an eye, while neural networks claim to optimize multi-step gradients with a simple click of a button. Computational tools have become so dominant I sometimes wonder if an outside observer might assume the field of analytical chemistry has evolved into a branch of data science. Are we in danger of trading student training in fundamental chromatographic theory for black-box automation?
Now, I know this question might sound crazy coming from someone who has spent much of their career so far with a focus on developing new computational methods to untangle complex comprehensive two-dimensional gas chromatography time-of-flight mass spectrometry (GCxGC–TOF-MS) and liquid chromatography tandem mass spectrometry (LC–MS/MS) data sets. I know exactly how transformative a chemometric pipeline can be for handling high-dimensional data and providing deeper chemical insights. My worry, however, is that these tools are now becoming black-box shortcuts versus tools for executing chromatographic theory. When these computational pipelines become a safety net for poor chromatography, we stop training true separation scientists and instead teach software dependency.
Let’s take a look at this example: if a student or routine operator encounters a severe peak tailing or baseline anomaly, the solution is often computational. Parameters in a peak-picking algorithm are adjusted, threshold tolerances are shifted, or machine learning smoothing filters are applied. However, these algorithms do not help diagnose the actual problem with the separation at hand.
If peak asymmetry is severely skewed, the obvious answer is that too much sample could have been injected and better chromatography could be realized with a lower injection volume. In some cases, peak asymmetry could also tell us something interesting about the interactions between the analyte and the stationary phase. For instance, during my undergraduate research on mixed-mode liquid chromatography (LC) columns featuring a gradient of chemical functionalities, we discovered that peak asymmetry resulted from neighboring ligand effects, where two distinct functionalities in close proximity on the silica support interact simultaneously with the analyte.1 If baseline noise is larger than expected, we should look at column age and system cleanliness rather than shifting software parameters. When software is treated as a safety net rather than a diagnostic partner, we lose our own technical intuition. A generation of researchers is at risk of learning how to flawlessly navigate a software interface without ever understanding the theory occurring inside the column housing.
This tension between algorithmic fixes and physical reality is exactly why my time at HPLC 2026 in Indianapolis felt so refreshing. Amidst the necessary focus on big data, it was inspiring to see the community continue to champion the core physical principles of our discipline. The sessions honoring the legacy of Pete Carr offered a stark reminder that the true leaps in separation science have always been driven by a deep, uncompromising understanding of thermodynamics and kinetics. We saw this philosophy echoed in several presentations highlighting the continued need to delve into the reversed-phase separation mechanism and push the speed limits required for 2D-LC. Similarly, another large portion of the talks at HPLC 2026 focused on new developments in column technology, including core-shell and monodisperse particles and ultra-long capillary columns. These advancements were not predicted by artificial intelligence. Instead, they came from researchers obsessing over each individual term of the Van Deemter equation. While computer simulations and data analysis tools can provide us with deep insights, they must serve as tools for executing theory and not as black-box shortcuts. True analytical innovation will always start at the bench.
As I prepare to launch my own independent research group this fall as an assistant professor, I want to foster a culture where data science is viewed as a partner to chromatographic theory and never its replacement. Yes, it is imperative to train our students on how to leverage different chemometric and machine learning algorithms for wrangling complex chromatographic datasets. In an era where a single untargeted metabolomics run can yield thousands of features, data literacy is a non-negotiable skill. I highly recommend the framework that Dr. Katelynn Perrault-Uptmor provides in her recent LCGC Blog regarding data training in analytical chemistry.2 She provides a spectacular roadmap for transforming the way we teach data workflows, ensuring students treat chemometrics as an active, iterative extension of their analytical process rather than a static button to push on a computer screen.
But as we eagerly adopt these modern data frameworks, we must ensure students spend the necessary time to evaluate separation conditions and column performance before collecting large datasets. It is incredibly tempting for a young researcher to rush through method development, operating under the assumption that if their baseline drifts or their peaks overlap, a powerful post-run alignment algorithm or a deconvolution script can simply sort it all out later. We have to break that habit early. In my laboratory, no large-scale untargeted sequence will be queued until a student can provide evidence that their system is operating with optimal chromatographic performance. Through these preliminary experiments, the instrument stops being an abstract appliance and becomes a dynamic chemical environment.
Advanced chemometrics and machine learning are magnificent tools that allow us to make sense of unprecedented molecular complexity, but they are entirely dependent on the quality of the raw physical separation. Our goal as educators and mentors is to ensure that fundamental understanding of chromatographic theory does not fade into a historical footnote. We should push our students to ask: How does the new instrument or data analysis tool further our theoretical understanding? Math can reveal remarkable patterns, but true innovation will always live at the physical bench.
References
1. Forzano, AV; Cain, CN; Rutan, SC; Collinson, MM. In situ silanization for continuous stationary phase gradients on particle packed LC columns. Anal. Methods, 2019, 11, 3648-3656. DOI: 10.1039/C9AY00960D
2. Perrault-Uptmor, KA. Who Will Handle the Data? Training Data Wranglers in Analytical Chemistry. The LCGC Blog, 2026.
Biography
Caitlin N. Cain is an incoming Assistant Professor in the Department of Chemistry at the University of Virginia, Virginia, USA. Her research specializes in the development of multidimensional chromatographic techniques and computational pipelines for untargeted metabolomics. She recently completed a postdoctoral research fellowship at the University of Michigan and previously earned her Ph.D. in Chemistry from the University of Washington in 2024 and B.S. degrees in Chemistry and Forensic Science from Virginia Commonwealth University in 2019. Her research efforts have been recognized by numerous fellowships and accolades, including the NIH Ruth L. Kirschstein Postdoctoral Individual National Research Service Award (F32 Fellowship), NSF Graduate Research Fellowship, and the inaugural LCGC Rising Stars of Separation Science Award. She proudly serves as Secretary for the ACS SCSC (Subdivision on Chromatography and Separations Chemistry).



