
HTC-19 Update: Artificial Intelligence and Machine Learning
Key Takeaways
- Encoding chromatographic priors into objective functions, decision policies, and GP kernels reduces regret and outperforms simply increasing Bayesian-optimisation algorithmic complexity in closed-loop LC workflows.
- Hybrid workflows couple pre-trained GNN retention prediction with mechanistic retention modelling and Bayesian optimisation, iteratively updating models from experiments to guide subsequent condition selection.
An illuminating session called AI and Modelling on Wednesday 27 May 2026 at HTC-19 in Leven, Belgium focused on automated and computational approaches to LC method development, with five presentations spanning Bayesian optimization, hybrid retention modelling, QSAR-based retention prediction, graph-theoretic peak alignment, and functional data analysis
Bob Pirok from the University of Amsterdam, Amsterdam, The Netherlands, delivered a keynote titled No Regrets: Encoding Chromatography in Automated Method Development Beyond Bayesian Optimization. Pirok challenged the premise that Bayesian optimization alone is sufficient for reliable automated LC method development, arguing that when treated as a generic black-box optimiser it frequently fails to navigate the complexity of chromatographic systems effectively. The central problem, he argued, is not the algorithm but the absence of encoded chromatographic knowledge within it. His group's approach reframes the problem — rather than applying machine learning to chromatography, the aim is to embed chromatography within machine learning. This involves systematic encoding of chromatographic knowledge across three components: the objective function, the decision-making strategy, and the surrogate model kernel. Pirok gave particular attention to the design of chromatographic response functions, which govern the ruggedness and interpretability of the optimisation landscape, and to kernel assumptions, which determine how the algorithm learns from and exploits that landscape. Chromatographic optimisation problems, he noted, exhibit low effective dimensionality driven by a limited number of dominant parameters; ignoring this structure leads to inefficient exploration, while incorporating it accelerates convergence and reduces optimisation regret. The principles were shown to extend to selectivity screening and method transfer. Using both simulated and experimental closed-loop workflows, Pirok demonstrated that design choices of this kind have a greater impact on performance than increasing algorithmic complexity.
Kai Chen of Janssen Research & Development, Beerse, Belgium presented a talk entitled Speeding Up Liquid Chromatography Method Development in Pharmaceutical Development by Using Strategies of Hybrid Retention Modeling describing a workflow developed in collaboration with KU Leuven, Leuven, Belgium and the VIB-UGent Center for Medical Biotechnology, Ghent, Belgium. Chen noted that conventional trial-and-error strategies and Quality by Design factorial designs can be slow to adapt to the shifting demands of pharmaceutical development across early and late-stage projects. His group has developed a hybrid workflow combining data-driven machine learning with mechanistic retention modelling. A Graph Neural Network model pre-trained on approximately 220,000 molecules using a generic three-minute LC method is fine-tuned with retention time data from 500 to 1,000 molecules across a set of eight to sixteen screening methods covering two columns, two organic modifiers, and mobile phases at different pH values. Following initial screening and column selection, the GNN is further fine-tuned on the screening data and an autonomous optimisation framework based on mechanistic retention models and Bayesian optimisation refines remaining parameters including column temperature and gradient. Experimental retention data are fed back iteratively to the GNN, whose updated predictions for potential related compounds — not yet physically available — are then used by the mechanistic model to select subsequent experimental conditions. Case studies demonstrated that iterative fine-tuning with data from self-optimising experiments improves GNN prediction accuracy, and that the strategy extends method development scope to include potential related substances.
Roman Szucs of Comenius University Bratislava, Bratislava, Slovakis presented on Predicting Chromatographic Retention Times: Descriptor Choice, Feature Selection, and QSAR Interpretation. His work assessed multiple molecular descriptor classes — traditional physicochemical, topological and fragment-based, and solvation-related descriptors — in combination with a range of feature selection strategies including filter-based, wrapper-based, and embedded methods, evaluating their respective effects on predictive performance, model stability, and feature consistency using nested cross-validation and permutation-based validation. Predictive models were trained using regularised linear and nonlinear regression approaches. Szucs placed particular emphasis on model interpretability, analysing selected features in terms of their physicochemical meaning and chromatographic relevance to provide mechanistic insight into retention behaviour. Solvation-aware descriptors were shown to offer complementary information to conventional molecular features, and both descriptor choice and feature selection strategy were found to strongly influence model robustness.
Gabriel Vivó-Truyols from the Spanish Scientific Council , Madrid Spain, gave a talk called On the Use of Graph Theory for Peak Assignation and Chromatographic Alignment: A Bayesian Approach. Vivó-Truyols noted that the growth of two-dimensional chromatography, high-resolution mass spectrometry, and higher-order detectors has substantially increased the volume and complexity of chromatographic data, making robust automated data curation strategies increasingly important. Vivó-Truyols showed that peak assignment and chromatographic alignment — typically treated as separate problems — become mathematically equivalent when formulated through graph theory, in which nodes represent chromatographic peaks or compounds and edges represent assignments of identity. Both operations reduce to a classical assignment problem solvable by combinatorial optimisation. He presented a Bayesian approach to this combinatorial optimisation that considers a distribution of solutions rather than a single optimum, demonstrating that this makes both peak assignment and alignment considerably more robust and reliable than conventional single-solution methods.
Leveraging Functional Data Analysis for Advanced Chromatographic Method Development: A Case Study in Sophorolipid Separation was presented by Manlio Caldara of JMP Statistical Discovery. Caldaraproposed that the conventional practice of reducing chromatograms to discrete scalar values — peak area, height, or retention time — discards high-dimensional information about peak morphology, baseline behaviour, and co-elution that is directly relevant to method development. His group's approach treats chromatograms as functional objects and applies Functional Data Analysis within a three-stage pipeline: automated spectral preprocessing including baseline correction and retention time alignment via Dynamic Time Warping; signal modelling using a Daubechies 10 wavelet basis chosen for its ability to capture sharp peak features while compressing noise; and Functional Principal Component Analysis to decompose shape variation into a manageable set of uncorrelated variables such as peak broadening, separation distance, and tailing. These functional principal component scores then serve as responses in a Functional Design of Experiments framework, enabling prediction of the entire chromatogram across the design space without additional experimental runs. Applied to the separation of two sophorolipid congeners differing by a single carbon atom — C18:1 and C17:1 — and varying column temperature and flow rate as critical method parameters, the workflow successfully identified an operating region in which the two compounds are fully resolved.
Related Content


Best of the Week: Separation Science Across Food Safety and Human Health



