News|Articles|March 5, 2026 (Updated: March 4, 2026)

AI/ML in Practice: Boudewijn Hollebrands on Deep Learning Models to Predict Food Peptide Retention Times

LCGC International’s interview series on the evolving role of artificial intelligence (AI)/machine learning (ML) in separation science continues with Boudewijn Hollebrands from Unilever Foods R&D, Wageningen, Netherlands, discussing deep learning models to predict food peptide retention times.

You recently published a paper, Application of a Deep Learning Model to Predict HPLC Retention Times of Food Peptides Across Chromatographic Conditions.1 What was the rationale behind this research, and what makes your machine-learning approach innovative compared with existing strategies for peptide identification using LC–MS?

Confident identification of food-derived peptides in LC–MS data using conventional proteomics workflows is challenging. Food peptides exhibit wide diversity in sequence length, composition, and biological origin, often placing them outside the chemical space typically covered by standard proteomics workflows. A lot of manual work is therefore involved to identify these molecules. The inclusion of retention information in the identification workflow, by comparing predicted and experimentally measured retention times, substantially improves the reliability of peptide identification and reduces the need for manual intervention.

We applied a transfer-learning approach where we fine-tune a generic deep learning model initially trained on large proteomics datasets using our own experimental data obtained from commercial peptide standards. In practice this means that we don’t need to start from scratch when developing retention time prediction models.

Why are retention time predictions particularly important for identifying small, food-derived peptides when MS spectral information alone is often insufficient?

Small food-derived peptides often produce limited or ambiguous MS/MS fragmentation patterns due to their short length, low charge states, and multiple overlapping fragment ions. This reduces the effectiveness and confidence of identification by peptide spectra matching alone. There are multiple strategies to obtain additional information that could aid identification, including alternative tandem mass spectrometry ( MS/MS) approaches, collisional cross sections, orthogonal chromatographic separation, and, last but not least, retention time prediction. The latter can be implemented with minimal additional experimental effort and it really helped us to more effectively identify small food peptides.

Proteomics repositories contain vast peptide data, yet models derived from them often perform poorly for food peptides. What causes this gap, and how does transfer learning help bridge it?

Proteomics repositories are dominated by tryptic peptides from well-characterized proteins, resulting in strong biases in peptide length, terminal residues, charge states, and physicochemical properties. Food peptides, in contrast, often arise from less-specific enzymatic or thermal processes and hence occupy a broader and different chemical space.

This mismatch leads to poor results when using proteomics-trained models to peptides they are not trained for. With transfer learning, you can build on existing models instead of starting from scratch. This approach enables the model to accommodate differences in both peptide chemistry as well as in the particular chromatographic conditions used, while minimizing the need for large additional datasets.

From a practical perspective, how much experimental data are needed to fine-tune such a model, and can routine analytical laboratories realistically implement this approach?

Only a small set of experimental data is required for effective fine-tuning of generic prediction models. Typically, a few hundred confidently identified peptide are sufficient. Such dataset could readily be obtained from the analysis of a single peptide standard within the same sample sequence. Therefor, adaptation in routine analytical laboratories is entirely realistic.

How robust are retention time predictions across different chromatographic conditions, and which practical parameters most strongly influence prediction accuracy?

Retention time prediction is robust when the model is appropriately adapted to a chromatographic system. This means that there should be sufficient data points present in the dataset used for fine-tuning, especially also in the regions around the retention region of interest. Things tend to go wrong if you start extrapolating outside these boundaries.

An important practical parameter is that molecules should interact with the column material to have predictable retention. Limited interaction with the stationary phase limits prediction accuracy drastically.

How can retention time prediction be integrated into everyday LC–MS workflows to increase peptide identification confidence and reduce manual validation work?

All you have to do is collect retention data from a commercial peptide standard in the same sample sequence with your set of samples. This enables precise model adjustment for retention time prediction. The predicted retention times can than be compared with experimental values to remove candidate identifications that are too far off, rank peptide assignments based on their retention time match, or flag inconsistent results for manual review.

Looking ahead, how do you see machine learning and transfer learning transforming separation science beyond food peptide analysis?

AI is transforming peptide analysis thanks to the simplicity of encoding amino acid sequences as text, enabling rapid advances in building models. In general, peptide properties are now just a sequence away from accurate prediction.

Machine learning and transfer learning are pushing separation science towards data-driven optimization and faster adaptation, moving beyond the era of trial-and-error. However, encoding molecules other than peptides remains challenging, and high-quality data is still the key to better predictions. Better data means eventually smarter models. As accessibility grows, machine learning is set to become a flexible tool across separation science, not just for niche tasks.

Reference

  1. Hollebrands, B.; Hageman, J.; Janssen, H.-G. Application of a Deep Learning Model to Predict Liquid Chromatography Retention Times of Food Peptides Across Chromatographic Conditions. J. Sep. Sci. 2025, 48 (9), e70270. DOI: 10.1002/jssc.70270