A recent study conducted by the Department of Chemical Engineering at the Indian Institute of Technology (Delhi, India) used liquid chromatography-mass spectrometry (LC–MS) to distinguish hetero-variants (glycoforms) resulting in a monoclonal antibody (mAb) able to be characterized, revealing discernible peaks at the intact level.
Although automated peak detection functionalities are available in commercially accessible software, utilizing visual inspection and manual adjustments to achieve optimal true positive rates is often necessary. A recent study conducted by the Department of Chemical Engineering at the Indian Institute of Technology (Delhi, India) used liquid chromatography-mass spectrometry (LC–MS) to distinguish hetero-variants (glycoforms) resulting in a monoclonal antibody (mAb) able to be characterized, revealing discernible peaks at the intact level. LCGC International spoke to Anurag Rathore, corresponding author for the article, about his department’s findings.
Your paper (1) presents a study conducted by you and your coauthors where a machine learning (ML)-based approach for peak detection is used to facilitate a head-to-head intact-level comparison of commercially licensed biosimilars and innovator products. What are the benefits of using ML to do this, as opposed to other approaches?
Unlike the traditional peak detection methods that require pre-existing information about the sample such as baseline distortions, phasing errors and t1 noise, ML-based techniques necessitate minimum prior knowledge. Thus, reducing the dependency on manual adjustments and expert input, making the process more automated and streamlined. In addition, ML based methods are less sensitive to noise and thus particularly advantageous in environments with signal-to-noise issues ensuring reliable peak detection without extensive manual intervention.
Why is the method of peak detection important?
The method of peak detection is important for following reasons in the context of analysis of therapeutic proteins (mAbs):
Have you found that there are certain chromatography or spectrometry techniques that are optimized by using ML-based approaches? 7
Yes, certain chromatography and spectrometry techniques can be optimized by using ML based approaches due to their complex and high-dimensional data characteristics which can be challenging to process using traditional methods. Some common examples are high-resolution liquid chromatography-mass spectrometry (LC–MS). Studies have shown that ML techniques such as convolutional neural network (CNN) and recurrent neural network (RNN) supersede far over other techniques in higher true positive rate detection.
Was ML was your best option in carrying out your peak detection analysis? Were other artificial intelligence (AI) approaches considered?
Conventional algorithms for peak detection such as partial least squares-discriminant analysis (PLS-DA) and locally weighted regression (LWR) were applied towards our problem statement. The results from them reflected lesser accuracy in multiple peak detection and required heavy computational load. Artificial neural networks were also deployed for the similar tasks of peak detection but their inability to extract relations out of spectral data led to inaccurate detections. The approach developed by us with convolutional neural networks transcended the performance of conventional algorithms as well as the ML approach based on artificial neural networks in terms of accuracy, computational efficiency, and operational efficiency.
Briefly state your findings in this study.
In the initial phase, hetero-variants (glycoforms) of a mAb were distinguished using LC–MS, revealing discernible peaks at the intact level. To comprehensively identify each peak in the intact-level analysis, a deep learning approach utilizing CNNs was employed. Using conventional software for peak identification only five peaks were detected with a 0.5 threshold. The CNN model identified seven main peaks with many overlapping peaks within the main peak under the same conditions, indicating superior detection capability. The true positive rate for 0.5 threshold of CNN model was 0.9 with probability AUC value of 0.9949, giving good results. The results were also compared with some conventional algorithms such as PLS-DA and LWR for peak detection and CNN model outperformed both of these models with higher computational efficiency.
Do your findings correlate with what you had hypothesized?
Yes, as hypothesized utilizing machine learning, specifically CNNs, would improve peak detection accuracy and true positive rates compared to conventional methods. Using conventional software for peak identification only five peaks were detected with a 0.5 threshold. The CNN model identified seven main peaks with many overlapping peaks within the main peak under the same conditions, indicating superior detection capability.
Was there anything particularly unexpected that stands out from your perspective?
The ability of the CNN model to accurately detect multiple overlapping peaks without getting affected by noise is what unexpectedly stands out.
Were there any limitations or challenges you encountered in your work?
Some of the limitations of the present study include:
What best practices can you recommend in this type of analysis for both instrument parameters and data analysis?
The best practices we recommend for effective data analysis are:
In the case of instrument parameters, one should focus on:
What are the next steps in this research and are you planning to be involved in improving this technology?
We can explore combining CNN with other AI techniques such as classifiers to enhance detection capabilities and robustness further. Develop strategies to reduce the computational load, such as parallel processing or splitting datasets into smaller regions, to make the approach more efficient and scalable.
What are your thoughts on AI and ML for data analysis in chromatography and spectrometry?
AI and machine learning can significantly improve the reliability and depth of analytical results in chromatography and spectrometry by enhancing accuracy, efficiency and scalability. The high-dimensional and complex datasets produced by the chromatography and spectrometry techniques can be easily handled by ML algorithms by extracting meaningful features and patterns that might be missed by conventional methods.
Reference
1. Nikita, S.; Bhattacharya, S.; Manocha, K.; Rathore, A. S. Deep Learning Framework for Peak Detection at the Intact Level of Therapeutic Proteins. J. Sep. Sci. 2024, 47 (11),139888. DOI: 10.1002/jssc.202400051
Characterization and Assessment of Effects of Drying Temperature on Edible Mushrooms with HS-GC–MS
September 6th 2024To gain insight into the effect of drying temperature on its composition, headspace solid-phase microextraction-gas chromatography-mass spectrometry (HS-SPME-GC–MS) was used for the identification of volatile organic compounds (VOCs) of 86 mushroom samples that were divided into five groups and dried at different temperatures.
Modern HPLC Strategies: Improving Retention and Peak Shape for Basic Analytes
August 16th 2024In high-performance liquid chromatography (HPLC), it is common for bases and unreacted ionized silanols on silica-based columns to cause irreproducible retention, broad peaks, and peak tailing when working with basic analytes. David S. Bell, Lead Consultant at ASKkPrime LLC offers innovative HPLC strategies that can help mitigate such issues.
Real-Time Measurement of EPA Regulated HON Compounds and Environmental Pollutants Using SIFT-MS
September 6th 2024This application note describes the determination of method detection limits (MDLs) for the newly regulated HON (Hazardous Organic NESHAP) compounds, which validate SIFT-MS as an effective solution for measuring these toxic VOCs and other environmental pollutants in ambient air, whether at the fence line or in a mobile setting.