Precision Analytics for Life Science and Medicine: AI & Data Science

ColumnJune 2024
Volume 20
Issue 6
Pages: 7

AI & Data Science using gradient boosting machine (GBM) learning and analysis is at the forefront of applied machine learning in the life sciences and medicine as part of the realm of artificial intelligence (AI). Gradient boosting involves a toolkit of learning techniques aimed at building predictive models by combining outputs of multiple less robust models by using decision trees in a sequential manner. During a presentation at Analytica in Munich, Germany, experts discussed the use of GBM in life sciences and medicine.

The first talk of this session was presented by Bing Zhang from Baylor College of Medicine in Houston, Texas and was titled, "Leveraging Artificial Intelligence to Illuminate the Dark Phosphoproteome." This talk addressed the challenge of effectively analyzing and interpreting mass spectrometry-based phosphoproteomics data. Zhang's team employed machine learning (ML) and deep learning (DL) methods to enhance phosphoproteomic data analysis, aiming to understand what is referred to the "dark phosphoproteome." Specifically, they developed DeepRescore2 software, utilizing deep learning-based retention time and fragment ion intensity predictions to improve phosphopeptide identification and phosphosite localization. Additionally, Zhang discussed the IDPpub computational pipeline, which leverages BioBERT software to extract phosphorylation sites from biomedical abstracts, facilitating the identification of regulating enzymes and biological functions of phosphosites.

The second talk of this session was presented by Lennart Martens from VIB life sciences research institute and Ghent University in Belgium. The presentation was entitled, "Machine Learning-powered Floodlights to Illuminate Precision Medicine," and focused on the integration of machine learning models into mass spectrometry-based proteomics. Martens highlighted the significant improvement in identification performance achieved by machine learning models such as MS2PIP and DeepLC software coupled with the MS2Rescore variant of the Percolator rescoring engine. These machine learning models enhance information recovery from proteomic data, providing new insights into underlying biology and pathology encoded in existing datasets. Furthermore, Martens emphasized the potential of machine learning models in revealing detailed insights into molecular pathologies and mapping protein activity at a proteome-wide scale, which could have implications for precision medicine.

The third presentation by Fan Liu from Leibniz-Forschungsinstitut für Molekulare Pharmakologie (FMP) in Berlin, Germany, discussed "Developing Structural Interactomics and its Application in Cell Biology," and focused on proteome-wide cross-linking mass spectrometry for capturing protein interactions and molecular spatial arrangements. Liu highlighted the advancements in experimental methods and software tools, generating extensive protein-protein interaction (PPI) data across several biological systems. These data offer insights into protein subcellular localizations, interactions, and architectures, serving as valuable training data for AI-based methods to identify protein-protein interaction (PPI)-and specific amino acid sequences or structural features within proteins that play a crucial role in mediating the binding of proteins to each other,

The final talk of the session was given by A. P. Gamiz-Hernandez from Stockholm University in Sweden, who presented, "Insights into Molecular Principles of Protein Function and Disease," addressing the energy metabolism of cells and the challenges in understanding the OXPHOS (oxidative phosphorylation) complexes' energy transduction mechanism located in the inner mitochondrial membrane and responsible for generating ATP (adenosine triphosphate) for cellular energy, Gamiz-Hernandez discussed combining molecular dynamics simulations and machine learning models to predict structure-based chemical reactivity, such as pKa and redox potentials, in proteins. This approach aimed to identify key residues responsible for protein function and disease-related mutations, providing insights into molecular principles underlying protein function and disease mechanisms.

Related Videos
Robert Kennedy
John McLean | Image Credit: © Aaron Acevedo
Related Content