Development of New Centroiding Algorithms for High-Resolution Mass Spectrometry

The Column, January 2022, Volume 18, Issue 1
Pages: 11

Researchers have developed two new algorithms capable of converting centroided data—generated during high-resolution mass spectrometry (MS) analysis—to mass peak profile data and vice versa (1).

Liquid chromatography and gas chromatography coupled with high‑resolution mass spectrometry (LC/GC–HRMS) are ubiquitous when comprehensive chemical characterization of complex samples is necessary, generating extremely information-rich datasets on everything from environmental matters to biological problems. However, while the amount of data generated is often a benefit, the actual processing of that data becomes a challenge. This is particularly the case when dealing with unknown chemicals in highly complex sample matrices. One commonly employed strategy for the reduction of data size and information density is centroiding in which the distribution of the mass profile peak is represented with one point that is commonly associated with the mass peak apex. This approach is either performed by the instrument during data acquisition or as a step in a data processing workflow. Data can often be reduced 10‑fold using this process, however, the price is the loss of information related to mass peak distribution—this provides valuable insights into the mass accuracy and precision. There is also a wide range of issues associated with the software packages used for centroiding, both vendor specific and open source. These issues can lead to uncertainty in results, and in particular analysis of complex matrices can be a struggle. The overall result is often a lack of reproducibility and issues with identification of unknown chemicals of interest.

One solution to these problems would be to introduce access to information related to
the peaks in both time and mass domains, which has been shown to improve reproducibility and reliability in other techniques. Currently, most existing centroiding algorithms do not produce such information, and there is no algorithm that can estimate the peak mass widths from centroided data. As such, researchers have sought to develop and validate new algorithms capable of being seamlessly converted to profile data and vice versa.

The successfully developed algorithms, named the Cent2Prof package, was developed in Julia language and tested using seven previously analyzed datasets from three different vendors in both positive and negative modes. For evaluation purposes, the new algorithms were tested against an existing algorithm called MZmine.

Researchers found rates of false detection were reduced by ≤5% with the new algorithm package, with the MZmine algorithm having a 30% rate of false positives and 3% rate of false negatives. The error in profile predication was found to be ≤56%, independent of the mass, ionization mode, and intensity. This was six times more accurate than the resolution-based estimate values, according to the authors.

All of the algorithms are open source and open access, with the current model only being based on quadrupole time-of-flight (QTOF) data, which limits the application of the algorithms for orbital trap data. Researchers believe an additional model is needed for orbital trap data and will work on this in the future alongside optimization to improve the currently required time to run the package, which is around 16 min for a chromatogram consisting of approximately 2000 scans.


  1. S. Samanipour et al., Anal. Chem. 93(49), 16562–16570 (2021).