Regions of the human genome previously thought to be “junk” have been demonstrated to be protein coding.
A new proteogenomics method using high-resolution isoelectric focusing together with liquid chromatography–mass spectrometry (HiRIEF LC–MS) — developed by researchers from Karolinska Institute and Science for Life Laboratory (SciLifeLab) in Sweden — has been published in the journal Nature Methods.1
Every day it becomes clearer that understanding the genetic make-up of humans is key to furthering our understanding of health and disease — from the discovery of new biomarker proteins for the diagnosis of disease, to finding new targets for personalized medicine. It is estimated that more than three quarters of the human genome is actively transcribed (or decoded) into RNA, but only 1.5% of the genome has been linked to the production of proteins. The remaining regions are unchartered, attributed to regulatory function or “junk” DNA.
The current method of genome annotation, attributing genetic code to protein production, is determined via RNA analysis. However, to conclusively determine the purpose of a gene, it has to be linked to a protein. Associate professor and study leader Janne Lehtiö told The Column: “I believe our method is one important component in the active effort by the scientific community to connect genomics and proteomics fields. The combination of the two will provide many new discoveries as shown already by our small-scale work.”
The researchers collected lysate samples from human A431 cells and mouse N2A cells, then prefractionated the lysates based on pI using HiRIEF narrow strips. The resulting fractions were then analyzed by performing LC–MS. Lehtiö said: “We initiated the project to develop a hassle-free peptide level fractionation method to allow deep proteome analysis of complex samples such as human clinical samples; a goal which we achieved early on in the project. We then pushed the method further to make use of the additional data that the isoelectric focusing of peptides provide and applied it to discover new protein coding genes.”
Prefractionation using peptide-level isoeletric focusing (IEF) prior to MS has been previously reported, but within this study the ultranarrow HiRIEF strips gave a fivefold increase in resolution. Lethiö said: “It was challenging to put all the components of the workflow together and make it fly. I mean wet lab part, data analysis workflow and improved isoelectric point calculator, and so on. We performed over a year of validations and reruns to make sure we had a functional method.”
The method detected and identified 13,078 proteins within the human cell lysates, and found 98 previously undiscovered protein-coding loci. Lehtiö commented: “When we had everything in place it felt like participating in a Jules Verne adventure inside the genome.”
When asked to expand on the importance of the results to the wider research community, Lehtiö told The Column that many of the new proposed protein variants are connected to several diseases. She said: “I hope those findings will provide ideas for researchers across the globe which will accelerate medical research.” &mdash B.D.
1. J. Lethiö et al., Nature Methods DOI: 19.1038/NMETH.2732 (2013).