Uncovering the Mysteries of Microproteins

By

Zhe Ji, PhD, assistant professor of Pharmacology and at the McCormick School of Engineering, was senior author of the study published in Nature Communications.

Northwestern Medicine scientists have developed a method to identify and characterize microproteins — a development that opens the door for understanding physiology and disease at a molecular level of detail not previously possible, according to findings published in Nature Communications.

Microproteins, which are proteins measuring less than 100 amino acids in length, have gone largely unappreciated until recent technological developments made them detectable by scientists. Because of this, microproteins are not well-characterized or understood, said Zhe Ji, PhD, assistant professor of Pharmacology and at the McCormick School of Engineering, and senior author of the study.

“Non-canonical proteins, or microproteins, are a relatively new field,” said Ji, who is also a member of the Robert H. Lurie Comprehensive Cancer Center of Northwestern University. “The median length of these microproteins is around 20 amino acids long, and most of them aren’t stable; degrade quickly. However, some are stable and do have biological functions, and my lab wanted to understand the molecular features distinguishing these stable versus unstable microproteins.”

While traditional proteins are well-characterized and annotated in open-source databases available to scientists, no such comprehensive catalog currently exists for microproteins, Ji said. 

In the study, Ji and his collaborators used ribosome profiling data to characterize microproteins in humans, mice, zebrafish, worms and yeast. The Ji Laboratory also developed a low-input and rapid ribosome profiling method, which was detailed in a previous study.

Then, the study authors developed a logistic regression model based on the microproteins’ features to determine how likely each microprotein was to be stable in humans.

“Our model effectively explained the two groups of microproteins: why some of them are stable and detectable, and why some are not,” Ji said. “When a microprotein is longer and conserved, and also has a domain, it’s more stable, which makes sense.”

By validating the first round of findings by selectively expressing the microproteins in cultured cells, Ji and his colleagues suggested that there are about 4,000 microproteins that may be stably expressed in humans, he said.

“A surprising finding of our work is that most of these stable human microproteins have longer lengths (generally >60 amino acids) but are poorly conserved across mammals. This means that a lot of these species-specific young proteins in the cell encoded by our genome have been ignored by literature,” Ji said. “This work basically opens up the possibility for us to characterize these microproteins. We found some of them are human-specific, some can be mouse-specific, so this suggests there are potentially thousands of functional microproteins in these different species that we know very little about.”

Building on this discovery, Ji and his collaborators will continue to study microproteins to see which ones are transcribed into fully functioning proteins within human cells, he said.

“We also plan to study their functions in different biological conditions such as immunology, cancer and neuronal disorders,” Ji said. “Microproteins all can play  functional roles there.”

The study was supported by National Institutes of Health grants R35GM138192, R01HL161389, and R00CA207865. Additional funding was provided by the Lynn Sage Scholar fund and the Predoctoral Training Program in Biomedical Data Driven Discovery.