Skip to Content

Solving the bottleneck in genome biology

June 25, 2025

Whole genomes—complete sets of an organism’s DNA—can now be assembled in anywhere from a couple of days to a couple of minutes, thanks to today’s computational power and advanced algorithms. But to understand gene function, study evolutionary relationships, and advance medical and agricultural research, geneticists need to go a step further to accurately identify genes and other important biological features through a slow, tedious process called genome annotation.

Supported by the National Institutes of Health, Johns Hopkins researchers have created LiftOn, a new software tool that can transfer annotations between the genomes of different species to map out new genomes far more quickly than current methods. Their work appears in Genome Research.

Although promising, this “shortcut” provided by genome annotation transfer—the act of using one genome’s annotations on another genome—has been limited by its accuracy in the past.

“Existing tools often struggle to transfer annotations between genomes with significant divergence, such as those of distinct species,” explains first author Kuan-Hao Chao, a fourth-year doctoral candidate working in the labs of two BME Faculty, Steven Salzberg, Bloomberg Distinguished Professor of Computational Biology and Genomics, and Mihaela Pertea, associate professor of biomedical engineering. “The key question is how to improve the reliability and comprehensiveness of genome annotation transfer, particularly for protein-coding genes, across both closely related and distantly related species,” says Chao.

But rather than starting over from scratch, Chao and his team in the Center for Computational Biology (CCB) combined the strengths of two existing DNA alignment tools.

They began with Liftoff, an efficient and cost-effective annotation “lift-over” tool developed by Alaina Shumate, Med ’22 (PhD) and Salzberg, who is the director of the CCB. With the help of DNA alignment program Minimap2, Liftoff can transfer annotations from a well-annotated genome to a newly sequenced one based on similarities in their DNA sequences.

According to the research team, LiftOn combines Liftoff with miniprot, a protein-based aligner, to generate better protein-coding gene annotations than either alignment method can achieve on its own. LiftOn’s protein-maximization algorithm allows it to identify optimal coding sequences, resolve overlapping annotations, and detect additional gene copies, overall enhancing its ability to transfer genome annotations.

The researchers first evaluated LiftOn on same-species datasets—such as human, mouse, honeybee, and rice genomes—before moving on to test the tool’s ability to transfer annotations across more distantly related species, achieving high mapping rates between human and chimp (98.7%), two types of fruit flies (94.5%), and even mouse and rat genomes (94.3%) and obtaining more comprehensive protein-coding gene annotations than either DNA- or protein-based methods could achieve alone.

“Our experimental study demonstrates that incorporating protein sequence alignment into the annotation process substantially improves accuracy, even between divergent species,” Chao says. “LiftOn—the first tool to do so—reliably outperforms other methods that rely solely on DNA or proteins for mapping annotations from one genome to another.”

The researchers plan to expand LiftOn’s current capabilities by incorporating RNA sequencing assembly data, using multiple sources for genome annotation transfer, and optimizing it for faster runtimes.

“Our goal is to apply LiftOn to a broader range of species and genome projects to further validate and refine this important tool,” Chao says. “Making genome annotations widely available gives scientists a detailed map of an organism’s genetic information, helping them find disease-causing mutations more quickly and better understand genetic differences between species and individuals.”

Additional authors of this work from Johns Hopkins include Shumate, Pertea, Salzberg, Celine Hoh, a PhD student in the Department of Computer Science; and Alan Mao, Engr ’25, Peab ’25. Jakob M. Heinz, a doctoral student at Harvard Medical School, also contributed to this research.

Learn more by watching Chao’s presentation of this work at the 2024 International Conference on Intelligent Systems for Molecular Biology.

This story originally appeared on Department of Computer Science’s website>>

Associated Faculty: Steven L. Salzberg, Mihaela Pertea

Read the Johns Hopkins University privacy statement here.

Accept