Why is this work important?
We really don’t know how many people are out there walking around with a genetic aberration that is causing them health issues. They go completely undiagnosed, meaning we cannot find the genetic cause of their problems. Everyone has around 50,000 variants that are rare in the population, and we have absolutely no idea what most of [those variants] are doing. If you collect gene expression data, which shows which proteins are being produced in a patient’s cells at what levels, we’re going to be able to identify what’s going on at a much higher rate.
How does your software tackle this challenge?
Our computational system, called Watershed, scours reams of genetic data along with gene expression to predict the functions of variants from individuals’ genomes. We validated these predictions experimentally and applied the findings in order to assess the rare variants captured in massive genetic studies, such as the UK Biobank, the Million Veteran Program, and the Jackson Heart Study. What we found has helped reveal which rare variants may be having an impact. Our results were published in Science in September and are part of the National Institutes of Health’s Genotype-Tissue Expression project.
Does this work advance the field of personal genomics?
Yes, I think that’s fair to say. Characterizing rare variants that occur in the noncoding parts of the genome that currently are not evaluated represents an important advance in the field of personal genomics, which focuses on the sequencing and analysis of individuals’ genomes. Any improvement we can make in this area has implications for public health; even pointing to what the genetic cause of an illness is gives parents and patients a huge sense of relief and understanding, and can point to possible therapeutics.