Despite improvements in genomic sequencing technologies, a significant number of medically important genes are within genomic regions in which sequencing lacks the technical accuracy to generate reliable reads, according to a report in Genome Medicine.1
This study examined the relationship between human genome complexity and gene variants associated with human disease. Researchers mapped regions of disease relevance to regions of high or low confidence. These benchmark regions were used to determine the sensitivity and positive predictive value of 2 sequencing pipelines for specific categories of variation.
Researchers used a gold-standard genome sequence from the United States National Institute of Standards and Technology (NIST). This genome had previously been sequenced by 5 different sequencing technologies. NIST combined all sequencing results to generate a reliable consensus sequence for 77% of the genome.
The results indicated that the accuracy of gene variant information relied on the genomic region, type of gene variant, and sequencing read depth. This accuracy also varied based on the sequencing pipeline from which the data were obtained.
Of 3300 genes known to cause human disease and overlap with the high confidence regions of the genome, 593 (18%) contained large protein coding regions that could not be reliably sequenced. Remarkably, only 990 medically important genes in the human genome were found entirely within high confidence, reliable regions.
Sites prone to sequencing errors existed by the thousands in publically available gene variant databases, and a false-positive genetic variant was found in the BRCA2 gene.
“As this technology moves from the research lab to the clinic, we need to be able to accurately and reliably sequence entire genomes, because incorrect sequence information can lead to inappropriate medical care,” said Rachel Goldfeder, a graduate student in the Department of Medicine and the Stanford Center for Inherited Cardiovascular Disease at Stanford University, in California, and lead author on the study.
“The good news is that, in this case, 77% of the donor’s genome was reliably sequenced using current methods. The challenge now is to focus our efforts on the other 23%, namely, on regions of the genome that remain elusive. Only then can we realize the full potential of precision medicine.”
1. Goldfeder RL, Priest JR, Zook JM, et al. Medical implications of technical accuracy in genome sequencing. Genome Med. doi:10.1186/s13073-016-0269-0.