Thousands of never-before-seen genetic variants in the human genome have been uncovered by researchers using a new genome sequencing technology.
The discoveries closes many human genome mapping gaps that have long resisted sequencing.
The technique, called single-molecule, real-time DNA sequencing (SMRT), may now make it possible for researchers to identify potential genetic mutations behind many conditions whose genetic causes have long eluded scientists, said Evan Eichler, professor of genome sciences at the University of Washington, who led the team that conducted the study.
“We now have access to a whole new realm of genetic variation that was opaque to us before,” Eichler said.
To date, scientists have been able to identify the genetic causes of only about half of inherited conditions. This puzzle has been called the “missing heritability problem.”
One reason for this problem may be that standard genome sequencing technologies cannot map many parts of the genome precisely.
The standard approach also made it possible to identify very large variations, typically involving segments of DNA that are 5,000 bases long or longer.
But for technical reasons, scientists had previously not been able to reliably detect variations whose lengths are in between – those ranging from about 50 to 5,000 bases in length.
The SMRT technology used in the new study makes it possible to sequence and read DNA segments longer than 5,000 bases, far longer than standard gene sequencing technology.
This “long-read” technique allowed the researchers to create a much higher resolution structural variation map of the genome than has previously been achieved.
Mark Chaisson, a postdoctoral fellow in Eichler’s lab and lead author on the study, developed the method that made it possible to detect structural variants at the base pair resolution using this data.
To simplify their analysis, researchers used the genome from a hydatidiform mole, an abnormal growth caused when a sperm fertilises an egg that lacks the DNA from the mother.
The fact that mole genome contains only one copy of each gene, instead of the two copies that exist in a normal cell simplifies the search for genetic variation.
Using the new approach in the hydatidiform genome, the researchers were able to identify and sequence 26,079 segments that were different from a standard human reference genome used in genome research. Most of these variants, about 22,000, have never been reported before, Eichler said.
“These findings suggest that there is a lot of variation we are missing,” he said.
The findings were published in the journal Nature.
