(Credit: University of Chicago)
The time and cost of sequencing entire human genomes has dropped dramatically in the 21st century. Unfortunately, it can take months to analyze the results, given that there are 3.2 billion base pairs to decode. (As such, it’s become popular instead to focus on the fewer than 2 percent of the genome that codes for proteins, in a process called exome sequencing.)
Now researchers out of the University of Chicago are reporting in the journal Bioinformatics that one of the world’s fastest supercomputers devoted to life sciences can analyze 240 full genomes in 50 hours. They estimate that the same task would have taken a single 2.1 GHz CPU more than 47 years.
To be clear, tapping the extraordinary powers of the Beagle (named after the ship that carried then 22-year-old Charles Darwin on his scientific voyage around the world in 1831) isn’t going to keep the cost of whole genome analysis down, which is of course an imperative if the tech is to be clinically useful. It does, however, demonstrate that genome analysis has the potential to be far, far faster than it is today.
“The supercomputer can process many genomes simultaneously rather than one at a time,” first author Megan Puckelwartz, a postdoctoral fellow at the University of Chicago, said in a school news release. “It converts whole genome sequencing, which has primarily been used as a research tool, into something that is immediately valuable for patient care.”
The researchers report that they used one quarter of the Beagle’s operating capacity in conjunction with commercially available software packages to analyze raw sequencing data from 61 human genomes, and that using this approach not only improved speed dramatically but also accuracy. This will presumably help reduce the cost of both sequencing and analyzing whole genomes down the road.
In fact, study author Elizabeth McNally, director of the Cardiovascular Genetics clinic at the University of Chicago Medicine, said in the news release that if the cost of analysis can be moved into the $ 1,000 range, which is the current target on the sequencing side, it will make sense to analyze entire genomes instead of just a small fraction of them.
Because exome sequencing can help spot an estimated 85 percent