Fujitsu Laboratories develops new technology that accelerates database analyses of genomic information

Fujitsu Laboratories Ltd. announced the development of a technology that accelerates database analyses of the correlations between genomic variations and environmental information, such as disease and lifestyle habits. This technology speeds up the process by a factor of roughly 400 compared to existing methods.

Thanks to advances in genomic medicine, it is possible to analyze genomic and genetic information in combination with clinical and environmental information to study the relationship between genetic factors and environmental factors. This kind of research relies on genomic information stored in databases in order to analyze the information from different perspectives, but because of the massive volumes of genomic information being handled, there is the problem of the lengthy time required for processing.

Fujitsu Laboratories has greatly accelerated analysis processing by introducing a new data structure that makes it possible to rapidly analyze large-scale genomic information within a database.

This technology makes it possible to acquire knowledge that previously was difficult to obtain quickly, aiding the advance of genomic medical research.

Details of this technology are being presented at the 19th International Conference on Extending Database Technology (EDBT 2016), opening March 15 in Bordeaux, France.

The advent of next-generation sequencers which quickly read enormous volumes of genomic information has opened up the possibility of measuring and analyzing a genome to reveal what diseases a person might be susceptible to, to predict a patient’s response to a drug and the drug’s side effects, and to design personalized preventative and therapeutic treatments. Making effective use of genomic medicine will require studying and understanding the relationship between genomic information and clinical and environmental information.

With a person’s entire genome being approximately three billion bases in length, there can be tens of millions of variations, known as “variants” that can account for differences between individuals. With type-2 diabetes, for example, there are dozens of variants and several lifestyle habits that are known to cause the disease, and there may be synergies among each of these factors. One method for gaining such insights is the genome-wide association study, where a huge volume of genomic information and clinical and environmental information are collected and subjected to statistical analysis.

Issues

Aggregating data on a single variant across a population of 100,000 people takes about one second of processing time using existing open-source database software (according to Fujitsu Laboratories’ research). Accordingly, for a single disease, for example, aggregating variants at 10 million loci in a study population of 100,000 people would take roughly 120 days. Genome-wide association studies require multiple iterations of this kind of analysis, making improvements in processing speed a pressing issue.

[“Source-news-medical”]