Indexing Finite Language Representation of Population Genotypes

Lecturer : 
Prof. Veli Mäkinen
Event type: 
HIIT seminar
Event time: 
2012-03-19 13:15 to 14:00
Place: 
Lecture hall T2, ICS department
Description: 

Abstract:

With the recent advances in DNA sequencing, it is now possible to have complete genomes
of individuals sequenced and assembled. This rich and focused genotype information can be used to do
different population-wide studies, now first time directly on whole genome level. We propose a way to
index population genotype information together with the complete genome sequence, so that one can use the index to efficiently align a given sequence to the genome with all plausible genotype recombinations taken into account. This is achieved through converting a multiple alignment of individual genomes into a finite automaton recognizing all strings that can be read from the alignment by switching the sequence at any time. The finite automaton is indexed with an extension of Burrows-Wheeler transform to allow pattern search inside the plausible recombinant sequences. The size of the index stays limited, because of the high similarity of individual genomes. The index finds applications in variation calling and in primer design. On a variation calling experiment, we found about 1.0% of matches to novel recombinants just with exact matching, and up to 2.4% with approximate matching.

Joint work with Jouni Sirén and Niko Välimäki.

Bio:

Veli Mäkinen finished his PhD studies in computer science in 2003 at the University of Helsinki.
After that he has mainly worked in research projects funded by the Academy of Finland, as Postdoctoral Research Fellow (2005-2007) and Academy Research Fellow (2007-2010). Years 2004-2005 he worked as a Postdoctoral Researcher at Bielefeld University / Center for Biotechnology, Germany. In 2010, he was appointed as a Professor in computer science at the University of Helsinki with the specialization area of data analysis and computational modeling of biological systems. Veli Mäkinen now heads the Genome-scale algorithmics research group that belongs to the new Center of Excellence in Cancer Genetics Research.


Last updated on 11 May 2012 by Webmaster - Page created on 5 Mar 2012 by Sohan Seth