Vineet Bafna

University of California San Diego

Haplotype Assembly

 

The availability of high density SNP chips has empowered whole-genome association scans for many common diseases. However, current genotyping methods do not reveal haplotypes: the combination of alleles at neighboring SNPs on a single chromosome that tend to be inherited together. Knowledge of haplotypes is important for fine-scale mapping of disease-related variants, and understanding the role of different evolutionary mechanisms-meiotic recombination, positive selection- in shaping human genetic variation. Statistical methods, based on population genotypes analysis can be used, but are limited for long range haplotyping.

 

Parallel advances in sequencing technologies now allow sequencing of individual genomes; they also enable haplotyping.  In this talk, I will describe combinatorial and stochastic (Markov Chain Monte Carlo) algorithms for reconstructing long and accurate haplotypes from whole genome sequence data for an individual (J. Craig Venter).  While the overall method is a heuristic one, relying on computing cuts in an associated graph. it is motivated by a theoretical analysis of the mixing time for representative markov chains, where we show that two similar markov chains have very different mixing properties. Experimental results on the Venter data, and simulations validate the power of our approach to haplotype assembly.

 

(Joint work with Vikas Bansal (UCSD), and Aaron Halpern (JCVI))

 

****************

Vineet Bafna is an Associate Professor in the Computer Science Department at UCSD. Prior to joining UCSD in 2003, he spent seven years in the bio-science industry, ultimately as Director of Informatics Research, at Celera Genomics. At Celera, he participated in the human genome pro ject, designing novel tools for gene discovery, and leading the analysis of mass spectrometry data for identifying cancer bio-markers. His current research focus is on computational problems arising in mass spectrometric data analysis, populations genetics, non-coding genes, and cancer genomics. He is an Associate Editor for JBCB, IEEE TCBB, and Biology Direct, and has served on the program committees of ISMB, RECOMB, and other conferences. He has co-authored over seventy research articles in refereed journals and conference proceedings.

 

4:15pm

B17 Upson Hall

Thursday, November 20, 2008

Refreshments at 3:45pm in the Upson 4th Floor Atrium

Computer Science

Colloquium

Fall 2008

www.cs.cornell.edu/events/colloquium