Thursday, February 26, 2004
B17 Upson Hall
Probabilistic Models for Identifying Regulation Networks
Microarray-based hybridization methods techniques allow to simultaneously measure the expression level for thousands of genes. Such measurements contain information about many different aspects of gene regulation and function, and indeed this type of experiments has become a central tool in biological research. A major computational challenge is finding ways to extract new biological understanding from this wealth of data.
Our goal is to uncover the causal structure of the interactions between genes, with the aim of understanding the regulatory processes that bring about the observed expression patterns. I will argue that one way of addressing this question is a Bayesian framework, where we treat the measured expression level of each gene as a random variable and each regulatory interaction as a probabilistic dependency between such variables. In my talk, I will describe an ongoing project to use Bayesian networks and extensions of them to model such dependencies.
I will explain the basic foundations of the approach, the possible choices in defining the modeling language, the methods we use to learn models from data, and finally how we interpret the learned models. This latter stage includes validation against known biology, and constructing new hypotheses about the role of unknown genes. I will present a progression of models that capture different aspect of gene regulation, and an assessment of their performance on several large scale yeast gene expression experiments.
This is joint work with Dana Pe'er, Iftach Nachman, Aviv Regev, Gal Elidan, Eran Segel, Micha Shapira, David Botstein, and Daphne Koller.