PAG-XIII  Plant & Animal Genomes XIII Conference

January 15-19, 2005
Town & Country Convention Center
San Diego, CA



P866 : Algorithms


Bayesian Poisson Modeling Strategies For Evaluating Differential Gene Expression In cDNA Tag Sampling Experiments

Xiao-Lin Wu1 , Shizhong Xu2 , Zhihua Jiang1

1  Department of Animal Sciences, Washington State University, WA 99164-6351, USA
2  Department of Botany and Plant Sciences, University of California, CA 92521, USA

Evaluation of differential gene expression is an important step toward defining transcriptomes in specific tissue/cell types. In the past, cDNA tag-counting methods, such as EST (expressed sequence tag)-based in silico gene profiling or serious analysis of gene expression (SAGE), have been used as efficient tools for whole-genome transcriptome analysis. Although a variety of statistical methods have been proposed, questions have been raised regarding the simplified underlying assumptions and thus their statistical robustness and application context. In the present research, we explored various Bayesian Poisson modeling strategies for evaluating differential gene expression in cDNA tag sampling experiment using simulated data. We modeled differential gene expression either as the difference of two independent Poisson random variables or as a bivariate Poisson difference. Theoretically, Poisson distributions can better describe the nature of discrete and relatively small frequency of cDNA tag data than the approximate normal distributions. The use of bivariate Poisson distributions allows us to model correlation in evaluating gene expression whereas ignoring the correlation leads to misspecification and over-estimation of model parameters. Furthermore, homogeneous Poisson model tends to underestimate the observed dispersion when there exists population/tissue/cell heterogeneity. In practice, the latter is often unobserved yet can be inferred from the excess of zero values over the expected frequency. The zero-inflated Poisson model (ZIP) successfully handles the situation where the assumption of a homogenous Poisson distribution is obviously violated. As a special mixture model, the ZIP approach has significantly improved the estimation efficiency by puttinh optimal weights to various components in the mixture.