×

Bayesian hierarchical modeling and selection of differentially expressed genes for the EST data. (English) Zbl 1216.62182

Summary: Expressed sequence tag (EST) sequencing is a one-pass sequencing reading of cloned cDNAs derived from a certain tissue. The frequency of unique tags among different unbiased cDNA libraries is used to infer the relative expression level of each tag. We propose a hierarchical multinomial model with a nonlinear Dirichlet prior for the EST data with multiple libraries and multiple types of tissues. A novel hierarchical prior is developed and the properties of the proposed prior are examined. An efficient Markov chain Monte Carlo algorithm is developed for carrying out the posterior computation. We also propose a new selection criterion for detecting which genes are differentially expressed between two tissue types. Our new method with the new gene selection criterion is demonstrated via several simulations to have low false negative and false positive rates. A real EST data set is used to motivate and illustrate the proposed method.

MSC:

62P10 Applications of statistics to biology and medical sciences; meta analysis
62F15 Bayesian inference
92C40 Biochemistry, molecular biology
65C40 Numerical analysis or methods applied to Markov chains
65C60 Computational problems in statistics (MSC2010)
PDFBibTeX XMLCite
Full Text: DOI Link

References:

[1] Audic, The significance of digital gene expression profiles, Genome Research 7 pp 986– (1997)
[2] Baggerly, Differential expression in SAGE: Accounting for normal between-library variation, Bioinformatics 19 pp 1477– (2003) · doi:10.1093/bioinformatics/btg173
[3] Baggerly, Overdispersed logistic regression for SAGE: Modelling multiple groups and covariates, BMC Bioinformatics 5 pp 144– (2004) · Zbl 05325961 · doi:10.1186/1471-2105-5-144
[4] Claverie, Computational methods for the identification of differential and coordinated gene expression, Human Molecular Genetics 8 pp 1821– (1999) · doi:10.1093/hmg/8.10.1821
[5] Flandrin, Significance of heat-shock protein (HSP) 90 expression in acute myeloid leukemia cells, Cell Stress Chaperones 13 pp 357– (2008) · doi:10.1007/s12192-008-0035-3
[6] Ibrahim, Bayesian models for gene expression with DNA microarray data, Journal of the American Statistical Association 97 pp 88– (2002) · Zbl 1073.62578 · doi:10.1198/016214502753479257
[7] Jiao, On correcting the overestimation of the permutation-based false discovery rate estimator, Bioinformatics 24 pp 1655– (2008) · Zbl 05511678 · doi:10.1093/bioinformatics/btn310
[8] Kim, Methylation of the RUNX3 promoter as a potential prognostic marker for bladder tumor, Journal of Urology 180 pp 1141– (2008) · doi:10.1016/j.juro.2008.05.002
[9] Kuznetsov, Distribution associated with stochastic processes of gene expression in a single eukaryotic cell, EURASIP Journal on Applied Signal Processing 4 pp 285– (2001) · Zbl 0997.92031 · doi:10.1155/S1110865701000294
[10] Liu, The collapsed Gibbs sampler in Bayesian computations with applications to a gene regulation problem, Journal of the American Statistical Association 89 pp 958– (1994) · Zbl 0804.62033 · doi:10.2307/2290921
[11] Liu, CDK1 promotes cell proliferation and survival via phosphorylation and inhibition of FOXO1 transcription factor, Oncogene 27 pp 4733– (2008) · doi:10.1038/onc.2008.104
[12] Lu, Identifying differential expression in multiple SAGE libraries: An overdispersed log-linear model approach, BMC Bioinformatics 6 pp 165– (2005) · doi:10.1186/1471-2105-6-165
[13] Molenaar, Cyclin D1 and CDK4 activity contribute to the undifferentiated phenotype in neuroblastoma, Cancer Research 68 pp 2599– (2008) · doi:10.1158/0008-5472.CAN-07-5032
[14] Morris, Bayesian shrinkage estimation of the relative abundance of mRNA transcripts using SAGE, Biometrics 59 pp 476– (2003) · Zbl 1210.62193 · doi:10.1111/1541-0420.00057
[15] Morris, Bayesian Inference for Gene Expression and Proteomics pp 254– (2006) · doi:10.1017/CBO9780511584589.014
[16] Romualdi, Detecting differentially expressed genes in multiple tag sampling experiments: Comparative evaluation of statistical tests, Human Molecular Genetics 10 pp 2133– (2001) · doi:10.1093/hmg/10.19.2133
[17] Schmitt, Exhaustive mining of EST libraries for genes differentially expressed in normal and tumor tissues, Nucleic Acids Research 27 pp 4251– (1999) · doi:10.1093/nar/27.21.4251
[18] Stekel, The comparison of gene expression from multiple cDNA libraries, Genome Research 10 pp 2055– (2000) · doi:10.1101/gr.GR-1325RR
[19] Tibshirani, Statistical significance for genomewide studies, Proceedings of the National Academy of Sciences of the United States of America 100 pp 9440– (2003) · Zbl 1130.62385 · doi:10.1073/pnas.1530509100
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.