Nonparametric Bayesian sparse factor models with application to gene expression modeling. (English) Zbl 1223.62013

Summary: A nonparametric Bayesian extension of Factor Analysis (FA) is proposed where observed data \(\mathbf Y\) is modeled as a linear superposition, \(\mathbf G\), of a potentially infinite number of hidden factors, \(\mathbf X\). The Indian Buffet Process (IBP) is used as a prior on \(\mathbf G\) to incorporate sparsity and to allow the number of latent features to be inferred. The model’s utility for modeling gene expression data is investigated using randomly generated data sets based on a known sparse connectivity matrix for E. Coli, and on three biological data sets of increasing complexity.


62F15 Bayesian inference
62H25 Factor analysis and principal components; correspondence analysis
62G99 Nonparametric inference
65C40 Numerical analysis or methods applied to Markov chains


Full Text: DOI arXiv


