×

Theory and computations for the Dirichlet process and related models: an overview. (English) Zbl 1406.62029

Summary: Data analysis sometimes requires the relaxation of parametric assumptions in order to gain modeling flexibility and robustness against mis-specification of the probability model. In the Bayesian context, this is accomplished by placing a prior distribution on an infinite-dimensional space, referred to as Bayesian nonparametric models. We provide an overview on the most popular Bayesian nonparametric models for probability distributions and for collections of predictor-dependent probability distributions. The intention of is not to be complete or exhaustive, but rather to touch on areas of interest for the practical use of the priors in the context of a hierarchical model. We give an overview covering the main properties of the basic models and the algorithms for fitting them.

MSC:

62G05 Nonparametric estimation
60G57 Random measures
62F15 Bayesian inference
62G07 Density estimation

Software:

DPpackage
PDFBibTeX XMLCite
Full Text: DOI

References:

[1] Müller, P.; Quintana, F. A.; Jara, A.; Hanson, T., Bayesian Nonparametric Data Analysis (2015), Springer: Springer New York, USA · Zbl 1333.62003
[2] Lo, A. Y., On a class of Bayesian nonparametric estimates I: density estimates, Ann. Stat., 12, 351-357 (1984) · Zbl 0557.62036
[3] Ferguson, T. S., A Bayesian analysis of some nonparametric problems, Ann. Stat., 1, 209-230 (1973) · Zbl 0255.62037
[4] Ferguson, T. S., Prior distribution on the spaces of probability measures, Ann. Stat., 2, 615-629 (1974) · Zbl 0286.62008
[5] Ghosal, S.; Ghosh, J. K.; Ramamoorthi, R. V., Posterior consistency of Dirichlet mixtures in density estimation, Ann. Stat., 27, 143-158 (1999) · Zbl 0932.62043
[6] Shen, W.; Tokdar, S. T.; Ghosal, S., Adaptive Bayesian multivariate density estimation with Dirichlet mixtures, Biometrika, 100, 623-640 (2003) · Zbl 1284.62183
[7] Lijoi, A.; Prünster, I.; Walker, S., On consistency of non-parametric normal mixtures for Bayesian density estimation, J. Am. Stat. Assoc., 100, 1292-1296 (2005) · Zbl 1117.62387
[8] Ghosal, S.; Van der Vaart, A. W., Posterior convergence rates of Dirichlet mixtures at smooth densities, Ann. Stat., 35, 697-723 (2007) · Zbl 1117.62046
[9] Dey, D.; Müller, P.; Sinha, D., Practical Nonparametric and Semiparametric Bayesian Statistics (1998), Springer: Springer New York, USA · Zbl 0893.00018
[10] Hanson, T.; Branscum, A.; Johnson, W., Bayesian nonparametric modeling and data analysis: an introduction, (Dey, D. K.; Rao, C. R., Bayesian Thinking: Modeling and Computation. Bayesian Thinking: Modeling and Computation, Handbook of Statistics, vol. 25 (2005), Elsevier: Elsevier Amsterdam, The Netherlands), 245-278
[11] Hjort, N. L.; Holmes, C.; Müller, P.; Walker, S., Bayesian Nonparametrics (2010), Cambridge University Press: Cambridge University Press Cambridge, UK
[12] Blackwell, D., Discreteness of Ferguson selection, Ann. Stat., 1, 356-358 (1973) · Zbl 0276.62009
[13] Blackwell, D.; MacQueen, J., Ferguson distributions via Pólya urn schemes, Ann. Stat., 1, 353-355 (1973) · Zbl 0276.62010
[14] de Finetti, B., Foresight: its logical laws, its subjective sources, (Kyburg, H. E.; Smokler, H. E., Studies in Subjective Probability (1937), John Wiley and Sons: John Wiley and Sons New York, USA), 53-118
[15] Korwar, R. M.; Hollander, M., Contributions to the theory of Dirichlet processes, Ann. Probab., 1, 705-711 (1973) · Zbl 0264.60084
[16] Sethuraman, J., A constructive definition of Dirichlet prior, Stat. Sin., 2, 639-650 (1994) · Zbl 0823.62007
[17] Feigin, P. D.; Tweedie, R. L., Linear functionals and Markov chains associated with Dirichlet processes, Math. Proc. Camb. Philos. Soc., 105, 579-585 (1989) · Zbl 0677.60080
[18] Antoniak, C. E., Mixtures of Dirichlet processes with applications to Bayesian nonparametric problems, Ann. Stat., 2, 1152-1174 (1974) · Zbl 0335.60034
[19] Petrone, S., Bayesian density estimation using Bernstein polynomials, Can. J. Stat., 27, 105-126 (1999) · Zbl 0929.62044
[20] Petrone, S., Random Bernstein polynomials, Scand. J. Stat., 26, 373-393 (1999) · Zbl 0939.62046
[21] Barrientos, A. F.; Jara, A.; Quintana, F. A., Bayesian density estimation for compositional data using random Bernstein polynomials, J. Stat. Plan. Inference, 166, 116-125 (2015) · Zbl 1394.62037
[22] Dalal, S. R., Dirichlet invariant processes and applications to nonparametric estimation of symmetric distribution functions, Stoch. Process. Appl., 9, 99-107 (1979) · Zbl 0415.60035
[23] Doss, H., Bayesian nonparametric estimation of the median. I. Computation of the estimates, Ann. Stat., 13, 1432-1444 (1985) · Zbl 0587.62070
[24] Doss, H., Bayesian nonparametric estimation of the median. II. Asymptotic properties of the estimates, Ann. Stat., 13, 1445-1464 (1985) · Zbl 0587.62071
[25] Newton, M. A.; Czado, C.; Chapell, R., Bayesian inference for semiparametric binary regression, J. Am. Stat. Assoc., 91, 142-153 (1996) · Zbl 0870.62026
[26] Freedman, D., On the asymptotic distribution of Bayes’ estimates in the discrete case, Ann. Math. Stat., 34, 1386-1403 (1963) · Zbl 0137.12603
[27] Fabius, J., Asymptotic behavior of Bayes’ estimates, Ann. Math. Stat., 35, 846-856 (1964) · Zbl 0137.12604
[28] Mauldin, R. D.; Sudderth, W. D.; Williams, S. C., Polya trees and random distributions, Ann. Stat., 20, 1203-1221 (1992) · Zbl 0765.62006
[29] Lavine, M., Some aspects of Polya tree distributions for statistical modeling, Ann. Stat., 20, 1222-1235 (1992) · Zbl 0765.62005
[30] Lavine, M., More aspects of Polya tree distributions for statistical modeling, Ann. Stat., 22, 1161-1176 (1994) · Zbl 0820.62016
[31] Christensen, R.; Hanson, T.; Jara, A., Parametric nonparametric statistics: an introduction to mixtures of finite Polya trees, Am. Stat., 62, 296-306 (2008)
[32] Monticino, M., How to construct a random probability measure, Int. Stat. Rev., 69, 153-167 (2001) · Zbl 1171.60311
[33] Dubins, L. E.; Freedman, D. A., Random distribution functions, (Proceedings of the Fifth Berkeley Symposium on Mathematics, Statistics and Probability, vol. 2 (1967)), 183-214 · Zbl 0201.49502
[34] Kraft, C. M., A class of distribution function processes which have derivatives, J. Appl. Probab., 1, 385-388 (1964) · Zbl 0203.19702
[35] Metivier, M., Sur la construction de mesures aleatoires presque surement absolument continues par rapport a une mesure donnee, Z. Wahrscheinlichkeitstheor. Verw. Geb., 20, 332-334 (1971) · Zbl 0212.19303
[36] Schervish, M. J., Theory of Statistics (1995), Springer: Springer New York, USA · Zbl 0834.62002
[37] Walker, S. G.; Mallick, B. K., Hierarchical generalized linear models and frailty models with Bayesian nonparametric mixing, J. R. Stat. Soc. B, 59, 845-860 (1997) · Zbl 0886.62072
[38] Hanson, T.; Johnson, W. O., Modeling regression error with a mixture of Polya trees, J. Am. Stat. Assoc., 97, 1020-1033 (2002) · Zbl 1048.62101
[39] Hanson, T., Inference for mixtures of finite Polya tree models, J. Am. Stat. Assoc., 101, 1548-1565 (2006) · Zbl 1171.62323
[40] Paddock, S. M.; Ruggeri, F.; Lavine, M.; West, M., Randomized Polya tree models for nonparametric Bayesian inference, Stat. Sin., 13, 443-460 (2003) · Zbl 1015.62051
[41] Jara, A.; Hanson, T.; Lesaffre, E., Robustifying generalized linear mixed models using a new class of mixture of multivariate Polya trees, J. Comput. Graph. Stat., 18, 838-860 (2009)
[42] Hanson, T.; Monteiro, J. V.D.; Jara, A., The Polya tree sampler: toward efficient and automatic independent Metropolis proposals, J. Comput. Graph. Stat., 20, 1, 41-62 (2011)
[43] Muliere, P.; Tardella, L., Approximating distributions of random functionals of Ferguson-Dirichlet priors, Can. J. Stat., 26, 283-297 (1998) · Zbl 0913.62010
[44] Muliere, P.; Secchi, P., A Note on a Proper Bayesian Bootstrap (1995), Università degli Studi di Pavia, Dipartamento di Economia Politica e Metodi Quantitativ, Tech. rep. · Zbl 1121.62542
[45] Pitman, J., Some developments of the Blackwell-MacQueen urn scheme, (Ferguson, T. S.; Shapeley, L. S.; MacQueen, J. B., Statistics, Probability and Game Theory. Papers in Honor of David Blackwell. Statistics, Probability and Game Theory. Papers in Honor of David Blackwell, IMS Lecture Notes - Monograph Series, Hayward, California (1996)), 245-268
[46] Regazzini, E.; Lijoi, A.; Prünster, I., Distributional results for means of normalized random measures with independent increments, Ann. Stat., 31, 560-585 (2003) · Zbl 1068.62034
[47] Ishwaran, H.; James, L. F., Gibbs sampling methods for stick-breaking priors, J. Am. Stat. Assoc., 96, 161-173 (2001) · Zbl 1014.62006
[48] Pitman, J.; Yor, M., The two-parameter Poisson-Dirichlet distribution derived from a stable subordinator, Ann. Probab., 25, 855-900 (1997) · Zbl 0880.60076
[49] Ishwaran, H.; Zarepour, M., Markov chain Monte Carlo in approximate Dirichlet and beta two-parameter process hierarchical models, Biometrika, 87, 371-390 (2000) · Zbl 0949.62037
[50] Kingman, J. F.C., Random discrete distributions, J. R. Stat. Soc. B, 37, 1-22 (1975) · Zbl 0331.62019
[51] Cifarelli, D.; Regazzini, E., Problemi statistici non parametrici in condizioni di scambialbilita parziale e impiego di medie associative (1978), Quaderni Istituto Matematica Finanziaria: Quaderni Istituto Matematica Finanziaria Torino, Tech. rep.
[52] Muliere, P.; Petrone, S., A Bayesian predictive approach to sequential search for an optimal dose: parametric and nonparametric models, J. Ital. Stat. Soc., 2, 349-364 (1993) · Zbl 1446.62283
[53] Mira, A.; Petrone, S., Bayesian hierarchical nonparametric inference for change-point problems, (Bernardo, J. M.; Berger, J. O.; Dawid, A. P.; Smith, A. F.M., Bayesian Statistics, vol. 5 (1996), Oxford University Press)
[54] Giudici, P.; Mezzetti, M.; Muliere, P., Mixtures of Dirichlet process priors for variable selection in survival analysis, J. Stat. Plan. Inference, 111, 101-115 (2003) · Zbl 1033.62099
[55] MacEachern, S. N., Dependent nonparametric processes, (ASA Proceedings of the Section on Bayesian Statistical Science (1999), American Statistical Association: American Statistical Association Alexandria, VA)
[56] MacEachern, S. N., Dependent Dirichlet Processes (2000), Department of Statistics, The Ohio State University, Tech. rep.
[57] Barrientos, A. F.; Jara, A.; Quintana, F. A., On the support of MacEachern’s dependent Dirichlet processes and extensions, Bayesian Anal., 7, 277-310 (2012) · Zbl 1330.60067
[58] De Iorio, M.; Müller, P.; Rosner, G. L.; MacEachern, S. N., An ANOVA model for dependent random measures, J. Am. Stat. Assoc., 99, 205-215 (2004) · Zbl 1089.62513
[59] De Iorio, M.; Johnson, W. O.; Müller, P.; Rosner, G. L., Bayesian nonparametric non-proportional hazards survival modelling, Biometrics, 65, 762-771 (2009) · Zbl 1172.62073
[60] Jara, A.; Lesaffre, E.; De Iorio, M.; Quintana, F. A., Bayesian semiparametric inference for multivariate doubly-interval-censored data, Ann. Appl. Stat., 4, 2126-2149 (2010) · Zbl 1220.62023
[61] Gelfand, A. E.; Kottas, A.; MacEachern, S. N., Bayesian nonparametric spatial modeling with Dirichlet process mixing, J. Am. Stat. Assoc., 100, 1021-1035 (2005) · Zbl 1117.62342
[62] Dunson, D. B.; Herring, A. H., Semiparametric Bayesian Latent Trajectory Models (2006), Duke University: Duke University Durham, NC, USA, Tech. rep., ISDS Discussion Paper 16
[63] Müller, P.; Rosner, G. L.; De Iorio, M.; MacEachern, S., A nonparametric Bayesian model for inference in related longitudinal studies, J. R. Stat. Soc. C, 54, 611-626 (2005) · Zbl 1490.62121
[64] Iñacio, V.; Jara, A.; Hanson, T. E.; de Carvalho, M., Bayesian nonparametric roc regression modeling, Bayesian Anal., 8, 623-646 (2013) · Zbl 1329.62154
[65] Müller, P.; Erkanli, A.; West, M., Bayesian curve fitting using multivariate normal mixtures, Biometrika, 83, 67-79 (1996) · Zbl 0865.62029
[66] Müller, P.; Quintana, F. A.; Rosner, G., A method for combining inference across related nonparametric Bayesian models, J. R. Stat. Soc. B, 66, 735-749 (2004) · Zbl 1046.62053
[67] Teh, Y. W.; Jordan, M. I.; Beal, M. J.; Blei, D. M., Hierarchical Dirichlet processes, J. Am. Stat. Assoc., 101, 1566-1581 (2006) · Zbl 1171.62349
[68] Griffin, J. E.; Steel, M. F.J., Order-based dependent Dirichlet processes, J. Am. Stat. Assoc., 101, 179-194 (2006) · Zbl 1118.62360
[69] Rodriguez, A.; Dunson, D. B.; Gelfand, A., The nested Dirichlet process, J. Am. Stat. Assoc., 103, 1131-1154 (2008) · Zbl 1205.62062
[70] Dunson, D. B.; Pillai, N.; Park, J. H., Bayesian density regression, J. R. Stat. Soc. B, 69, 163-183 (2007) · Zbl 1120.62025
[71] Dunson, D. B.; Park, J. H., Kernel stick-breaking processes, Biometrika, 95, 307-323 (2008) · Zbl 1437.62448
[72] Dunson, D. B.; Xue, Y.; Carin, L., The matrix stick-breaking process: flexible Bayes meta-analysis, J. Am. Stat. Assoc., 103, 317-327 (2008) · Zbl 1471.62502
[73] Chung, Y.; Dunson, D. B., The local Dirichlet process, Ann. Inst. Stat. Math., 63, 59-80 (2011) · Zbl 1432.62083
[74] Ren, L.; Du, L.; Carin, L.; Dunson, D. B., Logistic stick-breaking process, J. Mach. Learn. Res., 12, 203-239 (2011) · Zbl 1280.62079
[75] Chung, Y.; Dunson, D. B., Nonparametric Bayes conditional distribution modeling with variable selection, J. Am. Stat. Assoc., 104, 1646-1660 (2009) · Zbl 1205.62039
[76] Rodriguez, A.; Dunson, D. B., Nonparametric Bayesian models through probit stick-breaking processes, Bayesian Anal., 6, 145-178 (2011) · Zbl 1330.62120
[77] Müller, P.; Quintana, F. A., Random partition models with regression on covariates, J. Stat. Plan. Inference, 140, 2801-2808 (2010) · Zbl 1191.62073
[78] Müller, P.; Quintana, F. A.; Rosner, G. L., A product partition model with regression on covariates, J. Comput. Graph. Stat., 20, 260-278 (2011)
[79] Quintana, F. A., Linear regression with a dependent skewed Dirichlet process, Chil. J. Stat., 1, 35-49 (2010) · Zbl 1213.62071
[80] Barrientos, A. F.; Jara, A.; Quintana, F. A., Fully nonparametric regression for bounded data using dependent Bernstein polynomials, J. Am. Stat. Assoc. (2016), in press
[81] Epifani, I.; Lijoi, A., Nonparametric priors for vectors of survival functions, Stat. Sin., 20, 1455-1484 (2010) · Zbl 1200.62121
[82] Leisen, F.; Lijoi, A., Vectors of two-parameter Poisson-Dirichlet processes, J. Multivar. Anal., 102, 482-495 (2011) · Zbl 1207.62062
[83] Lijoi, A.; Nipoti, B.; Prünster, I., Bayesian inference with dependent normalized completely random measures, Bernoulli, 20, 1260-1291 (2014) · Zbl 1309.60048
[84] Tokdar, S. T.; Zhu, Y. M.; Ghosh, J. K., Bayesian density regression with logistic Gaussian process and subspace projection, Bayesian Anal., 5, 1-26 (2010) · Zbl 1330.62182
[85] Jara, A.; Hanson, T., A class of mixtures of dependent tail-free processes, Biometrika, 98, 553-566 (2011) · Zbl 1231.62178
[87] Escobar, M. D., Estimating normal means with a Dirichlet process prior, J. Am. Stat. Assoc., 89, 268-277 (1994) · Zbl 0791.62039
[88] Escobar, M. D.; West, M., Bayesian density estimation and inference using mixtures, J. Am. Stat. Assoc., 90, 577-588 (1995) · Zbl 0826.62021
[89] Liu, J. S., Nonparametric hierarchical Bayes via sequential imputations, Ann. Stat., 24, 911-930 (1996) · Zbl 0880.62038
[90] MacEachern, S. N.; Clyde, M.; Liu, J. S., Sequential importance sampling for nonparametric Bayes models: the next generation, Can. J. Stat., 27, 251-267 (1999) · Zbl 0957.62068
[91] Newton, M. A.; Quintana, F. A.; Zhang, Y., Nonparametric Bayes methods using predictive updating, (Dey, D.; Müller, P.; Sinha, D., Practical Nonparametric and Semiparametric Bayesian Statistics (1998), Springer), 45-62 · Zbl 0918.62030
[92] Newton, M. A.; Zhang, Y., A recursive algorithm for nonparametric analysis with missing data, Biometrika, 86, 15-26 (1999) · Zbl 0917.62045
[93] Jordan, M.; Ghahramani, Z.; Jaakkola, T.; Saul, L., An introduction to variational methods for graphical models, Mach. Learn., 37, 183-233 (1999) · Zbl 0945.68164
[94] Blei, D.; Jordan, M., Variational inference for Dirichlet process mixtures, Bayesian Anal., 1, 121-144 (2006) · Zbl 1331.62259
[95] Bush, C. A.; MacEachern, S. N., A semiparametric Bayesian model for randomised block designs, Biometrika, 83, 275-285 (1996) · Zbl 0864.62052
[96] MacEachern, S. N., Estimating normal means with a conjugate style Dirichlet process prior, Commun. Stat., Simul. Comput., 23, 727-741 (1994) · Zbl 0825.62053
[97] Jain, S.; Neal, R. M., A split-merge Markov Chain Monte Carlo procedure for the Dirichlet process mixture model, J. Comput. Graph. Stat., 13, 158-182 (2004)
[98] Dahl, D. B., Sequentially-Allocated Merge-Split Sampler for Conjugate and Nonconjugate Dirichlet Process Mixture Models (2005), Texas AM University: Texas AM University USA, Tech. rep., Technical Report
[99] Phillips, D. B.; Smith, A. F.M., Bayesian model comparisons via jump diffusions, (Gilks, W. R.; Richardson, S.; Spiegelhalter, D. J., Markov Chain Monte Carlo in Practice (1996), Chapman and Hall: Chapman and Hall New York, USA), 215-239 · Zbl 0855.62018
[100] Richardson, S.; Green, P. J., On Bayesian analysis of mixtures with an unknown number of components, J. R. Stat. Soc. B, 59, 731-792 (1997) · Zbl 0891.62020
[101] Fong, Y.; Wakefield, J.; Rice, K., An efficient Markov chain Monte Carlo method for mixture models by neighborhood pruning, J. Comput. Graph. Stat., 21, 197-216 (2012)
[102] MacEachern, S. N.; Müller, P., Estimating mixture of Dirichlet process models, J. Comput. Graph. Stat., 7, 2, 223-338 (1998)
[103] Neal, R., Markov chain sampling methods for Dirichlet process mixture models, J. Comput. Graph. Stat., 9, 249-265 (2000)
[104] Doss, H., Bayesian nonparametric estimation for incomplete data via successive substitution sampling, Ann. Stat., 22, 1763-1786 (1994) · Zbl 0824.62027
[105] Florens, J.-P.; Rolin, J.-M., Simulation of Posterior Distributions in Nonparametric Censored Analysis (1998), GREMAQ: GREMAQ Toulouse, available at
[106] Hanson, T.; Johnson, W. O., A Bayesian semiparametric AFT model for interval-censored data, J. Comput. Graph. Stat., 13, 341-361 (2004)
[107] Papaspiliopoulos, O.; Roberts, G. O., Retrospective Markov chain Monte Carlo methods for Dirichlet process hierarchical models, Biometrika, 95, 169-186 (2008) · Zbl 1437.62576
[108] Walker, S. G., Sampling the Dirichlet mixture model with slices, Commun. Stat. Simul. Comp., 36, 45-54 (2007) · Zbl 1113.62058
[109] Kalli, M.; Griffin, J. E.; Walker, S., Slice sampling mixture models, Stat. Comput., 21, 93-105 (2011) · Zbl 1256.65006
[110] Gelfand, A. E.; Kottas, A., A computational approach for full nonparametric Bayesian inference under Dirichlet Process Mixture models, J. Comput. Graph. Stat., 11, 289-304 (2002)
[111] Jara, A., Applied Bayesian non- and semi-parametric inference using DPpackage, RNews, 7, 17-26 (2007)
[112] Jara, A.; Hanson, T.; Quintana, F.; Müller, P.; Rosner, G. L., DPpackage: Bayesian semi- and nonparametric modeling in R, J. Stat. Softw., 40, 1-30 (2011)
[113] Ishwaran, H.; James, L. F., Approximate Dirichlet process computing in finite normal mixtures: smoothing and prior information, J. Comput. Graph. Stat., 11, 508-532 (2002)
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.