×

Integrated cumulative error (ICE) distance for non-nested mixture model selection: application to extreme values in metal fatigue problems. (English) Zbl 1308.62134

Summary: We consider the problem of selecting the most appropriate model, amongst a given collection of mixture models, to describe datasets likely drawn from mixture of distributions. The proposed method consists of finding the quasi-maximum likelihood estimators (QMLEs) of the various models in competition, using the Expectation-Maximization (EM) type algorithms, and subsequently estimating, for every model, a statistical distance to the true model based on the empirical cumulative distribution function (cdf) of the original dataset and the QMLE-fitted cdf. To evaluate the goodness of fit, a new metric, the Integrated Cumulative Error (ICE) is proposed and compared with other existing metrics for accuracy of detecting the appropriate model. We state, under mild conditions, that our estimator of the ICE distance converges at the rate \(\sqrt{n}\) in probability along with the consistency of our model selection procedure (ability to detect asymptotically the right model). The ICE criterion shows, over a set of benchmark examples, numerically improved performance from the existing distance-based criteria in identifying the correct model. The method is applied in a material fatigue life context to model the distribution of indicators of the fatigue crack formation potency, obtained from numerical experiments.

MSC:

62H30 Classification and discrimination; cluster analysis (statistical aspects)
62F10 Point estimation
62F12 Asymptotic properties of parametric estimators
62E10 Characterization and structure theory of statistical distributions

Software:

ABAQUS
PDFBibTeX XMLCite
Full Text: DOI Euclid

References:

[1] Henna, J. (1985). On estimating the number of constituents of a finite mixture of continuous distributions., Ann. Inst. Statist. Math. , 37 , 235-240. · Zbl 0577.62031 · doi:10.1007/BF02481094
[2] Izenman, A. J. and Sommer, C. (1988). Philatelic mixtures and multivariate densities., Journal of the American Math. Soc. , 83 , 941-953.
[3] Roeder, K. (1994). A graphical technique to determining the number of components in a mixture of normals., J. American Statist. Assoc. , 89 , 487-495. · Zbl 0798.62004 · doi:10.2307/2290850
[4] Lindsay, B. G. (1983). Moment matrices: Application in mixtures., Ann. Statist. , 17 , 722-740. · Zbl 0672.62063 · doi:10.1214/aos/1176347138
[5] Dacunha-Castelle, D. and Gassiat, E. (1999). Testing the order of a model using locally conic parametrization: Population mixtures and stationary ARMA processes., Ann. Statist. , 27 , 1178-1209. · Zbl 0957.62073 · doi:10.1214/aos/1017938921
[6] Keribin, C. (2000). Consistent estimation of the order of mixture models., Sankhya Series A , 62 , 49-66. · Zbl 1081.62516
[7] Berkhof, J., van Mechelen, I. and Gelman, A. (2003). A Bayesian approach to the selection and testing of mixture models., Statistica Sinica , 13 , 423-442. · Zbl 1015.62019
[8] Vuong, Q. H. (1989). Likelihood ratio test for model selection and non-nested hypothesis., Econometrica , 57 , 307-333. · Zbl 0701.62106 · doi:10.2307/1912557
[9] Suresh, S. (1998)., Fatigue of Materials, 2nd ed. , Cambridge University Press, Cambridge, UK.
[10] McDowell, D. L. (1996). Basic issues in the mechanics of high cycle metal fatigue., Int. J. Frac. , 80 , 103-145.
[11] Schijve, J. (2005). Statistical distribution functions and fatigue of structures., Int. J. Fat. , 27 , 1031-1039.
[12] Przybyla, C. P. and McDowell, D. L. (2010). Microstructure-sensitive extreme value probabilities for high cycle fatigue of Ni-base superalloy IN100., Int. J. Plast. , 26 , 372-394. · Zbl 1426.74274
[13] Berger, C. and Kaiser, B. (2006). Results of very high cycle fatigue tests on helical compression springs., Int. J. Fat. , 28 , 1658-1663. · Zbl 1139.74311 · doi:10.1016/j.ijfatigue.2006.02.046
[14] Marines, I., Bin, X. and Bathias, C. (2003). An understanding of very high cycle fatigue of metals., Int. J. Fat. , 25 , 1101-1107.
[15] Miao, J., Pollock, T. M. and Jones, J. W. (2009). Crystallographic fatigue crack initiation in nickel-based superalloy René 88DT at elevated temperature., Acta Mat. , 57 , 5964-5974.
[16] Jha, S. K., Caton, M. J. and Larsen, J. M. (2008). Mean vs. life-limiting fatigue behavior of a nickel-based superalloy., Superalloys 2008 - Proceedings of the 11th International Symposium on Superalloys , 565- 572.
[17] Sakai, T., Lian, B., Takeda, M., Shiozawa, K., Oguma, N., Ochi, Y., Nakajima, M. and Nakamura, T. (2010). Statistical duplex SN characteristics of high carbon chromium bearing steel in rotating bending in very high cycle regime., Int. J. Fat. , 32 , 497-504.
[18] Schijve, J. (1994). Fatigue predictions and scatter., Fatigue Fract. Enng. Mater. Struct. , 17 , 381-396.
[19] Ravi Chandran, K. S., Chang, P. and Cashman, G. T. (2010). Competing failure modes and complex SN curves in fatigue of structural materials., Int. J. Fat. , 32 , 482-491.
[20] Wu, C. F. (1983). On the convergence properties of the EM algorithm., Ann. Statist. , 11 , 95-103. · Zbl 0517.62035 · doi:10.1214/aos/1176346060
[21] Dempster, A., Laird, N., and Rubin, D. (1977). Maximum likelihood from incomplete data via the EM algorithm (with discussion)., J. Roy. Stat. Soc. B , 39 , 1-38. · Zbl 0364.62022
[22] Ahmad, K. E., Jahseen, Z. F. and Modhesh, A. A. (2010). Estimation of a discriminant function based on small sample size from a mixture of two Gumbel distributions., Comm. Statist.-Simulation and Computation , 39 , 713-725. · Zbl 1192.62159 · doi:10.1080/03610911003624867
[23] Akaike, H. (1973). Information theory and an extension of the likelihood principle., Proceedings of the second International symposium of Information Theory . Ed. Petrov, B. N. and Csáki, F., Akadémiai Kiado, Budapest. · Zbl 0283.62006
[24] Babu, G. J. (2011). Resampling method for model fitting and model selection., J. Biopharma. Statist. , 21 , 1177-1186. · doi:10.1080/10543406.2011.607749
[25] McDowell, D. L. (2007). Simulation-based strategies for microstructure-sensitive fatigue modeling., Mat. Sci. Engg. A , 468-470 , 4-14.
[26] Vandermeulen, W., Scibetta, M., Leenaers, A., Schuurmans, J. and Gérard, R. (2008). Measurement of the Young modulus anisotropy of a reactor pressure vessel cladding., J. Nuc. Mat. , 372, 2-3 , 249-255.
[27] Hughes, T. J. R. (2000)., The Finite Element Method: Linear Static and Dynamic Finite Element Analysis . Dover publications. · Zbl 1191.74002
[28] ABAQUS FEA, V6.7.1., D S Simulia, Dassault Systèmes, Providence, RI .
[29] Mesarovic, S. Dj. and Padbidri, J. (2005). Minimal kinematic boundary conditions for simulations of disordered microstructures., Phil. Mag. , 85 , 65-78.
[30] Dabrowski, A. R. (1990). Extremal point processes and intermediate quantile functions., Probab. Theory Related Fields , 85 , 365-386. · Zbl 0673.60033 · doi:10.1007/BF01193943
[31] Han, L. and Ferreira, A. (2006)., Extreme Value Theory . Springer, New-York.
[32] Kullback, S. and Leibler, R. A. (1951). On Information and Sufficiency., Ann. Math. Statist. , 22 , 79-86. · Zbl 0042.38403 · doi:10.1214/aoms/1177729694
[33] LeCam, L. (1953). On some asymptotic properties of maximum likelihood estimates and related Bayes’ estimates., University of California Publications in Statistics , 1 , 277-330.
[34] Shorack, G. R. and Wellner, J. A. (1986)., Empirical Processes with Applications to Statistics . Wiley, New York. · Zbl 1170.62365
[35] Budka, M., Gabrys, B. and Musial, K. (2011)., On Accuracy of PDF Divergence Estimators and Their Applicability to Representative Data Sampling . Entropy , 13 , 1229-1266. · Zbl 06334247 · doi:10.3390/e13071229
[36] Titterington, D. M., Smith, A. F. M. and Makov, U. E. (1985)., Statistical Analysis of Finite Mixture Distributions , Wiley, Chichester. · Zbl 0646.62013
[37] van der Vaart, A. W. and Wellner, J. A. (1996)., Weak Convergence and Empirical Processes: With Applications to Statistics . Springer-Verlag, New-York. · Zbl 0862.60002
[38] Teicher, H. (1963). Identifiability of finite mixtures., Ann. Math. Stat. , 34, 1265-1269. · Zbl 0137.12704 · doi:10.1214/aoms/1177703862
[39] Wald, A. (1949). Note on the consistency of the maximum likelihood estimate., Ann. Math. Statist. , 60 , 595-603. · Zbl 0034.22902 · doi:10.1214/aoms/1177729952
[40] White, H. (1982). Maximum likelihood estimation of misspecified models., Econometrica , 50 , 1-25. · Zbl 0478.62088 · doi:10.2307/1912526
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.