×

A non-inferiority test for diagnostic accuracy in the absence of the golden standard test based on the paired partial areas under receiver operating characteristic curves. (English) Zbl 1514.62867

Summary: Non-inferiority tests are often measured for the diagnostic accuracy in medical research. The area under the receiver operating characteristic (ROC) curve is a familiar diagnostic measure for the overall diagnostic accuracy. Nevertheless, since it may not differentiate the diverse shapes of the ROC curves with different diagnostic significance, the partial area under the ROC (PAUROC) curve, another summary measure emerges for such diagnostic processes that require the false-positive rate to be in the clinically interested range. Traditionally, to estimate the PAUROC, the golden standard (GS) test on the true disease status is required. Nevertheless, the GS test may sometimes be infeasible. Besides, in a lot of research fields such as the epidemiology field, the true disease status of the patients may not be known or available. Under the normality assumption on diagnostic test results, based on the expectation-maximization algorithm in combination with the bootstrap method, we propose the heuristic method to construct a non-inferiority test for the difference in the paired PAUROCs without the GS test. Through the simulation study, although the proposed method might provide a liberal test, as a whole, the empirical size of the proposed method sufficiently controls the size at the significance level, and the empirical power of the proposed method in the absence of the GS is as good as that of the non-inferiority in the presence of the GS. The proposed method is illustrated with the published data.

MSC:

62-XX Statistics
PDFBibTeX XMLCite
Full Text: DOI

References:

[1] T.A. Alonzo and M.S. Pepe, Using a combination of reference tests to assess the accuracy of a new diagnostic test, Stat. Med. 18 (1999), pp. 2987-3003. doi: 10.1002/(SICI)1097-0258(19991130)18:22<2987::AID-SIM205>3.0.CO;2-B · doi:10.1002/(SICI)1097-0258(19991130)18:22<2987::AID-SIM205>3.0.CO;2-B
[2] S.G. Baker and P.F. Pinsky, A proposed design and analysis for comparing digital and analog mammography: Special receiver operating characteristic methods for cancer screening, J. Am. Stat. Assoc. 96 (2001), pp. 421-428. doi: 10.1198/016214501753168136 · Zbl 1031.62093
[3] C.A. Beam, E.F. Conant, E.A. Sickles, and S.P. Weinstein, Evaluation of proscriptive health care policy implementation in screening mammography, Radiology 229 (2003), pp. 534-540. doi: 10.1148/radiol.2292021585 · doi:10.1148/radiol.2292021585
[4] S.V. Beiden, G. Campbell, K.L. Meier, and R.F. Wagner, The problem of ROC analysis without truth: The EM algorithm and the information matrix, Proc. SPIE 3981 (2000), pp. 126-134. doi: 10.1117/12.383099 · doi:10.1117/12.383099
[5] A.J. Branscum, W.O. Johnson, T.E. Hanson, and I.A. Gardner, Bayesian semiparametric ROC curve estimation and disease diagnosis, Stat. Med. 27 (2008), pp. 2474-2496. doi: 10.1002/sim.3250 · doi:10.1002/sim.3250
[6] F.-C . Chang, S.-Y . Yeh, and H.-N . Hsieh, Generalized confidence interval estimation for the difference in paired areas under the ROC curves in the absence of a gold standard, Commun. Stat. Simul. C. 42 (2013), pp. 2056-2072. doi: 10.1080/03610918.2012.690483 · Zbl 1302.62065
[7] Y.H.J. Chen and C. Chen, Testing superiority at interim analyses in a non-inferiority trial, Stat. Med. 31 (2012), pp. 1531-1542. doi: 10.1002/sim.5312 · doi:10.1002/sim.5312
[8] Y.-K . Choi, W.O. Johnson, M.T. Collins, and I.A. Gardner, Bayesian inferences for receiver operating characteristic curves in the absence of a gold standard, J. Agric. Biol. Environ. Stat. 11 (2006), pp. 210-229. doi: 10.1198/108571106X110883 · doi:10.1198/108571106X110883
[9] S.C. Chow, Statistical Design and Analysis of Stability Studies, Chapman and Hall/CRC Biostatistics Series, New York, 2007. · Zbl 1183.62188 · doi:10.1201/9781584889069
[10] S.C. Chow, Controversial Statistical Issues in Clinical Trials, Chapman and Hall/CRC Biostatistics Series, New York, 2011. · Zbl 1304.62005 · doi:10.1201/b10987
[11] S.C. Chow and J.P. Liu, Design and Analysis of Clinical Trials, 3rd ed., Wiley, New York, 2013. · Zbl 1282.62004 · doi:10.1002/9781118458167
[12] E.R. DeLong, D.M. DeLong, and D.L. Clarke-Pearson, Comparing the areas under two or more correlated receiver operating characteristic curves: A nonparametric approach, Biometrics 44 (1988), pp. 837-845. doi: 10.2307/2531595 · Zbl 0715.62207 · doi:10.2307/2531595
[13] N. Dendukuri and L. Joseph, Bayesian approaches to modeling the conditional dependence between multiple diagnostic tests, Biometrics 57 (2001), pp. 158-167. doi: 10.1111/j.0006-341X.2001.00158.x · Zbl 1209.62275 · doi:10.1111/j.0006-341X.2001.00158.x
[14] L.E. Dodd and M.S. Pepe, Partial AUC estimation and regression, Biometrics 59 (2003), pp. 614-623. doi: 10.1111/1541-0420.00071 · Zbl 1210.62152 · doi:10.1111/1541-0420.00071
[15] C. Enøe, M.P. Georgiadis, and W.O. Johnson, Estimation of sensitivity and specificity of diagnostic tests and disease prevalence when the true disease state is unknown, Prev. Vet. Med. 45 (2000), pp. 61-81. doi: 10.1016/S0167-5877(00)00117-3 · doi:10.1016/S0167-5877(00)00117-3
[16] M.P. Georgiadis, W.O. Johnson, I.A. Gardner, and R. Singh, Correlation-adjusted estimation of sensitivity and specificity of two diagnostic tests, Appl. Stat. 52 (2003), pp. 63-76. · Zbl 1111.62341 · doi:10.1111/1467-9876.00389
[17] R.M. Henkelman, I. Kay, and M.J. Bronskill, Receiver operator characteristic (ROC) analysis without truth, Med. Decis. Making. 10 (1990), pp. 24-29. doi: 10.1177/0272989X9001000105 · doi:10.1177/0272989X9001000105
[18] H.-N . Hsieh, H.-Y . Su, and X.-H . Zhou, Interval estimation for the difference in paired areas under the ROC curves in the absence of a gold standard test, Stat. Med. 28 (2009), pp. 3108-3123. doi: 10.1002/sim.3661 · doi:10.1002/sim.3661
[19] S.L. Hui and X.H. Zhou, Evaluation of diagnostic tests without gold standards, Stat. Methods Med. Res. 7 (1998), pp. 354-370. doi: 10.1191/096228098671192352 · doi:10.1191/096228098671192352
[20] Y.L. Jiang, C.E. Metz, and R.M. Nishikawa, A receiver operator characteristic partial area index for highly sensitive diagnostic tests, Radiology 201 (1996), pp. 745-750. doi: 10.1148/radiology.201.3.8939225 · doi:10.1148/radiology.201.3.8939225
[21] H. Jin and Y. Lu, A non-inferiority test of areas under two parametric ROC curves, Contemp. Clin. Trials. 30 (2009), pp. 375-379. doi: 10.1016/j.cct.2009.03.003 · doi:10.1016/j.cct.2009.03.003
[22] M. Jorgensen, EM algorithm, in Encyclopedia of Environmetrics, A.H. El-Shaarawi and W.W. Piegorsch, eds., Wiley, New York, 2002.
[23] C.-R . Li, C.-T . Liao, and J.-P . Liu, On the exact interval estimation for the difference in paired areas under the ROC curves, Stat. Med. 27 (2008), pp. 224-242. doi: 10.1002/sim.2760 · doi:10.1002/sim.2760
[24] C.-R . Li, C.-T . Liao, and J.-P . Liu, A non-inferiority test for diagnostic accuracy based on the paired partial areas under ROC curves, Stat. Med. 27 (2008), pp. 1762-1776. doi: 10.1002/sim.3121 · doi:10.1002/sim.3121
[25] J.P. Liu and S.C. Chow, Statistical issues on the diagnostic multivariate index assay for targeted clinical trials, J. Biopharm. Stat. 18 (2008), pp. 167-182. doi: 10.1080/10543400701668316
[26] J.-P . Liu, M.-C . Ma, C.-Y . Wu, and J.-Y . Tai, Tests of equivalence and non-inferiority for diagnostic accuracy based on the paired areas under ROC curves, Stat. Med. 25 (2006), pp. 1219-1238. doi: 10.1002/sim.2358 · doi:10.1002/sim.2358
[27] D.K. McClish, Analyzing a portion of the ROC curve, Med. Decis. Making. 9 (1989), pp. 190-195. doi: 10.1177/0272989X8900900307 · doi:10.1177/0272989X8900900307
[28] B.J. McNeil, E. Keeler, and S.J. Adelstein, Primer on certain elements of medical decision making, N. Engl. J. Med. 293 (1975), pp. 211-215. doi: 10.1056/NEJM197507312930501 · doi:10.1056/NEJM197507312930501
[29] G. Pennello and L. Thompson, Experience with reviewing Bayesian medical device trials, J. Biopharm. Stat. 18 (2008), pp. 81-115. doi: 10.1080/10543400701668274
[30] M.S. Pepe, The Statistical Evaluation of Medical Tests for Classification and Prediction, Oxford University Press, New York, 2003. · Zbl 1039.62105
[31] H. Saeki and T. Tango, Non-inferiority test and confidence interval for the difference in correlated proportions in diagnostic procedures based on multiple raters, Stat. Med. 30 (2011), pp. 3313-3327. doi: 10.1002/sim.4364 · doi:10.1002/sim.4364
[32] M.L. Thompson and W. Zucchini, On the statistical analysis of ROC curves, Stat. Med. 8 (1989), pp. 1277-1290. doi: 10.1002/sim.4780081011 · doi:10.1002/sim.4780081011
[33] S. Wieand, M.H. Gail, B.R. James, and K.L. James, A family of nonparametric statistics for comparing diagnostic markers with paired or unpaired data, Biometrika 76 (1989), pp. 585-592. doi: 10.1093/biomet/76.3.585 · Zbl 0674.62075 · doi:10.1093/biomet/76.3.585
[34] W.E. Wolski, M. Lalowski, P. Martus, R. Herwig, P. Giavalisco, J. Gobom, A. Sickmann, H. Lehrach, and K. Reinert, Transformation and other factors of the peptide mass spectrometry pairwise peak-list comparison process, BMC Bioinform. 6 (2005), eid 285. doi:doi:10.1186/1471-2105-6-285. · doi:10.1186/1471-2105-6-285
[35] W.-H . Wu and H.-N . Hsieh, Generalized confidence interval estimation for the mean of delta-lognormal distribution: An application to New Zealand trawl survey data, J. Appl. Stat. 41 (2014), pp. 1471-1485. doi: 10.1080/02664763.2014.881780 · Zbl 1514.62940
[36] Z. Yang, X.Z. Sun, and J.W. Hardin, Testing non-inferiority for clustered matched-pair binary data in diagnostic medicine, Comput. Stat. Data Anal. 56 (2012), pp. 1301-1320. doi: 10.1016/j.csda.2011.06.019 · Zbl 1241.62171 · doi:10.1016/j.csda.2011.06.019
[37] X.-H . Zhou, P. Castelluccio, and C. Zhou, Nonparametric estimation of ROC curves in the absence of a gold standard, Biometrics 61 (2005), pp. 600-609. doi: 10.1111/j.1541-0420.2005.00324.x · doi:10.1111/j.1541-0420.2005.00324.x
[38] X.H. Zhou, N.A. Obuchowski, and D.K. McClish, Statistical Methods in Diagnostic Medicine, 2nd ed., Wiley, New York, 2011. · Zbl 1268.62146 · doi:10.1002/9780470906514
[39] H. Zhou and G. Qin, Confidence intervals for the difference in paired Youden indices, Pharm. Stat. 12 (2013), pp. 17-27. doi: 10.1002/pst.1543 · doi:10.1002/pst.1543
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.