×

On the classification of financial data with domain agnostic features. (English) Zbl 1481.91200

Summary: We compare a data-driven domain agnostic set of canonical features with a smaller collection of features that capture well-known stylized facts about financial asset returns. We show that these facts discriminate better different asset types than general-purpose features. Therefore, financial time series analysis is a domain where well-informed expert knowledge may not be disregarded in favor of agnostic representations of the data.

MSC:

91G15 Financial markets
62P05 Applications of statistics to actuarial sciences and financial mathematics
PDF BibTeX XML Cite
Full Text: DOI

References:

[1] Alonso, A. M.; Maharaj, E. A., Comparison of time series using subsampling, Comput. Stat. Data Anal., 50, 2589-2599 (2006) · Zbl 1445.62216
[2] Bastos, J. A.; Caiado, J., Clustering financial time series with variance ratio statistics, Quant. Finance, 14, 2121-2133 (2014) · Zbl 1402.62246
[3] Bekaert, G.; Harvey, C., Emerging equity market volatility, J. Financ. Econ., 43, 29-77 (1997)
[4] Bollerslev, T., Generalized autoregressive conditional heteroskedasticity, J. Econom., 31, 307-327 (1986) · Zbl 0616.62119
[5] Caiado, J.; Crato, N.; Peña, D., A periodogram-based metric for time series classification, Comput. Stat. Data Anal., 50, 2668-2684 (2006) · Zbl 1445.62222
[6] Caiado, J.; Crato, N.; Peña, D., Comparison of time series with unequal length in the frequency domain, Commun. Stat., Simul. Comput., 38, 527-540 (2009) · Zbl 1161.37348
[7] Caiado, J.; Maharaj, E. A.; D’Urso, P., Time series clustering, (Henning, C.; Meila, M.; Murtagh, F.; Rocci, R., Handbook of Cluster Analysis (2015), CRC Press, Taylor & Francis Group), 241-263 · Zbl 1396.62196
[8] Caiado, J.; Crato, N.; Poncela, P., A fragmented-periodogram approach for clustering big data time series, Adv. Data Anal. Classif., 14, 117-146 (2020) · Zbl 1474.62214
[9] Caiado, J.; Crato, N., Identifying common dynamic features in stock returns, Quant. Finance, 10, 797-807 (2010)
[10] Cerqueti, R.; Giacalone, M.; Mattera, R., Model-based fuzzy time series clustering of conditional higher moments, Int. J. Approx. Reason., 134, 34-52 (2021) · Zbl 07414900
[11] Clarida, R.; Gali, J., Sources of real exchange-rate fluctuations: how important are nominal shocks?, Carnegie-Rochester Conf. Ser. Public Policy, 41, 1-56 (1994)
[12] Cont, R., Empirical properties of asset returns: stylized facts and statistical issues, Quant. Finance, 1, 223-236 (2001) · Zbl 1408.62174
[13] Ding, Z.; Granger, C. W.J.; Engle, R. F., A long memory property of stock market returns and a new model, J. Empir. Finance, 1, 83-106 (1993)
[14] D’Urso, P.; De Giovanni, L.; Massari, R., GARCH-based robust clustering of time series, Fuzzy Sets Syst., 305, 1-28 (2016) · Zbl 1368.62167
[15] D’Urso, P.; Garcia-Escudero, L. A.; De Giovanni, L.; Vitale, V.; Mayo-Iscar, A., Robust fuzzy clustering of time series based on B-splines, Int. J. Approx. Reason., 136, 223-246 (2021) · Zbl 07415297
[16] Engle, R. F., Autoregressive conditional hetroskedasticity with estimates of the variance of United Kingdom inflation, Econometrica, 50, 987-1008 (1982) · Zbl 0491.62099
[17] Fulcher, B. D.; Jones, N. S., Highly comparative feature-based time-series classification, IEEE Trans. Knowl. Data Eng., 26, 3026-3037 (2014)
[18] Galeano, P.; Peña, D., Multivariate analysis in vector time series, Resen. Inst. Mat. Estat. Univ. Sao Paulo, 4, 383-404 (2000) · Zbl 1098.62558
[19] Galeano, P.; Peña, D.; Tsay, R. S., Outlier detection in multivariate time series by projection pursuit, J. Am. Stat. Assoc., 101, 654-669 (2006) · Zbl 1119.62360
[20] Granero, S.; Segovia, J. T.; Perez, J. G., Some comments on Hurst exponent and the long memory processes on capital markets, Phys. A, Stat. Mech. Appl., 387, 22, 5543-5551 (2008)
[21] Granger, C. W.J.; Ding, Z., Varieties of long-memory models, J. Econom., 73, 61-77 (1996) · Zbl 0854.62100
[22] Harvey, C., Predictable risk and returns in emerging markets, Rev. Financ. Stud., 8, 773-816 (1995)
[23] Kraus, A.; Litzenberger, R., Skewness preference and the valuation of risky assets, J. Finance, 21, 1085-1094 (1976)
[24] Lubba, C. H.; Sethi, S. S.; Knaute, P.; Schultz, S. R.; Fulcher, B. D.; Jones, N. S., catch22: CAnonical Time-series CHaracteristics, Data Min. Knowl. Discov., 33, 1821-1852 (2019)
[25] Maharaj, E. A.; D’Urso, P., A coherence-based approach for the pattern recognition of time series, Phys. A, Stat. Mech. Appl., 389, 17, 3516-3537 (2010)
[26] Maharaj, E. A.; D’Urso, P., Fuzzy clustering of time series in the frequency domain, Inf. Sci., 181, 1187-1211 (2011) · Zbl 1215.62061
[27] Maharaj, E. A.; D’Urso, P.; Caiado, J., Time Series Classification and Clustering (2019), CRC Press, Taylor & Francis Group: CRC Press, Taylor & Francis Group United States · Zbl 1435.62006
[28] Mandelbrot, B., The variation of certain speculative prices, J. Bus., 36, 394-419 (1963)
[29] Otranto, E., Clustering heteroskedastic time series by model-based procedures, Comput. Stat. Data Anal., 52, 4685-4698 (2008) · Zbl 1452.62784
[30] Peña, D.; Prieto, F. J., Cluster identification using projections, J. Am. Stat. Assoc., 96, 1433-1445 (2001) · Zbl 1051.62055
[31] Piccolo, D., A distance measure for classifying ARIMA models, J. Time Ser. Anal., 11, 152-164 (1990) · Zbl 0691.62083
[32] Robinson, P. M., Gaussian semiparametric estimation of long-range dependence, Ann. Stat., 23, 1630-1661 (1995) · Zbl 0843.62092
[33] Taylor, S. J., Modelling Financial Time Series (2008), World Scientific: World Scientific Singapore · Zbl 1146.91033
[34] Tsay, R. S., Analysis of Financial Time Series (2010), Wiley · Zbl 1209.91004
[35] Wang, X.; Smith, K.; Hyndman, R. J., Characteristic-based clustering for time series data, Data Min. Knowl. Discov., 13, 335-364 (2006)
[36] Zakoian, J.-M., Threshold heteroskedastic models, J. Econ. Dyn. Control, 18, 931-955 (1994) · Zbl 0806.90018
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.