×

Geographic ratemaking with spatial embeddings. (English) Zbl 1484.91375

Summary: Spatial data are a rich source of information for actuarial applications: knowledge of a risk’s location could improve an insurance company’s ratemaking, reserving or risk management processes. Relying on historical geolocated loss data is problematic for areas where it is limited or unavailable. In this paper, we construct spatial embeddings within a complex convolutional neural network representation model using external census data and use them as inputs to a simple predictive model. Compared to spatial interpolation models, our approach leads to smaller predictive bias and reduced variance in most situations. This method also enables us to generate rates in territories with no historical experience.

MSC:

91G05 Actuarial mathematics
91D20 Mathematical geography and demography
91B72 Spatial models in economics
PDFBibTeX XMLCite
Full Text: DOI arXiv

References:

[1] Anselin, L., Syabri, I. and Kho, Y. (2010) Geoda: An introduction to spatial data analysis. In Handbook of Applied Spatial Analysis, pp. 73-89. Heidelberg, Germany: Springer.
[2] Bengio, Y., Courville, A. and Vincent, P. (2013) Representation learning: A review and new perspectives. IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(8), 1798-1828.
[3] Blier-Wong, C., Baillargeon, J.-T., Cossette, H., Lamontagne, L. and Marceau, E. (2020) Encoding neighbor information into geographical embeddings using convolutional neural networks. In The Thirty-Third International Flairs Conference.
[4] Blier-Wong, C., Baillargeon, J.-T., Cossette, H., Lamontagne, L. and Marceau, E. (2021) Rethinking representations in P&C actuarial science with deep neural networks. arXiv preprint arXiv:2102.05784.
[5] Boskov, M. and Verrall, R. (1994) Premium rating by geographic area using spatial models. ASTIN Bulletin: The Journal of the IAA, 24(1), 131-143.
[6] Cocos, A. and Callison-Burch, C. (2017) The language of place: Semantic value from geospatial context. In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers, pp. 99-104.
[7] Collobert, R. and Weston, J. (2008) A unified architecture for natural language processing: Deep neural networks with multitask learning. In Proceedings of the 25th International Conference on Machine Learning, pp. 160-167.
[8] Denuit, M. and Lang, S. (2004) Non-life rate-making with Bayesian GAMs. Insurance: Mathematics and Economics, 35(3), 627-647. · Zbl 1070.62095
[9] Dimakos, X. K. and Di Rattalma, A. F. (2002) Bayesian premium rating with latent structure. Scandinavian Actuarial Journal, 2002(3), 162-184. · Zbl 1039.91039
[10] Dumoulin, V. and Visin, F. (2018) A guide to convolution arithmetic for deep learning. arXiv:1603.07285 [cs, stat].
[11] Eisenstein, J., O’Connor, B., Smith, N. A. and Xing, E. P. (2010) A latent variable model for geographic lexical variation. In Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing, pp. 1277-1287. Association for Computational Linguistics.
[12] Fahrmeir, L., Lang, S. and Spies, F. (2003) Generalized geoadditive models for insurance claims data. Blätter der DGVFM, 26(1), 7-23.
[13] Finkelstein, L., Gabrilovich, E., Matias, Y., Rivlin, E., Solan, Z., Wolfman, G. and Ruppin, E. (2002) Placing Search in Context: The Concept Revisited. ACM Transactions on Information Systems20(1), 16.
[14] Frees, E. W. (2015) Analytics of insurance markets. Annual Review of Financial Economics, 7, 253-277.
[15] Glorot, X. and Bengio, Y. (2010) Understanding the difficulty of training deep feedforward neural networks. In Proceedings of the 13th International Conference on Artificial Intelligence and Statistics (AISTATS) 2010, p. 8, Sardinia, Italy.
[16] Goodfellow, I., Bengio, Y. and Courville, A. (2016) Deep Learning. Cambridge, Massachusetts: MIT Press. · Zbl 1373.68009
[17] Gschlößl, S. and Czado, C. (2007) Spatial modelling of claim frequency and claim size in non-life insurance. Scandinavian Actuarial Journal, 2007(3), 202-225. · Zbl 1150.91026
[18] He, K., Zhang, X., Ren, S. and Sun, J. (2015) Delving deep into rectifiers: Surpassing human-level performance on imagenet classification. In Proceedings of the IEEE International Conference on Computer Vision, pp. 1026-1034.
[19] He, K., Zhang, X., Ren, S. and Sun, J. (2016) Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770-778.
[20] Henckaerts, R., Antonio, K., Clijsters, M. and Verbelen, R. (2018) A data driven binning strategy for the construction of insurance tariff classes. Scandinavian Actuarial Journal, 2018(8), 681-705. · Zbl 1418.91241
[21] Hengl, T., Heuvelink, G. B. and Rossiter, D. G. (2007) About regression-kriging: From equations to case studies. Computers & Geosciences, 33(10), 1301-1315.
[22] Hui, B., Yan, D., Ku, W.-S. and Wang, W. (2020) Predicting economic growth by region embedding: A multigraph convolutional network approach. In Proceedings of the 29th ACM International Conference on Information & Knowledge Management, pp. 555-564.
[23] Ioffe, S. and Szegedy, C. (2015) Batch normalization: Accelerating deep network training by reducing internal covariate shift. In International Conference on Machine Learning, pp. 448-456.
[24] ISO 19109 (2015) Geographic information — rules for application schema. Standard, International Organization for Standardization, Geneva. Technical Committee ISO/TC 211, Geographic Information/Geomatics.
[25] Jeawak, S. S., Jones, C. B. and Schockaert, S. (2019) Embedding geographic locations for modelling the natural environment using Flickr tags and structured data. In European Conference on Information Retrieval, pp. 51-66. Springer.
[26] Jurafsky, D. and Martin, J. H. (2009) Speech & Language Processing, second edition. Upper Saddle River, New Jersey: Prentice Hall.
[27] Kingma, D. P. and Ba, J. (2014) Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980.
[28] Lambert, J. H. (1772) Beiträge zum gebrauche der mathematik und deren anwendung: Part iii, section 6: Anmerkungen und zusätze zur entwerfung der land-und himmelscharten: Berlin, translated and introduced by WR Tobler. Translated and introduced by WR Tobler, Univ. Michigan in 1972.
[29] Mikolov, T., Chen, K., Corrado, G. and Dean, J. (2013) Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781.
[30] Mikolov, T., Grave, E., Bojanowski, P., Puhrsch, C. and Joulin, A. (2017) Advances in pre-training distributed word representations. arXiv preprint arXiv:1712.09405.
[31] Miller, H. J. (2004) Tobler’s first law and spatial analysis. Annals of the Association of American Geographers, 94(2), 284-289.
[32] Paszke, A., Gross, S., Chintala, S., Chanan, G., Yang, E., Devito, Z., Lin, Z., Desmaison, A., Antiga, L. and Lerer, A. (2017) Automatic differentiation in pytorch. In 31st Conference on Neural Information Processing Systems.
[33] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al. (2019) Pytorch: An imperative style, high-performance deep learning library. arXiv preprint arXiv:1912.01703.
[34] (2020) R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing.
[35] Shi, P. and Shi, K. (2017) Territorial risk classification using spatially dependent frequency-severity models. ASTIN Bulletin: The Journal of the IAA, 47(2), 437-465. · Zbl 1390.62221
[36] Simonyan, K. and Zisserman, A. (2014) Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556.
[37] Taylor, G. (2001) Geographic premium rating by Whittaker spatial smoothing. ASTIN Bulletin: The Journal of the IAA, 31(1), 147-160. · Zbl 1060.91101
[38] Taylor, G. C. (1989) Use of spline functions for premium rating by geographic area. ASTIN Bulletin: The Journal of the IAA, 19(1), 91-122.
[39] Tobler, W. R. (1970) A computer movie simulating urban growth in the Detroit region. Economic Geography, 46(sup1):234-240.
[40] Wang, C., Schifano, E. D. and Yan, J. (2017) Geographical ratings with spatial random effects in a two-part model. Variance, 13(1), 20.
[41] Wang, Z., Li, H. and Rajagopal, R. (2020) Urban2Vec: Incorporating Street View imagery and POIs for multi-modal urban neighborhood embedding. Proceedings of the AAAI Conference on Artificial Intelligence, 34(01), 1013-1020.
[42] Wood, S. (2012) mgcv: Mixed GAM computation vehicle with GCV/AIC/REML smoothness estimation.
[43] Xu, S., Cao, J., Legg, P., Liu, B. and Li, S. (2020) Venue2Vec: An efficient embedding model for fine-grained user location prediction in geo-social networks. IEEE Systems Journal14(2), 1740-1751.
[44] Yao, Y., Li, X., Liu, X., Liu, P., Liang, Z., Zhang, J. and Mai, K. (2017) Sensing spatial distribution of urban land use by integrating points-of-interest and Google Word2Vec model. International Journal of Geographical Information Science, 31(4), 825-848.
[45] Yin, Y., Liu, Z., Zhang, Y., Wang, S., Shah, R. R. and Zimmermann, R. (2019) GPS2Vec: Towards generating worldwide GPS embeddings. In Proceedings of the 27th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, pp. 416-419, Chicago IL USA. ACM.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.