Data Science in Insurance

A special issue of Risks (ISSN 2227-9091).

Deadline for manuscript submissions: closed (31 October 2022) | Viewed by 24740

Special Issue Editors


E-Mail Website
Guest Editor
Department of Mathematics for Economic, Financial and Actuarial Sciences, Catholic University of Milan, 20123 Largo Gemelli 1, Milan, Italy
Interests: actuarial science; complex networks
Special Issues, Collections and Topics in MDPI journals

E-Mail Website
Guest Editor
Department of Mathematics for Economic, Financial and Actuarial Sciences, Università Cattolica del Sacro Cuore, 20123 Milano, Italy
Interests: capital requirement for non-life insurance; reinsurance; medical malpractice

E-Mail Website
Guest Editor
Department of Statistical Sciences, Università Cattolica del Sacro Cuore, 20123 Milano, Italy
Interests: insurance statistics; data analytics; premium pricing; claims reserving

Special Issue Information

Dear Colleagues,

The digital revolution has allowed for the collection and storage of large and diverse amounts of information in the insurance field. This era is referred to as big data because the great uncertainty to be modelled is too complex for traditional data processing techniques. For insurance purposes, big data refers to unstructured and/or structured data being used to influence underwriting, rating, pricing, forms, marketing and claims handling.

Moving from these considerations, the fourth edition of  the InsuranceDataScience conference will take place from 15 to 17 June 2022 at the Catholic University of the Sacred Heart, Milan (see https://insurancedatascience.org/). Authors and participants to the meeting are invited to submit papers with original contributions related to the topics of the Special Issue.

The Special Issue aims to compile high-quality papers that offer a discussion of the state-of-the-art or introduce new theoretical or practical developments in this field.  We welcome papers related, but not limited to, the following areas: data science, analytics, machine learning, artificial intelligence, computational statistics and software, as applied in the insurance industry (life, non-life, health insurance and/or reinsurance). 

Dr. Gian Paolo Clemente
Dr. Nino Savelli
Dr. Diego Zappa
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Risks is an international peer-reviewed open access monthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 1800 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • statistical models for insurance
  • machine learning and data science in insurance
  • artificial insurance in insurance

Published Papers (8 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Editorial

Jump to: Research, Other

3 pages, 277 KiB  
Editorial
Special Issue “Data Science in Insurance”
by Gian Paolo Clemente, Francesco Della Corte, Nino Savelli and Diego Zappa
Risks 2023, 11(5), 80; https://doi.org/10.3390/risks11050080 - 24 Apr 2023
Cited by 1 | Viewed by 1460
Abstract
Within the insurance field, the digital revolution has enabled the collection and storage of large quantities of information [...] Full article
(This article belongs to the Special Issue Data Science in Insurance)

Research

Jump to: Editorial, Other

19 pages, 9181 KiB  
Article
ECLIPSE: Holistic AI System for Preparing Insurer Policy Data
by Varun Sriram, Zijie Fan and Ni Liu
Risks 2023, 11(1), 4; https://doi.org/10.3390/risks11010004 - 21 Dec 2022
Cited by 1 | Viewed by 1411
Abstract
Reinsurers possess high volumes of policy listings data from insurers, which they use to provide insurers with analytical insights and modeling that guide reinsurance treaties. These insurers often act on the same data for their own internal modeling and analytics needs. The problem [...] Read more.
Reinsurers possess high volumes of policy listings data from insurers, which they use to provide insurers with analytical insights and modeling that guide reinsurance treaties. These insurers often act on the same data for their own internal modeling and analytics needs. The problem is this data is messy and needs significant preparation in order to extract meaningful insights. Traditionally, this has required intensive manual labor from actuaries. However, a host of modern AI techniques and ML system architectures introduced in the past decade can be applied to the problem of insurance data preparation. In this paper, we explore a novel application of AI/ML on policy listings data that poses its own unique challenges, by outlining the holistic AI-based platform we developed, ECLIPSE (Elegant Cleaning and Labeling of Insurance Policies while Standardizing Entities). With ECLIPSE, actuaries not only save time on data preparation but can build more effective loss models and provide crisper insights. Full article
(This article belongs to the Special Issue Data Science in Insurance)
Show Figures

Figure 1

35 pages, 1164 KiB  
Article
A Combined Neural Network Approach for the Prediction of Admission Rates Related to Respiratory Diseases
by Alex Jose, Angus S. Macdonald, George Tzougas and George Streftaris
Risks 2022, 10(11), 217; https://doi.org/10.3390/risks10110217 - 16 Nov 2022
Cited by 2 | Viewed by 1866
Abstract
In this paper, we investigated rates of admission to hospitals (or other health facilities) due to respiratory diseases in a United States working population and their dependence on a number of demographic and health insurance-related factors. We employed neural network (NN) modelling methodology, [...] Read more.
In this paper, we investigated rates of admission to hospitals (or other health facilities) due to respiratory diseases in a United States working population and their dependence on a number of demographic and health insurance-related factors. We employed neural network (NN) modelling methodology, including a combined actuarial neural network (CANN) approach, and model admission numbers by embedding Poisson and negative binomial count regression models. The aim is to explore the gains in predictive power obtained with the use of NN-based models, when compared to commonly used count regression models, in the context of a large real data set in the area of healthcare insurance. We used nagging predictors, averaging over random calibrations of the NN-based models, to provide more accurate predictions based on a single run, and also employed a k-fold validation process to obtain reliable comparisons between different models. Bias regularisation methods were also developed, aiming at addressing bias issues that are common when fitting NN models. The results demonstrate that NN-based models, with a negative binomial distributional assumption, provide improved predictive performance. This can be important in real data applications, where accurate prediction can drive both personalised and policy-level interventions. Full article
(This article belongs to the Special Issue Data Science in Insurance)
Show Figures

Figure 1

14 pages, 831 KiB  
Article
Modeling Under-Reporting in Cyber Incidents
by Seema Sangari, Eric Dallal and Michael Whitman
Risks 2022, 10(11), 200; https://doi.org/10.3390/risks10110200 - 22 Oct 2022
Cited by 4 | Viewed by 1472
Abstract
Under-reporting in cyber incidents is a well-established problem. Due to reputational risk and the consequent financial impact, a large proportion of incidents are never disclosed to the public, especially if they do not involve a breach of protected data. Generally, the problem of [...] Read more.
Under-reporting in cyber incidents is a well-established problem. Due to reputational risk and the consequent financial impact, a large proportion of incidents are never disclosed to the public, especially if they do not involve a breach of protected data. Generally, the problem of under-reporting is solved through a proportion-based approach, where the level of under-reporting in a data set is determined by comparison to data that is fully reported. In this work, cyber insurance claims data is used as the complete data set. Unlike most other work, however, our goal is to quantify under-reporting with respect to multiple dimensions: company revenue, industry, and incident categorization. The research shows that there is a dramatic difference in under-reporting—a factor of 100—as a function of these variables. Overall, it is estimated that only approximately 3% of all cyber incidents are accounted for in databases of publicly reported events. The output of this work is an under-reporting model that can be used to correct incident frequencies derived from data sets of publicly reported incidents. This diminishes the “barrier to entry” in the development of cyber risk models, making it accessible to researchers who may not have the resources to acquire closely guarded cyber insurance claims data. Full article
(This article belongs to the Special Issue Data Science in Insurance)
Show Figures

Figure 1

28 pages, 1181 KiB  
Article
Scenario Generation for Market Risk Models Using Generative Neural Networks
by Solveig Flaig and Gero Junike
Risks 2022, 10(11), 199; https://doi.org/10.3390/risks10110199 - 22 Oct 2022
Cited by 3 | Viewed by 2355
Abstract
In this research study, we show how existing approaches of using generative adversarial networks (GANs) as economic scenario generators (ESG) can be extended to an entire internal market risk model—with enough risk factors to model the full band-width of investments for an insurance [...] Read more.
In this research study, we show how existing approaches of using generative adversarial networks (GANs) as economic scenario generators (ESG) can be extended to an entire internal market risk model—with enough risk factors to model the full band-width of investments for an insurance company and for a time horizon of one year, as required in Solvency 2. We demonstrate that the results of a GAN-based internal model are similar to regulatory-approved internal models in Europe. Therefore, GAN-based models can be seen as an alternative data-driven method for market risk modeling. Full article
(This article belongs to the Special Issue Data Science in Insurance)
Show Figures

Figure 1

25 pages, 646 KiB  
Article
Robust Classification via Support Vector Machines
by Alexandru V. Asimit, Ioannis Kyriakou, Simone Santoni, Salvatore Scognamiglio and Rui Zhu
Risks 2022, 10(8), 154; https://doi.org/10.3390/risks10080154 - 01 Aug 2022
Cited by 2 | Viewed by 2231
Abstract
Classification models are very sensitive to data uncertainty, and finding robust classifiers that are less sensitive to data uncertainty has raised great interest in the machine learning literature. This paper aims to construct robust support vector machine classifiers under feature data uncertainty via [...] Read more.
Classification models are very sensitive to data uncertainty, and finding robust classifiers that are less sensitive to data uncertainty has raised great interest in the machine learning literature. This paper aims to construct robust support vector machine classifiers under feature data uncertainty via two probabilistic arguments. The first classifier, Single Perturbation, reduces the local effect of data uncertainty with respect to one given feature and acts as a local test that could confirm or refute the presence of significant data uncertainty for that particular feature. The second classifier, Extreme Empirical Loss, aims to reduce the aggregate effect of data uncertainty with respect to all features, which is possible via a trade-off between the number of prediction model violations and the size of these violations. Both methodologies are computationally efficient and our extensive numerical investigation highlights the advantages and possible limitations of the two robust classifiers on synthetic and real-life insurance claims and mortgage lending data, but also the fairness of an automatized decision based on our classifier. Full article
(This article belongs to the Special Issue Data Science in Insurance)
Show Figures

Figure 1

16 pages, 510 KiB  
Article
Multiple Bonus–Malus Scale Models for Insureds of Different Sizes
by Jean-Philippe Boucher
Risks 2022, 10(8), 152; https://doi.org/10.3390/risks10080152 - 28 Jul 2022
Cited by 3 | Viewed by 1425
Abstract
How to consider the a priori risks in experience-rating models has been questioned in the actuarial community for a long time. Classic past-claim-rating models, such as the Buhlmann–Straub credibility model, normalize the past experience of each insured before applying claim penalties. On the [...] Read more.
How to consider the a priori risks in experience-rating models has been questioned in the actuarial community for a long time. Classic past-claim-rating models, such as the Buhlmann–Straub credibility model, normalize the past experience of each insured before applying claim penalties. On the other hand, classic Bonus–Malus Scales (BMS) models generate the same surcharges and the same discounts for all insureds because the transition rules within the class system do not depend on the a priori risk. Despite the quality of prediction of the BMS models, this experience-rating model could appear unfair to many insureds and regulators because it does not recognize the initial risk of the insured. In this paper, we propose the creation of different BMSs for each type of insured using recursive partitioning methods. We apply this approach to real data for the farm insurance product of a major Canadian insurance company with widely varying sizes of insureds. Because the a priori risk can change over time, a study of the possible transitions between different BMS models is also performed. Full article
(This article belongs to the Special Issue Data Science in Insurance)
Show Figures

Figure 1

Other

Jump to: Editorial, Research

50 pages, 2008 KiB  
Systematic Review
Explainable Artificial Intelligence (XAI) in Insurance
by Emer Owens, Barry Sheehan, Martin Mullins, Martin Cunneen, Juliane Ressel and German Castignani
Risks 2022, 10(12), 230; https://doi.org/10.3390/risks10120230 - 01 Dec 2022
Cited by 13 | Viewed by 10939
Abstract
Explainable Artificial Intelligence (XAI) models allow for a more transparent and understandable relationship between humans and machines. The insurance industry represents a fundamental opportunity to demonstrate the potential of XAI, with the industry’s vast stores of sensitive data on policyholders and centrality in [...] Read more.
Explainable Artificial Intelligence (XAI) models allow for a more transparent and understandable relationship between humans and machines. The insurance industry represents a fundamental opportunity to demonstrate the potential of XAI, with the industry’s vast stores of sensitive data on policyholders and centrality in societal progress and innovation. This paper analyses current Artificial Intelligence (AI) applications in insurance industry practices and insurance research to assess their degree of explainability. Using search terms representative of (X)AI applications in insurance, 419 original research articles were screened from IEEE Xplore, ACM Digital Library, Scopus, Web of Science and Business Source Complete and EconLit. The resulting 103 articles (between the years 2000–2021) representing the current state-of-the-art of XAI in insurance literature are analysed and classified, highlighting the prevalence of XAI methods at the various stages of the insurance value chain. The study finds that XAI methods are particularly prevalent in claims management, underwriting and actuarial pricing practices. Simplification methods, called knowledge distillation and rule extraction, are identified as the primary XAI technique used within the insurance value chain. This is important as the combination of large models to create a smaller, more manageable model with distinct association rules aids in building XAI models which are regularly understandable. XAI is an important evolution of AI to ensure trust, transparency and moral values are embedded within the system’s ecosystem. The assessment of these XAI foci in the context of the insurance industry proves a worthwhile exploration into the unique advantages of XAI, highlighting to industry professionals, regulators and XAI developers where particular focus should be directed in the further development of XAI. This is the first study to analyse XAI’s current applications within the insurance industry, while simultaneously contributing to the interdisciplinary understanding of applied XAI. Advancing the literature on adequate XAI definitions, the authors propose an adapted definition of XAI informed by the systematic review of XAI literature in insurance. Full article
(This article belongs to the Special Issue Data Science in Insurance)
Show Figures

Figure 1

Back to TopTop