In recent years, as new technologies have emerged, the concept and systems surrounding big data have improved. Additionally, the application of big data has become more in-depth and extensive. Various industries and fields have been innovative in digital transformation and big data integration applications, using big data to improve existing business processes, expand new business forms and more. Utilizing the more significant amount of data and the richer data dimension, actuaries can execute the actuarial role at a heightened level, helping to improve risk management capabilities and better serve consumers.
Part 1 of this two-part series, published in November 2022, analyzed the change in the actuarial mindset in the era of big data. This article, Part 2, introduces the third and fourth steps of actuarial big data application. It then explores the outlook of future actuarial work using big data.
Modeling and Model Performance Evaluation
The application of actuarial big data is limited to traditional actuarial model construction and includes various machine learning algorithms, mapping knowledge domains and other modeling methods. It is necessary to select suitable modeling methods according to the scenario’s objectives and to iterate and optimize the model through a series of test indicators and visual model evaluation indicators to obtain a model with good effects, stable results and generalization ability.
Let’s take machine learning modeling as an example.
- Algorithm selection—Machine learning algorithms must be combined with different modeling objectives, such as solving classification or regression problems. The difference compared to other information technology (IT) staff modeling is that actuaries are more concerned about the rationality of the modeling process and the scientificity of the results, and they need to judge whether the impact of model variables is in accord with business rules and logic. This requires a strong ability to interpret features and models. Before the era of actuarial big data, companies usually used a combination of various algorithms. First, they used nonlinear algorithms with high-dimensional feature processing capabilites to screen effective features and determine the model structure. Then they used generalized linear models with strong interpretability for training and prediction to balance the model effect and the interpretability of rule.
- Variable selection—The variables in the model generally are screened by the training set and tuned by the validation set, which requires careful consideration of various factors including the importance of features, the influence of correlation among variables, the interpretability of variables and the performance of visual evaluation indicators such as One-Way. The exposure number of variables also is considered to ensure the stability of the variables’ impact. For insignificant or unstable variables due to low exposure numbers, it is necessary to consider elimination. For variables whose influence is judged to be contrary to the substance of business from actuarial principles or business logic, it is necessary to selectively analyze to find relevant influence factors and adjust the form of variables (e.g., by adding cross-influence variables) to re-enter the model to participate in the training. Still, attention needs to be paid to the risk of overfitting. In addition, due to the long duration of life insurance contracts, some variables may have large distribution differences over long time spans, so further generalization and validation of the model are needed. This can adjust the variable coefficients by lengthening the time window so the model obtains the generalization ability and better reflects the changes of future trends.
- Model evaluation—Regression and classification models have different evaluation methods and indicators, such as mean square error indicators for regression models, receiver operating characteristic (ROC) curves, lift charts for classification models and more. In practice, it is necessary to combine the evaluation indicators of scientific selection—like model algorithm, business requirements and professional experience—and define the evaluation level and boundary of individual indicators reasonably. This allows actuaries to determine the relative importance or weight of each evaluation indicator, which in turn helps them comprehensively evaluate the model effect. In addition to the commonly used evaluation indicators, a comprehensive evaluation should be made in terms of the reasonableness of model results and other aspects, such as measuring the degree of deviation between predicted and actual values through a lift curve in different dimensions or calculating the lift degree of lift charts to measure the predictive ability of classification models (bad samples compared with random selection). In addition, further rationality analysis of the prediction results relying on expert experience is required to ensure compliance with actuarial principles and business logic.
- Iterative optimization—Machine learning model construction and model evaluation are iterative processes that require continuous algorithm adjustment and hyperparameter adjustment and feature adjustment based on evaluation indicators. Usually, a model must be tuned dozens of times before it can be determined, which is a large amount of repetitive work, making the whole modeling process time-consuming. Therefore, actuaries have developed an automation tool based on the modeling process in preliminary practice, mainly used in the process of iterative model optimization. This can improve modeling efficiency greatly.
Application implementation is the most difficult step in the actuarial big data practice. It requires careful consideration of application scenarios and suitable application implementation forms—a combination of the actual business needs and integration of model results. It also requires being well-versed in business rules and logic and being able to transform professional and complex analysis logic into an applicable and implementable form embedded in business processes. In the application implementation process, the assistance and cooperation of personnel from other business lines, such as IT, research and development departments, often are required.
Promote Innovative Integration
With the gradual development of differentiated and customized life insurance products, along with the more refined and specialized requirements of life insurance operations, the application of actuarial big data will become more in-depth and extensive. Actuaries should keep an open mind and actively promote the integration of actuarial science and technology, such as big data, to bring the effect of “1+1>2” into play.
Focus on Comprehensive Ability Training and Strengthen Professional Applications
For future actuarial big data applications, actuaries should innovate on existing business processes and management modes to provide better actuarial services. For example, in terms of risk quantification, the use of multidimensional data and corresponding analysis tools and in-depth research on insurance risks helps actuaries explore some risk measurement methods using actuarial tools in refined dimensions. This can provide an actuarial basis for accurate pricing and differentiated risk management. For example, through in-depth correlation with big data, a company can conduct more multidimensional research on the influence factors of long-term interest rate trends to assist the company’s asset and liability management. In addition to improvements to existing service models, it’s necessary to develop and innovate to create new management modes and other related application areas.
Maintain a Scientific Cognition of Risk
Life insurance provides a long-term risk guarantee, and the essence of the operation is risk management. Big data gives actuaries suitable methods and means to measure risk. The underlying changes in data will make actuarial models different. Still, the most basic actuarial logic remains unchanged, and the relevant applications of big data are not a subversion of insurance principles and basic laws. Rather, they help to realize the essence of insurance and actuarial science better. Therefore, while actuaries continue to strengthen the big data application depth and broaden the application field, they must grasp the essence of life insurance and follow the actuarial logic for reasonable integration and innovation.
Under the application of big data, the data dimension has increased significantly, making it more necessary for actuaries to weigh and optimize the information dimension and reduce redundant information interference. Actuaries should never blindly pursue the complexity of the model and large arithmetic power. They should consider all relevant steps of the value chain, coordinate different models and business rules based on actuarial principles, and build a complete, logical closed loop to ensure the application of big data does not deviate from the business essence. At the same time, it also is necessary to carry out relevant analysis in conjunction with the actual business situation to ensure the scientificity, rationality and usability of the application results.
This article mainly summarizes the working practice of the company in the aspect of actuarial big data. Special thanks to Li Mingguang and Peggy Hou for their guidance and support and Li Xiangye for participating in the discussion.
Statements of fact and opinions expressed herein are those of the individual authors and are not necessarily those of the Society of Actuaries or the respective authors’ employers.
Copyright © 2023 by the Society of Actuaries, Schaumburg, Illinois.