###
计算机系统应用英文版:2023,32(3):338-344
本文二维码信息
码上扫一扫!
基于CatBoost用信预测模型的TreeSHAP解释性研究
(1.宁夏大学 信息工程学院, 银川 750021;2.石嘴山银行股份有限公司 金融大数据实验室, 银川 750011;3.宁夏大学 前沿交叉学院, 中卫 755099)
Research on Interpretative TreeSHAP Based on CatBoost’s Credit Utilization Prediction Model
(1.School of Information Engineering, Ningxia University, Yinchuan 750021, China;2.Laboratory of Financial Big Data, Bank of Shizuishan, Yinchuan 750011, China;3.School of Advanced Interdisciplinary Studies, Ningxia University, Zhongwei 755099, China)
摘要
图/表
参考文献
相似文献
本文已被:浏览 464次   下载 1245
Received:August 17, 2022    Revised:September 15, 2022
中文摘要: 银行客户申请信用贷款在授信通过后, 精准预测客户是否用信及分析影响客户用信的关键因素, 对提高银行客户服务能力及盈利能力具有重要意义. 目前, 机器学习算法鲜有在用信预测方面的应用, 且金融用信领域缺乏模型可解释性的研究, 为此提出一种基于CatBoost的TreeSHAP解释性用信预测模型. 通过CatBoost构建用信预测模型, 利用3种超参数优化算法对该模型进行对比优化, 与基线模型在4项主要性能指标上进行实验对比, 结果表明经TPE算法优化后的模型性能均优于其他模型, 然后结合TreeSHAP方法从全局和局部的层面增强模型的可解释性, 解释性分析客户用信的影响因素, 为银行对客户进行精准化营销提供决策依据.
中文关键词: 用信预测  可解释性  TPE  CatBoost  TreeSHAP  机器学习
Abstract:It is essential for banks to accurately predict whether clients will use their credit and analyze key factors influencing credit utilization after these clients have been approved for credit, so as to improve their client service level and profitability. Currently, machine learning algorithms are rarely applied to credit utilization prediction, and there is a lack of research on model interpretability in the financial credit utilization field. Therefore, this study proposes an interpretative TreeSHAP credit utilization prediction model based on CatBoost. Specifically, a credit utilization prediction model is constructed by CatBoost and is compared and optimized by using three hyperparameter optimization algorithms. Then, the model is experimentally compared with baseline models in terms of four main performance metrics. The results show that the model optimized by the TPE algorithm outperforms other models. Finally, the interpretability of the model is enhanced locally and globally by the TreeSHAP method. Furthermore, factors influencing client credit utilization are interpretively analyzed, so as to provide a decision-making basis for banks to make accurate marketing to clients.
文章编号:     中图分类号:    文献标志码:
基金项目:国家自然科学基金(71461025); 宁夏自然科学基金(2020A1166)
引用文本:
马朔,李钊,赵军.基于CatBoost用信预测模型的TreeSHAP解释性研究.计算机系统应用,2023,32(3):338-344
MA Shuo,LI Zhao,ZHAO Jun.Research on Interpretative TreeSHAP Based on CatBoost’s Credit Utilization Prediction Model.COMPUTER SYSTEMS APPLICATIONS,2023,32(3):338-344