Journal of Systems & Management ›› 2020, Vol. 29 ›› Issue (4): 629-638.DOI: 10.3969/j.issn.1005-2542.2020.04.002

Previous Articles     Next Articles

Credit Loan Evaluation Model Based on Natural Language Processing and Deep Learning

ZHAO Xuefeng, WU Weiwei, SHI Huining   

  1. 1. School of Management, Harbin Institute of Technology, Harbin 150000, China; 2. School of Accounting, Guangdong University of Foreign Studies, Guangzhou 510000, China
  • Online:2020-07-29 Published:2020-08-07

基于自然语言处理与深度学习的信用贷款评估模型

赵雪峰,吴伟伟,时辉凝   

  1. 1.哈尔滨工业大学 管理学院,哈尔滨 150000; 2.广东外语外贸大学 会计学院,广州 510000

  • 通讯作者: 吴伟伟(1978-),男,教授。
  • 作者简介:赵雪峰(1993-),男,硕士生。研究方向为技术预测与创新管理。
  • 基金资助:

    国家自然科学基金资助项目(71472055);国家社会科学基金重点项目(16AZD0006);

    中央高校基本科研业务费专项资金资助项目(HIT.NSRIF.2019033)

Abstract:

In view of the fact that current credit evaluation model has the characteristics of complex preprocessing, subjective factors interference, and low accuracy, a novel model is proposed, which first constructs text data of continuous credit characteristics, and uses the Word2Vec algorithm for word vectorization, and then evaluates by connecting convolutional neural network(CNN) with word embedding layer. Besides, an empirical analysis is conducted through the Keras framework and based on the personal credit data of the bank from 2008 to 2018. The results show that the overall evaluation accuracy of the novel model is as high as 91.7%. Missing features can be evaluated directly, i.e., missing features need not processed, with an accuracy rate of 85.8%. The novel model transforms discrete credit features into continuous text, which reduces the complexity of feature preprocessing. The combination of Word2Vec and natural language processing achieves direct assessment of missing credit features. The excellent feature analysis capabilities based on CNN improves the robustness of the credit evaluation model, improves some of the problems, and avoids subjective factors in the current credit evaluation model.

Key words: natural language processing, convolutional neural network (CNN), deep learning, credit loan

摘要:

针对目前信用贷款评估模型存在特征预处理复杂、受主观因素干扰、准确率较低的现象,提出一种新模型。该模型先组建连续性信贷特征文本数据,并使用Word2Vec算法进行词向量化后通过词嵌入层衔接CNN(卷积神经网络)进行评估,通过Keras框架并依据2008~2018年的银行个人信贷数据进行实证分析。结果表明:新模型的总体评估准确率高达91.7%,无需对缺失特征进行处理并可直接评估,且评估准确率更优异,达到85.8%。新模型将离散型的信贷特征转变为连续性文本,降低特征预处理复杂度,结合Word2Vec与自然语言处理实现直接评估缺失信贷特征的目的,并基于CNN优异的特征分析能力最终提高信贷评估模型鲁棒性,进一步改善了目前信用贷款评估模型中存在的部分问题,同时避免评估中主观因素的干扰。

关键词: 自然语言处理, 卷积神经网络, 深度学习, 信用贷款

CLC Number: