Research on Dynamic Prediction Model of Consumer Credit Risk under Fintech Innovation
Publié en ligne: 17 mars 2025
Reçu: 04 oct. 2024
Accepté: 26 janv. 2025
DOI: https://doi.org/10.2478/amns-2025-0341
Mots clés
© 2025 Yangyudongnanxin Guo, published by Sciendo
This work is licensed under the Creative Commons Attribution 4.0 International License.
In recent years, commercial banks have increased their publicity in consumer loans, which has led to a certain development in the scale of consumer loans [1]. The development of China’s economy promotes people’s affluence, and people’s consumption ability has been greatly improved, opening up new development space for the development of consumer credit business. Personal consumer credit will become an important economic income-generating business for the future development of commercial banks [2]. Consumer credit is currently one of the important businesses of commercial banks’ comprehensive financial services. For commercial banks, this is to reduce homogeneous competition and find a personalized development path that has an important role [3]. As the level of China’s economic development continues to improve, people’s quality of life and disposable income have increased substantially, and the overall demand for consumer credit has developed rapidly, promoting economic development. The personal credit market has prompted people to obtain financial support through consumer credit to further satisfy the consumption needs of residents, thus greatly improving their consumption [4-5].
Nowadays, the traditional banking business can no longer meet the needs of the diversified development of banks. This business approach is in urgent need of transformation. In the context of the enormous competitive pressure, banks gradually began to optimize and adjust their business strategy [6]. On the one hand, with the continuous progress of the economy, the income of the residents has increased significantly, the consumption demand of the residents has also increased, and the financial institutions have gradually transferred their human, financial, and material resources to the field of personal consumer credit [7]. On the other hand, the low risk of personal consumer credit transactions can effectively improve the assets of financial institutions. Quality structure [8]. In terms of loan delinquency rate, the competition in the consumer credit market is becoming more and more intense because it is much lower than corporate credit and has higher profit margins. Therefore, the personal consumer credit business has become a new economic income-generating point for commercial banks, and at the same time, the personal consumer credit business will be the most important development direction in the next development period of this field. Consumer credit business of commercial banks continues to rise [9-10]. One of the important ways for banks to generate profits is personal consumer credit, and the scientific development of this business can largely increase the commercial profits of Chinese commercial banks, but on the other hand, it also increases the business risks of commercial banks [11]. Generally speaking, carrying out business operations usually brings certain business operation risks, especially personal credit transactions, on the rise. Although the total volume of personal consumer credit transactions in China is growing rapidly, the business risks faced are more diverse, which requires commercial banks to improve their risk management awareness [12-13]. For commercial banks, developing and managing risks is usually the most important problem they face when developing personal consumer credit. For commercial banks, the establishment of a perfect risk early warning indicator system and risk early warning model is the most important [14].
The existing risk early warning indicator system of commercial banks relies heavily on the People’s Bank of China (PBOC) credit system. However, the PBOC credit system has structural defects, which leads to limitations in the coverage of the PBOC credit system [15]. In addition, due to the inherent design problems of the system itself, its credit data has a lag due to its inability to record customers’ recent credit behaviors and lacks timeliness and comprehensiveness. Therefore, relying solely on this traditional credit assessment method can no longer meet the needs of commercial banks for consumer credit risk management in the era of rapid development of big data [16-18].
The development and popularity of consumer credit are intensifying. The risk assessment of consumer credit is also becoming more and more important. The current research on consumer credit risk assessment mainly explores the optimization and improvement of credit risk assessment strategies empowered by big data technology, artificial intelligence technology, and machine learning algorithms, and also constantly tries to discover and improve the index elements of credit risk assessment models. Literature [19] surveyed residents’ income, housing value, education level, and occupational status to predict the self-assessed health chances of residents’ credit scores and SEP measures and found that the credit scores and SEP indicators showed a moderate correlation, and pointed out that the credit scores can realize an effective complement to the SEP indicators. Literature [20], based on the literature review method, reveals that artificial intelligence technology and machine learning algorithms can analyze consumer credit customer asymmetry, adverse selection, and moral hazard based on public information data. Literature [21] conceptualized a credit risk assessment model considering the factors of political and economic crisis and conducted simulation tests on it, which showed that the analytical prediction results of the model were basically in line with the actual situation of non-performing loans and gained the unanimous approval of experts. Literature [22] talks about a kind of multi-rule based decision making (MRDM) which has very potential in the application of consumer credit assessment and compares the roles, advantages, and disadvantages played by two MRDM methods in assisting the assessment of consumer credit through actual cases. Literature [23] used bibliometric analysis to examine journal articles on the combination of credit risk assessment and big data technologies and found that research on big data technologies in credit assessment practices is a significantly hot and growing topic in the related field and noted that current research is effective in improving people’s knowledge and understanding of big data in credit risk assessment practices. Literature [24] conceived a two-stage credit risk model. It introduced target evolutionary feature selection to minimize the misclassification cost (root-mean-square error) and the number of attributes required for modeling PD as well as EAD models. Finally, test experiments pointed out that the performance of the proposed model is excellent in terms of prediction accuracy and cost-effectiveness. Literature [25] aims to improve the accuracy of credit risk assessment by envisioning a modified two-particle swarm algorithm (MBPSO) based on the logic of the GK algorithm and applying it in conjunction with the GK algorithm for credit risk assessment, obtaining a more robust and accurate credit assessment accuracy. Literature [26] explored how consumers’ personal information affects credit risk assessment and showed that consumer information helps in the prediction of individuals’ repayment behavior while incorporating personal information as a reference in credit risk assessment can improve prediction accuracy. Literature [27] tries to comprehensively analyze the problems related to consumer credit risk assessment from three dimensions of classification algorithms, data features, and machine learning methods, classifies the classification algorithms, data features, and learning methods, and builds a data feature-driven modeling framework based on multiple classifiers based on this foundation, and finally investigates the model’s interpretability, fairness, and the model’s multimodality, which makes a positive contribution to the credit risk assessment field of credit risk assessment. Literature [28] conceptualized an online consumer credit risk inference methodology based on data augmentation and model enhancement strategies, which adds consumer profile information, while multi-stage view monitoring based on consumer repayment time enhances the prediction accuracy of credit risk.
Reasonable use and management of consumer credit have a positive effect on the good development of individuals and society, so it is also very meaningful to deeply understand the logic of the phenomenon of consumer credit fire and improve the supervision and management of consumer credit. Literature [29] builds a measurement model to assess how material desire affects people’s credit card use, impulsive buying behavior, and compulsive buying behavior and the results of the study show that material desire significantly promotes impulsive buying behavior, and it is believed that the reduction of material desire can reduce impulsive buying and compulsive buying. Literature [30] discusses the international consensus on the regulation of the consumer credit market, which includes appropriate regulation by regulators, mandatory disclosure of information, reasonable cost of credit financing, etc., and concludes with an in-depth analysis of how payment intermediaries can play an active role in cross-border purchasing. Literature [31] identifies a database of loan characteristics in combination with a sample of more than one million personal consumer loans from LendingClub, which effectively improves the accuracy of consumer loan default prediction and also points out that the regulatory and guidance aspects of loan origination platforms should be made more transparent. Literature [32] quantifies the explanatory power of liquidity constraints and anchoring theory in conjunction with changes in issuers’ minimum payment formulas, pointing out that anchoring a contractual clause facilitates households’ repayment decisions.
The article constructs a consumer credit risk prediction model based on survival analysis out of the demand for the prediction of consumer credit risk in the context of financial technology. Based on the survival time theory, the survival data are divided. Kaplan-Meier method is used to find the survival function. After testing the survival data and completing the model matching degree, the consumer credit risk prediction model is constructed using by Cox proportional risk prediction model. The Cox proportional risk model is compared with the risk prediction models such as RandomForest, XGBoost, Lightbm, etc., and the risk prediction effect of the Cox model is tested by integrating the ROC curve, KS value, and probability value. The predictive validity of the Cox model is further verified by empirically analyzing whether the borrower is overdue and the overdue time.
Financial technology refers to a new type of financial industry through the organic integration of modern science and technology, and the in-depth transformation and innovation of financial products and financial enterprise management [33]. It is not just a simple combination of finance and science and technology, but through big data, blockchain, artificial intelligence, cloud computing, and other advanced technologies, to realize the transparency, intelligence, and digitalization of financial services so as to produce the value-added effect of “1+1>2”. The development of fintech aims to enhance the operational efficiency of the financial industry, optimize customer experience, and give rise to more advanced and convenient financial products and modern financial management systems.
Big data technology is an advanced technology for processing and analyzing massive data sets, especially referring to those data that exceed the processing capacity of traditional databases. It relies heavily on sophisticated data processing software that can quickly analyze, process, and extract valuable information.
In the banking industry, the application of big data technology is becoming increasingly broad and deep. First, it plays an important role in risk management, enabling more effective identification and management of credit and market risks. Second, in terms of customer service, big data technology helps banks analyze customer needs and provide more personalized services. In addition, big data technology plays a key role in banks’ marketing strategies. By analyzing customer data, banks can design more accurate marketing campaigns. Also, big data technology is very effective in preventing financial fraud. Finally, big data technology also helps banks improve their internal operational efficiency.
Blockchain technology is a distributed ledger technology that makes data transmission secure and reliable in a decentralized network. In this network, data is stored in the form of blocks connected by encrypted chains, with each new block containing encrypted information from the previous block, thus ensuring data immutability and transparency. This structure makes blockchain an ideal system for recording and sharing data. In the banking industry, blockchain technology is mainly used to improve transaction efficiency and security.
Artificial Intelligence (AI) technology mimics human thought processes, including the ability to learn, reason, self-correct, and automate decision-making utilizes algorithms and large amounts of data to enable machines to solve complex problems and perform specific tasks. In the banking industry, AI is used to improve service efficiency and customer experience.
Cloud computing is a technology that provides computing resources and data storage services over the Internet. It allows users to access and use software, storage, and other computing functions located on remote servers over the Internet without having to manage physical servers or run application software locally.
In the banking industry, the use of cloud computing is growing, revolutionizing this traditional industry. First, cloud computing provides powerful data processing capabilities that help banks efficiently handle large amounts of transaction data and enable fast, accurate data analysis. Second, cloud computing helps banks cope with business fluctuations by providing flexible resource allocation. In addition, cloud computing supports innovation in the banking industry. Finally, cloud computing also plays a key role in improving the security and compliance of banking operations.
Consumer loans specifically refer to loan services provided by banks or other financial institutions to individual consumers with a defined consumption purpose. Such loans are mainly based on the borrower’s credit history and future repayment ability and are used to satisfy their needs for purchasing consumer goods or paying for other personal consumption.
Thanks to personal consumer credit, public deposit funds can be further circulated, which can effectively alleviate the financial pressure on families or individuals. Under the joint promotion of “consumption upgrading, policy support, and financial technology development”, China’s consumer credit system is becoming increasingly mature. According to the purpose of the loan, personal consumer credit can be categorized into housing and non-housing categories.
Housing consumer loans, also known as personal housing loans, provide financial support for borrowers to purchase various types of housing for their use, including ordinary housing and villas. This type of loan usually has a large amount, up to tens of millions of dollars, and a long loan period, usually 1-30 years, and the sum of the borrower’s age and the loan period does not exceed 30 years. Because the mortgage loan cycle is generally longer, the borrower can choose different repayment methods, such as equal monthly principal and interest, equal monthly principal and interest each month for a repayment period, and the month principal and interest on the same month to pay off. In order to minimize business risks, commercial banks often require borrowers to provide security in the form of mortgages or pledges. If the borrower fails to repay the loan in accordance with the contract at that time, the commercial bank has the right to dispose of the collateral to recover the loan. In the past, the interest rate for housing loans was fixed. With the introduction of LPR interest rates, borrowers can choose to repay their loans at floating interest rates.
Non-housing consumer loans refer to personal consumption loans other than those used for the purchase of a home, which have a wide range of uses, including but not limited to the purchase of automobiles, home renovations, and the purchase of large consumer goods. Based on the length of the repayment period, non-housing consumer loans can be further classified into installment and non-installment modes. The installment mode is usually used for larger amounts of consumption, such as purchasing a car, renovation, etc., while the non-installment mode is usually used for smaller amounts of consumption or emergencies.
Consumer credit risk refers to the possibility that the borrower of personal consumer credit is unable to fulfill the repayment obligation according to the contract due to various uncertainties, thus exposing the commercial bank to the loss of funds [34]. This kind of risk runs through the three links of pre-credit, credit, and post-credit, including but not limited to incomplete pre-credit investigation, non-compliance of credit review, as well as untimely and unstandardized post-credit management. When the borrower defaults, commercial banks will face the risk of non-performing loans, which will affect their asset quality and operational stability.
The characteristics of consumer credit risk mainly include: First, the diversity of risk sources, which may come from the borrower’s own credit risk but also from the market environment, policy adjustment, and other external factors. Second, the risk is hidden. Part of the risk may be difficult to detect at the beginning of the loan but gradually exposed over time. Thirdly, risks are contagious, and once a risk arises in a certain link, it may spread to the entire credit chain and affect the overall operation of the bank.
Survival time refers to the time elapsed between the beginning of a certain research time point and the occurrence of an endpoint event in the research object. Survival functions have different units and ways of measurement depending on how they are defined, and their probability density functions can be categorized into continuous and discrete types [35]. These two types of functions are explained in detail below.
Assuming that
In contrast, the probability of an individual surviving longer than time
If first-order differentiation is done on
Another way of describing the magnitude of the probability of death in the survival analysis approach is to express the degree of risk or instantaneous mortality that an individual may die, called the risk function when it is known that the individual is still alive at the moment of
From equations (1) and (2) we know
The survival function is then obtained by taking the integral of the above equation and converting it to exponential form:
Also i.e., for:
Then, based on the relationship between the functions, the probability density function of an individual at time point t can then be obtained:
Similarly, assuming that
At this point, the survival function is represented by the following equation:
where
The risk function
From the above equation, we know that
So equation (11) can be rewritten as:
Also ie:
The data of the study are categorized in survival analysis into two types, complete data, and censored data, to make a distinction between the final results of the study sample.
Suppose the individual data can be observed for a complete survival time (i.e., whether they can be determined to be normal or past due customers) at the time of data collection or experimentation. In that case, these complete data are called complete data. Conversely, if individuals cannot be consistently observed at the time of data collection or experimentation due to some uncertainties, or if individuals do not fail or die because the researcher cannot confirm the true survival time due to some reasons (costs in terms of human resources, material resources, and time) that should have been decided beforehand for the duration of the data collection or experimentation, these data are referred to as censored data.
The main method utilized in the industry is the Kaplan-Meier method, which is a non-parametric method of making curves by finding the survival function. In practice, it is easy to observe the turning points of the graph and the degree of variation, and it also has the nature of the product limit, so it can also be called the PL method.
The Kaplan-Meier method arranges the survival times of complete and censored data from smallest to largest, assuming that
Then, the estimation method of the survival function is expressed as:
In this paper, the Kaplan-Meier method will be applied to estimate the cumulative survival survival rate
When we want to compare more than
where
When the resulting model is considered in practical application, we are generally not sure in advance whether the model is a match for the solution of the problem because one or more features of the model may not be appropriate for the particular data that is available, and therefore checks should be made to confirm that the model matches the resulting data before the model is made.
A common method used in the industry is to make a log-log plot, which is a logarithmic transformation of the specific variables for the estimated survival function. Assuming that all samples are now divided into
The Cox proportional risk model, which was first applied to the pricing of bonds and some financial products, has been gradually used to measure credit risk in recent years with the continuous development of the commercial bank credit risk measurement model [36]. This subsection mainly introduces the basic form of the Cox proportional risk model and the estimation and testing of relevant parameters.
Let the density function of survival time
If
Here
From equation (18):
where
It follows from Eq. (20) that for all
In practical problems,
Equation (21) is called the generalized Cox model. When
The so-called proportional risk assumption test, or PH assumption, means that the effect of covariates in the model does not change with time. That is to say, in the model, the risk function of different individuals is proportional to each other [37]. The test of the PH assumption is necessary before building the Cox proportional risk model. Only through the PH assumption the model built is valid; otherwise, it is invalid. If the PH assumption is not satisfied, on the one hand, the changed significance due to the existence of variables that do not satisfy the PH assumption receives a corresponding effect. Secondly, if the risk ratio increases with time, then the relative hazard ratio is overestimated, and if the opposite occurs, the relative hazard ratio is underestimated.
In academic research, common methods for testing PH assumptions include the graphical method and the test method.
(1) Graphical method, the so-called graphical method, is to determine whether the PH assumption is satisfied by observing the distribution in the scatterplot. The graphical method is simple and easy to operate and is often applied to a variety of variables, including continuous variables, binary variables, hierarchical variables, etc., which enables the researcher to judge whether the PH assumption is satisfied intuitively and has a certain degree of credibility. However, human judgment is subjective, and sometimes, it is difficult to determine the degree of deviation that leads to the error of the model, thus affecting the validity of the model. Therefore, it is necessary to judge whether the PH assumption is statistically significant with the help of statistical tests.
(2) Testing method, both through the construction of statistics to test the established model and through the construction of statistics derived from the P value to determine whether the data meets the assumptions set by the model. This mainly includes the time-covariate method, generalized linear regression method, etc. These tests are based on the original assumption that the risk ratio is zero and do not require the stratification of time and covariates, and these methods are more common in testing the PH assumption.
For the parameter part of the Cox proportional risk model, this paper adopts the partial likelihood estimation method. Assuming a sample size of
Let
If an individual in
Multiplying the conditional probabilities of death at all time points gives the partial likelihood function, shown in equation (24):
Taking logarithms of Eq. (24) and solving for derivatives makes:
The great likelihood estimate of parameter
Once the estimate of parameter
The nonparametric method is a method of estimating the baseline survival function, i.e., the estimation expression is:
Where
This method defines the baseline cumulative hazard rate function
In this way, the baseline survival function is obtained:
The Cox proportional risk model in this paper is compared with the risk prediction models such as RandomForest, XGBoost, and Lightbm to analyze the prediction effect of each model.
Several algorithms are trained on the training set (Train) using the above parameters, and then the model results are evaluated using the test set (Test). A comparison of ROC curves of several algorithms is shown in Figure 1.

ROC curve (AUC value) of several algorithms
In Figure 1, the best experimental result is the Cox proportional risk model proposed in this paper, whose AUC value reaches 0.7295, compared with the worst RF model in this group of experiments, whose AUC value is 0.6978, the same dataset, the same incidence characteristics can still improve the AUC value by 3.17 percentage points.
Comparison of KS curves and KS values for several algorithms are shown in Fig. 2. Figs. (a)~(d) are the KS curves and KS values for RF, XGB, LGB, and Cox models, respectively. And the optimal dividing line of the sample when the KS value is taken as maximum. It can be seen that the cut-off point for different modeling algorithms to obtain the maximum KS value is different. In the RF model results, the most obvious differentiation effect of the model can be obtained from the samples with the top 36.5% of the probability value ranking in the prediction results. The model has the maximum KS value at this time. Comparing the four algorithms, the Cox algorithm proposed in this paper has the largest KS value of effect and the best prediction result. From the samples in the top 27.53% of the probability value ranking in the prediction results, the maximum KS value (0.2846) can be obtained.

KS curve (KS value) of several algorithms
In the final model results, the probabilities are transformed into scores through a linear representation function, where it is important to note that the larger the probability value, the lower the score and the higher the likelihood of a positive sample (bad people in the wind control), and conversely, the higher the score and the higher the likelihood of a negative sample (good people in the wind control). The scores are binned to compare the distribution of good and bad people between different models. The distribution of scoring results of each model is shown in Fig. 3 and Figs. (a)~(d) shows the distribution of prediction results of RF, XGBoost, Lightbm, and Cox models, respectively.
Under the same feature conditions, the Cox algorithm proposed in this paper can effectively capture more positive samples (label=1). In the low score of 300-500, the model results of several algorithms, RF, XGBoost, and Lightbm, have almost no positive sample distribution share and do not effectively catch positive samples, and the model passes positive samples that should have been rejected in the low score. In wind control, a bad customer often causes a certain asset loss, which is extremely unfavorable, and the Cox algorithm proposed in this paper can effectively capture a certain proportion of the positive sample population in the low score segment, which is the model effectiveness of the mention.
In short, whether from the algorithm or feature construction on the innovation, this paper proposes the method in the results once again to verify the effectiveness of its own.

Precision-recall curve of several algorithms
The paper adopts the Boruta feature selection method to screen the key influencing features, and there are 8 features after the screening, which are interest rate, the maximum credit limit of valid RMB credit card, number of times a single credit card has been overdue for M1, and above in the past 24 months, the total amount of mortgage repayment due in the current month, number of credit card inquiries in 6 months, whether the repayment is overdue for the first time, the amount of the first overdue amount, and the current credit limit. The above eight features are numbered X1~X8, where X1 belongs to the macro environment, X2~X5 belongs to the pre-credit credit information, and X6~X8 belongs to the post-credit lending behavior information.
Then, the Pearson correlation coefficient is calculated for the screened features, and the correlation coefficients of the feature variables are shown in Table 1. It is found that the correlation between the features is low, and all of them can be used as the feature variables for the next regression analysis.
Characteristic variable correlation coefficient
| X1 | X2 | X3 | X4 | X5 | X6 | X7 | X8 | |
|---|---|---|---|---|---|---|---|---|
| X1 | 1 | -0.055 | -0.042 | 0.018 | -0.042 | 0.035 | -0.088 | -0.065 |
| X2 | -0.055 | 1 | 0.452 | 0.033 | 0.105 | -0.052 | 0.164 | 0.273 |
| X3 | -0.042 | 0.452 | 1 | 0.138 | 0.046 | -0.037 | 0.153 | 0.242 |
| X4 | 0.018 | 0.033 | 0.138 | 1 | -0.114 | -0.009 | 0.033 | 0.092 |
| X5 | -0.042 | 0.105 | 0.046 | -0.114 | 1 | 0.010 | 0.003 | -0.094 |
| X6 | 0.035 | -0.052 | -0.037 | -0.009 | 0.010 | 1 | 0.221 | -0.119 |
| X7 | -0.088 | 0.164 | 0.153 | 0.033 | 0.003 | 0.221 | 1 | 0.379 |
| X8 | -0.065 | 0.273 | 0.242 | 0.092 | -0.094 | -0.119 | 0.379 | 1 |
The results of the paper’s PH assumption test are shown in Table 2. When P>=0.05, the variables to be tested then satisfy the PH assumption. From the test results, the P value is greater than 0.05, so the original hypothesis is accepted, i.e., the model meets the conditions of the PH assumption, and the eight covariates have a relatively small correlation with time t.
Covariable PH test results
| Code | Covariable | P |
|---|---|---|
| X1 | Interest rate | 0.21 |
| X2 | Highest valid credits | 0.88 |
| X3 | M1 and above overdue frequency | 0.32 |
| X4 | Housing loan repay the amount of this month | 0.59 |
| X5 | Credit query number in 6 months | 0.23 |
| X6 | First overdue | 0.27 |
| X7 | First overdue amount | 0.24 |
| X8 | Current credit | 0.69 |
The following proportional risk model is established based on the above eight covariates, and the significance level of the joined variables is taken as 0.05 and the model parameters and significance test results are shown in Table 3. Then, according to the Cox proportional risk model expression, the survival function of the Cox model of default risk after the first borrowing of personal loans can be obtained:
Model parameters and significance test results
| Code | Covariable | coef | exp(coef) | se(coef) | P |
|---|---|---|---|---|---|
| X1 | Interest rate | 0.05 | 1.16 | 0.05 | <0.005 |
| X2 | Highest valid credits | -0.08 | 0.97 | 0.07 | 0.015 |
| X3 | M1 and above overdue frequency | 0.14 | 1.18 | 0.06 | <0.005 |
| X4 | Housing loan repay the amount of this month | 0 | 1 | 0.04 | 0.088 |
| X5 | Credit query number in 6 months | 0.08 | 1.09 | 0.03 | <0.005 |
| X6 | First overdue | 0.42 | 1.52 | 0.03 | <0.005 |
| X7 | First overdue amount | 0.03 | 1.04 | 0.04 | <0.005 |
| X8 | Current credit | -0.78 | 0.55 | 0.09 | <0.005 |
The paper used the Likelihood Ratio Test LRT for the overall significance test of the model, which showed an overall significance level of p<0.05, indicating that the overall test of the model is significant, i.e., there is at least one covariate with non-zero coefficients.
1) Interest rate (X1) regression coefficient coef = 0.05 > 0, indicating that the interest rate is a risk factor, exp (coef) = 1.16 > 1, indicating that for every unit increase in the interest rate, the degree of risk will increase to 1.16 times the original. Generally speaking, the higher the borrowing interest rate, the higher the loan cost to be paid and the higher the repayment pressure, and thus its default risk will rise.
2) The regression coefficient of effective RMB credit card maximum credit limit (X2) coef=-0.08<0, indicating that the effective RMB credit card maximum credit limit is a protective factor, and exp(coef)=0.97<1, indicating that for every increase of this variable by one unit, the risk degree will be reduced to 0.97 times of the original, which indicates that the higher the borrower’s effective RMB credit card maximum credit limit, the higher the borrower’s effective RMB credit card maximum credit limit, the better the borrower’s credit status, and thus the lower their default risk. Generally speaking, the higher the credit limit, the higher the creditworthiness of the borrower as recognized by the bank, and the higher the repayment ability of the borrower.
3) The regression coefficient coef=0.14>0 for the number of times a single credit card has been overdue for M1 and above in the past 24 months (X3) indicates that the maximum credit limit of a valid RMB credit card is a risk factor, and exp(coef)=1.18>1 indicates that for every increase of one unit in the variable, the risk level will be increased by 1.18 times of the original, which means that the more the number of times a single credit card has been overdue for M1 and above in the past 24 months, the higher the default risk will be. The more the number of times, the higher the risk of default.
4) The regression coefficient of the number of inquiries within 6 months (X5) coef=0.08>0, indicating that the number of inquiries within 6 months is a risk factor, exp(coef)=1.09>1, indicating that the risk degree will increase to 1.09 times of the original for every increase in the number of inquiries by one unit, which means that the greater the number of inquiries within 6 months, the greater the risk of default.
5) Whether the first repayment is late (X6) regression coefficient coef = 0.42 > 0, indicating that whether the first repayment is late as a risk factor, exp (coef) = 1.52 > 1, indicating that the first repayment occurs late, the risk degree will increase to the original 1.52 times, that is, the risk of default will increase.
6) First overdue amount (X7) regression coefficient coef=0.03>0, indicating that the first overdue amount is a risk factor, exp(coef)=1.04>1, indicating that for every increase of one unit in the first overdue amount, the risk degree will increase to 1.04 times of the original, which means that the higher the amount of the first overdue amount is, the higher the default risk is.
7) Current credit limit (X8) regression coefficient coef=-0.78<0, indicating that the current credit limit is a protective factor, exp(coef)=0.55<1, indicating that for every increase of one unit in the current credit limit, the risk degree will be reduced to 0.55 times of the original, which indicates that the higher the borrower’s current credit limit is, the better the credit condition of the borrower is, and the lower its default risk is.
This paper predicts the Cox survival function based on the fitted Cox survival function since the paper takes repayment overdue for more than 180 days to be written off as a sign of default, there is a legitimate repayment period of 30 days after the borrowing, a borrower will certainly not be written off within 210 days after the first borrowing, so the main prediction is to predict the defaults after a borrower’s first borrowing of 210 days, but within 1 year, and the result is shown in Fig. 4. The horizontal axis indicates the date (in days) and the vertical axis is the survival probability. The content represents the probability of survival of the borrower after the first borrowing, and it is set that the borrower is considered to be in default when the value is < 0.5.
As can be seen in Figure 4, the number of borrowers who are predicted to default after 210 days of first borrowing and within 1 year are 3 and 9, and they are predicted to default on day 286 and 357, respectively. The other borrowers are not predicted to default during that performance period.

Cox proportional hazards model survival time prediction result
In actuality, borrowers numbered 3 and 9 did default and survived for 286 and 357 days, respectively. Borrowers 3 and 9 were not over or behind in their time to default predictions, and the other eight clients were predicted not to default. The Cox proportional risk model performed well in its predictions.
This paper combs through consumer credit and its risks in the context of financial technology and adopts the Cox proportional risk model to dynamically predict consumer credit risks. The Cox model is compared with other models in terms of prediction performance, and its validity is further verified through empirical research.
The AUC value of the Cox proportional risk model is 0.7295, and the maximum value of KS (0.2846) can be obtained in the top 27.53% of the samples in the order of the probability value. The predictive performance of the Cox proportional risk model is the best performance among all the predictive models.
The regression coefficients of interest rate, the maximum credit limit of valid RMB credit card, number of single credit cards overdue M1 and above in the last 24 months, number of credit card inquiries in 6 months, whether the first repayment is overdue, the amount of the first overdue amount, and the current credit limit are 0.05, -0.08, 0.14, 0.08, 0.42, 0.03, and -0.78, respectively.
Among the 10 borrowers, the Cox model predicts that borrowers #3 and #9 will default, and they are predicted to default on day 286 and day 357, respectively. At the same time, the other borrowers are predicted not to default during that performance period. The prediction matches the actual situation, and Cox’s empirical results are good.
This research was supported by the provincial soft science key project of Hunan Science and Technology Department: Science and Technology Financial Innovation supporting the Development of Hunan Science and Medium-sized Enterprises (No. 2013ZK2024).
