Accesso libero

Intelligent decision support systems in information systems: integrated learning algorithms and applications

,  e   
17 mar 2025
INFORMAZIONI SU QUESTO ARTICOLO

Cita
Scarica la copertina

Introduction

Intelligent Decision Support System (IDSS) is an information system based on technological tools and data analysis designed to assist managers and decision makers in making effective decisions. By utilizing advanced algorithms and models, IDSS can help users collect, analyze and interpret various data, provide the basis and suggestions for decision-making, and thus improve the accuracy and efficiency of decision-making [1-4]. Intelligent decision support system is a tool to assist decision making, by integrating and analyzing a large amount of internal and external data to provide support and suggestions for decision making. Its features include data integration, data analysis, decision-making suggestions, real-time updates, etc. It has been applied in the fields of finance, healthcare, logistics management and marketing [5-8], and integrated learning algorithms play an important role in optimizing intelligent decision support systems.

With the development of Internet technology, people have higher requirements in data processing, how to quickly and accurately classify, recognize and predict massive data has become the focus of attention. Integrated learning algorithm is a solution to this problem [9-12].Integrated learning algorithms are mainly classified into two categories: Bagging and Boosting.In Bagging algorithm, each weak learner is independent of each other, each weak learner samples the training set with putback, and then trains with the sampling set, and finally votes the results of all the weak learners [13-14].In the Boosting algorithm, each weak learner is ordered, and the input data of each weak learner is a “weighted sample” based on the error results of the previous learner, which improves the accuracy of the weak learner through multiple iterations. Integrated learning algorithms are widely used in regression problems, classification problems, etc. [15-18].

This paper first introduces the concept and development of traditional integrated learning models Bagging, Boosting, Stacking and other algorithms and the specific applications derived from these integrated concepts, respectively. On the basis of these traditional integrated learning algorithms and deep integrated learning ideas, the use of integrated learning predictive analytics, artificial intelligence, recommender systems and other technologies to build a financial risk early warning decision support system, which is able to provide effective decision-making support through the results of risk prediction, and make more correct strategic decisions. Finally, the system’s response time, accuracy and effectiveness of financial risk prediction are verified through comparative experiments and analysis.

Risk prediction modeling based on integrated learning algorithms
Related theories and algorithms

Integration algorithm itself is a better and omnipotent algorithm, integration algorithm can integrate good algorithms to become a better algorithm. Because of its powerful and versatile characteristics also makes the integrated algorithm has a very bright performance in the major competitions, the integrated algorithm is also gradually attracting people’s attention, and then some researchers began to use the integrated algorithm on the phenomenon of credit card delinquency to establish a prediction model. Integration algorithm in a sense, it is not a machine learning algorithm, it is more inclined to be an optimization strategy, through the construction of a number of simple weak machine learning algorithms combined to provide more effective decision-making to complete the learning task.

As one of the important research topics in the field of artificial intelligence, integrated learning combines multiple base learners by generating multiple base learners and then combining them together according to certain combining rules, which makes the combined model outperform a single base learner. The base learners in integrated learning need to be both accurate and diverse in order for integration to achieve good results. The integrated learning technique is to train the same training set together with homogeneous algorithms to reach the ideal training effect. In the risk warning intelligent decision support system, the system generally uses a single rule to realize the classification of various models. There are three main techniques for integrating algorithms, which are Bagging method, Boosting method and Stacking method.

1) The main concern of Bagging method is how to reduce the error, which helps to reduce the error caused by the fluctuation of random data and avoid overfitting, for example, Random Forest is a typical representative algorithm of Bagging method. In the training process of Bagging method, there is no dependence between algorithms, which can realize parallel training, accelerate training speed and save training time.

2) Boosting method focuses more on how to reduce the deviation, and in the process of model training, it relies more on the training results of the previous model, and gives more weight to the wrong prediction points according to the feedback of the previous round of results. Its typical representatives are Adaboosting algorithm, Boosting Tree algorithm, etc. In the branch of Boosting Tree algorithm, the most representative algorithm is the Extreme Gradient Boosting Tree (GBDT) algorithm.

3) Stacking method focuses on how to improve the prediction, in the process of implementing the Stacking method, the primary learner is first trained on the original dataset, and then a brand new dataset is generated to be used to train the initial learner. In the generated new dataset, the training results of the initial learner on the most primitive dataset appear as features of the new dataset, and the labels of the most primitive dataset and the new dataset are the same.

Traditional Integrated Learning and Applications

Integration learning is a common means of improving the performance of models. We can explain integrated learning from many perspectives, and it is a typical field in which the development of theory is driven by the practicality of the application. The classical integrated learning theory from the error optimization to explain the integrated learning ideas to optimize which part of the error, error in the classical bias and variance theory can be integrated learning ideas to solve and optimize.

About error: error refers to the difference between the measured value and the reference value, in the field of machine learning, the results predicted by the model inevitably deviate from the actual results, so we call it error. Generally speaking the error is divided into the following three categories, i.e., inherent error, bias, and variance. Mathematically we represent the error by the following equation: Err(x0)=E[ (Yf^(x0))2|X=x0 ]=σε2+[ E[ f^(x0) ]f(x0) ]2+E[ f^(x0)E[ f^(x0) ] ]2=σε2+Bias2(f^(x0))+Var(f^(x0))=IrreducibleError+Bias2+Variance where x0 represents the sample input, f^(x0) denotes the predictive model obtained from the training set, while f(x0) represents the objectively true distributional model of X in the Bayesian perspective, and σε denotes the variance of X the distribution.

Bagging and Boosting methods in integrated learning methods are essentially optimized for the variance and bias of the error as described above, and we will mainly focus on these two methods in the next section for a detailed introduction.

The idea of Bagging is an integrated method that extracts a random subset of samples and trains them separately to make a joint decision. The most general explanation of Bagging. Essentially, from the mathematical level we use the following equation to represent Bagging: fbag(x)=1/Mn=1Mf^m(x) Where M denotes the number of models, by extracting and training the data from the original dataset with putback we obtained the final results. By jointly modeling the distribution of the model and the model, for a variable X with a correlation between distributions of ρ , we obtain the variance of the mean of the distribution as: Var( x/M)=ρσ2+(1ρ)/Mσ2

So essentially as the number of individual learners increases, the second term gradually converges to 0, which makes the variance of the overall decision of the distribution decrease.

Deep Integrated Learning Ideas

Integration learning ideas are beginning to be introduced into deep frameworks to increase the generalization ability of the model, and there are typically two types of methods: implicit deep integration and display deep integration.

Implicit Deep Integration: Implicit Deep Integration usually refers to an integration strategy that introduces stochasticity inside the deep framework to increase the diversity of the deep learning framework during the training process, thus improving the generalization ability. We express it with the formula: y=m*α(Wx)

Drop Connect does not randomly set 0 to the output; in contrast, drop Connect performs a random 0-setting operation on the weight value W between nodes. We represent the process by the following equation: y=α((M*W)x) Where x also denotes the input, W denotes the weight parameter, and M denotes the Bernoulli distribution with parameter p, the weight parameter is subjected to a randomized zero-setting operation to achieve diversity training of the model. Explaining the above two algorithms from the perspective of integrated learning, essentially drop Connect can theoretically generate up to 2|M| random network structures, while dropOut can also reach 2|m| structures, where because the number of elements of the M matrix is greater than the number of elements of the m vector, the fully-connected layer is made to exhibit a better generalization performance through these two methods.

Showing Deep Integration Application: for showing the first type of ideas for deep integration, it is usually to do the extension of the classical base model algorithm and apply the traditional integration learning ideas to the deep learning base model, typically such as the voting method, averaging method and so on. We can also learn to get the weights of the model, typically such as the super Learner algorithm, the algorithm by outputting the predicted probability distribution of the multi-network, this result is input to the 1 × l convolutional kernel for weight learning value, and output the final result. For the second class of ideas, we use Equation (6) to express this concept: l=αr(x,w)+lsofimax(x,y)

Intelligent decision support system for risk early warning information
Concept of Intelligent Decision Support System

Information intelligence decision support systems are built on the principles of data integration, knowledge representation, reasoning and user interaction. These systems utilize intelligent technologies such as integrated learning predictive analytics, natural language processing, machine vision, and recommender systems to provide valuable insights and recommendations. Successful applications of Intelligent Decision Support Systems span the fields of healthcare, finance, finance, supply chain management, marketing, etc., demonstrating their versatility and their transformative impact on the decision-making process.

Intelligent Decision Support Systems are designed to help decision makers improve the quality and efficiency of the decision making process by providing data-driven insights, recommendations and analytical tools. The fundamentals of intelligent decision support systems include data integration, knowledge representation, reasoning, and user interaction. First, data integration involves collecting and aggregating relevant data from a variety of sources, both internal and external to the organization, and then processing and cleansing these data and converting them into a structured format suitable for analysis. Second, knowledge representation involves encoding region-specific knowledge and rules into an intelligent decision support system. The knowledge can be in the form of expert knowledge, historical data, or predefined decision models. Effective knowledge representation allows the system to understand and interpret the data in the context of the decision problem. Again, inference and reasoning mechanisms allow the intelligent decision support system to draw conclusions and make predictions based on the available data and encoded knowledge. These mechanisms may involve patterns, correlations, statistical analysis for identifying potential solutions, machine learning algorithms, and expert systems. Finally, user interaction is an important aspect of intelligent decision support system design. The system must provide decision makers with intuitive interfaces for interacting with data, exploring, and receiving suggestions for various scenarios. User feedback and collaboration are critical to improving the decision-making process.

Architecture of an intelligent decision support system for risk early warning

With the continuous development of information technology, massive information from multiple sources has been brought about by various information collection technologies and developed communication networks. In order to effectively manage and utilize these multiple sources, heterogeneous and massive information, it is necessary to transform the information advantage into decision-making advantage through real-time judgment, identification and integration. The process of acquiring financial risk early warning information is complicated, and it needs to take into account the massive information such as the operating status and development status of the enterprise, so the intelligent decision-making system comes into being. The main function of the system is to predict the probability of the company’s future risks by using the financial risk prediction model based on the company’s credit data.

The architecture of Intelligent Decision Support System (IDSS) is shown in Fig. 1. IDSS organically combines the dialog component (human-computer interaction system), model component (model library management system and model library), and data component (database management system and database). It greatly expands the database function and model library function. That is, the development of the management information system to rise to a new level of decision support system. Became both the ability to provide decision support capabilities for managers. So that those who originally can not use the computer to solve the problem gradually become able to use the computer to solve.

Figure 1.

Intelligent decision support system general structure diagram

Human-machine interface components

The human-computer interface is the interface between the decision support system and the user. The user controls the operation of the actual decision support system through the HMI, and the decision support system requires the user to input information for control and data for calculation, and at the same time to display the operation and final results to the user. The human-computer dialog includes the following functions.

Data components

The database is the basis for the stable operation of the design system, and the data warehouse stores raw information related to financial forecasting planning, decision making, and control. Data components include database management system and database, database is used to store a large amount of data, typical data organization model has network model, hierarchical model, relational model and so on.

Database management system must have database creation, deletion, modification and maintenance, data storage, retrieval, sorting, indexing, statistics and other functions.

The database language system consists of two parts, i.e., database definition language, which provides the means to define the organization of data in the database, such as data schema, data dependency relationships, and so on. Database operation language, provides the means to operate on the data in the database, including database creation, maintenance data dictionary creation and maintenance, data query, retrieval and so on.

Model components

The model library is one of the core parts of the financial risk early warning information decision support system, which collects all the financial risk early warning models and stores all the financial risk decisions. The model part has model library management system and model library components. The model library is used to store models, which have their own characteristics, and models are different from data. The representation of the model is always in the form of a computer program, and this physical form is specified in the model library as the name of the model and the associated computer program, the classification of the model functions, the input and output data of the model, the control parameters and other attributes. It can be run in a certain way to perform input, output, calculation and other processing. Model library management system manages the model library, in order to adapt to the static and dynamic characteristics of the model, the model library management system has two aspects, one is similar to the static management function of the database management system, the other is the dynamic operation of the model management function.

Application of integrated learning algorithms in financial early warning decision-making systems
Analysis of experimental results
System response time

The process of obtaining financial risk early warning information is relatively complex, and requires comprehensive consideration of the enterprise’s operating conditions, development conditions and other massive information, assisted decision-making system came into being, which greatly advances the pace of research on financial risk early warning. Based on the results of the above experimental preparation, financial risk early warning information to assist decision-making simulation experiments, through the system response time and financial risk incidence rate to reflect the system’s performance indicators, the specific experimental results of the analysis process is as follows. Design 10 kinds of financial risk situation, respectively, recorded as number 1-number 10, the application of the existing system and the design of the system for its risk warning, record the system issued financial risk warning information time, that is, the system response time. The system response time data obtained through the experiment is shown in Table 1. The data show that in different financial risks, the response time of the designed system in this paper is lower than that of the existing system in the range of 10s-13s, and the stability is better.

System response time data table

Financial risk Status number Existing system/s The design department of this article/s
1 23.63 10.21
2 25.50 10.57
3 20.11 11.23
4 15.57 12.58
5 10.35 10.13
6 20.67 10.25
7 15.48 10.72
8 15.59 10.32
9 14.92 10.28
10 16.94 10.42
Financial risk incidence analysis

A number of different financial risk scenarios are implemented for a particular enterprise, applying the existing system and the designed system, and observing the probability of the enterprise incurring financial risk, which is the financial risk incidence rate. The financial risk incidence data obtained through the experiment is shown in Table 2. The data shows that the incidence of financial risk is lower in the designed system (8.26%-10.23%) of this paper compared to the existing system (35.24%-46.63%). The above experimental results show that after applying the design system, it can be effectively shortened. System response time and reduce the incidence of financial risk, which fully confirms the effectiveness of the design system.

Financial risk incidence data table

Experimental frequency Existing system/% The design department Of this article/%
1 45.36 10.23
2 35.24 9.56
3 40.17 8.26
4 45.36 8.93
5 46.63 9.16
Parameter description and experimental comparison of integrated learning methods
Data processing

In this paper, we will use a multi-sectional paired sample across industries. The year in which the financial crisis sample firm is ST is taken as the standard year (t-0), and three initial datasets are constructed from the alternative financial indicator data of all sample firms in the three years prior to the standard year. The ST year data of the financial crisis sample firms are obtained from the individual stock information section of Sina.com, and if there are multiple STs, the year in which the firm was first STed is used as the standard year. The financial indicator data of the sample companies are collected and organized from the China Stock Market Research Database System (CSMAR), which is jointly developed by Shenzhen Guotaian Information Technology Co. Ltd. and the China Center for Financial Research of the University of Hong Kong.

Since realistic data are often incomplete or contain noise (errors or deviations from expected values), it is necessary to improve data quality through data preprocessing, which in turn improves the validity and generalizability of the financial crisis prediction model. This paper preprocesses the data incompleteness and data noise situations that exist in the three initial datasets, respectively, with the following methods and justifications: for data incompleteness, the rows with missing data situations of financial indicators are deleted. Because the missing data in the initial dataset is mainly due to the non-existence of records corresponding to the corresponding year of the corresponding company in the database, generally for the whole line is missing, so it is appropriate to adopt the preprocessing method of directly ignoring the corresponding records. Data noise preprocessing is mainly to exclude abnormal data that deviate from the expected value. First, the three initial datasets of the data subset of financial crisis sample companies and the data subset of financial normal sample companies are respectively subjected to robustness processing by using the triple standard deviation test to exclude those rows with financial indicators that deviate from the mean by more than three times the standard deviation. Then, the first five maximum and the first five minimum values of each alternative financial indicator are analyzed separately, and the rows corresponding to indicator values that are extremely abnormal in terms of order of magnitude are excluded.

The three datasets obtained after preprocessing are denoted as (t-2), (t-3), and (t-4), respectively.

Comparison results of integration methods

There are three main parameters that can be adjusted by the deep integrated learning algorithm, which are the wealth ratio, the liquidity parameter, and the initial wealth value. The wealth ratio, i.e., the proportion that each model pays to buy contracts, for example, if a model has an initial wealth value of 100 and a wealth allocation ratio of 0.1, this model pays 10 virtual coins to buy contracts every time it makes a prediction. The liquidity parameter is usually denoted by b. The larger b is, the more the market price will move. The initial wealth value is the wealth that a model has initially, which can be understood as all models have the same wealth at the beginning, but some models are accurate and win more wealth, and more wealth after that. In order to choose the appropriate parameters, the approach taken in this paper is to pick a few sets of parameter values to observe the results, so as to determine the more appropriate parameter values. Because the wealth ratio takes values between 0 and 1, this paper picks so this paper randomly picks three values, i.e. 50, 100 and 150.Each of the three parameters takes three different values, finally forming 27 different sets of parameter combinations. Because there are too many evaluation indicators, and it is observed through experiments that the model with better AUC results will have correspondingly better results for the other four indicators, this section only shows the value of AUC, and the results obtained for all combinations are shown in Table 3: In this paper, it is found in the course of experiments that the larger the b value the longer the model’s runtime is, and the results are not the best. Observing the results of all combinations, it can be found that the deep integrated learning algorithm gives better results when the wealth ratio is 0.2, b is 100, and the initial wealth value is 100.

The proportion of wealth is 0.2, 0.5 and 0.8

b Initial value
50 100 150
0.2
100 0.8369 0.91125 0.8792
200 0.8156 0.8549 0.8479
400 0.7914 0..8379 0.7836
0.5
100 0.8075 0.7746 0.7993
200 0.7986 0.7520 0.8326
400 0.7825 0.8714 0.8086
0.8
100 0.8156 0.7596 0.8229
200 0.8479 0.8163 0.8074
400 0.8226 0.7785 0.8476

The parameters of the deep integration learning algorithm are those selected for the experiments in the previous section. The results obtained by the deep integration learning algorithm are shown in Table 4. It can be seen that the results obtained through the deep integration learning algorithm and the results obtained through the machine learning approach are consistent, i.e., the closer the data is to the year being ST, the better the results obtained. It can be seen that there is consistency between the results obtained through the deep integration learning algorithm and the results obtained through the machine learning approach, i.e., the closer the data is to the year being ST, the better the results obtained.

The integration method perform son different time nodes

ACC AUC F1-score Precision Recall
T-2 0.9115 0.8802 0.8836 0.8815 0.8884
T-3 0.9063 0.8536 0.8679 0.8537 0.8677
T-4 0.8724 0.8476 0.8216 0.8479 0.8015

In order to observe the results of all models at the same time, all the results are visualized and analyzed in this section. The way of tabular data is not intuitive enough, so line graphs are used to visualize the results of different models, and the performance of the deep integration learning algorithm on three different year datasets is shown in Fig. 2, along with the results of the comparison of other models. By comparing the three graphs, it can be seen that the results obtained by the deep integrated learning algorithm are relatively smooth, and the results obtained by the five indicators tend to be in the same range of values, with the range of the five indicators in the T-2 period being in the range of 0.875 to 0.95, T-3: 0.85 to 0.95, and T3: 0.8 to 0.9. The results obtained by the other models, on the other hand, vary in size and are more dispersed. In addition, the results obtained by the deep integrated learning algorithm are better than the other models in general, which does not mean that the deep integrated learning algorithm performs optimally in each index, but the overall combined performance of the five indexes is optimal. Therefore, the information market integration learning method selected in this paper is more reasonable. By comparing the three graphs, it can be seen that the results obtained by the deep integrated learning algorithm are relatively smooth, and the results obtained by the five indicators tend to be in the same range of values, while the results of the other models are of different sizes and scattered. In addition, the results obtained by the deep integrated learning algorithm are better than the other models in general, which does not mean that the deep integrated learning algorithm performs optimally on each indicator, but the overall combined performance of the five indicators is optimal. Therefore, the deep integration learning algorithm method selected in this paper is more reasonable.

Figure 2.

Model comparison

Conclusion

In this paper, we construct an intelligent decision support system for financial risk through artificial intelligence, recommender system and other technologies, and use deep integration algorithms to predict the occurrence of financial risk. Using five evaluation indexes, ACC, AUC, f1_score, recall, and precision, the effects of all models are compared. The analysis shows that after applying the intelligent decision support system designed in this paper, the response times are all in the range of 10s-13s, which are lower than the existing systems. The intelligent decision support system is also able to reduce the incidence of financial risk, indicating the reliability and effectiveness of the system. The comprehensive results obtained by the deep integrated learning algorithm on the five indicators are more excellent than other models. Therefore, the application of deep integrated learning algorithms in information intelligent decision support systems will become a future trend.

Lingua:
Inglese
Frequenza di pubblicazione:
1 volte all'anno
Argomenti della rivista:
Scienze biologiche, Scienze della vita, altro, Matematica, Matematica applicata, Matematica generale, Fisica, Fisica, altro