Multidimensional Data Integration and Predictive Modeling of Political Stability under the Chinese Model
Publié en ligne: 17 mars 2025
Reçu: 01 nov. 2024
Accepté: 17 févr. 2025
DOI: https://doi.org/10.2478/amns-2025-0337
Mots clés
© 2025 Zhennan Xiao, published by Sciendo
This work is licensed under the Creative Commons Attribution 4.0 International License.
Political legitimacy is a basic concept in political science and the core of the establishment of a state [1]. Political legitimacy is a problem that must be faced by human beings in order for them to continue to develop once there is class differentiation, the emergence of a ruling class and a ruled class, the emergence of a state and a political society. Its degree not only reflects the process of political democratization, but also is a prerequisite for the continued stability of the political system [2-3]. However, at present, the relationship between political legitimacy and political stability in many countries is still complicated, and its role is more like a double-edged sword, which can enhance the political stability of the society, and at the same time may bring the unstable development of the society [4-5]. Therefore, the relationship between political legitimacy and political stability is a very important theoretical and practical issue [6].
Political stability refers to the solidity of the structure of the political system and the orderliness and continuity of its operation. Stability means that the internal structure of the political system remains relatively stable and does not change arbitrarily, orderliness means that the internal elements of the system are arranged in a reasonable order, and continuity means that the function of the system is not impeded, and maintains normal operation [7-9]. The political stability of the state includes three levels, the first is the stability of the regime system, that is, the stability of the political community, the stability of the political system and the stability of the rulers, the stability of the political community requires the integrity and unity of the state and the continuity of the identity of the state, the stability of the political system implies the continuity of the constitutional system and the basic political system and the rules, and the stability of the rulers means that there is no extraordinary turnover of the political leaders [10-12]. Secondly, the structure of state power is rational, which is reflected in the scientific, regular, legitimate, effective and unified nature of the power structure. Further, there is an orderly political process, implying that political decision-making and political implementation are in order [13-15]. These three levels are both interdependent and independent of each other, of which the stability of the regime system is the most fundamental, the instability of the regime is the absolute instability of politics, while the other two levels of instability is the relative instability of politics. At present, China has entered the development stage of “12th Five-Year Plan”, which is a critical period for building a moderately prosperous society in all aspects, and also a period for deepening reform and opening up, and accelerating the transformation of the mode of economic development [16-19]. Influenced by complex factors at home and abroad, China is experiencing profound social changes and social transformation, and new contradictions and problems are constantly emerging, which makes it particularly important to maintain a stable political environment and social order [20-21].
In this paper, the multidimensional data related to political stability are integrated and preprocessed to construct a political stability prediction model. An algorithm based on the combination of improved stack sparse self-coding TS-SAE and K-Means++ is proposed. The TS-SAE algorithm utilizes the mapping value matching mechanism and layer-by-layer greedy principle to de-emphasize and downsize the multidimensional data related to political stability, and then constructs the TS-SAE-K-Means++ model. After completing the integration of multidimensional data, the Bayesian network structure for political stability prediction is constructed by using Bayesian network, the network parameter learning of the Bayesian network is carried out, and the computational steps and the number of nodes of the Bayesian network are optimized to realize the prediction of political stability. The political stability prediction analysis is carried out around the rapid development of China under the Chinese model, and the clustering profile coefficient S and the model eigenvalue F1 are used as the evaluation indexes to compare the effect of the multi-dimensional data preprocessing of political stability, and the prediction results of political stability are analyzed in terms of robustness. The predictions of political stability of China as a whole and the predictions of political stability of Chinese cities are made from the overall and local perspectives, respectively.
The political system’s order and inheritance are represented by political stability, and the prediction and assessment of national political stability is crucial in recognizing China’s political stability during the transition period. In this paper, a political stability prediction model is constructed, and the TS-SAE-K-Means++ multidimensional data clustering model is used to integrate and preprocess the multidimensional data related to political stability.
Sparse Self-Encoder (SAE) is an extension of Self-Encoder, which is used to extract important data quantities, dimensionality quantities from high-dimensional datasets in a feature-selective manner for the purpose of de-emphasis and dimensionality reduction by employing the method of adding sparse regularization in the hidden layer [22].
In this paper, for the case of the average activation of the selfencoder hidden layer shown in Eq. (1), the KL scatter is constructed as the regular constraint term of the SAE network, and the KL scatter is detailed in Eq. (1):
The regular constraint term is used to penalize the hidden layer neuron
In order to prevent overfitting during the training process, this paper adds a weight decay term Δ to the SAE loss function to improve the training ability of the network:
Stack Sparse Self-Encoder (S-SAE) is an extension of Sparse Self-Encoder, in which the output of the hidden layer of each SAE is used as the input of the next SAE, i.e., S-SAE is composed of a number of SAEs connected in series.
The TS-SAE proposed in this paper is a deep learning network containing two hidden layers based on the stack sparse self-encoder.The TS-SAE is a network structure consisting of two individual SAEs concatenated in a series, which contains a data input layer, a data dimensionality reduction layer, a data de-duplication layer, and a data output layer totaling four layers.
When processing multidimensional data
K-Means clustering algorithm has high research value, but the algorithm can not plan the number of clusters well when clustering multidimensional data
The steps of K-Means++ algorithm are as follows.
Step 1, one data from the multidimensional dataset
Step 2, for each data
Step 3, using the
Step 4, repeating step 2 with step 3 until
Step 5, using the above extracted clustering centers to perform multi-dimensional data clustering using the K-Means algorithm.
In the K-Means++ algorithm, step 3 is the core algorithm, and the principle of selecting new clustering centers is that data with larger
In sub-step 1, a data
Sub-step 2, randomly take a fall in the Sum_D data
The K-Means clustering algorithm in step 5 uses the Euclidean distance measure shown in equation (5) to measure the distance relationship between samples. The K-Means++ algorithm proposed in this paper optimizes the process of extracting the cluster centers and improves the superiority of cluster center selection:
In the multi-indicator evaluation system, due to the different nature represented by each evaluation indicator, if the raw data are directly used for processing, the importance of the higher value items in the analysis of the indicators will be highlighted, and the ability of the indicators with lower values will be weakened. In order to ensure the uniformity and reliability of the analysis and evaluation results, it is necessary to standardize the raw data.
Normalization of data refers to scaling the data so that it falls within a specified interval. For the multidimensional dataset sample sequence
The multidimensional dataset sample sequence
In this paper, the TSSAE-K-Means++ data de-clustering model is constructed, which consists of four main parts.
Data input layer, which inputs the multidimensional dataset into the TSSAE-K-Means++ clustering model.
Data processing layer, the input multidimensional data is firstly partitioned into data according to 6 dimensions as a group, and fill in 0 for less than 6 groups to carry out the complementary operation. Then a layer-by-layer greedy approach is adopted to perform data dimensionality reduction on each data set with feature data as the template, and finally, the linkage method is utilized to map each data set, and the strings matching the mapping values are subjected to data de-duplication, and the deleted data are filled in with 0.
Data clustering layer, firstly, the K-Means++ algorithm is used to extract the clustering center of the data group, and then K-Means clustering is performed on each group of data based on the extracted clustering center.
Data output layer, the groups and data after clustering by TSSAE-K-Means++ model are output.
Given the uncertainty of political risk, the analysis related to political stability often involves fuzzy and inconsistent decision-making information, which may be caused by a variety of reasons, and the deterministic-based analysis methods often regard this uncertainty simply as the noise or error of the system, thus ignoring the important decision-making information that may be embedded in it. Bayesian networks, which are intelligent data mining and knowledge discovery methods, play a crucial role in handling uncertain and inconsistent information with their intelligent reasoning capabilities.Therefore, this study will apply Bayesian networks to predict political stability.
In order to make the establishment process of Bayesian network more rigorous, and to reduce the complexity of the network, improve the node correlation, and then improve the prediction ability and prediction efficiency of the Bayesian network, this study will establish a Bayesian network structure, which maintains the flexibility and fault-tolerance of the inference process of the Bayesian network, and at the same time increases the accuracy of its network structure.
Bayesian networks are also known as belief networks and causal probability networks.Bayesian networks are widely used in machine learning techniques, and their theoretical basis is probability theory.The Bayesian formula is the mathematical basis of Bayesian networks. It combines probability theory with graph theory to graph the causal relationships between factors in complex, fuzzy, and uncertain problems. Bayesian networks are classified into two main categories, directed acyclic graphs (DAGs), and probability distribution tables.DAGs provide a qualitative description of the correlation between individual variables. Probability distribution tables provide a quantitative representation of the correlation between individual variables.
Bayesian Networks Bayesian network models and analysis and reasoning are based on probability theory. Among the knowledge about probability are conditional probability, multiplication formula, full probability formula, Bayes formula, prior probability, posterior probability, joint probability distribution, etc. In the Bayesian network knowledge expression, the joint probability distribution is the center.
For two different events, it is known that the conditional probability of an event occurring under the condition that the event occurs is:
From the conditional probability, there is a multiplication formula at that time:
For events
Let event
Then for any event
For event
For the identified Bayesian network model, the joint probability distribution of the nodes can be expressed as:
Bayesian network is a directed acyclic graph (DAG) composed of nodes and directed arcs. Among them, nodes represent random variables, and directed arcs represent the direct causal relationship between one node and another node, pointing to the child node from the parent node, indicating the “result” caused by the “cause”.
The node variables in a Bayesian network are independent of each other.
(1) Conditional interdependence. In the case of knowing the value of a node’s parent node, the node is independent of all its children and exists independently.
(2) Causal independence. The existence of directed edges in the Bayesian network indicates the direct causal influence of the parent node on the child nodes, and there is no mutual cooperation between the parent nodes, but affects the child nodes alone, i.e., more than one cause independently affects the result.
Establishing a scientific and reasonable Bayesian network model is the first step in the study of Bayesian networks. Currently, the construction methods commonly used in the academic world for Bayesian networks are mainly as follows.
Using questionnaires and field visits, the knowledge and experience of experts are utilized to estimate the conditional probability of each node, so as to obtain the topology and parameter patterns of the Bayesian network. When constructing a Bayesian network, it is generally not necessary to obtain a large number of samples, so it is mainly used to construct models with a small number of nodes and simple relationships between nodes. However, the Bayesian network model constructed based on the knowledge swelling of the special mound is often susceptible to the influence of the subjective will of the special mound, thus giving its conclusions an unobjective meaning.
A Bayesian network is constructed by choosing appropriate algorithms, construction learning and parameter learning. Using this method to build a Bayesian network, first of all, it should be supported by a huge amount of sample data, and thus it is suitable for constructing a network with multiple nodes and the relationship between multiple nodes. With the development of machine learning and artificial intelligence technology, this new algorithm is also evolving. Compared to expert knowledge, the structure of Bayesian network models built using data learning methods is too mechanized and not flexible enough.
Combining the learning model of expert knowledge and data, the initial topology of the Bayesian network is first decided by experts and scholars of the discipline, combined with their practical experience and relevant theoretical knowledge. Finally, a suitable algorithm is selected for learning, and the initial Bayesian network is adjusted and optimized based on the results of learning. This method balances subjectivity and objectivity, combines the advantages of expert knowledge and data learning, and improves the efficiency of modeling while ensuring the correctness of Bayesian network modeling.
The main work that needs to be done for Bayesian network construction is as follows
(1) Bayesian network construction. Bayesian network is a graphical structure consisting of restricted states between nodes. The causal relationship between the nodes determines the states between the connected nodes, which allows the obtained network to express the transfer relationship. After analysis by experts, the Bayesian network can be constructed initially. However, in the specific analysis process, there will be some situations that need to be modified, such as the introduction of new node variables.
(2) Define the variables of the network nodes. Bayesian network includes two aspects. One is the construction of the network and the other is the construction of the parameters. The information of the nodes is a prerequisite for determining the likelihood of the nodes and their type.The main nodes of Bayesian networks are natural nodes, decision nodes, and utility nodes. In addition, the natural nodes can be further categorized into two types, M and N, based on their main utility. In a specific analysis, the nodes of a Bayesian network represent the hazardous events and risk factors of a system.These nodes are usually classified as nodes of type N. Such nodes can be described in terms of the probability of occurrence or non-occurrence, while in each node, there are two state variables, Y and N, which represent whether the risk factor described in that node occurs or not.The analysis of the nodes of type M can be obtained from 0-1, and a lower probability e is set on such nodes, which makes the setting of the risk factor closer to the reality.
(3) Determination of the parameters of the Bayesian network. In a Bayesian network, the parameters to be determined mainly include the probability distribution of the nodes, which is the boundary probability of the nodes, and the conditional probability table (CPT) of each node. Under a specific network architecture, based on the data given in the conditional probability table of each node, the relevant probability calculations are carried out according to the causal relationship and independent dependence between the variables, and finally the boundary probability of each node is obtained.
Since many variables are continuous type variables, they need to be discretized. For the purpose of Bayesian network conditional probability determination, this study discretizes the values of all variables into three levels of low, medium, and high according to the distribution characteristics of each variable, which are represented by 1, 2, and 3, respectively.
Political stability contains six indicators of regime stability, institutional stability, social stability, economic stability, political and cultural stability, and international relations and regional stability, represented by F1, F2, F3, F4, F5, and F6, respectively.
At present, in the application of Bayesian network by many scholars, the conditional probability is mostly given directly by experts, and then continuously updated by the corresponding cases, and the final probability value is calculated in this way. However, due to the complex association relationship of Bayesian networks, it is difficult for experts to formulate conditional probability based on experience.Once there are many nodes, the preset conditional probabilities become even more difficult to realize in the case of complex association relations.Therefore, this study determines the conditional probability of each node using network parameter learning.
There are three main ways of Bayesian network parameter learning algorithms, maximum likelihood estimation, Bayesian estimation, gradient descent algorithm. And the Expectation Maximization algorithm. The first two algorithms are utilized when there are no missing values, while the latter two are typically utilized when there are missing values.The absence of missing data in this study’s samples necessitates the use of Bayesian estimation for parameter learning.
Bayesian estimation assumes a fixed unknown parameter
It follows from Bayes’ rule:
For data set
Extending this result to Bayesian networks, event
In the Bayesian estimation algorithm, the parameter estimates are calculated by the following equation:
China’s rapid economic growth over the past three decades, especially its performance in the economic crisis, has attracted the attention of public opinion and academics at home and abroad, and a wave of debates about the China model and China’s development path has been set in motion. In order to obtain a more comprehensive and in-depth understanding of the China model, this chapter will use the political stability prediction model constructed in this study as a means to carry out political stability prediction for China, which has been developing rapidly under the China model.
Before formally carrying out the political stability prediction, the TS-SAE-K-Means++ multidimensional data clustering method in the political stability prediction model of this paper is used to integrate and preprocess the multidimensional data related to political stability, and the results of the political stability prediction based on the Bayesian network are analyzed for robustness.
It is known that in this paper, TS-SAE-K-Means++ multidimensional data clustering method is used in the integrated preprocessing of multi-location data for political stability. The three data processing algorithms SSAE, DAE, TSSAE and the three clustering algorithms K-Mean, KNN, K-Means++ are modeled in combination respectively, and the clustering profile coefficients S and model eigenvalues F1 are used as the evaluation indexes to compare the results of the multi-dimensional data preprocessing of the different combinations of modeling. The multi-dimensional data preprocessing results of different combinations of algorithmic models under different data sets are specifically shown in Table 1. From the data in the table, it can be seen that the clustering profile coefficient of the TS-SAE-K-Means++ multidimensional data clustering method adopted in this paper is as high as 0.96, and the F1 reaches 91.1%, which are the optimal values of the clustering profile coefficient S and the model eigenvalue F1. The integrated preprocessing of political stability-related multidimensional data using the TS-SAE-K-Means++ multidimensional data clustering method in this paper is superior and can be used for further political stability prediction work.
Pretreatment of multidimensional data
| Data processing method | Clustering algorithm | S | F1(%) |
|---|---|---|---|
| SSAE | K-Mean | 0.67 | 82.5 |
| KNN | 0.91 | 82.9 | |
| K-Means++ | 0.75 | 85.9 | |
| DAE | K-Mean | 0.71 | 82.1 |
| KNN | 0.72 | 82.3 | |
| K-Means++ | 0.85 | 88.6 | |
| TSSAE | K-Mean | 0.9 | 81.1 |
| KNN | 0.84 | 90.6 | |
| K-Means++ | 0.96 | 91.1 |
Among the multidimensional data related to China’s policy stability, 10 sample data that have been preprocessed are randomly selected as the test set, and political stability prediction is carried out on the basis of the test set, and the robustness of the Bayesian network parameter learning applied in the political stability prediction method proposed in this paper is verified by comparing the political stability prediction results with the actual results. If the prediction probability is ≥50%, the prediction result occurs, and vice versa <50%, it does not occur. The results of the robustness test are specifically shown in Table 2. Each sample prediction contains F1, F2, F3, F4, F5, F6 in total 6 predictions. From the table it can be seen that the overall prediction accuracy is 89.99%, indicating that the prediction results of the political stability prediction model proposed in this paper have good robustness.
Robust test results
| Sample number | Predictive result | Forecast risk | Actual risk | Accuracy rate(%) | |||||
|---|---|---|---|---|---|---|---|---|---|
| F1 | F2 | F3 | F4 | F5 | F6 | ||||
| 1 | 20 | 25 | 50 | 25 | 25 | 25 | F3 | F3 | 100 |
| 2 | 33.3 | 55.6 | 45.5 | 33.3 | 33.3 | 66.7 | F2, F6 | F3, F6 | 66.7 |
| 3 | 25 | 28.6 | 11.1 | 20 | 20 | 40 | - | - | 100 |
| 4 | 40 | 50 | 40 | 50 | 33.3 | 33.3 | F2, F4 | F2 | 83.3 |
| 5 | 42.9 | 33.3 | 57.1 | 50 | 20 | 25 | F3, F4 | F3, F4 | 100 |
| 6 | 66.7 | 33.3 | 66.7 | 33.3 | 33.3 | 33.3 | F1, F3 | F1, F3 | 100 |
| 7 | 50 | 50 | 14.3 | 12.5 | 18.8 | 20 | F1, F2 | F2 | 83.3 |
| 8 | 20 | 50 | 33.3 | 33.3 | 75 | 33.3 | F2, F5 | F5 | 83.3 |
| 9 | 33.3 | 20 | 33.3 | 33.3 | 40 | 11.1 | - | - | 100 |
| 10 | 66.7 | 66.7 | 13.3 | 50 | 22.2 | 50 | F1, F2, F4, F6 | F1, F2, F6 | 83.3 |
| Overall prediction accuracy | 89.99 | ||||||||
In this section, the political stability of Chinese cities will be explored first, and 14 cities are used as the objects of the study, including 4 first-tier cities (Beijing, Shanghai, Guangzhou, Shenzhen), 5 second-tier cities (Tianjin, Hangzhou, Nanjing, Wuhan, Chengdu), and 5 third-tier cities (Yangzhou, Dalian, Hohhot, Luoyang, Weihai). The political stability of first-, second- and third-tier cities is specifically shown in Figure 1. It can be seen that the political stability of all cities fluctuates significantly from 2020 to 2024 and 2025-2030, but the political stability of first-tier cities is always in the range of [3,4], while that of second-tier cities is in the range of [2,3.5], and that of third-tier cities is lower than that of first- and second-tier cities, which is in the range of [1,3.5]. Obviously, there are more obvious differences in the political stability of China’s first-, second-, and third-tier cities, which are ranked in order of political stability as first-tier cities > second-tier cities > third-tier cities.

The political stability of Chinese cities
Comparing China’s political stability in 2020-2024 with the projected political stability in 2025-2030 with the global average political stability is shown in Figure 2. It is clear from the figure that political stability is on the rise both in China and globally. However, the global average political stability shows obvious fluctuation, decreasing from 2.67 to 2.47 in 2021, and is predicted to decrease in 2025 and 2028. In contrast, China’s political stability only shows a small decline of 0.02 in 2024, and is predicted to show a steady upward trend from 2025 to 2030.

China’s political stability forecast and global political stability forecast
According to the political stability prediction model proposed in this paper, it is known that political stability consists of six indicators: regime stability (F1), institutional stability (F2), social stability (F3), economic stability (F4), political and cultural stability (F5), and international relations and regional stability (F6). In this section, we will explore the future situation of China and the continents of Asia, Africa, Europe, Oceania, North America, and South America in each indicator of political stability prediction, and the prediction results are specifically shown in Figure 3. As can be seen from the figure, among all the political stability indicators in each region, the indicators of political and cultural stability, institutional stability, and regime stability are relatively stable, with the average values of 3.41, 3.55, and 3.47. The indicator of international relations and regional stability has the lowest value, with the average value of each continent being only 1.92, while China’s is 2.15.China has played an important role in the world’s peace and development, and has been actively establishing good diplomatic relations with other countries of the world, which has led to the development of China’s political stability and the development of the world. China’s role in world peace and development is significant, and it actively establishes good diplomatic relations with other countries. However, due to some historical problems, such as the territorial disputes between China and some countries in Southeast Asia, as well as some emerging problems, such as the hegemonic behavior of the United States in an attempt to curb China’s development, China’s bilateral relations with some countries in the region have been adversely affected. The indicators of social and economic stability are similarly low, second only to the indicators of international relations and regional stability, with average values of 2.13 and 2.32, respectively, compared with 2.16 and 3.14 for China, and the extent of the negative impact of escalating terrorism is also growing. While countries are developing economically, the gap between the rich and the poor is becoming more and more obvious, which stimulates the discontent of people at the bottom of society, triggers a series of social problems, and disrupts social security. The development of global integration has weakened geographical constraints, and the negative effects of immigration and refugee flows are becoming more pronounced, posing a double challenge to social and economic stability.

Indicators of political stability prediction
In this study, the TS-SAE-K-Means++ multidimensional data clustering model is first constructed to realize the integrated preprocessing of multidimensional data related to political stability. On this basis, the prediction method of political stability is proposed based on Bayesian network theory. To analyze the prediction of China’s political stability under the Chinese model, this paper adopts the TS-SAE-K-Means++ multidimensional data clustering method for multi-dimensional data integration preprocessing, and the clustering profile coefficient is as high as 0.96, and the F1 reaches 91.1, which is better than SSAE, DAE, and TSSAE data processing algorithms. The overall prediction accuracy of political stability prediction is 89.99% with good robustness. In the localized political stability prediction of Chinese cities, there are more obvious differences in the political stability of first-, second- and third-tier cities, and the political stability of first-, second- and third-tier cities are in the different value intervals of [3,4], [2,3.5], and [1,3.5], respectively, and the political stability situation is ranked as first-tier > second-tier > third-tier in the order of first-tier > second-tier > third-tier cities. In terms of overall China’s political stability forecast, compared to the global political stability forecast trend with obvious fluctuating dynamics, China’s political stability forecast is also expected to maintain a steady upward trend in 2025-2030. Consistent with the trend of global political stability forecasts for all continents, China’s political stability forecasts have the lowest value of 2.15 for the international relations and regional stability indicators, followed by 2.16 and 3.14 for the social and economic stability indicators, while the political and cultural stability, institutional stability, and regime stability indicators are relatively stable, with an average value of 3.41, 3.55, and 3.47, respectively.
