A time-series analysis study on the correlation between college students’ mental health status and the ideology courses in colleges and universities
Published Online: Sep 25, 2025
Received: Jan 29, 2025
Accepted: May 10, 2025
DOI: https://doi.org/10.2478/amns-2025-1006
Keywords
© 2025 Yiqian Wang, published by Sciendo
This work is licensed under the Creative Commons Attribution 4.0 International License.
With the rapid development of China’s economy and culture, higher education has gradually become popularized, and the number of students enrolled in higher education institutions has increased year by year [1-2]. At present, higher education institutions generally emphasize the cultivation of students’ learning and research ability, but pay less attention to students’ mental health, and the related cultivation and education system does not match with the development of higher education. Most of the scholars focus on the psychological health of college students on the “pathology” type of research on individual college students, which leads to the neglect of the social perspective [3-6]. As a result, the attribution of mental health from the perspective of social phenomena has been neglected, leading to a monotonous response. Mental health of students in higher education is characterized by psychological problems and their manifestation of “dysfunctional” behaviors, which is similar to the case of “dysfunctional” leading to transgressive behaviors identified by Merton, an American sociologist and psychologist, and therefore it is exceptionally important to explore the attribution of psychological disorders among students in higher education [7-10]. Therefore, it is very important to explore the attribution of psychological disorders among college students.] Mental health education for college students is closely related to their ideological work, and the cultivation of students’ mental health in colleges and universities should be the responsibility of all departments, including ideological and political education, which is responsible for ideological education. Therefore, it is of great practical significance to utilize the main channel of ideological education in colleges and universities to cultivate and educate college students to learn to face and solve mental health problems through ideological and political classes [11-15].
At present, the mental health literacy of college students is still constrained by many factors, and there is still great room for the mental health literacy level of college students to rise. Literature [16] takes Chinese students from five different universities as the research object, and explores the influence mechanism between ideological and political education and the mental health of college students in three directions: overall dialogue of disciplines, localized articulation, and rupture supplementation, and the findings show that ideological and political education can improve the comprehensive quality of college students and enhance their mental health level. Literature [17] creates a system that can predict the relationship between dividing educational psychology and ideological and political education in colleges and universities based on knowledge graph and KNN model, and verifies that the system has a high accuracy rate, recall rate and F1 value through performance testing. After investigating and analyzing the mental health status of students in private colleges and universities, literature [18] proposed integration paths and strategies between mental health education and ideological and political education courses, and verified the effectiveness of these integration paths and strategies through empirical analyses, which is of great significance in promoting students’ mental health and comprehensive development. Literature [19] outlines the standards of mental health, the definition and specific manifestations of emotional stability and a high degree of well-being, and discusses the guiding mechanism of ideological and political education on students’ psychological state in the context of industry-education integration.
In addition, the literature [20] tries to strengthen students’ mental health education through ideological and political education courses, and takes basketball teaching in sports colleges as an example of experimental teaching, and the teaching results verify the feasibility of the above attempts, and the ideological and political education of the courses can improve the mental health state of college students, and increase their overall sense of well-being and psychological resilience. Literature [21] puts forward the improved teaching method of integrating ideological and political education into the mental health course, aiming to promote the mental health growth of college students in a targeted way, and then improve the quality of talent cultivation. Literature [22] analyzes the mental health status and attribution of current college students through questionnaires, and introduces mental health education in ideological and political education to promote the healthy growth of college students and cultivate high-quality and qualified talents for the society. Literature [23] discusses the differences between health psychology education and ideological and political education in terms of educational content, methods, principles, purposes, implementers, etc., respectively, and establishes a synergistic nurturing model of health psychology education and ideological and political classes, which verifies the feasibility and validity of the model through practical teaching and performs well in improving the mental health status of students.
The article firstly introduces the research methodology of this paper, discusses the steps of Apriori algorithm from searching for frequent itemsets and searching for strong association rules, and then carries out basic theoretical research on the specific modeling steps of ARIMA algorithm. The role of Apriori algorithm and ARIMA algorithm to data mining analysis and predictive analysis in college students’ mental health status and college ideology courses is elaborated in detail. The article takes the enrollment psychological census data of students in the class of 2020 of a university as the research object, preprocesses the data, uses Apriori algorithm to mine the relationship between the mental health status of college students and the Civics and Political Science courses of colleges and universities, and obtains the association rules. And the ARIMA time series algorithm is used to predict the changes of the psychological status of two students, in order to better play the role of the intervention of the college Civics program in the mental health of college students.
The Apriori algorithm aims to extract the most frequently purchased items by customers from merchandise transaction data, as well as the association rules between these items, in order to better understand and analyze these relationships. It aims to better understand customers’ buying habits by analyzing these association rules and provide valuable information for decision making.
The significance of Apriori algorithm: (1) Merchants stock goods in accordance with the goods that customers buy frequently, buying more and more, and buying less and less. (2) Goods with correlation are placed according to the neighboring position, which is convenient for customers to buy. (3) Speed up the circulation of goods (4) Promote the economic development of the country and contribute to the growth of GDP.
Apriori algorithm consists of two parts: searching for frequent item sets, and searching for strong association rules. The so-called frequent is that customers buy many times, measured by the support (count) index: the so-called strong correlation is that customers frequently buy some goods, but also a high probability of frequent purchase of some other goods, measured by the confidence index. Algorithms use the following concepts [24].
Algorithms: in a general sense, algorithms have several characteristics: (1) Algorithms can be executed by computers as well as by the human brain: what can be executed by the human brain, but not by the computer, is not called an algorithm. Algorithms are specific to computers. (2) an algorithm can be implemented on any software platform, coded in any programming language, algorithms are independent of software platforms, programming languages, hardware. (3) An algorithm is capable of being coded and implemented by any software engineer. A computer is capable of being composed of a natural science programming language, a geometric programming language (concepts and notation), and class programming statements, but none of them can be directly understood by a computer, and therefore a computer cannot be considered as a kind of programming. (4) An algorithm that has a beginning must have an end, to be completed in a finite amount of time, in a finite amount of space (memory and external memory).
Shopping baskets: Customers use shopping baskets to hold items when shopping in a supermarket, and these items are intrinsically linked to each other.
Data Requirements: Data should be true, accurate, and have a large sample size. A large sample size includes a wide range and a long time span. In this algorithm, the database D stores all the transaction data and saves only the name of the commodity without considering the quantity and price of the commodity. Each row of database D is well sorted in the dictionary order of commodity names.
Support (count): The number of times a customer buys some commodities at the same time is called the support count, and the ratio of this count to the number of times all commodities in database D is called the support. The threshold value that measures how often a customer purchases an item is called the minimum support (count), which is denoted as min_support and min_support_count, respectively.
When customers buy some item A and some other item B at the same time, their confidence level changes, i.e., the conditional probability
Minimum support (counts) and minimum confidence levels are set artificially and are based on three things: (1) advice from experts in the application area (market). (2) Analysis by statisticians. (3) The experience of the programmer. Ordered set: elements cannot be duplicated. Elements are ordered. In ordinary sets, elements are unordered. Ordered itemsets: the elements in the set are trade names. The trade names are ordered in dictionary order.
Frequent itemset: if the support of itemset I satisfies a predefined minimum support, or if the support count of I is greater than or equal to the minimum support count, then I is a frequent itemset.
Strong association rule: rule A => B has support sup
Comparison of ordered itemsets: (1) When counting
Ordered itemsets greatly improve the efficiency of the algorithm. If the itemset is unordered, then each element of one itemset needs to be compared with all the elements in the other itemset, and the number of such comparisons grows exponentially. The flow of the classical Apriori algorithm is shown in Figure 1.

Flowchart of the classic Apriori algorithm
When existing frequent item sets are found in database
According to the introduction of the principle of association algorithm above, it can be obtained that the association algorithm is divided into two parts: mining of frequent itemsets and generation of association rules. The association algorithm mines the frequent itemsets in the original data by connecting step, pruning step and user-defined minimum support count. Then the mined frequent itemsets are converted into strong association rules by the association rule generation algorithm and the minimum confidence set by the user. The basic model of association rule mining is shown in Fig. 2.

Basic model of association rules mining
A collection of historical points in time is called a time series. Time series are arranged in a certain time interval, and based on the existing historical time to predict future trends, from predicting the sales of a product to estimate the number of users of a product, commonly used time series forecasting model there are a lot of models, including the ARIMA model is the most commonly used in the forecasting of time series of practical cases of the model, this model is mainly for the smooth non-white noise series of data.
The theoretical components of the ARIMA model consist of three parts: an autoregressive model (AR), a moving average model (MA), and a difference model (Integration). The AR term is only used to predict the past value of the next value, which is defined by the
The ARIMA
Assuming that the time series data set is represented by
Where,
When
Another component of ARIMA-
Where,
Combining the above two models gives the autoregressive moving average
where
The ADF test is an augmented form of Dickey-Fuller, which is based on the principle of identifying the presence or absence of a unit root in a time series, if the time series is smooth, then there is no unit root. On the contrary, there is. ADF assumes the existence of unit root and if it finally presents significance, i.e., the value of
According to CRAMER decomposition theorem (5):
The above equation is equivalent to equation (7):
It follows that the nature of the first-order difference is a process of self-national normalization, which leads to any
The essence of the above equation is also a
The autocorrelation function, ACF, whose coefficients express the linear relationship between the observed values of a time series and its values in past time periods. The autocorrelation coefficient takes values from -1 to 1. The closer its absolute value is to 1, the more autocorrelated the series is. For time series with significant cyclical or seasonal variations and those with a strong upward and downward trend, the absolute value of the autocorrelation coefficient will be closer to 1. The value of its coefficient ACF can be calculated according to the following formula (10):
Where the above equation is expanded in terms to obtain equations (11) and (12):
The experimental object selected in this paper is the psychological data generated by the psychological census of students in the class of 2020 of a university at the time of freshman enrollment. There are 2557 students in the class of 2020, among which there are 872 male students and 1685 female students. For the 2020 grade students in the psychological census test done at the time of new student enrollment, the enrollment assessment data of the Symptom Self-assessment Scale SCL-90 is selected as the research data, and the improved Apriori algorithm is utilized to mine out the interconnection between the psychological health status of college students and the Civic and Political Education of colleges and universities, and the ARIMA algorithm in the time series is utilized to predict the psychological health status of college students and the results obtained are analyzed and interpreted. The results are analyzed and interpreted.
Data integration is the process of combining data from multiple data sources into one consistent data store, thus providing a complete data source for data mining. The data used in this case involves a table of basic information about individual students and a table of individual students’ psychological problems. The table of mental health issues analyzed for each student is generated by joining through the common primary key XH (student number). Part of the table after joining through the school number is shown in Table 1. From the table, it can be seen that the 10 randomly selected students have the highest mean score of 1.787 for obsessive-compulsive symptoms.
Part of the link after the connection
| Code number | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 |
|---|---|---|---|---|---|---|---|---|---|---|
| gender | female | man | man | female | female | man | man | female | man | female |
| Is it only | Y | N | Y | Y | N | N | Y | N | Y | Y |
| Single parent | N | Y | N | N | Y | N | N | Y | Y | N |
| …… | …… | …… | …… | …… | …… | …… | …… | …… | …… | …… |
| horror | 1.25 | 1.29 | 0.91 | 1.31 | 1.31 | 1.31 | 1.55 | 0.78 | 1.26 | 1.28 |
| paranoia | 1.12 | 2.85 | 1.34 | 1.12 | 1.68 | 1.72 | 1.26 | 1.05 | 0.9 | 1.15 |
| Obsessive-compulsive disorder | 1.22 | 3.02 | 0.83 | 1.58 | 1.46 | 1.55 | 2.63 | 1.47 | 1.35 | 2.76 |
| somatization | 1.34 | 1.23 | 1.59 | 1.28 | 1.18 | 1.91 | 1.05 | 1.52 | 1.78 | 1.89 |
| Interpersonal sensitivity | 1.6 | 1.77 | 1.19 | 2.17 | 1.65 | 0.95 | 1.23 | 1.63 | 1.25 | 1.47 |
| depression | 1.17 | 2.63 | 1.5 | 1.55 | 0.91 | 1.59 | 0.96 | 1.12 | 1.62 | 1.4 |
Some of the students’ psychological problems are shown in Table 2. Data mining using the matrix-based Apriori algorithm starts with transforming the related transaction database into a Boolean matrix. After data preprocessing, each student’s mental health problem analysis table is a transaction Ti (TID) with transaction set T={T1,T2,T3,......,Ti}. Each psychological dimension factor is an item set. The corresponding 9 psychological factors in each transaction correspond to 1 if they exhibit symptoms and 0 if they are asymptomatic. Assuming minimum support minsup = 20%, the minimum support count is sup_count = minsup × |T|.
Some students’ psychological problems
| School number | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 |
|---|---|---|---|---|---|---|---|---|---|---|
| antagonism | N | N | Y | N | Y | N | Y | N | N | N |
| anxiety | Y | N | N | N | N | Y | N | N | Y | N |
| insanity | N | N | N | N | N | N | N | Y | N | N |
| horror | N | N | N | Y | Y | N | N | N | N | Y |
| paranoia | N | N | N | N | N | N | N | N | N | N |
| Obsessive-compulsive disorder | N | Y | N | N | N | Y | N | Y | N | N |
| somatization | N | N | N | N | Y | N | N | N | Y | N |
| Interpersonal relation | N | Y | Y | N | N | N | Y | N | N | N |
| depression | N | N | N | N | N | N | Y | N | N | Y |
A Boolean matrix that is transformed based on the contents of the table.
By utilizing matrix-based Apriori algorithm for data mining, some of the association rules between college students’ mental health status and college Civics courses are shown in Table 3. The above rules are converted to values by code table for codes and stored in the association rule base. By analyzing the obtained rules, it is found that the support between stress adjustment and anxiety is 0.92 and the confidence level is 0.4. These data indicate that there are indeed some potential relationships between college students’ mental health status and college Civics courses, and that the college Civics courses have a certain guiding role in intervening and guiding college students’ mental health status.
Association rule
| Association rule | support | confidence |
|---|---|---|
| Social responsibility => Obsessive-compulsive disorder | 0.85 | 0.16 |
| A positive outlook on life => Obsessive-compulsive disorder | 0.95 | 0.20 |
| Emotional management => depression | 0.86 | 0.25 |
| Pressure adjustment => somatization | 0.91 | 0.45 |
| Interpersonal communication => Interpersonal sensitivity | 0.93 | 0.32 |
| Pressure adjustment => anxiety | 0.92 | 0.40 |
| A positive outlook on life => depression | 0.85 | 0.16 |
This section continues with an experimental test of two students selected from the university to predict their mental health status using a time series algorithm.
The time series of the depression severity index of the two students is shown in Figure 3. From the ARIMA model, it can be seen that the data series of student A shows more fluctuation, and when the time series is 16, the depression severity index reaches 0.7, and the degree of mood depression is more serious overall.

The time series of the two studies of the severity of depression
Several possible models are screened by smoothing the non-smooth series, based on the initial determination of the model and order of the series from the ACF and PACF plots of the smooth difference series. Based on the AIC and SC criteria, the p and q that minimize the AIC and SC values are selected. secondly, other statistical parameters such as the adjusted R2 and the inverse root of the lag polynomial are also referred to select the most appropriate model.
For the sequence of student B, the partial autocorrelation coefficient of the sequence after first-order differencing is significantly non-zero at k=1, and quickly converges to 0 after k=2. The autocorrelation coefficient is significantly non-zero at k=1. k=2 is on the edge of the confidence interval, which can be considered to be taken as q=1 or q=2. The initial model chosen is ARMA(2,1) or ARMA(2,2), and after attempting to build the model, we found that the value of its adjusted R2 is too small. So the second-order difference series was obtained by differencing again for parameter estimation and model testing.
The test results of each model are shown in Table 4. The adjusted R2 value of ARIMA(3,2,0) model is 0.6431, which is larger than the values of the other three models, while the AIC and SC values are -6.0231 and -5.9031, which are smaller than those of the other models, so it can be determined that ARIMA(3,2,0) is the optimal model.
Test results of each model
| Model class | Adjusted R2 | AIC | SC |
|---|---|---|---|
| ARIMA (1,2,0) | 0.5213 | -5.7132 | -5.7012 |
| ARIMA (2,2,0) | 0.5794 | -5.8315 | -5.7123 |
| ARIMA (22,2,0) | 0.5412 | -5.7103 | -5.6713 |
| ARIMA (3,2,0) | 0.6431 | -6.0231 | -5.9031 |
The estimation results of the parameters of each model are shown in Table 5. The corresponding autoregressive parameters of the ARIMA(3,2,0) model were -1.0631, -0.7531, and -0.3812, respectively. From this model, it can be seen that the change of the three-difference series of the severity of depression of Student B was governed by the level of depression in the previous three days, and also had a relationship with the random error of the day.
Estimate of the model parameters
| Model class | Φ1 | Φ2 | Φ3 | |
|---|---|---|---|---|
| ARIMA (1,2,0) | -0.6531 | |||
| ARIMA (2,2,0) | -1.5361 | -0.6512 | 0.9825 | |
| ARIMA (22,2,0) | -0.9213 | -0.2712 | ||
| ARIMA (3,2,0) | -1.0631 | -0.7531 | -0.3812 |
After ordering, estimating and testing to get a more satisfactory model it is time for forecasting. Forecasting is the estimation of the values taken at future moments of the series based on the past and present sample values. The forecasting results of the ARIMA model are shown in Table 6. The results of the prediction made by the model can be seen that the predicted values are closer to the real values, and the minimum error value in the relative error is -0.004%. Therefore, it can be considered that the ARIMA (3,2,0) model finally established by the A student sequence has a better prediction effect and can reflect the change rule of the A student sequence. And the results of the prediction by the model established by the B student sample can be seen, the relative error between the predicted value and the real value is within the acceptable range, the model ARIMA (3,2,2) has a better prediction effect, basically portraying the change rule of the sequence of B students.
Prediction of the ARIMA model
| sequence | Sample period | actual value | Predictive value | relative error% |
|---|---|---|---|---|
| A | 50 | 0.4512 | 0.4492 | -0.004 |
| 51 | 0.4523 | 0.4412 | -0.025 | |
| B | 52 | 0.4812 | 0.5123 | 0.065 |
| 53 | 0.4612 | 0.5034 | 0.091 |
The article uses data mining methods and time series methods to analyze and predict the correlation between the mental health status of college students and the Civics courses in colleges and universities. Through experimental research this article proposes:
In the analysis of association rules, the support degree between stress adjustment and anxiety is 0.92, and the confidence level is 0.4, so that the Civics and Political Science courses in colleges and universities have a positive effect on the mental health status of college students. In the prediction results of ARIMA model, the prediction results of ARIMA model of A student sequence are closer to the real value, and the minimum error value in the relative error is -0.004%, which finally concludes that the prediction effect of ARIMA (3,2,0) model is better.
