A time series analysis study of green finance investment returns under the Sustainable Development Goals (SDGs)
Published Online: Sep 29, 2025
Received: Jan 09, 2025
Accepted: Apr 30, 2025
DOI: https://doi.org/10.2478/amns-2025-1086
Keywords
© 2025 Xiaojia Pan and Lili Liu, published by Sciendo.
This work is licensed under the Creative Commons Attribution 4.0 International License.
In today’s globalization, no country or industry can exist in isolation. Problems such as global climate change, gradual depletion of resources, and deterioration of the ecological environment have become common challenges for all mankind [1]. In order to cope with the challenges, the concept of sustainable development has emerged and gradually become a global consensus [2]. As an important part of modern economic activities, the key role of financial investment in resource allocation is self-evident. The integration of financial investment and sustainable development goals is an important trend in the future development of the financial industry. By formulating sustainable investment strategies, innovating sustainable investment products and strengthening risk management, financial institutions can effectively promote the green transformation and sustainable development of the global economy [3-6]. The integration of financial investment and sustainable development goals not only helps to realize the synergistic development of economy, society and environment, but also is the way of transformation and upgrading of the financial industry itself [7-9].
With the enhancement of people’s environmental awareness and sense of social responsibility, more and more investors begin to pay attention to the performance of enterprises in the environment, society and governance, not only pay attention to the economic benefits of investment projects, but also pay attention to their impact on the environment and society [10-12]. The shift in investment philosophy has made sustainable investment a new market demand, and financial institutions meet the market demand by providing investment products that meet the goal of sustainable development [13-14]. In their investment decisions, financial institutions have to consider both the return on investment and the sustainability of the project [15]. Therefore, how financial institutions can ensure the sustainability of investment projects while pursuing economic benefits has become an urgent problem [16-18]. This challenge requires financial institutions to have higher wisdom and foresight in investment decision-making in order to achieve a win-win situation for both economic and social benefits, and information analysis techniques based on return on investment (ROI) analysis have emerged [19-20].
In this paper, we study the dynamic characteristics of green financial investment returns through time series analysis methods, and construct a combined ARIMA-PSO-LSTM model by combining the ability of ARIMA model to capture short-term fluctuations, the good improvement performance of Particle Swarm Optimization algorithm (PSO), and the learning ability of time series (LSTM) algorithm to learn long-term dependencies. Afterwards, by comparing the prediction results of the single model and the combined model, the accuracy and stability of the prediction performance of the ARIMA-PSO-LSTM model in terms of green financial investment returns are explored to provide an effective analytical tool for financial investment under the goal of sustainable development.
Properties of Time Series - Autocorrelation
The nature of autocorrelation of time series data is generally expressed in terms of the partial autocorrelation coefficient function (PACF), autocorrelation coefficient function (ACF) [21] along with the autocovariance function. Common economic behaviors that are measured over time can often be expressed in terms of the correlation of the data.
The random variable
The autocorrelation coefficient, in order to better represent the magnitude of the correlation between the variable and the lagged value, is mathematically expressed in the form:
Properties of Time Series - Smoothness
Let
Weakly smooth is the average of the data and there is no affiliation or inclusion relationship between the data strongly smooth and the data weakly smooth. Its usually data that satisfy the following conditions:
Properties of Time Series - White Noise
White noise is a manifestation of a weakly smooth process, and through the above description of white noise, it is summarized that if the sequence is a white noise sequence, then it needs to meet the following three characteristics:
Based on the above mathematical formula, white noise is smooth and it has a strong theoretical guidance in the field of time series analysis.
The mathematical expression formula for the common white noise test-statistic
Where
Common Financial Time Series Models
Autoregressive
where
The
The moving average (MA) model, also named the sliding average model in some introductions to model theory, can be understood as writing the time series as a linear combination of a series of uncorrelated random variables.
where
Differential autoregressive moving average model [23]
In the formula
Modeling the time series data is actually based on the time series historical data, and the final econometric model is determined through the steps of data processing, model judgment, and model parameterization.
The establishment of ARIMA model includes the following main steps:
Data acquisition. The data used in the model can be desensitized through the private database of the company’s industry or directly obtained in the open data market. Preprocessing of data. In this paper, the data are first visualized and then judged by using ACF and PACF for further validation. If there is an upward or downward trend in the time series data, the sequence data are first differenced, and then the data are examined to see whether they satisfy the smooth characteristics, and the differencing operation is performed until the data are smooth. Model identification. Through certain identification methods, to determine the time series process is more in line with which known model, Box-Jenkins is a commonly used model identification methods. Model ordination. Judge and test the order of the model by ACF and PACF and AIC criterion, etc., and detect whether there is still unextracted effective information in the residual data. Estimation of parameter values of the model. After determining the order of the model, it is necessary to determine the specific parameters of the model, and finally fit a specific model. Validation of the model. Adopt the actual time series data to be brought into the specific model determined before, predict the data of the future period of time according to the designed working steps of the model, observe the actual fitting effect of the model, and evaluate the model according to the set evaluation criteria and draw conclusions.
LSTM is a special RNN structure. In order to solve the gradient dispersion problem of traditional RNN in the long sequence training process, LSTM introduces a special “gate” structure, which consists of input gates, forgetting gates, output gates and cellular units.
where
Particle swarm optimization algorithm [24] regards the individuals in the group as particles searching in space, each particle randomly obtains a set of random solutions, each solution has a specific position, speed and fitness, the particles in the search process in space constantly track the optimal solution in space to adjust their own parameters, so as to complete the search process from the local optimum to the global optimum.
Specifically, assuming the existence of a
where
The structure of the PSO-LSTM model is shown in Figure 1.

PSO-LSTM model flowchart
Algorithm flow: Step 1, the sample batch, the number of hidden layer units, the learning rate, and the number of iterations of LSTM are used as the optimization object, and the position information of the particles is initialized according to the pre-set range. Step 2, initialize the particle population, divide the training set and test set, and input the initialized parameters in step 1 into the LSTM network for training, and take the model prediction error as the adaptation value of the particles. Step 3, compare the adaptation value of each particle and the best position it has experienced, determine the optimal position of the particle, update the velocity and position of the particle, and calculate a new round of particle adaptation value. Step 4, stop updating when the search process reaches a predetermined maximum number of iterations, or when the fitness values of the particles no longer change significantly with the number of iterations, and obtain the sample batch, the number of hidden layer units, the learning rate, and the number of iterations values of the LSTM model at this time. Step 5, the various values obtained in step 4 are input into the LSTM model for training and prediction.
In order to verify the prediction performance of PSO-LSTM model [25] on dissolved oxygen quality concentration, the mean absolute percentage error
where
The experimental environment is Anaconda, the programming language is Python 3.6, and the model training framework is Tensorflow 1.4 based on Keras. The number of neurons in the input layer of the LSTM model is 4, the number of hidden layers is 1, the number of neurons in the output layer is 1, and the step size required for prediction is 40. The Adam algorithm is used to optimize the parameters during the training process, and the sample batch, the number of hidden layer units, the learning rate, and the number of iterations are set as the parameters to be optimized, and the specific value ranges are set as follows: the sample batch takes the value range of
In the parallel combination of forecasting models, the forecasting results of different forecasting models do not interfere with each other, and the forecasting results of different models are superimposed using the weighted combination method, and the weighted combination of sequences is the forecasting result of the combination model. Assuming that
where
Generally speaking, the size of the weighting coefficients of different components of the combined prediction model is negatively correlated with its prediction error, and the prediction model with the larger error will be given a smaller weighting coefficient. According to the different calculation methods of the weighting coefficients of the combination prediction model, the combination prediction model can be divided into optimal combination prediction model and non-optimal combination prediction model. The optimal combination prediction model determines the weighting coefficients of each prediction model according to the principle of minimum combination prediction model prediction. In this paper, the weighted least squares method is used to calculate the weighting coefficients of different model prediction results, and the mathematical description of the weighting coefficient calculation process is as follows:
Assuming that
Where,
where
where
Let the error information matrix of the combined model be
Thus the sum of squares of the prediction errors of the combined model at moment
Because the combined prediction model in this paper adopts parallel combination, the modeling prediction process of each prediction model does not interfere with each other, so when the constraints are
In this paper, we use the parallel weighted combination of the above prediction methods, using the weighted least squares method to calculate the weighting coefficients of the two predicted values, and the weighting coefficients are calculated as follows:
Let the error vector of the ARIMA model be
There exists a weighting factor
The weighting coefficients and the sum of squares of the prediction errors of the combined model are calculated as follows:
Finally, ARIMA-PSO-LSTM combination model [26] is constructed to predict the green finance investment return.
In this paper, the performance of the ARIMA-PSO-LSTM model is examined by taking the results of a bank’s low green financial investment returns from January 2010 to December 2024 as an example.
The lowest price of a bank from January 2010 to December 2014 is shown in Fig. 2 From the figure, it can be seen that the price in the beginning of the overall slow downward trend, followed by a rapid rise and then a rapid decline, and finally a slow downward trend, the overall trend of oscillation, the initial judgment is not a smooth sequence, in order to eliminate the subjectivity of the graph test method, the daily closing price of the ADF smoothness test test results of the hypothesis of non-smooth sequence of p-value of 0.5451, which is significantly greater than 0.05, therefore, can not be rejected the original hypothesis and the original series is judged to be non-stationary.

The lowest price of a bank from January 2010 to December 2014
For the non-smooth time series, the difference is often used to eliminate the smoothness and extract the relevant information. The first-order difference time series is shown in Fig. 3, and the sequence values do not show obvious upward or downward trends, nor obvious cyclical or seasonal changes, but instead fluctuates up and down around 0, which basically eliminates the trend, and the data has a stable mean, so that it can be determined that the sequence of the lowest price of the stock after the first-order difference has a smooth nature. Similarly, in order to avoid the subjectivity of the graphical test method, the ADF smoothness test is conducted on the daily minimum price series after differencing, and the p-value of the non-smooth series in the test result hypothesis is 0.01, which is smaller than the significant level of 0.05, therefore, the original hypothesis can be rejected, and it is determined that the series after differencing is smooth.

First order difference sequence diagram
The autocorrelation results are shown in Fig. 4, and the partial autocorrelation results are shown in Fig. 5. It can be seen that the autocorrelation value and partial autocorrelation value within the lag of 100th order have the phenomenon of exceeding the confidence boundary, which can basically be judged as trailing. According to the relevant decision rules, it is initially defined as an ARIMA model.

Self-related results

Partial self-correlation
For a fixed time series, the significantly valid model fitted may not be unique, and it is then necessary to select a relatively superior model for statistical inference. The AIC criterion and BIC criterion can be used as a basis for selecting a relatively better model. All combinations with p and q less than or equal to 5 are selected for multiple fitting, and the combined AIC and BIC minimum criteria are finally chosen to build the ARIMA (4, 1, 2) model, at which point the model corresponds to the training set with RMSE = 0.431645 and MAE = 0.248307.
The residuals of the model should satisfy an independent normal distribution with mean 0 and the autocorrelation coefficients of the residuals should be zero for any lag order. The Ljung-Box test is often used to check whether the autocorrelation coefficients of the residuals are all zero. In this experiment, p-value = 0.9937, the residuals of the model did not pass the test of significance, i.e., the autocorrelation coefficient of the residual series can be determined to be zero. The ARIMA model can fit the Ping An Bank daily minimum price series data better. In addition, the Q-Q plot of the residuals of the model is shown in Figure 6, which basically determines that the residuals satisfy the normal distribution and the model passes the test.

Model of residual difference Q-Q diagram
The establishment of ARIMA model is the first step in the establishment of ARIMA-PSO-LSTM model, since the predicted value of ARIMA-PSO-LSTM model consists of two parts, the predicted value of ARIMA and the predicted value of residuals of LSTM model, and, the latter predicted value of residuals is obtained by using the residuals of the outputs of the ARIMA model in the training set as the training object of LSTM. Therefore, the fitting effect of the ARIMA model affects the fitting effect of the combined model. Selecting appropriate parameters and establishing an effective ARIMA model are necessary conditions for the successful establishment of the combined model. In order to verify whether the fitted ARIMA model has good applicability, 1200 days of rolling forecasts based on the true values are carried out. The prediction results of the ARIMA model are shown in Figure 7. After each prediction of the latter day’s data, the real values of the latter day are added to the training data set before prediction, and the prediction results are MSE=0.056572, MAE=0.144935, and RMSE=0.237659. Since each prediction is based on the real values, the model basically fits the trend of the original data, and the modeling is reasonable, and it can be used to build the ARIMA- PSO-LSTM model.

Prediction of the ARIMA model
In order to better show the advantages of finding hyperparameters through the optimization-seeking algorithm, ARIMA and the optimized ARIMA-PSO-LSTM model are compared horizontally in this paper. Figure 8 shows the prediction results of ARIMA and ARIMA-PSO-LSTM models. As can be seen from the figure, the ARIMA-PSO-LSTM model is more closely related to the real values, which shows that the ARIMA-PSO-LSTM model can better simulate the prediction of the real values. It can be seen that compared with the single ARIMA model, the combined ARIMA-PSO-LSTM model established in this paper has better prediction accuracy and prediction effect.

The prediction of the LSTM model and ARIMA-PSO-LSTM model
The results of the comparison of the prediction error evaluation of the two prediction models are shown in Table 1. Among them, the MAE value, MSE value, RMSE value and MAPE value of the ARIMA-PSO-LSTM model are smaller than the error of the ARIMA model, and the indexes have been reduced by 33.4%, 11.2%, 25.09%, and 39.99% in turn. Therefore, the ARIMA-PSO-LSTM model established in this paper with PSO as the optimization basis can effectively improve the forecast accuracy and better adapt to the fluctuation of financial insurance data. And ARIMA-PSO-LSTM does not need to rely on manual adjustment of parameters, which reduces the randomness and makes the acquisition of parameters more accurate.
The prediction error of the two prediction models is compared
| Model | ARIMA | ARIMA-PSO-LSTM |
|---|---|---|
| MAE | 5.68171 | 3.78381 |
| MSE | 59.06123 | 52.44168 |
| RMSE | 8.04989 | 6.03049 |
| MAPE(%) | 35.05587 | 21.03775 |
The ARIMA-PSO-LSTM prediction model, single ARIMA model, and PSO-LSTM model proposed in this paper are simulated and compared using Matlab software, respectively. The combined prediction model works similarly to the LSTM neural network model, which requires the collected green financial investment return data to allocate the training set and test set in the ratio of 9:1, and compare the predicted values with the actual data and the predicted values of other models.
In this paper, we compare the prediction performance of green financial investment return data series before and after genetic algorithm (GA) and PSO-LSTM model. The comparison results of prediction before and after optimization of PSO algorithm are shown in Fig. 9. By zooming in on the prediction result graph at 0 to 168h, it can be seen that optimizing the parameters of the LSTM neural network model by the two optimization algorithms can further improve the accuracy of the model prediction, but the prediction result of the PSO-LSTM model is more accurate than that of the GA-LSTM model, which is closer to the original data. It can be seen that compared with setting the model parameters artificially, iteratively finding the optimal values of the hyperparameters of the LSTM neural network model through the PSO algorithm improves the prediction accuracy.

The PSO algorithm optimized the comparison results
Figure 10 shows the comparison of the prediction results of the three green financial investment prediction models. From the figure, it can be seen that when the green financial investment return is 0, the prediction result of ARIMA model presents an up and down situation, and at this time, the prediction result of PSO-LSTM model is more accurate. In terms of the prediction performance of a single model, the PSO-LSTM model is more advantageous than the ARIMA model, but compared with the ARIMA-PSO-LSTM model, the prediction results of the latter model are closer to the original data values and have better prediction performance.

Comparison of three energy prediction models
The results of the complexity assessment of green finance investment returns for the three prediction models are shown in Table 2. The running time and training time are used and used to assess the complexity of the models. It is obvious that the running time of the three models is similar when predicting data series of the same length, between 0.0621-0.0859 s. The ARIMA model only needs to perform numerical operations on the data and does not need to be trained in advance, and its training time is 0s. PSO-LSTM model needs to determine the optimal parameters of LSTM model with the help of PSO algorithm, train and refine the LSTM neural network model, and then predict the data, so it has a training time, so does ARIMA-PSO-LSTM model. In summary, the complexity of the ARIMA model is the lowest of the three, followed by the ARIMA-PSO-LSTM model and the PSO-LSTM model is the highest.
The return of the forecast model green financial investment
| Model | Running time | Training time |
|---|---|---|
| ARIMA | 0.0621 | 0 |
| PSO-LSTM | 0.0859 | 115.1623 |
| ARIMA-PSO-LSTM | 0.0669 | 115.1623 |
The assessment of the prediction accuracy of the three green financial investment models is shown in Table 3. The model accuracy indexes RMSE and MAE of the three green financial investment prediction models were compared. It can be seen that since the ARIMA model is more suitable for the prediction of stable data and linear data, and the collected data contains both linear and nonlinear data, the prediction accuracy of the ARIMA model is the lowest among the three models, and the values of its RMSE and MAE are respectively 0.11976 and 0.3967, both higher than the other two models. The RMSE and MAE of the ARIMA-PSO-LSTM prediction model are the lowest of the three, with values of 0.0108 and 0.1201, respectively, which are significantly better than those of the single ARIMA and PSO-LSTM models.
Forecast accuracy of green financial investment model
| Model | RMSE | MAE |
|---|---|---|
| ARIMA | 0.11976 | 0.3967 |
| PSO-LSTM | 0.0639 | 0.2614 |
| ARIMA-PSO-LSTM | 0.0108 | 0.1201 |
Based on the financial time series model, this paper introduces ARIMA model and particle swarm optimization algorithm (PSO) to optimize the parameters of the long and short-term memory network (LSTM), and then constructs the ARIMA-PSO-LSTM combination model to improve the prediction ability of green financial investment return.
In this paper, through the minimum price of a bank’s daily trading day, the smoothness test, model ordering, model building, and residual test are carried out step by step. Through the ARIMA model to predict the data of the next 1200 trading days, the MSE, MAE and RMSE values of the model are obtained as 0.056572, 0.144935, and 0.237659, respectively, which present better prediction results, laying the foundation for the establishment of the ARIMA-PSO-LSTM combination model in the later paper. The prediction process of the combined ARIMA-PSO-LSTM model reduces the randomness, which makes the acquisition of parameters more accurate; and the values of MAE, MSE, RMSE, and MAPE of this combined model are reduced by 33.4%, 11.2%, 25.09%, and 39.99% in turn than those of the ARIMA model, which has higher prediction accuracy. The simulation experiment compares the prediction accuracy of ARIMA-PSO-LSTM model, ARIMA model and PSO-LSTM model, and it is found that the prediction results of ARIMA-PSO-LSTM prediction model are closer to the original data, and it has the lowest prediction error and the highest prediction accuracy.
