Construction of Educational Tourism Seeking Prediction Model in the Context of Smart Tourism
Pubblicato online: 29 set 2025
Ricevuto: 07 gen 2025
Accettato: 24 apr 2025
DOI: https://doi.org/10.2478/amns-2025-1094
Parole chiave
© 2025 Yan Wang and Lina Fu, published by Sciendo.
This work is licensed under the Creative Commons Attribution 4.0 International License.
China is a large country with a population of 1.4 billion, since the twenty-first century, people’s living standards continue to improve, the rapid development of the tourism market, but during the period there are also a lot of problems, such as the lack of tourism talent [1]. At present, China’s tourism talent market generally presents a situation of oversupply, higher vocational tourism-related students out of the profession is far from being able to meet the development of the tourism industry needs related professionals, the supply of talent to meet the growing development of the expanding tourism market demand [2-4]. This will lead to the quality and ability of tourism enterprises to recruit talent quality and ability to reduce the quality of the tourism service sector, thereby greatly damaging the reputation and prestige, reduce the attractiveness of tourist attractions and tourist hotels and inns and other tourism service units to tourism customers, affecting the development of the tourism economy, reducing the economic benefits of tourism [5-8]. Therefore, it is important to establish a model of tourism talent market demand, formulate a reasonable talent training program, preferred education strategic plan and education development plan for the development of domestic tourism economy and tourism vocational education [9-10].
The tourism economic market is also a commodity economic market, which is a fluctuating market guided by the law of value [11]. The value of tourism management and service vocational and technical personnel is closely related to the status of the tourism market, when the tourism market development trend is good, normal operating conditions, then tourism management and service vocational and technical personnel have a place to use, can be better based on the law of value and embodied [12-14]. Therefore, tourism-related higher vocational colleges and universities or higher vocational colleges and universities related to tourism disciplines in the process of training of talents, it is important to closely follow the changes in the tourism market to carry out [15-17]. To do this, we must always keep an eye on the tourism market, the use of information technology to understand the tourism market in a timely manner a variety of information, the current situation and changes in industrial structure, industry composition and changes, structural conditions and changes in the situation and changes in the state of supply and demand and changes in the future development of changes in the future development of a certain period of time to make predictions [18-21]. Because just conform to the status quo of the tourism market will be outdated and unable to keep up with the pace of the times because the training of tourism higher vocational talents lags behind the changes in the tourism market [22-23].
In this paper, the working principles of stacked self-encoder, denoising self-encoder, variational self-encoder and long short-term memory network (LSTM) are first introduced. Then SAE is used to extract potential high-dimensional features in educational tourism data layer by layer to realize data compression. Then the SAE-LSTM model is constructed by combining the LSTM method, and the attention mechanism is incorporated into the SAE-LSTM model to constitute the SAE-LSTM model based on the attention mechanism. The model can better access the potential, nonlinear and complex mapping and long-term dependency relationships between the modalities of tourism data, and improve the prediction accuracy and fitting performance of the whole model. Finally, the accuracy and stability of the prediction model in this paper are verified through the processing and empirical analysis of the tourism data in O region.
Auto-Encoder An Auto-Encoder (AE) [24] is a neural network that learns an efficient representation of sample data by unsupervised learning. The self-encoder learns the mapping For the input data
The decoding process is expressed as Eq:
Where:
The self-encoder is able to transform the input to the hidden variable
By minimizing the reconstruction error, the network parameters can be efficiently learned Stacked Autocoding Stacked Auto-Encoder [25] (SAE) is a deep neural network consisting of multiple self-encoders stacked on top of each other. SAE has multiple hidden layers, and the input of each hidden layer is a nonlinear mapping of the previous layer, so the deeper the network is, the better the network’s ability of nonlinear feature extraction is, and the more complex hidden features can be learned by SAE. Therefore, SAEs usually use layer-by-layer pre-training to learn network parameters. Denoising Self-Encoder Denoising Auto-Encoder (DAE) is an auto-encoder that increases coding robustness by adding random noise perturbations to the input data. The noise added by DAE is usually Gaussian noise, or the value of some dimensions of the input data The DAE is trained on the basis of the self-encoder by injecting noise into the input data Variational Self-Encoder The network main structure of the variational autoencoder (VAE) as an unsupervised learning model is roughly the same as that of the autoencoder, which improves the generalization ability of the model by separately modeling two complex conditional probability density function outputs obeying a certain distribution of hidden variables. The VAE model is shown in Fig. 1. The whole network structure of the VAE consists of an inference network and a generative network. The inference network is the part of the network that uses the neural network to estimate the VAE distribution

Variational autoencoder model
Using the idea of VAE to approximate
In VAE,
Consider:
So there:
A reparameterization technique is introduced to transform the discontinuous problem into a continuously derivable problem, and the reparameterization technique improves the training efficiency of the VAE model by
The internal state
Where:
The gating mechanism introduced by the LSTM network consists of three “gates”, which are input gate
The formula for the forgetting gate is:
The expression for the input gate is:
The expression for the output gate is:
Where:
Figure 2 shows the schematic diagram of the working principle of LSTM network. Through the LSTM memory unit, the whole network can establish longer distance temporal dependencies. At

Schematic diagram of the LSTM network
Individual AE training process Self-encoder is an unsupervised learning algorithm whose target output is the input of the model, which mainly consists of three network structures, x, h and z. The AE is an unsupervised learning algorithm with the target output as the input of the model. The encoding and decoding process of AE is to first use the output of the previous layer of encoder as the input of the next layer of encoder, which can achieve the effect of increasing the depth of the network in order to more accurately extract the potential high-dimensional features of tourism data. When a set of training samples {
where
In order to ensure that the encoding results of the self-encoder are as consistent as possible with the original input data, the minimization is used to reconstruct the error when training the single-layer AE, which is written as
Structural composition of SAE The SAE used in this paper is formed by combining n independent encoders together. The network structure of the SAE in this paper is divided into a total of three layers, one input layer and two hidden layers, whose input layer contains 12 neurons with input dimension, and the second and third layers are hidden layers. When the input
where
The LSTM neural network used in this paper has two layers, and the use of double-layer stacked structure can effectively overcome the single LSTM network can only increase the number of hidden layers to improve the accuracy of the network and improve the computational complexity of the shortcomings. In the specific model, the input of the first layer is the output of SAE, and the number of nodes is 64, the second layer is the output layer of LSTM, and the input of the first layer is used as the output of the second layer, and the number of nodes is also 64, and the latter is stacked with a dropout layer and a fully-connected layer, and the fully-connected layer uses the Sigmoid activation function.
The SAE-LSTM model [28] constructed in this paper mainly consists of two parts: the SAE layer is composed of three self-encoders connected in series, which extracts potential high-dimensional features in tourism data, realizes effective dimensionality reduction to achieve the purpose of data compression, and directly takes the output of the last hidden layer of the SAE as the input of the subsequent LSTM tourism prediction model to simplify the overall model construction process and effectively reduce the workload of the later prediction network. This simplifies the overall model construction process and effectively reduces the workload of the prediction network in the later stage; while the LSTM layer is stacked by double-layer LSTM, which can effectively obtain the time-dependent features in the tourism data, is responsible for the tourism volume prediction, and finally links to a fully-connected layer for the output of the data.
Denote the similarity between
Then the normalized weights are obtained to the corresponding weight coefficients, and the weights from the first step are numerically transformed by introducing the SoftMax calculation method. Namely:
Finally, the weight coefficients corresponding to weights
The Attention Mechanism sequence can be queried for the value of
SAE layer, composed of multiple independent self-encoders, SAE can provide an effective feature fusion mechanism to fuse the multidimensional features in tourism data, and input the massive high-dimensional tourism data into the SAE for training, to extract the potential high-dimensional features in the tourism data, and to realize effective dimensionality reduction to achieve the purpose of data compression. Next is the LSTM layer, which consists of two layers of LSTM units linked together, on the one hand, it can effectively overcome the shortcomings of a single LSTM that can only increase the number of hidden layers to improve the accuracy of the network and enhance the computational complexity. Attention layer is composed of a fully connected layer containing the attention mechanism, which is mainly responsible for the calculation of the attention importance score of the traffic data and its features, by assigning different weights to the features of the tourism data in the prediction model, and the more important the feature information is, the greater the weight is. The output layer is the last layer of the model, which is mainly responsible for outputting the prediction results.
The experimental steps involved in this paper are mainly data processing and analysis, model construction and evaluation. The experiments are all based on Intel® Core(TM) i7-8700 CPU@3.20GHz 3.19GHzd Windows 10 system, the data processing and analysis part is done by using python3.7 tools through Numpy, scikit-learn, minepy, pandas, pickle, Matplotlib and other packages were done. Model building and evaluation was done using python3.7 tools through keras deep learning framework and packages such as pandas, Numpy, scikit-learn.
The dataset used in this paper is the monthly traveler arrivals from January 2015 to August 2023 for 2 municipalities in Region O, A and B, as well as the monthly search engine intensity data for educational tourism for tourism related influences in Region O. Where the monthly educational tourism arrivals are provided by the governmental tourism bureau of the place, this paper collects two types of tourism arrivals from the Driving Start EngineConsulting website, which are tourist arrivals from the global market and tourist arrivals from China, and Fig. 3 shows the monthly tourist arrivals in Region O from the global market. It is clear from the figure that tourist arrivals for educational tourism have cyclical fluctuations.

The monthly number of visitors to the global market is reached
The experimental data in this paper uses 250 search engine intensity data of influencing factors related to tourism in region O. 42 monthly search engine data are from Baidu and 213 monthly search engine data are from Google. The influencing factors related to educational tourism in region O used in the experiment are composed of influencing factor extensions of seven tourism categories, which are listed in Table 1.
The influence of tourism in o region
Tourist category | Influencing factor |
---|---|
Dining | O province food, O province restaurant, O province food festival |
Lodging | O province hotel, O province accommodation |
Transportation | O province ferry, O province flights |
Tour | O province travel, O province map, O province travel agency, O province tourism |
Clothing | O province weather, O province weather forecast |
Shopping | O province shopping, O province shopping mall |
Recreation | O province bar, O province show, O province night life |
Figure 4 presents a graph of the trend of some influencing factors versus tourist arrivals in region O, where (a) to (c) refer to the trend of O province food, O province hotel and O province ferry versus the number of passengers, respectively. The graph can be visualized very well to show the consistent relationship between the influencing factors of educational tourism in region O and its tourist arrivals, and it can be seen that each influencing factor and tourist arrivals have similar time trends and fluctuations, and it can be seen that the seasonal nature of the educational tourism in region O. The temperature difference between the four seasons in region O is small, so the tourists start to increase in the fall and winter of each year. Due to the similar trend of its influencing factors and tourist arrivals, it also proves that these influencing factors can reflect the tourism trend.

Part of the impact factors and the trend chart of visitors to the o region
In summary, the experimental data in this paper consists of monthly search intensity of 250 educational tourism keywords and 2 kinds of tourist arrivals in region O. This time series data is a list with ordered values in the format of 92*250.
Missing value processing In this paper, only the search intensity values of two features, namely, O region tourist attractions and O region food map, have missing values, among which there is only one missing value for O region tourist attractions and two missing values for O region food map, which have almost no effect on the data, so this paper simply fills the three missing values with 0 values. Feature selection The experimental data in this paper includes hundreds of factors, and although the proposed deep learning network does not need to perform feature selection, it can automatically remove several factors that have very little correlation with the arrival of educational tourism tourists in Region O in the data preprocessing stage. Pearson’s correlation coefficient can detect feature correlation, but Pearson’s correlation coefficient can only be used for two variables that have a linear relationship. So this paper uses the maximum information coefficient (MIC) to detect the nonlinear correlation of features. Pearson correlation coefficient can detect feature correlation, but Pearson correlation coefficient can only be for two variables that have linear relationship. So in this paper, Maximum Information Coefficient (MIC) is used to detect the nonlinear correlation of features. The classical mutual information step is:
The core of the MIC implementation of correlation is that if two attributes are correlated, a grid dividing the data can be plotted on a scatterplot to represent the correlation of the two attributes. | Data normalization Data normalization is to scale the data to a certain interval, this paper uses min-max normalization to normalize the educational tourism data of Province O, so that each feature is in the same order of magnitude. The conversion function used is:
The min-max normalization allows the sample data to fall within the interval [0, 1], with min and max being the minimum and maximum values of the sample data.
Data Conversion A time series is a sequence of numbers arranged in chronological order, the educational tourism data used in this paper is collected on a monthly basis, in this paper, the original educational tourism data is converted into time series data based on a sliding window, given a time series T and a window of length 12, And so on, there are a total of 80 data of length 12
The second data transformation step is to convert the sliding window based time series data of Province O into supervised learning data to facilitate subsequent training of the model. The supervised learning data format consists of inputs (x) and outputs (y), which means that the outputs are predicted from the inputs. In this paper, the actual next month’s educational tourism traveler arrivals in Province O are added to each
In order to validate the effectiveness of the feature dimensionality reduction module SAE model, the LSTM model without SAE was constructed to conduct comparison experiments on the data of 2 cities respectively. The validation results of the effectiveness of the SAE model are shown in Table 2, and the prediction results of each model in 2 cities are shown in Fig. 5, in which (a) and (b) represent the city A and the city B. It can be seen that the SAE-LSTM model has a lower MAE, RMSE, and MAPE on the dataset of the city A and the data of the city B, respectively, as compared with the LSTM model, its MAE, RMSE and MAPE decreased by 685.1797, 997.6496 and 6.9165, respectively, on the dataset of city A; and its MAE, RMSE and MAPE decreased by 133.6032, 197.1081 and 6.8158, respectively, on the dataset of city B. On the other hand, the R2 for the datasets of city A and city B were improved by 0.0765 and 0.0423. In general, the predictive performance of the SAE-LSTM model proposed in this paper is better than that of the LSTM model without SAE on the two city datasets, which indicates that the predictive performance of the model can be improved by adopting the SAE model for the feature downscaling process.
Effectiveness validation of the SAE model
Data set | Model | MAE | RMSE | MAPE | R2 |
---|---|---|---|---|---|
City A | LSTM | 2435.7387 | 3147.7083 | 11.5892 | 0.9033 |
SAE-LSTM | 1750.559 | 2150.0587 | 4.6727 | 0.9798 | |
City B | LSTM | 607.6683 | 785.9161 | 9.8639 | 0.9319 |
SAE-LSTM | 474.0651 | 588.808 | 3.0481 | 0.9742 |

Effectiveness validation of the SAE model
In order to verify the effectiveness of the timing prediction module LSTM model, the SAE model is retained and commonly used timing prediction models are constructed for comparison experiments. BPNN neural network is used to do timing prediction model for comparison. The performance indexes of each model are shown in Table 3, and the prediction results of each model are shown in Fig. 6, where (a) and (b) represent city A and city B, respectively. For the City A dataset, the SAE-LSTM model proposed in this paper performs the best, with its MAE, RMSE and MAPE reduced by 639.3335, 794.1965 and 5.5641 compared to the SAE-BPNN model. In the City B dataset, the three evaluation metrics decreased by 174.2964, 234.7116, and 3.5582. R2 increased by 0.0406 and 0.0389 for the two cities, respectively. So compared to other prediction methods, the LSTM model shows higher accuracy, proving that the accuracy of passenger flow prediction can be improved using the LSTM model.
Validity of the LSTM model
Data set | Model | MAE | RMSE | MAPE | R2 |
---|---|---|---|---|---|
City A | SAE-BPNN | 2406.7771 | 2955.4755 | 13.2938 | 0.9591 |
SAE-LSTM | 1767.4436 | 2161.279 | 7.7297 | 0.9997 | |
City B | SAE-BPNN | 642.8299 | 829.4109 | 14.6134 | 0.9596 |
SAE-LSTM | 468.5335 | 594.6993 | 11.0552 | 0.9985 |

Prediction of various models
In order to verify the effectiveness of the models proposed in this paper, the BPNN prediction model is constructed for comparative experiments, and the prediction is performed without using SAE for data dimensionality reduction. The performance metrics of each model are shown in Table 4, and the prediction results of different models are shown in Fig. 7, where (a) and (b) represent city A and city B, respectively. on the city A dataset, the metric values of the SAE-LSTM model proposed in this paper on MAE, RMSE, and MAPE are reduced by 950.1945, 1,168.4574, and 6.8464 compared with that of the BPNN model, and its R2 increased by 0.0989. The prediction results for the B-city dataset show that the SAE-LSTM model has reduced the MAE, RMSE and MAPE by 262.4891, 388.6242 and 9.3684 than the BPNN model, and its R2 has also increased by 0.0586 compared to the comparative model. It can be seen that the prediction performance of the SAE-LSTM model proposed in this paper is best .
Performance indicators for each model
Data set | Model | MAE | RMSE | MAPE | R2 |
---|---|---|---|---|---|
City A | BPNN | 2697.0785 | 3320.968 | 12.7449 | 0.8985 |
SAE-LSTM | 1746.884 | 2152.5106 | 5.8985 | 0.9974 | |
City B | BPNN | 749.9702 | 956.3858 | 19.8099 | 0.9399 |
SAE-LSTM | 487.4811 | 567.7616 | 10.4415 | 0.9985 |

Prediction of different models
Overall, compared with other prediction models that do not use SAE for data dimensionality reduction, the SAE-LSTM model proposed in this paper shows better prediction performance on the dataset of 2 cities, so the model is effective for predicting daily passenger flow in cities.
With the rapid development of smart tourism, educational tourism demand prediction is of great significance for optimizing resource allocation and enhancing tourism experience. In this paper, a hybrid tourism demand prediction model (SAE-LSTM) is proposed by combining stacked self-encoder (SAE) and long-short-term memory network (LSTM) with embedded attention mechanism. This model is used to forecast educational tourism. The main conclusions are as follows:
In this paper, we analyze the two main characteristics of nonlinear and periodicity of the passenger flow data in area O. We merge the city daily tourist flow data, weather data and other data to delete duplicate fields and fill in the missing values in the dataset, and then construct a SAE-LSTM prediction model based on the stacked self-encoder and the long and short-term memory network to realize the city daily passenger flow prediction. In this paper, the effectiveness of SAE model, LSTM model and SAE-LSTM model for urban daily passenger flow prediction is verified respectively. An empirical examination of the model’s prediction results reveals that the MAE, RMSE and MAPE values of the SAE-LSTM prediction model are significantly lower than those of the LSTM model, the SAE-BPNN model and the BPNN model, with an increase in R2 of 0.0389-9.3684. It is clear that, compared to other prediction models without dimensionality reduction using SAE, the SAE-LSTM model proposed in this paper shows better prediction performance on the dataset of 2 cities, which shows that the model is effective for predicting educational tourism.