Construction of Educational Tourism Seeking Prediction Model in the Context of Smart Tourism

China is a large country with a population of 1.4 billion, since the twenty-first century, people’s living standards continue to improve, the rapid development of the tourism market, but during the period there are also a lot of problems, such as the lack of tourism talent [1]. At present, China’s tourism talent market generally presents a situation of oversupply, higher vocational tourism-related students out of the profession is far from being able to meet the development of the tourism industry needs related professionals, the supply of talent to meet the growing development of the expanding tourism market demand [2-4]. This will lead to the quality and ability of tourism enterprises to recruit talent quality and ability to reduce the quality of the tourism service sector, thereby greatly damaging the reputation and prestige, reduce the attractiveness of tourist attractions and tourist hotels and inns and other tourism service units to tourism customers, affecting the development of the tourism economy, reducing the economic benefits of tourism [5-8]. Therefore, it is important to establish a model of tourism talent market demand, formulate a reasonable talent training program, preferred education strategic plan and education development plan for the development of domestic tourism economy and tourism vocational education [9-10].

The tourism economic market is also a commodity economic market, which is a fluctuating market guided by the law of value [11]. The value of tourism management and service vocational and technical personnel is closely related to the status of the tourism market, when the tourism market development trend is good, normal operating conditions, then tourism management and service vocational and technical personnel have a place to use, can be better based on the law of value and embodied [12-14]. Therefore, tourism-related higher vocational colleges and universities or higher vocational colleges and universities related to tourism disciplines in the process of training of talents, it is important to closely follow the changes in the tourism market to carry out [15-17]. To do this, we must always keep an eye on the tourism market, the use of information technology to understand the tourism market in a timely manner a variety of information, the current situation and changes in industrial structure, industry composition and changes, structural conditions and changes in the situation and changes in the state of supply and demand and changes in the future development of changes in the future development of a certain period of time to make predictions [18-21]. Because just conform to the status quo of the tourism market will be outdated and unable to keep up with the pace of the times because the training of tourism higher vocational talents lags behind the changes in the tourism market [22-23].

In this paper, the working principles of stacked self-encoder, denoising self-encoder, variational self-encoder and long short-term memory network (LSTM) are first introduced. Then SAE is used to extract potential high-dimensional features in educational tourism data layer by layer to realize data compression. Then the SAE-LSTM model is constructed by combining the LSTM method, and the attention mechanism is incorporated into the SAE-LSTM model to constitute the SAE-LSTM model based on the attention mechanism. The model can better access the potential, nonlinear and complex mapping and long-term dependency relationships between the modalities of tourism data, and improve the prediction accuracy and fitting performance of the whole model. Finally, the accuracy and stability of the prediction model in this paper are verified through the processing and empirical analysis of the tourism data in O region.

2

Construction of Educational Tourism Prediction Model under the Background of Smart Tourism

2.1

SAE-LSTM prediction modeling

2.1.1

Self-encoder principle

1)

Auto-Encoder

An Auto-Encoder (AE) [24] is a neural network that learns an efficient representation of sample data by unsupervised learning. The self-encoder learns the mapping f_θ : x → x by utilizing the data x itself as a supervised signal to guide the training of the network. The encoder learns the mapping relation g_θ : x → z and maps a set of D-dimensional data $x^{(n)} \in ℝ^{(D)}, 1 \leq n \leq N$ into the feature space to obtain an encoding $z^{(n)} \in ℝ^{(M)}, 1 \leq n \leq N$ for each data, from which hidden features that can be expressed on the input are extracted. The decoder learns the mapping relation h_θ : z → x, reconstructs the extracted hidden features back to the original data, and makes the reconstructed data as similar as possible to the input data.

For the input data x, the encoding process of the self-encoder is expressed as Eq: (1) $z = f (w^{(1)} x + b^{(1)})$

The decoding process is expressed as Eq: (2) $\hat{x} = f (w^{(2)} z + b^{(2)})$

Where: x is the input data; z is the hidden variable; $\hat{x}$ is the reconstructed data; w⁽¹⁾, w⁽²⁾ is the weight matrix; b⁽¹⁾, b⁽²⁾ is the bias vector; f(·) is the activation function.

The self-encoder is able to transform the input to the hidden variable z and reconstruct it by the decoder $\hat{x}$ . For a given set of data $x^{(n)} \in ℝ^{(D)}, 1 \leq n \leq N$ , the reconstruction error is: (3) $L = \sum_{n = 1}^{N} {(x^{(n)} - {\hat{x}}^{(n)})}^{2}$

By minimizing the reconstruction error, the network parameters can be efficiently learned θ = {w⁽¹⁾, b⁽¹⁾, w⁽²⁾, b⁽²⁾}. 2)

Stacked Autocoding

Stacked Auto-Encoder [25] (SAE) is a deep neural network consisting of multiple self-encoders stacked on top of each other. SAE has multiple hidden layers, and the input of each hidden layer is a nonlinear mapping of the previous layer, so the deeper the network is, the better the network’s ability of nonlinear feature extraction is, and the more complex hidden features can be learned by SAE. Therefore, SAEs usually use layer-by-layer pre-training to learn network parameters.

3)

Denoising Self-Encoder

Denoising Auto-Encoder (DAE) is an auto-encoder that increases coding robustness by adding random noise perturbations to the input data. The noise added by DAE is usually Gaussian noise, or the value of some dimensions of the input data x is randomly set to 0 based on a damage ratio μ, μ usually not exceeding 0.5.

The DAE is trained on the basis of the self-encoder by injecting noise into the input data x to get the damaged data $\tilde{x}$ , going through the encoding process to get the low-dimensional depth hidden variables of the data z, and then going through the decoding process to reconstruct the original lossless data $\hat{x}$ . Finally, the reconstruction error is minimized $L (x, \hat{x})$ , so that the output of the decoder can approximate the recovery of the original input.

4)

Variational Self-Encoder

The network main structure of the variational autoencoder (VAE) as an unsupervised learning model is roughly the same as that of the autoencoder, which improves the generalization ability of the model by separately modeling two complex conditional probability density function outputs obeying a certain distribution of hidden variables. The VAE model is shown in Fig. 1. The whole network structure of the VAE consists of an inference network and a generative network. The inference network is the part of the network that uses the neural network to estimate the VAE distribution q_ϕ(z ∣ x), whose input is x and output is VAE distribution q_ϕ(z ∣ x); the generative network is the part of the neural network that is used to estimate the probability distribution p_θ(x ∣ z), whose input is the hidden variable z and output is the probability distribution p_θ(x ∣ z).

Using the idea of VAE to approximate p_θ(x ∣ z) by distribution q_ϕ(z ∣ x), i.e., it is necessary to compute and optimize the KL dispersion $D_{K L}$ between q_ϕ(z ∣ x) and p_θ(x ∣ z), $D_{K L}$ which is a measure of the distance between distributions q, p defined as: (4) $D_{K L} [q_{ϕ} (z | x), p_{θ} (x | z)] = - L (ϕ, θ) + \log p (x)$

In VAE, q_ϕ(z ∣ x) and p_θ(x ∣ z) are usually Gaussian distributions. In the prediction model, q_ϕ(z ∣ x) is a normal distribution N(μ, σ), and p_θ(x ∣ z) is a standard normal distribution $N (0, 1)$ , so the KL scatter $D_{K L} [q_{ϕ} (z | x), p_{θ} (x | z)]$ can be further expressed as: (5) $D_{K L} [q_{ϕ} (z | x), p_{θ} (x | z)] = - \log σ + 0.5 * σ^{2} + 0.5 * μ^{2} - 0.5$

Consider: (6) $D_{K L} [q_{ϕ} (z | x), p_{θ} (x | z)] \geq 0$

So there: (7) $L (ϕ, θ) \leq \log p (x)$

$L (ϕ, θ)$ is the lower bound of log p(x) and the loss function of the VAE network model. 7 The loss function consists of two components: one is the KL scatter $D_{K L} [q_{ϕ} (z | x), p_{θ} (x | z)]$ , which is used to evaluate the similarity between q_ϕ(z ∣ x) and p_θ(x ∣ z); and the other is the reconstruction error $E_{z ~ q} [\log p_{θ} (x | z)]$ , which is used to measure the difference between the reconstructed data and the input data. However, weights need to be assigned to the above two components during model training, if $D_{κ L} [\begin{matrix} q_{ϕ} (z | x), p_{θ} (x | z) \end{matrix}]$ is too large it will result in the hidden variables being very close to the normal distribution, and if $E_{z ~ q} [\begin{matrix} \log p_{θ} (x | z) \end{matrix}]$ is too large it will result in the reconstructed data being too similar to the original data and too far from the expected normal distribution. This is obtained by the above equation: (8) $L (ϕ, θ) = - D_{K L} [q_{ϕ} (z | x), p_{θ} (x | z)] + E_{z ~ q} [\log p_{θ} (x | z)]$

A reparameterization technique is introduced to transform the discontinuous problem into a continuously derivable problem, and the reparameterization technique improves the training efficiency of the VAE model by z = μ + σ⊙ε way of sampling the hidden variables z, ε the variables are sampled from the standard normal distribution $N (0, 1)$ , μ and σ are generated by the encoder, which connects the gradient propagation.

2.1.2

Principles of LSTM networks

The internal state c_t and external state h_t of the Long Short-Term Memory Network [26-27] (LSTM) are formulated as: (9) ${\tilde{c}}_{t} = \tanh (W_{c} x_{t} + U_{c} h_{t - 1} + b_{c})$ (10) $c_{t} = f_{t} ⊙ c_{t - 1} + i_{t} ⊙ {\tilde{c}}_{t}$ (11) $h_{i} = o_{i} ⊙ \tanh (c_{t})$

Where: h_i is the external state of the memory cell; f_t, i_t, o_t is the three gates to control the path of information transfer; ⊙ is the product of vector elements; c_t−1 is the memory cell of the previous moment; ${\tilde{c}}_{t}$ is the candidate state obtained by the nonlinear function.

The gating mechanism introduced by the LSTM network consists of three “gates”, which are input gate i_t, oblivion gate f_t, and output gate o_t.

The formula for the forgetting gate is: (12) $f_{t} = σ (W_{f} x_{t} + U_{f} h_{t - 1} + b_{f})$

The expression for the input gate is: (13) $i_{t} = σ (W_{i} x_{t} + U_{i} h_{t - 1} + b_{i})$

The expression for the output gate is: (14) $o_{t} = σ (W_{o} x_{t} + U_{o} h_{t - 1} + b_{o})$

Where: σ(·) is the Sigmoid function; x_i is the input at the current moment; h_t−1 is the external state at the previous moment; w,U,b is the learnable network parameters.

Figure 2 shows the schematic diagram of the working principle of LSTM network. Through the LSTM memory unit, the whole network can establish longer distance temporal dependencies. At t moments, the input of the LSTM memory unit contains the input value x_t of the current moment, the output value h_i−1 of the previous moment, and the unit state c_t−1 of the previous moment, and the output is the h_t of the current moment and the unit state c_t of the current moment, in which the unit state c_t can capture the key information of the current moment and has the ability to be saved for a long period of time, so that the LSTM network has a strong temporal order memory capability.

2.2

SAE-LSTM model construction for educational tourism demand

2.2.1

SAE layer

1)

Individual AE training process

Self-encoder is an unsupervised learning algorithm whose target output is the input of the model, which mainly consists of three network structures, x, h and z. The AE is an unsupervised learning algorithm with the target output as the input of the model.

The encoding and decoding process of AE is to first use the output of the previous layer of encoder as the input of the next layer of encoder, which can achieve the effect of increasing the depth of the network in order to more accurately extract the potential high-dimensional features of tourism data. When a set of training samples {x¹, x², x³…}, xⁱ ∈ Rⁿ is input, the autoencoder first encodes its input xⁱ, then outputs it to the hidden layer, denoted as h(xⁱ), and then decodes and reconstructs the output of the hidden layer, denoted as z(xⁱ). The encoding and decoding process is as follows: (15) $h (x) = f (w_{1} x + b)$ (16) $z (x) = f (w_{2} h (x) + d)$

where w₁ and w₂ are the weight matrices in the encoding and decoding processes, respectively, b and d are the deviation vectors in the encoding and decoding processes, respectively, and f(x) and c(x) are both nonlinear activation functions.

In order to ensure that the encoding results of the self-encoder are as consistent as possible with the original input data, the minimization is used to reconstruct the error when training the single-layer AE, which is written as $L (X, Z)$ . The model parameter Θ can be obtained and its expression is as follows: (17) $θ = a r g_{θ} m i n \frac{1}{2} \sum_{i = 1}^{N} {‖ x^{i} - z (x^{i}) ‖}^{2}$

2)

Structural composition of SAE

The SAE used in this paper is formed by combining n independent encoders together. The network structure of the SAE in this paper is divided into a total of three layers, one input layer and two hidden layers, whose input layer contains 12 neurons with input dimension, and the second and third layers are hidden layers.

When the input x_n, first into the first self-encoder for encoding and decoding to get the result h_m, then the encoding result h_m of the first layer as the input of the second layer of self-encoder to continue encoding and decoding, the result of the second layer as the input of the third layer for encoding and decoding, and the sequential training, i.e., the first layer of the first AE is trained, and after that the second layer of the second AE is trained, then the third layer of the third AE is trained, and the output of the first layer is as the input of the second layer for training, and the output of the second layer as the input of the third layer to continue training, after the overall training is completed, the decoder is removed and the weights of each layer are saved after the completion of the training, and then loaded into the three-layer network of the SAE model as the initial weights correspondingly afterward, the weight of the first layer corresponds to hidden1, the second layer corresponds to hidden2, and the third layer corresponds to hidden3, by putting the last hidden layer can effectively capture the high-dimensional feature information in the original tourism data, which can effectively achieve the purpose of feature fusion and information compression. The number of nodes is set to 400, the activation function of the three layers is Sigmoid function, the loss function is MSE, and the specific expressions of the training process are as follows: (18) $h_{1} = f (W_{1} x + b_{1})$ (19) $h_{2} = f (W_{2} h_{1} + b_{2})$ (20) $h_{3} = f (W_{1} h_{2} + b_{3})$

where h_m(m = 1, 2, 3) is the coding result of the ind selfencoder of the SAE model, W_i and b_i are the weight matrix and bias term of the ith encoder in the SAE model, respectively, and f(x) is the activation function of the selfencoder, both of which are Sigmoid functions.

2.2.2

LSTM layer

The LSTM neural network used in this paper has two layers, and the use of double-layer stacked structure can effectively overcome the single LSTM network can only increase the number of hidden layers to improve the accuracy of the network and improve the computational complexity of the shortcomings. In the specific model, the input of the first layer is the output of SAE, and the number of nodes is 64, the second layer is the output layer of LSTM, and the input of the first layer is used as the output of the second layer, and the number of nodes is also 64, and the latter is stacked with a dropout layer and a fully-connected layer, and the fully-connected layer uses the Sigmoid activation function.

2.2.3

Constructing the SAE-LSTM prediction model

The SAE-LSTM model [28] constructed in this paper mainly consists of two parts: the SAE layer is composed of three self-encoders connected in series, which extracts potential high-dimensional features in tourism data, realizes effective dimensionality reduction to achieve the purpose of data compression, and directly takes the output of the last hidden layer of the SAE as the input of the subsequent LSTM tourism prediction model to simplify the overall model construction process and effectively reduce the workload of the later prediction network. This simplifies the overall model construction process and effectively reduces the workload of the prediction network in the later stage; while the LSTM layer is stacked by double-layer LSTM, which can effectively obtain the time-dependent features in the tourism data, is responsible for the tourism volume prediction, and finally links to a fully-connected layer for the output of the data.

2.2.4

Attention mechanism layer

Denote the similarity between Query and Key_i by S. The range of values of its weights varies using different methods. To wit: (21) $S (Q u e r y, K e y_{i}) = Q u e r y \cdot K e y_{i}$ (22) $S (Q u e r y, K e y_{i}) = \frac{Q u e r y \cdot K e y_{i}}{∥ Q u e r y ∥ \cdot ∥ K e y_{i} ∥}$ (23) $S (Q u e r y, K e y_{i}) = M L P (Q u e r y \cdot K e y_{i})$

Then the normalized weights are obtained to the corresponding weight coefficients, and the weights from the first step are numerically transformed by introducing the SoftMax calculation method. Namely: (24) $b_{i} = \frac{e x p (S (Q u e r y, K e y_{i}))}{\sum_{j = 1}^{m} e x p (S (Q u e r y, K e y_{j}))}$

Finally, the weight coefficients corresponding to weights b_i and Value obtained in the second step are weighted and summed to obtain Attention value as: (25) $A t t e n t i o n (Q u e r y, S o u r c e) = \sum_{i = 1}^{L_{x}} b_{i} \cdot V a l u e_{i}$

The Attention Mechanism sequence can be queried for the value of Attention by means of Query. When Query = Key exists, Value is output, i.e., each element in the sequence is stored by means of Key and Value data correspondences.

2.2.5

SAE-LSTM model construction process

1)

SAE layer, composed of multiple independent self-encoders, SAE can provide an effective feature fusion mechanism to fuse the multidimensional features in tourism data, and input the massive high-dimensional tourism data into the SAE for training, to extract the potential high-dimensional features in the tourism data, and to realize effective dimensionality reduction to achieve the purpose of data compression.

2)

Next is the LSTM layer, which consists of two layers of LSTM units linked together, on the one hand, it can effectively overcome the shortcomings of a single LSTM that can only increase the number of hidden layers to improve the accuracy of the network and enhance the computational complexity.

3)

Attention layer is composed of a fully connected layer containing the attention mechanism, which is mainly responsible for the calculation of the attention importance score of the traffic data and its features, by assigning different weights to the features of the tourism data in the prediction model, and the more important the feature information is, the greater the weight is.

4)

The output layer is the last layer of the model, which is mainly responsible for outputting the prediction results.

3

Analysis of predicted results of educational tourism in the context of smart tourism

3.1

Educational Tourism Data Sources and Processing

3.1.1

Introduction to the experimental environment

The experimental steps involved in this paper are mainly data processing and analysis, model construction and evaluation. The experiments are all based on Intel® Core(TM) i7-8700 CPU@3.20GHz 3.19GHzd Windows 10 system, the data processing and analysis part is done by using python3.7 tools through Numpy, scikit-learn, minepy, pandas, pickle, Matplotlib and other packages were done. Model building and evaluation was done using python3.7 tools through keras deep learning framework and packages such as pandas, Numpy, scikit-learn.

3.1.2

Data sources

The dataset used in this paper is the monthly traveler arrivals from January 2015 to August 2023 for 2 municipalities in Region O, A and B, as well as the monthly search engine intensity data for educational tourism for tourism related influences in Region O. Where the monthly educational tourism arrivals are provided by the governmental tourism bureau of the place, this paper collects two types of tourism arrivals from the Driving Start EngineConsulting website, which are tourist arrivals from the global market and tourist arrivals from China, and Fig. 3 shows the monthly tourist arrivals in Region O from the global market. It is clear from the figure that tourist arrivals for educational tourism have cyclical fluctuations.

The experimental data in this paper uses 250 search engine intensity data of influencing factors related to tourism in region O. 42 monthly search engine data are from Baidu and 213 monthly search engine data are from Google. The influencing factors related to educational tourism in region O used in the experiment are composed of influencing factor extensions of seven tourism categories, which are listed in Table 1.

Table 1.

The influence of tourism in o region

Tourist category	Influencing factor
Dining	O province food, O province restaurant, O province food festival
Lodging	O province hotel, O province accommodation
Transportation	O province ferry, O province flights
Tour	O province travel, O province map, O province travel agency, O province tourism
Clothing	O province weather, O province weather forecast
Shopping	O province shopping, O province shopping mall
Recreation	O province bar, O province show, O province night life

Figure 4 presents a graph of the trend of some influencing factors versus tourist arrivals in region O, where (a) to (c) refer to the trend of O province food, O province hotel and O province ferry versus the number of passengers, respectively. The graph can be visualized very well to show the consistent relationship between the influencing factors of educational tourism in region O and its tourist arrivals, and it can be seen that each influencing factor and tourist arrivals have similar time trends and fluctuations, and it can be seen that the seasonal nature of the educational tourism in region O. The temperature difference between the four seasons in region O is small, so the tourists start to increase in the fall and winter of each year. Due to the similar trend of its influencing factors and tourist arrivals, it also proves that these influencing factors can reflect the tourism trend.

In summary, the experimental data in this paper consists of monthly search intensity of 250 educational tourism keywords and 2 kinds of tourist arrivals in region O. This time series data is a list with ordered values in the format of 92*250.

3.1.3

Data processing

1)

Missing value processing

In this paper, only the search intensity values of two features, namely, O region tourist attractions and O region food map, have missing values, among which there is only one missing value for O region tourist attractions and two missing values for O region food map, which have almost no effect on the data, so this paper simply fills the three missing values with 0 values.

2)

Feature selection

The experimental data in this paper includes hundreds of factors, and although the proposed deep learning network does not need to perform feature selection, it can automatically remove several factors that have very little correlation with the arrival of educational tourism tourists in Region O in the data preprocessing stage. Pearson’s correlation coefficient can detect feature correlation, but Pearson’s correlation coefficient can only be used for two variables that have a linear relationship. So this paper uses the maximum information coefficient (MIC) to detect the nonlinear correlation of features.

Pearson correlation coefficient can detect feature correlation, but Pearson correlation coefficient can only be for two variables that have linear relationship. So in this paper, Maximum Information Coefficient (MIC) is used to detect the nonlinear correlation of features.

The classical mutual information step is: (26) $I (X; Y) = \sum_{x \in X} \sum_{y \in Y} p (x, y) \log \frac{p (x, y)}{p (x) p (y)}$

X and Y are both sets, x and y are discrete takes, the maximum information index finds the optimal discretization method and limits the mutual information takes to the interval [0, 1], the maximum information index used in this paper is calculated as: (27) $M I C (X; Y) = \max_{| X | | Y | < B} \frac{\max (I (X; Y))}{\log_{2} (\min (| X |, | Y |))}$

The core of the MIC implementation of correlation is that if two attributes are correlated, a grid dividing the data can be plotted on a scatterplot to represent the correlation of the two attributes. |X| is the number of segments the horizontal coordinate is divided into on the scatterplot, |Y| is the value of the corresponding vertical coordinate, B is usually 0.6 times the total amount of data, and the resolution of the grid is limited to |X|*|Y| < B. 3)

Data normalization

Data normalization is to scale the data to a certain interval, this paper uses min-max normalization to normalize the educational tourism data of Province O, so that each feature is in the same order of magnitude. The conversion function used is: (28) $x^{*} = \frac{x - \min}{\max - \min}$

The min-max normalization allows the sample data to fall within the interval [0, 1], with min and max being the minimum and maximum values of the sample data. 4)

Data Conversion

A time series is a sequence of numbers arranged in chronological order, the educational tourism data used in this paper is collected on a monthly basis, in this paper, the original educational tourism data is converted into time series data based on a sliding window, given a time series T and a window of length 12, T = (x₁, x₂, …, x_n), n is 92, representing 92 months of data collected from January 2015 to August 2023, and each x_n is also a 250-dimensional vector, representing the monthly search intensity of 250 travel-related features. In this paper, the data is shifted as specified by the shiff function of pandas, which first places the time window at the start of T, then shifts the time window one month backward with time, and then takes the second month of T as the start position, obtaining a second length of data also of length 12.

And so on, there are a total of 80 data of length 12 c₁, c₂, …, c₈₀, where c₁ = (x₁, x₂, …, x₁₂), c₂ = (x₂, x₃, …, x₁₃). The converted data is given in the following equation: (29) $W_{(c)} = {\begin{matrix} c_{i} | i = 1, 2, ..., 80 \end{matrix}}$

The second data transformation step is to convert the sliding window based time series data of Province O into supervised learning data to facilitate subsequent training of the model. The supervised learning data format consists of inputs (x) and outputs (y), which means that the outputs are predicted from the inputs. In this paper, the actual next month’s educational tourism traveler arrivals in Province O are added to each c_i.

3.2

Results of Educational Tourism Data Demand Forecasts

3.2.1

Validation of the validity of the SAE model

In order to validate the effectiveness of the feature dimensionality reduction module SAE model, the LSTM model without SAE was constructed to conduct comparison experiments on the data of 2 cities respectively. The validation results of the effectiveness of the SAE model are shown in Table 2, and the prediction results of each model in 2 cities are shown in Fig. 5, in which (a) and (b) represent the city A and the city B. It can be seen that the SAE-LSTM model has a lower MAE, RMSE, and MAPE on the dataset of the city A and the data of the city B, respectively, as compared with the LSTM model, its MAE, RMSE and MAPE decreased by 685.1797, 997.6496 and 6.9165, respectively, on the dataset of city A; and its MAE, RMSE and MAPE decreased by 133.6032, 197.1081 and 6.8158, respectively, on the dataset of city B. On the other hand, the R² for the datasets of city A and city B were improved by 0.0765 and 0.0423. In general, the predictive performance of the SAE-LSTM model proposed in this paper is better than that of the LSTM model without SAE on the two city datasets, which indicates that the predictive performance of the model can be improved by adopting the SAE model for the feature downscaling process.

Table 2.

Effectiveness validation of the SAE model

Data set	Model	MAE	RMSE	MAPE	R²
City A	LSTM	2435.7387	3147.7083	11.5892	0.9033
City A	SAE-LSTM	1750.559	2150.0587	4.6727	0.9798
City B	LSTM	607.6683	785.9161	9.8639	0.9319
City B	SAE-LSTM	474.0651	588.808	3.0481	0.9742

3.2.2

Validation of the validity of the LSTM model

In order to verify the effectiveness of the timing prediction module LSTM model, the SAE model is retained and commonly used timing prediction models are constructed for comparison experiments. BPNN neural network is used to do timing prediction model for comparison. The performance indexes of each model are shown in Table 3, and the prediction results of each model are shown in Fig. 6, where (a) and (b) represent city A and city B, respectively. For the City A dataset, the SAE-LSTM model proposed in this paper performs the best, with its MAE, RMSE and MAPE reduced by 639.3335, 794.1965 and 5.5641 compared to the SAE-BPNN model. In the City B dataset, the three evaluation metrics decreased by 174.2964, 234.7116, and 3.5582. R² increased by 0.0406 and 0.0389 for the two cities, respectively. So compared to other prediction methods, the LSTM model shows higher accuracy, proving that the accuracy of passenger flow prediction can be improved using the LSTM model.

Table 3.

Validity of the LSTM model

Data set	Model	MAE	RMSE	MAPE	R²
City A	SAE-BPNN	2406.7771	2955.4755	13.2938	0.9591
City A	SAE-LSTM	1767.4436	2161.279	7.7297	0.9997
City B	SAE-BPNN	642.8299	829.4109	14.6134	0.9596
City B	SAE-LSTM	468.5335	594.6993	11.0552	0.9985

3.2.3

Validation of the validity of the SAE-LSTM model

In order to verify the effectiveness of the models proposed in this paper, the BPNN prediction model is constructed for comparative experiments, and the prediction is performed without using SAE for data dimensionality reduction. The performance metrics of each model are shown in Table 4, and the prediction results of different models are shown in Fig. 7, where (a) and (b) represent city A and city B, respectively. on the city A dataset, the metric values of the SAE-LSTM model proposed in this paper on MAE, RMSE, and MAPE are reduced by 950.1945, 1,168.4574, and 6.8464 compared with that of the BPNN model, and its R² increased by 0.0989. The prediction results for the B-city dataset show that the SAE-LSTM model has reduced the MAE, RMSE and MAPE by 262.4891, 388.6242 and 9.3684 than the BPNN model, and its R² has also increased by 0.0586 compared to the comparative model. It can be seen that the prediction performance of the SAE-LSTM model proposed in this paper is best .

Table 4.

Performance indicators for each model

Data set	Model	MAE	RMSE	MAPE	R²
City A	BPNN	2697.0785	3320.968	12.7449	0.8985
City A	SAE-LSTM	1746.884	2152.5106	5.8985	0.9974
City B	BPNN	749.9702	956.3858	19.8099	0.9399
City B	SAE-LSTM	487.4811	567.7616	10.4415	0.9985

Overall, compared with other prediction models that do not use SAE for data dimensionality reduction, the SAE-LSTM model proposed in this paper shows better prediction performance on the dataset of 2 cities, so the model is effective for predicting daily passenger flow in cities.

4

Conclusion

With the rapid development of smart tourism, educational tourism demand prediction is of great significance for optimizing resource allocation and enhancing tourism experience. In this paper, a hybrid tourism demand prediction model (SAE-LSTM) is proposed by combining stacked self-encoder (SAE) and long-short-term memory network (LSTM) with embedded attention mechanism. This model is used to forecast educational tourism. The main conclusions are as follows:

In this paper, we analyze the two main characteristics of nonlinear and periodicity of the passenger flow data in area O. We merge the city daily tourist flow data, weather data and other data to delete duplicate fields and fill in the missing values in the dataset, and then construct a SAE-LSTM prediction model based on the stacked self-encoder and the long and short-term memory network to realize the city daily passenger flow prediction. In this paper, the effectiveness of SAE model, LSTM model and SAE-LSTM model for urban daily passenger flow prediction is verified respectively. An empirical examination of the model’s prediction results reveals that the MAE, RMSE and MAPE values of the SAE-LSTM prediction model are significantly lower than those of the LSTM model, the SAE-BPNN model and the BPNN model, with an increase in R² of 0.0389-9.3684. It is clear that, compared to other prediction models without dimensionality reduction using SAE, the SAE-LSTM model proposed in this paper shows better prediction performance on the dataset of 2 cities, which shows that the model is effective for predicting educational tourism.

Lingua:: Inglese

Frequenza di pubblicazione:: 1 volte all'anno
Argomenti della rivista:: Scienze biologiche, Scienze della vita, altro, Matematica, Matematica applicata, Matematica generale, Fisica, Fisica, altro

Feed RSS della rivista

Construction of Educational Tourism Seeking Prediction Model in the Context of Smart Tourism

Yan Wang

Lina Fu

Pubblicato online: 29 set 2025

Ricevuto: 07 gen 2025

Accettato: 24 apr 2025

DOI: https://doi.org/10.2478/amns-2025-1094

Parole chiaveStacked Self-Encoding (SAE), Long Short-Term Memory Network, SAE-LSTM model, Attention Mechanism, Educational Tourism Prediction

© 2025 Yan Wang and Lina Fu, published by Sciendo.

This work is licensed under the Creative Commons Attribution 4.0 International License.

Parole chiave
Stacked Self-Encoding (SAE), Long Short-Term Memory Network, SAE-LSTM model, Attention Mechanism, Educational Tourism Prediction