Combining big data technology to study the geographical distribution characteristics of tourism consumption behavior
Online veröffentlicht: 17. März 2025
Eingereicht: 11. Okt. 2024
Akzeptiert: 26. Jan. 2025
DOI: https://doi.org/10.2478/amns-2025-0189
Schlüsselwörter
© 2025 Zhen Xu, published by Sciendo
This work is licensed under the Creative Commons Attribution 4.0 International License.
In recent years, China’s consumption structure has undergone significant changes, especially since the implementation of double holidays, golden weeks and paid vacations. People’s leisure and recreation time has gradually increased, their daily lifestyles and contents have become richer and richer, and the opportunities for outbound travel have also increased; thus, tourism has become an important part of the daily life of urban residents [1-2]. However, along with the development of urbanization, the fast pace of urban life and the heavy pressure of urban life have made people’s tourism consumption behavior change greatly in form, content and purpose compared with the past [3-4]. In general, people’s total demand for tourism has expanded, the opportunities for tourism have increased, the spatial scope of tourism has been extended, and the consumption behavior of tourism has become more rational. In a research report on the tourism consumption of the new middle class in urban China, it is pointed out that people’s tourism consumption behavior is changing from original sightseeing to leisure, and it also brings challenges to the tourism industry in terms of tourism marketing and service [5-6]. Under the situation that the demand of the tourism market tends to be refined, it is of great significance to grasp the characteristics of travelers’ tourism consumption behavior to promote the development of the tourism industry [7-8].
Staying close to tourism consumers, deeply understanding their new needs and new ways of tourism consumption, integrating new market resources, and planning new product development and operation modes should become the basic work for the tourism industry to seek sustainable development and high-quality development. Miah, S.J et al. mainly focus on the expansion of strategic decision support based on social media-generated big data in the field of tourism and show that the approach is universal and can be further discussed in terms of what adaptation problems and solutions may exist when applied in other domains or different types of big data streams [9]. Han,Q et al. proposed and validated Tourism2vec as a novel and effective method to help people deeply understand and analyze tourism behaviors, which not only can help us discover the laws hidden behind the data but also can portray the destination more accurately and comprehensively, in addition, it can help to improve the traditional administrative zoning methods to achieve more scientific and accurate tourism planning and promotion [10]. Salas-Olmedo,M.H et al. indicated that when exploring urban tourism behavior, people need to comprehensively utilize various types of data sources with rich and diverse origins, and by analyzing the digital footprints provided by different data sources, we can more accurately depict the trajectories of tourists’ actions and spatial distribution characteristics in the city, and carry out more detailed and comprehensive different types or functions of regional research [11]. Song, H et al. indicated that with the help of powerful and efficient big data technology, it is possible to precisely monitor and analyze all kinds of travel activities on a global scale in the future and adjust the strategic planning and resource allocation according to the results obtained, which will further promote the world tourism industry to move forward in the direction of intelligent and sustainable development [12].
This paper divides tourism consumption activities into tourism consumption subjects. Media has three parts and attributes the factors affecting tourism consumption to four aspects: tourism consumption level, people’s living standard, economic development level, and tourism infrastructure construction level, so as to select the variable factors affecting the emergence of spatial differences in tourism consumption. Spatial and economic distances are used to analyze the spatial weights. The SLM model and SEM model are proposed, and the great likelihood method is used to estimate the parameters of SLM and SEM. Global and local spatial autocorrelation tests are conducted in conjunction with the 2010-2022 sample data to validate the parameter estimation of the SLM model and SEM model. The degree of correlation between each influential factor in different models is discussed.
Tourism is a complex activity that functions as an economic activity and benefits from both spiritual and cultural aspects.Tourism consumption is a cross-overlapping behavior of tourism and consumption activities, an integral part of total social consumption, which is hierarchical, structured, conditioned, and attributed to higher documented consumption behaviors.It is a form of enjoyment consumption, as well as a form of development and investment, and the constituent factors of tourism consumption are very complex.The normal market activity for tourism consumption is composed of three parts: tourism consumption subject, tourism consumption object, and tourism consumption media.
The so-called subject of tourism consumption refers to the person who leaves their place of residence to visit other places and stays there in order to achieve travel excursion and some other purposes and carry out non-remunerated activities. The subject of tourism referred to here must be a tourist who has a desire to travel and a preference for a certain tourism product, a tourist who has leisure time, a tourist who has financial security and a tourist who has considerable physical support.
Tourism consumption can be divided into three categories: inbound tourists, outbound tourists, and domestic tourists.
Tourism consumption object refers to the tourism consumption object, which has the attributes of general commodities but is different from general commodities, is a special commodity. The tourism object here refers to the object of tourism consumption, i.e., tourism products and by virtue of the object, i.e., the hardware and software of food, lodging, traveling, touring, purchasing and entertainment, and the comprehensive products through the combination of services. The purchase of tourism products refers to the process of purchasing the right to enjoy, including the purchase of ownership (such as food and souvenirs) and the purchase of ownership without the right to enjoy (such as travel, accommodation, entertainment, tourism products, etc.).
Tourism consumption media refers to intermediary organizations and enterprises serving tourism, including travel agencies, intermediaries, trade associations, and so on. The tourism media in this context refers to operators and intermediaries of tourism products and goods, including destinations and objects.
Exploring the causal relationship between several variables using observed data is the basis of regression analysis.Subsequent statistical inference and analysis are only meaningful if the relationship between the dependent and independent variables is correctly formulated. However, in the absence of a clear theoretical relationship, it is uncertain which of the regression moduli to choose as the independent variables. Variable selection, as an important means of screening the independent variables, not only improves the prediction accuracy and enhances the interpretability of the model but also reduces the cost of action for the application workers and avoids unnecessary losses.
This section is only based on the linear regression model for elaboration. The model is as follows:
What are the effects of variable misselection on estimation? And how does variable selection improve the predictive accuracy of the dependent variable? For simplicity, model (1) is used as an example, and the standard deviation of the error term
Let
And the alternative model is:
For the full model (2), the least squares estimate of the regression coefficient
For the alternative model (3), the least squares estimate of the regression coefficient
If the full model (2) is correct, then there are
From the above, it can be seen that as long as
The above derivation shows that choosing the wrong model can result in biased parameter estimates. Furthermore, it can be shown that if
In summary, it can be concluded from the discussion above that the wrong selection of variables will produce biased estimates and predictions, especially the wrong selection of significant variables will have a larger bias, making the subsequent statistical inference unreliable. However, when the effects of variables are very small or absent (coefficients close to or equal to zero), not selecting these variables not only results in less bias in estimation and prediction but also improves the accuracy of estimation and prediction. Therefore, it is necessary and meaningful to find a suitable method for variable selection.
Combined with the characteristics of China’s development, the geographical environment is complex and diverse, with great differences between the north, south, east and west, and different levels of economic development, so the spatial differences affecting the level of tourism consumption as well as the influencing factors should be considered from various aspects. Selecting the representative influencing factors among them can help us understand which counterfactuals play a facilitatory role and which factors have an inhibiting effect. Based on the relevant literature, the factors affecting tourism consumption are finally attributed to four aspects: the level of tourism consumption, people’s living standards, the level of economic development, and the level of tourism infrastructure construction. In order to make the data without the influence of heteroskedasticity, all the data were processed by taking a logarithm.
According to the previous theoretical analysis, this paper takes the mean value of the comprehensive score of tourism consumption as the explanatory variable, the per capita tourism consumption, the consumption level of the people, and the number of A-grade scenic spots as the explanatory variables, and the per capita GDP, the number of travel agencies, and the number of star-rated hotels as the control variables, which are explained as follows:
Explained variables The explanatory variable is selected as the comprehensive score (ss) whose value is mainly used to reflect the degree of agglomeration of tourism consumption, so the comprehensive score is used to measure the agglomeration level of tourism consumption development. There are many indicators used to measure the composite score. This paper uses the composite score (ss) to reflect the level of tourism consumption and the level of agglomeration. Explanatory variables Per capita tourism consumption refers to the average monetary amount spent by each tourist, that is, the ratio of the total tourism income and the total number of tourists, reflecting the average spending level of the people. In this paper, the per capita tourism consumption (unit: yuan) is selected to be recorded as a pcts indicator to measure the impact of tourism consumption level. The impact of the people’s living standard on tourism consumption is mainly reflected in the impact of the income level of tourists on tourism consumption and the impact of demographic and environmental factors on tourism consumption and, therefore, is mainly measured by the level of residents’ consumption (unit: yuan) recorded as hcl as an indicator. This aspect of tourism infrastructure infrastructure reflects that the number of A-class scenic spots plays a decisive role in people’s choice of whether or not to travel to this place. So, choose the number of A-class scenic spots (unit: a) recorded asa these three indicators. Control variables GDP per capita: this paper mainly analyzes the spatial differences in tourism consumption, so it is more appropriate to choose the GDP per capita of each province. Therefore, the level of economic development has been chosen as the indicator of per capita gross domestic product (unit: yuan), which is labeled as pcgdp. The relationship between the number of travel agencies and star-rated hotels and the level of development of tourism consumption is mainly the impact of these two tourism infrastructures on the level of development of tourism consumption, so the control variables are selected as the number of travel agencies (unit: nta) recorded as nta, the total number of star-rated hotels (unit: nsh) recorded as nsh.
Spatial autocorrelation is a spatial statistical method used to describe spatially interacting phenomena, referring to the correlation of the same variable at different spatial locations. Many geographical phenomena are spatially autocorrelated because they are influenced by processes that are continuous in their geographical distribution [13-15].
The global indices for measuring spatial autocorrelation are the global
The global
The value of
For the global
Global
The value of the
Similar to the
There are three main measures of local spatial autocorrelation, i.e., the local index of spatial associations (LISA), the
Local
Decomposing the
Among them:
Localization
Local
Among them:
Local
In defining the spatial weight, the first step is to quantify the location of the spatial unit, and the quantification of the location is generally based on the “distance”, and the most commonly used distance setting methods include spatial distance and economic distance.
Spatial distance
The spatial distance is mainly set with neighboring distance, limited distance and negative index distance weights.
Economic distance
Depending on the distance between two locations for one or several economic variables such as GDP, foreign trade volume
This paper focuses on spatial regression models that incorporate spatial effects (spatial correlation and spatial variance), including two types of spatial lag models (SLM) and spatial error models (SEM).
Spatial lag model (SLM)
The general form of the spatial lag model (SLM) can be expressed as follows:
The model is a standard regression model integrating spatially lagged dependent variables. Where
Spatial Error Model (SEM)
The interrelationships between institutions or regions in the spatial error model are represented by their error terms, which are expressed in their general form:
The spatial error model is essentially a spatial autoregressive model combining a standard regression model with an error term whose spatial correlation role is present in the perturbation error
In determining the spatial correlation of regional economic growth behavior, not only the Moran’I test can be used, but also the two Lagrangian multiplier forms LM-LAG, LM-ERR (Lagrangian multiplier tests for spatial lag and spatial error models, respectively) and the robust estimates R-LMLAG, R-LMERR (robust estimates for spatial lag and robust estimates of Lagrange multiplier tests for spatial error models) are performed.
LM-Lag and Robust LM-Lag are suitable for spatial lag models, and LM-Error and Robust LM-Error are suitable for spatial error models.
Both tests, LM-Lag and LM-Error, obey a chi-square distribution with one degree of freedom, and they are tests for different forms of spatial measurement modeling equations, but both tests need to be performed simultaneously in the actual test.
The discriminatory criteria for choosing SLM or SEM are: if the maximum likelihood LM-Lag test is more significant than the LM-Error test in the spatial dependence test in the case that the Moran I test is significant and if the robust estimate R-LMLAG is significant, but R-LMERR is not then the spatial lag model (SAR) is chosen. Conversely, if LM-Error is statistically significant more than LM-Lag and R-LMERR is significant, R-LMLAG is not significant. Then, the spatial error model is chosen.
Second, in diagnosing overall significance, in addition to comparing the goodness-of-fit
Spatial autoregressive models are no longer unbiased, efficient, and consistent estimates by ordinary least squares OLS estimation due to the endogeneity of the variables. Here in this paper, we focus on the great likelihood method to estimate the parameters of SLM and SEM, which is used in the empirical analysis.
The estimation steps of the spatial lag regression model include:
Perform least squares estimation on model Perform least squares estimation on model From the values of Given Then the maximum likelihood function value is:
The estimation procedure for the spatial error model is:
Perform OLS estimation of model Calculate the residuals From the value of Calculate the remaining parameter estimates from the Then the maximum likelihood function value is:
In order to facilitate the comparative analysis, this paper uses geographic distance, economic distance, and functions based on geographic and economic distance to construct spatial weight matrices, respectively. The spatial weights of functions are constructed, and the gravitational model is introduced into the study of spatial action to establish the spatial weights of functions based on the gravitational model. The gravity model follows the first law of geography, which makes it clear that the association and correlation between similar things is stronger than that between more distant things, which is also in line with the general law of economic operation. Combined with the gravity model, the size of the interaction force between two regions or units is negatively correlated with the distance between the two regions and positively correlated with the total economic volume of the two regions. The gravity model has become an important model for studying the spatial effects of regional and district factor flows.
Based on the gravity model, a non-binary spatial weight matrix of functions is established. After row normalization, global and local spatial autocorrelation tests are carried out. The Getis-Ord index G is not applicable since it requires a non-standardized symmetric spatial weight matrix with all elements 0 and 1. Only the Moran Index I and the Geary Index C were calculated for the explanatory and interpretive variables of tourism economic efficiency.
The global correlation test of tourism consumption level is shown in Table 1. According to the test results, most of the values of Moran index I are around 0 and Gillray index C is around 1 in 2010-2022. However, the P-value is high, indicating that both global spatial autocorrelation indices cannot reject the original hypothesis of “no spatial autocorrelation”, and in the Moran index I, there is a result greater than 0 from 2012 to 2015, indicating that there is a positive spatial correlation in these two years, and the Gillette index also obtains the same result.
Global correlation test for consumption levels
YEAR | Moran’s I | Geary’s C | ||||||||
---|---|---|---|---|---|---|---|---|---|---|
I | E(I) | sd(I) | z | p-value* | C | E(C) | sd(C) | z | p-value* | |
2010 | -0.025 | -0.027 | 0.124 | 0.042 | 0.925 | 1.524 | 1.000 | 0.142 | 0.254 | 0.724 |
2011 | -0.018 | -0.027 | 0.124 | 0.214 | 0.798 | 0.839 | 1.000 | 0.143 | -0.078 | 0.951 |
2012 | 0.105 | -0.027 | 0.125 | 1.53 | 0.335 | 0.827 | 1.000 | 0.146 | -1.241 | 0.352 |
2013 | 0.042 | -0.027 | 0.124 | 0.721 | 0.652 | 0.931 | 1.000 | 0.142 | -0.158 | 0.816 |
2014 | 0.045 | -0.027 | 0.128 | 0.715 | 0.652 | 0.993 | 1.000 | 0.145 | -0.129 | 0.948 |
2015 | 0.142 | -0.027 | 0.127 | 1.568 | 0.241 | 0.824 | 1.000 | 0.145 | -1.505 | 0.247 |
2016 | -0.064 | -0.027 | 0.124 | -0.415 | 0.825 | 1.124 | 1.000 | 0.142 | -1.557 | 0.124 |
2017 | -0.089 | -0.027 | 0.124 | -0.524 | 0.662 | 1.181 | 1.000 | 0.143 | 0.682 | 0.542 |
2018 | -0.036 | -0.027 | 0.124 | 0.069 | 0.963 | 1.012 | 1.000 | 0.143 | 0.725 | 0.415 |
2019 | -0.021 | -0.027 | 0.125 | 0.051 | 0.942 | 1.068 | 1.000 | 0.142 | 0.241 | 0.856 |
2020 | -0.055 | -0.027 | 0.127 | -0.182 | 0.785 | 1.043 | 1.000 | 0.143 | 0.359 | 0.521 |
2021 | -0.075 | -0.027 | 0.124 | -0.358 | 0.852 | 1.029 | 1.000 | 0.142 | 0.522 | 0.856 |
2022 | -0.061 | -0.027 | 0.126 | -0.182 | 0.896 | 1.014 | 1.000 | 0.145 | 0.384 | 0.722 |
The local correlation test of tourism consumption levels is shown in Table 2. The results of the local spatial autocorrelation test indicate that spatial autocorrelation exists in most areas. The study shows that although there is no spatial autocorrelation globally, there is spatial autocorrelation locally, and the reason for this situation may lie in the fact that the local autocorrelations cancel each other out, resulting in the absence of autocorrelation globally.
Survey of local correlation of travel consumption level
Region | Moran’s I | Geary’s C | ||||||||
---|---|---|---|---|---|---|---|---|---|---|
Ii | E(Ii) | sd(Ii) | z | p-value* | ci | E(ci) | sd(ci) | z | p-value* | |
Beijing | 1.253 | -0.043 | 0.725 | 1.078 | 0.018 | 0.652 | 2.352 | 1.872 | -0.789 | 0.179 |
Tianjin | 0.893 | -0.043 | 0.380 | 0.783 | 0.167 | 0.666 | 2.352 | 2.376 | -0.133 | 0.234 |
Hebei | -0.425 | -0.043 | 0.378 | -0.606 | 0.516 | 4.463 | 2.352 | 1.887 | 0.848 | 0.476 |
Shanxi | 0.025 | -0.043 | 0.093 | 0.741 | 0.839 | 1.324 | 2.352 | 1.409 | -0.789 | 0.092 |
Neimenggu | -0.072 | -0.043 | 0.740 | -0.674 | 0.239 | 4.884 | 2.352 | 1.307 | 0.275 | 0.399 |
Liaoning | 0.024 | -0.043 | 0.573 | 0.456 | 0.853 | 0.630 | 2.352 | 1.135 | -0.509 | 0.174 |
Jilin | -0.142 | -0.043 | 0.931 | 1.182 | 0.223 | 0.723 | 2.352 | 1.330 | -0.058 | 0.383 |
Heilongjiang | 1.352 | -0.043 | 0.310 | -0.431 | 0.283 | 1.835 | 2.352 | 1.037 | -0.146 | 0.187 |
Shanghai | -0.012 | -0.043 | 0.145 | 0.508 | 0.371 | 1.639 | 2.352 | 1.404 | -0.393 | 0.120 |
Jiangsu | -0.157 | -0.043 | 0.095 | 0.437 | 0.151 | 1.320 | 2.352 | 1.520 | -0.710 | 0.831 |
Zhejiang | -0.214 | -0.043 | 0.685 | -0.609 | 0.396 | 1.493 | 2.352 | 1.047 | -0.183 | 0.179 |
Anhui | 0.135 | -0.043 | 0.603 | 0.428 | 0.443 | 0.562 | 2.352 | 1.316 | -0.422 | 0.334 |
Fujian | -0.024 | -0.043 | 0.041 | 0.446 | 0.625 | 1.403 | 2.352 | 1.966 | -0.798 | 0.409 |
Jiangxi | 0.328 | -0.043 | 0.546 | 0.772 | 0.036 | 0.527 | 2.352 | 1.306 | -0.990 | 0.805 |
Shandong | 0.258 | -0.043 | 0.239 | 0.404 | 0.391 | 1.754 | 2.352 | 1.264 | -0.245 | 0.759 |
Henan | 0.321 | -0.043 | 0.202 | 0.491 | 0.340 | 1.700 | 2.352 | 1.253 | -0.682 | 0.097 |
Hubei | 0.205 | -0.043 | 0.079 | 0.334 | 0.756 | 0.620 | 2.352 | 1.351 | -0.050 | 0.428 |
Hunan | 0.172 | -0.043 | 0.422 | 0.779 | 0.832 | 1.545 | 2.352 | 1.378 | -0.052 | 0.526 |
Guangdong | -0.724 | -0.043 | 0.353 | -0.276 | 0.407 | 5.592 | 2.352 | 1.250 | 0.960 | 0.630 |
Guangxi | 0.024 | -0.043 | 0.180 | 0.756 | 0.144 | 0.603 | 2.352 | 1.364 | -0.660 | 0.083 |
Hainan | -0.825 | -0.043 | 0.537 | -0.664 | 0.068 | 4.566 | 2.352 | 1.177 | 0.135 | 0.608 |
Chongqing | 0.152 | -0.043 | 0.244 | 0.728 | 0.540 | 1.793 | 2.352 | 1.265 | -0.562 | 0.173 |
Sichuan | 0.283 | -0.043 | 0.395 | 0.710 | 0.617 | 0.656 | 2.352 | 1.022 | -0.072 | 0.235 |
Guizhou | -0.983 | -0.043 | 0.517 | -0.340 | 0.230 | 5.449 | 2.352 | 1.261 | -0.088 | 0.709 |
Yunnan | 0.058 | -0.043 | 0.406 | 0.635 | 0.802 | 1.519 | 2.352 | 1.113 | 0.422 | 0.429 |
Xizang | 0.089 | -0.043 | 0.244 | 0.759 | 0.262 | 1.538 | 2.352 | 1.210 | -0.215 | 0.312 |
Shanxi | 0.087 | -0.043 | 0.369 | 0.696 | 0.677 | 1.707 | 2.352 | 1.434 | -0.602 | 0.533 |
Gansu | -0.135 | -0.043 | 0.364 | -0.363 | 0.590 | 6.610 | 2.352 | 1.177 | 0.494 | 0.309 |
Qinghai | 0.198 | -0.043 | 0.610 | 0.195 | 0.505 | 3.765 | 2.352 | 1.923 | -0.028 | 0.013 |
Ningxi | -0.675 | -0.043 | 0.205 | -0.525 | 0.359 | 4.656 | 2.352 | 1.375 | 0.952 | 0.128 |
Xinjiang | -0.178 | -0.043 | 0.380 | 1.078 | 0.862 | 2.597 | 2.352 | 1.526 | 0.416 | 0.021 |
It can be seen that the Moran Index I shows that Beijing, Shanghai, Hainan, Guizhou, Gansu, Ningxia, and Jilin reject the hypothesis of no correlation at the 10% significant level, and thus, these regions are locally correlated. The Gillet index C shows that Hebei, Guangdong, Hainan, Guizhou, Gansu, and Ningxia reject the original hypothesis at the 10% significant level, and there is a local spatial correlation. Therefore, combining the Moran Index I and the Gillette Index C, Beijing, Shanghai, Hainan, Guizhou, Gansu, Ningxia, Hebei, and Guangdong explanatory variables show significant local spatial correlation, exhibiting either spatial positive or spatial negative correlation. Therefore, when choosing the spatial estimation model, it is necessary to fully consider the spatial correlation of the tourism economic efficiency, introduce the spatial lag term of the tourism economic efficiency into the model, and pay attention to the resulting correlation problems.
Based on the different control of spatial and time effects, the spatial panel model with fixed effects can be categorized into four types: no fixed effects, spatial fixed effects, time fixed effects and spatial time fixed effects. Firstly, we compare the estimation results of the four types of SDM models and select the optimal model from them. Secondly, the individual fixed effects model, spatial lag model and spatial error model of the non-spatial panel are used as the reference model for estimation, and this process is completed with the help of software Matlab R2020b and its spatial econometrics software package.
The regression results of the influencing factors of the regional tourism development index are shown in Table 3. By synthesizing the correlation test and model estimation results, the following conclusions can be initially drawn:
The results of the whole domain tourism development index were returned
Variable
OLS(Individual fixation)
SLM
SEM
SDM(Time fixed)
hcl
-0.4254***
0.2986***
-0.1527**
0.2975***
(-4.2151)
(8.5407)
(-2.0124)
(7.8697)
asa
0.3048***
0.1243***
0.2235***
0.1543***
(8.0053)
(3.6381)
(5.7680)
(4.0517)
pcgdp
0.2176***
0.2688***
0.2104***
0.2493**
(4.5206)
(7.9618)
(3.6524)
(6.3124)
nta
-0.5248
0.1275***
-0.827**
0.1993***
(-1.2701)
(3.3562)
(9.8571)
(6.4215)
nsh
0.4867***
0.2534***
-0.875**
0.1942***
(10.4813)
(4.8965)
(-2.5354)
(4.2513)
W* hcl
-
-
-
-0.1896*
-
-
-
(-0.1935)
W*asa
-
-
-
0.0027
-
-
-
(0.0528)
W*pcgdp
-
-
-
0.0924
-
-
-
(1.657)
W*nta
-
-
-
-0.5246***
-
-
-
(-8.2012)
W*nsh
-
-
-
0.1530
-
-
-
(1.7852)
-
0.1893***
0.5243***
0.3562***
(3.6538)
(11.2042)
(7.4251)
0.9147
0.7896
0.4869
0.7892
653.2568
541.2305
598.6258
463.5284
The SDM (time-fixed) model is the optimal model for the tourism development index of the entire region. The regression results of the influencing factors of the regional tourism development index are shown in Table 3, and from the results of the spatial effect test, although the SEM (spatial fixed) model has the highest log-likelihood value (Log L), the adjusted goodness-of-fit coefficient (Adj.
Comparing the OLS model and SDM model, the regression coefficients of the number of A-grade scenic spots and the total number of star-rated hotels in the OLS model are 0.3048 and 0.4867, respectively, while the regression coefficients are 0.1543 and 0.1942 in the SDM model, with the coefficients significantly lower. The regression coefficient of per capita gross domestic product (pcgdp) in the OLS model is 0.2176, while in the SDM model, the regression coefficient is 0.2493, and the coefficient is slightly increased. The regression coefficient of the consumption level of residents (hcl) in the OLS model is -0.4254, while the regression coefficient in the SDM model is 0.2975. The regression coefficient of the number of travel agencies (NTA) in the OLS model is -0.5248, which does not pass the test of the level of significance, while the regression coefficient in the SDM model is 0.1993, and it is significant at the 0.01 level. The above comparison results show that ignoring the existence of spatial effects of explanatory and interpreted variables will overestimate the impact of the number of A-grade scenic spots and the total number of star-rated hotels on the overall tourism development index, underestimate the impact of the per capita gross domestic product on the overall tourism development index, and the paradox of the negative impact of the level of residents’ consumption and the number of travel agencies on the overall tourism development index will occur.
The values of the SLM model
Since the regression coefficients of the explanatory variables in the spatial Durbin model can not directly reflect their specific influence on the regional tourism development index, it is necessary to decompose them, and the results of the direct effect, indirect effect and total effect of the influencing factors of the regional tourism development index are shown in Table 4.
Direct effect, indirect effect and total effect of tourism development index
Variable | hcl | asa | pcgdp | nta | nsh |
---|---|---|---|---|---|
Direct effect | 0.3512*** | 0.1375*** | 0.2635*** | 0.2286** | 0.2425*** |
(7.8695) | (4.5264) | (6.9865) | (5.6838) | (4.4151) | |
Indirect effect | -0.0921 | 0.0785 | 0.2513*** | -0.6879*** | 0.2441* |
(-0.8879) | (0.7196) | (3.5628) | (-6.0591) | (2.3561) | |
Total effect | 0.2041 | 0.2215 | 0.5237*** | -0.3604*** | 0.5124*** |
(1.7245) | (1.8206) | (7.2653) | (-4.0111) | (2.9815) |
Residents’ consumption level has a promotional effect on the regional tourism development index, and its direct effect on the regional tourism development index is 0.3512 and is significant at the 0.01 level, and the indirect effect is -0.0921, which does not pass the significance level test. If every 1% increase in the level of residents’ consumption, it will directly promote the region’s all-region tourism development index by 0.35%.The effect of residents’ consumption level on the tourism development index of neighboring regions in the whole region is not significant.
In summary, the results of the spatial effect analysis of the regional tourism development index show that:
There is a spatial positive autocorrelation of China’s provincial regional tourism development index from 2010 to 2022, and this spatial correlation shows an enhanced development trend. The consumption level of residents, the number of A-grade scenic spots, the gross domestic product per capita, the number of travel agencies and the total number of star-rated hotels have a positive effect on the regional tourism development index, especially the consumption level of residents. GDP per capita and the total number of star-rated hotels have positive spillover effects on the regional tourism development index. The regional tourism development index will be indirectly promoted by 0.25% and 0.24% if neighboring provinces increase their infrastructure and scientific and technological innovation by 1%. The number of travel agencies has a negative spillover effect on the regional tourism development index. The regional tourism development index will be reduced by 0.68% if the quality of the population in neighboring provinces improves by 1%.
Taking a tourist attraction in Jilin Province as the main research object, a questionnaire survey, network check-in data, GPS data, communication data, travelogue text, and other data were compared and analyzed. Questionnaire survey as the main data source, network travelogue text and geographic information as an auxiliary supplement, and government statistical data as support, using a variety of ways to collect and process the data of spatial and temporal behavior of tourists.
A total of 600 online questionnaires and offline paper questionnaires were distributed, and 579 valid questionnaires were collected. The questionnaires covered various aspects, including the basic understanding of the research tourists, the understanding of the tourists’ travel mode, the number of days they stayed, the scenic spots they visited, as well as the tourists’ consumption behavior and satisfaction in the scenic spots, the service space, the transportation space, and so on.
Through the stage of interviews with tourists and tourism APP in the tourists for different scenic sources of the comments in the summary to get the tourists in different scenic spots in the number of consumption behavior, the average number of times of consumption of different scenic spots as shown in Table 5. Visitors in the scenic spot in each attraction of the consumer behavior is not less than 2 times. The statistics in the text of the attractions show a total of 19 attractions, with tourists’ consumption behavior totaling 103 times, and an average number of times per tourist being 5 times.
Average consumption of scenic spots in different scenic spots
Scenic spot name | Average consumption number (time) |
---|---|
Changbaishan | 5 |
Changbaishanxiagufushilinjingqu | 2 |
Longshunxueshanfeihujingqu | 8 |
Daxitaihejingqu | 3 |
Chuangxingchangbaishanyuanshisamanbuluofengjingqu | 2 |
Daguandongwenhuayuan | 7 |
Mojiefengjingqu | 3 |
Shangbaishanlishiwenhuayuan | 5 |
Shangbaishanhepinghuaxuechang | 5 |
Hongqichaoxianminsucun | 12 |
Changbaishanbaoshixiaozhenlvyoudujiaqu | 15 |
Xidongyouleyuan | 4 |
Songhuacun | 8 |
Baihuagujingqu | 6 |
Changbaishandiyicunfengjingqu | 4 |
Changbaishanwenhuaboliancheng | 5 |
Haigouhuangjincheng | 3 |
Huiyiyizhi | 4 |
Genjudizhanshijinianguan | 2 |
By grading the transportation isochronous circle time of this scenic spot, the tourists’ visit duration is divided into half an hour for a short distance, one to two hours for a middle distance and three hours for a long distance. Combined with the statistical development of tourists’ travel time length, tourist consumption behavior and travel time length have a certain correlation.
Combined with the development of the city including geographic location and economic and social development of the current situation of the development of the city, the city Moran index I shows that p = 0.223, with local correlation. Combined with the results of the analysis of the spatial effect of the regional tourism development index, the development of this tourist attraction in Jilin Province has a certain correlation with the level of consumption of residents, the number of A-class scenic spots, per capita gross domestic product, the number of travel agencies and the total number of star-rated hotels.
This paper divides the subject and object of tourism consumption behavior, selects the tourism consumption spatial difference variables, establishes the spatial measurement model, and conducts a correlation analysis of the factors influencing the spatial difference of tourism consumption behavior. The geographic distance and economic distance functions are used to construct the spatial weight matrix, respectively. Based on the gravity model, Moran’s I and Geary’s C index tests are carried out.
The results of the spatial autocorrelation test show that spatial autocorrelation exists in most regions, and Moran’s I and Geary’s C indices point out that there are significant local spatial correlations between the explanatory variables in Beijing, Shanghai, Hainan, Guizhou, Gansu, Ningxia, Hebei, and Guangdong, which show either spatial positive correlation or spatial negative correlation.
The SLM model, the SEM model, and the SDM model all confirm the existence of a spatial spillover effect of the whole region’s tourism development index. Spatial spillover is a significant factor that influences China’s provincial region-wide tourism development index. For example, a 1% increase in the whole-area tourism development index of neighboring provinces can indirectly promote the whole-area tourism development index of this region by 0.35% through spatial interaction.
When we combine the statistics of tourists’ consumption behavior in scenic spots with the analysis of correlation factors, we can determine that the majority of tourists engage in scenic spot consumption at least twice, with an average frequency of approximately five times.