Open Access

Research on Optimal Strategy of Market Segmentation and Product Development of Rural Tourism Based on Cluster Analysis

 and   
Sep 26, 2025

Cite
Download Cover

Introduction

In recent years, the rural tourism market has shown a vigorous development momentum. Rural tourism has attracted more and more tourists with its unique natural landscape and folklore charm [1-2]. However, the development of rural tourism market is not a one-step process, and it needs to be segmented in order to deeply understand the needs and development potential of different market segments, so as to better meet the needs of tourists [3-6]. Rural tourism market segmentation can not only better meet the needs of different consumers, but also promote the healthy development of the tourism market [7-9]. Through market segmentation, tourism enterprises can better understand the needs and preferences of consumers and provide more personalized and differentiated products and services. Consumers can also find suitable tourism products more easily and get a better travel experience [10-13].

Rural tourism products are specialty products that supply the main demand of urban or off-site leisure travelers to specific villages [14-15]. The development of rural tourism will have a far-reaching impact on the local economy, society and ecological environment. At present, the development of rural products in China is in the beginning stage, through the development of rural tourism products and the combination of resources, and rural culture, and good ecological combination [16-19], to do enough experiential projects, so that tourists not only “live in the farmhouse, eat farm meals, do farm work, enjoy the farmhouse”, but also to “see farmhouse theater, folk festival, green tasting fresh, ecological livability” [20-22]. Only in this way to the village tourism product improvement to achieve quality enhancement, in order to attract more tourists, otherwise the product into the decline period [23-25]. Cluster analysis can play an important role in the village tourism market segmentation and product development optimal strategy.

Literature [26] used cluster analysis to develop segmentation of village tourism and concluded that the level of environmental attitude of tourists is one of the variables explaining the market segmentation, revealing that there is a large heterogeneity in the agritourism market, and the traditional agritourism activities accounted for a relatively small proportion of the market. Literature [27] discussed the motives of rural tourism and segmented the market based on push and pull motives, showing that the rural tourism market is heterogeneous and diversified, providing suggestions for practitioners and future research. Literature [28] used a two-stage cluster analysis approach to segment the market for older rural tourists and determined that the general characteristics and behavioral intentions of older rural tourists differed across travel periods, a result that provides a reference for the development of effective marketing strategies for older rural tourists. Literature [29] segmented the village tourist market using scales reporting environmentally, culturally, socially, and economically sustainable tourism behaviors, and through a hierarchical cluster analysis, produced a triple cluster solution with different sustainability influencing behaviors for more sustainable destination development. Literature [30] describes the nature of agritourism segmentation, motivations for agritourism activities and types of tourists on agricultural farms, emphasizing that market segmentation plays an important role in the development of tourism marketing activities, rural tourism and agritourism. The above studies reveal the importance of market segmentation in rural tourism, which can play an important role not only for the marketing of rural tourism, but also for rural tourism and rural revitalization, etc., and cluster analysis has been applied in these studies.

In addition to market segmentation, rural tourism product development strategy is also important for the sustainable development of rural tourism, but only a few scholars have studied this field. Literature [31] examined the heterogeneity of rural domestic tourism consumption based on cluster analysis, revealing that providing different rural tourism products to groups of tourists plays an important role in improving the management and marketing of rural destinations. Literature [32] describes the main directions of tourism product strategy in Poland, emphasizing that the development of tourism products requires not only time, but also the incorporation of a unique image, which enhances the competitive advantage of the destination. Literature [33] presents two conceptual frameworks aimed at analyzing and understanding the characteristics and strategic choices related to tourism product development, diversification, etc. and proposes types of strategic choices for tourism product development and destination assembly based on tourism product characteristics.

This paper takes the survey of urban residents in rural tourism as the research sample data source, and adopts the factor analysis method and the improved K-Means clustering method based on lion group optimization as the research method to segment the rural tourism market from the perspective of rural tourists. Determine the specific analysis steps and calculation steps of the factor analysis method, take the variance contribution rate of each factor as the weight and obtain the comprehensive evaluation index function, use the factor loadings, common factors, special factors, etc. to establish the factor loading matrix and calculate the contribution rate of the factor variance. The design of the fitness function is completed based on the Euclidean distance design. Using lion group optimization algorithm to find the optimal clustering center in all class clusters, each fitness corresponds to the position of different lions on, to determine the global optimal solution (clustering center), to solve the problem of traditional clustering algorithms sensitive to the initial clustering center. Motivational factor analysis of rural tourists, further clustering analysis of rural tourists on the basis of the determined motivational factors, to obtain different clusters of rural tourists. On the basis of the clustering segmentation of rural tourists, the segmentation of rural tourism market is realized through the difference analysis of different rural tourist clusters.

Research sample and methodology

Rural tourism is an important leisure mode for urban residents, but product homogeneity and content repetition lead to weak attraction and low revisit rate, forming an intrinsic demand for rural tourism crowd segmentation. In this paper, we will segment the Chinese rural tourism market and explore its rural tourism motivation. This chapter will discuss the sample and research methodology required for the rural tourism market segmentation study.

Research sample

The research sample of this paper comes from a survey of urban residents in rural tourism. In order to ensure the validity of the sample survey, the formal survey was conducted in March 2024, which is the peak season of rural tourism, the tourism experience and memory from filling out the questionnaire time is relatively short, the results are relatively true and effective. The questionnaire mainly consists of two parts, one is the tourism behavior characteristics of rural tourists, and the other is the rural tourists’ motivation scale.

In order to ensure the representativeness of the sample, Beijing, Hunan and Guangdong were chosen to conduct the survey, representing the northern, central and southern regions respectively, and the questionnaires were collected through two ways.

Online questionnaire survey

Through the online questionnaire sample service, online questionnaires were distributed in Beijing, Hunan and Guangdong, and at the same time, one screening item was set in the questionnaire, “whether you have participated in rural tourism in the previous 12 months”, and an attention test question was set in the questionnaire. Through the above screening and eliminating the questionnaires with too short filling time, 155, 153 and 156 valid questionnaires were obtained respectively, totaling 464 questionnaires.

Field survey

We went to some rural tourist attractions in Beijing, Hunan and Guangdong and asked rural tourists to fill in the questionnaires on the spot, and collected 80, 90 and 92 valid questionnaires respectively, totaling 262 questionnaires. The field survey and online questionnaire survey collected a total of 726 valid data.

Research methodology

The main data analysis methods used in this paper are factor analysis with improved K-Means clustering method based on lion group optimization.

Factor analysis

Analytical steps of factor analysis [34]

The steps are as follows.

Two keys: one is to construct the variables; the other is to explain the variables. The model in this chapter is discussed around the core of these two key issues.

There are four steps in factor analysis:

Ensure that the pairs of original variables you need to analyze can be applied in the factor analysis model;

Construct the factor variables;

Factor variables can have interpretive should be used to rotate the method;

Calculate factor variable scores.

Calculation steps of factor analysis:

Standardization of the raw data, so that the differences between the variables on the order of magnitude and size level can be eliminated.

The correlation matrix of the standardized data is required to be derived.

The eigenvalues and eigenvectors of the correlation matrix.

Calculation of variance contribution ratio and cumulative variance contribution ratio;

Determine the factors; let f1, f2, f3fp be P factors, in which the total amount of information contained in the first C factors, that is, the cumulative contribution rate mentioned earlier, exceeds 80%, we can extract the first C factors to reflect the overall situation to reflect the original data.

If the cumulative contribution of the extracted C factors does not exceed 70%, cannot be determined, and expresses no obvious practical significance, it is therefore necessary to rotate the factors.

Use a linear combination of the raw data variables in order to find the scores of the factors:

Calculating factor scores can adopt: regression estimation, Bartlett estimation or Thomson estimation.

Composite score

Using the variance contribution rate of each factor as the weight, the comprehensive evaluation index function is obtained from the linear combination of each factor: F=(λ1f1+λ2f2+λ3f3+...λpfp)λ1+λ2+λ3+...+λp$$F = \frac{{({\lambda_1}{f_1} + {\lambda_2}{f_2} + {\lambda_3}{f_3} + ...{\lambda_p}{f_p})}}{{{\lambda_1} + {\lambda_2} + {\lambda_3} + ... + {\lambda_p}}}$$

where λi is the variance contribution ratio of the factor before or after rotation.

Achievement ranking: using the comprehensive score can get the score ranking, and then can be compared with the original data.

Mathematical model of factor analysis

The factor analysis model is described as follows:

The original m variables are expressed as a linear combination of variables of p factors, set m the original variables as x1, x2x3…, xm, to find p factors (p < m) as f1, f2, f3…, fp, the relationship equation between the principal components and the original variables is expressed as: x1=11f1+12f2+13f3++1 pfp+θ1 x2=21f1+22f2+23f3++2 pfp+θ2 xm=m1f1+m2f2+m3f3++mpfp+θm $$\begin{array}{cl} \begin{array}{*{20}{c}} {{x_1} = {\partial_{11}}{f_1} + {\partial_{12}}{f_2} + {\partial_{13}}{f_3} + \cdots + {\partial_{1\:p}}{f_p} + {\theta_1}} \\ {{x_2} = {\partial_{21}}{f_1} + {\partial_{22}}{f_2} + {\partial_{23}}{f_3} + \cdots + {\partial_{2\:p}}{f_p} + {\theta_2}} \\ \vdots \\ {{x_m} = {\partial_{m1}}{f_1} + {\partial_{m2}}{f_2} + {\partial_{m3}}{f_3} + \cdots + {\partial_{mp}}{f_p} + {\theta_m}} \end{array} \\ \\ \\ \end{array}$$

Coefficient ∂ij is the linear correlation coefficient between the i nd variable and the p rd factor, which responds to the degree of correlation between the variable and the factor, and is also the factor loading. Factor indicates the original variable in a linear combination with the factor, also called the common factor. θ is a special factor, which indicates the influence of factors other than the common factor.

The above can be written in matrix form as: X=KF+θ$$X = KF + \theta$$

where F = (f1, f2, f3, …fp)T is the common factor vector, i.e., the principal factors, θ = (θ1, θ2, θ3, …θm)T is the special vector factors, and K = (∂ij)m×p is the factor loading matrix [35].

It is usually assumed: E(F)=0,var(F)=Ip E(θ)=0,var(θ)=D=diag(ε12,ε22,ε32,......εm2) cov(F,θ)=0$$\begin{array}{l} E(F) = 0,\operatorname{var} (F) = {I_p} \\ E(\theta ) = 0,\operatorname{var} (\theta ) = D = diag(\varepsilon_1^2,\varepsilon_2^2,\varepsilon_3^2,......\varepsilon_m^2) \\ \operatorname{cov} (F,\theta ) = 0 \\ \end{array}$$

On the basis of the above assumptions, you can clearly see that the common factors are not correlated with each other and have unit squares, the special factors are not correlated with each other, and in addition the special factors are not correlated with the common factors.

Common Metrics: Zi2=j=1mij2(i=1,2,...,p)$$Z_i^2 = \sum\limits_{j = 1}^m {\partial_{ij}^2} (i = 1,2,...,p)$$

The extent to which the information in variable xi can be explained by the p male factors is expressed as the contribution of the p male factors to the variance of the i th variable xi.

Contribution of factor variance: Hj2=i=1pij2(j=1,2,3,...m)$$H_j^2 = \sum\limits_{i = 1}^p {\partial_{ij}^2} (j = 1,2,3,...m)$$

The sum of the variances provided by the jst common factor on the variable xi reflects the relative importance of the jrd common factor. In conducting factor analysis of student achievement, X = (X1, X2, X3, …Xm)T can be expressed in terms of n student m subject achievement constituting a m-dimensional random variable, and F = (f1, f2, f3, …fp)T cannot be expressed in terms of a p-dimensional random vector as a common factor, the actual F significance of which still has to be problem-specific.

Improved K-Means clustering method based on lion group optimization

Lion optimization algorithm

The lion optimization algorithm simulates the lion’s habit to search for excellence, sets the initial value as the lion king position, selects the part of the space to be searched for excellence as the lioness to “hunt”, and if it can hunt better prey, the lion king position will be changed to a new location. The algorithm uses the cubs that are driven out of the pride as a perturbation factor, because they need to hunt independently, and when the hunting position of the cubs outperforms that of the current pride, it promotes convergence in a dynamic way. The principle of the lion optimization algorithm is shown below.

Suppose there are N lions in a D-dimensional dataset forming a set with the number of adult lions as nLeader, 2 ≤ nLeaderN/2. The lioness position is xi = (xi1, xi2, ⋯, xiD), 1 ≤ iN, and i represents the ith lion. The formula for calculating the number of adult lions: nLeader=[Nβ]$$nLeader = [N\beta ]$$

where β is the scale factor and N is the total number of lions.

In the lion optimization algorithm, the lion king, lioness, and cubs update their historical positions separately: xik+1=gk(1+γpikgk)$$x_i^{k + 1} = {g^k}\:(1 + \gamma ||p_i^k - {g^k}||)$$ xik+1=pik+pck2(1+αfγ)$$x_i^{k + 1} = \frac{{p_i^k + p_c^k}}{2}(1 + {\alpha_f}\gamma )$$ xik+1={ gk+pik2(1+αcγ) 0<q<13 pmk+pik2(1+αcγ) 13q<23 g¯k+pik2(1+αcγ) 23q<1$$x_i^{k + 1} = \left\{ {\begin{array}{*{20}{l}} {\frac{{{g^k} + p_i^k}}{2}(1 + {\alpha_c}\gamma )}&{ 0 < q < \frac{1}{3}} \\ {\frac{{p_m^k + p_i^k}}{2}(1 + {\alpha_c}\gamma )}&{ \frac{1}{3} \le q < \frac{2}{3}} \\ {\frac{{{{\bar g}^k} + p_i^k}}{2}(1 + {\alpha_c}\gamma )}&{ \frac{2}{3} \le q < 1} \end{array}} \right.$$

where gk is the optimal position of the group in generation k; γ is a random number in the range (0, 1); pik$$p_i^k$$ is the optimal position of the individual in generation k; pck$$p_c^k$$ is the historical optimal position of the lioness collaborator; αf, αc are perturbation factors; q is a probability factor; pmk$$p_m^k$$ is the k th-generation historical optimal position of the cubs that follow the lioness; and g¯k$${\bar g^k}$$ is the position of the cubs that have been repelled.

Improved K-Means clustering

In this paper, the K-Means clustering algorithm is improved based on the lion group optimization, and the optimal solution obtained by the lion group optimization algorithm, the lion king, is used as the clustering center, in order to solve the problem that the traditional clustering algorithm is sensitive to the initial clustering center [36].

K-Means clustering algorithm

The K-Means algorithm is a clustering algorithm based on the nearest-neighbor rule with a custom parameter of K. The purpose of this type of algorithm is to classify the N data objects in the dataset to be processed into K classes without interfering with each other, with the requirement that the overall similarity of the data in each class should be high, and, at the same time, the similarity between the different classes should be low. If S={S1,S2,,SN}$$S = \left\{ {{S_1},{S_2}, \cdots ,{S_N}} \right\}$$ is the sample data set to be processed, it will be divided into K classes, and the expression for each class is P={P1,P2,,PK},1<KN$$P = \left\{ {{P_1},{P_2}, \cdots ,{P_K}} \right\},1 < K \le N$$.

Adaptation function design

The fitness function is transformed from the objective function, and the fitness value can measure the in-class similarity of data objects to some extent [37]. The fitness value of the LSO-KM clustering algorithm is based on the design of the Euclidean distance, which is equal to the sum of the Euclidean distances from all the data in the class clusters to the center of the in-class clustering, and the fitness function is: f(uj)=i=1mujsi$$f({u_j}) = \sum\limits_{i = 1}^m {||{u_j} - {s_i}||}$$

Where m is the number of all data objects in the data set; si is the i rd data object; uj is the j th clustering center.

Determination of clustering center

Assuming that the size of the sample data is N, the number of clusters is K, using the lion group optimization algorithm to seek K optimal clustering centers in parallel in all the class clusters, each fitness corresponds to a different lion’s position, when the value of the fitness of the smallest global optimal position, that is, K clustering centers.

When the global optimal solution (clustering center) is determined, the cluster division is determined by the nearest neighbor law, which prioritizes each data object to the class closest to it: E(xi)=minj=1K(xiyj)2$$E({x_i}) = \min \sqrt {\sum\limits_{j = 1}^K {{{({x_i} - {y_j})}^2}} }$$

Where, xi is the i nd data; K is the number of categories; yj is the j th clustering center.

Steps of LSO-KM clustering algorithm

Step 1 Read the data. Process the experimentally selected data set.

Step 2 Input parameters. Input the values of each parameter of the lion swarm optimization algorithm such as the number of sample data N, sample feature attribute dim, the number of classifications in the current dataset K.

Step 3 Initialize Lion Groups. Randomly initialize the lion group according to the parameter range, the dimension size is K × dim, in order to realize N individual lions searching for the optimal solution in parallel.

Step 4 Initialize the clustering center. Calculate the number of lion king, lioness and cubs in the lion group according to equation (1), and initialize the lion king as the clustering center of the K-mean clustering algorithm, where the lion king position is derived from the optimal position of the initialized lion group.

Step 5 Update the individual optimum and global optimum. Sequentially update the positions of the lion king, lioness and cubs. On the basis of the individual optimum, the fitness value of the function is calculated based on the current position, and then the global optimum position is updated.

Step 6 Judgment of end conditions. Compare whether the current iteration number reaches Imax, if not, the iteration number is increased by 1 and return to step 5; if the termination condition is reached, go to step 7.

Step 7 end of iteration. Get the optimal solution Lion King and its corresponding global optimal position, that is, the K final clustering centers.

Step 8 Find the clustering result. Calculate the distance from each object to the clustering center, and divide the data into the most suitable class according to the nearest neighbor rule, and complete the clustering after all the sample data are assigned to the K clustering centers.

Motivational Factors and Cluster Analysis of Rural Tourists

In this chapter, factor analysis with improved K-Means clustering method based on lion group optimization will be used to analyze the sample data of China’s rural tourism market.

Motivational factor analysis of rural tourists

Exploratory factor analysis was conducted on the initial scale of tourism motivation using SPSS 24.0. The Bartlett’s spherical test (X2=4188.765, df=92, P<0.01) and KMO test (KMO=0.792) indicated that there might be potential factors between the items, which was suitable for factor analysis. Therefore, the scale was subjected to principal component analysis, and all the factors were rotated using the maximum variance method to extract the factors with an eigenroot greater than 1, and the items with a cross-loading greater than 0.4 and a factor loading of less than 0.6 were gradually deleted, resulting in a 14-item, 5-factor tourism motivation scale. The details of the tourism motivation scale are shown in Table 1. The reliability analysis showed that the correlation coefficients of all the factors were Cron-bach’α>0.7, item correlation>0.3, and total correlation of the corrected items>0.5, which indicated that the scale had good reliability and stability, and the factors had good consistency. According to the meanings of the items, the factors were named as children’s education, leisure and relaxation, social interaction, recreation and excitement, and exploration of new things. The cumulative variance explained rate of the five factors was 73.49%.

Factor analysis of rural tourists’ motivation

Factor Factor load Characteristic root Variance explanation ( % ) Reliability coefficient Mean
Children education - 4.452 31.82 0.896 5.89
Let the children understand the farming culture 0.923 - - - 5.95
Let children feel the rural life 0.864 - - - 5.97
Let children increase their knowledge 0.862 - - - 5.91
Leisure relaxation - 2.206 15.82 0.748 5.91
Relieve the pressure 0.818 - - - 5.97
Enjoy peace 0.744 - - - 5.9
Get rid of the monotonous daily life 0.692 - - - 5.86
Adjust the body and mind 0.682 - - - 6.11
Social interaction - 1.588 11.31 0.783 4.57
Business / business / conference needs 0.858 - - - 4.03
Visiting relatives and friends 0.816 - - - 4.63
Make New Friends 0.734 - - - 4.82
Entertainment excitement - 1.021 7.24 0.766 5.12
Participate in exciting activities 0.874 - - - 5.49
Seek stimulation and excitement 0.833 - - - 4.78
Seeking new exploration - 1.014 7.3 0.767 5.45
Explore new attractions / places 0.851 - - - 5.3
Learn new things 0.826 - - - 5.55
Cumulative variance interpretation rate ( % ) - - 73.49 - -

The children’s education motive has the largest variance contribution (31.82%) among all the motives, indicating that Chinese rural tourists show significant differences in this motive. Although previous studies have dealt with the motives related to children’s education, they have not been able to extract the independent motives for children’s education, but have been included in the study-related motives or family-related motives. The motivation obtained in this paper reflects the traditional Chinese concept of “reading thousands of books and traveling thousands of miles”, which reflects the importance Chinese families attach to the education of their minor children, and is significantly different from the motivation for rural tourism in other countries and regions. The highest mean value of the leisure and relaxation motivation factor is the main motivation for urban residents to participate in rural tourism, which is consistent with previous studies and belongs to the core motivation in the motivation theory, not only reflecting the demand for daily life adjustment, but also confirming that rural tourism is an important mode of short-distance leisure for urban residents on weekends. The difference between the motives of children’s education and exploration is that the former hopes that the minor children in the family can acquire life experience and increase their knowledge, while the latter is the travelers’ own demand for new things.

Cluster analysis of rural tourists

In the previous section, based on the meaning of the factors’ topic items, this paper named them as children’s education, leisure and relaxation, social interaction, entertainment and excitement, and novelty seeking and exploration, respectively. In this section, we will use the improved K-Means clustering method based on lion group optimization to carry out rapid clustering based on the five motivation factors for many times, and repeatedly compare the results, and finally determine that it is the most ideal to be divided into four categories, and the distribution of the sample size of each population is more reasonable, and the differences between the populations are significant. The mean values of the motivation factors of the rural tourist subgroups are shown in Table 2, and the ANOVA test results show that the five motivation factors distinguish the four rural tourist clusters very well (P<0.001). In addition, Scheffe’s post hoc test showed that there was a significant difference between the different groups, which further proved that these four groups were effectively recognized, and they were named as family education type, leisure and relaxation type, exploration and recreation type, and all-around active type according to the scores of the motivation factors, respectively.

Average value of motivation factors of rural tourists’ segmentation population

Factor Family education type Leisure relaxation type Explore entertainment type Full active type Total F-value
Children education 6.31H 6.03M 4.15L 6.32H 5.89 297.35***
Leisure relaxation 6.06H 5.65M 5.44L 6.24H 5.91 53.53***
Social interaction 3.76L 3.41L 4.35M 5.72H 4.57 330.62***
Entertainment excitement 5.22M 3.54L 4.81M 5.92H 5.12 243.28***
Seeking new exploration 5.53M 4.26L 4.98M 6.04H 5.45 150.57***

Taking the four categories obtained from the cluster analysis as the dependent variable and the five motivational factors as the independent variables, three discriminant functions were obtained through discriminant analysis, and the specific discriminant results are shown in Table 3. The standardized typical discriminant function coefficients reflect the contribution value of the motivational factors to the function, the Wilks’ Lambda test confirms that the five motivational factors are significant for constructing these discriminant functions, and the chi-square test confirms that the three discriminant functions are all significant.

Discriminant analysis

- Function 1 Function 2 Function 3
Standardized typical difference function coefficient Children education 0.003 0.355 0.791
Leisure relaxation 0.062 0.744 0.456
Social interaction 0.652 27.6 5.8
Entertainment excitement 0.644 747.778 168.432
Seeking new exploration 0.456 0.003 0.505
Characteristic root 2.986 -0.277 -0.716
Variance explanation ( % ) 66.8 1.236 0.264
Canonical correlation 0.865 0 0
Wilks’ Lambda 0.088 0.109 -0.125
Chi-square 1745.636 -0.05 0.515
Degree of freedom 15 0.976 -0.153
Prominence 0 8 3

The results of the discriminant analysis for the classification of tourists showed that the discriminant function was able to categorize the tourists well, and the final assessment of the clustering results is shown in Table 4. 95.0% of the overall sample was able to be correctly categorized with a high degree of accuracy, with 95.33% of the family-educated, 90.48% of the leisure-relaxation, 93.33% of the exploratory-entertainment and 96.99% of the fully-active being correctly assigned to the various categories.

Results of evaluation and clustering

Classification Family education type Leisure relaxation type Explore entertainment type Full active type Total
Family education type 204(95.33%) 2(0.93%) 0(0%) 8(3.74%) 214(100%)
Leisure relaxation type 10(7.94%) 114(90.48%) 2(1.59%) 0(0%) 126(100%)
Explore entertainment type 4(3.33%) 2(1.67%) 112(93.33%) 2(1.67%) 120(100%)
Full active type 5(1.88%) 0(0%) 3(1.13%) 258(96.99%) 266(100%)
Rural tourism market segmentation analysis based on cluster analysis

In the previous chapter, this paper conducts in-depth analysis of the motivation factors and clustering of rural tourists through factor analysis and the improved K-Means clustering method based on lion group optimization, and the final assessment of the clustering results obtains four types of rural tourist clusters, namely, family education, leisure and relaxation, exploring and entertainment, and fully active.

This chapter will realize the segmentation of rural tourism market by analyzing the differences of different rural tourist clusters on the basis of rural tourist clustering segmentation.

Analysis of rural tourists’ motivation for traveling

The chi-square identification table of clusters of rural tourists and their motivations for traveling is shown in Table 5. From the table, it can be seen that on the motivation of approaching nature, the exploration and recreation type has the highest proportion of 19.05%. On the motivation of relaxation, the leisure and relaxation type has 38 people with a high proportion of 30.16%. In terms of the motivation to pass time, the proportion of family education type is as high as 25.7%. It is significantly higher than other ethnic groups. In terms of consumption and shopping motives, the proportion of the fully active type is the highest at 17.29%. Through the chi-square test, it was found that the results were at the significant level of P=0.02<0.05, which proved that there were significant differences in the motives of different types of rural tourists.

Visitor travel motivation

Visitor travel motivation Leisure relaxation type Explore entertainment type Family education type Full active type
N Proportion N Proportion N Proportion N Proportion
Close to nature 18 14.29% 23 19.17% 32 14.95% 18 6.77%
Relax 38 30.16% 24 20.00% 32 14.95% 38 14.29%
Experience rural life 9 7.14% 10 8.33% 23 10.75% 23 8.65%
Morning hurry with friends and family feelings 22 17.46% 24 20.00% 43 20.09% 41 15.41%
Walk at will 10 7.94% 18 15.00% 14 6.54% 43 16.17%
Spend time 15 11.90% 12 10.00% 55 25.70% 38 14.29%
Consumer shopping 12 9.52% 3 2.50% 9 4.21% 46 17.29%
Knowledge and learning 2 1.59% 6 5.00% 6 2.80% 19 7.14%
Pearson chi-square λ2=34.806Df=21P=0.02
Analysis of Differences in Information Collection Patterns of Rural Tourists

The results of the chi-square identification of different types of rural tourists on information collection are specifically shown in Table 6. From the table, it can be seen that in the information acquisition channel merchant, which is introduced by friends and relatives, the family education type has the highest proportion of 59.81% (128 people), followed by the leisure and relaxation type, accounting for 51.59%. On radio and television, the proportion of the fully active type was slightly higher at 24.06% (64 people), while the proportions of the other ethnic groups were not very different. On newspapers and magazines, the highest proportion of family education types was 19.63% (42 people), which was comparable to the 19.55% proportion of fully active types, but significantly higher than that of leisure and relaxation types. On internet information searching, the Fully Active type had the highest percentage at 30.45% (81), followed by Explore Recreation and Explore Entertainment, with the Family Educational type being the most underutilized in this form of information gathering. The test result by chi-square test was P=0.003<0.05 level of significance. It is clear that there are significant differences in information gathering among different groups of travelers in different segments of the market.

Forms of information gathering

Forms of information gathering Leisure relaxation type Explore entertainment type Family education type Full active type
N Proportion N Proportion N Proportion N Proportion
Introduction of relatives and friends 65 51.59% 53 44.17% 128 59.81% 64 24.06%
Radio and television 20 15.87% 18 15.00% 36 16.82% 64 24.06%
Newspapers and magazines 11 8.73% 16 13.33% 42 19.63% 52 19.55%
Network 24 19.05% 26 21.67% 0 0.00% 81 30.45%
Tourism brochure 6 4.76% 7 5.83% 8 3.74% 5 1.88%
Pearson chi-square λ2=33.715Df=16P=0.003
Analysis of differences in assessment guidelines for rural tourists

In order to test how different rural tourist clusters differ on pre-purchase program evaluation, this section analyzes the differences between rural tourist clusters and pre-purchase program evaluation guidelines, and the results of the analysis are specifically shown in Table 7. It can be seen that the family-educated type values the accessibility of transportation more than the other clusters, with the lowest mean value of 1.36. The Fully Active type, on the other hand, valued more than the other ethnic groups on a number of assessment criteria, including rich ecological resources of flora and fauna (1.49), beautiful natural environment of the countryside (1.17), dining with special flavors (1.8), fame of the countryside (2.59), and diversified experiential activities (1.62). In different market segments, there is no significant difference between different clusters of rural tourists in the two assessment criteria of reasonable price and environmental hygiene (P>0.05), while there is a significant difference in the accessibility of transportation, rich flora and fauna, living resources, beautiful rural natural environment, diversified experiential activities, perfect recreational facilities, special flavor of food and beverage, rural fame, and environmental hygiene (P< 0.05).

Difference analysis

Evaluation criteria Leisure relaxation type Explore entertainment type Family education type Full active type F P
Traffic convenience 2.11 1.82 1.36 2.61 8.915 0.001
Rich animal and plant ecological resources 1.83 1.75 2.49 1.49 7.861 0.006
Beautiful rural natural environment 1.63 1.51 2.26 1.17 15.737 0.001
Reasonable price 2.11 1.94 1.6 2.1 2.274 0.088
The fame of the village 3.55 2.99 3.92 2.59 13.999 0.006
Dining with special flavors 2.22 2.24 3.16 1.8 8.108 0.002
Perfect recreation facilities 2.39 1.92 3.83 1.72 28.588 0.005
Service quality 1.8 1.57 1.91 1.63 1.455 0.248
Environmental Sanitation 1.42 1.49 1.97 1.49 3.05 0.032
Diversified experience activities 2.41 2 3.78 1.62 25.299 0.007
The mean range is 1~5, and the lower the score, the higher the importance.
Analysis of differences in post-purchase behavior of rural tourists

The differences in the post-purchase assessment of different tourists are specifically shown in Table 8. The satisfaction, willingness to revisit, and willingness to recommend rural tourism of the family education type are higher than those of the other clusters, with mean values of 1.69, 1.22, and 1.22, respectively, while the opposite is true for the fully active type. In terms of satisfaction, willingness to revisit, and willingness to recommend rural tourism, different tourist clusters showed significant differences (P<0.05), which means that in different market segments, there are significant differences in post-purchase behaviors of different tourist clusters (P<0.05).

Post-purchase evaluation

Post-purchase assessment Leisure relaxation type Explore entertainment type Family education type Full active type F P
Overall satisfaction 2.26 2.48 1.69 2.66 9.721 0.005
Reactivity 2.11 2.4 1.22 2.48 11.548 0.009
Recommended will 2.14 2.42 1.22 2.52 11.062 0.003
The mean range is 1~5, and the lower the score, the higher the importance.
Optimal strategy for rural tourism product development

Combined with the above analysis of rural tourism market segmentation from the perspective of different rural tourism clusters, this chapter will divide the initial development of rural tourism products into three major product types, and put forward the optimal strategy for rural tourism product development.

Idyllic tourism products

This kind of product is to utilize the unique natural environment of the countryside, to carry out products such as idyllic style, walking around the village and so on. At present, with the development of domestic rural tourism, travelers are gradually keen on “back to basics, back to nature,” the countryside style tourism. The city has a number of good conditions in the countryside, the natural environment is beautiful, quiet, such as the town of Datong, through the careful design of tourism activities, can attract the need to relax, relieve stress, weekend getaways for city dwellers to sightseeing tours. In the design and development of such products, each unit village should be connected into a line or surface, forming a cluster of tourist villages. This will not only avoid the destruction of environmental resources brought about by the large influx of tourists in individual villages due to the small space, but also prolong the tourists’ visiting time and form the advantages of relatively independent rural tourist resorts.

Participatory agro-tourism products

The development of these products should be from the local folklore in-depth excavation of cultural connotations, the formation of a strong local flavor and real farm life tourism products, take the agricultural entertainment boutique route. The earliest form of tourism in the form of Nongjiale appeared, it can be said that most of the travelers on the definition of rural tourism is Nongjiale, generally for one-day or weekend tours, the consumer groups of the product of the same place to revisit the rate of high. Nongjiale is the earliest but not the best. At present, there is a single type of Nongjiale products, low quality of service, simple facilities, the tendency of small hotels in the suburbs of the city in the hospitality places, the local culture is not prominent, and there is a lack of high-quality products, most of the Nongjiale products are relying on the countryside to carry out the activities of chess and cards and chatting, which gives people the impression of carrying out the city life in the countryside. For this reason, the design of agro-entertainment products should deeply excavate the rural cultural characteristics, and develop a variety of participatory products on the basis of stabilizing consumer groups, such as planting vegetables, making farm food, planting flowers, cutting rice and transplanting rice plants, weeding and fertilizing, fishing and shrimping, harvesting agricultural crops, fresh fruit picking, and making and tasting local specialties to show the authenticity of rural life, create a local atmosphere, increase cultural content, increase knowledge and entertainment, and retain tourists. To keep the tourists and establish the concept of high-quality goods.

Tourism products for experiencing countryside and folklore

This kind of product attracts the urban cultural tourists who experience the cultural difference between urban and rural areas as the main motive for traveling. In the current domestic tourism market is moving towards the trend of sightseeing vacation, experience vacation development, has a very good market development prospects, coupled with the annual folk culture festival, these unique folk culture and farming culture will attract a large number of domestic and foreign cultural tourists.

Conclusion

In this paper, the factor analysis method and the improved K-Means clustering method based on lion group optimization are selected as the research methods, and the market segmentation of tourists is carried out on the basis of the research sample data from the survey of urban residents in rural tourism, to explore the behavioral patterns and other characteristics of rural tourists in different segmented markets.

In the analysis of the motivation factors of rural tourists, five factors are identified, namely, children’s education, leisure and relaxation, social interaction, entertainment and excitement, and exploration, and the cumulative variance explained rate of the five factors is calculated to be 73.49%, among which the children’s education motivation factor has the largest variance contribution rate (31.82%). Based on the five motivational factors, multiple rapid clustering was conducted to classify four types of intra-country traveler clusters, namely, family education type, leisure and relaxation type, exploration and entertainment type, and fully active type. In the discriminant analysis, the family education type, leisure and relaxation type, exploration and recreation type and fully active type samples basically can realize the accurate allocation of the countryside tourist population, and the percentage of the accurate allocation reaches 95.33%, 90.48%, 93.33% and 96.99% respectively.

On the basis of the rural tourist cluster segmentation, the difference analysis of different rural tourist clusters is further carried out to realize the segmentation of the rural tourism market. In the difference analysis of tourism motivation and information collection pattern, through the chi-square test, the test results of tourism motivation and information collection pattern are P=0.02 and P=0.003 respectively, which all present a significant level (P<0.05), proving that different rural tourist clusters present significant differences in tourism motivation and information collection pattern. In the difference analysis of rural tourists’ evaluation criteria, except for the two evaluation criteria of reasonable price and environmental health, there are significant differences (P<0.05) between different rural tourist clusters in the diversified experiential activities, perfect recreational facilities, special flavor catering, rural fame, environmental health, convenient transportation, rich flora and fauna, living resources, and beautiful natural environment of the countryside. As for the differences in post-purchase behavior, different tourist clusters showed significant differences in satisfaction, willingness to revisit, and willingness to recommend rural tourism (P<0.05).

Finally, based on the analysis of rural tourism market segmentation, the optimal strategy for the development of rural tourism products is proposed to provide reference for tourism enterprises and local governments in the development of rural tourism projects.

Language:
English