Research on Precision Marketing Strategy of Rural Tourism Combining Big Data and Cloud Computing Technology
Data publikacji: 25 wrz 2025
Otrzymano: 27 sty 2025
Przyjęty: 29 kwi 2025
DOI: https://doi.org/10.2478/amns-2025-1014
Słowa kluczowe
© 2025 Qingqing Sang and Yu Hu, published by Sciendo
This work is licensed under the Creative Commons Attribution 4.0 International License.
Today, rural tourism is being favored by travelers and has become an important part of the tourism industry. In recent years, with the speed of urban-rural integration and the improvement of living standards in China, people are becoming more and more enthusiastic about rural tourism [1-3]. Rural tourism is not only the main way to relax the body, return to nature, feel the countryside life, and experience the countryside scenery, but also can promote the development of the rural economy, and promote the economic growth and cultural integration of urban and rural areas [4-6]. In general, the space of rural tourism is mainly distributed in suburban areas and scenic spots far away from the source of customers [7]. The development of rural tourism has brought certain economic benefits to the countryside, played a certain positive significance in solving the “three rural” problem, and also satisfied the psychological needs of urban and rural residents to return to nature and experience the traditional way of life [8-10]. Therefore, vigorously developing rural tourism is of great significance in promoting rural economic development.
Taking the rural tourism development model as the leading can greatly increase the income of local residents and reduce the economic burden of residents [11-12]. Through the analysis of rural tourism demand-oriented, innovative to meet people’s actual needs and personal preferences, can change the traditional mode of rural tourism, and further promote the development of rural economy. Compared with the urban tourism program, rural tourism started late, backward facilities and equipment, and rough operation [13-15]. Accurately selecting the target group of tourists in rural tourism, grasping the needs of tourists, and improving the marketing efficiency and accuracy, precision marketing based on big data analysis will help rural tourism to solve the above operational problems and collaborate to realize rural revitalization [16-17].
Precision marketing is a marketing concept based on digital technology, which accurately selects the target market of rural tourism on the basis of data analysis and processing, depicts the digital portrait of the target tourists, and accurately analyzes the tourists’ demand, so as to guide the marketing strategy and means to be more demand oriented and target precise, and at the same time, reduce the marketing cost, improve the market efficiency, and avoid the blind marketing [18-21]. Big data lays the data foundation of precision marketing, with the deep application of the Internet, a large amount of personal data and business data, social data stacked generation, China entered the era of big data [22-23]. Big data is a data set whose size exceeds the ability of ordinary database software tools to capture, store, manage and analyze [24]. The data in the era of big data, characterized by diversity, mass, value and rapidity, the generation of a large number of rural tourists’ identity data and behavioral data, as well as the enhancement of data analysis and processing technology and capabilities, provide a scientific basis for the precise marketing of rural tourism [25-26].
Literature [27] designed a marketing strategy framework for rural tourism attractions centered on big data and artificial intelligence technology, which helps tourists’ rural tourism experience enhancement. Literature [28] constructed an online marketing model based on villagers’ willingness and launched a related study, revealing that a southern and southeastern region possesses the advantages of e-marketing, which can be vigorously promoted to promote employment. Literature [29] envisioned a tourism marketing strategy based on SWOT empirical analysis for tourism streets in Asahan County, Basilmandaji City, including exploring the potential of the community, optimizing the network signals, and improving the regional access streets. Literature [30] proposed a rural tourism cloud data architecture based on Internet platforms and technologies aimed at improving tourism efficiency and assisting in the development of precise tourism marketing strategies, and the feasibility was verified through simulation and simulation experiments. Literature [31] used neural network intelligent algorithm to extract seven indicators affecting inbound travel, and built an analytical model to test, the study provides an important basis for tourism marketing strategy and price policy development. Literature [32] examined the underlying logical mechanisms and influencing elements of the innovation and reform brought by information technology-enabled tourism market, mainly including tourism operation, marketing, and strategic development of the tourism industry, which is conducive to the understanding of how tourism technology promotes the informatization and intelligent construction of the tourism industry. Literature [33] used big data analysis methods to discover the consumer motivation, consumer preferences and factors affecting consumer satisfaction in rural tourism, and based on this to segment customer groups and develop tourism marketing strategies.
This paper discusses the combination path of cloud computing technology and rural tourism marketing. On the basis of K-means, a density-based method is introduced to determine the initial clustering center, and the number of comparisons and calculations is simplified based on the theory of geometric triangle trilateral relationship. The use of collaborative filtering algorithm to realize the accurate marketing of rural tourism. On the basis of the clustering algorithm based on user ratings, the Euclidean distance is used to calculate the similarity of tourist attractions and user-ratings, and generate a recommendation list to complete the precise recommendation of rural tourist attractions. Use the improved K-means algorithm to divide tourist behavior and analyze the basic preferences and attributes of tourists. Based on the clustering results, the marketing strategy is determined, and finally the performance evaluation and analysis of the recommendation results is performed.
Mobile cloud computing technology is usually in the form of “cloud computing + Internet + mobile terminal” constitutes a management and service platform for various industries, through which end users can obtain the required infrastructure services, various software and application services, various resource information and services on demand, conveniently and quickly [34]. The application of mobile cloud computing and big data technology to build its own industrial cloud service platform is an option, and the following four promotion methods can also be applied.
Build its own rural leisure tourism cloud service platform on the region’s official tourism website, and display the region’s rural leisure tourism resources in a comprehensive manner by means of “portal website + APP”. Open up the “Rural Leisure Tourism Resource Product Chain” display and service section in the tourism channel of famous comprehensive portal websites. Display the region’s quality rural leisure tourism resources and services, such as “high-quality products chain” and “new products chain” on well-known tourism intermediary service websites. Selectively display and promote the special rural leisure tourism resources and product chains on famous Chinese specialty tourism websites.
Different types of rural leisure tourism have different construction connotations, but the construction goal is the same, must be centered on the “eat, live, travel, travel, music, shopping” six elements of innovative development concept.
Under the mobile cloud computing environment, create a cloud service platform for rural leisure tourism industry, provide precise interaction and docking channels for the demand side, the provider and the regulator, establish supply-side product chains and demand-side product chains, and carry out product supply and demand analyses, hot sales analyses and prediction analyses on the basis of this. Relying on the cloud service platform to realize the e-commerce of leisure tourism products.
Service providers obtain information on actual customers and potential customers of leisure tourism products through the platform, a large amount of industrial information data is accumulated on the cloud service platform of rural leisure tourism industry, and industry managers, service providers, and leisure tourists share data resources, and they actually participate in the product design and innovation of service enterprises, playing the role of “cooperative producers”, which is the basis for customer interface innovation.
High-quality service products formed based on new service concepts, new service ideas, new service processes, new customer interfaces, etc. are delivered to consumers through new service delivery systems. Leisure tourism product e-commerce has changed the business transaction mode and process, the internal organization and staff capacity of the service providing enterprise should also be changed, and the service provider should organize, manage and coordinate all the links.
New technologies such as mobile cloud computing and big data analysis can inject new momentum into the rapid development of the rural service industry, and are platforms, tools, methods and means to improve service quality. The innovative design adopts the mode of “cloud platform + Internet + mobile terminal” to provide a platform environment for the supply side, demand side and regulatory side of the rural leisure tourism industry to fully dock, exchange and interact.
The K-means algorithm is a center of mass based technique in dividing clustering methods, which uses
The K-means algorithm is described as follows: input: dataset containing
The steps of the algorithm are as follows:
Initialize repeat. FOR each input vector Assign FOR each cluster Update the cluster centers to the current centroids of all samples in Compute the criterion function
Although the K-means algorithm is widely used, it has several drawbacks: (1) It is more difficult to choose a suitable
To address the shortcomings of (1) (3) in the above algorithms, this paper introduces a density-based method to determine the initial centroid, thus making up for the fact that the K-means algorithm is only suitable for solving the problem of data types with convex distributions. For the shortcomings of (4), this paper introduces the theory of geometric triangle trilateral relationship to simplify the number of comparisons and calculations.
The basic idea of the density-based approach is to add the density of points in a single region to clusters similar to it whenever it is greater than a certain threshold. The traditional
Definition 1: The distance formula of 2 objects is:
where
Definition 2:
Definition 3: Core object: an object is said to be core if the
The improved density-based centroid initialization algorithm is described as follows: input: a dataset containing
The steps of the algorithm are as follows:
Calculate the distance Calculate the Find the object Calculate the distance between Find the object Continue to find object
In the traditional K-means algorithm, the time complexity of the process of attributing each object to the class of its nearest center is
Since the K-means algorithm uses the Euclidean distance to measure the similarity between the objects, this paper considers the theory of the relationship between the sum of the two sides of the triangle is greater than the third side of the triangle to simplify the calculation process, the algorithm in one iteration process is as follows:
Calculate the distance Calculate the distance Continue step 2) until
The time complexity of this improved algorithm is
In real life when we are faced with the choice of goods, we will often first consult the people around us who have experience in purchasing, according to their advice to make a suitable choice for their own purchases, or will first consult the opinions of people with similar interests to help them make a suitable choice. User-based collaborative filtering algorithms are mainly used to mine and calculate user information, and find similar users according to the results of information analysis. Similarity can be obtained by obtaining the explicit or implicit behavior of the user such as ratings, retweets, saves, tags, comments, clicks, page stay time and whether to buy, etc., and calculated after processing, the closer the similarity of the results of the calculation, we believe that the similarity of the interests of similar users [36]. The algorithm process is:
Transform the collected user information into a two-dimensional matrix for digital representation, and perform noise reduction and normalization on the data. Construct the preprocessed data into a user-item rating matrix, and obtain a TOP-N list of similar users by using the similarity formula on the data. Obtain the scored data of similar users, and calculate the weighted average to get the predicted score value of the target user. Sort the TOP-N according to the result of the predicted score value as the resulting recommendation for the user.
The basic formula of user-based collaborative filtering algorithm is shown in (3):
User-based collaborative filtering algorithms are suitable for cases with a small number of users due to their own characteristics, and when the set of users is too large, the computation of similarity through the user matrix can be costly.
The collaborative filtering algorithm based on commodities is relatively more used in practical applications, it does not calculate the commodity attributes, but calculates the similarity between commodities through user ratings, which simply means that it recommends similar commodities to the commodities that the user has liked. The basic formula is shown in (4):
The recommendation calculation process is roughly similar to the user-based collaborative filtering algorithm, the only difference is that after the establishment of the user-item scoring matrix this algorithm calculates the similarity between the goods, by obtaining the user’s historical data to determine the goods that the user has liked, and recommend the list of similar goods of this product TOP-N to the target user.
When the item data of a system is much smaller than the user set data and the number of items is not likely to change too much, it is more suitable to use the collaborative filtering algorithm based on items.
In this paper, Euclidean distance is used for similarity calculation. Euclidean distance calculates the true distance between points in a certain vector space, i.e., the true distance between individuals in space is used to determine the degree of similarity between two individuals. The Euclidean distance is used to ensure that the two points are always within a scale, so the Euclidean distance calculates the absolute distance between points in a multidimensional space. The cosine similarity algorithm calculates whether vectors are isotropic or not, whereas the Euclidean distance calculates the true distance between points, and is therefore more applicable than cosine similarity when using user behavior as an indicator for inter-user similarity calculations. Vector
When similarity is calculated for users based on their ratings, the Euclidean distance focuses on showing the degree of fit of the user ratings, while the cosine similarity better distinguishes the user’s separation status, i.e., the rating hierarchy.
Because the number of tourist attractions is not easy to change and is much smaller than the number of users, and the resulting attractions - evaluation index matrix data is dense and there is basically a common value between the variables, for the attraction similarity selection of Euclidean distance calculation. The algorithm is based on the user-item matrix data for calculation, after determining the weight of each item of the evaluation index system, a 549×12-dimensional matrix of attraction-evaluation indicators can be constructed by crawling the relevant data from the network:
Where each row represents the data of an attraction and the columns represent the data of the evaluation indicators that have been constructed, multiplying the data of each column with the weight of that indicator to obtain the Attraction-Weight Indicator Matrix
The similarity of the attraction data results in the matrix
Construct a user-attraction rating matrix based on user ratings of attractions:
Here each user is a
Assuming that the number of attractions in the final recommended list
In this paper, with the help of improved K-means clustering analysis to classify the behavior and preference patterns of tourists’ tourism consumption, the clustering of 35 variables, marking the performance of the research object in the 35 indicators of tourists’ behavior. In addition, the clustering object is 255 tourists who have been or are traveling and consuming in Hangzhou. According to the questionnaire statistics and the control of tourists’ behavioral stages, the K value is taken as 4, so that the sample size is distributed more evenly. After the mean clustering and ANOVA test, the results of the division of tourist behavior under 35 indicators are shown in Table 1.
The result of the tourist behavior clustering (n=255)
| Indicator | Behavior cluster result | F | Sig. | |||
|---|---|---|---|---|---|---|
| 1 | 2 | 3 | 4 | |||
| 1 | -7 | -8 | 9 | -2 | 5.543 | 0.007 |
| 2 | -6 | 17 | -7 | 15 | 8.025 | 0.000 |
| 3 | -16 | 14 | 3 | 6 | 4.141 | 0.014 |
| 4 | 12 | 3 | 3 | -30 | 15.076 | 0.000 |
| 5 | -2 | 4 | -1 | -1 | 0.191 | 0.002 |
| … | … | … | … | … | … | … |
| 31 | -20 | 4 | 2 | 4 | 2.673 | 0.008 |
| 32 | -23 | -7 | 6 | 25 | 4.963 | 0.000 |
| 33 | 2 | -8 | -6 | -8 | 19.907 | 0.004 |
| 34 | -20 | 6 | 12 | -4 | 5.092 | 0.014 |
| 35 | 15 | -11 | -17 | 20 | 16.198 | 0.000 |
The degree of difference of F-test is significant, meanwhile, the significance level of all 35 indicators is lower than 0.05, which represents the higher validity of data analysis when K=4, and the results of K-means clustering in this group are established. In addition, the significant indicators between the four types are obviously different, and the sample size of each cluster set is divided more evenly, and does not need to be divided again.
After improved K-means clustering, the visitor behavior and preference under are mainly reflected in the following four types:
Personalized push sensitive type. For personalized marketing push content response sensitive tourists accounted for 27.7% of the total sample size, this type of tourists in the current digital environment accounted for a higher proportion than the other three types, the characteristics of which are mainly manifested in the start of the trip before the start of the trip, easy to because of the personal history of the network traces caused by the customization of tourism, leisure-related product recommendations, and then form a payment conversion. Although this category of tourists passively obtain promotional content before the trip, as long as it meets their personalized preferences, it is able to form a high marketing conversion. Social media and self-media active access to information type. Tourists who actively obtain relevant information through social apps or self-media before and during their trips, so as to plan specific itineraries and consumption items accounted for 23.9% of the total sample size. The characteristics of these tourists are that they have clear travel goals and consumption budgets, and they obtain verticalized information through the travel-related self-media they usually pay attention to, or social apps and O2O software of various carriers, etc. They also like to obtain travel information and preferential activities through the likes and retweets of the self-media, and then make consumption choices based on the information they have obtained on their own during their trips. This kind of tourists need to be integrated through multiple channels of digital marketing to push, implant, and create more “card” selling points to stimulate consumption. Technology interactive experience type. Visitors who are easily attracted by novel technology and content and prefer to participate in the immersive experience of scenic spots with new technology account for 27.1% of the total sample size. Its main characteristic is that it prefers novel form, technology, interactive marketing promotion, and innovative technology is the entry point for this type of tourists to form payment conversion. The proportion of this type of tourists is high, and the promotion method should focus on the form rather than the content. Rely on travel path customization + positioning push project type. The ability and willingness of this type of tourists to take the initiative to obtain information is weak, and even compared to the first type of tourists, their sensitivity to the usual browsing of cell phone “personalized push” content is not high. On the contrary, this group of tourists is used to “arranged” and “one-stop” tourism services. With the popularization of smartphones, this group of tourists is more likely to rely on local official tourism information service platforms than social apps and various types of self-publishing media to get real-time information during their trips. They prefer official channels for travel paths and transportation routes, as well as food, lodging and leisure programs near their location. Although the proportion of this type of tourists is the smallest in the total sample (21.3%), they will become potential loyal users of the cloud-based big data tourism service information platform, and generate marketing and consumption conversion through the platform functions.
The basic attributes of the population in the questionnaire corresponding to the four types of tourist behavior and preference types were counted, and the results are shown in Table 2.
Clustering of tourist behavior preferences and basic property distribution
| Gender | Age | |||||||
|---|---|---|---|---|---|---|---|---|
| Cluster | Male | Female | <15 | [15,25] | [26,45] | [46,65] | >65 | |
| 1 | 58.65% | 41.35% | 1.44% | 14.5% | 28.23% | 35.29% | 20.54% | |
| 2 | 20.82% | 79.18% | 1.11% | 40.31% | 31.18% | 16.8% | 10.6% | |
| 3 | 50.27% | 49.73% | 1.25% | 30.26% | 32.31% | 31.57% | 4.61% | |
| 4 | 57.30% | 42.70% | 0% | 18.14% | 20.5% | 40.08% | 21.28% | |
| Cluster | Monthly income | The biggest consumer project | ||||||
| <5000 | [5000, 10000] | >10000 | Food | Dorm | Shop | Recreation | Scenic spot project | |
| 1 | 19.33% | 56.13% | 24.54% | 12.09% | 14.22% | 26.12% | 38.16% | 9.41% |
| 2 | 49.12% | 37.93% | 12.95% | 26.6% | 16.32% | 3.3% | 21.86% | 31.92% |
| 3 | 34.01% | 32.53% | 33.46% | 15.07% | 23.83% | 16.22% | 23.41% | 21.47% |
| 4 | 12.46% | 37.92% | 49.62% | 25.42% | 28.12% | 5.99% | 12.01% | 28.46% |
In the personalized push-sensitive tourist groups, the ratio of men and women is basically equal, and mainly concentrated in the middle-aged groups in the stages of 26~45 years old and 46~65 years old, and from the income point of view is concentrated in the middle- and high-income groups. It can be proved that the customized push type online marketing meets the preference of the main consumer group in the current market.
In terms of consumption share, personalized accurate push and technological interactive experience marketing promotion can play a significant consumption promotion role for leisure and entertainment programs in the countryside (38.16% and 23.41% respectively). Meanwhile, the interactive marketing of social media and self-media as well as positioning push marketing can effectively boost the consumption of food and beverage (more than 25% of total consumption) and the revenue of attraction programs (about 30% of total consumers on average).
It can be seen that the digital marketing system planned for the above four types of tourists’ preferences meets the core needs of tourism consumers and can bring about local consumption growth in leisure, accommodation and shopping.
Using squared Euclidean distance and again for systematic clustering, the classification results of tourists’ behaviors and preferences in the digital era are shown in Figure 1.

The result of the tourist behavior clustering
The systematic clustering can verify the validity of K-mean rapid clustering and intuitively determine the similarity and dissimilarity between tourists’ consumption behaviors in the digital era. From the Ward’s linkage clustering map, the tourists’ behaviors can be roughly divided into 2 major categories, and the refinement can be divided into 4 small types, which is consistent with the results of the improved K-means clustering, and the number of sample distributions divided into 4 categories is also consistent with the results of the rapid clustering analysis. Therefore, it can be verified that the behavior of tourists in the digital era can be divided into four main types: personalized push-sensitive, social media and self-media active access to information, technological interaction and experience, and reliance on travel path customization + positioning push project type.
In addition, a cluster scatter plot was plotted using python for cluster analysis of the collected data, and the results obtained are shown in Figure 2. Looking at the 4 types of tourist behavior:

Cluster scatter diagram
Type I - personalized push-sensitive tourists are mainly manifested as accepting customized content and channel push based on preferences, so it is necessary to optimize the potential tourists’ consumption data mining and precise promotion calculations before the trip to activate and expand the consumption demand.
Type 2 - social media and self-media active access to information type and type 3 - technology interactive experience type of tourists need to match the multi-channel digital media integrated marketing, facilitating interactive experience, content identification, and other forms of user preferences to enhance the marketing Conversion rate.
As for the fourth type of tourists who rely on official channels for information, the cloud-based big data tourism service information platform can be upgraded and shared to push customized travel routes and peripheral projects to tourists based on their preferences and rural traffic and scenic area conditions to stimulate consumption.
In summary, based on the classification of tourist behavior research, tourism digital marketing system should be divided into the following three main levels, front-end strategy: tourism consumption data mining and cloud computing. Middle-end strategy: digital media marketing integration. Back-end strategy: upgrading and sharing cloud-based big data tourism service information platform.
Crawl the user comments of a travel website as experimental data. According to the acquired user travel text data for LDA theme analysis, extract the user’s tourism interest theme feature words, some user interest feature words are shown in Table 3. Calculate the interest similarity between users, and number the users using USER for easy representation.
The interest similarity of some users
| User | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 |
|---|---|---|---|---|---|---|---|---|---|
| 1 | 1.000 | 0.057 | 0.007 | 0.069 | 0.058 | 0.104 | 0.060 | 0.000 | 0.107 |
| 2 | 0.068 | 1.000 | 0.008 | 0.038 | 0.119 | 0.054 | 0.042 | 0.067 | 0.000 |
| 3 | 0.000 | 0.014 | 1.015 | 0.057 | 0.052 | 0.010 | 0.055 | 0.109 | 0.051 |
| 4 | 0.053 | 0.065 | 0.045 | 1.000 | 0.049 | 0.007 | 0.049 | 0.020 | 0.017 |
| 5 | 0.048 | 0.098 | 0.044 | 0.037 | 1.000 | 0.113 | 0.063 | 0.007 | 0.017 |
| 6 | 0.101 | 0.044 | 0.009 | 0.004 | 0.109 | 1.000 | 0.011 | 0.008 | 0.011 |
| 7 | 0.070 | 0.053 | 0.053 | 0.047 | 0.062 | 0.012 | 1.000 | 0.116 | 0.006 |
| 8 | 0.000 | 0.056 | 0.119 | 0.003 | 0.000 | 0.009 | 0.099 | 1.000 | 0.008 |
| 9 | 0.109 | 0.000 | 0.039 | 0.003 | 0.002 | 0.009 | 0.000 | 0.008 | 1.000 |
The user’s comment data is processed and analyzed to extract the user’s attribute word-emotion word pairs about the attractions, the attribute word-emotion word pairs of the attractions are subjected to emotion scoring and emotion polarity determination to obtain the user’s scores of the attribute word-emotion about the attractions, the attribute word-emotion scores of the same attribute facet in the same user’s comment are summed and then averaged to obtain the user’s scores of the attribute facets of the attractions and the scores of the emotions of The sentiment scores of each user on different attribute facets of attractions are calculated, and then the sentiment scores are used to calculate the sentiment similarity between users on attraction-attribute facets. After obtaining the user’s interest similarity and emotion similarity, the two similarities are integrated according to 0.5 and 0.5 to give weight, and the integrated user’s comprehensive similarity is shown in Table 4.
The integration similarity of some users
| User | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 |
|---|---|---|---|---|---|---|---|---|---|
| 1 | 1.000 | 0.416 | 0.346 | 0.445 | 0.267 | 0.403 | 0.000 | 0.000 | -0.282 |
| 2 | 0.406 | 1.000 | 0.331 | 0.461 | 0.349 | 0.376 | 0.311 | 0.020 | -0.251 |
| 3 | 0.352 | 0.358 | 1.000 | 0.465 | 0.289 | 0.341 | 0.016 | 0.068 | -0.242 |
| 4 | 0.444 | 0.429 | 0.409 | 1.000 | 0.336 | 0.367 | 0.013 | -0.031 | -0.306 |
| 5 | 0.256 | 0.326 | 0.288 | 0.349 | 1.000 | 0.562 | 0.041 | 0.267 | -0.395 |
| 6 | 0.386 | 0.358 | 0.281 | 0.336 | 0.518 | 1.000 | 0.011 | 0.162 | -0.368 |
| 7 | 0.052 | 0.276 | 0.059 | 0.027 | 0.026 | 0.016 | 1.000 | 0.077 | 0.031 |
| 8 | 0.000 | 0.032 | 0.030 | -0.001 | 0.239 | 0.225 | 0.077 | 1.000 | 0.122 |
| 9 | -0.263 | -0.266 | -0.208 | -0.322 | -0.402 | -0.371 | 0.017 | 0.108 | 1.000 |
After integration through the integrated similarity to find the nearest neighbor users similar to the target use, find the attractions played by similar users, calculate the attribute sentiment score of each attraction, and then use the Euclidean distance to calculate the similarity between the attraction’s attribute sentiment score and the target user’s attraction attribute sentiment characteristics, the attractions with higher similarity will be recommended, and the set of attractions recommended by similar users is shown in Table 5.
Collection of sites recommended by similar users
| Sites | Bada mountain | Gulang island | Dujiang ancient city | Jade dragon snow mountain | Potala Palace | Huaqing Palace |
|---|---|---|---|---|---|---|
| Similarity | 0.625 | 0.572 | 0.413 | 0.221 | 0.192 | 0.082 |
In order to validate the recommendation effect of the recommendation method in this paper, the recall rate, accuracy rate and integrated metric F1 value are used as the judging criteria. Recall rate: indicates the probability of recommending the attractions that the user is interested in to that user. That is, the ratio of the attractions that the user is interested in the list of recommended attractions to all the attractions that the user is actually interested in. Accuracy rate: indicates the probability of whether the user likes or dislikes the recommended attractions, i.e., the ratio of the attractions in the recommended attractions list that the user is interested in to all the attractions in the recommended list. The F1 value is used to evaluate the accuracy and recall rate together. The larger the values of recall, accuracy and F1 value, the better the recommendation effect.
In order to verify the recommendation effect of the recommendation algorithm proposed in this paper, the recommendation algorithm proposed in this paper and the traditional collaborative filtering algorithm based on user attraction ratings and the recommendation algorithm based on similar users are compared and analyzed, and Fig. 3 shows the accuracy curve of the three recommendation algorithms. Under the selection of different attraction recommendation list length, the number of attractions in the recommendation list gradually increases, while the number of attractions of interest to the user in the recommendation list increases less, so the accuracy rate decreases after the rise. According to the results of the experiment, this paper’s recommendation algorithm based on user ratings has a greater improvement in accuracy compared to the collaborative filtering algorithm based on user attraction ratings, and it is also slightly higher than the recommendation algorithm for similar users in terms of accuracy.

The comparison of recommendation accuracy of the algorithms
Fig. 4 shows the recall graphs of the three recommendation algorithms. The recall rates of the three algorithms show an increasing trend as the length of the recommended attractions list increases. When the number of recommended attractions increases, the proportion of attractions in the list of attractions recommended to the user that the user is interested in to all the attractions that the user is actually interested in increases gradually, so the recall rate also increases gradually. The user rating based recommendation algorithm in this paper has an increase in recall compared to the other two algorithms, so the user rating based recommendation algorithm proposed in this paper has a certain advantage in terms of recall.

The comparison of recall rate of the algorithms
The F1 value curves of the three recommendation algorithms are shown in Fig. 5, and the recommendation algorithm of this paper and the recommendation algorithm based on similar users are better than the traditional collaborative filtering algorithm based on attraction ratings in terms of the comprehensive F1 value. And the F1 value of this paper’s recommendation algorithm based on user image model is also slightly higher than that of the recommendation algorithm based on similar users. In summary, the recommendation algorithm based on user image model in this paper has better recommendation effect.

The comparison of F1 of the algorithms
This study focuses on the feasibility of using cloud computing calculation, optimized K-means clustering algorithm and improved collaborative filtering algorithm to achieve precision marketing for rural tourism. Based on the clustering results of the traveler behavior samples, the tourists are classified into four types: personalized push-sensitive (27.7%), social media and self-media active information-getting (23.9%), technological interactive experience (27.1%), and relying on travel path customization + positioning push items (21.3%). The recommendation accuracy of this paper’s method is better, and the recommendation accuracy, recall and F1 value are higher than the traditional user similarity and coordination filtering algorithms.
Based on the experimental results, this paper proposes the following rural tourism precision marketing strategies: (1) Utilize tourism consumption data mining and cloud computing. (2) Integrating digital media marketing. (3) Upgrade and share cloud-based big data tourism service information platform.
