Research on Optimization Design of Talent Cultivation Path in Tourism Industry Based on Data Mining under Vocational Education System 
Publicado en línea: 19 mar 2025
Recibido: 12 nov 2024
Aceptado: 11 feb 2025
DOI: https://doi.org/10.2478/amns-2025-0378
Palabras clave
© 2025 Pingping Wei et al., published by Sciendo
This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
China’s tourism industry will usher in a new stage of prosperity and development, which will put forward higher requirements for talents in the tourism industry, and there still exists a contradiction between “talent surplus” and structural talent shortage in China’s tourism industry. At this stage, there are still some problems in China’s excellent talent training system of tourism management. Some higher vocational colleges and universities are homogenized in their positioning, and they do not combine their own schooling practice and regional disciplinary characteristics to formulate the excellent talent cultivation plan, resulting in the lack of clarity in the level of tourism excellence training, and the type of differentiation is not high [1–3]. Curriculum and teaching content are out of touch with the development needs of the times, the horizontal knowledge network is not comprehensive enough to lead to insufficient theoretical knowledge, software operation is not skillful, and the vertical knowledge system is not in-depth enough to lead to insufficient understanding of the cutting-edge hotspots and development direction of the field of tourism [4–6]. Relevant higher vocational colleges and universities have few comprehensive practical teaching programs, and insufficient training of practical ability and innovation ability, which is not conducive to the cultivation of students’ comprehensive practical ability, thus resulting in the lack of diversified tourism talents [7–8]. At the same time, the level of internationalized education in higher vocational colleges and universities is not high, and there is a gap between the supply side and the demand side, and the integration of industry-academia-research has become a difficult problem faced by colleges and universities due to the difficulty of establishing a mechanism of trust and responsibility sharing between colleges and universities and enterprises [9–11].
With the development of the times, the development of big data technology has led to changes in the talent cultivation mode, new standards and new directions for the cultivation of excellent talents, and new requirements for excellent talents [12–13]. The goal of cultivating excellent talents in tourism management under the wave of big data should be based on its own professional characteristics, and creatively cultivate high-level tourism management talents with solid cultural quality, modern management technology, strong employment and entrepreneurial ability, and in line with the market demand, in order to adapt to the industry’s sustainable development needs [14–17]. The cultivation of outstanding talents in tourism management should be based on the premise of school-enterprise cooperation, identify its own precise positioning, innovate the teaching mode, reform the curriculum system, cultivate high-quality teachers, and commit to creating a group of high-level applied specialists in tourism management who have high professionalism, strong innovation ability, and excellent technical level [18–20].
Aiming at the shortcomings of Apriori algorithm, an efficient association rule mining algorithm EA is proposed, which utilizes the results in Lk and Ck to screen the database, reduces the number of records that candidate items are looked up in the database, and improves the efficiency of the whole algorithm. Then we conduct a descriptive statistical analysis of the demand for talents in the tourism industry based on three aspects: current salary situation, company size, and job inspection. The article applies association rule data mining to analyze correlations between students’ curriculum and employment, and explores the relationship between vocational education curriculum, internship opportunities, and high salaries. Finally, based on the summary analysis of the previous article, it provides a specific path optimization design for the formation and cultivation of talents in the tourism industry.
We are living in the information age, and the explosion of data has created an urgent need for powerful techniques to help us find valuable information from massive amounts of data. This need has led to the birth of data mining, which is the process of extracting interesting patterns and knowledge from large amounts of data. Data mining is an emerging technology, unlike traditional statistical analysis, data mining has a huge amount of workload and a great deal of unknowns in the mining results, so it is often used to reveal important information that is undiscovered and valuable in a large amount of data [21].
Generally speaking, the process of data mining includes three stages: data preprocessing, data mining, and result evaluation. The data mining process is shown in Figure 1. Data pre-processing stage As the data stored in the database of the computer system has problems such as noise, missing data, and redundancies, it is not suitable for direct data analysis and mining. Therefore, the need for further preprocessing of raw data, and the quality of preprocessing directly affects the quality of data mining results. The data preprocessing stage requires data cleaning, integration, generalization, and transformation, and the final preprocessed data should have the characteristics of accuracy, completeness, and consistency. Data mining stage The data mining stage is the key stage of the entire data mining process, which aims to mine and analyze data after preprocessing. [22] In this stage, it is important to clarify the mining task and select the mining algorithm. It should be combined with the user’s needs to clarify the mining task, and after determining the task requirements, then use the data characteristics to select the appropriate mining methods, such as clustering, association, classification, etc., and then use the corresponding mining means to mine the preprocessed data, from which the corresponding knowledge and patterns are obtained. Result Evaluation After the previous data mining stage, the corresponding results will be obtained. But whether the mined knowledge is effective or not, it still needs to be analyzed and evaluated to eliminate redundant and irrelevant results and finally obtain useful information. If the information we obtained is not what we require, we must return to the previous stage, adjust the data selection parameters, or even modify the data mining method. Finally, after continuous attempts to obtain meaningful results, it is necessary to further convert the results into a user-friendly presentation.

Data mining process
In order to accurately describe the association rule mining problem and facilitate the discussion of the problem, a formal definition of the association rule mining problem needs to be given. The following is a definition of the association rule mining
Definition 1 The dataset mined by an association rule is denoted as 
Definition 2 Let 
Transactions and itemsets, although both are collections of items, have different meanings. A transaction is a constituent element of database 
Definition 3 The number of transactions in dataset 
Where: |
Definition 4 Let  If  If  If 
Definition 5 If 
The confidence level of association rule 
Usually, the minimum confidence level specified by the user according to the extraction needs is denoted as minconfidence.
Support and confidence are two important concepts for describing association rules, the former is used to measure the statistical importance of an association rule in the whole dataset, and the latter is used to measure the degree of confidence of an association rule. Generally speaking, only association rules with high support and confidence are likely to be interesting and useful for users.
Usually, users specify the minimum support (denoted as minsupport) and the minimum confidence (denoted as minconfidence) according to the extraction needs. The former describes the minimum importance of an association rule, and the latter specifies the minimum reliability that an association rule must satisfy.
Definition 6 e.g. sup 
The problem of association rule mining is to solve all association rules in 
An example is shown in Fig. 2, where the classification of 

The classification of 1
A new efficient association rule mining algorithm EA
When performing support calculation, Apriori algorithm has been scanning the whole database. If the results in 
Algorithm EA in the calculation of the strong set of records contained in the strong set of the corresponding things in the TID, each time the calculation of 
Example of the operation process of EA algorithm
For example, there is a database 

The Explanation of EA Algorithm Process
This section applies the descriptive statistical analysis of data. Descriptive statistical analysis of structured data on the demand for talents in the tourism industry is done from three aspects: salary situation, company size and work experience. By observing the results of the descriptive statistical analysis of the indicators, we have a preliminary understanding of the current status of the demand for talent in the tourism industry in the current society.
The logarithmic average salary is shown in Figure 4. At present, the salary of professionals in the tourism industry is still relatively high in general, mostly concentrated in the range of 8K-20K. However, since there are many human-caused outliers in the raw salary, all salary analysis will be based on the average salary after the commonly used logarithmic processing. The average salary is calculated by dividing the highest and lowest salaries offered for the job. The graph shows that the logarithmic mean salary is roughly evenly distributed, indicating that the job outlook for travel industry professionals is still stable and impressive.

Logarithmic mean salary
The size of the company is closely related to the economic strength of the company, so the analysis of the size of the company will also help the data tourism industry talents to find jobs with better employment prospects. The box plot of salary versus company size is shown in Figure 5. The larger the company size, the higher the corresponding logarithmic average salary, especially for some foreign-funded and joint ventures. This is also releasing a signal to the tourism industry talents, in the process of choosing a career to learn more about the size and culture of the company, as a prerequisite to screen out the employment opportunities offered by quality companies. Small and medium-sized companies data more demand for tourism industry talents, while the demand of large companies with more than 5,000 employees is less. This practical conclusion may not be in line with the theoretical assumptions, but it does provoke in-depth thinking. Although on the surface of the small and mediumsized enterprises for the tourism industry talent to provide more employment opportunities, but whether it reflects the current tourism industry-related companies are mostly small and medium-sized enterprises in the initial development stage, large enterprises due to their own structural characteristics, all kinds of talent reserves are more complete, but on the contrary, the demand for tourism industry talent is less than small and medium-sized enterprises.

Salary and company scale plan
The accumulation of work experience is still particularly important for those who work in the field of tourism industry, and work experience should be accumulated at least to a certain stage in order to have a more complete and independent working mind. The box plot of average salary versus work experience is shown in Figure 6. As a rule, the salary promised by employers increases with the number of years of work experience they require. However, since the travel industry does not require too much work experience, the main focus is on 3-4 years or less. There is not much difference in salary levels for those with 2 years of work and below, but there is a significant increase in salary for those with more than 5 years of work experience, and there is no significant increase in salary after the required number of years of work experience reaches 7 years or above.

Average salary and work experience box diagram
The internship and employment of students is an important index for evaluating the training of vocational education talents, and it is the expectation of every graduate to get more choices of jobs and higher salaries in the internship and employment. To explore the correlation between courses and curricula, curricula and internships and employment, and to analyze the core courses that students learn in school, strengthening students’ learning of core courses in school is the basis for students to get better internships and employment.
In this section, 100 students majoring in tourism in vocational education were selected as the empirical research subjects. The employment questionnaire data of these 100 students was collected through LimeSurvey, an online questionnaire system. The associations between the professional courses were analyzed and the list of courses involved is as follows: introduction to tourism, hospitality and MICE, tourism geography, tourism consumer behavior, lodging management, club management, economic analysis of the tourism industry, tourism project management, big data and smart tourism, community tourism planning and management, tourism crisis management, food culture and catering management, intermediate financial accounting, hospitality management information system, revenue Management, Introduction to Tourism, Tourism Economics, Consumer Behavior in Tourism, Tourism Laws and Regulations, Destination Management, Tourism Planning and Development, Ecotourism, Introduction to Landscape Architecture, Rural Tourism Planning and Design, Tourism Psychology, Tourism Sociology, Introduction to World Heritage, Architectural Photography, and Cutting Edge Issues in Tourism, totaling 28 courses. After dividing the four relative grades based on the grades, the rules found to have a better confidence level are as follows:
At the time of data preprocessing, the students’ grades for each course were categorized into three grades, i.e. Above 80 is A, 60-80 is B, and below 60 is C. Finally, the correlations between the courses were obtained. Two pieces of information were filtered from the results obtained. The correlation between the courses is shown in Table 1. It can be preliminarily analyzed to get the conclusion that the grades of Tourism Geography course and Introduction to Tourism, Hospitality and MICE A1 do not affect the grades of Introduction to Landscape Architecture and Introduction to Tourism, Hospitality and MICE A2.
The correlation between courses
| Sequence | Rule | Confidence | 
|---|---|---|
| 1 | Tourism geosciences (1-1) = C ecotourism (3-1) =A10==> Introduction to landscape design (3-2) = A10 | 1 | 
| 2 | Tourism geosciences (1-1) = A ecotourism (3-1) =A10==> Introduction to landscape design (3-2) = A8 | 0.9 | 
| Sequence | Rule | Confidence | 
|---|---|---|
| 1 | Introduction to tourism and hotel exhibition A1 (1-1) =C20=> Introduction to tourism and hotel exhibition A2 (1-2) A=16 | 0.85 | 
| 2 | Introduction to tourism and hotel exhibition A1 (1-1) = A16=> Introduction to tourism and hotel exhibition A2 (1-2) A=13 | 0.8 | 
The student achievement data was preprocessed again by converting all the student achievements into a four-level scale where the top 20% of the achievement rankings are excellent, the next 30% are good, the next 30% are moderate, and the remaining 20% are poor, was preprocessed. The discretized data of student performance is shown in Table 2.
Data on student achievement
| Introduction to landscape design B | Club management B | Hotel management information system | 
|---|---|---|
| A | B | A | 
| C | C | A | 
| C | C | A | 
The association relationship between courses and courses is obtained through the association rule experiment, and the association analysis between courses is shown in Table 3. In the table, it can be found that the rule Introduction to Tourism, Hospitality and MICE A1 (1-1) = C20 => Introduction to Tourism, Hospitality and MICE A2 (1-2) B = 19, and failure to study Introduction to Tourism, Hospitality and MICE A1 will affect the grade of Introduction to Tourism, Hospitality and MICE A2. Based on the data in the table, it can be inferred that if the rules Introduction to Tourism, Hospitality and MICE and Destination Management are not studied seriously, it will affect the graduation design grade, and the related correlation results later in this paper show that the graduation design will have a direct impact on the individual’s work. The seventh data shows that if there are problems in the economic analysis of tourism industry in the second semester of freshman year and food culture and food service management in the second semester of sophomore year, it will lead to unsatisfactory grades in the first semester of junior year on tourism consumer behavior. The results of the association rules obtained above indicate that some courses are related to each other in some way, they affect each other, and the good or bad grades of some courses will directly affect the grades of the following courses. Under the credit system, schools can use these potential rules to guide students’ learning and teaching.
Correlation analysis between courses
| Sequence | Rule | Confidence | 
|---|---|---|
| 1 | Tourism crisis management (2-1) = A ecotourism (3-1) =A11==>Introduction to landscape design (3-2) =A11 | 1 | 
| 2 | Tourism geosciences (1-1) =A ecotourism=A10==>Introduction to landscape design (3-2) =A10 | 1 | 
| 3 | Tourism crisis management (2-1) = A Tourist destination management (3-1) =A10==>Introduction to landscape design (3-2) =A10 | 1 | 
| 4 | Tourism crisis management (2-1) =A Tourist destination management=A10==> ecotourism (3-1) =A10 | 1 | 
| 5 | Tourism crisis management (2-1) = A Tourist destination management=A10==> ecotourism (3-1) =A Introduction to landscape design (3-2) =A10 | 1 | 
| 6 | Tourist destination management (3-1) = A ecotourism (3-1) =A14==> Introduction to landscape design (3-2) =A13 | 0.94 | 
| 7 | Economic analysis of tourism industry (1-2) =C Food culture and food management (2-2) =D12==> Travel consumer behavior (3-1) =C11 | 0.93 | 
| 8 | Community tourism planning and management (2-2) =B Tourism sociology (3-1) =A11==> Tourist law (3-2) =B10 | 0.9 | 
| 9 | Introduction to tourism and hotel exhibition (1-2) = Tourist destination management (3-1) =C13==> Travel frontier problem (4-2) =C11 | 0.86 | 
| 10 | Introduction to tourism and hotel exhibition (1-1) =C20=> Introduction to tourism and hotel exhibition A2 (1-2) B=19 | 0.86 | 
The correlation between courses and internships is shown in Table 4. Based on the data in the table, it can be inferred that unsatisfactory grades in Club Management and Destination Management will affect the internship in the fourth year. Whereas, courses such as Tourism Consumer Behavio Introduction to Landscape Architecture, Tourism Project Management, Hospitality Managemen Information Systems, Tourism Economics, Intermediate Financial Accounting, and Sociology o Tourism are helpful for the senior year internship.
The association of courses and internships
| Sequence | Rule | Confidence | 
|---|---|---|
| 1 | Club management (1-2) =C Tourist destination management (3-1) =C9==>internships=N9 | 1 | 
| 2 | Tourism project management (1-2) =B Introduction to world heritage (3-1) =B9==>internships=Y9 | 1 | 
| 3 | Introduction to landscape design (3-2) =B Travel frontier problem (4-2) =B9==>internships=Y | 1 | 
| 4 | Tourism planning and development (1-1) =A Large Numbers according to wisdom travel (2-1) =B8==>internships=Y8 | 1 | 
| 5 | Tourism project management (1-2) =B Introduction to landscape design (3-2) =A8==>internships=Y8 | 1 | 
| 6 | Large Numbers according to wisdom travel (2-1) =B Hotel management information system (2-2) =A8==>internships=Y8 | 1 | 
| 7 | Large Numbers according to wisdom travel (2-1) =B Tourism economics (2-2) = A8==>internships=Y8 | 1 | 
| 8 | Food culture and food management (2-2) =B Intermediate financial accounting (2-2) =B8==>internships=Y8 | 1 | 
| 9 | Tourism economics (2-2) =A Tourism sociology (3-1) =B8==>internships=Y8 | 1 | 
| 10 | Tourism planning and development (1-1) =A Rural tourism planning and design (3-2) =A11==>internships=Y10 | 0.9 | 
The correlation analysis between courses and Offer is shown in Table 5. From the table, it can be concluded that Tourism Consumer Behavior (1-1) = Tourism Economics (2-2) = A12 => Offer = Y12, which can be inferred that if Tourism Consumer Behavior and Tourism Economics are not studied seriously, it will affect the eligibility to get an offer.
Correlation analysis of course and offer
| Sequence | Rule | Confidence | 
|---|---|---|
| 1 | Tourism consumer behavior (1-1) =A Tourism economics (2-2) =A12==>Offer=Y12 | 1 | 
| 2 | C=A Tourism psychology (3-1) =A11==>Offer=Y11 | 1 | 
| 3 | Tourism consumer behavior (1-1) =A Rural tourism planning and design (3-2) =A11==>Offer=Y11 | 1 | 
| 4 | Community tourism planning and management (2-1) =B Introduction to landscape design (2-2) =A11==>Offer=Y11 | 1 | 
| 5 | Tourist destination management (3-1) =A Introduction to landscape design (3-2) =AOffer=Y11 | 0.95 | 
| 6 | Introduction to tourism and hotel exhibition (1-2) =C Large Numbers according to wisdom travel (2-1) =B10==>Offer=Y10 | 0.93 | 
| 7 | Tourism consumer behavior (1-1) =A Intermediate financial accounting (2-1) =B10==>Offer=Y10 | 0.92 | 
| 8 | Large Numbers according to wisdom travel (2-1) =B Tourism economics (2-2) =B10==>Offer=Y10 | 0.92 | 
| 9 | Tourism crisis management (2-1) =B Food culture and food management (2-2) =A10==>Offer=Y10 | 0.87 | 
| 10 | Food culture and food management (2-2) =A Tourist destination management (3-1) =A10==>Offer=Y10 | 0.85 | 
The analysis of the association between courses and high salary is shown in Table 6. Tourism Consumer Behavior, Tourism Economics, Tourism Geography, Tourism Psychology, Rural Tourism Planning and Design, Community Tourism Planning and Management, Revenue Management, Destination Management, Introduction to Landscape Architecture, Introduction to Tourism, Hospitality, and MICE, Big Data and Smart Tourism, Intermediate Financial Accounting, Food and Catering Culture and Restaurant Management, and Tourism Destination Management have an impact on the ability to obtain Offer. Intermediate Financial Accounting, Frontier Issues in Tourism, Introduction to Software Engineering, Economic Analysis of the Tourism Industry, Tourism Planning and Development, Tourism Geography, Tourism Consumer Behavior, Rural Tourism Planning and Design, Architectural Photography, and Revenue Management have a critical impact on the ability to achieve a high salary.
The relevance of the course to the high salary
| Sequence | Rule | Confidence | 
|---|---|---|
| 1 | Intermediate financial accounting (2-2) =B20==>High salary=G20 | 1 | 
| 2 | Travel frontier problem (4-2) =B17==>High salary=G17 | 1 | 
| 3 | Economic analysis of tourism industry (1-2) =A14==>High salary=G | 1 | 
| 4 | Tourism economics (2-2) =A14==>High salary=G14 | 1 | 
| 5 | Tourism planning and development (3-2) =C14==>High salary=G14 | 1 | 
| 6 | Travel consumer behavior (1-2) =A13==>High salary=G13 | 1 | 
| 7 | Tourism consumer behavior (1-1) =A13==>High salary=G13 | 1 | 
| 8 | Rural tourism planning and design (3-2) =A13==>High salary=G13 | 1 | 
| 9 | Architectural photography (4-1) =A13==>High salary=G13 | 1 | 
| 10 | Revenue management (2-2) =A16==>High salary=G16 | 1 | 
In summary, Tourism Geography, Food Culture and Catering Management, Intermediate Financial Accounting, Big Data and Smart Tourism, Introduction to Tourism, Hospitality and MICE, Destination Management, Tourism Consumer Behavior, Revenue Management, Introduction to Landscape Architecture, Rural Tourism Planning and Design, Tourism Economics, Community Tourism Planning and Management, Tourism Psychology, Tourism Industry Economic Analysis, Architectural Photography, Tourism Economics are the most important courses that have a vital impact on the courses that have an impact on the jobs and salaries of graduates.
Through the descriptive statistical analysis of the current demand for talents in the tourism industry, it is found that the current development of the tourism industry as a whole presents a good state, and it is also found that there is an important link between the cultivation of talents in the tourism industry and the design of the school curriculum. Therefore, this section designs the optimization path for talent cultivation in the tourism industry based on the above findings.
Breaking the inherent curriculum structure, reconstructing the tourism curriculum system with professional group thinking, realizing professional integration, culture and tourism fusion and skill interaction, and upgrading the knowledge structure and professional skills simultaneously. Restructure the knowledge structure with cross-border integration thinking, highlighting generalized teaching, cross-border integration, and the transformation of traditional knowledge into intelligent ability. Design the teaching content with the thinking of integration of books and certificates, and introduce the work standards of “National Tourist Guide Qualification Examination”, “Design and Implementation of Study Tour Curriculum” and “Tourism Customizer” to provide effective choices for students to master a variety of skills. Provide effective options for students to master various skills. We design the practical links with the mindset of integration of production and education, and provide progressive training for the skills of tour guide, travel customization, and study guide, so as to realize the complementary enhancement of the digital skills of “Internet+Tourism”.
Docking the industry development and the new needs of tourism consumption, re-establish the knowledge, ability composite cross-border intelligent tour guide personnel training specifications. In the process of cultivation, it is necessary to penetrate the Internet, intelligent and new media knowledge, construct a composite knowledge system, realize the transformation of basic service skills to composite skills such as the application of digital technology, new media marketing, and intelligent service, and realize the cultivation goal of iterative upgrading of knowledge structure and ability structure. At the same time, it is necessary to strengthen students’ cross-border thinking, intelligent thinking, and innovative thinking, and enhance their scientific and technological innovation literacy.
Cultivate cross-boundary and composite tour guide talents through professional integration and industry-teaching fusion, carry out inter-industry and inter-professional cultivation of tour guide talents from a borderless perspective, and form a professional group by integrating related professions, so as to make the knowledge system, the curriculum system, and the practical teaching system integrated and coherent. At the same time, deepen the school-enterprise cooperation, alternating work and study cultivation mechanism, build a new tourism talent incubation base, through the learning cognitive stage, learning and training apprenticeship stage, skills enhancement stage, project combat stage, internship stage, comprehensive enhancement stage, forming a complete alternating work and study talent cultivation chain, prompting the transformation of static teaching into dynamic teaching, the transformation of single-skill learning into composite cross-border skill learning, and the innovation of the teaching process to realize both the quality of talent cultivation and the quality of employment, and to realize the quality of employment. Realize that there is a double improvement in the quality of talent training and employment.
Establishing new standards for employment positions, coordinating and standardizing training materials and teachers, etc., and launching a selection process for social tourism training institutions, focusing on supporting and building training bases adapted to new forms of tourism and new occupations. At the same time, relying on virtual reality, human-computer interaction, big data and other advanced information technology, breaking the demand for training venues to create immersive training base to meet the diverse learning needs of different students, to promote the “Internet +” “smart tourism” education new form.
The current situation of talent demand in the tourism industry is analyzed, and the association rule algorithm in data mining is used to analyze the relationship between vocational education courses and employment, and the conclusions of the study will provide data support for the optimal design of talent training paths in the tourism industry. The article draws the following conclusions.
The tourism industry’s salary is currently satisfactory, with a focus on the range of 8K-20K. In the correlation analysis between courses, it is found that Introduction to Tourism, Hotels and Exhibitions A1 (1-1) =C20=> Introduction to Tourism, Hotels and Exhibitions A2 (1-2) B=19, which can be obtained from the fact that failure to learn Introduction to Tourism, Hotels and Exhibitions A1 affects the performance of Introduction to Tourism, Hotels and Exhibitions A2. The path of cultivating talents in the tourism industry can be carried out in three aspects: integration of industry and education, crossborder integration, and creation of talent cultivation bases.
