Protection and Inheritance Strategy of She Traditional Sports Skills Based on Pattern Recognition
Publicado en línea: 21 mar 2025
Recibido: 06 nov 2024
Aceptado: 13 feb 2025
DOI: https://doi.org/10.2478/amns-2025-0660
Palabras clave
© 2025 Hui Lan, published by Sciendo
This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
In the cultural life of society, traditional sports programs of ethnic minorities are indispensable, and they are also irreplaceable key components in national soft power and education [1-3]. Throughout China’s minority cultures, She traditional sports have always been unique. In a certain sense, it is very close to the folk art that surrounds it, which contains different components such as music, dance and martial arts, without mixing any elements from the West, and is full of the national spirit of ancient China and the unyielding character of valor, which is the cultural essence that unites the wisdom of the She people [4-6].
Tracing back the origin of She traditional sports, it is ultimately the practical activities of She compatriots’ life and health, customs and habits, and resisting foreign disasters, which are very important in the process of human culture [7-9]. Most of the She people have been living in deep mountains and dense forests for a long time, and due to the differences in production, life style and geographical features, they have formed a unique cultural form. She traditional sports are closely related to production and labor, living customs, marriage and love, religious beliefs, etc., and can be roughly divided into four categories: dances of a sporting nature, athletic activities of a sporting nature, recreational projects of a sporting nature, and martial arts activities in the traditional sense [10-11]. There is no clear category boundary between the She traditional sports programs, and in different occasions and under different needs, each She traditional sports program can show different purposes and characteristics. However, many young people nowadays know nothing about She traditional sports, so more efforts should be made to protect and pass on She traditional sports heritage in time.
A multi-Kinect based human behavior recognition study is proposed for She traditional sports skills, the human skeletal motion model is analyzed, and the final data fusion results are obtained through the prediction of human skeletal joint points and the analysis of the Kalman filter mathematical model adapted to this paper. Through the collection of She traditional sports skills movement data, the human skeleton is represented as a graph structure, using the dynamic spatio-temporal graph convolution network DLSTM-GCN, and the experimental analysis is based on the skeletal data obtained after target detection and skeletal key point detection, and after the experimental comparative analysis, it is concluded that the improved model can better capture the short-term and long-term temporal dependence between the actions to improve the sporadic action recognition Accuracy.
Under modern society, the development of traditional sports for ethnic minorities is somewhat abrupt in comparison to modern life. In this regard, the literature [12] describes how to protect traditional sports culture while governing the community through the dissemination of traditional sports culture and modern governance. In fact, the protection and dissemination of traditional sports heritage are inseparable from social governance and community governance. And the literature [13] constructed a multi-source data analysis model to analyze the reasons for participation in She community sports, which provides a reference for the development of sustainable strategies for the dissemination and inheritance of She sports. With the development of more intelligent technologies, traditional ethnic minority sports are seeing hope again and no longer rely on traditional oral transmission. For example, literature [14] found that ethnic traditional sports have innovated inheritance methods through the application of smart media technology, diversified communication, and promoted sustainable development. Literature [15] analyzed the evolution of the inheritance process of ethnic traditional sports culture through intelligent PLS software. As always, traditional sports are used to maintain the health of ethnic people and spread the culture. Literature [16] utilized a neural network algorithm to recognize segmentation of images of ethnic minority costumes, and its recognition speed and accuracy help to retain and authenticate the dissemination of traditional elements of clothing. Literature [17] utilizes machine learning models to recognize Bima script handwriting for preservation and proper dissemination of Bima script. Literature [18] recognized historical artifacts using digital image processing techniques such as image enhancement, image coding, and pattern recognition, and its recognition speed and accuracy were higher than traditional handwriting recognition. For the above studies, in fact, most of them can be carried over to the recognition of She traditional sports, but it needs to be more accurate and visualized to be more in line with the characteristics of sports.
The human behavior recognition system mainly utilizes Kinect as a visual sensor to collect data. Kinect is a human-computer interaction device released by Microsoft in 2010 for use with the Xbox-360 gaming console. In 2012, Kinect was integrated with the Windows platform with the launch of Kinect for Windows as a way to encourage developers to design Kinect-based somatosensory interaction devices. In 2014, Microsoft introduced Kinect 2.0, a new generation of sensors. Both of them can acquire color and depth data simultaneously, and the acquired data can be used for 3D reconstruction, human posture recognition, gesture recognition, and other applications. The principle of Kinect structure is shown in Figure 1.

Kinect structure diagram
The Kinect sensor is different from ordinary sensors in that it can not only collect color maps, but also obtain depth information of the objects in the scene from Kinect. Kinect 2.0 adopts the TOF detection method, which mainly uses an infrared emitter to emit infrared light and modulates the light source using a square wave with a frequency range of 10-100MHz. [19] The phase detection is used to get the phase shift and attenuation of the emitted light and the light after the object has been reflected. From there, the total flight time of the infrared light from the light source to the surface of the object and back to the sensor is calculated, and the distance from the object to the sensor is found based on the round-trip flight time of the light. The depth is calculated using equation (1).
Where
After obtaining the target human body joint coordinate information using the Kinect sensor, the joint information can be used for subsequent data fusion and human behavior recognition studies.
Sensor calibration generally involves solving for a set of unknown parameters based on the sensor projection process, by which any point in three-dimensional space can be transformed to the corresponding point in the two-dimensional plane by projection. The unknown parameters usually include the focal length
The Kinect sensor has three coordinate systems: the color image coordinate system, the depth image coordinate system, and the skeletal space coordinate system. In the color image coordinate system, coordinate (
In the course of this paper, the Kinect1 (abbreviated as KI) spatial coordinate system is denoted as
where

Kinect imaging model
From subsection 2.2.2, we can see that the two Kinect external parameter calibrations actually solve the rotation matrix
Let vector
From Eq. (2) and vector properties, the homonymous vectors
From the rotational transformation relations Rodrigues of the spin theory, there exists a matrix
There are
The rotation vector
Finally,
As for the translation matrix
From the above analysis, it can be seen that only need to know more than two groups of homonymous vectors in three-dimensional space can be derived from the rotation matrix
In this paper, we construct the homonymous vectors and homonymous points by obtaining the normal vectors of the three-plane target, extract the point cloud of the three-plane target using two Kinect, and then obtain the parametric equations of the three-plane target using the programming of the PCL point cloud library to obtain the normal vectors of the three-plane target and the plane equation in two Kinect coordinate systems, and then find the intersection point of the three planes using the plane equation
In order to reduce the calculation amount of the two sets of data fusion, and optimize and improve the stability and reliability of the human skeletal joint positions after data fusion, priority is given to the reliable data screening operation when data fusion is carried out. There is an interdependence and constraint relationship between the human skeletal joint positions, and according to the physiological characteristics of the human body, the length of the bones is fixed, so based on the coordinates of the acquired data, the length of the bones is derived, and the constraints are optimized for the skeletal position data.
Each frame acquired by each Kinect device contains 25 3D coordinate data of human skeletal joint points, which are flagged with their corresponding names
The lengths of the corresponding bones acquired by two different Kinect
The corresponding formula for bone length
In order to represent the angular value of joint activity rotation, based on the actual physiological joint activity angle threshold for data screening, here the rotation matrix
The specific formula for converting the rotation matrix
At this point, solve for
Since the angle between the two Kinect devices in this paper is placed at 90°, considering the placement of the human posture direction and the YOZ plane of the first Kinect reference coordinate system, the angles between XOY
Therefore, combining the two characteristics together to determine the contribution of the human skeletal joints, the smaller the standing distance
When both Kinect lost data at the same time, both weights are set to 0. At this time, it is necessary to compensate for the lost data prediction.
Kalman filtering is a common multi-sensor fusion of different perspectives of data processing, recursive way to the previous moment of the state as a parameter value, in order to realize the next target prediction of the estimated state of the numerical value, and then analyze the effect of the observation of the correction, to obtain the optimal prediction results, the filtering prediction of the computational amount is small, to ensure that the state of the prediction process of the speed of data processing and the final prediction of the optimal effect. The state change is denoted as
In order to ensure the accuracy and real-time performance of the improved prediction method for the calculation of nonlinear human complex motion, the nonlinear motion characteristics are linearized and converted to meet the nonlinear motion conditions, and at the same time, the Kalman filtering idea can also be taken to quickly predict the solution to the problem and realize the high-efficiency state position prediction.
Here the nonlinear motion of human body is expressed in the form of Taylor series expansion of its dynamic nonlinear characteristics, the position change of the human skeletal joints is expressed as
A Taylor series expansion expansion of multidimensional variables is carried out in the form as in Eq. (15), where
This linear representation of the nonlinear human motion velocity feature is brought into the computation to realize the improved Kalman filtering with a state change transfer variance formula of
The two variance correspondences of the carryover expansion using the Taylor series expansion formula are shown in Eqs. (16) and (17), where
Based on the above improved formula to derive the corresponding predicted value corresponding to the formula for
Video as well as image data is Euclidean data with pixels neatly arranged in the form of a matrix, where each pixel has the same number of neighboring pixels, traditional convolutional neural networks can only handle Euclidean type data and cannot be used to process non-Euclidean graph data. In mathematical graph theory, graph data refers to topological graphs, which are constructed from a series of nodes and edges that correspond to each other. Graph data has no regularized structure, and each node has a different number of neighboring nodes, and there is no translation invariance, so it is infeasible to use a fixed-size convolutional kernel to extract the features of the nodes in the graph data. In order to efficiently process complex graph data, convolutional neural networks can be utilized, which essentially acts as a feature extractor by updating the features of the nodes using the connectivity relationships between the nodes.
There are two types of convolution methods for GCNs: the spectral domain-based graph convolution method and the null domain-based graph convolution method, the latter being specifically described here.
The node features are updated using null domain graph convolution, which is achieved by aggregating the features of the nodes surrounding the node. In CNN, the features of neurons in the latter layer can be obtained by aggregating the features of a region in the previous layer, and by applying this idea of local connectivity to GCN, the iterative formulas of the nodes can be obtained intuitively as shown in Eq. (18).
In Eq. (18),
With the in-depth study of GCN and the rapid development of artificial intelligence, the ST-GCN model, which integrates graph convolution and skeletal human action recognition, is proposed.The input of the model is the skeleton spatio-temporal map constructed from human skeletal data.The spatial and temporal features of the skeleton spatio-temporal map are aggregated using spatial graph convolutional network and temporal convolutional network to gradually generate a higher level of feature maps to realize action recognition. The process of sparring action recognition is shown in Fig. 3.

The process of Sanda action recognition
Action video after human posture estimation to get a continuous human posture skeletal point sequence, due to the action in time has continuity, in some adjacent action frames in the human body posture change is very small, the skeletal posture point coordinates do not have differences, in a period of continuous posture frames there are more redundant frames, to remove the redundant posture frames, you can reduce the redundancy of information in the sequence of skeletal posture points to improve the accuracy of the model, simplify the representation of the action to reduce the computational cost [20]. In the preprocessing section, a simple algorithm based on non-great suppression is designed to remove redundant frames. This algorithm can remove redundant pose frames in a more targeted way, and can retain pose frames with pose changes or significant action characteristics, so as to retain more distinctive action information, thus improving the accuracy of action recognition and reducing the cost of model inference.
The non-great suppression method is used to extract some redundant frames in the continuous skeletal pose sequence, and the redundancy elimination condition formula is shown in (20).
Where
Where
Improvement of ST-GCN, this paper uses LSTM-based Recognition Termination Strategy Network instead of TCN.Compared to TCN, LSTM is better at capturing temporal correlation when dealing with time-series data, and has stronger temporal modeling ability, which is able to capture the temporal information and long-term dependencies in the action sequences.LSTM usually has fewer parameter counts, which likewise helps to reduce the model’s overall LSTMs usually have fewer number of parameters, which also helps to reduce the overall complexity and computation of the model and makes it easier to perform efficient action categorization in resource-constrained environments.LSTMs have richer action representations with hidden states that can be used as latent representations of the action sequences.
Then, the LSTM-based policy module uses the currently pooled features and the hidden state
In order to verify the feasibility of the research methodology proposed in this paper, this section takes the traditional sports skill “Shequan” of the She ethnic group as an example for data collection and action recognition.
The data fusion experiments in this project involve two filtering estimation and fusion experiments, which are conducted by utilizing she-fist data captured by two Kinect sensors at the same time. The acquisition frame rate is 30 frames of data per second, and the corresponding multiple frames of skeletal point data can be obtained under the multi-second sampling time, and subsequently, after coordinate calibration, it can be used in the simulation experiment of data fusion.
When two Kinects are utilized to collect she-fist data, both sensors can simultaneously track the joint position information of the movement. However, there may be problems such as body occlusion and data loss. When one Kinect can capture the data and the other is estimated or not captured, the fused data thus obtained has the possibility of insufficient accuracy and forming a large error, so it is necessary to judge the observed data, and the process of obtaining the estimated value based on the observed value by filtering and estimating is a continuous iterative process, and by discarding the observations with large errors and unusable values, a more accurate and less error can be obtained The fusion value can be obtained with more accurate and less error by discarding the large error and unusable observation value.
Firstly, a simulation experiment for filter estimation is carried out, in which data from different skeletal joints in 3D coordinates under multiple frames is simulated, and good filtering results are obtained. Therefore, in this paper, we take the right knee KneeRight and the right hand HandRight as examples for filtering experiments, and analyze the filtering effect of the right knee joints and the right hand joints to verify the feasibility of Kalman filtering for the prediction of joints.
Fig. 4 and Fig. 5 show the data of the right knee joint point in the X-axis direction for the experiment, through Fig. 4 and Fig. 5 we can get that the data of the right knee skeletal point estimated by filtering is closer to the real value, and the error is not more than 0.05m, corresponding to the error after filtering estimation is also smaller than the observed error, and the filtering effect obtained is obvious, so the filtering fusion under multi-Kinect is feasible to be estimated by utilizing Kalman filtering.

The right knee filter experiment trajectory

The right knee filter experiment trajectory error diagram
Figure 6 shows the human right hand skeletal joint point data captured using a certain Kinect sensor, the filtering trajectory in the extraction of skeletal point coordinate information, the right hand tracking is not accurate, there may be occlusion, data jitter is large, etc., at this time, it is necessary to utilize the Kalman estimation, will be estimated as an observation for experimental research, so that the data can be smooth, more close to the true value, to avoid data jumps and other situations. Therefore, it can be concluded that Kalman filtering is feasible for predicting the coordinates of skeletal points.

Right hand filter experiment trajectory
Figure 7 shows the data fusion experiment under the right knee joint point, the data collected by the two Kinect sensors are filtered and estimated respectively, since both use the same type of two Kinect sensors, then the weights can be made to be 0.7 each, so that the fusion operation can be carried out according to the estimated value, and the final data fusion result is obtained, and the fused data is between (0.09m, 0.11m), which contains the detected to all the skeletal point data in sports skill movement.

Data fusion experiment comparison diagram
The data fusion experimental diagram of the posture of the lower She fist in the selected experiment is captured by the color camera, the “K1 acquisition pose” and “K2 acquisition pose” are the action postures tracked by the two Kinect under the corresponding frames, and the “post-fusion pose” is the human skeleton model obtained by the data fusion algorithm, and the complete human skeleton motion model after fusion is obtained by solving the occlusion problem, and the recognition research of the corresponding posture of She Fist can be completed through the extraction of human posture features.
In the field of action classification, the accuracy rate is an important concept for evaluating the performance of the detection model, and the accuracy rate indicates the ratio of the number of correctly classified samples to the total number of samples on the test dataset of the classification model. In the case that the number of samples in each category is roughly balanced, the accuracy rate can better reflect the model’s performance. The calculation formula for this is:
For datasets with uneven sample distribution, the evaluation metric of accuracy is often considered. Compared to accuracy, precision is more concerned with the effect of the model on each category, which is calculated by the formula:
By calculating the accuracy of the model on each category, it is possible to know its specific recognition effect for each category, and a high recognition accuracy avoids misrecognition of specific categories.
The model was first trained on the training set of NTU-RGB+D dataset and tested on the validation set, and the accuracy of 60 types of actions was calculated and compared with the commonly used generic action classification algorithms, and the specific results are shown in Table 1.
As can be seen from the table, the dynamic spatio-temporal map convolutional classification effect is in the leading position in the field of generalized action classification when only human key point data are used. For the more traditional methods such as LieGroup, LSTM, and simple CNN, the GCN method has obvious advantages. Dynamic spatio-temporal graph convolution also has obvious advantages over simple GCN methods such as ST-GCN, although it is slightly inferior to DGNN. The possible reason for this is that DGNN uses a fully connected layer as the feature extraction module of the backbone network instead of the convolutional layer used in this network, which has an obvious advantage in the number of parameters. However, in terms of training and inference time DGNN is at a significant disadvantage over the earlier fusion network, and the dynamic spatio-temporal map convolution spends only about one-third of the training speed and inference time of DGNN.
Other algorithms compare results
| Model name | Accuracy rate(%) |
|---|---|
| Lie Group | 52.5 |
| ARRN-LSTM | 83.6 |
| 3scale ResNet152 | 87.7 |
| ST-GCN | 81.5 |
| Pb-GCN | 89.2 |
| DGNN | 91.2 |
| DLSTM-GCN | 90.1 |
This model accomplishes the fine-grained action classification task on the she-boxing action classification set, the backbone network of dynamic spatio-temporal graph convolution is designed with 10 layers, the length of the input human keypoint sequence is 300 frames, and the number of human keypoints is 25, all of which are 3-dimensional coordinates with a batch size of 16. The dataset has a total of 1274 targets in 10 classes, which are divided into training and validation sets according to 8:2. The comparison of the results of this method with other classification algorithms is shown in Table 2.
Comparison of experimental accuracy of action classification data set
| Categories | ST-GCN | 2s-AGCN | DGNN | FenceNet | DLSTM-GCN |
|---|---|---|---|---|---|
| SL-R | 0.38 | 0.58 | 0.45 | 0.72 | 0.71 |
| SL-L | SL-R | 0.12 | 0.41 | 0.68 | 0.39 |
| SJ-R | SL-L | 0.53 | 0.8 | 0.87 | 0.83 |
| SJ-L | SJ-R | 0.88 | 0.89 | 0.87 | 0.87 |
| SW-R | SJ-L | 0.78 | 0.89 | 0.87 | 0.87 |
| SW-L | SW-R | 0.46 | 0.54 | 0.29 | 0.74 |
| FH-R | SW-L | 0.66 | 0.85 | 0.78 | 0.83 |
| FH-L | FH-R | 0.45 | 0.75 | 0.68 | 0.68 |
| UC-R | FH-L | 0.42 | 0.73 | 0.52 | 0.71 |
| UC-L | UC-R | 0.24 | 0.5 | 0.62 | 0.87 |
| Top 1(%) | 67.75 | 84.57 | 76.67 | 82.76 | 93.45 |
| Top 5(%) | 97 | 100 | 98.15 | 99.51 | 100 |
As can be seen from the results, dynamic spatio-temporal graph convolution achieves good results on the fine-grained she-fist action classification dataset. In terms of classification accuracy in each category, especially in the two categories of swinging fist and uppercut, the proposed model has a significant advantage over other models. The classification model based on DLSTM-GCN with 2s-AGCN, which is also based on the keypoint weight matrix, is also much higher than the other two models in terms of classification accuracy on multiple categories.
In this paper, we also analyze the classification accuracy of this method and other methods on each category of the she-fist movement classification dataset, and the results are shown in Table 3. From the results in the table, it can be learned that the recognition accuracy of the present method has a significant leading position in several categories. On almost all categories, the recognition effect of the present method is more balanced, and the accuracy is generally around 0.8~0.97, which will have high stability and reliability on practical scenarios.
Comparison of accuracy of the classification data set of the method
| Categories | ST-GCN | 2s-AGCN | DGNN | FenceNet | DLSTM-GCN |
|---|---|---|---|---|---|
| SL-R | 0.38 | 0.58 | 0.45 | 0.72 | 0.81 |
| SL-L | 0.53 | 0.7 | 0.54 | 0.75 | 0.84 |
| SJ-R | 0.38 | 0.39 | 0.2 | 0.2 | 0.82 |
| SJ-L | 0.68 | 0.77 | 0.58 | 0.79 | 0.89 |
| SW-R | 0.44 | 0.6 | 0.79 | 0.66 | 0.85 |
| SW-L | 0.63 | 0.77 | 0.82 | 0.71 | 0.97 |
| FH-R | 0.38 | 0.58 | 0.72 | 0.87 | 0.96 |
| FH-L | 0.55 | 0.89 | 0.83 | 0.76 | 0.88 |
| UC-R | 0.44 | 0.83 | 0.71 | 0.77 | 0.76 |
| UC-L | 0.37 | 0.55 | 0.64 | 0.77 | 0.97 |
DLSTM-GCN was tested on a simple comparison of motion-capture based she-boxing dataset and the comparison results are shown in Table 4. Due to its lesser amount of data, more categories, and higher degree of fine-grainedness. Therefore, from the indexes, there is an improvement compared to the Shequan motion dataset, from 63.55% to 98.76%, which is significant.
Comparison of experimental results based on action capture
| Model name | Top 1(%) | Top 5(%) |
|---|---|---|
| ST-GCN | 52.11 | 95.76 |
| 2s-AGCN | 53.47 | 94.21 |
| DLSTM-GCN | 63.55 | 98.76 |
In this subsection, the performance of several commonly used interpolation algorithms on the she-fist action dataset will be compared. In this paper, the real collected human key point sequences are first downsampled, and then the downsampled human key point sequences are expanded to the original length by the interpolation algorithm, and the average error and the root mean square error under multiple expansion multiples are calculated, and the results are shown in Table 5. After the experiment, the accuracy of the polynomial interpolation method is almost the same, and the average error is the lowest when the interpolation order reaches the 8th order, and the lowest value is 0.3011. From the table, it can also be seen that satisfactory interpolation results can also be obtained with higher data expansion multiples.
The accuracy of various interpolation algorithms is compared
| Interpolation algorithm | Mean error | Error mean square root | ||
|---|---|---|---|---|
| k=2 | k=5 | k=2 | k=5 | |
| 2 times neville | 0.0781 | 0.4612 | 0.1338 | 0.6353 |
| 4 times neville | 0.0413 | 0.3184 | 0.0689 | 0.4474 |
| 8 times neville | 0.0391 | 0.3011 | 0.0628 | 0.4123 |
| Linear interpolation | 0.0781 | 8.0278 | 0.1338 | 9.7477 |
The X-axis motion of the hand in one of the samples is shown as an example in Fig. 8-Fig. 10. The true length of the key point sequence in the sample is 60, and the original sequence is downsampled by 2x, 3x and 5x, and then the downsampled data is expanded to the original length by interpolation algorithm. It can be seen that, compared to linear interpolation, Neville interpolation can better perceive the motion characteristics of joints, compensate for some missing key frames, and predict key point positions more accurately. However, when the sampling frequency is too low, as shown in the 5-fold expansion example in the figure, it can lead to distortion of the expanded data.

Interpolation algorithm data expansion 2 times rendering


The interpolation algorithm data expands 5 times the rendering
The key to the protection and transmission of intangible cultural heritage also lies in the grassroots level, and in giving full play to the power of folk organizations. Folk organizations are generally composed of artists who are usually the protectors and inheritors of intangible cultural heritage, and at the same time, they also have strong appeal and influence in a certain region, so it is necessary to give full play to the enthusiasm of folk organizations in the specific implementation of the inheritance of She traditional sports and cultural heritage. In fact, not only the She traditional sports themselves, but also the individuals or organizations who inherit these She traditional sports cultural heritage, they own or inherit the form of such culture, therefore, in the process of She traditional sports cultural heritage inheritance, it is also necessary to protect the intangible cultural heritage protectors who are able to pass on and continue to make innovations, instead of limiting themselves to the collection and preservation of the physical achievements of She traditional sports. The protection of the intangible cultural heritage of traditional sports should also be protected.
As an important base for cultivating talents, schools also play an important role in the preservation of traditional sports in China. However, from the point of view of school physical education nowadays, whether it is in the stage of compulsory education, or in the stage of high school and university education, the physical education is only limited to the teaching of simple and basic physical exercise skills, and the teaching form is boring and single. And rarely will some excellent traditional sports programs be introduced into physical education, resulting in students’ understanding of traditional sports programs being very insufficient. Therefore, the author believes that on the one hand, traditional sports should be introduced into school physical education, increase the curriculum of traditional sports and develop teaching materials of intangible cultural heritage, so that students can fully recognize and understand She traditional sports in their thoughts and actions; on the other hand, intangible culture teachers should be strengthened, so that more teachers can devote themselves to the inheritance and protection of intangible cultural heritage, and promote the inheritance of She traditional sports in a comprehensive way. On the other hand, the construction of teachers of intangible culture should be strengthened, so that more teachers can devote themselves to the inheritance and protection of intangible cultural heritage, and the inheritance of She traditional sports can be promoted comprehensively.
To do a good job in the transmission and protection of Chinese traditional sports in the context of intangible cultural heritage, it is closely related to the support and help of the general public. The adoption of the Intangible Cultural Heritage Law of the People’s Republic of China to regulate the protection of China’s intangible cultural heritage is of great significance. The enactment of the law and the establishment of the legal status of the protection and inheritance of intangible cultural heritage have undoubtedly provided a clear “wind vane” and an effective “cardiotonic” for the people. Therefore, we should strengthen the legal publicity of China’s intangible cultural heritage with the help of the widely active microblogging, WeChat and other Internet channels, as well as radio and television media, so that the public can understand and familiarize themselves with the knowledge of the law on intangible cultural heritage, thus consciously establishing the awareness of protecting the intangible cultural heritage and enhancing the sense of responsibility for the inheritance of traditional sports of the She ethnic group.
In this paper, we take the traditional sports skill “She Quan” as an example, and use the coordinates of the joints tracked by multi-Kinect sensors to recognize the She Quan movements, extract the real-time coordinate points and angle features of the human skeleton, and combine with the dynamic temporal regularization algorithm to complete the data fusion of She Quan. By introducing the key point weight matrix and dynamic graph connection mechanism, the part of the human body that accounts for the main factors in traditional sports skills has a higher weight, and expanding the experiments on the basis of the she-boxing action dataset, we verify the algorithm’s ability to classify the fine-grained boxing technical actions and the effect of each module on the improvement of the classification accuracy by using the dynamic spatio-temporal graph convolutional network, and the results show that the recognition effect of this method is more balanced, and the accuracy is generally around 0.8~0.95. The above two methods are used to protect the traditional sports skills of the She ethnic group, and then the inheritance strategy of the traditional sports skills of the She ethnic group is proposed.
This article is part of the Fujian Provincial Social Science Fund Project: Research on the Identity and Ethnic Integration of the She Ethnic Folk Sports Culture in the New Era in China (FJ2021B129).
