Machine Learning Model Construction and Practice for Personalized Training Programs in Physical Education and Sport Teaching

With the progress of the times and the development of science and technology, people pay more and more attention to physical education. The traditional physical education teaching mode often has the problems of single teaching method, lack of teaching resources, and insignificant teaching effect, which is difficult to meet the diversified and individualized needs of students [1-3]. Therefore, how to use information technology to innovate physical education teaching mode and improve teaching effect has become an urgent problem to be solved in the field of physical education.

The rapid development of information technology provides new ideas and methods for physical education teaching. Physical education teachers use digital evaluation systems to record students’ performance in physical activities, such as running time, high jump performance, etc. These data can help teachers to customize personalized training plans for students, so as to improve students’ physical skills more effectively [4-7]. Interactive software and apps can also be utilized to provide immediate feedback and assessment of students’ athletic performance, helping them correct their movements in a timely manner and promoting their independent learning and self-improvement [8-10]. In practice, students’ exercise data are recorded by using wearable devices to monitor their exercise intensity and changes in physical fitness, thus enhancing students’ understanding of their exercise status and improving their satisfaction and sense of achievement of the effectiveness of exercise through the visual display of data [11-14]. Teachers, in turn, adjust the teaching plan according to the data to accommodate the individual differences of different students, ensuring that each student can give full play to his or her potential under the premise of safety, and truly realizing personalized teaching [15-17].

In this paper, outlier processing, data standardization, and data correlation analysis are performed for sports test data. The standardized data are removed from the strong correlation between the data using principal component analysis. The data generated after principal component analysis is reduced in dimension and classified by K-means clustering to highlight its features, thus making the personalized program generated later more targeted. A BP neural network algorithm is used to build a network model for sports personalization scheme prediction. Finally, the NLP sentiment technology model is used to adjust the sports personalization scheme and produce the most effective results.

2

Modeling of personalized training programmes for sports

2.1

The role of new technologies in motion data analysis

In the 21st century, the rise of emerging technologies, particularly machine learning, artificial intelligence and big data analytics, has profoundly changed the way we see and understand sport, resulting in the increasing prominence of sports data analytics and sports decision support systems. The rapid development of these technologies has opened up unprecedented opportunities in sport, putting data and technology at the heart of modern sport. Machine learning, as a technology that can automatically learn and extract patterns from large amounts of data, holds great potential for sports data analytics. Through machine learning algorithms, we can gain a deeper and more accurate understanding of a sportsperson’s performance, thus providing targeted advice and strategies for coaches and players.

2.2

Machine learning related techniques

2.2.1

Nature of relevance

The correlation problem assumes that there is an indeterminate interdependence between the variables, i.e., for every change in the value of the independent variable, the dependent variable will also change to a greater or lesser extent. However, due to the non-deterministic relationship between the independent variable and the dependent variable, it is unknown exactly how the dependent variable will change when the independent variable changes.

The Pearson linear correlation coefficient is a parameter that measures the degree and type of linear correlation between two random variables. The magnitude of the parameter characterizes the degree of linear correlation between the variables and takes a value between -1 and 1. The closer the value of the coefficient is to ±1, the greater the correlation between the two variables, close to +1 means that there is a strong positive linear correlation between the variables, close to -1 means that there is a strong negative linear correlation between the variables, and the smaller the coefficient is, the closer it is to 0 means that there is less correlation between the variables. Pearson correlation coefficient and the degree of correlation correspond to each other. For random variables X and Y, the Pearson correlation coefficient between them is calculated as follows: (1) $ρ = \frac{cov (X, Y)}{\sqrt{var (x)} \sqrt{var (Y)}}$

The Spearman rank correlation coefficient provides a distribution-free measure of correlation between variables that does not require any transformation of the original data and does not force a linear relationship between the variables, but only a monotonic increase (or decrease) between the variables, i.e., a simultaneous increase (or simultaneous decrease). The Spearman rank correlation coefficient is less restrictive and more general than a linear relationship. Monotonic relationships are less restrictive and more common than linear relationships. The ordering of Spearman’s rank correlation coefficient is not determined by the actual data values, but depends mainly on the rank of the data sorted values, and the correlation of the data ranks can be calculated as long as the data are sorted according to certain rules. For two-dimensional random variables (X,Y) with the same distribution and (X1,Y1), (X2,Y2) and (X3,Y3) are independent of each other, the Spearman rank correlation coefficient is calculated as follows: (2) $ρ = 3 {P [(X_{1} - X_{2}) (Y_{1} - Y_{3}) > 0] - P [(X_{1} - X_{2}) (Y_{1} - Y_{3}) < 0]}$

2.2.2

BP neural network

Neural networks are the dominant solution for modern machine learning, achieving significantly better results than traditional machine learning solutions in many fields, and due to their powerful and almost foolproof fitting ability, they have seen a proliferation of research in academia and groundbreaking applications in industry. A neural network consists mainly of different layers, each of which has many nodes, each of which is associated with other nodes, and the input of one node becomes the output of another. The computation of a specific node can be depicted as: (3) $y = f (\sum_{i = 1}^{N} ω_{i} x_{i})$ Where x_i denotes the output of the previous ind node and y denotes the output of the current node, which is equal to the weighted sum of each of the previous neurons, and the weight of each node is ω_i. Then, after the activation function f is that the system becomes a nonlinear system with the ability to fit the expression of nonlinear data.

There are various forms of activation functions, and it is generally sufficient to require a nonlinear function that is easy to compute. The weights ω_i of them are the parameters of the network. A complete network connects inputs and outputs, and the network is like a complex function without a specific form. The process of obtaining outputs from inputs can be expressed as: (4) $y = ℕ (x)$ Where N indicates that after the calculation of the network, determine the parameters of the network has the ability to fit the data, the fitting effect to a certain extent, you can achieve a very high prediction accuracy on the test data, and in turn, you can also use the labeled data to test the classification and regression ability of this network.

The process of determining network parameters is called training. That is, the network parameters are adjusted by known input and output data. The general process is: 1)

Initialize the network parameters, which can be random values.

2)

For a given sample x and true output y, compute the network output y_pred = ℕ(x) and compute the loss, e.g., the squared error loss function: (5) $l o s s = \frac{1}{2} {[y - y_{p r e d}]}^{2}$

3)

Backpropagation (BP) algorithm is used to batch adjust the network parameters based on the value of the loss and the gradient of the network.

4)

Until the loss of the system no longer decreases and the network parameters are trained.

The principle of the BP algorithm is chain derivation, which gives the expression of the output as: (6) $z = f_{2} (\sum w_{2} f_{1} (\sum ω_{1} x))$

Then the gradient of z for node x can be calculated as shown in equation (7).

(7)

\begin{matrix} \frac{d z}{d x} = \frac{d z}{d h_{2}} \frac{d h_{2}}{d h_{1}} \frac{d h_{1}}{d x} \\ = f_{2}' (h_{2}) f_{1}' (h_{1}) f_{0}' (x) \\ = f_{2}' (f_{2} (\sum ω_{2} f_{1} (\sum ω_{1} x))) f_{1}' (f_{1} (\sum ω_{1} x)) f_{0}' (x) \end{matrix}

The actual neural networks used in production often have many more complex structures, but the basic algorithmic principle is still to learn the data through the BP algorithm and gradually adjust the network parameters. The role of the Loss function is to provide the error calculation guidelines of the neural network, which directly determines the direction of the neural network calculation, in addition to the squared error mentioned in Equation (5), the commonly used loss function also has a 0-1 loss, the loss of the absolute value and so on.

In this paper, we deal with one-dimensional vectors of specified length, and we do not need a complex neural network to fit the data, therefore, simple neural network structures such as Fully Connected Networks (FCNs) and Convolutional Neural Networks (CNNs) are used.

FCN is the simplest neural network, also known as multilayer perceptron, as the name suggests, it has many layers, and each layer has many nodes, and the nodes of the neighboring layers are connected two by two, while the number of nodes and the number of layers of the network can be continuously increased, the larger the whole network, the better the fitting and regression effect on the data, and the richer the information mined from the data.

However, the number of network layers and the number of nodes can not be increased indefinitely, on the one hand, it will bring an exponential increase in the amount of computation, the training time is prolonged, on the other hand, the network’s descriptive ability will not be increased indefinitely, on the contrary, it may “remember” all the samples, the formation of overfitting, so we can not unlimitedly increase the size of the network to improve the performance of the network, and should expand the design of new network structures. Therefore, the network size should not be increased indefinitely to improve network performance, but rather expanded to design a new network structure.

Convolutional neural networks introduce convolutional computation into the network, and the one-dimensional convolutional discretization is defined as follows: (8) $y_{t} = \sum_{k - 1}^{K} w_{k} x_{t - k + 1} \Leftrightarrow y = w * x$ Where * denotes the convolution operator. Where ω is called the convolution kernel and x is the input sequence, the output sequence is obtained by sliding the convolution kernel through the input sequence, this structure can better extract the features of the data. In the backpropagation process after this operation, assuming that f is a scalar activation function, the relationship between the output and the input is expressed as: (9) $y = f (ω * x)$

By the nature of the convolution, the derivative can be calculated: (10) $\frac{\partial f (y)}{\partial ω} = \frac{\partial f (y)}{\partial y} * x$

There are also mechanisms in convolutional neural networks to reduce model parameters, including local connectivity, weight sharing, and a typical convolutional neural network will also work with a fully-connected network after multiple convolutions, and after this is obtained, the final result is output by the activation function.

2.2.3

Dimensionality reduction techniques for sports test data

Principal component analysis method is a comprehensive statistical analysis method, which can achieve the purpose of making the original data dimensionality reduction, simplify and condense the data, make the problem become simpler, delete the repetitive information of the original data variables, and extract the concise data information. The basic idea of principal component analysis method is to transform the dimensions, through orthogonal changes, convert the correlated variables into new uncorrelated variables, convert the covariance matrix of the original data variables into diagonal matrix, convert the original data variables into new orthogonal system, and reduce the dimensionality through multidimensional variables.

Take the two-dimensional space as an example. Let the number of samples be 50, each sample contains two variables x₁ and x₂, and the two-dimensional plane is defined by x₁ and x₂.

The distribution direction of the sample data is not the x₁-axis and x₂-axis, but its distribution trend has a certain regularity, so consider that there may be some connection between the two, and other variables can be used to replace x₁ and x₂. However, if any of these dimensions is used to replace them, it will surely result in the loss of the original data information. Therefore, the coordinate system is rotated counterclockwise by a certain angle Θ so that the direction of maximum dispersion of the sample points is the z₁ axis and its orthogonal direction is the z₂ axis.

The z₁ and z₂ rotation equations are: (11) ${\begin{matrix} z_{1} = x_{1} \cos Θ + x_{2} \sin Θ \\ z_{2} = - x_{1} \sin Θ + x_{2} \cos Θ \end{matrix}$

Its matrix form is: (12) $(\begin{matrix} z_{1} \\ z_{2} \end{matrix}) = (\begin{matrix} \cos Θ & \sin Θ \\ - \sin Θ & \cos Θ \end{matrix}) (\begin{matrix} x_{1} \\ x_{2} \end{matrix}) = P^{T} X$ where P^T is an orthogonal matrix.

The coordinate system has the following properties after rotation: 1)

Variables z₁ and z₂ are orthogonal to each other, and the sample points are uncorrelated in both directions, which prevents information overlap due to multicollinearity between the original data variables.

2)

During the rotation, variables z₁ and z₂ are linear combinations of the original variables x₁ and x₂.

3)

The sample points have the largest dispersion in the direction of z₁, which makes most of the information in the original variables reflected by variable z₁, so z₁ is called the first principal component. The z₂ with smaller variance is the second principal component. When dealing with practical problems, the information in the z₂ direction can be ignored, and only the first principal component z₁ can be used to replace the original data variables, which minimizes the loss of information and realizes the effect of dimensionality reduction.

The calculation steps of the principal component analysis method are as follows: 1)

Standardize the original data and unify the data scale and order of magnitude.

2)

Find the covariance matrix.

3)

Find the eigenvalues of the covariance matrix, calculate the variance contribution rate and the cumulative variance contribution rate.

4)

Determine the number of principal components and derive the results.

2.2.4

K-means clustering

K-means clustering algorithm is one of the most commonly used clustering algorithms in cluster analysis and one of the widely used clustering algorithms. K-means clustering algorithm has high efficiency and simplicity and hence it is a highly researchable divisive clustering algorithm. The algorithm takes the number of K cluster classes as a parameter and divides n data objects into K cluster classes, resulting in low similarity between different clusters and high similarity within the same cluster.

The basic steps of K-means clustering algorithm are as follows: 1)

In the data set X, randomly select k data objects as the initial clustering centers.

2)

Categorize the data objects, calculate the distance between each data object and the k clustering centers, and select the nearest.

The clustering center with the distance, divide the data object into that class, and so on, divide each data object into the class corresponding to the clustering center with the closest distance.

3)

Based on the results, calculate the arithmetic mean of all data objects in each cluster and reselect the clustering center.

4)

Re-divide the data objects into classes according to the new clustering centers.

5)

Re-step 4) and terminate the clustering if the clustering center no longer changes significantly.

6)

Output the results.

2.2.5

NLP Emotional Technology Model

With the progress of technology, how to utilize the vivid subjective evaluation data to provide support for the evaluation of teachers’ teaching effectiveness is a problem we need to solve. Natural language processing is an important direction in the field of computer science and artificial intelligence, and subjective evaluation data can be fully analyzed and mined by using NLP natural language processing technology for teaching management. Natural Language Toolkit (NLTK) is a class library based on Python language, and it is also the most popular natural language programming and development tool. When conducting natural language processing research and applications, proper utilization of the functions in NLTK can dramatically improve efficiency and achieve work goals.

Natural language sentiment analysis can currently be performed using either lexical analysis or machine learning. Dictionary matching is used to directly calculate sentiment words in the text and derive their sentiment tendency scores. While the idea of machine learning method is to first select a part of the text that expresses positive sentiment and a part of the text that expresses negative sentiment, and train them with machine learning method to obtain a sentiment classifier. Then all texts are dichotomously categorized positively and negatively by this sentiment classifier, and the final categorization can either give a category like 0 or 1 for the text, or a probability value.

3

Practice of individualized training programmes in physical education

3.1

Motion data correlation analysis

In this project, the composite scores of the physical education test taken by physical education majors at X college were divided into four categories. These contain the excellent category (composite score of 90 or more), the good category (80 to 89.9), the passing category (60 to 79.9), and the failing category (less than 60). The distribution of the overall physical fitness status of students in the school was observed by calculating the proportion of people in each category per year, as shown in Table 1.

Table 1.

The proportion of students at each category from 2016 to 2019

Year	Excellent class	Good class	Passing class	Inferior class
2023	0.86%	15.94%	75.93%	8.46%
2022	1.04%	16.86%	72.38%	10.28%
2021	1.42%	20.31%	67.89%	10.75%
2020	1.02%	19.82%	70.36%	10.29%

Each year, only 0.86% to 1.42% of the students were classified in the “excellent” category. The percentage of students in the “Failed” category is found to be 8.46% to 10.75% of the students who did not reach the passing line in recent years. The percentage of students in the “Good” category is nearly 15.94% to 20.31%. In contrast, the proportion of students in the “excellent” and “good” categories in 2021 is significantly higher than that in 2020, and the proportion of excellent students in 2021 is the highest in recent years. In the “Good” category, 2021 continues to have the highest percentage of students. This shows that the overall physical fitness of students in 2021 is significantly higher than in other years. In the “Pass” category, 2023 has the highest percentage of students and it is clear that there was an increase in the percentage of students passing in 2023 and a decrease in the percentage of students in the “Fail” category. The physical fitness of these two categories of students is significantly lower than the normal level, and if not adjusted in time, will have a significant impact on their lives and learning. Reducing the failure rate and improving students’ physical fitness is the primary goal of every college and every physical education teacher.

Taking the data of 2023, the last year of the dataset, as an example, the physical education test data can reflect the basic physical information of contemporary college students, such as body shape, physical function and so on. The test has the ability to provide teachers with a precise understanding of the physical weaknesses of students in each school or region. The average values of the eight sports test items in different comprehensive score types and different genders are calculated as shown in Table 2. The average value of students in the “excellent category” is significantly higher than that of other categories, in which height is an unchangeable factor for adults. Obviously, some of the measurements of the students are not satisfactory at the “fail” level, such as weight, sitting forward bend, and strength program. It is clear from the table that students in the “failing” category have high mean values for weight, which may be an important factor in students’ failing overall performance.

Table 2.

The mean value in different project of each attribute for female and male

Female
	Height (cm)	Weight (kg)	Lung capacity (mL)	50m run (s)	Fixed jump (cm)	Preflexion (cm)	Endurance project (s)	Power project (kg)
Excellent class	176.93	65.95	5520.22	6.56	257.56	16.3	212.82	18.76
Good class	176.87	65.82	5187.35	6.91	244.78	13.72	231.19	12.69
Passing class	176.35	70.11	4865.18	7.67	225.84	10.24	257.06	6.12
Inferior class	176.42	84.61	4696.17	8.36	199.66	6.43	304.74	2.03
Male
Excellent class	167.43	56.94	3851.67	7.76	204.13	17.95	214	51.08
Good class	164.98	54.83	3661.15	8.89	182.96	17.56	230.28	41.56
Passing class	163.74	55.9	3191.12	9.62	165.34	13.83	252.12	36.57
Inferior class	164.17	65.65	2886.74	11.46	147.49	11.11	291.26	32.25

The correlations between the attributes were calculated and the results are shown in Figure 1. The measurement items were, height (1), weight (2), lung capacity (3), 50-meter run (4), standing long jump (5), seated forward bending (6), endurance items (7), and strength items (8). There was a strong correlation between the measurement items in the results of the physical education test, in which there was a negative correlation between the 50-meter run and the standing long jump, with a maximum correlation coefficient of 0.7483. Since the data for the 50-meter run were time data, there was a negative correlation between the time data and the scores, with the longer the time, the lower the scores. There is a positive correlation between height, weight and lung capacity, there is also a strong negative correlation between height and the strength program, the taller the height, so the measurement requires the use of more strength relative to others, which affects the speed of doing work. Due to the strong correlation between the attributes, this will lead to redundant information, which will affect the accuracy of the model. In the case where the correlation between the input data is too strong, the weights in the network connected to the input neurons perform a similar function. The correlation between the data is too strong, and the weight relationships trained in the network are not portable enough to apply the model to other years of data. Therefore, a principal component analysis method is needed to transform the original data to eliminate the strong correlation between the data before performing neural network training. Through the analysis of the physical fitness test data, it was found that the comprehensive scores provide great help for teachers to develop teaching programs based on the actual physical condition of students. The equipment measurement data can reflect the changes in various indicators of students over the years, and the comprehensive scores can visualize the physical quality of students, which facilitates a reasonable division of the physical quality level of students, and students can also clearly understand their physical condition based on the comprehensive scores.

3.2

Classification of physical education test data

The above sports test data was used as experimental data. It was analyzed using k-means algorithm. (weight/height squared), lung capacity, forward body flexion, standing long jump, 50-meter run, 1000-meter run (male) or 800-meter run (female), and pull-ups (male) or sit-ups (female) were used as inputs for men and women, respectively.

The determination of k value is mainly in the movement experts to ask for advice and discussion, it is recommended that the kinds of 8-15, too many or too few categories of features are not obvious enough, the prescription is not targeted enough to formulate a large number of tests, the calculation of the sum of squares of the error in the cluster for a number of hierarchical clustering, after selecting the optimal center point, the best clustering results, and ultimately selected 10 when observing all kinds of specific data classified in all kinds of clusters in the categories of the most obvious characteristics under the Cluster characterization, so as to carry out the movement program development.

The centroids of k-means clustering are points that must be points in the data with some reference value, k is 10, which means that men and women are divided into ten categories each. The number of iterations was chosen as 100 times, if the set of center points did not change or did not converge by reaching the maximum number of times then the set of center points of the last time was chosen as the center point. The results of the body measurement clustering are displayed by dimensionality reduction with PCA (Principal Component Analysis) as shown in Fig. 2, PCA enables the new low-dimensional data set to retain the variable results of the original data as much as possible, but it is important to note that the dimensionality reduction by PCA actually loses some information by looking at the pca. Explained_variance_ratio values of [0.99781134, 0.00157322], you can see that the two principal components retained, the first principal component explains 99.88% of the original variance, and the second principal component explains 0.12% of the original variance. This means that the reduction to two dimensions still retains about 99.95% of the original information. The data were divided into 10 classes, and most of the centroid scores in each class varied widely with a certain degree of variability, and the centroid data were used as a reference quantity when the initial exercise prescription was formulated.

3.3

Individualized Program Projections

The sports test data was preprocessed and standardized, and principal component analysis was used to eliminate strong correlations between the data. The original data was transformed using principal component analysis method and then modeled using BP neural network. BP (back propagation) neural network is one of the most successful neural network algorithms available. The above mentioned student sports test data were used, of which 4500 data were used as training data and 1500 as test data for training and prediction of exercise programs. The overall structure of the BP neural network used in this paper is shown in Figure 3.

The K-mean clustering algorithm was used to divide 6000 data into 10 categories, and then each class formulated a corresponding motion scheme, and set the input data of these 10 classes as [1 0 0 0 0 0 0 0 0 0],[0 1 0 0 0 0 0 0 0 0]…[0 0 0 0 0 0 0 0 1 0] and [0 0 0 0 0 0 0 0 0 1], where 1 is located, corresponds to that type of scheme. The prediction results of the BP neural network, the relationship between the number of hidden layer nodes and the error loss are shown in Figure 4. According to the test training network, it was found that when the input layer neuron was set to 7, the output layer was 10, and the number of hidden layer nodes was set to 17, the prediction effect was the best, and the prediction accuracy was 96%. If the number of hidden layer nodes is too large or too small, the prediction effect will be low.

3.4

Individualized program adjustments

1)

On the first day, the length of exercise is 50 minutes, and the heart rate data is measured by the bracelet, and the heart rate between 120-140 is 40 minutes, which is passed to the server for judgment, and the degree of exercise on that day is recorded as 1. The user’s feedback feeling on that day is “I haven’t run for a long time, and I feel exhausted”. The user’s feedback is: “It’s been a long time since I ran, I feel exhausted”, and then the user’s feeling text is passed into the NLP sentiment analysis model, resulting in a negative sentiment record of -1, and so on for each day’s exercise data.

2)

After 2 weeks of exercise, the daily exercise level data were collected to obtain [1, 0, 0, -1, 1] for the first week and [0, 0, -1, 1, 1] for the second week,and the affective data were [-1, 1, 0, 1, -1] for the first week and [-1, 1, 0, -1, -1] for the second week. Inputting the exercise level data and the affective data into the computational model, it can be seen that the obtained affective analysis level Y is -1, and the final (X,Y) is 1.1 T, which indicates an increase in exercise intensity by 10%.

3)

The initial exercise program was adjusted to enhance its overall exercise intensity by 10%, and a personalized exercise program was recommended as follows. The number of training times of the exercise intervention implementation program was 4-6 times, each training time was 35-60 minutes, and the purpose and precautions of the exercise intervention were the same as those of the initial exercise program. The personalized exercise program has been adjusted and is shown in Table 3.

4)

Repeat the steps from 1) to 3) to execute a 2-week exercise and continue to adjust the personalized exercise program.

Table 3.

Adjusted exercise prescription for men in Category 1 (First week)

Week 1	Training topic	Training content	Motor intensity	Exercise group/rest
Train 1	Upper limb muscle Aerobic endurance	Flat support	34 seconds	3 sets/rest for 35 seconds
		Skin-board push-ups	19 times/group	3 sets/rest for 35 seconds
		Straight arm suspension	34seconds/group	3 sets/rest for 50 seconds
		push-ups	16times/group	3 sets/rest for 35 seconds
		3000m	The distribution speed is 5 minutes and 42 per kilometer
Train 2	Torso power + Aerobic endurance	Flat support	34s/group	3 sets/rest for 30 seconds
		Crib	23s/group	3 sets/rest for 35 seconds
		Double up	18s/group	3 sets/rest for 50 seconds
		4000m	The distribution speed is 5 minutes and 38 seconds per kilometer	3 sets/rest for 35 seconds
Train 3	Upper limb torso	Flat support	34s/group	3 sets/rest for 35 seconds
		Straight arm suspension	34s/group	3 sets/rest for 30 seconds
		Push-ups	23s/group	3 sets/rest for 50 seconds
		Crib	23s/group	3 sets/rest for 30 seconds
Week 2	Training topic	Training content	Motor intensity	Exercise group/rest
Train 1	Upper limb muscle Aerobic endurance	Flat support	34s	3 sets/rest for 35 seconds
		Skin-board push-ups	24s/group	3sets/rest for 30 seconds
		Straight arm suspension	34s/group	2sets/rest for 40 seconds
		Bent arm suspension	25s/group	2sets/rest for 45 seconds
		Push-ups	14s/group	3sets/rest for 35 seconds
		2000m	The distribution speed is 5 minutes and 42 per kilometer
Train 2	Torso power + Aerobic endurance	Flat support	34s/group	3sets/rest for 40 seconds
		Crib	29s/group	3sets/rest for 40 seconds
		Double up	24s/group	3sets/rest for 40 seconds
		2500m	The distribution speed is 5 minutes and 41 per kilometer
Train 3	Upper limb torso	Flat support	34/group	3sets/rest for 35 seconds
		Straight arm suspension	34/group	3sets/rest for 52 seconds
		Bent arm suspension	23/group	3sets/rest for 40 seconds
		Push-ups	23/group	3sets/rest for 40 seconds
		Crib	30s/group	2sets/rest for 35 seconds

4

Conclusion

In this paper, we first analyze the correlation of the sports test data and use principal component analysis to downscale the data. Under k-means clustering, different features in the sports test data are highlighted. Then, the BP neural network model is used to complete the recommendation of a personalized exercise program, and the personalized exercise program is adjusted by NLP emotion technology. It can be seen: 1)

Most of the students were categorized in the “Pass” category and the “Fail” category. In the correlation analysis, it was found that the students in the failing category had a high average weight and lower than normal physical fitness, and there was a negative correlation between the 50-meter run and the standing long jump. The physical education test data was clustered into 10 categories, and the center point scores in each category varied greatly.

2)

The prediction accuracy of the BP neural network model was 96%. A personalized exercise regimen for a period of 2 weeks was obtained to be executed after the adjustment of NLP emotion technique. The BP neural network was used to predict the corresponding exercise regimen for each class. The personalized exercise regimen adapted by the NLP affective technique was basically: 4-6 training sessions, each session lasting 35-60 minutes.

Idioma:: Inglés

Calendario de la edición:: 1 veces al año
Temas de la revista:: Ciencias de la vida, Ciencias de la vida, otros, Matemáticas, Matemáticas aplicadas, Matemáticas generales, Física, Física, otros

RSS Feed de revista

Machine Learning Model Construction and Practice for Personalized Training Programs in Physical Education and Sport Teaching

Qun Wan

Publicado en línea: 19 mar 2025

Recibido: 05 nov 2024

Aceptado: 08 feb 2025

DOI: https://doi.org/10.2478/amns-2025-0374

Palabras clavePersonalized training scheme, BP neural network, Cluster analysis, Principal component analysis, NLP sentiment analysis

© 2025 Qun Wan, published by Sciendo

This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

Palabras clave
Personalized training scheme, BP neural network, Cluster analysis, Principal component analysis, NLP sentiment analysis