Machine Learning Model Construction and Practice for Personalized Training Programs in Physical Education and Sport Teaching 
Publicado en línea: 19 mar 2025
Recibido: 05 nov 2024
Aceptado: 08 feb 2025
DOI: https://doi.org/10.2478/amns-2025-0374
Palabras clave
© 2025 Qun Wan, published by Sciendo
This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
With the progress of the times and the development of science and technology, people pay more and more attention to physical education. The traditional physical education teaching mode often has the problems of single teaching method, lack of teaching resources, and insignificant teaching effect, which is difficult to meet the diversified and individualized needs of students [1-3]. Therefore, how to use information technology to innovate physical education teaching mode and improve teaching effect has become an urgent problem to be solved in the field of physical education.
The rapid development of information technology provides new ideas and methods for physical education teaching. Physical education teachers use digital evaluation systems to record students’ performance in physical activities, such as running time, high jump performance, etc. These data can help teachers to customize personalized training plans for students, so as to improve students’ physical skills more effectively [4-7]. Interactive software and apps can also be utilized to provide immediate feedback and assessment of students’ athletic performance, helping them correct their movements in a timely manner and promoting their independent learning and self-improvement [8-10]. In practice, students’ exercise data are recorded by using wearable devices to monitor their exercise intensity and changes in physical fitness, thus enhancing students’ understanding of their exercise status and improving their satisfaction and sense of achievement of the effectiveness of exercise through the visual display of data [11-14]. Teachers, in turn, adjust the teaching plan according to the data to accommodate the individual differences of different students, ensuring that each student can give full play to his or her potential under the premise of safety, and truly realizing personalized teaching [15-17].
In this paper, outlier processing, data standardization, and data correlation analysis are performed for sports test data. The standardized data are removed from the strong correlation between the data using principal component analysis. The data generated after principal component analysis is reduced in dimension and classified by K-means clustering to highlight its features, thus making the personalized program generated later more targeted. A BP neural network algorithm is used to build a network model for sports personalization scheme prediction. Finally, the NLP sentiment technology model is used to adjust the sports personalization scheme and produce the most effective results.
In the 21st century, the rise of emerging technologies, particularly machine learning, artificial intelligence and big data analytics, has profoundly changed the way we see and understand sport, resulting in the increasing prominence of sports data analytics and sports decision support systems. The rapid development of these technologies has opened up unprecedented opportunities in sport, putting data and technology at the heart of modern sport. Machine learning, as a technology that can automatically learn and extract patterns from large amounts of data, holds great potential for sports data analytics. Through machine learning algorithms, we can gain a deeper and more accurate understanding of a sportsperson’s performance, thus providing targeted advice and strategies for coaches and players.
The correlation problem assumes that there is an indeterminate interdependence between the variables, i.e., for every change in the value of the independent variable, the dependent variable will also change to a greater or lesser extent. However, due to the non-deterministic relationship between the independent variable and the dependent variable, it is unknown exactly how the dependent variable will change when the independent variable changes.
The Pearson linear correlation coefficient is a parameter that measures the degree and type of linear correlation between two random variables. The magnitude of the parameter characterizes the degree of linear correlation between the variables and takes a value between -1 and 1. The closer the value of the coefficient is to ±1, the greater the correlation between the two variables, close to +1 means that there is a strong positive linear correlation between the variables, close to -1 means that there is a strong negative linear correlation between the variables, and the smaller the coefficient is, the closer it is to 0 means that there is less correlation between the variables. Pearson correlation coefficient and the degree of correlation correspond to each other. For random variables X and Y, the Pearson correlation coefficient between them is calculated as follows:
The Spearman rank correlation coefficient provides a distribution-free measure of correlation between variables that does not require any transformation of the original data and does not force a linear relationship between the variables, but only a monotonic increase (or decrease) between the variables, i.e., a simultaneous increase (or simultaneous decrease). The Spearman rank correlation coefficient is less restrictive and more general than a linear relationship. Monotonic relationships are less restrictive and more common than linear relationships. The ordering of Spearman’s rank correlation coefficient is not determined by the actual data values, but depends mainly on the rank of the data sorted values, and the correlation of the data ranks can be calculated as long as the data are sorted according to certain rules. For two-dimensional random variables (X,Y) with the same distribution and (X1,Y1), (X2,Y2) and (X3,Y3) are independent of each other, the Spearman rank correlation coefficient is calculated as follows:
Neural networks are the dominant solution for modern machine learning, achieving significantly better results than traditional machine learning solutions in many fields, and due to their powerful and almost foolproof fitting ability, they have seen a proliferation of research in academia and groundbreaking applications in industry. A neural network consists mainly of different layers, each of which has many nodes, each of which is associated with other nodes, and the input of one node becomes the output of another. The computation of a specific node can be depicted as:
There are various forms of activation functions, and it is generally sufficient to require a nonlinear function that is easy to compute. The weights 
The process of determining network parameters is called training. That is, the network parameters are adjusted by known input and output data. The general process is:
 Initialize the network parameters, which can be random values. For a given sample  Backpropagation (BP) algorithm is used to batch adjust the network parameters based on the value of the loss and the gradient of the network. Until the loss of the system no longer decreases and the network parameters are trained.
The principle of the BP algorithm is chain derivation, which gives the expression of the output as:
Then the gradient of 
The actual neural networks used in production often have many more complex structures, but the basic algorithmic principle is still to learn the data through the BP algorithm and gradually adjust the network parameters. The role of the Loss function is to provide the error calculation guidelines of the neural network, which directly determines the direction of the neural network calculation, in addition to the squared error mentioned in Equation (5), the commonly used loss function also has a 0-1 loss, the loss of the absolute value and so on.
In this paper, we deal with one-dimensional vectors of specified length, and we do not need a complex neural network to fit the data, therefore, simple neural network structures such as Fully Connected Networks (FCNs) and Convolutional Neural Networks (CNNs) are used.
FCN is the simplest neural network, also known as multilayer perceptron, as the name suggests, it has many layers, and each layer has many nodes, and the nodes of the neighboring layers are connected two by two, while the number of nodes and the number of layers of the network can be continuously increased, the larger the whole network, the better the fitting and regression effect on the data, and the richer the information mined from the data.
However, the number of network layers and the number of nodes can not be increased indefinitely, on the one hand, it will bring an exponential increase in the amount of computation, the training time is prolonged, on the other hand, the network’s descriptive ability will not be increased indefinitely, on the contrary, it may “remember” all the samples, the formation of overfitting, so we can not unlimitedly increase the size of the network to improve the performance of the network, and should expand the design of new network structures. Therefore, the network size should not be increased indefinitely to improve network performance, but rather expanded to design a new network structure.
Convolutional neural networks introduce convolutional computation into the network, and the one-dimensional convolutional discretization is defined as follows:
By the nature of the convolution, the derivative can be calculated:
There are also mechanisms in convolutional neural networks to reduce model parameters, including local connectivity, weight sharing, and a typical convolutional neural network will also work with a fully-connected network after multiple convolutions, and after this is obtained, the final result is output by the activation function.
Principal component analysis method is a comprehensive statistical analysis method, which can achieve the purpose of making the original data dimensionality reduction, simplify and condense the data, make the problem become simpler, delete the repetitive information of the original data variables, and extract the concise data information. The basic idea of principal component analysis method is to transform the dimensions, through orthogonal changes, convert the correlated variables into new uncorrelated variables, convert the covariance matrix of the original data variables into diagonal matrix, convert the original data variables into new orthogonal system, and reduce the dimensionality through multidimensional variables.
Take the two-dimensional space as an example. Let the number of samples be 50, each sample contains two variables 
The distribution direction of the sample data is not the 
The 
Its matrix form is:
The coordinate system has the following properties after rotation:
 Variables  During the rotation, variables  The sample points have the largest dispersion in the direction of 
The calculation steps of the principal component analysis method are as follows:
 Standardize the original data and unify the data scale and order of magnitude. Find the covariance matrix. Find the eigenvalues of the covariance matrix, calculate the variance contribution rate and the cumulative variance contribution rate. Determine the number of principal components and derive the results.
K-means clustering algorithm is one of the most commonly used clustering algorithms in cluster analysis and one of the widely used clustering algorithms. K-means clustering algorithm has high efficiency and simplicity and hence it is a highly researchable divisive clustering algorithm. The algorithm takes the number of K cluster classes as a parameter and divides n data objects into K cluster classes, resulting in low similarity between different clusters and high similarity within the same cluster.
The basic steps of K-means clustering algorithm are as follows:
 In the data set X, randomly select k data objects as the initial clustering centers. Categorize the data objects, calculate the distance between each data object and the k clustering centers, and select the nearest. The clustering center with the distance, divide the data object into that class, and so on, divide each data object into the class corresponding to the clustering center with the closest distance. Based on the results, calculate the arithmetic mean of all data objects in each cluster and reselect the clustering center. Re-divide the data objects into classes according to the new clustering centers. Re-step 4) and terminate the clustering if the clustering center no longer changes significantly. Output the results.
With the progress of technology, how to utilize the vivid subjective evaluation data to provide support for the evaluation of teachers’ teaching effectiveness is a problem we need to solve. Natural language processing is an important direction in the field of computer science and artificial intelligence, and subjective evaluation data can be fully analyzed and mined by using NLP natural language processing technology for teaching management. Natural Language Toolkit (NLTK) is a class library based on Python language, and it is also the most popular natural language programming and development tool. When conducting natural language processing research and applications, proper utilization of the functions in NLTK can dramatically improve efficiency and achieve work goals.
Natural language sentiment analysis can currently be performed using either lexical analysis or machine learning. Dictionary matching is used to directly calculate sentiment words in the text and derive their sentiment tendency scores. While the idea of machine learning method is to first select a part of the text that expresses positive sentiment and a part of the text that expresses negative sentiment, and train them with machine learning method to obtain a sentiment classifier. Then all texts are dichotomously categorized positively and negatively by this sentiment classifier, and the final categorization can either give a category like 0 or 1 for the text, or a probability value.
In this project, the composite scores of the physical education test taken by physical education majors at X college were divided into four categories. These contain the excellent category (composite score of 90 or more), the good category (80 to 89.9), the passing category (60 to 79.9), and the failing category (less than 60). The distribution of the overall physical fitness status of students in the school was observed by calculating the proportion of people in each category per year, as shown in Table 1.
The proportion of students at each category from 2016 to 2019
| Year | Excellent class | Good class | Passing class | Inferior class | 
|---|---|---|---|---|
| 2023 | 0.86% | 15.94% | 75.93% | 8.46% | 
| 2022 | 1.04% | 16.86% | 72.38% | 10.28% | 
| 2021 | 1.42% | 20.31% | 67.89% | 10.75% | 
| 2020 | 1.02% | 19.82% | 70.36% | 10.29% | 
Each year, only 0.86% to 1.42% of the students were classified in the “excellent” category. The percentage of students in the “Failed” category is found to be 8.46% to 10.75% of the students who did not reach the passing line in recent years. The percentage of students in the “Good” category is nearly 15.94% to 20.31%. In contrast, the proportion of students in the “excellent” and “good” categories in 2021 is significantly higher than that in 2020, and the proportion of excellent students in 2021 is the highest in recent years. In the “Good” category, 2021 continues to have the highest percentage of students. This shows that the overall physical fitness of students in 2021 is significantly higher than in other years. In the “Pass” category, 2023 has the highest percentage of students and it is clear that there was an increase in the percentage of students passing in 2023 and a decrease in the percentage of students in the “Fail” category. The physical fitness of these two categories of students is significantly lower than the normal level, and if not adjusted in time, will have a significant impact on their lives and learning. Reducing the failure rate and improving students’ physical fitness is the primary goal of every college and every physical education teacher.
Taking the data of 2023, the last year of the dataset, as an example, the physical education test data can reflect the basic physical information of contemporary college students, such as body shape, physical function and so on. The test has the ability to provide teachers with a precise understanding of the physical weaknesses of students in each school or region. The average values of the eight sports test items in different comprehensive score types and different genders are calculated as shown in Table 2. The average value of students in the “excellent category” is significantly higher than that of other categories, in which height is an unchangeable factor for adults. Obviously, some of the measurements of the students are not satisfactory at the “fail” level, such as weight, sitting forward bend, and strength program. It is clear from the table that students in the “failing” category have high mean values for weight, which may be an important factor in students’ failing overall performance.
The mean value in different project of each attribute for female and male
| Female | ||||||||
|---|---|---|---|---|---|---|---|---|
| Height (cm) | Weight (kg) | Lung capacity (mL) | 50m run (s) | Fixed jump (cm) | Preflexion (cm) | Endurance project (s) | Power project (kg) | |
| Excellent class | 176.93 | 65.95 | 5520.22 | 6.56 | 257.56 | 16.3 | 212.82 | 18.76 | 
| Good class | 176.87 | 65.82 | 5187.35 | 6.91 | 244.78 | 13.72 | 231.19 | 12.69 | 
| Passing class | 176.35 | 70.11 | 4865.18 | 7.67 | 225.84 | 10.24 | 257.06 | 6.12 | 
| Inferior class | 176.42 | 84.61 | 4696.17 | 8.36 | 199.66 | 6.43 | 304.74 | 2.03 | 
| Male | ||||||||
| Excellent class | 167.43 | 56.94 | 3851.67 | 7.76 | 204.13 | 17.95 | 214 | 51.08 | 
| Good class | 164.98 | 54.83 | 3661.15 | 8.89 | 182.96 | 17.56 | 230.28 | 41.56 | 
| Passing class | 163.74 | 55.9 | 3191.12 | 9.62 | 165.34 | 13.83 | 252.12 | 36.57 | 
| Inferior class | 164.17 | 65.65 | 2886.74 | 11.46 | 147.49 | 11.11 | 291.26 | 32.25 | 
The correlations between the attributes were calculated and the results are shown in Figure 1. The measurement items were, height (1), weight (2), lung capacity (3), 50-meter run (4), standing long jump (5), seated forward bending (6), endurance items (7), and strength items (8). There was a strong correlation between the measurement items in the results of the physical education test, in which there was a negative correlation between the 50-meter run and the standing long jump, with a maximum correlation coefficient of 0.7483. Since the data for the 50-meter run were time data, there was a negative correlation between the time data and the scores, with the longer the time, the lower the scores. There is a positive correlation between height, weight and lung capacity, there is also a strong negative correlation between height and the strength program, the taller the height, so the measurement requires the use of more strength relative to others, which affects the speed of doing work. Due to the strong correlation between the attributes, this will lead to redundant information, which will affect the accuracy of the model. In the case where the correlation between the input data is too strong, the weights in the network connected to the input neurons perform a similar function. The correlation between the data is too strong, and the weight relationships trained in the network are not portable enough to apply the model to other years of data. Therefore, a principal component analysis method is needed to transform the original data to eliminate the strong correlation between the data before performing neural network training. Through the analysis of the physical fitness test data, it was found that the comprehensive scores provide great help for teachers to develop teaching programs based on the actual physical condition of students. The equipment measurement data can reflect the changes in various indicators of students over the years, and the comprehensive scores can visualize the physical quality of students, which facilitates a reasonable division of the physical quality level of students, and students can also clearly understand their physical condition based on the comprehensive scores.

The correlation between attributes
The above sports test data was used as experimental data. It was analyzed using k-means algorithm. (weight/height squared), lung capacity, forward body flexion, standing long jump, 50-meter run, 1000-meter run (male) or 800-meter run (female), and pull-ups (male) or sit-ups (female) were used as inputs for men and women, respectively.
The determination of k value is mainly in the movement experts to ask for advice and discussion, it is recommended that the kinds of 8-15, too many or too few categories of features are not obvious enough, the prescription is not targeted enough to formulate a large number of tests, the calculation of the sum of squares of the error in the cluster for a number of hierarchical clustering, after selecting the optimal center point, the best clustering results, and ultimately selected 10 when observing all kinds of specific data classified in all kinds of clusters in the categories of the most obvious characteristics under the Cluster characterization, so as to carry out the movement program development.
The centroids of k-means clustering are points that must be points in the data with some reference value, k is 10, which means that men and women are divided into ten categories each. The number of iterations was chosen as 100 times, if the set of center points did not change or did not converge by reaching the maximum number of times then the set of center points of the last time was chosen as the center point. The results of the body measurement clustering are displayed by dimensionality reduction with PCA (Principal Component Analysis) as shown in Fig. 2, PCA enables the new low-dimensional data set to retain the variable results of the original data as much as possible, but it is important to note that the dimensionality reduction by PCA actually loses some information by looking at the pca. Explained_variance_ratio values of [0.99781134, 0.00157322], you can see that the two principal components retained, the first principal component explains 99.88% of the original variance, and the second principal component explains 0.12% of the original variance. This means that the reduction to two dimensions still retains about 99.95% of the original information. The data were divided into 10 classes, and most of the centroid scores in each class varied widely with a certain degree of variability, and the centroid data were used as a reference quantity when the initial exercise prescription was formulated.

Distribution of boys in different categories
The sports test data was preprocessed and standardized, and principal component analysis was used to eliminate strong correlations between the data. The original data was transformed using principal component analysis method and then modeled using BP neural network. BP (back propagation) neural network is one of the most successful neural network algorithms available. The above mentioned student sports test data were used, of which 4500 data were used as training data and 1500 as test data for training and prediction of exercise programs. The overall structure of the BP neural network used in this paper is shown in Figure 3.

BP neural network overall structure
The K-mean clustering algorithm was used to divide 6000 data into 10 categories, and then each class formulated a corresponding motion scheme, and set the input data of these 10 classes as [1 0 0 0 0 0 0 0 0 0],[0 1 0 0 0 0 0 0 0 0]…[0 0 0 0 0 0 0 0 1 0] and [0 0 0 0 0 0 0 0 0 1], where 1 is located, corresponds to that type of scheme. The prediction results of the BP neural network, the relationship between the number of hidden layer nodes and the error loss are shown in Figure 4. According to the test training network, it was found that when the input layer neuron was set to 7, the output layer was 10, and the number of hidden layer nodes was set to 17, the prediction effect was the best, and the prediction accuracy was 96%. If the number of hidden layer nodes is too large or too small, the prediction effect will be low.

Hidden layer node number and Iteration times and error loss
On the first day, the length of exercise is 50 minutes, and the heart rate data is measured by the bracelet, and the heart rate between 120-140 is 40 minutes, which is passed to the server for judgment, and the degree of exercise on that day is recorded as 1. The user’s feedback feeling on that day is “I haven’t run for a long time, and I feel exhausted”. The user’s feedback is: “It’s been a long time since I ran, I feel exhausted”, and then the user’s feeling text is passed into the NLP sentiment analysis model, resulting in a negative sentiment record of -1, and so on for each day’s exercise data.
After 2 weeks of exercise, the daily exercise level data were collected to obtain [1, 0, 0, -1, 1] for the first week and [0, 0, -1, 1, 1] for the second week,and the affective data were [-1, 1, 0, 1, -1] for the first week and [-1, 1, 0, -1, -1] for the second week. Inputting the exercise level data and the affective data into the computational model, it can be seen that the obtained affective analysis level Y is -1, and the final (
The initial exercise program was adjusted to enhance its overall exercise intensity by 10%, and a personalized exercise program was recommended as follows. The number of training times of the exercise intervention implementation program was 4-6 times, each training time was 35-60 minutes, and the purpose and precautions of the exercise intervention were the same as those of the initial exercise program. The personalized exercise program has been adjusted and is shown in Table 3.
Repeat the steps from 1) to 3) to execute a 2-week exercise and continue to adjust the personalized exercise program.
Adjusted exercise prescription for men in Category 1 (First week)
| Week 1 | Training topic | Training content | Motor intensity | Exercise group/rest | 
|---|---|---|---|---|
| Train 1 | Upper limb muscle Aerobic endurance | Flat support | 34 seconds | 3 sets/rest for 35 seconds | 
| Skin-board push-ups | 19 times/group | 3 sets/rest for 35 seconds | ||
| Straight arm suspension | 34seconds/group | 3 sets/rest for 50 seconds | ||
| push-ups | 16times/group | 3 sets/rest for 35 seconds | ||
| 3000m | The distribution speed is 5 minutes and 42 per kilometer | |||
| Train 2 | Torso power + Aerobic endurance | Flat support | 34s/group | 3 sets/rest for 30 seconds | 
| Crib | 23s/group | 3 sets/rest for 35 seconds | ||
| Double up | 18s/group | 3 sets/rest for 50 seconds | ||
| 4000m | The distribution speed is 5 minutes and 38 seconds per kilometer | 3 sets/rest for 35 seconds | ||
| Train 3 | Upper limb torso | Flat support | 34s/group | 3 sets/rest for 35 seconds | 
| Straight arm suspension | 34s/group | 3 sets/rest for 30 seconds | ||
| Push-ups | 23s/group | 3 sets/rest for 50 seconds | ||
| Crib | 23s/group | 3 sets/rest for 30 seconds | ||
| Week 2 | Training topic | Training content | Motor intensity | Exercise group/rest | 
| Train 1 | Upper limb muscle Aerobic endurance | Flat support | 34s | 3 sets/rest for 35 seconds | 
| Skin-board push-ups | 24s/group | 3sets/rest for 30 seconds | ||
| Straight arm suspension | 34s/group | 2sets/rest for 40 seconds | ||
| Bent arm suspension | 25s/group | 2sets/rest for 45 seconds | ||
| Push-ups | 14s/group | 3sets/rest for 35 seconds | ||
| 2000m | The distribution speed is 5 minutes and 42 per kilometer | |||
| Train 2 | Torso power + Aerobic endurance | Flat support | 34s/group | 3sets/rest for 40 seconds | 
| Crib | 29s/group | 3sets/rest for 40 seconds | ||
| Double up | 24s/group | 3sets/rest for 40 seconds | ||
| 2500m | The distribution speed is 5 minutes and 41 per kilometer | |||
| Train 3 | Upper limb torso | Flat support | 34/group | 3sets/rest for 35 seconds | 
| Straight arm suspension | 34/group | 3sets/rest for 52 seconds | ||
| Bent arm suspension | 23/group | 3sets/rest for 40 seconds | ||
| Push-ups | 23/group | 3sets/rest for 40 seconds | ||
| Crib | 30s/group | 2sets/rest for 35 seconds | 
In this paper, we first analyze the correlation of the sports test data and use principal component analysis to downscale the data. Under k-means clustering, different features in the sports test data are highlighted. Then, the BP neural network model is used to complete the recommendation of a personalized exercise program, and the personalized exercise program is adjusted by NLP emotion technology. It can be seen:
 Most of the students were categorized in the “Pass” category and the “Fail” category. In the correlation analysis, it was found that the students in the failing category had a high average weight and lower than normal physical fitness, and there was a negative correlation between the 50-meter run and the standing long jump. The physical education test data was clustered into 10 categories, and the center point scores in each category varied greatly. The prediction accuracy of the BP neural network model was 96%. A personalized exercise regimen for a period of 2 weeks was obtained to be executed after the adjustment of NLP emotion technique. The BP neural network was used to predict the corresponding exercise regimen for each class. The personalized exercise regimen adapted by the NLP affective technique was basically: 4-6 training sessions, each session lasting 35-60 minutes.
