Research on the Optimization of English Teaching Mode and Personalized Learning Path in Colleges and Universities Based on Big Data Regression Analysis
Online veröffentlicht: 24. März 2025
Eingereicht: 01. Nov. 2024
Akzeptiert: 10. Feb. 2025
DOI: https://doi.org/10.2478/amns-2025-0794
Schlüsselwörter
© 2025 Dongmei Li, published by Sciendo
This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
Innovative teaching mode is the key to improve the quality of teaching and cultivate students’ core competitiveness, exploring and applying data-driven teaching mode innovation has become crucial, through data analysis teachers can deeply understand students’ learning needs, learning progress and learning styles, so as to target the design and adjustment of teaching content and methods, personalized teaching has become possible to help each student to achieve better learning results [1-4]. However, to realize the innovation of teaching mode, it is necessary to overcome various challenges, continuously improve the technical infrastructure, ensure data security and privacy protection, and improve the professionalism and training level of teachers [5]. Only in this way can colleges and universities meet the challenges in the field of education and create a more valuable educational environment for students’ learning and development.
In the era of big data, the innovation of English teaching mode in colleges and universities has become the focus of extensive attention in the academic and educational circles. With the rapid development of information technology, traditional English teaching methods gradually appear to be lagging behind and cannot fully meet students’ language learning needs in the context of diversified disciplines and globalization [6-7]. The popularization of big data has led to an unprecedented shift in educational methods, and how to make full use of these technical means to improve the quality of English teaching in colleges and universities is an urgent problem to be solved. The changes brought by the era of big data are not only the updating of technology, but also the challenge to the traditional English teaching concepts and methods. Students’ subject backgrounds are becoming more and more diversified, and the social demand for English application ability and practical talents is also more complex and diverse, which requires the English teaching mode in colleges and universities to have stronger adaptability and innovation [8-10]. Therefore, in-depth research on English teaching mode in colleges and universities in the era of big data is imperative in order to provide more targeted educational programs and promote the overall development of students in language proficiency, innovative thinking, personalized learning and other aspects [11-14].
Literature [15] analyzed the LAN computer-optimized English teaching model, including classroom structure, classroom practice, teaching system, and classroom construction strategy, and the optimized teaching model significantly promoted teaching efficiency and teacher-student communication. Literature [16] established a user-driven rapid response knowledge space with multimodal subject knowledge resource integration to optimize the student-oriented English teaching mode under the guidance of educational objectives and improve the classroom interaction rate. Literature [17] computed English teaching data in colleges and universities by means of neural network algorithms, and realized the innovation and optimization of English teaching under the conditions of cognitive process simulation. Literature [18] explored the possibility of optimizing the teaching mode under the combination of deep learning and English blended teaching, and revealed the importance of balancing the English online resources with the traditional education mode by comprehensively analyzing the integration principle of deep learning and the blended teaching method, but did not have any outstanding features for the teaching mode combining the two compared with the traditional teaching mode. These current optimizations of the English teaching mode mainly promote the effect of teaching interaction and teaching efficiency, as well as the innovation of the teaching mode itself, and lack of specific references to the optimization of the teaching mode to explore.
In addition, the literature [19] uses the improved Drosophila optimization adjustable recurrent neural network to realize students’ personalized English learning, mainly through the analysis and evaluation of English learning data, and then extracts the features, according to which the personalized learning path recommendation is carried out. Literature [20] analyzed learning under intelligent education technology in business English through explanatory sequential mixed methods, showing that personalized learning concepts have a positive relationship with students’ motivation, commitment, and achievement, and that intelligent education technology promotes the development of personalized teaching. Literature [21] used big data technology to study the English personalized teaching strategies and learning analysis, mainly data mining technology to analyze the students’ learning situation in the absorption, while analyzing the personalized learning effect of the WeChat personalized network platform, but did not specifically introduce the personalized path. For the above introduction of personalized learning research to see, path planning research is relatively small, although the literature [22] through the self-mapping learning path theory to explore, which is a student can be based on their own learning situation and course design for multi-level personalized path development. However, this path puts high demands on students’ self-control, knowledge, and self-understanding, and is not practical for students with poor self-control, weak knowledge, and poor self-perception, so a personalized learning path with generality is needed to meet the needs of students at different stages and achieve personalized learning in the true sense.
And big data regression analysis is to explore the interrelationship between two or more variables by constructing a numerical model to predict the value of the dependent variable under the condition of known independent variables [23]. It gives a strong guidance for the optimization of teaching mode and personalized learning path planning under the influence of multiple factors in English teaching.
Based on the logistic regression model, an ordered multicategorical logistic regression model is constructed. This model is used to analyze the influencing factors of English teaching modes in colleges and universities. Subsequently, the modern and traditional English teaching models were integrated, while a learner state model based on online learner behavior was proposed to recommend a personalized learning path for learners that meets their learning state according to the judgment of the learner state, and finally an accurate personalized learning path based on the learner state was designed.
Logistic regression algorithms can predict the likelihood of an event occurring under the action of a variety of different input variables, and can also be regarded as two opposing events, such as the occurrence of
The common application of logistic regression is mainly reflected in the binary classification problem, logistic regression in the binary classification problem in the classification process only distinguish between 0 and 1 class, its probability distribution can be expressed as:
Eq:
Where
When the case of multiple inputs occurs, it is necessary to expand the model weight vector and input variables, but in this paper, it is still notated as
For convenience of presentation, the above two equations are unified in the following equation:
Equation (5) is known as the logistic regression function.
In order to obtain the optimal solution of the logistic regression model coefficients, the gradient descent method is commonly used to optimize the model coefficients during the training process of the logistic regression model. The gradient descent method determines the maximum step size of regression coefficient training based on the deviation between the actual results and the predicted results as well as the learning rate (the set parameter), and then adjusts it after several iterations to obtain the optimal regression coefficients.
Assuming that there is
where
obtained by taking the above equation in logarithms:
Eq. (8) can be regarded as the logarithmic loss function of the logistic regression function, which can be obtained by taking the derivative of
After obtaining the derivatives of the weights, the parameter update function is calculated based on the deviation of the derivatives that occurs at each training:
The updating formula of the weights is shown in equation (10), logistic regression after each prediction of the training samples, the deviation occurs after the logarithmic loss function to get the deviation of the target, as shown in equation (9), and then according to this deviation on the weights
In order to prevent overfitting phenomenon in model training and increase the generalization performance of the logistic regression model in the prediction process, people often tend to add the regular term
The loss function formula after adding the L2 regular term becomes:
The update of the regularized gradient descent method
where
The questionnaires were distributed through the public online questionnaire platform, Questionstar, and 545 online questionnaires were retrieved from September 2023 to October 2023, with 531 valid questionnaires and a validity rate of 97.43%, and all of these learners had at least half a year’s experience of studying in university English courses.
On the basis of defining each variable, the author adopted a 5-level scale to design the questionnaire, in the form of objective multiple-choice questions, including four major sections: learner factors, teacher factors, online course factors and environmental factors. The questionnaire was modified to take into account my own observations of English classroom teaching: learner factors include learner motivation and learning strategies, totaling six items; teacher factors include professionalism and teaching guidance, totaling five items; course factors include content format, totaling five items; and environmental factors include platform design and teaching interaction, totaling six items. Before the formal administration of this study, the initial scale was pilot tested in a small area, the 50 questionnaires collected at the initial stage were factor analyzed, and the questionnaires were modified on the basis of categorization and analysis, and the formal questionnaire was finally determined to be administered to the participants.
The ordered multicategorical Logistic regression model is a probabilistic nonlinear regression model that is suitable for analyzing the relationship between an ordered multicategorical dependent variable and multiple independent variables. The model does not require the variables to obey a normal distribution, and its independent variables can be continuous or discontinuous, and it is most appropriate for discrete, hierarchically categorized dependent variables [25]. The basic idea of ordered multicategorical logistic regression model is to partition the dependent variable into two classes, for which a logistic regression model with dichotomous dependent variable is built.
Let the ordered dependent variable
where
The parameter estimates of the model can be derived using the great likelihood method. Assuming that
Where
In this paper, the teachers’ evaluation performance is divided into five grades, and it is set as the dependent variable, which takes the value of
In order to test the accuracy of the model, 120 samples are taken as the test set to test the model fitting effect; the remaining 360 samples are used as the training set to fit the model. An ordered multicategorical Logistic regression model is built for the training set, and combined with the polr function using the MASS package of the statistical software R language, the parameter estimation results of the full model are obtained as shown in Table 1. From the table, it can be seen that there are many variables with small t-values, with a minimum value of -13.92, and there may be a problem of multicollinearity between the independent variables.
Full model parameter estimation result
| Variable | Coefficient | Standard error | T value |
|---|---|---|---|
| Intercept 1|2 | 4.47 | 1.71 | 2.47 |
| Intercept 2|3 | 4.03 | 1.69 | 2.87 |
| Intercept 3|4 | 6.02 | 1.66 | 3.48 |
| Intercept 4|5 | 8.11 | 1.66 | 4.67 |
| Learner factor (x1) | 1.27 | 0.28 | 4.44 |
| Teacher factor (x2) | 0.29 | 0.32 | 0.83 |
| Online course factor (x3) | 0.68 | 0.35 | 1.86 |
| Environmental factor (x4) | 0.53 | 0.81 | 0.62 |
| Learner motivation (x5) | 0.02 | 0.01 | 1.84 |
| Learning strategy (x6) | -0.01 | 0.01 | -1.02 |
| Professional literacy (x7) | -0.69 | 0.36 | -1.79 |
| Teaching guidance (x8) | -12.57 | 0.88 | -13.92 |
| Content form (x9) | 0.02 | 0.01 | 3.31 |
| Platform design (x10) | 0.43 | 0.03 | 12.41 |
| Teaching interaction (x11) | 0.35 | 0.05 | 2.47 |
| Residual error | 409.00 | AIC | 439.00 |
In order to optimize the model, the independent variables were screened using the backward stepwise regression method, and the insignificant variables were gradually eliminated: x4 (environmental factors), x10 (platform design), x9 (content form), x2 (teacher factors), x6 (learning strategies), x8 (instructional guides), and x5 (learner motivation), and then regression analysis was done again for the remaining variables using the ordered multicategorical logistic model. The regression results and test results are shown in Tables 2 and 3.
Model regression
| Variable | Coefficient | Standard error | T value |
|---|---|---|---|
| Intercept 1|2 | -4.41 | 0.61 | -6.97 |
| Intercept 2|3 | -3.91 | 0.52 | -7.48 |
| Intercept 3|4 | -2.91 | 0.38 | -7.48 |
| Intercept 4|5 | -0.89 | 0.29 | -2.95 |
| learner factor (x1) | 1.28 | 0.26 | 4.45 |
| course factor (x3) | 0.79 | 0.35 | 2.18 |
| Environment factor (x4) | -0.85 | 0.30 | -2.01 |
Test result
| Variable | LR card | Freedom | Significance | 95% confidence interval | |
|---|---|---|---|---|---|
| Lower limit | Upper limit | ||||
| learner factor (x1) | 19.2045 | 1 | 5.786e-05*** | 1.1963 | 1.3457 |
| course factor (x3) | 4.3988 | 1 | 0.018675* | 0.75478 | 0.84279 |
| Environment factor (x4) | 8.7842 | 1 | 0.001679** | -0.8317 | -0.9167 |
| Residual error | 421.78 | ||||
| AIC value | 435.78 | ||||
| -2 log likelihood | 1317.39 | ||||
As shown in Tables 2 and 3, the P-value of each variable in the model is less than 0.02, which is significant; moreover, the AIC value of the model is 435.78, which is a relatively small amount of deficit pool information. Overall, the model has a good fitting effect.
The accuracy of the model is tested using the test set. By using the R software predict function, the predictions are obtained and the predicted and true values are compared and analyzed and the results are shown in Table 4.
The proposed effect of the predicted value is analyzed
| Predictive value (grade) | True value | ||||
|---|---|---|---|---|---|
| 1 | 2 | 3 | 4 | 5 | |
| 1 | 0 | 0 | 0 | 0 | 0 |
| 2 | 0 | 0 | 0 | 0 | 0 |
| 3 | 0 | 0 | 0 | 0 | 0 |
| 4 | 0 | 0 | 0 | 0 | 1 |
| 5 | 0 | 1 | 4 | 14 | 95 |
From Table 4, it can be seen that there is no corresponding predicted value for the test value of true value 1, while the predicted values of true value 2, 3 and 4 are all 5, the number of which are: 1, 5 and 15 respectively.There is only 1 test value of true value 5 corresponding to test value 4, and the corresponding test value of 5 is 95. Therefore, the overall accuracy of the model is 95/120 = 0.7917.
Since the predicted value for a true value of 3 is 4, and the predicted value for a true value of 5 is 3, which is close to being correct, the correctness rate is approximately equal to (1+14+95)/120=0.92. It is a better prediction, thus suggesting that the chosen regression model is more reasonable. From this, we can conclude: The main factors that can reflect the influence of English teaching in colleges and universities are learner factors, curriculum factors and environmental factors.
In the English teaching mode of colleges and universities supported by information technology, information technology is used as a teaching aid to join the teaching process, the teacher in the English classroom teaching process, still occupies a dominant position, the teacher needs to guide the students in all aspects of the learning process to solve the problems and difficulties encountered by the students in a timely manner, so that in the actual process of teaching, in accordance with the actual learning situation of the students, to carry out the Targeted English teaching, improve the quality and efficiency of English classroom teaching.
In the English teaching mode of colleges and universities supported by information technology, information technology still has certain advantages, and teachers should make full use of these advantages to create a good learning environment for students’ English learning and improve students’ learning efficiency. Teachers in the actual teaching process, but also according to the actual situation of the students, to develop appropriate learning programs, the corresponding teaching preparation, to provide students with targeted teaching.
In the English teaching mode of colleges and universities supported by information technology, although information technology has certain advantages, the traditional teaching mode also has certain educational value, information technology can provide students with rich learning content, and the traditional teaching mode can better integrate the process of students’ learning knowledge, deepen students’ understanding and mastery of knowledge, and can effectively improve the efficiency of classroom teaching. Therefore, in the actual teaching process, it is necessary to combine modern teaching methods with traditional teaching methods, so as to improve the quality and efficiency of English teaching in colleges and universities.
Starting from the perspective of the implicit relationship that exists between learners’ online learning behaviors and the difficulty of knowledge points, which is often overlooked, this study takes into account the relationship between a series of online learning behaviors (including video viewing behaviors, forum interaction behaviors, and practicing behaviors) and the difficulty of knowledge points in the process of trying to judge the difficulty of knowledge points. In this study, in order to abstract the relationship between different parameters, the author constructed a model for measuring the difficulty of a specific knowledge point for general students, i.e., a knowledge point difficulty score model based on learner behavior. The specific formula is shown in Equation (16).
Where,
In Eq. (16),
The input parameter of the Knowledge Point Difficulty model is the average historical learning performance of all users who have learned the knowledge point
The assessment of learner state values sets the foundation for subsequent personalized path planning. In order to realize the judgment of students’ learning knowledge state, this study constructs a learner state model based on learners’ online learning behavior. The model aims to objectively evaluate the learner’s mastery state of knowledge points and assign values to them. The input parameters of the learner state judgment model are the individual learner’s video viewing behavior of a specific knowledge point and its practice test results, and the output is the individual learner’s mastery of a specific knowledge point, i.e., the state value judgment.
The judgment process of learners is mainly the following steps:
The online learner follows the initial sequence of course planning. The learner watches videos and completes chapter test questions. The learner’s learning status is determined based on the normalized values of the learner’s online learning behavior and practice test results. Plan the appropriate next knowledge point sequence for the learner based on the learner status, i.e., complete the personalized path planning process.
The learning behavior of the online learner will be recorded in the form of log data and the results of the completed test will be normalized. In this study, the author divides the learning status of online learners into four states, namely the state of “not learning”, the state of “not mastering”, the state of “insufficient mastery” and the state of “mastered”. The state value of the “unlearned” state is assigned as 1; The status value of the “not mastered” learning state is assigned to 2; The state value of the “insufficient mastery” learning state is assigned to 3; The status value of the Mastered learning status is assigned to 4.
A learning path is an ordered sequence of learning content and learning activities experienced by a learner in the learning process, in which the learner realizes the learning of basic knowledge, the mastery of the method system, and the completion of the problem solving and tasks, so as to enhance the corresponding competence [26]. Therefore, the learning path can be represented by a three-dimensional vector matrix including three dimensions: knowledge
Learning path formalized representation:
Where, 1, 2 until
The learning paths experienced by learners can be divided into mainstream learning paths and personalized learning paths. Mainstream learning paths are simple sequences of learning content and activities that meet the learning needs of most students and are applicable to most students based on the big data and knowledge mapping of the learning outcomes of the student population, which also includes three dimensions: knowledge
Personalized learning path is a learning sequence based on the analysis of each learner’s learning outcomes, designing learning objectives to meet his/her learning needs, and providing learning content and activities that meet his/her learning style and cognitive characteristics, which are paced and controlled by the learner. A personalized learning path can be expressed as follows:
Learning path planning is to match each learner’s learning profile with a learning path that suits the learner’s individual development on the basis of mainstream learning paths.
This study utilized the K-means function of the R software to conduct exploratory cluster analysis on the final estimated attribute mastery probabilities of 531 students from the College of Foreign Languages of a university, and the final cluster class value determined after the cluster analysis of the attribute mastery probabilities was 11 classes, as shown in Table 5.
The properties of the clustering are the probability
| Categories | A1 | A2 | A3 | A4 | A5 | A6 | A7 | T1 | T2 | Mean |
|---|---|---|---|---|---|---|---|---|---|---|
| Ks1 | 0.977 | 0.99 | 0.262 | 0.172 | 0.737 | 0.965 | 0.839 | 0.396 | 0.178 | 0.613 |
| Ks2 | 0.751 | 0.949 | 0.699 | 0.272 | 0.521 | 0.174 | 0.358 | 0.284 | 0.125 | 0.459 |
| Ks3 | 0.751 | 0.949 | 0.699 | 0.272 | 0.521 | 0.174 | 0.358 | 0.284 | 0.125 | 0.534 |
| Ks4 | 0.992 | 1 | 0.883 | 0.933 | 0.423 | 0.99 | 0.592 | 0.395 | 0.792 | 0.778 |
| Ks5 | 0.948 | 0.955 | 0.381 | 0.095 | 0.179 | 0.902 | 0.539 | 0.308 | 0.242 | 0.505 |
| Ks6 | 0.968 | 0.962 | 0.686 | 0.688 | 0.762 | 0.95 | 0.467 | 0.481 | 0.556 | 0.724 |
| Ks7 | 0.836 | 0.936 | 0.223 | 0.307 | 0.332 | 0.974 | 0.838 | 0.467 | 0.809 | 0.636 |
| Ks8 | 0.814 | 0.96 | 0.244 | 0.211 | 0.458 | 0.841 | 0.646 | 0.802 | 0.161 | 0.571 |
| Ks9 | 0.000 | 0.409 | 0.001 | 0.275 | 0.066 | 0.000 | 0.229 | 0.192 | 0.005 | 0.131 |
| Ks10 | 0.000 | 0.01 | 0.137 | 0.01 | 0.037 | 0.042 | 0.003 | 0.961 | 0.48 | 0.187 |
| Ks11 | 0.968 | 0.962 | 0.686 | 0.688 | 0.762 | 0.951 | 0.556 | 0.792 | 0.539 | 0.767 |
In the table, the average attribute mastery probability of ks9 is the lowest at 0.131, the average attribute mastery probability of ks4 is the highest at 0.778, yet the average mastery probability of ks11, which can be categorized as all mastery, is not the highest at 0.767.
There are complete 5 learning paths from all attributes are not mastered to all attributes are mastered, the number of people in each path in the state of knowledge is merged, you can get the number of students in these 5 complete learning paths, specific can be obtained as shown in Table 6. From the table, we can know that the number of students in learning path 4 is 245, accounting for 46.14% of the total number of students, from all the attributes have not mastered to master the attributes A1 (basic knowledge) A2 (lexical properties of words) A3 (grammatical composition of the English language) and A5 (translation of English sentences) to achieve the state of knowledge of the ks2; advancement to the ks6 needs to be mastered in the ks9 based on mastering the attributes A4 (reading comprehension of the problem solving) A6 (word mastery) and T2 (relational representation); and finally mastering A7 (English listening) and attribute T1 (the skill of recognizing implicit conditions) to reach the state of full mastery of all attributes at ks11.
Complete learning path type and number
| Type | Path process | The general number of cognitive states |
|---|---|---|
| 1 | ks9→ks3→ks11 | 117 |
| 2 | ks9→ks10→ks5→ks8→ks11 | 154 |
| 3 | ks9→ks10→ks5→ks1→ks11 | 179 |
| 4 | ks9→ks2→ks6→ks11 | 245 |
| 5 | ks9→ks10→ks5→ks7→ks4→ks11 | 235 |
For example, out of a total of 531 students in this study, using the student’s current learning status as the starting point, then there are 15 learning path types and the number of individuals in each learning path and the corresponding competency values were derived.
The types and numbers of individual learning paths are shown in Table 7. The number of students judged to have mastered all the paths in the table is 22, which is 4.14% of the total number of students. However, Learning Path 13 has the smallest percentage of students of all path types, with only 0.56%. The number of students who were judged to have Learning Path Type 1 was 22.79% of the total number of students, which was the highest percentage of all path types. The specific pathway process is that the student’s current categorized knowledge state is ks4 which means that at this point the student has mastered all of the attributes except for the T1 (skill of recognizing implicit conditions) attribute, on which they have progressed to ks11 which is the state of full mastery. The average competency value for students who are classified as this path type is 0.47.
Individual learning path types and Numbers
| Type | Path process | The number of people in the path | Mean capacity |
|---|---|---|---|
| 1 | ks4→ks11 | 121 | 0.41 |
| 2 | ks1→ks11 | 55 | 0.265 |
| 3 | ks8→ks11 | 15 | -0.102 |
| 4 | ks3→ks11 | 85 | -0.614 |
| 5 | ks6→ks11 | 76 | 0.574 |
| 6 | ks2→ks6→ks11 | 27 | -0.201 |
| 7 | ks7→ks4→ks11 | 25 | 0.004 |
| 8 | ks5→ks7→ks4→ks11 | 15 | 0.011 |
| 9 | ks5→ks8→ks11 | 6 | -0.052 |
| 10 | ks5→ks1→ks11 | 15 | -0.025 |
| 11 | ks10→ks5→ks7→ks4→ks11 | 52 | -1.635 |
| 12 | ks9→ks10→ks5→ks7→ks4→ks11 | 10 | -1.835 |
| 13 | ks9→ks2→ks6→ks11 | 3 | -1.263 |
| 14 | ks9→ks3→ks11 | 4 | -1.236 |
| 15 | Master of | 22 | 1.041 |
A learning diagnostic report for Student 107’s mastery of the attributes based on the above analysis is shown in Table 8, which is suitable for this student. It can be seen from the table that the probability of mastery of attributes A3 (grammatical composition of English), A4 (reading comprehension solutions), A5 (English sentence translation), T1 (skill of recognizing implicit conditions) and T2 (relational representations) is relatively low for student No. 107, and the final judgment is that he has not mastered them. So focus on attributes A3, A4, A5, T1 and T2.
The number is 107 students’ properties
| A1 | A2 | A3 | A4 | A5 | A6 | A7 | T1 | T2 | ||
|---|---|---|---|---|---|---|---|---|---|---|
| mp | 0.9927 | 0.9965 | 0.4527 | 0.041 | 0.0332 | 0.9645 | 0.5115 | 0.2601 | 0.2347 | 0.113 |
| ks | 1 | 1 | 0 | 0 | 0 | 1 | 1 | 0 | 0 |
Note: mp refers to this student’s estimated probability of mastery of the attribute; ks refers to the student’s estimated pattern of mastery.
According to the student’s attribute mastery, it can be known that the knowledge state of student No. 107 is in the position of KS5 in the learning path map, and if you want to advance to the highest level, there are three paths to take, namely: KS5→KS7→KS4→KS11; KS5→ KS1→ KS11 and KS5→KS8→KS11. The first path is to learn attribute T2 first; the second path is to learn attribute A5 first; and the third learning path is to learn attribute T1 first. Then as to which path is the best path for student 106 to choose, by calculating the center distance between ks5 to ks7, ks1, and ks8, which are 0.885761, 0.916453, and 0.977684, respectively. Choosing the distance with the smallest distance, then the final choice of the next level of this student’s state of knowledge will be ks7, which means that the recommended path of study for student 106 is ks5→ks7→ks4→ks11.
The influencing factors of English teaching in colleges and universities were analyzed by using ordered multicategorical logistic regression model, and three factors, namely, learner factors, curriculum factors and environmental factors, were mined to have significant influence on the English teaching mode in colleges and universities. Then a model for the learner state is proposed, based on which an accurate personalized learning path planning framework is designed based on learner state. The 531 students were clustered into 11 knowledge states using K-means clustering, which finally resulted in five complete learning paths. The largest number of people on knowledge attributes Ks4 and Ks6 was 122, which accounts for the largest proportion.
For the learning path aspect of individual students, it is based on the estimated knowledge states of the students. If student 106’s knowledge state is in the state of KS5, then there can be three learning paths for this student, all of which can lead to the state of full mastery. Through the calculation, it can be seen that ks5→ks7→ks4→ks11 is the most time-saving path.
