Research on Optimization Strategy of English Teaching Resource Allocation Based on Intelligent Data Analysis
Published Online: Mar 21, 2025
Received: Oct 22, 2024
Accepted: Feb 18, 2025
DOI: https://doi.org/10.2478/amns-2025-0595
Keywords
© 2025 Shuhui Cui, published by Sciendo
This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
In the rapid development of the digital environment, the English teaching mode has also ushered in the opportunity and challenge of a major transformation. On the one hand, digital tools provide more diversified learning methods for online courses, multimedia teaching materials, interactive learning platforms and other learning resources, which promote students’ personalized learning and stimulate their interest in English learning [1-3]. On the other hand, the digital environment also promotes the globalization of English teaching and learning, with the help of the Internet and distance learning technology, connecting students with English resources and teachers from all over the world, broadening their horizons, and improving their international communication skills [4-6]. However, teaching resources are always limited, and the effective allocation of learning resources is a great challenge to the English education model because the traditional resource allocation method cannot make full use of the existing English learning resources and is prone to the disadvantage of resource bottleneck [7-8].
The so-called resources is a general term for various substances including physical, financial and human resources [9]. Teaching resources refer to all kinds of available conditions needed to carry out teaching, and the connotation of teaching resources in the narrow sense mainly includes material conditions such as teaching materials, classrooms, teaching instruments and equipment, as well as human resources such as teachers and teaching administrators [10-11]. The connotation of teaching resources in the broad sense, on the basis of material resources, human resources, but also need to cover the teaching schedule, teaching methods and teaching philosophy and other virtual resources [12-13]. Enhancing the appropriateness between teaching resource inputs and subject specialties and ensuring the reasonable distribution of teaching resources are the intrinsic motivation of resource allocation optimization. Using teaching resource optimization methods can dynamically adjust the allocation of resource blocks, modeling the teaching resource allocation scheduling problem as a nonlinear optimization problem and solving the global optimal solution of the problem is an effective way to solve the problem of English teaching resource allocation [14-16].
Literature [17] proposes a management method for optimal allocation of teaching resources based on convolutional neural network (CNN) and Arduino device, which not only identifies different English teaching scenarios by classifying and identifying the English education resource base, but also innovates the interaction mode between students and teaching resources to achieve a reasonable allocation of English teaching resources. Literature [18] constructed a decision tree model of English teaching semantic topic words based on contextual relations and realized semantic conversion and information extraction in English teaching resource base, which realized good resource scheduling performance and adaptive resource allocation performance, and improved the utilization rate of English teaching resources. Literature [19] describes the key role of data analysis technology and teaching resource allocation model in the teaching process, and the data-driven technology and model can accurately identify students’ learning needs and behavioral patterns, and then assist education administrators to optimize the teaching methods and resource allocation scheme, which is conducive to promoting education equity and improving education quality. Literature [20] shows that the growth and development of online educational resources are accompanied by characteristics such as dispersion and disorganization, based on which a SOA-based English teaching resources integration and optimization system is developed, which improves the situation of isolation of online English teaching resources and provides prerequisites for the allocation of teaching resources through the establishment of an open educational resources sharing platform. Literature [21] explored the domain regression correction algorithm in the process of English distance teaching resources allocation, and combined structural risk minimization and principal component extraction methods to reduce the model complexity of the neighborhood regression algorithm, and experiments showed that the proposed optimization algorithm significantly improves the allocation of English distance teaching resources chloride. Literature [22] uses association rule algorithm to integrate and optimize English teaching resources, and introduces semi-supervised neural network to improve it, which substantially improves the resource allocation efficiency in multimedia network-assisted English teaching.
In this paper, we develop an intelligent grouping strategy that utilizes particle swarm genetic algorithm and Bayesian knowledge tracking model. On the basis of the existing Bayesian knowledge tracking model, the knowledge point relationship parameter matrix is added to solve the state level of English learners. Coupled with the particle swarm genetic algorithm, the particles in the group and the individual extremes and group extremes are subjected to the crossover operation in the genetic algorithm with the particles themselves unfolding the mutation operation, and at the same time, the performance of the algorithm is improved by adaptively adjusting the crossover probability and mutation probability, and by the encoding of the segmented real numbers, and so on. Finally, the advantages of the algorithm and the reasonable feasibility of English teaching resource allocation under this method are verified through experiments.
The BKT model divides all the knowledge systems that students need to learn into a number of different knowledge points, and assumes that there are two kinds of knowledge states, mastery or non-mastery. Through continuous practice, students can transfer from the state of not mastering to the state of mastering, but the direction of transfer will not occur in the reverse direction, that is, there is no student from mastery of a knowledge point into the case of not mastering the knowledge point, that is to say, there is no forgetting in the process of learning.
Therefore, in the BKT model for the judgment of students’ knowledge mastery in terms of skills, four parameters are set,
Since the difficulty of different knowledge skills varies, the corresponding individual parameters are different, so each knowledge point needs to train the corresponding four parameters separately. Referring to the probability distribution table, we can get the following formulas:
The probability of a student getting the Nth question right can be interpreted as the sum of the probability of not making a mistake if the knowledge point is mastered, and the probability of guessing correctly if the knowledge point is not mastered, which can be expressed by formula (1):
The probability of a student getting the Nth question wrong can be interpreted as the sum of the probability of making a mistake with mastery of the point and the probability of guessing incorrectly without mastery of the point, which can be expressed in equation (2):
The probability of a student’s mastery of a knowledge point can be interpreted as the sum of the probability of mastery of the knowledge point at the end of answering the The probability that a student will get the next question right can be interpreted as the sum of the student’s mastery of knowledge without errors and the probability of guessing correctly without mastery [23], which is used to predict the student’s state of answering the question, and can be represented by Equation (4):
Based on the above formula, we can derive the structure of Bayesian knowledge tracking model with parameters. Where
In the actual prediction, assuming that a particular student has done 5 questions successively, we use 0 and 1 to quantify answering incorrectly and answering correctly, respectively. According to the time sequence of completing the questions, they are assigned to the corresponding performance nodes, and finally the expectation maximization (EM) algorithm is used to calculate the probability of maximally mastering the knowledge skills corresponding to the questions in the case of this student’s answers, and to predict the next performance.
Based on the above principles, the specific flow of the Bayesian knowledge tracking model is shown in Figure 1.

Flowchart of Bayesian knowledge tracking algorithm
The steps of the algorithm are summarized as follows:
Set the initial values of the corresponding parameters according to the number of knowledge skills Input questions answered by different learners for different knowledge Use the gradient descent algorithm to train and update the parameters according to the students’ answers to the questions. Use the expectation maximization algorithm to calculate and judge the knowledge mastery level of students using the updated parameters. Predict the students’ answers to the next question. Repeat steps 3 through 5 until the threshold is reached
Therefore, assuming
The CS-BKT model has a core assumption that as students deepen their understanding of a skill A, they will likewise deepen their understanding of skill B to some degree, achieving a touchstone effect.
Therefore, the CS-BKT model introduces a new parameter matrix for the interactions between skills:
Therefore, we need to take into account changes in students’ mastery of skill
The CS-BKT model structure is constantly changing with the answers to skill
Therefore, assuming that there are
Where
According to the above principle, the specific flow of CS-BKT model is shown in Figure 2:

Flowchart of the CS-BKT model
Taking the problem of English teaching resource allocation as a combinatorial optimization problem with multiple constraints, combined with the Bayesian knowledge tracking model mentioned above, this paper will take the classic “volume grouping problem” as the starting point, and design a volumeization strategy based on particle swarm genetic algorithm, so as to arrange a scientific and reasonable teaching plan in English classroom teaching, and improve the utilization rate of teaching resources and teaching effect.
The group paper problem is essentially a function optimization problem under multiple constraints, which belongs to the typical CSP problem, and also belongs to the NP-Hard problem, which requires that under certain preconditions, a reasonable combination of test questions can be selected from the test bank, and a set of standard test papers can be generated quickly and efficiently. In the process of solving such problems, it is necessary to set multiple evaluation indicators and an objective function reflecting the quality of the test paper. If the number of questions in a set of test papers is
where
In this paper,
The quantitative evaluation function of the following 4 indicators is given according to the definition of the 4 evaluation indicators of the test paper:
Where
The four evaluation functions are positively normalized, and the larger the function value is, the closer it is to the user’s expectation, reflecting that the combination of questions in the set of question papers performs better under the index. The hierarchical analysis method (AHP) is a multi-attribute decision-making method combining qualitative analysis and quantitative analysis, which is widely used to determine the weights of the indicators in the evaluation model. In this paper, a 3-scaled AHP is adopted to assign weights to the indicators. The multi-objective optimization problem is transformed into a single-objective combinatorial optimization problem. Determine the objective function as shown in equation (18):
where
For the intelligent paper organizing algorithm, users not only want to be able to generate test papers that meet the requirements, but also want the algorithm to run as quickly as possible.
Combining the advantages and shortcomings of particle swarm algorithm and genetic algorithm, this paper proposes an intelligent paper organizing strategy based on particle swarm genetic algorithm, which incorporates genetic operation in particle swarm algorithm. The particles in the particle swarm do not update themselves by speed and position, but rather use crossover operations and their own mutation operations to update themselves. The particles perform crossover operations between the individual and population extremes, which are derived from comparing fitness values according to the objective function at each iteration. The convergence efficiency of the population is improved by the combination of the two algorithms.
In this paper, the segmented real number coding mechanism is used to encode the population particles. That is, test questions are categorized according to the type of question and then grouped into segments. That is, the test questions on the same segment encoded are of the same type [25].
In the initialization population stage, when drawing test questions from the question bank, the test questions are first screened from the question bank according to the desired knowledge points to constitute a small question bank. In addition, it is easy to repeatedly select invalid test questions during the drawing process. To solve this problem, this paper adopts a non-repeating random array generation algorithm as a solution.
In the process of selecting generation updates, the particle with the highest fitness value in each generation is the population extreme value. The individual’s extreme value is determined by the continuous automatic updating and adjustment process to achieve the highest fitness value.
Step1 Set the length of a certain type of coding segment in an individual to be
Step2 Judge whether the
Step3 If
This method can make all the genes of the generated new individual unique, i.e., there is no duplication of questions in the same paper. If the fitness value of the particle after the application of the crossover operation is higher than the previous fitness value, the particle is automatically updated, otherwise it is not adjusted.
The variation operation focuses on the local search ability of the particle swarm. In this paper, we adopt the segmented variation model, that is, the variation operation is unfolded in each type of question with the corresponding program coding segment. The specific process is as follows:
Step1 set the length of a question type coding segment in an individual as
Step2 Obtain the gene (i.e., the question number) on the Stepl variation position, further obtain the topic information, and obtain the topic set with the same question type, score and knowledge points of the topic for the inclusion relationship.
Step3 Randomly select a question in the question set and replace the question number that Stepl was identified with the question number of this question. Take single choice questions as an example.
Improvement idea. Combining the concepts of Logistic function and similarity coefficient to realize the adaptive balance adjustment of 2 probabilities in genetic operation. Logistic function. Logistic function is widely used in information science, biology and other fields, describing certain bounded growth phenomena more accurately. It is expressed in different forms, the more common ones are shown in equation (19):
The logistic function is positively correlated with the variables in the interval and converges at both ends. Based on this property, incorporating the function into an adaptive strategy for probability can lead to an arithmetic that meets the need for improvement.
Similarity coefficient. The degree of similarity between individuals in a population is reflected by the similarity coefficient. In this paper, expectation EX and variance DX are introduced to calculate the similarity coefficient to obtain the population mean, and the dispersion degree of deviation from the mean by taking the fitness value as a variable as shown in Eqs. (20) and (21):
Where Theoretically the particles in the population become more and more excellent as they evolve, the value of particle fitness gradually increases, and the degree of similarity of the particle population becomes higher and higher. That is, EX gradually increases, while DX gradually decreases. Accordingly, the formula for Crossing probability and variation probability adjustment formula. Combined with the definition of similarity coefficient, the improved adaptive crossover probability
where
Setting As the similarity coefficient
The flow of the algorithm is shown in Figure 3. The algorithm ends when the number of iterations reaches the specified value.

Algorithm flow chart
This section compares the overall performance of the Student-KT model with the model in this paper in terms of student achievement prediction. In this paper, three students were randomly selected from different datasets for analysis. Figures 4 to 6 show the three students’ mastery of each knowledge concept in 50 practice steps.

ASSISTments0910

ASSISTments2015

Statics2011
The results show that there is a significant difference between the student model and the distillation model in predicting the performance of students in specific practice steps. From the figure, it can be seen that in most cases, the predictions of this paper’s algorithm are closer to the actual answer performance than Student-DKT. The model in this paper can achieve good prediction results at the initial stage by extracting soft target knowledge to guide the student model. Based on the predicted performance of Student-DKT and this paper’s model, the following conclusions can be drawn:
In addition to predicting future performance, the algorithmic framework of this paper is able to automatically capture learners’ tendencies towards specific knowledge concepts. As can be seen in Figure 4, the learners’ mastery of knowledge concepts in the ASSISTments0910 dataset ranges from 0.15 to 0.95, with large fluctuations in changes, indicating that the learners’ learning status is extremely unstable. As can be seen in Figure 5, the algorithm in this paper improves Student-DKT to a lesser extent relative to the other two datasets. This observation confirms that the performance improvement of the distillation mechanism on the ASSISTments2015 dataset is not significant. In addition, the observation shows that the learners’ actual answer results fluctuate greatly, with a fluctuation range of 0.37 to 0.94. However, the model’s prediction curve for knowledge state is relatively smooth. This indicates that in the ASSISTments2015 dataset, which has an imbalance between the size of the student set and the size of the concept set, the predictive power of knowledge tracking is significantly lower than that of the other two datasets. As can be seen in Figure 6, the model’s estimation of learner competence gradually improves as the learner correctly attempts most of the remaining exercises, indicating that the learner maintains a better learning state.
After comprehensive analysis, the designed framework not only allows accurate cognitive prediction of students, but also has interpretability, which makes the method proposed in this paper more attractive for practical application. For example, teachers can explicitly analyze learners’ knowledge acquisition ability and provide them with richer learning guidance.
In order to test the effectiveness of this paper’s intelligent grouping strategy for English exams based on the improved harmonic search algorithm on the intelligent grouping of English exam papers, taking the question paper of the 2019 English Grade 4 exam in Liaoning Province as an example, the relevant grouping organizations in this province use the computerized automatic grouping system for the intelligent grouping, and introduce this paper’s strategy for the grouping of the paper into this system. Using Java language programming for program writing, the experimental environment is WindowsXP system, the processor is 851MHz, and the memory is 64MB.The mean value of knowledge points, the mean value of difficulty, and the mean value of differentiation of all the questions in the question bank of the English Grade 4 examination of a certain province in 2019 are set to 0. 64, 0. 53, and 0. 65 in that order.
The total value of the test paper score is 100 points, the fill-in-the-blank score is 12 points, the multiple-choice score is 32 points, the terminology score is 12 points, the short-answer score is 22 points, and the synthesis score is 22 points. The upper and lower limits of the indicators for the three time periods set for the expected completion of the test paper are 30-60min, 61-90min, 91-120min, respectively, and the effect of the grouping of the three time periods is shown in Fig. 7, Fig. 8, and Fig. 9.

Group volume effect of 30-60 min examination paper

The group volume effect of 61-90 min test paper

The group volume effect of 91-120 minutes
The grouping accuracy indicates the probability that the computer automatic grouping system obtains the feasible solution of the grouping strategy 40 times before and after the use of the strategy in this paper. The optimal value, the worst value, and the mean value are, in order, the optimal value, the worst value, and the mean value of the quality of the feasible solutions obtained by the computer automatic grouping system after the strategy of this paper is grouped 40 times. In Fig. 7, Fig. 8 and Fig. 9, before using the strategy in this paper, the probability of the computer automatic grouping system obtaining feasible solutions for the three time periods is 0.88, 0.94 and 0.90 in turn, and after using the strategy in this paper, the probability of the computer automatic grouping system obtaining feasible solutions is 0.98, and the optimal value and the mean value of the feasible solutions obtained by the computer automatic grouping system for 40 times after the strategy in this paper are larger than those before using the strategy in this paper.
Setting this paper based on particle swarm genetic algorithm of English exam intelligent grouping strategy in finding the optimal solution of English exam intelligent grouping strategy, the number of individuals in the population in the harmonic memory bank are 100, and the number of iterations is 350 times. The maximum value of adaptation before and after the use of this paper’s strategy is shown in Fig. 10. Analyzing Fig. 4, it can be seen that the maximum value of fitness after the use of this paper’s strategy is 0.71, and the value of fitness before the use of this paper is 0.68, then the quality of the questions in the question bank is better when using this paper’s strategy to find the optimal solution of the intelligent grouping strategy for English exams.

Population fitness maximum
The diversity of the test paper population can indicate the level of variation among the test questions in the question bank. If the difference between the questions is large, the population diversity is high, otherwise, the population diversity is low and the difference between the questions is small.
For the particle swarm genetic algorithm, the diversity of the population and the algorithm’s search performance has a direct impact, if the diversity of the population is larger, the overall search performance of the algorithm is better, and can explore the unexplored search range.
Used in the research content of this paper, it can be understood as the ability to obtain new question types. However, if the population diversity is always large. The difficulty of obtaining the global optimal solution increases. Therefore, at the beginning of the search, the population needs to have a good population diversity, and at the end of the search, in order to get the accurate global optimal solution, the population needs to be closer to the optimal solution, and the diversity of the population needs to gradually become smaller. Before and after the use of the strategy in this paper, test the diversity of the population after the intelligent grouping of computerized automatic grouping system, set the time needed to group the test questions is 30-60 min, 91-120 min, respectively, the results are shown in Fig. 11, Fig. 12. Analysis of Figure 11, Figure 12 shows that, when there is a constraint on the time of the English test, the use of this paper’s strategy, the automatic computerized grouping of papers, the diversity of populations in the grouping of papers in the early stage are higher, along with the increase in the number of iterations, grouping of papers in the late stage of this paper’s strategy of the diversity of the populations to reduce rapidly, to quickly obtain the global optimal solution. In contrast, after using the strategy proposed in this paper, the performance of automatic computerized grouping has improved.

Population diversity of 30-60 min test papers

Population diversity of 91-120 min test papers
In order to further prove the effectiveness of this paper’s strategy, 2 mainstream multi-attribute association and intelligent genetic strategies are used to compare with this paper’s strategy. The comparison direction is the grouping accuracy and the time, and the comparison results are shown in Fig. 13 and Fig. 14, respectively. It can be clearly seen that no matter in 50, 100, 150, 200 or 250 iterations, the grouping accuracy of this paper’s strategy is significantly higher than that of the other two strategies, with the highest accuracy of up to 98%, and from the grouping time, it can also be seen that the grouping time of grouping with this paper’s strategy is within 2s, which is significantly lower than that of the other two methods, which effectively proves the validity of the strategy in this paper.

Comparison of precision of different strategy group volumes

Compare the time of the different strategy group volumes
In this paper, in order to optimize the allocation of teaching resources in the English classroom and achieve a more scientific and reasonable curriculum planning, an intelligent grouping strategy based on Particle Swarm Genetic Algorithm is designed in combination with Bayesian Knowledge Tracking Model. After verifying the performance of the knowledge tracking module on the dataset, rough simulation experiments on grouping papers are conducted. It has been found that in most cases, the prediction results of this paper’s algorithm are closer to the actual answer performance than Student-DKT. The model presented in this paper guides the student model by extracting soft target knowledge, which can lead to good prediction results in the initial stage. The probability of obtaining feasible solutions for the three time periods of the intelligent grouping system is 0.88, 0.94 and 0.90 in turn, and the probability of obtaining feasible solutions for the automatic computerized grouping after the use of this paper’s strategy is 0.98. After the use of this paper’s strategy, the optimal and mean values of the feasible solutions for the 40 times of the automatic computerized grouping are larger than those before the use of the strategy, which realizes the efficient allocation of resources for English language teaching.
