Research on Optimization Strategy of English Teaching Resource Allocation Based on Intelligent Data Analysis

In the rapid development of the digital environment, the English teaching mode has also ushered in the opportunity and challenge of a major transformation. On the one hand, digital tools provide more diversified learning methods for online courses, multimedia teaching materials, interactive learning platforms and other learning resources, which promote students’ personalized learning and stimulate their interest in English learning [1-3]. On the other hand, the digital environment also promotes the globalization of English teaching and learning, with the help of the Internet and distance learning technology, connecting students with English resources and teachers from all over the world, broadening their horizons, and improving their international communication skills [4-6]. However, teaching resources are always limited, and the effective allocation of learning resources is a great challenge to the English education model because the traditional resource allocation method cannot make full use of the existing English learning resources and is prone to the disadvantage of resource bottleneck [7-8].

The so-called resources is a general term for various substances including physical, financial and human resources [9]. Teaching resources refer to all kinds of available conditions needed to carry out teaching, and the connotation of teaching resources in the narrow sense mainly includes material conditions such as teaching materials, classrooms, teaching instruments and equipment, as well as human resources such as teachers and teaching administrators [10-11]. The connotation of teaching resources in the broad sense, on the basis of material resources, human resources, but also need to cover the teaching schedule, teaching methods and teaching philosophy and other virtual resources [12-13]. Enhancing the appropriateness between teaching resource inputs and subject specialties and ensuring the reasonable distribution of teaching resources are the intrinsic motivation of resource allocation optimization. Using teaching resource optimization methods can dynamically adjust the allocation of resource blocks, modeling the teaching resource allocation scheduling problem as a nonlinear optimization problem and solving the global optimal solution of the problem is an effective way to solve the problem of English teaching resource allocation [14-16].

Literature [17] proposes a management method for optimal allocation of teaching resources based on convolutional neural network (CNN) and Arduino device, which not only identifies different English teaching scenarios by classifying and identifying the English education resource base, but also innovates the interaction mode between students and teaching resources to achieve a reasonable allocation of English teaching resources. Literature [18] constructed a decision tree model of English teaching semantic topic words based on contextual relations and realized semantic conversion and information extraction in English teaching resource base, which realized good resource scheduling performance and adaptive resource allocation performance, and improved the utilization rate of English teaching resources. Literature [19] describes the key role of data analysis technology and teaching resource allocation model in the teaching process, and the data-driven technology and model can accurately identify students’ learning needs and behavioral patterns, and then assist education administrators to optimize the teaching methods and resource allocation scheme, which is conducive to promoting education equity and improving education quality. Literature [20] shows that the growth and development of online educational resources are accompanied by characteristics such as dispersion and disorganization, based on which a SOA-based English teaching resources integration and optimization system is developed, which improves the situation of isolation of online English teaching resources and provides prerequisites for the allocation of teaching resources through the establishment of an open educational resources sharing platform. Literature [21] explored the domain regression correction algorithm in the process of English distance teaching resources allocation, and combined structural risk minimization and principal component extraction methods to reduce the model complexity of the neighborhood regression algorithm, and experiments showed that the proposed optimization algorithm significantly improves the allocation of English distance teaching resources chloride. Literature [22] uses association rule algorithm to integrate and optimize English teaching resources, and introduces semi-supervised neural network to improve it, which substantially improves the resource allocation efficiency in multimedia network-assisted English teaching.

In this paper, we develop an intelligent grouping strategy that utilizes particle swarm genetic algorithm and Bayesian knowledge tracking model. On the basis of the existing Bayesian knowledge tracking model, the knowledge point relationship parameter matrix is added to solve the state level of English learners. Coupled with the particle swarm genetic algorithm, the particles in the group and the individual extremes and group extremes are subjected to the crossover operation in the genetic algorithm with the particles themselves unfolding the mutation operation, and at the same time, the performance of the algorithm is improved by adaptively adjusting the crossover probability and mutation probability, and by the encoding of the segmented real numbers, and so on. Finally, the advantages of the algorithm and the reasonable feasibility of English teaching resource allocation under this method are verified through experiments.

2

A model for tracking students’ knowledge levels based on the CS-BKT model

2.1

Fundamentals of Bayesian Knowledge Tracking Modeling

The BKT model divides all the knowledge systems that students need to learn into a number of different knowledge points, and assumes that there are two kinds of knowledge states, mastery or non-mastery. Through continuous practice, students can transfer from the state of not mastering to the state of mastering, but the direction of transfer will not occur in the reverse direction, that is, there is no student from mastery of a knowledge point into the case of not mastering the knowledge point, that is to say, there is no forgetting in the process of learning.

Therefore, in the BKT model for the judgment of students’ knowledge mastery in terms of skills, four parameters are set, P(L₀), P(T), P(G) and P(S). Two of them, P(L₀) and P(T), are learning parameters, which are mainly used to indicate the state of knowledge that the students have learned. P(L₀) indicates the student’s initial level of knowledge, P(L_o) = 0 indicates that the student is completely unaware of the required knowledge before answering the question, and P(L_o) = 1 indicates that the student has completely mastered the required knowledge before answering the question. P(T), on the other hand, indicates the probability that the student has converted from not knowing to knowing about the knowledge point after a period of time.

Since the difficulty of different knowledge skills varies, the corresponding individual parameters are different, so each knowledge point needs to train the corresponding four parameters separately. Referring to the probability distribution table, we can get the following formulas: 1)

The probability of a student getting the Nth question right can be interpreted as the sum of the probability of not making a mistake if the knowledge point is mastered, and the probability of guessing correctly if the knowledge point is not mastered, which can be expressed by formula (1): (1) $P (C o r r e c t_{n}) = p (L_{n}) (1 - p (S)) + (1 - P (L_{n})) p (G)$

2)

The probability of a student getting the Nth question wrong can be interpreted as the sum of the probability of making a mistake with mastery of the point and the probability of guessing incorrectly without mastery of the point, which can be expressed in equation (2): (2) $P (I n c o r r e c t_{n}) = p (L_{n}) p (S) + (I - P (L_{n})) (I - p (G))$

3)

The probability of a student’s mastery of a knowledge point can be interpreted as the sum of the probability of mastery of the knowledge point at the end of answering the n − 1st question and the probability of transferring from unlearned grip to learned grip, which is used to update the student’s knowledge status, and can be expressed in equation (3): (3) $P (L_{n}) = p (L_{n - 1} / E V i d e n c e_{n - 1}) + (1 - p (L_{n - 1} / E V i d e n c e_{n - 1})) p (T)$

4)

The probability that a student will get the next question right can be interpreted as the sum of the student’s mastery of knowledge without errors and the probability of guessing correctly without mastery [23], which is used to predict the student’s state of answering the question, and can be represented by Equation (4): (4) $P (C_{n + 1}) = p (L) (1 - p (S)) + (1 - p (L)) p (G)$

Based on the above formula, we can derive the structure of Bayesian knowledge tracking model with parameters. Where kc represents the learning state, o represents the answer situation, and parameter p(L) will be updated continuously as the answer situation keeps changing. U and I represent the unknown and known required knowledge skills of the student, and parameter p(T) is the transition probability from unknown to known. In addition, c in the rounded rectangle represents the student’s correct answer, and parameter p(G) represents the probability of answering correctly in the unknown state. Rounded rectangle i represents the student answering incorrectly, while parameter p(S) is the probability of the student answering incorrectly if the knowledge skill is known.

In the actual prediction, assuming that a particular student has done 5 questions successively, we use 0 and 1 to quantify answering incorrectly and answering correctly, respectively. According to the time sequence of completing the questions, they are assigned to the corresponding performance nodes, and finally the expectation maximization (EM) algorithm is used to calculate the probability of maximally mastering the knowledge skills corresponding to the questions in the case of this student’s answers, and to predict the next performance.

Based on the above principles, the specific flow of the Bayesian knowledge tracking model is shown in Figure 1.

The steps of the algorithm are summarized as follows: 1)

Set the initial values of the corresponding parameters according to the number of knowledge skills

2)

Input questions answered by different learners for different knowledge

3)

Use the gradient descent algorithm to train and update the parameters according to the students’ answers to the questions.

4)

Use the expectation maximization algorithm to calculate and judge the knowledge mastery level of students using the updated parameters.

5)

Predict the students’ answers to the next question.

6)

Repeat steps 3 through 5 until the threshold is reached

Therefore, assuming n skill, it is necessary to train 4n parameters. Calculate the probability of mastery $p {(L_{t + 1})}_{u}^{k}$ of students u for knowledge k, it is necessary to calculate the probability of students to understand the knowledge according to the students answered the question correctly or incorrectly, respectively, and the final mastery level is the sum of the probability of mastery and the probability of conversion in the case of non-mastery, and the specific formula can be expressed as follows: (5) $p {(L_{1})}_{u}^{k} = p {(L_{0})}^{k}$ (6) $p {(L_{t + 1} | o b s = c o r r e c t)}_{u}^{k} = \frac{p {(L_{t})}_{u}^{k} \cdot (1 - p {(S)}^{k})}{p {(L_{t})}_{u}^{k} \cdot (1 - p {(S)}^{k}) + (1 - P {(L_{t})}_{u}^{k}) \cdot p {(G)}^{k}}$ (7) $p {(L_{t + 1} | o b s = w r o n g)}_{u}^{k -} \frac{p {(L_{t})}_{u}^{k} \cdot p {(S)}^{k}}{p {(L_{t})}_{u}^{k} \cdot p {(S)}^{k} + (1 - P {(L_{t})}_{u}^{k}) \cdot (1 - p {(G)}^{k})}$ (8) $p {(L_{t + 1})}_{u}^{k} = p {(L_{t + 1} | o b s)}_{u}^{k + (1 - p {(L_{t + 1} | o b s)}_{u}^{k})} \cdot p {(T)}^{k}$ (9) $p {(C_{t + 1})}_{u}^{k} = p {(L_{t})}_{u}^{k} \cdot (1 - p {(S)}^{k}) + (1 - P {(L_{t})}_{u}^{k}) \cdot p {(G)}^{k}$

2.2

Fundamentals of the CS-BKT knowledge tracking model

The CS-BKT model has a core assumption that as students deepen their understanding of a skill A, they will likewise deepen their understanding of skill B to some degree, achieving a touchstone effect.

Therefore, the CS-BKT model introduces a new parameter matrix for the interactions between skills: R_ij = the effect on skill j while the student learns skill i.

Therefore, we need to take into account changes in students’ mastery of skill k when calculating their mastery of other skills.

The CS-BKT model structure is constantly changing with the answers to skill K, and accordingly the students’ level of knowledge p(L) about skill K is constantly changing. And this change has an impact on the mastery level of other skills at the same time, with an impact parameter of R_k[i].

Therefore, assuming that there are n skills in total, n² more parameters need to be trained compared to the standard BKT model.The final level of mastery of a skill in the CS-BKT model is the sum of the level of understanding and the probability of the influence of other skills on this skill, as calculated in the standard BKT model [24]. It can be expressed by the formula: (10) $\hat{p} {(L_{t + 1})}_{u}^{k} = p {(L_{t + 1} | o b s)}_{u}^{k} + (1 - p {(L_{t + 1} | o b s)}_{u}^{k}) \cdot p {(T)}^{k}$ (11) $Δ p {(L_{t + 1})}_{u}^{k} = \hat{p} {(L_{t + 1})}_{u}^{k} - p {(L_{t})}_{u}^{k}$ (12) $p {(L_{t + 1})}_{u} = p {(L_{t})}_{u} + R_{k} \cdot Δ p {(L_{t + 1})}_{u}^{k}$

Where p(L_t+1)_u is the mastery level of student u for all skills, that is to say, we update no longer a student’s level for one skill, but the full skill mastery status will be updated because the skill has an impact on all other skills.

According to the above principle, the specific flow of CS-BKT model is shown in Figure 2:

3

Allocation of teaching resources based on the group paper strategy

Taking the problem of English teaching resource allocation as a combinatorial optimization problem with multiple constraints, combined with the Bayesian knowledge tracking model mentioned above, this paper will take the classic “volume grouping problem” as the starting point, and design a volumeization strategy based on particle swarm genetic algorithm, so as to arrange a scientific and reasonable teaching plan in English classroom teaching, and improve the utilization rate of teaching resources and teaching effect.

3.1

Mathematical modeling of the grouping problem

The group paper problem is essentially a function optimization problem under multiple constraints, which belongs to the typical CSP problem, and also belongs to the NP-Hard problem, which requires that under certain preconditions, a reasonable combination of test questions can be selected from the test bank, and a set of standard test papers can be generated quickly and efficiently. In the process of solving such problems, it is necessary to set multiple evaluation indicators and an objective function reflecting the quality of the test paper. If the number of questions in a set of test papers is m and each question has n indicators, then this set of test papers can be regarded as a matrix of m × n. (13) $T = [\begin{matrix} t_{11} & t_{12} & \dots & t_{1 n} \\ t_{21} & t_{22} & \dots & t_{2 n} \\ \dots & \dots & \dots & \dots \\ t_{m 1} & t_{m 2} & \dots & t_{m n} \end{matrix}]$

where t_ij represents the jrd attribute of the ind question of this set of question papers.

In this paper, n is taken as 4, which corresponds to the evaluation indexes of difficulty coefficient, answer time, differentiation and knowledge point coverage of the mathematical model of the set of papers established in this paper, respectively1.

The quantitative evaluation function of the following 4 indicators is given according to the definition of the 4 evaluation indicators of the test paper: (14) $f_{1} = 1 - \frac{D - D^{*}}{D^{*}}$ (15) $f_{2} = 1 - \frac{H - H^{*}}{H^{*}}$ (16) $f_{3} = 1 - \frac{Q - Q^{*}}{Q^{*}}$ (17) $f_{4} = E$

Where D′, H′, Q′ is the difficulty coefficient, answer time and differentiation set by the user, respectively, and D, H, Q is the difficulty coefficient, answer time and differentiation achieved by the algorithmic set of papers.

The four evaluation functions are positively normalized, and the larger the function value is, the closer it is to the user’s expectation, reflecting that the combination of questions in the set of question papers performs better under the index. The hierarchical analysis method (AHP) is a multi-attribute decision-making method combining qualitative analysis and quantitative analysis, which is widely used to determine the weights of the indicators in the evaluation model. In this paper, a 3-scaled AHP is adopted to assign weights to the indicators. The multi-objective optimization problem is transformed into a single-objective combinatorial optimization problem. Determine the objective function as shown in equation (18): (18) $F = w_{1} \times f_{1} + w_{2} \times f_{2} + w_{3} \times f_{3} + w_{4} \times f_{4}$

where w is the objective weight of the indicator obtained through the improved AHP method.

3.2

Intelligent grouping strategy based on particle swarm genetic algorithm

3.2.1

The main idea of the algorithm

For the intelligent paper organizing algorithm, users not only want to be able to generate test papers that meet the requirements, but also want the algorithm to run as quickly as possible.

Combining the advantages and shortcomings of particle swarm algorithm and genetic algorithm, this paper proposes an intelligent paper organizing strategy based on particle swarm genetic algorithm, which incorporates genetic operation in particle swarm algorithm. The particles in the particle swarm do not update themselves by speed and position, but rather use crossover operations and their own mutation operations to update themselves. The particles perform crossover operations between the individual and population extremes, which are derived from comparing fitness values according to the objective function at each iteration. The convergence efficiency of the population is improved by the combination of the two algorithms.

3.2.2

Coding scheme and initialization of populations

In this paper, the segmented real number coding mechanism is used to encode the population particles. That is, test questions are categorized according to the type of question and then grouped into segments. That is, the test questions on the same segment encoded are of the same type [25].

In the initialization population stage, when drawing test questions from the question bank, the test questions are first screened from the question bank according to the desired knowledge points to constitute a small question bank. In addition, it is easy to repeatedly select invalid test questions during the drawing process. To solve this problem, this paper adopts a non-repeating random array generation algorithm as a solution.

In the process of selecting generation updates, the particle with the highest fitness value in each generation is the population extreme value. The individual’s extreme value is determined by the continuous automatic updating and adjustment process to achieve the highest fitness value.

3.2.3

Cross-operation

Step1 Set the length of a certain type of coding segment in an individual to be s₁. When the random number R₁ is less than the crossover rate, select the allele g₁ ~ g₂ of the individual and the individual extreme value in turn.

Step2 Judge whether the g₁, g₂ scores are equal and whether all genes of individual particles have overlap with g₂.

Step3 If g₁ ~ g₂ score values are equal and there is no overlap phenomenon, when the random number R₂ > 0.5, the genes of the new individual come from g₂, otherwise from g₁. Take the single-choice question as an example.

This method can make all the genes of the generated new individual unique, i.e., there is no duplication of questions in the same paper. If the fitness value of the particle after the application of the crossover operation is higher than the previous fitness value, the particle is automatically updated, otherwise it is not adjusted.

3.2.4

Variant operations

The variation operation focuses on the local search ability of the particle swarm. In this paper, we adopt the segmented variation model, that is, the variation operation is unfolded in each type of question with the corresponding program coding segment. The specific process is as follows:

Step1 set the length of a question type coding segment in an individual as s₂, when the random number R₃ is less than the mutation rate, determine the mutation position.

Step2 Obtain the gene (i.e., the question number) on the Stepl variation position, further obtain the topic information, and obtain the topic set with the same question type, score and knowledge points of the topic for the inclusion relationship.

Step3 Randomly select a question in the question set and replace the question number that Stepl was identified with the question number of this question. Take single choice questions as an example.

3.2.5

Adaptive crossover and variance probabilities

1)

Improvement idea.

Combining the concepts of Logistic function and similarity coefficient to realize the adaptive balance adjustment of 2 probabilities in genetic operation.

2)

Logistic function.

Logistic function is widely used in information science, biology and other fields, describing certain bounded growth phenomena more accurately. It is expressed in different forms, the more common ones are shown in equation (19): (19) $y = \frac{1}{1 + e^{α + ω κ}}$

The logistic function is positively correlated with the variables in the interval and converges at both ends. Based on this property, incorporating the function into an adaptive strategy for probability can lead to an arithmetic that meets the need for improvement. 3)

Similarity coefficient.

The degree of similarity between individuals in a population is reflected by the similarity coefficient. In this paper, expectation EX and variance DX are introduced to calculate the similarity coefficient to obtain the population mean, and the dispersion degree of deviation from the mean by taking the fitness value as a variable as shown in Eqs. (20) and (21): (20) $E X = f_{s o g} = \frac{f_{1} + f_{2} + \dots + f_{M}}{M}$ (21) $D X = \frac{f_{1}^{2} + f_{2}^{2} + \dots + f_{M}^{2}}{M} - f_{a v g}^{2}$

Where f_avg is the average fitness value of the population, M is the population size, and f_i(i = 1, 2, ⋯, M) represents the individual fitness value.

Theoretically the particles in the population become more and more excellent as they evolve, the value of particle fitness gradually increases, and the degree of similarity of the particle population becomes higher and higher. That is, EX gradually increases, while DX gradually decreases. Accordingly, the formula for λ is shown in equation (22): (22) $λ = \frac{E X + 1}{\sqrt{D X}}$

4)

Crossing probability and variation probability adjustment formula.

Combined with the definition of similarity coefficient, the improved adaptive crossover probability p_c and variance probability p_m are given, and the adjustment formulae are shown in Eqs. (23) and (24): (23) $p_{c} = \frac{1}{1 + e^{- \frac{n_{1}}{λ}}} - 0. 1$ (24) $p_{=} = \frac{u_{2}}{5 (1 + e^{\frac{1}{λ}})}$

where u₁, u₂ is 2 constants whose values range from (0, +∞), (0, 1) respectively.

Setting u₁ to 10 and u₂ to 0.1, the significance of the improved crossover probability p_e and genetic probability p_m is that: 1)

As the similarity coefficient λ increases, p_e becomes smaller and p_m larger.

2)

p_e always varies within the interval (0.4, 0.9) and p_m always varies within the interval (0, 0.1).

3.2.6

Algorithmic flow

The flow of the algorithm is shown in Figure 3. The algorithm ends when the number of iterations reaches the specified value.

4

Algorithm performance validation

4.1

Knowledge Trace Performance Validation

This section compares the overall performance of the Student-KT model with the model in this paper in terms of student achievement prediction. In this paper, three students were randomly selected from different datasets for analysis. Figures 4 to 6 show the three students’ mastery of each knowledge concept in 50 practice steps.

The results show that there is a significant difference between the student model and the distillation model in predicting the performance of students in specific practice steps. From the figure, it can be seen that in most cases, the predictions of this paper’s algorithm are closer to the actual answer performance than Student-DKT. The model in this paper can achieve good prediction results at the initial stage by extracting soft target knowledge to guide the student model. Based on the predicted performance of Student-DKT and this paper’s model, the following conclusions can be drawn: 1)

In addition to predicting future performance, the algorithmic framework of this paper is able to automatically capture learners’ tendencies towards specific knowledge concepts.

2)

As can be seen in Figure 4, the learners’ mastery of knowledge concepts in the ASSISTments0910 dataset ranges from 0.15 to 0.95, with large fluctuations in changes, indicating that the learners’ learning status is extremely unstable.

3)

As can be seen in Figure 5, the algorithm in this paper improves Student-DKT to a lesser extent relative to the other two datasets. This observation confirms that the performance improvement of the distillation mechanism on the ASSISTments2015 dataset is not significant. In addition, the observation shows that the learners’ actual answer results fluctuate greatly, with a fluctuation range of 0.37 to 0.94. However, the model’s prediction curve for knowledge state is relatively smooth. This indicates that in the ASSISTments2015 dataset, which has an imbalance between the size of the student set and the size of the concept set, the predictive power of knowledge tracking is significantly lower than that of the other two datasets.

4)

As can be seen in Figure 6, the model’s estimation of learner competence gradually improves as the learner correctly attempts most of the remaining exercises, indicating that the learner maintains a better learning state.

After comprehensive analysis, the designed framework not only allows accurate cognitive prediction of students, but also has interpretability, which makes the method proposed in this paper more attractive for practical application. For example, teachers can explicitly analyze learners’ knowledge acquisition ability and provide them with richer learning guidance.

4.2

Grouping strategy simulation experiment

In order to test the effectiveness of this paper’s intelligent grouping strategy for English exams based on the improved harmonic search algorithm on the intelligent grouping of English exam papers, taking the question paper of the 2019 English Grade 4 exam in Liaoning Province as an example, the relevant grouping organizations in this province use the computerized automatic grouping system for the intelligent grouping, and introduce this paper’s strategy for the grouping of the paper into this system. Using Java language programming for program writing, the experimental environment is WindowsXP system, the processor is 851MHz, and the memory is 64MB.The mean value of knowledge points, the mean value of difficulty, and the mean value of differentiation of all the questions in the question bank of the English Grade 4 examination of a certain province in 2019 are set to 0. 64, 0. 53, and 0. 65 in that order.

4.2.1

Analysis of the effect of grouping papers under different examination paper expectation times

The total value of the test paper score is 100 points, the fill-in-the-blank score is 12 points, the multiple-choice score is 32 points, the terminology score is 12 points, the short-answer score is 22 points, and the synthesis score is 22 points. The upper and lower limits of the indicators for the three time periods set for the expected completion of the test paper are 30-60min, 61-90min, 91-120min, respectively, and the effect of the grouping of the three time periods is shown in Fig. 7, Fig. 8, and Fig. 9.

The grouping accuracy indicates the probability that the computer automatic grouping system obtains the feasible solution of the grouping strategy 40 times before and after the use of the strategy in this paper. The optimal value, the worst value, and the mean value are, in order, the optimal value, the worst value, and the mean value of the quality of the feasible solutions obtained by the computer automatic grouping system after the strategy of this paper is grouped 40 times. In Fig. 7, Fig. 8 and Fig. 9, before using the strategy in this paper, the probability of the computer automatic grouping system obtaining feasible solutions for the three time periods is 0.88, 0.94 and 0.90 in turn, and after using the strategy in this paper, the probability of the computer automatic grouping system obtaining feasible solutions is 0.98, and the optimal value and the mean value of the feasible solutions obtained by the computer automatic grouping system for 40 times after the strategy in this paper are larger than those before using the strategy in this paper.

4.2.2

Test paper population fitness analysis

Setting this paper based on particle swarm genetic algorithm of English exam intelligent grouping strategy in finding the optimal solution of English exam intelligent grouping strategy, the number of individuals in the population in the harmonic memory bank are 100, and the number of iterations is 350 times. The maximum value of adaptation before and after the use of this paper’s strategy is shown in Fig. 10. Analyzing Fig. 4, it can be seen that the maximum value of fitness after the use of this paper’s strategy is 0.71, and the value of fitness before the use of this paper is 0.68, then the quality of the questions in the question bank is better when using this paper’s strategy to find the optimal solution of the intelligent grouping strategy for English exams.

4.2.3

Diversity analysis of test paper populations

The diversity of the test paper population can indicate the level of variation among the test questions in the question bank. If the difference between the questions is large, the population diversity is high, otherwise, the population diversity is low and the difference between the questions is small.

For the particle swarm genetic algorithm, the diversity of the population and the algorithm’s search performance has a direct impact, if the diversity of the population is larger, the overall search performance of the algorithm is better, and can explore the unexplored search range.

Used in the research content of this paper, it can be understood as the ability to obtain new question types. However, if the population diversity is always large. The difficulty of obtaining the global optimal solution increases. Therefore, at the beginning of the search, the population needs to have a good population diversity, and at the end of the search, in order to get the accurate global optimal solution, the population needs to be closer to the optimal solution, and the diversity of the population needs to gradually become smaller. Before and after the use of the strategy in this paper, test the diversity of the population after the intelligent grouping of computerized automatic grouping system, set the time needed to group the test questions is 30-60 min, 91-120 min, respectively, the results are shown in Fig. 11, Fig. 12. Analysis of Figure 11, Figure 12 shows that, when there is a constraint on the time of the English test, the use of this paper’s strategy, the automatic computerized grouping of papers, the diversity of populations in the grouping of papers in the early stage are higher, along with the increase in the number of iterations, grouping of papers in the late stage of this paper’s strategy of the diversity of the populations to reduce rapidly, to quickly obtain the global optimal solution. In contrast, after using the strategy proposed in this paper, the performance of automatic computerized grouping has improved.

In order to further prove the effectiveness of this paper’s strategy, 2 mainstream multi-attribute association and intelligent genetic strategies are used to compare with this paper’s strategy. The comparison direction is the grouping accuracy and the time, and the comparison results are shown in Fig. 13 and Fig. 14, respectively. It can be clearly seen that no matter in 50, 100, 150, 200 or 250 iterations, the grouping accuracy of this paper’s strategy is significantly higher than that of the other two strategies, with the highest accuracy of up to 98%, and from the grouping time, it can also be seen that the grouping time of grouping with this paper’s strategy is within 2s, which is significantly lower than that of the other two methods, which effectively proves the validity of the strategy in this paper.

5

Conclusion

In this paper, in order to optimize the allocation of teaching resources in the English classroom and achieve a more scientific and reasonable curriculum planning, an intelligent grouping strategy based on Particle Swarm Genetic Algorithm is designed in combination with Bayesian Knowledge Tracking Model. After verifying the performance of the knowledge tracking module on the dataset, rough simulation experiments on grouping papers are conducted. It has been found that in most cases, the prediction results of this paper’s algorithm are closer to the actual answer performance than Student-DKT. The model presented in this paper guides the student model by extracting soft target knowledge, which can lead to good prediction results in the initial stage. The probability of obtaining feasible solutions for the three time periods of the intelligent grouping system is 0.88, 0.94 and 0.90 in turn, and the probability of obtaining feasible solutions for the automatic computerized grouping after the use of this paper’s strategy is 0.98. After the use of this paper’s strategy, the optimal and mean values of the feasible solutions for the 40 times of the automatic computerized grouping are larger than those before the use of the strategy, which realizes the efficient allocation of resources for English language teaching.

Language:: English

Publication timeframe:: 1 times per year
Journal Subjects:: Life Sciences, Life Sciences, other, Mathematics, Applied Mathematics, General Mathematics, Physics, Physics, other

Journal RSS Feed

Research on Optimization Strategy of English Teaching Resource Allocation Based on Intelligent Data Analysis

Shuhui Cui

Published Online: Mar 21, 2025

Received: Oct 22, 2024

Accepted: Feb 18, 2025

DOI: https://doi.org/10.2478/amns-2025-0595

KeywordsBayesian, Knowledge tracking, Particle swarm genetics, Teaching resource allocation

© 2025 Shuhui Cui, published by Sciendo

This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

Keywords
Bayesian, Knowledge tracking, Particle swarm genetics, Teaching resource allocation