Teaching Practices to Enhance English Reading Comprehension Using Natural Language Processing Technology
Published Online: Mar 19, 2025
Received: Oct 27, 2024
Accepted: Feb 13, 2025
DOI: https://doi.org/10.2478/amns-2025-0441
Keywords
© 2025 Zhuo Wang et al., published by Sciendo
This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
At present, in English teaching, most students think that “reading” is easy and “writing” is difficult because “writing” ability involves the use and collocation of words, sentence composition and syntax, correctness of paragraph semantics, and overall structure of the text. The reason is that the ability to “write” involves the use and collocation of words, sentence composition and syntax correctness, whether the semantic meaning of a paragraph is cohesive, and whether the overall structure of a text is reasonable. This requires a lot of practice and detailed evaluation and feedback on these exercises. However, in view of the current conditions of English teaching, such as the teacher-student ratio, it is obviously more difficult [1–3]. There are many research areas related to “reading and writing” ability in natural language processing technology, which are based on word segmentation, word vector modeling, word attribute labeling, sentence breaking, sentence meaning segmentation, and other techniques to achieve the task of recognizing, analyzing, and comprehending the text from the parts of words, sentences, paragraphs, etc. to the whole of the text [4–6]. At present, a model for the application of these technologies is AES, and the AES method has been commonly used in the TOEFL, GRE, TOEIC and other examinations of the ETS Center in the U.S. TOEFL and other examinations are now commonly using AES to complete the initial review of these papers, and then two teachers to complete the final grade confirmation on the basis of the AES rubric, which greatly improves the speed of the essay review [7–9]. China has also been using and testing AES for more than ten years, but the accuracy of essay grading done by AI needs to be improved and the feedback evaluation needs to be improved. In recent years, neural network technologies such as RNN and LSTM have developed rapidly, and the relevance of some AES scores based on RNN or LSTM technologies has reached or even exceeded the teacher scoring relevance on the relevant test set [10–12]. In summary, based on the understanding of AES and other artificial intelligence technologies, the introduction of natural language processing technology in the teaching of English “reading and writing” competence development can automatically analyze students’ reading materials and generate reading comprehension questions at the same time as students “read”. Further, it can also analyze, understand and evaluate students’ open-ended English reading and writing exercises [13–15].
In the process of carrying out English classroom teaching, developing students’ reading comprehension ability is one of the main teaching tasks of teachers. Literature [16] mainly examines the role and research hotspots of intelligent tutoring systems and natural language processing technologies in artificial intelligence as well as AI algorithms such as statistical learning, data mining, machine learning and natural language parsing in language education, and finds that most of them have been applied in the fields of writing, reading and vocabulary acquisition, which can deal with issues such as learning anxiety, willingness to communicate, knowledge acquisition and classroom interaction. Literature [17] emphasizes the importance of reading and writing in learning English oral and comprehensive skills, designs a multi-criteria decision support system (MCDM) based on artificial intelligence for the production and application of multi-modal online reading in English, and experimentally verifies its feasibility and practicality, which can help to improve students’ English reading and writing skills. Literature [18] systematically explores how to take the road of artificial intelligence plus smart classroom, designs a smart classroom teaching mode of college English based on artificial intelligence technology in mobile information system, and verifies the effectiveness of the mode in cultivating students’ reading comprehension ability through empirical analysis, and the students’ satisfaction with the teaching mode is extremely high, which can further promote the intelligent development of education. Literature [19], through a systematic review of the relevance and uses of natural language processing in language learning, points out that natural language processing can address the analysis and generation of written and spoken language, and plays an important role in improving language learners’ reading comprehension in English. Literature [20] analyzed through simulation experiments show that deep learning complexity and multimodal target recognition can effectively complete a variety of natural language problems including lexical and semantic role marking, which in turn improves the accuracy of natural language processing, and in the cultivation of the students’ reading and writing ability its a certain role in promoting. Literature [21] examines whether the application of artificial intelligence technologies such as natural language processing technology, interactive exercises, personalized feedback and speech recognition technology to English language teaching can improve students’ oral English skills and oral self-regulation, and the experimental results show that English language teaching based on artificial intelligence can improve students’ oral proficiency, which highlights the artificial intelligence technology’s Potential.
This study combines natural language processing techniques with English reading comprehension to construct an English reading comprehension model that utilizes dependent syntactic analysis and keyword co-occurrence. It combines keyword features with the BERT model to calculate the number of keyword co-occurrences in English articles and questions, and uses pre-trained model parameters for fine-tuning. To enhance the encoding process of the model and obtain long-distance inter-word association information, dependency syntactic analysis is implemented. The model mainly consists of an input layer, a coding layer, and a matching layer. The input layer is responsible for vectorizing the question and the article and extracting the features, while the coding layer fuses the textual information and the features, and finally the matching layer finds the answer corresponding to the question in the article and outputs the answer interval. To carry out performance testing experiments on the English reading comprehension model of this paper, and to carry out English reading comprehension education practice with the first-year students of business English major in a university as the research object, to explore the utility of the English reading comprehension system with the English reading comprehension of this paper as the technological core in the actual teaching and learning.
Natural language processing is designed to make machines learn the natural language of humans and learn the characteristics of the language without the help of human beings so that they can do the task of processing linguistic texts instead of human beings [22].
The pre-processing stage of the text is crucial in helping with subsequent natural language processing. In NLP engineering, the raw text data should be pre-processed by performing a series of operations to characterize it. These operations include word segmentation, text cleaning, stemming extraction, word form reduction, and denotational disambiguation. Segmentation is the reduction of sentence level into individual word level, which means that the sentence is cut into multiple words according to certain rules. Text cleaning is to remove a lot of useless parts from the original data, such as removing punctuation marks that don’t need to be used, stop words, and some unnecessary tags. Stem extraction and word form reduction both involve finding the original form of the target word. Referential disambiguation is one of the more difficult tasks in the basic task, in simple terms, it is necessary to determine which noun or which phrase in the previous text is represented by the pronouns appearing in the later text, and to determine the mapping relationship between the pronouns and the previous words, the process of finding the mapping relationship is the process of referential disambiguation.
Text vector modeling is the conversion of natural human language into computer-recognizable encoding - “computer language”, which, in layman’s terms, means that the text is somehow mapped into a bunch of numeric vectors, which are text vectors for the text. Usually text is made up of simple words, which are also called word vectors. In early research, each word was encoded using One-hot Representation, which essentially represents each word as a longer vector and then encodes it. In this process the size of the word list is represented by the dimensions of the vector, in the representation by the dimensions, most of these dimensions are assigned a value of o and only one of the dimensions is assigned a value of 1. This dimension with a value of 1 represents the current word. It is equivalent to numbering each word in order, the size of this word list may be tens to hundreds of thousands, there is a very important problem, also known as the “lexical divide” phenomenon, this representation, can not represent the relationship between words, that is, the similarity can not be known, any two words are isolated from each other, and in the practical applications It is also prone to dimensional catastrophe.
With the progress of text representation research, people gradually found that these shortcomings of One-hot Representation can not be avoided, therefore, researchers proposed the concept of word embedding model, the essence of which is to learn through a large amount of data training, and ultimately mapping each word into a vector in the lower dimensionality (dozens or hundreds of dimensions) of the word vector space. This method can determine the semantic similarity between words through some vector operations such as the distance between word vectors, so as to derive the relationship between words, such as the use of cosine similarity, Euclidean distance and other methods. And in addition to being able to study the relationships between words, they can be the input for the next stage of natural language processing tasks, providing the basis for tasks such as text summarization, human-computer dialogue, reading comprehension, and text generation. The word embedding model can only make predictions from known words, i.e., words that appeared in the previous training set.
The two models of word embedding are CBOW model and Skip-gram model, which predicts the target word by the words on both sides of the target word, and does not take into account the positional information of the words in the text when storing them, and uses a piece of contiguous space to represent all the words, so this model can save some training time [23–24]. Instead of needing to pass in
The traditional English reading comprehension model mainly suffers from the problems of low text representation rate and insufficient reasoning ability to the answers. Aiming at the above problems, this chapter will construct an English reading comprehension model based on natural language processing technology.
In this chapter, the multi-feature fusion approach to machine reading comprehension involves three main types of features, keyword features, pre-trained model features, and dependent syntactic features. Keyword features are the features that the keywords of the question appear in the article, pre-trained model features use pre-trained BERT, and dependent syntactic features extract syntactic information and then fuse it with the model. The inclusion of multiple features improves the information extraction ability of the model [25].
The essence of using deep learning models for reading comprehension is to find the correlation between the question and the answer, the correlation feature between the question and the answer is the keyword, and the keyword of the question appears near the position of the answer in the article. When the keyword appears in the article and the question at the same time, the co-occurrence number is 1. The co-occurrence feature
Splice the articles and questions into a sequence of length
Combining the keyword features with the BERT model, the input of the sequence is shown in equation (4):
Dependent syntactic analysis in a sentence only one constituent is independent, the other constituents of the sentence are subordinate to a certain constituent, no constituent can be dependent on two or more constituents, and other constituents to the left and right of the center constituent are not related to each other [26].
The word embedding uses 200-dimensional Tencent word vectors, which provide more than 8 million Chinese words and phrases, culminating in very rich words and phrases, with a relatively high accuracy rate. CNN is an effective method for extracting localized features. Dependent information is fed into the CNN convolutional layer, and the CNN automatically learns the values of its filters according to the task to be performed. There are two aspects of this computation that are worth noting positional invariance and combinatoriality [27]. An important feature of CNNs is that they are very fast, and for each input, the CNN puts its no-window function on a matrix, and uses a convolutional layer and a pooling layer to extract a new dependent feature vector,
The output
The model is based on the pre-training model BERT. The pre-training model can greatly improve the effect of natural language processing tasks, the pre-training model can be directly produced and used in the current task, to make up for the defects of insufficient training corpus, but also can accelerate the convergence speed of the model. Combining the pre-training model with dependent syntactic analysis information and keyword features, the model mainly consists of an input layer, encoding layer, and matching layer. The overall structure of the model is shown in Figure 1.

Model structure
In this paper, we use the self-attention model for the fusion of article semantics and question semantics. Into the regularization layer normalization, the coding layer has two sub-layers, the first sub-layer is the multi-head self-attention and the second sub-layer is the fully-connected feed-forward network, the two sub-layers are connected using the residual network structure, and then a regularization layer is connected. The multi-head attention is projected by multiple linear transformations, which then stitch together the different attentions to form the final representation.
The formulas are shown in equations (6)-(9):
MhA takes the same
The residual concatenation is so that the network does not get worse after training, as shown in equation (9). Because of the addition of a term
Normalization transforms the hidden layer data into a standard normal distribution with mean 0 variance 1. Normalization is performed before feeding the data into the activation function, so that the input data will not fall in the saturation zone of the activation function, which serves to accelerate the training speed and accelerate the convergence. As shown in equations (12)-(14):
Eq.
Finally, a deep, context-aware representation of the article and question is generated, and the output is
The idea of pointer-network is borrowed in the matching layer, where articles and questions are contextualized and an answer probability
Two values
The loss function is shown in equation (22):
RACE is a large-scale test-type question-and-answer dataset released by Carnegie Mellon University in 2017. RACE is based on English test questions from middle and high school students in China. The dataset includes 28,453 essays and 95,528 reading comprehension questions. In RACE, each sample is a quadruple (article, answer, question, interference answer), the interference answer of the quadruple is deleted, the formed triples (article, answer, question), and some pretreatment work can be used to train the model. The questions in the RACE dataset are divided into two categories, cloze questions and standard questions. To improve the sample quality of the dataset, we divided the original RACE dataset into RACE-1 dataset (cloze type questions) and RACE-2 dataset (standard questions). The ratio of the two data sets is 7:2:1, divided into the training set, validation set, and test set. In addition, this paper uses the key sentence annotation method to label the data set.
The English reading comprehension model of this paper is validated on two typical selective reading comprehension datasets on RACE, and the validation performance results are specifically shown in Table 1. It can be seen that among all the models including the model in this paper, the Attentive Reader model has the worst performance, which is only 46.59% and 41.23% in the two sub-datasets of RACE-1 and RACE-2. Whereas, the model in this paper has the best performance with 88.36%, 85.05% performance in RACE-1 and RACE-2 datasets.
Performance contrast
| Model | RACE-1 (%) | RACE-2 (%) |
|---|---|---|
| Richardson et al. | 69.93 | 64.19 |
| Wang et al | 74.93 | 70.93 |
| Li et al | 74.3 | 72.25 |
| Attentive Reader | 46.59 | 41.23 |
| Neural Reasoner | 46.71 | 46.4 |
| Parallel-Hierarchical | 74.68 | 70.38 |
| Reading Strategies | 82.07 | 81.57 |
| Bert | 74.74 | 80 |
| Model of this article | 88.36 | 85.05 |
It is known that the English reading comprehension model constructed in this paper is based on the pre-training model BERT, and dependent syntax analysis and keyword co-occurrence are introduced to improve the encoding process of the model (i.e., “Bert + keywords + dependent syntax”). This section focuses on the performance comparison between the English reading comprehension model constructed in this paper and other baseline models on the RACE dataset, and analyzes the performance comparison results. The performance comparison of this paper’s model with other baseline models is specifically shown in Table 2. From the table, it can be seen that the accuracy of the English reading comprehension model proposed in this chapter is 86.33% on the dataset RACE-1, which obtains the optimal performance compared with the other eight baseline methods. In addition, the model in this paper achieves optimal results on the RACE-2 dataset with an accuracy of 83.85%, which is also better than the other 8 baseline methods.
Contrast of the baseline model
| Model | RACE-1 (%) | RACE-2 (%) |
|---|---|---|
| Bert | 73.61 | 80.71 |
| Bert+LUA | 72.16 | 78.97 |
| Bert+TLUA | 82.38 | 80.01 |
| Bert+FRA | 83.8 | 80.96 |
| Bert+Key words+Dependency syntax | 83.15 | 81.46 |
| Bert+Key words | 73.34 | 70.42 |
| Bert+Dependency syntax | 68.71 | 76.55 |
| Bert+Wordset+Dependency syntax | 61.41 | 68.59 |
| Bert+Key words+Dependency syntax | 86.33 | 83.85 |
Vector comparators are often used to bring document representation vectors closer to question-answer pair representation vectors. In this section, the vector comparator is introduced to investigate whether it has an effect on the document representation of the English reading comprehension model in this paper. After adding the vector comparator, the results of its effect on the pre-trained model of this paper are shown in Fig. 2. In the figure, the document representation vectors are not only closer to the question-answer representation vectors, but also closer to the representation vectors of the supporting information, which proves that the vector comparator can further enhance the model’s representation ability, which is more conducive to the model’s choosing answers and extracting the supporting information at the same time.

Effect after adding vector comparator
To further explore the relationship between the representation vectors, this section presents the representation vectors in the form of a heat map, as shown in Figure 3. Figure (a) shows that the document representation of the baseline model places attention on other sentences, causing the model to make wrong choices. Figure (b) shows that the model after adding syntax partially focuses on the supporting information, which can guide the model to make the right choice, but it is not yet able to extract the supporting information accurately. Figure (c) shows that after adding the vector comparator it can be clearly observed that the model places most of its attention on the position of the support information, which can not only make the correct answer choices, but also extract the support information accurately.

Eigenvector heat diagram
In addition, this section also counts the prediction results of the baseline model, the model after adding syntactic knowledge and the model after introducing the vector comparator for the problem, as shown in Figure 4. The figure demonstrates the comparison of the model’s accuracy with the F1 value of extracted support information. The results show that after the introduction of the vector comparator, the model’s representation ability is strengthened, and the document representation is closer to the question-answer representation, which is conducive to the model’s answer selection. At the same time, the document representation is closer to the representation of support information, which improves the model’s ability to extract support information.

Accuracy and support information
In the above paper, this paper has constructed an English reading comprehension model based on natural language processing technology, and used it as the core technology to build an English reading comprehension system. In this chapter, we will explore whether the English reading comprehension system has any effect on students’ English reading ability, and select two parallel classes with similar English reading ability in the first year of Business English major in a university to conduct the experiment. The two classes have the same English class hours, same textbooks, the same teaching schedule, and both classes are taught by the same teacher and have the same 50 students. The experimental class will introduce the English reading comprehension system constructed in this paper into the reading classroom teaching, while the control class will continue to use the traditional grammar-translation teaching method.
Before and after the educational practice of English reading comprehension, English reading tests were conducted for the experimental and control classes respectively. In order to test whether there is a difference between the two classes’ respective English reading scores before and after the experiment, paired-sample t-tests were conducted for the two classes respectively, and the test results are shown in Table 3. As can be seen from the table, the mean value of the total reading scores of the experimental class in the post-test is 24.46 higher than that of the pre-test, and the p-value obtained from the paired-sample t-test of the two test scores of the experimental class before and after the experiment is 0.002 (p<0.05), which is statistically significant, indicating that there is a significant difference between the reading scores of the experimental class in the post-test and its pre-test, and there is a significant increase in the post-test scores compared to the pre-test scores, which means that the reading ability of the students in the experimental class has been significantly improved. The mean total score of the control class on the post-test was 0.28 higher than the pre-test, and the difference between the two test means was not significant. The p-value obtained from the paired samples t-test of the control class’s pre and post-test scores is 0.15 (p>0.05), which is not a statistically significant difference, indicating that there is no significant difference between the control class’s scores on the two tests. From the mean values of the two tests, it can be seen that the mean value of the post-test of the control class has improved compared to the pre-test, but there is no significant change in the results of the two tests, indicating that there is no significant improvement in the reading ability of the students in the control class.
English reading score
| Class | Experimental class | Control class | ||
|---|---|---|---|---|
| Test time | Before experiment | After experiment | Before experiment | After experiment |
| Mean | 24.12 | 28.58 | 25.54 | 25.82 |
| Case number | 50 | 50 | 50 | 50 |
| Standard deviation | 6.178 | 4.792 | 5.598 | 5.536 |
| T | −9.911 | −1.619 | ||
| P | 0.002 | 0.15 | ||
Based on natural language processing technology, this paper constructs an English reading comprehension model and correspondingly builds an English reading comprehension system for English learners. In the performance test experiments of the English reading comprehension model in this paper, the performance of this paper’s model in the two sub-datasets of RACE-1 and RACE-2 is as high as 88.36% and 85.05%, which is higher than that of all the comparison models. Also, when compared to other baseline models, the model in this paper still demonstrates high accuracy of 86.33% and 83.85% on the RACE-1 and RACE-2 sub-datasets. In the document characterization experiments with the addition of vector comparators, this paper’s model is able to place most of its attention on the location of the support information after the addition of vector comparators to achieve the accurate extraction of the support information, and the model’s characterization ability is also strengthened.
Applying the English reading comprehension system with the model of this paper as the technical core to the actual practice of English reading comprehension education, the English reading posttest scores of the experimental class taught with the system increased by 24.46 compared with the pre-test, which showed a significant difference (P=0.002<0.05), while the test results of the control class, which was still taught with the traditional teaching method, did not show any significant change (P=0.15>0.05). 0.15>0.05).
