The Practice of Machine Translation in the Enhancement of English Reading Comprehension in Multicultural Environments
Online veröffentlicht: 17. März 2025
Eingereicht: 17. Okt. 2024
Akzeptiert: 14. Feb. 2025
DOI: https://doi.org/10.2478/amns-2025-0234
Schlüsselwörter
© 2025 Juan Yu, published by Sciendo
This work is licensed under the Creative Commons Attribution 4.0 International License.
Machine translation technology is an emerging technology that has been developing in recent years, which utilizes the technical means of computers and artificial intelligence to convert the text of one language into the text of another language [1-2]. Machine translation technology has been widely used in a variety of fields, the most representative of which are online translation and translation software. In a diverse environment, the application of machine learning has effectively improved English reading comprehension [3-6].
In today’s era of globalization, the importance of English as an international common language is self-evident. As an important part of English learning, English reading comprehension plays a crucial role in acquiring information, expanding knowledge, and improving language ability [7-10]. English reading comprehension is an important element in promoting cross-cultural communication in a multicultural environment. It helps us learn language skills and enrich cultural knowledge. In today’s increasingly frequent international communication and international trade, reading becomes one of the most convenient means to understand the world and learn advanced technology [11-14]. Therefore, improving reading comprehension has become an important topic of our current research, and cross-cultural language differences, especially the implied cultural differences between languages, are the biggest obstacles in the process of reading comprehension [15-17]. In this context, the emergence of machine translation technology has become an important tool to help people solve the language barriers in cross-cultural communication [18-20].
In this paper, a neural machine translation model is formed using encoder-decoder, text feature representation model and two key decoding techniques. Using it as the backbone network, an online English machine translation platform is designed and constructed. The effectiveness of this platform for English reading comprehension teaching is evaluated through the English reading comprehension teaching experiment evaluation and satisfaction research. Finally, the regression linear model is used to further evaluate the improvement of students’ English reading comprehension ability.
A language model is the foundation of natural language processing, and its role is to express natural language into a mathematical form that can be processed by a computer. Let
The essence of neural machine translation [21-22], on the other hand, is autoregressive language modeling, which stands for decoding the output words one by one while decoding the generated target language, with the previous decoding results being used as input for decoding the current word.
Conditional probability modeling is performed for a given source language sentence
The initial state
Natural language processing tasks are generally sequences, such as a sentence or a piece of speech, and neural machine translation is a “sequence-to-sequence” task model. The core structure of neural machine translation, “encoder-decoder”, is shown in Figure 1.

Art-decoder structure
The whole encoder decoder structure consists of two parts: the encoder part can encode the source language sequence and output a fixed length vector representation. The decoder part decodes the vector representation output from the encoder into the target language sequence.
Machine translation, as a kind of natural language processing task, first needs to convert the original language input into digital vectors that can be processed by the computer, this process is called Embedding word embedding, the machine translation model containing Embedding layer is shown in Figure 2.

Machine translation model containing Embedding layer
In natural language processing tasks, Embedding word embedding i.e. text feature representation. Text feature representation is the most core and fundamental part of the natural language understanding process.CBOW and Skip-Gram use deep neural network DNN to obtain word vectors by training with contextual information.Word2vec uses Huffman coding with leaf nodes instead of neurons to improve the training efficiency.
The two most commonly used decoding methods for machine translation in the decoding stage are greedy search and cluster search. Greedy search implies that the model selects the maximum probability for each word in word-by-word decoding, if the output sequence of the decoder is
Essentially, cluster search may be similar to greedy search, but cluster search expands the search space by caching cluster widths of candidate sequences and selecting the one with the largest integrated probability as the output, thus making the translation results more diverse and making the generated translation results converge to the global optimum. The process of searching and decoding is given in the following equation:
The recurrent neural network model based on the attention mechanism is shown in Fig. 3. This section takes the neural machine translation model based on the attention mechanism as an example to describe its specific principles.

The cyclic neural network based on the attention mechanism
Is the most basic neural machine translation model based on the encoder-decoder framework, in which the encoder first encodes the source language sentence into a fixed-length semantic vector, and then the decoder continuously generates the target words using this semantic vector. Obviously, fixed-length semantic vectors have limited characterization ability and cannot well express all the information contained in the source language sentence, especially when the input sentence is very long, the traditional neural machine translation model is often not as effective as it should be.
For the neural machine translation model, the model uses a bidirectional recurrent neural network, where the output corresponding to the encoder at each moment is jointly determined by the outputs of the forward and reverse recurrent neural network units at the current moment. By introducing the attention mechanism, an alignment network from the source language sequence to the target language sequence is actually established. After weighting by the attention mechanism, the original fixed-length and unchanged semantic vectors become dynamic semantic vector matrices. This greatly enhances the ability to characterize long sentences. The dynamic generation comes from the attention mechanism, whose formula is shown in equation (8):
The above equation reflects that the essence of the attention mechanism is to find the weighted average, and the dynamic semantic vector
This platform is mainly built for teachers and students to be able to use neural machine translation conveniently when teaching English reading comprehension in a multicultural environment, and the workflow of the neural machine translation process platform is shown in Figure 4.

Neural machine translation process platform workflow path
The platform assigns a random id to users accessing the site, and the backend identifies users based on this id, creating a folder for each user where process files are stored, and deleting this folder at the end of the service. The source sentence uploaded by the user is written into the file in the corresponding folder, which triggers the machine translation model residing on the server to read the source sentence and translate it into the target language, and then write the result into another file and return the result to the user’s page through the webpage. The user is then required to submit a rating of the quality of the machine translation and a reference translation. An information storage program residing on the server reads the source sentences, machine translation, ratings, and reference translations from each file in the folder and integrates and stores them.
The platform is used as an online translation tool and parallel corpus collection platform in English reading comprehension classrooms, so the submission of source sentences and reference translations are independent. When used as an online translation tool, the server-resident program does not store the corresponding source sentences and machine translations for a long period of time, and this part of the data is deleted along with the user folder at the end of the service. When it is used as a parallel corpus collection platform, the user inputs source sentences and reference translations and clicks submit, and the server-resident program will store this part of data as a parallel corpus.
In order to enable teachers and students to use the online English machine translation platform without time and location constraints, this paper uses the XOJO tool to develop a Web-based machine translation platform, using peanut shells software to dynamically resolve the domain name, port mapping to the intranet, and realizing the function of accessing the extranet.XOJO is a cross-platform integrated development environment, which consists of an integrated debugger, a multi-platform compiler, and so on.Its User Interface (UI) Builder allows the UI to be built via drag-and-drop, enabling this development to focus on the implementation of application functionality without the need to learn complex front-end knowledge such as HTML.XOJO programming language and framework can be used directly to develop the required applications easily and quickly, without having to learn how to use the various application programming interfaces in the operating system.XOJO developed programs can be directly compiled into CPU-executable instructions, using the LLVM compiler tool to compile the application, the performance is better.In terms of web application security, XOJO web applications are compiled as binary code and the source code is not stored on the server.
The web application supports the latest versions of modern web browsers such as Chrome, InternetExplorer and Safari.As long as the application on the server to keep running, and then by peanut shells to achieve intranet penetration, the Web application can be accessed by any online user. The application receives client requests from the web server, performs the corresponding processing and returns the response to the web server, and finally the web server returns to the client. Users can use the platform anytime, anywhere through the installation of a browser on cell phones, tablets, computers and other smart terminals, without the need to install any plug-ins, user-friendly web interface, simple and clear functions.
Linear regression modeling is a classical mathematical method in the field of forecasting and early warning, and an important method to analyze the direction and degree of influence of the independent variable on the dependent variable.
The basic principle of linear regression model [23-24] is: through the processing and analysis of data, the form of linear model between the dependent variable and the independent variable is expressed by mathematical formula, and the regression coefficient of the regression equation is determined by the least squares estimation method, which ultimately achieves the minimum of the sum of the squares of the distances between the actual values and the regression equation, i.e., the best degree of fit. Among them, the positive, negative and size of the regression coefficient can reflect the direction and size of the change in the dependent variable caused by a one-unit change in the independent variable. Linear regression model in the application of the method of least squares to estimate the parameters, must meet the random disturbance term and the explanatory variables are not correlated, the random disturbance term to obey the normal distribution, and the assumptions of the same variance. The general form of the regression model is as follows:
After the parametric least squares estimates are derived, tests are needed to evaluate the predictive effect of the regression model.
Goodness-of-fit test the evaluation of the degree of fit is essentially a measure of the effect of multiple linear regression. In the regression process, the coefficient of determination
Significance test of regression coefficients
Significance test of regression coefficients is to test the hypothesis of overall regression coefficients based on the results of sample estimation, and its purpose is to test the significance of the effect of the independent variables corresponding to each regression coefficient on the dependent variable. If the effect of an independent variable is not significant, this independent variable should be removed to maximize the goodness of fit. The test of significance of the regression coefficients was performed using the
Significance test of regression equation
The test result of the significance of the regression equation is a measure to evaluate the closeness of the linear relationship between the dependent variable and each independent variable. Generally, the
From the distribution
The hardware environment of the experiments in this paper: the operating system is Ubuntu18.04, the GPU is Tesla V100, this paper utilizes the PaddlePaddle open source framework for the construction of Tibetan-Chinese neural machine translation model.
Software environment: the development language is selected Python3.7, the Transformer benchmark model parameters are set to the maximum sentence length of 100 words, the word vector dimension is 128, the training sample batch_size is 64, the number of neural network layers of the Transformer is set to 4 layers, the network dropout rate Droput is 0.2, the filter size is set to 2048, Adam optimization algorithm is used in training, the initial value of learning rate is set to 1.0, and the learning rate decay strategy described by Vaswani et al. is used. Cluster search strategy is used in the decoding phase and Beam width is set to 6. The number of iteration steps is set to 5000 and the parameters are set to be unshared in the model training.
In this section, the optimized corpus is used as the experimental object, and it is divided into training set and test set in the ratio of 7:3. The transformation trend of Loss value and Bleu value of the neural machine translation model is recorded under the training number Epoch=100. The curves of loss function and Bleu score changes with increasing training times are shown in Fig. 5.

The loss value and the bleu value curve
As can be seen from the figure, with the increase in the number of training times, the neural machine translation model in the training set and test set Loss value and Bleu value both show a sharp decline and rapid increase in the trend, respectively. And the convergence speed and convergence value are basically the same, the Loss value converges to about 1.59 when the training is up to 20 rounds, while the Bleu value needs to be iterated up to 30 rounds before converging to about 35.4.
In order to test the practical effect of neural-based machine translation in the improvement of English reading comprehension in a multivariate environment, this study selected two classes of sophomore students of a major in school A for the English reading comprehension thematic teaching experiment, 50 students in each class. The experimental group (T) used the online English machine translation platform constructed in this paper, and the control group (CK) was the traditional teaching mode. The teaching experiment lasted for 12 weeks, and the evaluation index of teaching effect was the English reading comprehension level test score. Before and after the teaching experiment, the English reading comprehension levels of the experimental group and the control group were pretested and posttested.
Table 1 shows the descriptive statistical results of the English reading comprehension level test scores of the experimental group and the control group before and after the teaching experiment. Figures 6 and 7 show the distribution of reading comprehension scores of the two classes before and after the experiment, respectively.
English reading comprehension level test results descriptive statistics
Teaching experiment | Pre-test | Post-test | ||
---|---|---|---|---|
Class | CK (mean) | T (mean) | CK (mean) | T (mean) |
Reading score | 56.82 | 56.6 | 62.38 | 70.6 |
Standard error | 4.137 | 3.648 | 5.267 | 4.563 |
P | 0.951 | 0.000 |

Pre-experiment the students read the distribution of the scores

After the experiment, the students read the distribution of the scores
Combined with the data in the table and Figure 6, it can be seen that the distribution of reading comprehension scores of the two groups of students is approximately the same. The distribution of the experimental group’s scores is more dispersed, i.e., the distribution of the group’s students’ scores is more extreme, while the distribution of the control group’s scores is relatively centralized, which makes the difference in the average scores of the two groups of students smaller, respectively, 56.6 and 56.82. And the p-value between the two classes is 0.951>0.05, which indicates that the non-significant difference of the pre-test scores of the two groups is established at a statistically significant level, and that the teaching experiment can be carried out.
After the 16-week English reading comprehension teaching experiment, the students’ posttest scores were analyzed. According to Figure 7, after the experiment, the distribution of students’ scores in the experimental group is generally higher than that of the control group, and the mean values of the posttest scores of the experimental group and the control group are 70.6 and 62.38 respectively, and the difference between the posttest scores of the two classes is 8.22 points, which is a large difference. The p-value in the table is 0.000 < 0.05, which means that there is a significant difference between the pretest and posttest scores of the students in the experimental group, and the English reading level of the students in the experimental group is significantly higher than that before the experiment after the teaching experiment. This shows that the English reading comprehension teaching model based on the platform of this paper effectively improves students’ English reading comprehension level.
In order to understand the implementation status and effect of the online English machine translation platform constructed in this paper in the English reading comprehension special classroom, this section conducted a questionnaire survey on the students who used the platform for the teaching experiment, i.e., the students in the experimental group above, and a total of 50 questionnaires were distributed and 50 were recovered. The survey dimensions included four dimensions of translation accuracy and fluency, semantic similarity, translation quality, and discourse coherence, with five items set for each dimension. A 5-point scale of multiple choice questions was used to collect relevant information, i.e., ratings 1-5 represent very dissatisfied, dissatisfied, average, satisfied, and very satisfied in that order.
The validity of the questionnaire was tested and analyzed in four aspects: content validity, structural validity, convergent validity, and discriminant validity, and a reliability
The results of the questionnaire survey on the classroom effect of the online English machine translation platform are shown in Figure 8. The data in the figure show that the students in the experimental group have a high degree of satisfaction with the online English machine translation model constructed in this paper, with most of them scoring more than 4 points, and the comprehensive satisfaction of the 20 indicators is 4.266.Specifically, the students’ satisfaction ratings for accuracy and fluency, semantic similarity, quality of translation, and coherence of discourse were 4.252, 4.232, 4.316, and 4.264 in that order.The highest level of satisfaction with the quality of translation indicates that the platform of this paper can provide students with an English teaching service that combines easy comprehension, high accuracy, and relevant translation, thus improving students’ English reading comprehension skills.

Results of questionnaire survey results
In this section, a regression model is designed to explore the effect of this paper’s English online machine translation platform on students’ reading comprehension ability. Students’ gender, age, specialty, teacher level, and teaching mode are taken as control variables, and this paper’s platform is taken as the independent variable, and reading comprehension ability is taken as the dependent variable. Table 2 shows the regression prediction results of the effect of English online machine translation platform on students’ reading comprehension ability.
Variable | Reading comprehension | ||
---|---|---|---|
Correlation coefficient | Relative error | Significance | |
Gender | 0.016 | 0.002 | 0.075 |
Age | 0.029 | 0.007 | 0.081 |
Majors | 0.005 | 0.001 | 0.143 |
Teacher level | 0.007 | 0.001 | 0.107 |
Teaching model | 0.078 | 0.003 | 0.095 |
Machine translation | 0.199*** | 0.024 | 0.000 |
R2 | 0.261 | ||
F | 17.596 |
is p significant at the 0.001 level
The data in the table show that the correlation coefficients of students’ English reading comprehension ability with students’ gender, age, specialty, teachers’ level and teaching mode as control variables are all greater than 0, and the significance is greater than 0.05, which indicates that there is a positive influence of control variables on English reading comprehension ability, but there is no significant difference. After fixing the control variables, and including the platform as the independent variable in the regression equation, we found that the correlation coefficient between the platform and the students’ English reading comprehension ability is 0.199, and the significance is 0.000<0.001, which means that the platform can promote the students’ English reading comprehension ability at the level of 0.001, and the students’ English reading comprehension ability rises accordingly when it is raised by one unit. 0.199. The strength of explanation of this regression model by R-square is 26.1%.
This paper carries out a study on the effect of neural machine translation model on the improvement of English reading comprehension ability. The online English machine translation platform constructed in this paper was applied in School A. The changes in students’ reading comprehension ability level before and after the application of the platform were compared to evaluate the actual effect of the platform. In addition, combined with the regression model, the changes in students’ English reading comprehension ability are predicted with the platform of this paper as the independent variable.
The neural machine translation model proposed in this paper converges to about 1.55 and 35.4 for Loss value and Bleu value respectively when it is trained to 20~30 rounds.After the teaching experiment, the mean values of the posttest scores of the experimental group and the control group were 70.6 and 62.38, respectively, and the p-value was 0.000, which showed a significant difference between them.The overall satisfaction level of students with the platform of this paper is 4.266, which is “satisfied”. The platform of this paper can improve students’ English reading comprehension at 0.001 significant level.