A Study on the Influence Mechanism of English Corpus on Translation Quality in Multilingual Website Translation
Pubblicato online: 21 mar 2025
Ricevuto: 31 ott 2024
Accettato: 06 feb 2025
DOI: https://doi.org/10.2478/amns-2025-0612
Parole chiave
© 2025 Ying Pu et al., published by Sciendo
This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
Translation quality assessment refers to the process of judging translation products according to certain standards. Translation quality assessment can make translation activities comply with certain norms and standards and meet certain needs [1-2]. In the field of translation research, especially in the field of professional translation, translation evaluation has not been able to get rid of the baggage of subjectivity, a translation is acceptable to the evaluator in a certain context, but in another context or evaluator, it may become bad and unacceptable [3-4]. Both the client as an evaluator and the translation staff as a service provider feel confused within this subjective evaluation system. In order to achieve good translation results, translators must take a feasible route between language comparison and quality assessment. Further research and establishment of a set of objective criteria, on the basis of which a rigorous and considerate analysis can be carried out, is an irreversible choice to break the strange circle of subjectivity in translation quality assessment [5-8].
The Translated English Corpus (TEC) repository contains almost all the languages commonly used internationally, and the vocabulary of the Translated English Corpus is mainly derived from publicly available publications in which English is the native language [9]. It should be noted in particular that the native language of the translators of the TEC texts is by default English, with the vast majority of the texts translated after 1983 being typical of contemporary English translations [10-11]. The specific text types are biographies, novels, newspapers and magazines, of which novels account for more than 80%, magazines for 15%, and the remaining 5% are biographies and newspapers. Up to now, TEC stores roughly 10 million word times, and as the number of translations in the world continues to increase, if the copyright of the translated text is acquired, and then after scanning, editing, or labeling, new translated texts can be added, so the current TEC vocabulary has shown a continuous increasing trend [12-14]. In order to carry out an in-depth study of the characteristics of the translated text, TEC also carries out two forms of text and metadata annotation.TEC does not carry out detailed annotation in the translated text, which is a prerequisite for translating the text to ensure its completeness.For the parts such as introductions and summaries that do not need to be translated, they will be specifically made to deal with them in in the database index and they are intentionally ignored and hidden, and, therefore, they can't be found in the index [15-17]. Because of the needs of the research, the metadata of TEC has recorded the hyperlinguistic features of the translated text in detail, which specifically contains the basic information such as the translator's name, gender, nationality, occupation, etc. Meanwhile, the source language of the translated text, the publisher, the number of words, and the specifics of the author of the original text are also recorded in detail, and all of these annotations are carried out in the form of independent additional information of the text with the markup code of XML, and the commonly used TEI These annotations support the comparison of real/imaginary word ratios, class/form ratios, sentence lengths, and word collocation patterns, as well as the frequency of occurrence of words in translations and the differences in translations due to different sources, and the characteristics of translations are derived through inductive analysis [18-21]. Currently, with the help of TEC client browser, syntactic query, word frequency query, sorting of indexed line collocations, and saving of retrieval results can be performed [22].
The English corpus of translation mentioned above has been widely used in the assessment of learners' translation quality in the process of translation teaching. In the translation industry, the use of corpora for quality assessment is also increasing. How to effectively guide this kind of professional assessment and obtain better results requires researchers to grasp several key factors in the quality assessment model [23-24].
English corpus is an important tool for English translation, researchers have analyzed and synthesized the current situation of corpus in the field of translation through literature review and other analytical methods, and put forward some optimization suggestions.De Sutter, G et al. analyzed the current status of translation research with corpus as the underlying logic, and argued that based on the revised corpus translation research agenda, a multi-dimensional and multi-method approach should be taken to explore the research factors affecting corpus translation, such as socio-cultural context, technology and cognitive factors, and discussed in detail with practical cases [25].Wu, K et al. synthesized the trajectory of corpus translation research, pointed out that the research trend of corpus translation is diversified methods as well as interdisciplinary thinking, and pointed out that translation quality checking and assisted translation are the future development trend of corpus translation [26].In the field of translation, corpus mainly plays the role of assisting translation, translation quality detection and enhancement, Giampieri, P Empirical investigation reveals that corpus tools can effectively assist students to make scientific translation decisions, help students to develop and master translation skills such as vocabulary collocation, fixed expressions, etc., but at the same time pointed out that in the process of corpus use, we should be careful not to be distracted by the Internet data. [27].De Clercq, O et al. explored the quality of English-French bilingual machine translation based on the corpus research method, pointed out that there are linguistic features in machine translated texts that significantly deviate from the original French observation norms, and argued that editing and recording these features and using them for machine translation optimization would help to further improve the quality of machine translation [28].Carl, M et al. analyzed the effect of cross-linguistic syntax and semantic distance on translation production time in machine translation based on a corpus tool for multilingual alternative translation, noting that non-literalism is very difficult for translation from scratch and post-editing [29].Imankulova, A et al. conceptualized a quality estimation strategy with sentence-level round-trip translation as the core logic and combined it with a filtered pseudo-parallel corpus for data expansion training, which effectively improved the BLEU scores, and also carried out experiments to corroborate the positive effect of iterative bootstrapping on the quality improvement of translation performance [30].
In this study, the influence mechanism of English corpus on translation quality in multilingual website translation is explored mainly from two aspects: the construction of Neural Machine Translation Model (NMT) and the experiment of English corpus-assisted translation. By clustering the training corpus with K-Means clustering algorithm at the clustering layer and completing the construction of the memory module, a neural machine translation model incorporating translation memory is proposed, and the model's effect of RLEU score enhancement is analyzed for six English translation directions, namely, English-German, English-Vietnamese, English, Russian, German-English, Vietnamese-English, and Russian-English. Then, a control experiment was designed with an English parallel corpus as a control variable to verify the enhancement effect of the English corpus on translation quality. As a result, a translation quality enhancement mechanism combining the NMT model and the English corpus has been proposed.
In order to make full use of corpus resources and realize high-quality translation of multilingual websites, this paper proposes a neural machine translation model that incorporates translation memories.
The role of machine translation is to convert the source language into the target language. Language modeling is the foundation of the field of natural language processing and plays an integral role in machine translation.Neural network-based language models, such as feed-forward neural networks and Transformer, also use conditional probabilities in the language model as the main path. A language model is a mathematical model that describes the laws of natural language in a form more suitable for automatic computer processing, and the final output is a probability. Specifically, the language model determines the probability that a certain sentence occurs in that language, and for common sentences, the language model should yield a high probability, and for incorrect sentences, the probability obtained by the language model should tend to zero, and a smooth translation is obtained by comparing whether or not a certain sequence of symbols is more likely to occur than another sequence of symbols.
In general, the language model decomposes the probability of a word sequence
The N-gram model was proposed to model these conditional probabilities, where each word occurrence depends on a finite number of words preceding it, which is used to train machine translation models using maximum likelihood estimation with the help of Markov's assumptions [31]. For the monadic language model, the conditional probability of the current word does not take the context into account. For binary language models, the conditional probability of a word considers only the word preceding it. As N increases, the extracted information will be more comprehensive, but the complexity of the language model will be greater, so generally in use, the N value is set to no more than 3.
In order to realize machine translation, word embedding techniques are generally used to convert natural language text into a form that can be recognized and processed by machines. Word embedding technology represents natural language text as discrete or distributed vectors. The discrete representation, also known as One-Hot, encodes each of the
Feedforward neural networks have many fewer parameters than N-gram models, each word is represented as a low-dimensional vector, modeled on a continuous space, and do not need to explicitly store all the N-grams.
A “sequence” is a common data structure. In computer vision, an image can be viewed as a sequence of pixels. In the field of natural language processing, a sentence can be regarded as a sequence of words, phrases, and words. By improving the sequence-to-sequence (Seq2Seq) training method, the proposed encoder-decoder structure has become the current mainstream framework for neural machine translation models. The core idea of the encoder-decoder structure is to convert the source text sequence into semantic encoding by an encoder, and then decode it by a decoder.In the encoder-decoder framework, the length of the input sequence may be different from the length of the output sequence, which corresponds to the needs of translation applications. The corresponding structure of the encoder-decoder framework is shown in Figure 1.

Framework training process of the Encoder-decoder
As can be seen in Figure 1, the encoder-decoder framework consists of two parts. In the process of model training, the input parallel sentence pairs are embedded into a list of word vectors, the model parameters are randomly initialized, and the mapping relationship between the source language sequence and the target language sequence is learned during the training, and the parameters are updated until the minimum loss is obtained. Assuming that the source sequence
The inference process of the encoder-decoder framework can be represented as shown in Fig. 2. The inputs to the decoder are the context vector, the hidden state of the previous moment, and the true predicted output of the previous moment, i.e., at this point, the actual prediction result of each step is used as the input for the next prediction by replacing the true target text entered in the training phase with the actual prediction result of each step. The training method of using only the correctly labeled corresponding words as inputs for the next moment is known as Teacher Forcing mechanism, which is used to effectively alleviate the problems such as weak prediction ability of early recurrent neural networks during the training phase. Both the inference and prediction phases of the model have a special flag “<sos>” as the initial state of the decoder and an end flag “<eos>”.

Forecasting process of the Encoder-Decoder
Encoder-Decoder model has been widely used in various fields of natural language and speech processing, including machine translation, text generation, speech recognition, etc. Commonly composed Encoder-Decoder structure are RNN, CNN, etc.
BLEU is currently the most widely used
The accuracy of n-gram matching
There is a short sentence preference in the above evaluation method, and in order to make the machine translation generate sentences of appropriate length, BLEU introduces a penalty factor
Many studies have confirmed the effectiveness of BLEU in differentiating the quality of translations, and the test results are relatively stable.The correlation between BLEU and the results of the manual evaluation is high in system level assessment, but the correlation between BLEU and the results of the manual evaluation may be poor in the sentence level.
In order to enable the model to utilize the corpus more fully, in this paper, external knowledge is introduced by using translation memories and used to guide the training of the translation model in order to improve the quality of the model's translation results. The architecture diagram of the neural machine translation model that utilizes translation memory is illustrated in Fig. 3. The encoding layer uses a bidirectional encoder to encode the input source utterance

Structure of neural machine translation model integrating translation memory
To for the current input source-target utterance pair <
The splicing is performed to obtain the expression for each word, as shown in Equation (7):
Meanwhile, the expression before and after splicing toward the final moment is used as the final expression of the sentence, as shown in Equation (8):
Similarly, the same operations are performed on the target statement
The clustering layer divides the source-target utterance pairs in the current corpus into
For the current input source utterance expression
The weighted sum of all the correlations yields the memory module embedding expression
Similarly, for the current input target utterance expression
The weighted sum of all the correlations is used as a memory module for the current input target semantics embedded in expression
In particular, the memory module embedding expressions
The self-encoding layer mainly consists of the target utterance self-encoder. This module is mainly used to update all target utterance expressions in the corpus, thus updating the target semantic cluster expressions in the memory module. As shown in Eqs. (15) and (16):
The memory self-encoding layer loss is shown in Equation (17):
The decoding layer uses a unidirectional decoder, and the initial hidden state
The decoding process utilizes the attention mechanism to capture the contextual attention information of the source utterance. Specifically, assume that the decoder obtains the hidden state
For the generative translation utterance decoding process, different target cluster attentions are computed for each moment since semantic shifts may occur at each moment and the target cluster semantics of interest are not exactly the same. Specifically, taking
In particular, the target cluster attention is computed once in the clustering layer and the relevant target semantic cluster information is obtained, while it will be obtained again at each moment in the decoding process, which is a two-level attention mechanism. The statement expression is used as the query vector in the clustering layer, and the attention information based on the overall semantic level is obtained, which guides the whole translation process at the sentence level, while the current hidden layer expression is used as the query vector in the decoding, and the attention information based on the currently decoded word is obtained, which guides the current translation moment at the word level. The overall attentional information ensures semantic accuracy and fluency of the overall translated utterance, while the attentional information of individual decoded moments allows for better alignment of the translation at the word level.
After fusing the three parts of information, the decoder generates the current
The loss function of the decoding layer is shown in Eq. (26):
The total model loss function consists of two parts, the decoding layer loss and the self-coding layer loss, as shown in Equation (27):
In the training phase, within the current epoch, the encoding results of all source-target utterance pairs are first obtained and K-Means clustering is used to obtain the memory module, after which training is performed. At the end of the current epoch, all source-target utterance pair expressions are re-acquired, the memory module is updated, and the next round of training is performed.
For testing, only the source utterances are used as inputs, and the similarity of the source utterances in the source semantic clusters is used to replace the similarity of the target utterances in the target semantic clusters.
In order to verify the superiority of the constructed neural machine translation model incorporating translation memory, it is compared with other neural machine translation models in experiments on several divine machine translation (NMT) tasks, and the experimental results of different NMT systems are summarized in Table 1. Table 1 demonstrates the BLEU scores of the six NMT systems, and “Ours” denotes the NMT system that uses the NMT model construction method of fusing translation memories proposed in this paper.
BLEU score of different NMT systems for NMT tasks
| Compare | System | English-German | German-English | English-Vietnamese | Vietnamese-English | English-Russian | Russian-English |
|---|---|---|---|---|---|---|---|
| 1 | RNN | 21.58 | 26.91 | 30.03 | 29.33 | 15.52 | 17.83 |
| Ours | 24.49 | 28.32 | 30.85 | 30.54 | 15.99 | 19.48 | |
| 2 | Transformer | 24.81 | 31.14 | 29.27 | 27.61 | 14.38 | 19.44 |
| Ours | 24.88 | 32.89 | 30.04 | 29.15 | 15.21 | 20.14 | |
| 3 | mRASP | 30.82 | 37.61 | 35.41 | 38.19 | 19.04 | 25.21 |
| Ours | 31.85 | 38.02 | 36.47 | 38.78 | 20.15 | 25.44 | |
| 4 | mBART | 29.81 | 37.74 | 35.13 | 37.64 | 19.18 | 25.53 |
| Ours | 30.92 | 38.07 | 35.44 | 38.53 | 19.82 | 25.99 |
In the six translation directions, compared with the baseline model, the model building method for fusing translation memories proposed in this paper can effectively improve the BLEU scores. The method in this paper has a significant effect on improving the translation quality of RNN models. Compared with the RNN model, the model in this paper not only achieves an improvement of 2.91 BLEU scores in the direction of English-German translation, but also achieves an improvement of 1.65 BLEU scores in the direction of Russian-English translation. For the Transformer model, the NMT system with this paper's method improves 1.75 BLEU scores and 1.54 BLEU scores in the German-English and Vietnamese-English translation directions, respectively. This paper's method improves the BLEU score of the mRASP model from 30.82 to 31.85 in the English-German translation direction, which has the greatest improvement effect on the multilingual pretrained model. In addition, the mBART model improves the BLEU score from 29.81 to 30.92 in the same translation direction, which also has an improvement effect.In summary, it can be seen that the modeling method proposed in this paper that incorporates translation memory achieves better performance in multiple translation directions for the multilingual neural machine translation model.
In order to better understand the proposed NMT model construction method for fusing translation memories, this paper further compares the BLEU scores obtained using the mRASP baseline model and the improved model of this paper to compare the advantages of this paper's method on six resource translation tasks. The BLEU score curves on the validation set under different translation directions are shown in Fig. 4. Where (a) ~ (f) represent the RLEU score curves on the six translation directions of English-German, English-Vietnamese, English, Russian, German-English, Vietnamese-English, and Russian-English, respectively.

The curves of BLEU scores on validation set for different translation directions
As can be seen from Fig. 4, the NMT model with fused translation memory applied converges faster than the baseline model. This phenomenon indicates the better performance of the NMT model with fused translation memory during the model training process. One explanation for the NMT model with fused translation memory to have better performance momentum than the baseline model at the later stage of the training of the NMT model (i.e., after all the training samples are included) may be that when translating the relevant information through the retrieval memory module, it fuses both global and local attentional information, guiding the translation process from both the sentence-level and the word-level perspectives, which results in higher-quality translated utterances.
Meanwhile, when the source language is English, the performance of the mRASP model with the fused translation memory modeling approach outperforms the baseline at the outset. Several factors can explain this observation. First, the mRASP model was pre-trained using a large-scale multilingual corpus and learned shared information between multiple languages, benefiting from this prior knowledge, which focuses on the comprehension and generation of English sentences at that stage. This prior knowledge gives the mRASP model an initial advantage.Secondly, English examples tend to have relatively regular syntax and structure. This simplicity allows the mRASP model to quickly capture and learn patterns of expression in the language. Finally, the mRASP model is exposed to a large number of English sentences in the early stages of training, which helps the model learn English language features and expressions more quickly.
In order to investigate the mechanism of the influence of English corpus on translation quality in the translation of multilingual websites, this paper selected 80 English majors in the third year of undergraduate colleges as the experimental subjects, and designed an experiment of English parallel corpus-assisted manual translation. The experimental group used parallel corpus for the experiment, and the control group used conventional reference resources such as dictionaries for the experiment. In order to facilitate the comparison between the experimental group and the control group and control the variables as much as possible, the experiment was a 90min time-limited translation. At the same time, except for the difference in the use of reference tools, the conditions of the two groups were identical in all other aspects. Limited to the conditions of the existing English parallel corpus resources, only the German-English form was chosen for this experiment.
According to the preliminary analysis of the translations based on the experimental data, three sets of data were collected to examine the efficiency of the translations of the two groups from three different perspectives. One is the comparison of the number of the experimental group and the control group who all completed the translation task, as shown in Table 2. The second is the comparison of the total number of words in the translation tasks completed by the two groups respectively, as shown in Table 3. The third statistic shows the statistics of the two groups in completing the translation of 25 terms in the original text, as shown in Table 4.
Comparison of the number of people who completed all translation tasks
| Group | Total number of people | Number of task completers | Percentage/% |
|---|---|---|---|
| Experimental group | 40 | 3 | 7.5 |
| Control group | 40 | 21 | 52.5 |
Comparison of completed translations
| Project | Experimental group | Control group |
|---|---|---|
| Standard translation quantity | 7047 words | 7047 words |
| Total translations completed* | 4358 words | 5706 words |
| Percentage of translations completed/% | 61.84 | 80.97 |
| * |
||
Terminology translation statistics
| Project | Experimental group | Control group |
|---|---|---|
| Correct quantity* | 7047 words | 7047 words |
| Untranslated and incorrect quantities | 4358 words | 5706 words |
| * |
||
As can be seen from 2, in terms of submitting complete translations, the experimental group lagged behind the control group in completing all the translation tasks within the stipulated 90 min, with only 3 people completing all the translation tasks. Meanwhile, Table 3 shows that there is a significant difference between the experimental group and the control group in terms of the total number of completed translations (total number of words) (
Comprehensive Tables 2~4 show that the experimental group not only has no advantage over the control group, but also lags significantly behind the control group in terms of translation efficiency. Theoretically speaking, an English parallel corpus should greatly facilitate translation practice activities and improve the efficiency of translators. The reason why the actual translation efficiency of the experimental group is not as good as that of the control group may be that the excessive search results of the English corpus have increased the burden of selection and screening for the translators, especially for the students who have a limited level of English and are not yet rich in translation practice experience. In the post-experimental interviews, the students in the experimental group emphasized that reading and browsing through the large number of search lines in the parallel corpus was a very time-consuming part of the translation process, for example, among the 25 terms sampled, the highest frequency of occurrence in the parallel corpus was 582 times, and the average frequency reached 61 times. In contrast, tools such as dictionaries query to a single and unambiguous result, and the burden of selection is much smaller than in parallel corpora. To address this problem, the NMT model incorporating translation memory constructed in this paper can be used to solve the problem, and machine translation can compensate for the slower efficiency of human translation.
According to the current experimental conditions, for operability considerations, this paper determines two parameters to evaluate the translation quality: one is the accuracy of terminology translation, and the other is the completeness of sentence translation of the translated text. In pragmatic translation, terminology translation is an important criterion to measure the merits of translation. The experimental data on translated terminology for both groups is shown in Table 5.
Comparison of translation accuracy of terms
| Project | Experimental group | Control group |
|---|---|---|
| Number of terms of translation | 363 | 502 |
| Number of terms of the correct translation* | 307 | 325 |
| Accuracy rate of term translations /% | 84.57 | 64.74 |
| * |
||
The percentage of accuracy of translated terms in the experimental group (84.57%) is higher than that of the control group (64.74%). And the chi-square test further confirmed that the experimental group was significantly better than the control group in terms of accuracy of translated terms (
Meanwhile, for the purpose of comparing the data of the two groups, the total number of sentences in each of the two groups' completed translations and the number of legitimate sentences successfully constructed by using techniques such as adding words were counted separately as shown in Table 6. The experimental group has a significant advantage over the control group in constructing legitimate sentences by adding subjects (
Adding words to construct translated sentence data statistics
| Project | Experimental group | Control group |
|---|---|---|
| Total number of translated sentences | 204 | 241 |
| Adjusted number of processed sentences* | 124 | 113 |
| * |
||
In summary, it can be seen that although the experimental group is not as good as the control group in terms of terminology and translation efficiency, it has obvious advantages in terms of accuracy of terminology translation and construction of legitimate translated sentences. Judging from the data of the experiment, the English parallel corpus can improve the quality of translation to some extent, and by combining the NMT model that incorporates translation memory with the English corpus, high-quality and high-efficiency translations in multilingual websites can be realized.
This study examines the relationship between the English corpus and translation quality in multilingual website translation using both the Neural Machine Translation Model (NMT) and the English corpus.The main findings of the research are as follows:
The neural machine translation model construction method incorporating translation memory proposed in this paper can effectively improve BLEU scores. Relative to the RNN model, the model in this paper achieves a 2.91 BLEU score improvement in the direction of English-German translation, and a 1.65 BLEU score improvement in the direction of Russian-English translation. For the Transformer model, this paper's model improves 1.75 BLEU scores and 1.54 BLEU scores in the German-English and Vietnamese-English translation directions, respectively. Meanwhile, this paper's model improves the BLEU scores of the mRASP model and the mBART model by 1.03 and 1.11 in the English-German translation direction, respectively. While in the training process, this paper's model can converge faster than the baseline model. It can be seen that the proposed modeling method of fusing translation memories can effectively improve the translation performance of NMT models in multiple translation directions.
In terms of translation efficiency, the experimental group using the English parallel corpus has significantly less number of completed translations and total number of translations than the control group (P<0.05). As for translation quality, the percentage of accuracy of translated terms in the experimental group (84.57%) was higher than that of the control group (64.74%), and the experimental group had a significant advantage over the control group in adding subjects to construct legitimate sentences (P<0.05). Therefore, an English parallel corpus can effectively improve the quality of translation, and the deficiency in translation efficiency can be compensated for by using machine translation.
