The Effects of Intelligent Semantic Analysis Techniques on Language Acquisition in the Improvement of English Intercultural Communication Skills
Data publikacji: 17 mar 2025
Otrzymano: 17 lis 2024
Przyjęty: 18 lut 2025
DOI: https://doi.org/10.2478/amns-2025-0264
Słowa kluczowe
© 2025 Xiangming Huang, published by Sciendo
This work is licensed under the Creative Commons Attribution 4.0 International License.
With the deepening of globalization and the continuous promotion of the “One Belt, One Road” initiative, China is increasingly in need of international talents with cross-cultural communication skills. It has become an important mission of Chinese education to cultivate a new generation of talents who have a global vision, know international rules and are able to participate in international affairs and international competition [1-4]. At present, the importance of intercultural communicative competence is becoming more and more prominent in university English teaching. Cultivating students’ intercultural communicative competence has become an important task in university English teaching, and intelligent semantic analysis technology plays an important role in intercultural communicative teaching activities [5-8].
Semantic analysis technology refers to the conversion of linguistic information in texts into machine-understandable forms so that computers can understand the actual meaning of sentences and documents [9-10]. Semantic analysis technology is a major branch of the field of artificial intelligence, which utilizes natural language processing (NLP) technology to convert natural language text into meaningful machine-readable information, thus helping computers to understand the real meaning of the text [11-14]. Traditional text processing techniques can only simply process and match the text, and cannot understand the meaning. Semantic analysis technology, on the other hand, can transform text information into a form that can be understood by computers, thus realizing in-depth analysis and understanding of text information [15-18].
The study builds an AMR intelligent semantic analysis system based on stack-LSTM algorithm. After building the system database, the similarity algorithm is used to find semantic differentiation, and then the label conversion algorithm and graph-to-graph linear transformation algorithm are proposed. In order to verify the role of AMR intelligent semantic analysis system in the enhancement of intercultural communicative competence in English language acquisition, an experimental environment was built and realized. Firstly, a comparison with other syntactic analysis tools is made, which is used to compare out the accuracy of the English and Chinese languages, and then the corpus’s is subjected to the experiment of identifying the accuracy of the utterance labeling, and finally the actual effect is analyzed.
English in cross-cultural context may have different meanings in different occasions, so we need to take into account the cross-cultural communication context when analyzing the semantics of English, and understand the semantics of these words and phrases in the context of different cultures.
Many phrases in English often have great differences between their literal meanings and their actual meanings, which is the most typical problem that needs to be noticed in semantic analysis of English in the context of cross-cultural communication. The semantic analysis of English cannot be completely translated directly, but needs to take into account the rules of language use in the cross-cultural context, that is, to consider the special meaning of the phrases in actual use.
Sentences consist of various phrases and some auxiliary words. Understanding the semantics of English in the context of cross-cultural communication, especially when facing some English utterances, actually the core lies in grasping the core vocabulary and phrases in the sentences, and understanding the meaning of the whole sentence through the actual meanings of these vocabulary and phrases.
Euphemisms are also a very common way of expression in English. Although many foreigners prefer to express themselves directly in terms of language communication, they sometimes use euphemisms to express themselves at certain times to avoid embarrassment or to be polite, taking into account some historical and religious factors.
Lexical semantic analysis has great advantages, the most prominent of which is that it no longer fixes each word as a set of combinations of meanings, but breaks it down into a number of meanings and puts these meanings into different contexts for the learners to understand and use, which can improve the rote learning habits of students and stimulate the students’ exploration of and love for the language. The structure of the intelligent algorithm for English semantic analysis of utterance components is shown in Figure 1. As can be seen in Figure 1, the structure of the intelligent algorithm based on semantic analysis firstly needs to decompose the English words contained in a sentence into individual words, then retrieve the lexical properties and the corresponding lexical meanings of each word, and finally analyze the meaning of the word in the sentence in context.

Semantic analysis of the intelligent algorithm structure
The system database is based on a collection of teachers and textbooks, with practitioners compiling the vocabulary that students need to learn based on their different levels of study and syllabus. It mainly contains tables with basic vocabulary information and more specific vocabulary information. Basic Vocabulary Information: According to the syllabus, information such as the lexical nature of the required vocabulary and the pronunciation phonetic symbols in different environments are stored in the database. Special vocabulary information: In English, most nouns have certain formation rules for their plurals, verbs for their past tenses and past participles, and present tenses, adjectives for their comparatives and superlatives, etc. However, there are some vocabulary words for which the above information is inconsistent with the common ones, and they have special characteristics. In order to realize semantic analysis intelligent algorithms more precisely, such words should be stored separately in the database.
Lexical analysis is the process of taking apart each word in a sentence individually and analyzing its lexical nature, lexical meaning, etc. The lexical analysis of utterance is mainly based on lexical analysis, which first obtains the lexical nature and lexical meaning of each word, and then, based on the obtained data and the lexical nature of the words before and after the word, it determines the lexical nature of the word that the word plays in this utterance.
The key problem that needs to be solved in cross-language dependent syntactic analysis is the variability of the two languages themselves, the source language and the target language, both at the word level and at the sentence structure level. This variability is mainly due to the fact that we use different words to express the same meaning in different languages, and each language has its own grammar to constrain the order of occurrence of words [19]. The representation learning method for language independent features is to formulate the same feature rules among different languages to represent words from multiple languages in the same feature space. Linguistically independent feature representation learning is mainly used to compensate for the differences between languages and establish inter-language connections by inducing the generation of linguistically independent features, and then train the dependent syntactic analyzer through the vector representation of the source language in the feature space, so that the dependent syntactic analyzer can carry out the dependent syntactic analysis of the target language.
Each word pair

Representation learning of word embedding
The input layer of the model shown in Figure 2 is introduced first. From the corpus of the source language, a clause
After obtaining the
where
Similarly, where
The parameters included in equations from (1) to (4) are the hidden layer parameter
The stochastic gradient descent method is utilized to solve the above equation, and the values of the model parameters are continuously updated over several iterations. By training the above model we obtain the word embedding matrix
The vector space modeling criterion is used in measuring utterance similarity. Vector space modeling involves separating the smallest semantic units such as words and phrases in a text and using their calculated similarity as vector elements. Teaching cosine is used in two English sentences to obtain the inter-semantic similarity.
In the vectorized representation of English utterances, two utterances are first represented as equal length vectors, e.g., for utterances
Removing the same words in
Combine the two statements by removing the articles and exclamations from both statements, keeping the real word prototypes, and recording the same words to obtain the combined statement
where the decimal obtained from the calculation is the value of similarity corresponding to the word. The prepositions and auxiliary verbs such as of, do, etc. in the utterances have no comparative words. After determining the semantic vectors
(1) Generate pseudo-labeled data Firstly, the label transformation probability based on arc alignment is counted from parallel labeled data
(2) Two-step training the training data used for the first training step is the pseudo-labeled data of the target specification obtained through the previous method. In order for the analyzer to be better adapted to the real data distribution, the training data used for the second training step (fine-tuning) is the manually labeled parallel corpus D
Unlike the label transformation algorithm, the graph-to-graph linear transformation algorithm directly learns a linear function that transforms the analyzer trained on the source canonical data to the target canonical. Since the Biaffine attention matrix is a core component of the Biaffine parser, it contains important information for making predictions about semantic dependency graphs. Therefore a natural approach is to inherit the parser trained on the source specification, which in turn helps to train the parser on the target specification.
Specifically, assume that
The final target analyzer is:
LSTM is widely used in various tasks in the field of natural language processing, and its structure based on recurrent neural networks allows the model’s output at a certain positional moment to carry information with simultaneous context [20]. The network structure of LSTM is similar to that of RNN, in that it is a repetitive module connected in a recurrent manner, and unlike the RNN, the LSTM has four internal neural network layers that interact in a very specific way, and its internal network structure is shown in Figure 3.

LSTM inner structure
The LSTM has three structures called “gates”, which consist of a sigmoid neural network layer and an element-by-element multiplication operation, which represents a method of selectively letting information pass through, the element-by-element multiplication means multiplying the input by a number in the interval 1, where 0 means it cannot pass through, and (0,1) means it passes through all of them.
The key to the LSTM is the cell state
In the above equation,
Finally the network will determine the value of the output based on the cell state, the output of the previous time point and the input of the current time point, calculated as shown in Eqs. (17) and (18), where
The stack-LSTM adds a stack pointer to the normal LSTM. similar to the traditional LSTM, new inputs are always added to the last position, but when computing the new memory part, the stack pointer determines from which position of the LSTM to provide the
Bidirectional LSTMs, in contrast to uni-LSTMs, can encode information about the current position and before, in addition to information about the current position and after through LSTMs in the other direction. More formally, we combine two vectors as the input representation of the word, a character-based word representation
In this paper, three stack-LSTMs are used to learn the representations of the three stacks
where
Our model uses the transfer state
In the above equation
Once the model has selected the transfer action CONFIRM, the model needs to determine the nodes in the AMR graph generated from the top word of the buffer, again by categorizing state
In the above equation N represents all possible candidate concept nodes for the top element of the buffer,
The softmax classification in Eq. (21) and Eq. (22) is applied on the state representation of the transfer system
The most basic guarantee of confidence in language research depends on the accuracy of the annotation. In this study, the accuracy of dependency syntactic structures is evaluated in comparison with Mate Parser, AMR, and Malt Parser intelligent techniques, and the results are shown in Table 1.
The accuracy of the syntax analysis tool is calculated
| Analyser | Language | Style | Dependent relation | Whole sentence | ||
|---|---|---|---|---|---|---|
| Mini | Max | Average | ||||
| AMR | English | Literature | 15.88% | 100% | 87.56% | 41% |
| Inliterature | 37.28% | 100% | 86.91% | 34% | ||
| Chinese | Literature | 11.4% | 100% | 81.42% | 32% | |
| Inliterature | 32.55% | 100% | 84.51% | 23% | ||
| Mate Parser | English | Literature | 27.12% | 100% | 86.31% | 28% |
| Inliterature | 13.21% | 100% | 85.33% | 21% | ||
| Chinese | Literature | 10.2% | 100% | 77.51% | 16% | |
| Inliterature | 45% | 100% | 80.55% | 7% | ||
| Malt Parser | English | Literature | 13.21% | 100% | 80.78% | 28% |
| Inliterature | 40% | 100% | 81.21% | 20% | ||
| Chinese | Literature | 11.4% | 100% | 76.57% | 17% | |
| Inliterature | 41.51% | 100% | 78.32% | 6% | ||
There is a large gap in the accuracy of AMR semantic tool compared to the lexical assignment tool, both for the annotation of dominant nodes and dependencies, the average accuracy of the syntactic parser ranges from 76% to 86%, while the highest fully correct rate for the whole sentence is only 41%. In addition, the data shows that language and genre also affect the accuracy of syntactic analysis. Among them, the syntactic annotation accuracy of English is significantly higher than that of Chinese, both in terms of local dominant nodes and dependencies and whole sentence accuracy, probably due to the fact that Chinese does not have a rich morphological tagging system, and lexical assignment based on morphological analysis is one of the important reference indexes of the syntactic analyzer.
The automatic syntactic analysis errors of English and Chinese were comparatively analyzed using AMR semantic technology, and the results are shown in Table 2 and Table 3, respectively. As can be seen from Tables 2 and 3, the errors of each analyzer show certain commonalities: the analysis of sentence structure has more errors than the analysis of phrase structure, and stack-LSTM has the least number of errors, which is significantly lower than that of Mate Parser and Malt Parser. Compared with English, the number of syntactic analysis errors in Chinese is significantly higher, but the types of errors are somewhat consistent with the results of English analysis.
Error analysis of English sentence collection
| Analyser | AMR | Mate Parser | Malt Parser | |||||||
|---|---|---|---|---|---|---|---|---|---|---|
| Syntax type | Grammatical type | Lexical error | Dependent error | Microscope | Grammatical type | Lexical error | Dependent error | Grammatical type | Lexical error | Dependent error |
| Phrase structure | Modification relation | 15 | 127 | 182 | 26 | 145 | 218 | 27 | 199 | 284 |
| Functional relation | 5 | 35 | 9 | 38 | 2 | 56 | ||||
| Sentence structure | Component relation | 8 | 120 | 261 | 19 | 151 | 331 | 18 | 227 | 467 |
| Small sentence relation | 9 | 124 | 18 | 143 | 13 | 209 | ||||
| Other | Undefined dependencies | 1 | 7 | 8 | 1 | 15 | 16 | 1 | 11 | 12 |
| Total | 38 | 413 | 451 | 73 | 492 | 565 | 61 | 702 | 763 | |
Error analysis of Chinese sentence collection
| Analyser | AMR | Mate Parser | Malt Parser | |||||||
|---|---|---|---|---|---|---|---|---|---|---|
| Syntax type | Grammatical type | Lexical error | Dependent error | Microscope | Lexical error | Dependent error | Microscope | Lexical error | Dependent error | Microscope |
| Phrase structure | Modification relation | 25 | 78 | 162 | 27 | 145 | 214 | 121 | 111 | 311 |
| Functional relation | 11 | 48 | 1 | 41 | 29 | 50 | ||||
| Sentence structure | Component relation | 42 | 185 | 358 | 69 | 197 | 462 | 72 | 218 | 430 |
| Small sentence relation | 46 | 85 | 41 | 155 | 45 | 165 | ||||
| Other | Undefined dependencies | 4 | 22 | 26 | 12 | 23 | 35 | 14 | 31 | 45 |
| Total | 128 | 418 | 546 | 150 | 561 | 711 | 281 | 575 | 786 | |
The experiment uses intelligent semantic techniques to process the collected sentence set. The sentence set has a total of 8000 interrogative sentences. 5367 sentences were randomly selected as the training corpus, and then 1535 interrogative sentences were arbitrarily selected from the remaining sentences as the test corpus.
According to the characteristics of the sentences, many semantic components will not appear or appear with low frequency. Since the feature selection in the experiment is rich and the feature space is huge, pruning before selecting features greatly improves the performance of the system. In this experiment, the features with feature occurrence frequency above times are trained, in which the pruned components are giving things, causing, object, than things, sub things, basis, materials, conditions, indicating judgment components, and other components. The number of features and predictions for training corpus I and the test corpus are shown in Table 4 below.
Quantity statistics
| Semantic component | Training language | Test language | Semantic component | Training language | Test language | ||||
|---|---|---|---|---|---|---|---|---|---|
| Characteristic number | Forecast number | Characteristic number | Forecast number | Characteristic number | Forecast number | Characteristic number | Forecast number | ||
| Tie | 7981 | 6621 | 256 | 242 | Degree | 567 | 531 | 19 | 15 |
| Consul | 26 | 25 | 1 | 1 | Range | 4632 | 3765 | 134 | 108 |
| Suffer | 25 | 13 | 1 | 0 | Trend | 1999 | 1891 | 71 | 69 |
| Results | 821 | 710 | 26 | 23 | Cause | 30 | 27 | 1 | 1 |
| Guest | 5632 | 4982 | 155 | 141 | Purpose | 26 | 26 | 1 | 1 |
| And things | 172 | 132 | 5 | 4 | Predicate | 7255 | 7168 | 256 | 242 |
| Party | 24 | 11 | 0 | 0 | Marker | 5176 | 5133 | 186 | 185 |
| Tools | 246 | 231 | 7 | 6 | Unit | 156 | 156 | 6 | 6 |
| Mode | 322 | 315 | 10 | 9 | yw | 79 | 67 | 2 | 2 |
| Space | 5834 | 5521 | 198 | 182 | ots | 3978 | 3786 | 135 | 126 |
| Time | 166 | 121 | 5 | 4 | - | - | - | - | - |
The experimental training results of each semantic component labeling evaluation are shown in Fig. 4, Fig. 5 and Fig. 6. As can be seen from the figures, the recognition accuracy of the department, consul, result, guest, tool, way, place, degree, tendency, reason, and purpose is above 92%, with good recognition effect. The recognition accuracy with matter, time, and scope is 82% to 91%, and the recognition effect is average, while the recognition effect with subject matter and parties is poor.

The semantic content of the sentence is accurate

The semantic component of the question is marked by the recall rate

The semantic component of the question is marked f1
In this section, 100 students from senior and junior levels of an English major class in a university were selected for regression analysis of the influence of using and not using intelligent semantic analysis technology on the students’ language acquisition, and for the accuracy of the results of the study, there were 50 students who used the intelligent semantic technology, and 50 students who did not use it. The results of the analysis are shown in Table 5. From the table, it can be seen that: the effect of whether or not to use intelligent semantic analysis technology in high and low grades on students’ school language use status is shown as follows: whether or not to use intelligent semantic analysis technology in high grades does not have a significant effect on students’ school language use status; whether or not to use intelligent semantic analysis technology in low grades does not have a significant effect on students’ school language use status, either; it is not clear whether or not to use intelligent semantic analysis technology in low grades is not a significant effect on students’ school language use status in comparison with no using intelligent semantic analysis technology compared to exp (1.774) = 5.86, significant at the 0.1 level, which means that after controlling for other factors, students who were not sure whether they would use intelligent semantic analysis technology in the lower grades were 4.7 times more likely to use intelligent semantic analysis technology (compared to not using intelligent semantic analysis). And there is no significant difference in the comparison between using both and not using intelligent semantic analysis technology, exp (.965) = 2.58. The effect of whether or not teachers use intelligent semantic analysis technology in their lessons on students’ language acquisition status is shown in the following way: after controlling for other factors, students whose teachers use intelligent semantic analysis technology in their lessons (compared to those whose teachers do not) are more likely to use it in their lessons (compared to those whose teachers do not). using intelligent semantic analysis technology (compared to not using intelligent semantic analysis technology) were 1.04 times more likely to acquire language, with a significant difference of exp (.707) = 2.01, while the difference between using both modalities (compared to not using intelligent semantic analysis technology) was not significant, with a difference of exp (.225) = 1.23. The effect of gender on students’ language acquisition status was demonstrated by the fact that after controlling for other factors , the odds of using intelligent semantic analysis technology (over not using intelligent semantic analysis technology) between male students and their classmates decreased by 41% compared to female students, exp (-.577) = 0.54, significant at the 0.05 level. After controlling for other factors, the odds of using both technologies between male students and their classmates (over not using intelligent semantic analysis technology) decreased by 29% compared to female students, exp (-.375) = 0.69, significant at the 0.1 level.
The student school language USES regression analysis
| Variable assignment | Regression coefficient | Standard error | Z value | Significance | |
|---|---|---|---|---|---|
| Use and do not use | |||||
| Seniors will use | no =0 | ||||
| yes = 1 | .06429 | .4183 | 0.13 | 0.996 | |
| Inclarity =2 | -.9618 | 1.0717 | -0.81 | 0.289 | |
| Lower grades will use | no =0 | ||||
| yes = 1 | -.3649 | .3916 | 0.9 | 0.371 | |
| Inclarity =2 | 1.7742 | 0.9853 | 1.77 | 0.002 | |
| Teachers don’t use it | yes=0 | .7065 | .2382 | 2.93 | 0.023 |
| Gender | Female=0 | -.5794 | .2332 | 2.45 | 0.033 |
| Age | -.4206 | .0892 | -4.66 | 0 | |
| Constant term | 5.4251 | 1.255 | 4.29 | 0 | |
| Both are involved (use and not use) VS Not use | |||||
| Seniors will use | no=0 | ||||
| yes=1 | -.2769 | 0.3868 | -0.69 | 0.593 | |
| Inclarity = 2 | .0109 | .9797 | 0.02 | 0.997 | |
| Lower grades will use | no=0 | ||||
| yes=1 | .3011 | .3652 | 0.8 | 0.396 | |
| Inclarity =2 | .9651 | .9485 | 1.06 | 0.314 | |
| Teachers don’t use it | yes=0 | .2249 | .2293 | 0.95 | 0.341 |
| Gender | female=0 | -.3751 | .2248 | -1.63 | 0.095 |
| Age | -.1554 | 0.0877 | -1.74 | 0.082 | |
| Constant term | 2.638 | 1.235 | 2.11 | 0.153 | |
| N | 578 | ||||
| LRchi2 | 59.58 | ||||
| Prob>chi2 | 0.0000 | ||||
| Log likeihood | -588.6723 | ||||
| Pseudo R2 | 0.0402 | ||||
This study proposes a stack-LSTM-based AMR intelligent semantic analysis method to achieve intercultural communication competence in English language acquisition. The intelligent semantic system is capable of analyzing syntax with good accuracy and a low error rate. Through the semantic component annotation recognition experiment, the AMR intelligent semantic system is used for recognition, and the experimental results show that the intelligent semantic system proposed in this paper achieves an accuracy of 92% for semantic component annotation recognition, and the annotation recognition effect is good. The intelligent semantic analysis system is used in practice to analyze the influence of language acquisition brought by the system through the use of different types of people, and the results show that the difference in the enhancement of their own language communication ability is significant between the teachers’ population who use the intelligent semantic system for teaching and the students who do not use the intelligent semantic system for teaching.
