An Innovative Approach to AI-Based Automated Knowledge Mapping for English Teaching Resource Construction
Pubblicato online: 21 mar 2025
Ricevuto: 29 ott 2024
Accettato: 19 feb 2025
DOI: https://doi.org/10.2478/amns-2025-0696
Parole chiave
© 2025 Yuqing Ge, published by Sciendo
This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
In the field of education, the automated construction of knowledge graph (KG) using artificial intelligence technology has become one of the hot spots of research [1-2]. The new round of scientific and technological revolution and industrial change represented by artificial intelligence and big data has become a new driving force for economic and social development. Among them, knowledge graph technology is now widely used in industry [3-5]. Knowledge graphs are structured semantic knowledge bases for describing concepts and their interrelationships in the world. Knowledge graphs link together distributed knowledge concepts scattered in various locations of textbooks to form a large knowledge base, describe complex relationships between entities in the objective world in a structured form, reduce data granularity from the document level to the knowledge point level, and aggregate a large amount of knowledge so as to realize knowledge-specific retrieval and reasoning [6-9].
In the field of education and teaching, there are also some researches that utilize knowledge graph technology for the construction of curriculum knowledge system. Knowledge mapping can make full use of existing learning resources and educational resources to visualize the structural relationships between knowledge points in multiple teaching resources [10-13]. Building an efficient English classroom is a major hotspot in today’s teaching reform, English teachers must keep pace with the development of the times, according to the requirements put forward by the new curriculum standard of English, combine with the actual situation of students, and utilize the automated knowledge mapping technology to integrate high-quality English teaching resources, such as network information, e-bookbags, newspapers and magazines, and other resources [14-15]. By sorting out the knowledge framework of English subjects, it helps students understand the knowledge connection between each English course, quickly grasp the main points of the course knowledge, and accurately retrace the knowledge learning vein, quickly check the gaps and make up for the deficiencies. This innovative method will provide powerful data support for the reform and optimization of English teaching in colleges and universities, and promote learners’ deeper understanding and application of knowledge [16-17].
In this study, several web pages were crawled using Scrapy to obtain compatible English teaching resources with the technical support of artificial intelligence. Most of the exercises crawled this time were multiple-choice questions, followed by fill-in-the-blanks questions, reading comprehension, etc., and the multiple-choice questions contained question stems, options, and parses. It is found that there are many interfering information in the data, which seriously affects the results of the subsequent study, and these problematic data are uniformly referred to as dirty data. Then process the dirty data, the process is roughly divided into: pre-processing stage, processing of missing data, deletion of formatting errors, deletion of logical errors in the data, etc., to take a combination of “top-down” and “bottom-up” approach to complete the construction of the knowledge graph the task. Aiming at the problem of low matching degree of teaching materials in traditional knowledge map, we introduce the mapping relationship of different entity structures on the basis of the original one, so as to build a knowledge map for multi-dimensional English teaching resources. In order to solve the current situation of English teaching in colleges and universities, we propose a practical application path for English teaching under the perspective of knowledge mapping, and use mathematical statistics to explore its practical application effect.
Knowledge mapping is a new type of knowledge organization mode based on data, with knowledge as the core, using computer technology to represent all kinds of associated knowledge in the form of visualization, aiming at describing entities, relationships and their interrelationships in the objective world from multiple dimensions [18-19]. It is a visual way of organizing information, which can be used to mine the relationships between information through semantic retrieval and visual query. Knowledge graph involves three important concepts: entities, relationships and patterns. Among them, entity refers to entities in the objective world, which is the basic unit that constitutes the knowledge graph. Entities include three types such as individuals, organizations and relationships between entities. A relationship is an interaction between different entities established in a particular way.
Currently, the field of English language teaching is facing unprecedented challenges that not only affect the effectiveness of teaching and student learning, but also pose a major threat to the achievement of teaching and learning goals. These problems can be seen at multiple levels. First of all, the low level of informatization has become a bottleneck for improving classroom teaching effectiveness and the growth of students’ learning efficiency. With the advent of the era of educational informatization, students expect to be able to use advanced technological means to assist their learning, but the reality is that many universities appear to be overwhelmed by the provision of informatization equipment and resources. This leads to the classroom teaching method still remaining in the traditional mode, unable to fully utilize modern scientific and technological achievements to enrich the teaching content and methods. Secondly, students’ access to knowledge is relatively narrow, and the single access to information largely hinders their ability to use modern tools such as the Internet for independent learning and exploration, which undoubtedly slows down the improvement of their learning efficiency. Again, the lack of language environment is also one of the problems that cannot be ignored in current English teaching. As a communication tool, language cannot be learned without a good language environment. However, in some schools, such an environment is not effectively constructed or maintained. The lack of opportunities to practice and use real contexts makes students encounter many difficulties in the process of language learning, making it difficult for them to have access to a sufficient number of vocabularies and complex grammatical structures, not to mention understanding the relevant cultural background. Such deficiencies undoubtedly limit the cultivation and development of language proficiency. Moreover, the fragmentation of teaching resources has also caused serious inconveniences for students’ learning.
English teaching plays an important role in higher education. However, due to the many problems of the traditional teaching mode, the quality of English teaching has not significantly improved. In recent years, with the rapid development of artificial intelligence and big data technology, the field of education has begun to introduce knowledge mapping technology to solve the problems of traditional teaching methods. The emergence of knowledge mapping can not only enable effective integration and sharing of information, but also provide more accurate learning resources for different professions to enable personalized learning and accurate teaching. Therefore, this subsection proposes a knowledge graph for English teaching resources, which is expected to effectively solve the many problems faced by current English teaching.
Web crawlers, also known as web spiders, are programs or scripts that can automatically obtain web content from the World Wide Web and are an important part of modern search engines [20-21]. Web crawlers are directly oriented to the Internet and are an important source of data for search engines. When crawling network data, web crawlers first obtain web page information according to a certain URL address, and then parse the obtained web page information, because the information in the web page has specific labels, for example, div can be divided into different areas of the document, a indicates hyperlinks, p indicates text segments, etc., web crawlers can accurately obtain information according to the specific labels of the web page. Web crawlers are an essential technique for retrieving learning resources from the internet.
In this paper, we used Scrapy to crawl several web pages, and finally obtained the number of English teaching resources 24995. In the process of crawling, the exercises were categorized according to the knowledge points, which were roughly divided into three major categories: lexical, syntactic, and comprehensive questions. Among the knowledge points examined in lexical exercises include nouns, verbs, similar words, prepositions, and so on. Syntax exercises include: emphasized sentences, virtual voice, general questions, imperative sentences and so on. Comprehensive exercises include four types of questions: gap-filling, reading comprehension, writing, and translation. The number of exercises examining lexical and other knowledge points is 12,389, the number of exercises examining syntax and other knowledge points is 11,472, and the number of comprehensive exercises is 1134. Most of the exercises crawled are multiple-choice questions, followed by fill-in-the-blank questions, reading comprehension, etc. Multiple-choice questions contain question stems, options, and explanations.
In this paper, the data crawled from the network from different web pages, these data from different web pages are not all correct, there are often incomplete data and other issues, these problems with the data uniformly known as dirty data, will be affected by the direct impact of data quality, in the processing of dirty data, this process is called data cleaning. Data cleaning is a very important part of the data cleaning will directly affect the effect of the model and the final conclusion, with the continuous development of computer technology, in the data cleaning, there is a set of more practical processes and methods. The process is roughly divided into three stages: pre-processing, processing missing data, and deleting format errors and logical errors in the data.
After a series of data cleaning operations, the quality of the data has now reached the requirements of the knowledge graph, and can now be completely used for the construction of the knowledge graph. After data cleaning, the amount of data changed from 24995 to 16850. Among them, there are 8529 exercises examining lexicography, involving the knowledge points of modal verbs, nouns, conjunctions, prepositions and prepositional phrases, articles, nonpredicative verbs, verb inflections, verbs and verb phrases, verb tenses, pronouns, adverbs and adjectives, numerals, and similar similar words, and the number of specific lexical exercises is shown in Fig. 1, in which the labeled data indicates the number, and the size of square area indicates the numerical sorting size, the larger the value, the larger the square area. It can be seen that the number of number words is the largest, with a specific value of 945, while the number of nouns is the smallest, with a numerical value of 552.

Number of specific lexical exercises
There are 7937 data of exercises examining syntax, in which the knowledge points involved are virtual voice, imperative sentence, emphasized sentence, noun clause, sentence constituent, simple sentence, compound sentence, exclamatory sentence, antithetical question, object clause, inverted sentence, parallel sentence, definite clause, subject-verb agreement, gerund clause, and general question, and the number of specific syntactic exercises is shown in Figure 2. The results show that out of the 16 syntactic exercises, the number of inverted sentence types is 659, which is the largest of all types, and the smallest number is the definite clause (419).

Specific number of syntax exercises
There are 384 exercise data for the synthesized questions, which involve completing the blanks, reading comprehension, translation and composition, and the specific number of synthesized exercises is shown in Table 1. Based on the data in the table, it can be seen that Reading Comprehension (113) > Essay (111) > Completion (97) > Translation (63).
Comprehensive exercise
| Exercise type | Number of exercises | Rank |
|---|---|---|
| Completion filling | 97 | 3 |
| Reading comprehension | 113 | 1 |
| Translate | 63 | 4 |
| Composition | 111 | 2 |
The knowledge graph for English teaching resources is shown in Figure 3, and the construction of the knowledge graph mainly consists of two methods: “top-down” and “bottom-up”. “Top-down” refers to the use of structured databases to define the ontology knowledge layer, which is then progressively refined by adding entities to the concepts. “Bottom-up” refers to the use of open-link data to summarize and organize entities and then progressively form upper-level concepts. The English Knowledge Graph is constructed using a combination of “top-down” and “bottom-up” methods. The ontological knowledge layer aims to extract and summarize concepts and inter-conceptual relationships from textbooks and teaching materials, forming “concept-relationship-concept”.

Knowledge graph for English teaching resources
Under the constraint of the ontology knowledge layer, the data resource layer constructs a hierarchical structure including “entity-attribute-attribute value” and “entity-relation-entity” by extracting the “attribute-value” relationship, entity and their relationship with each other according to the specific knowledge points in the teaching materials. On this basis, the concept and entity are linked, and the ontology knowledge layer and the data resource layer are integrated to construct the college English knowledge graph. For the knowledge points of “third-person pronouns”, the manual construction method is used to extract the instances of “third-person singular pronouns” from textbooks, syllabuses, courseware and other materials, which are used as entities in the data resource layer, and the knowledge points and examples are linked with the edges marked by “rdf:type”. In addition, the attributes (nominative and accusative, etc.) of the “third person singular pronouns” and their attribute values (he, she, it, him, her, it, etc.) are extracted, and the semantic relationship between the concept and the entity is displayed through the edge link between the attribute and the attribute value, forming the representation of “entity-attribute-attribute value”. In order to expand the coverage and application potential of knowledge elements in the English knowledge graph, the original data including unstructured data (such as pictures, audio, video, etc.) were included in the construction process of the knowledge graph, and the knowledge elements (such as knowledge points and difficulty) were extracted, and the association between the knowledge elements and resource objects was established.
Aiming at the problem of low matching degree of teaching materials in traditional knowledge mapping, on the basis of subsection 2.2.4, the mapping relationship of different entity structures is introduced, so as to establish a knowledge mapping oriented to multidimensional English teaching resources.
Taking the English teaching resources entity knowledge point as a spatial structure and denoting it as (
Where:
Where:
Preliminary screening of learning resources that match the learner’s knowledge level and learning style based on the results of the assessment of the learner’s knowledge level and learning style. Analyze the learners’ knowledge learning needs by using their test scores on the exercises and referring to the relationship network of the knowledge points they have learned in the knowledge graph. Filter again among the pre-selected resources of knowledge levels and learning styles, pinpoint the resources that are closely related to the learners’ current knowledge needs, and display them on the user interface to facilitate instant learning. Recommend exercises that are closely matched with the learning resources, scrutinize the learners’ knowledge mastery based on their scores, and dynamically adjust the recommendation strategy to ensure that the learning materials provided in the following period are closely matched with the learners’ actual mastery level, until all relevant knowledge points are fully covered and consolidated in the network.
As a bridge connecting linguistic intelligence and English education, English teaching resources based on knowledge mapping significantly enhance the richness of teaching content and the effectiveness of teaching activities through structured knowledge points and learning resources, and build a solid resource foundation for promoting the digital transformation of English education and realizing intelligent upgrading. In the end, an efficient and personalized English intelligent education ecosystem will be established, which will provide strong support for cultivating English talents with international vision and intercultural communication skills.
After importing all the entity and relationship tables, it is time to visualize the graph, and Figure 4 shows the connection relationship between lexical and syntactic knowledge points. The relationships between entities and entities are connected by arrows, e.g., light green indicates the whole lexicon, light cyan indicates tense information, light blue indicates the knowledge point information corresponding to the lexicon, and light orange indicates the syntactic information. The relationship between knowledge points can be viewed very intuitively through visual mapping, which makes it easy to update and maintain the mapping. The topics contain many types of knowledge points, but there is a lack of correlation between these topics, and the knowledge points are scattered with each other, so by using knowledge mapping it can be more intuitive to represent the relationship between the topics and knowledge points, and it is also easy to store knowledge, and when personalized topic recommendation is carried out, it can call the system algorithms more efficiently, and at the same time it can also excavate the deep level of English knowledge, so knowledge mapping visualization analysis is very necessary. By using knowledge graph visualization and analysis technology, students can improve their ability to independently learn with the help of this powerful tool. On the other hand, it can also effectively promote interaction among students, communication between students and teachers, and in-depth communication between teachers and students. This model can not only improve students’ English listening, speaking, reading, and writing skills, but also stimulate their learning interest and promote the development of thinking abilities.

Visual analysis of lexical and sentence patterns
In addition to storing information about English letters, words, grammar and other knowledge points in the knowledge graph, it also stores information such as explanations and answers of various exercises, and the visual analysis of the exercise part is shown in Fig. 5, where the light yellow icon indicates a specific exercise, and at the same time, it is associated with key information such as the question type, explanation, knowledge point and usage, so that by inquiring about a certain exercise, we can know which category of knowledge points it belongs to, as well as how other The use of graphs allows for a clearer presentation of knowledge. The use of graphics can show the structure between the knowledge more clearly, which is conducive to the users to view and analyze the connection between the knowledge points, and at the same time, the managers can also add and modify the knowledge through the graphical interface, so the visualization is necessary. In summary, knowledge mapping technology, as a cutting-edge technology, has revolutionized English teaching. It is not only a teaching aid, but also an important force to promote educational innovation, which can create a more efficient, interactive, and personalized university English teaching environment.

Visual analysis of exercises
In this study, Class A (experimental group: 8 students) and Class B (control group: 8 students), which are taught by the same English teacher, with no significant difference in English mapping scores, grammar test scores, and attitudes toward grammar learning, and with close class sizes, are selected as experimental subjects in a school, and there are two levels of the independent variable, one adopting knowledge graph-based English teaching resources in the experimental class, and another adopting traditional English teaching resources in the control class. Resources.
The main research tool used in this study is the Knowledge Mapping Practical Application Effectiveness Assessment Scale for English Teaching Resources, which is mainly composed of three parts, the first part of the scale is knowledge skills, the second part of the scale is learning attitudes, and the third part of the scale is values, each part of which has 10 items, and there are a total of 30 items in the scale, and the scale has excellent reliability and validity, which can greatly guarantee the scientificity of the research results. The scale has 30 items and is highly reliable, which greatly guarantees the scientific validity of the research results.
Pre-intervention comparative analysis Based on the scale test data, an independent sample t-test was conducted on the pre-intervention control group and experimental group, and the results of the pre-intervention comparative analysis are shown in Fig. 6, where (a)~(c) are knowledge skills, learning attitudes, and values, respectively. Combined with Figure 6 (a)~(c), it can be seen that the pre-intervention control group and experimental group do not have significant differences (P>0.05) in knowledge skills (P=0.233), learning attitudes (P=0.121), and values (P=0.162) and the difference in the mean values is not significant. The main reason for this is that when selecting the subjects, all the indicators of the two are at a uniform level, which meets the requirements of the experiment, and further follow-up research can be carried out. Post-intervention comparative analysis The above analysis shows that the control group and the experimental group did not present significant differences. Then a period of experimental cycle of teaching intervention was carried out for both, and at this level, independent samples t-test was used to study the effect of the practice of the experimental group and the control group after intervention, and the results of the comparative analysis between the experimental group and the control group after the intervention are shown in Figure 7. Based on the P-values in Fig. 7(a)~(c), it can be seen that after a period of experimental teaching intervention, it is found that there is a significant difference between the control group and the experimental group in knowledge skills (P=0.006), learning attitudes (P=0.002), and values (P=0.005) after intervention (P<0.05), which indicates that the introduction of knowledge mapping on the basis of the traditional resources of English language teaching is more conducive to the enhancement of students’ various abilities, which has guiding value for promoting the development of intelligent and digitalized English teaching resources. Within-group comparative analysis Finally, with the help of independent samples t-test, the results of the within-group comparative analysis were explored, and Figure 8 shows the results of the within-group comparative analysis, where (a)~(b) are the experimental group and the control group, respectively. The p-values in the figure show that the values of the practice effect indicators before and after the intervention in the experimental group (knowledge skills P=0.047, learning attitudes P=0.033, values P=0.021) show significant positive correlation, while the values of the practice effect indicators before and after the intervention in the control group (knowledge skills P=0.143, learning attitudes P=0.219, values P=0.355) are not significant. To summarize, it can be seen that the English teaching resources without the addition of knowledge mapping resulted in a more general performance of practice effects. On the basis of the original, after adding knowledge mapping, the practice effect is more obvious.

Comparative analysis before intervention

Comparative analysis after intervention

Comparative analysis within groups
In this paper, we use web crawling technology to obtain data on English teaching resources, and then pre-process this data before constructing a knowledge graph about these resources. Aiming at the current English teaching dilemma in colleges and universities (relatively narrow channels for students to access knowledge, low level of informatization, etc.), it is proposed that the constructed knowledge map be added to the actual English teaching process with a view to enhancing the effect of English teaching in colleges and universities, and in order to detect its practical effect, it is evaluated using a scale and an independent samples t-test. After a period of experimental teaching intervention, it is found that there are significant differences (P<0.05) between the control group and the experimental group in knowledge skills (P=0.006), learning attitudes (P=0.002), and values (P=0.005) after the intervention, which indicates that the introduction of knowledge mapping on the basis of the traditional English teaching resources is more conducive to the enhancement of students’ abilities and provides a reference for the intelligent and digitalization construction.
