Research on English Corpus Construction and Optimisation of Language Teaching Strategies Supported by Data Mining Algorithms
Publié en ligne: 17 mars 2025
Reçu: 29 oct. 2024
Accepté: 12 févr. 2025
DOI: https://doi.org/10.2478/amns-2025-0179
Mots clés
© 2025 Hong Zhou, published by Sciendo
This work is licensed under the Creative Commons Attribution 4.0 International License.
In this paper, the word2vec text mining method is applied to extract the target English teaching resources, such as English teaching materials, videos, newspapers and magazines. The collected resources are preprocessed and subjected to clustering, morpheme and association analysis to calculate a series and some keywords. Using these keywords to collect additional corpus, iterating repeatedly until a certain size of corpus is constructed. Applying the corpus to English teaching, it can be analyzed that a total of 184 [APPOINT] related corpora are obtained with the corpus as the object of study. There are 11 collocations related to “be addicted to”, including 8 negative words and 3 positive words. Eleven common errors in college students’ writing were also identified. The performance of English majors in S colleges and universities was significantly improved after applying English corpus teaching (P=0.003). Therefore, the English corpus has a positive effect on English teaching.