Accès libre

Lexical co-occurrence network and semantic relation mining based on English corpus

  
17 mars 2025
À propos de cet article

Citez
Télécharger la couverture

This paper analyzes the form of content characterized by nodes in the text complex network model, and points out the way of constructing text network with text words as network nodes. The lexical co-occurrence relationship based on lexical semantics is delineated, and combined with the implementation process of the lexical co-occurrence analysis method, the keyword extraction method of lexical co-occurrence network based on the improved TextRank algorithm is proposed. Combine the features of complex networks and utilize the FWN short text clustering algorithm to reveal the semantic associations between words and words. Analyze the advantages of the improved TextRank algorithm. To count the distribution of lexical co-occurrence network node word classes in the English corpus in the fields of literature, journalism, and law, and to calculate the semantic relevance. In the total network of the English corpus (which contains word co-occurrence network in the field of news, word co-occurrence network in the field of literature, word co-occurrence network in the field of law), nouns have the highest number of nodes as nodes, followed by verbs. Time words have the least number of times as nodes.