Stylistic Analysis of Chinese Language Literature Based on Text Mining Techniques

Chinese literary style highlights the distinctive features of different Chinese language literary works, which is of enormous significance to the research in the field of Chinese language literature. In this paper, we use web crawler technology to construct a dataset of 183 Chinese-language literary works obtained from online reading websites, and we use statistical-based segmentation methods, de-duplication, and other methods to preprocess the text data. The conditional cooccurrence matrix represents the acquired texts, and the BRET-AE model extracts text features. Appropriate classifiers are selected for different literary style analysis tasks. This paper selects seven literary works by Mo Yan and Jia Pingwa for empirical analysis of the Chinese language literary style. While Jia Pingwa’s word formation rate is higher than that of Mo Yan’s. Are selected for empirical analysis of Chinese language literary style. The average word length of all of Mo Yan’s works is 1.5115, and the average word length of Jia Pingwa’s 7 works is 1.3995. In addition, the average sentence length of Mo Yan’s works exceeds that of Jia Pingwa’s works, while the word formation rate of Jia Pingwa’s works is higher than that of Mo Yan’s works. Eventually, the clustering degree analysis reveals that Mo Yan’s Red Red Sorghum Family has the lowest clustering degree, while Wine Country has the highest clustering degree, highlighting the literary style transformation of Mo Yan’s works.

Idioma:: Inglés

Calendario de la edición:: 1 veces al año
Temas de la revista:: Ciencias de la vida, Ciencias de la vida, otros, Matemáticas, Matemáticas aplicadas, Matemáticas generales, Física, Física, otros

RSS Feed de revista

Stylistic Analysis of Chinese Language Literature Based on Text Mining Techniques

Xiaomin Shuai

Publicado en línea: 09 oct 2024

Recibido: 12 may 2024

Aceptado: 25 ago 2024

DOI: https://doi.org/10.2478/amns-2024-2902

Palabras clave<kwd>Text mining</kwd>, <kwd>Conditional co-occurrence matrix</kwd>, <kwd>Classifier</kwd>, <kwd>Text feature extraction</kwd>, <kwd>Literary style</kwd>

© 2024 Xiaomin Shuai, published by Sciendo

This work is licensed under the Creative Commons Attribution 4.0 International License.

Palabras clave
<kwd>Text mining</kwd>, <kwd>Conditional co-occurrence matrix</kwd>, <kwd>Classifier</kwd>, <kwd>Text feature extraction</kwd>, <kwd>Literary style</kwd>