Stylistic Analysis of Chinese Language Literature Based on Text Mining Techniques
Publicado en línea: 09 oct 2024
Recibido: 12 may 2024
Aceptado: 25 ago 2024
DOI: https://doi.org/10.2478/amns-2024-2902
Palabras clave
© 2024 Xiaomin Shuai, published by Sciendo
This work is licensed under the Creative Commons Attribution 4.0 International License.
Chinese literary style highlights the distinctive features of different Chinese language literary works, which is of enormous significance to the research in the field of Chinese language literature. In this paper, we use web crawler technology to construct a dataset of 183 Chinese-language literary works obtained from online reading websites, and we use statistical-based segmentation methods, de-duplication, and other methods to preprocess the text data. The conditional cooccurrence matrix represents the acquired texts, and the BRET-AE model extracts text features. Appropriate classifiers are selected for different literary style analysis tasks. This paper selects seven literary works by Mo Yan and Jia Pingwa for empirical analysis of the Chinese language literary style. While Jia Pingwa’s word formation rate is higher than that of Mo Yan’s. Are selected for empirical analysis of Chinese language literary style. The average word length of all of Mo Yan’s works is 1.5115, and the average word length of Jia Pingwa’s 7 works is 1.3995. In addition, the average sentence length of Mo Yan’s works exceeds that of Jia Pingwa’s works, while the word formation rate of Jia Pingwa’s works is higher than that of Mo Yan’s works. Eventually, the clustering degree analysis reveals that Mo Yan’s Red Red Sorghum Family has the lowest clustering degree, while Wine Country has the highest clustering degree, highlighting the literary style transformation of Mo Yan’s works.
