Otwarty dostęp

Folk Tales from Diverse Cultures: Digital Analysis of Content using Natural Language Processing

  
19 mar 2025

Zacytuj
Pobierz okładkę

At present, natural language processing has become one of the research hotspots of machine learning, and text categorization is an important branch of natural language processing technology. In this paper, for folktales from different cultures, based on natural language processing technology, the text is preprocessed using N-gram language model and SGM model. The word frequency of folktales from different cultures is counted using word frequency statistical analysis to characterize and classify them. Based on data-driven, compare the differences of key text features in different folktales. Using complex network characterization, it is concluded that the linguistic rhythm complex network aggregation coefficients of famous works are all above 0.35, the average distances are all below 2.5, and the aggregation coefficients average distance products are all kept around 1.

Język:
Angielski
Częstotliwość wydawania:
1 razy w roku
Dziedziny czasopisma:
Nauki biologiczne, Nauki biologiczne, inne, Matematyka, Matematyka stosowana, Matematyka ogólna, Fizyka, Fizyka, inne