Modeling Emotional Patterns and Singing Styles of Ancient Poetry in Vocal Art Songs Based on Text Mining
Publicado en línea: 24 mar 2025
Recibido: 01 nov 2024
Aceptado: 14 feb 2025
DOI: https://doi.org/10.2478/amns-2025-0710
Palabras clave
© 2025 Dan Shen, published by Sciendo
This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
Chinese civilization is splendid and eternal, among which ancient poems are the bright pearls in China’s splendid civilization, with unique national cultural connotations and aesthetic aspirations, which are typical representatives of China’s excellent culture. Ancient poems have deep feelings and beautiful words and phrases, and its profound meaning embodies culture, emotion and thought, and its language features exquisite fluency, which gives literary, artistic, and aesthetic thoughts in depth [1-4]. Ancient poetry is the crystallization of the wisdom of the ancient Chinese working people and the carrier of emotions, itself is a kind of art of singing with music, which has an innate inner fit with music, and also brings an opportunity for the creation of ancient poetry art songs [5-8]. Chinese ancient poetry art song is a kind of art paradigm that combines Chinese ancient poems with western music techniques to create an art paradigm with both poetry and music aesthetic elements, which has unique music aesthetic meaning and national cultural value [9-12]. The style of Chinese ancient poetry art songs is fresh and elegant, with timeless lyrics and melodious and elegant melodies, which are welcomed and sung by people, contributing to people’s spiritual fulfillment and the construction of advanced culture [13-15].
As music is the art of emotion, singing is for the emotional service, to achieve the emotional target is really a powerful performance of the work. In order to enhance the emotional value and promote traditional Chinese culture, the emotional mode of ancient poetry art songs is discussed to enhance the performance of the emotion of the work [16-17]. That is, it enhances the cultural connotation and emotional cognition to obtain the artistic value from the ancient poems themselves [18]. Therefore, by exploring the unique emotional patterns and singing styles in the art songs of ancient poems, and sorting out the details and techniques of singing, the works of ancient Chinese poems can be better interpreted and inherited [19-22].
The study utilizes text mining method and utilizes KH coder software to analyze sentiment patterns of two types of songs, Tang poetry and Song poetry. Firstly, the overall word frequency is mined, high-frequency words are selected, and the data is used to calculate the center words. At the same time using K-Means clustering method to text clustering of word frequency, analyze the emotional expression method. Then, Tang and Song poems are analyzed for emotion patterns using both genre and tone methods, and finally, singing style modeling is constructed according to the different types of singing styles in ancient poems.
In the text mining part of this paper, the main software used is KH coder (Koichi Higuchi Coder). This is a quantitative text analysis, content analysis, and dictionary-based unstructured text analysis of open source free software. The software can support text analysis in Simplified Chinese, including word frequency statistics, lexical annotation, keyword indexing, covariance network, cluster analysis, visualization, and other functions. At present, the software is widely used in the fields of historical research, literature review, and ecology. There are also scholars in the field of education who utilize the software to conduct literature review research on China and Japan. Combining the functions of KH coder software and manual processing such as entering and screening the original lyrics text, the text mining of ancient poetry songs of vocal art is completed, and Figure 1 shows the text mining process of ancient poetry songs of vocal art.

Lyrics to the text-mining process
The process of preprocessing involves representing unstructured text quantitatively into specific numerical forms before the formal analysis of textual content. Filtering out the unstructured text information that is not related to the study, extracting text features, and normalizing the text’s form. Preprocessing is crucial in text mining and has a direct impact on the accuracy of text mining results. The preprocessing of different types of text will be different, but basically involves cleaning the text, separating words, setting deactivation of words, and other processes. In this paper, preprocessing includes screening and cleaning, semantic substitution, setting deactivated words, forced word extraction, Chinese word segmentation, and testing word frequency distribution for five links.
Screening and Cleaning Filtering cleaning is to screen out eligible folk songs from the books and delete interfering lyrics fragments to analyze the collection of lyrics. As the folk songs collected in this paper come from paper books of earlier ages, it is necessary to manually enter the lyrics of each song in the book, and in the process of entering, it is necessary to sieve out some anonymous songs and some shaman songs without lyrics. In addition, some songs whose lyrics have nothing to do with the landscape imagery studied in this paper also need to be eliminated in the recording process, such as some toasts focusing on the description of kinship, and some hymns praising the leader. Semantic Replacement Text mining involves extracting valuable words scattered throughout the text. Because there are many dialects, transliterations of ethnic minorities in the lyrics, and some synonyms that refer to the same thing, for example, the word “spring water” has three expressions: “mountain spring”, “clear spring” and “spring water”. In order to ensure the accuracy of the subsequent analysis, it is necessary to manually extract a series of synonyms in the text, and replace all kinds of synonyms with the same word, so as to complete the semantic replacement of the lyrics text. For example, “mountain spring”. Setting deactivated words and forced extraction test Ancient poetry lyrics essentially belong to a kind of long text, in which not all words can provide reference for research. Some words may even play an interfering role. These meaningless words need to be deleted before the subsequent feature word extraction. The deactivated word list is the collection of such useless words summarized and organized by the linguist, which cannot reflect the effective information value in text mining and cannot be recognized as the feature words of the text. Chinese Segmentation Segmentation is the basis of text mining, which is the process of splitting continuous text into independent words one by one. Chinese word segmentation is the most important step in the preprocessing process. Unlike English, where there is a space between words in a sentence, Chinese utterances are composed of consecutive Chinese characters, so they need to be cut into word-by-word for subsequent quantization operations. Word frequency distribution test After word-splitting, the preprocessed lyric texts of the four ethnic groups were combined and subjected to word frequency-phrase frequency statistics.
Keywords are the core of the whole article, and the frequency of keywords can reflect the main views and opinions of the review. The keyword word frequency cloud can visualize the main keywords in a research field. The larger the keywords are, the higher the frequency of occurrence, and vice versa, the lower the frequency. Word frequency analysis involves statistically analyzing the word frequency of words in a text to extract and depict the lexical patterns of the text. The results of word frequency analysis can reflect the main ideas and knowledge structure of the text.
Commonly used algorithms for word frequency analysis include shortest path method, sparse matrix method, msktuple multivariate sequence method, itemset mining algorithm, plain Bayesian algorithm, recursive neural network model and word frequency-inverse document frequency method [23]. Among them, TF-IDF is the most commonly used.
The term frequency (TF) represents the frequency of keywords appearing in this text, assuming the existence of feature word
The inverse file frequency (IDF) is calculated as:
where |
TF-IDF is then denoted as:
Semantic network refers to the network of relationships among the feature words in the text. Semantic network refers to the set of points and the relationship between points, the set of points to form a certain mapping relationship graph, including directed and undirected graphs, etc., and semantic network analysis is the process of quantitative research on the mapping relationship of directed or undirected network composed of these sets of points [24].
Semantic primitives are the basic units of semantic networks, which can be represented by a triad (node 1, mapping relation, node 2), and one or several triad primitives form a semantic network, i.e., a directed weighted graph. Assuming that there exists a triad
Where vector
The core and difficulty of semantic network analysis lies in the similarity calculation between the nodes, the similarity represents the degree of relevance between two words or texts, the similarity takes the value between [0, 1], the larger the value of similarity represents the closer the connection between the two words or documents, and vice versa the more distant. Commonly used algorithms for similarity calculation include Euclidean distance method, cosine similarity method, Jaccard coefficient method, Pearson coefficient method and so on.
The K-Means algorithm is an iteratively solved machine learning algorithm for cluster analysis, which is mainly applied in the data clustering stage of data mining techniques, where a given data set
and the number
Each division represents a class
The clustering objective is to minimize the total sum of squared distances
The K-Means algorithm is an iterative solving cluster analysis algorithm for unsupervised learning, which means that there are no labels, but only categorization by the machine, so as to discover the inherent patterns in the data. It is designed to minimize the sum of the squares of the distances from all the sample data of the clustering domain to the center of the clusters, while the data of the same class are maximally similar or close in distance, and the data of different classes are maximally dissimilar or far apart.
top K attribute classification accuracy
where top K normalized discounted cumulative gain nDCG is often used as an evaluation metric for ranking results to evaluate the effectiveness of recommendation ranking in recommender systems. Since the multi-label classification model also predicts the probability value from 0 to 1 for the sentiment attributes in the comments, according to the different predicted probability values, it can be assumed that the sentiment attributes with high predicted values are more likely to be the expressions that the user wishes, and if the predicted result of the sorting list is consistent with the real user’s expression, then it can effectively illustrate that the model has a strong recognition ability for each label, so NDCG@K is also often used for the evaluation of the multi-label classification models.
where ‖
Since the two-step Pipeline method of entity recognition followed by emotion classification will generate redundant information brought by unnecessary entities, and the error of entity recognition module will also affect the performance of emotion classification, greatly increasing the difficulty of emotion recognition. Therefore, in this paper, we utilize the idea of joint training of entity recognition and emotion classification, process the attribute-level emotion classification task as a multi-label classification task, and propose a model BERT-SLAF that dynamically fuses the self-attention information as well as the external labeled attention information, and the overall model training is divided into four phases as shown in Fig. 2. In the first stage, the pre-trained word vector model BERT is used to semantically encode the text; in the second stage, the obtained word vectors are inputted into Bi-LSTM to further strengthen the influence of contextual semantics on the word vectors; in the third stage, the self-attention mechanism and the attention mechanism that introduces the knowledge of the external labels are used to adaptively capture the correlation features between the textual semantics and the attribute labels of the products, and then merge them to strengthen the ability of capturing the specific attribute labels. Capturing ability for specific attribute labels; the fourth stage utilizes multi-layer perceptual machines for multi-label classification.

The framework of the BERT-SLAF model
Multi-label categorization obtains the attention matrix
Each element of the vector represents the prediction probability for the
As users usually focus on certain attributes when commenting on product attributes and mention other attributes less frequently, the proportion of different emotional attributes in the entire comment dataset is different, and at the same time, different attributes are also expressed explicitly and implicitly. For attributes that are directly evaluated, it is easier to learn the attribute features, while for more implicit expressions of emotional attributes, these attribute features are more difficult to learn. In order to make the model accurately learn these attribute features that are more difficult to recognize, the FocalLoss loss function is chosen to calculate the loss value between the predicted label
FocalLoss is a loss function used to deal with sample classification imbalance problem, which focuses on the difficulty of sample discrimination according to the degree of the sample corresponding to the loss of the sample to add weight, that is, easy to distinguish the samples to add smaller weight, difficult to distinguish the samples to add a larger weight. Which is difficult to distinguish and easy to distinguish the fierce difference lies mainly in the classification confidence, usually the classification confidence close to 1 or close to 0 samples called easy to distinguish samples, the rest is called difficult to distinguish samples. Such as two positive samples, the first sample model will be classified as a positive example of the probability of 0.9, the second sample model will be classified as a positive example of the probability of 0.6, although the model can be correctly categorized as a positive example, but obviously in the classification of the second sample contributes more to the total Loss. In order to further improve the contribution share of difficult-to-categorize samples in the Loss, taking the positive samples as an example, FocalLoss introduces the modulation factor
The focusing parameter
This paper mines the lyrics of ancient poems of the Tang and Song dynasties type of songs as an example to analyze their emotions and singing styles in vocal art.
By using KH coder software to analyze the word frequency of the lyrics text, the 75 words and phrases with the highest frequency of use in the songs of ancient poems and lyrics are shown in Table 1. According to the analyzed data in the table, Wanli, Baiyun, Chengdu, etc. are the landscape phrases with high frequency, and the word frequencies are 30, 20, and 17. Jinjiang, Dufu Cao Tang, and Jinli, etc. are the historical and humanistic sites with high frequency. In addition, there are more words in the word frequency, such as leisurely, lonely, missing, and other poets’ inner emotional expression of evaluative vocabulary, reflecting the landscape and humanities in the art of ancient poems and songs, reflecting the poet’s praise of the scenery and the inner emotions of life encounters and changes of the times.
The top 75 high frequency word entry is the center of the poem
| Ranking | High frequency word | Word frequency | Ranking | High frequency word | Word frequency | Ranking | High frequency word | Word frequency |
|---|---|---|---|---|---|---|---|---|
| 1 | Wanli | 30 | 26 | Air smoke | 9 | 51 | Southern song | 8 |
| 2 | White cloud | 20 | 27 | Green day | 9 | 52 | Tize | 8 |
| 3 | Chengdu | 17 | 28 | Jade base | 9 | 53 | Spring wind | 8 |
| 4 | Old man | 15 | 29 | Spring color | 9 | 54 | White horse | 8 |
| 5 | Dust dust | 14 | 30 | Centenary | 9 | 55 | Land tour | 8 |
| 6 | The world | 12 | 31 | Commentary | 9 | 56 | Today | 8 |
| 7 | General | 12 | 32 | Jiang han | 8 | 57 | Look at | 7 |
| 8 | Where | 12 | 33 | Bright moon | 8 | 58 | meet | 7 |
| 9 | Yo-Yo | 11 | 34 | Cui wei | 8 | 59 | Universally | 7 |
| 10 | Plain life | 11 | 35 | jiandynasty | 8 | 60 | Day | 7 |
| 11 | Luoyang | 11 | 36 | Aancients | 8 | 61 | Extremes | 7 |
| 12 | Brocade | 11 | 37 | Jianghai | 8 | 62 | Landing | 7 |
| 13 | When | 11 | 38 | The three gorges | 8 | 63 | Cockle | 7 |
| 14 | Yamaguchi | 10 | 39 | Ruzi | 8 | 64 | Works | 7 |
| 15 | Cloud | 10 | 40 | Hometown | 8 | 65 | Tianya | 7 |
| 16 | Loneliness | 10 | 41 | World | 8 | 66 | Entertain | 7 |
| 17 | Court | 10 | 42 | Flock | 8 | 67 | World | 7 |
| 18 | Today | 9 | 43 | Nine days | 8 | 68 | Chiang jiang | 7 |
| 19 | Out of sight | 9 | 44 | Minshan | 8 | 69 | Since then | 7 |
| 20 | Jincheng | 9 | 45 | Trail | 8 | 70 | Long song | 7 |
| 21 | Rain and rain | 9 | 46 | Guan shan | 8 | 71 | Old man | 7 |
| 22 | Emei | 9 | 47 | Aspect | 8 | 72 | Cloud fog | 7 |
| 23 | River side | 9 | 48 | White crane | 8 | 73 | Grass hall | 7 |
| 24 | Road | 9 | 49 | Air current | 8 | 74 | Return | 7 |
| 25 | Xishan | 9 | 50 | Sun moon | 8 | 75 | Jiangshan | 7 |
Imported into the Rost CM6 software for semantic network analysis of the text after word separation, the software can automatically identify the semantic association strength between various elements and then build binary relationships. To form a network structure diagram reflecting the interrelationship of each landscape element, based on the semantic information mapped by the words and the relative position of the words in the network structure diagram, using the K-Means clustering method can be further summarized the lyrics word frequency into 8 class clusters as shown in Fig. 3, with different colors representing different types of vocabulary. The semantic network with binarized co-occurrence matrix is constructed and the obtained results are imported into Ucinet6 software in order to generate the center scores of the song elements, and the results are shown in Table 2. From the table, we can see that’sun and moon’ has the highest score of 0.051. This represents the semantic elements of the poem and the classification of landscape words as shown in Table 3.

Semantic map
The classification of the semantic terms of poetry
| Serial number | Element entry | The number of entries is central | Serial number | Element entry | The number of entries is central |
|---|---|---|---|---|---|
| 1 | Sun moon | 0.051 | 17 | Green day | 0.020 |
| 3 | White cloud | 0.048 | 27 | Air month | 0.017 |
| 4 | Ba shan | 0.041 | 28 | Qing jiang | 0.017 |
| 4 | Ancients | 0.03 | 29 | Zhong ding | 0.017 |
| 5 | Mountain forest | 0.03 | 30 | Bright moon | 0.013 |
| 6 | Entertain | 0.027 | 31 | Road | 0.01 |
| 7 | Gaping | 0.027 | 32 | Flower creek | 0.01 |
| 8 | Court | 0.024 | 33 | Shu shan | 0.01 |
| 9 | Wanli | 0.024 | 34 | Ba shan | 0.01 |
| 10 | Emissary | 0.024 | 35 | Guan shan | 0.006 |
| 11 | Land tour | 0.02 | 36 | Spring color | 0.006 |
| 12 | Air current | 0.02 | 37 | Jian ghan | 0.006 |
| 13 | Wu hou | 0.02 | 38 | Egret | 0.006 |
| 14 | The world | 0.02 | 39 | Universally | 0.006 |
| 15 | Minshan | 0.02 | 40 | Autumn wind | 0.006 |
| 16 | Southern song | 0.02 | 41 | jiandynasty | 0.006 |
The classification of the semantic terms of poetry
| First class | Second class | Tertiary |
|---|---|---|
| Natural landscape | Waterscape | Flower creek |
| Lake | Jiang han | |
| Landscape | Heshan, Xi shan, Yunshan | |
| River bank | Jinjiang, Jiang bian, Chun jiang, all-round, jiang han | |
| Celestial species | Spring, Cloud, White Cloud, Sun moon, today | |
| Meteorological class | Air, air, cold, white dew, cloud fog | |
| Fauna | Nothing | |
| Plants | Mountain forest | |
| Humanistic landscape | Building class | Brocade,Jiang han, Du fu grass hall |
| Road class | Road | |
| Facilities | Dry, embroidered, high marks | |
| Traffic | Boat boat | |
| Acoustic light | The moon and the moon | |
| Characters | General, Emissary, Tianzi, Ancients, Gitations | |
| Placid | Ba shan, Wu hou, Du fu grass hall | |
| Subjective emotion | Words | Prosperous, Yo-Yo, Air month, Entertainment, Hometown, The imperial Chinese, Bright moon |
From a comprehensive point of view, ancient poetic songs make good use of landscape, meteorology and transportation words to express the emotion of the songs, and the phenomenon of borrowing objects to express one’s will is relatively obvious.
After software analysis, four categories of landscape features of word frequency semantics of ancient poetry songs were derived. In order to extract the feature discriminators, the number of word frequency elements of each category is counted and a radar chart is generated as shown in Figure 4. The word bias of each subcategory in the composition of semantic elements can be clearly seen, which helps to understand the word distribution characteristics.

Four types of variety of characteristics radar
The global perspective of Tang and Song poetry songs is shown in Fig. 5 and Fig. 6. The horizontal coordinates in Fig. 5 are the genres above and the time below, which are in the order of Beginning, Sheng, Middle and Late, and the color of each genre is different, and it is found that all the poems of all the genres grow with the four times of the Beginning Tang, Sheng, Middle and Late Tang, and the number of poems can be grown up to 4,575. Especially in the three time periods of Early Tang, Sheng Tang, and Middle Tang. The horizontal coordinate of Fig. 6, above is the genre, below is the time, the time is in order of the Northern Song Dynasty and the Southern Song Dynasty. Statistical analysis of the three genres of Song Lyrics found that for each genre the Southern Song Dynasty has more song works than the Northern Song Dynasty, especially in the small order, the Northern Song Dynasty has only 2254 words, and the Southern Song Dynasty has 4623, a clear gap. It can be seen that the genre of Tang poetry and Song lyrics has developed over time, leading to the creation of more song works.

Bar chart of Tang poetry genre

Bar chart of Song Ci genre
In this paper, tones are categorized according to genre and used to create a bar chart as shown in Figure 7. From left to right, there are five genres of Tang poetry and three genres of Song lyrics. Below them are the tones, and the Chinese characters in the chart correspond to the tones respectively. According to this bar chart, analyzing the tones from a local point of view, it is found that the highest number in the genre classification is the fourth tone, the lowest is the third tone, and all the others have more flat tones than upper tones, but the second tone is more than the first tone in the miscellaneous poems, so that the number of the first and the second tones is equal in ancient texts. Tang poetry and Song lyrics are only a part of ancient texts and not a substitute for all of them, so in Tang poetry and Song lyrics, which are known to be about flatness and unevenness, the first tone and the second tone are almost the same, but the ancients of the Tang and Song dynasties preferred the first tone in composing their poems and lyrics, and the third tone has the least number, excluding light tones, and the third tone is found to be half as much as the others when the samples are sufficiently large. The fourth tone enhances the degree of emotion, while the third tone indicates a turn of phrase, the second tone is the upper tone that favors positivity, the first tone is more soothing and calm, and the third tone is the most difficult to pronounce, whereas Tang poems were meant to be read aloud and emotions expressed through reading. The fourth tone is the declining tone, the mouth will make an exhalation movement when making the fourth tone, there is gas coming out of the mouth, physically it is exhalation of gas, psychologically it is expression of emotion through the gas given. By analyzing the charts, it was found that only heptameter and miscellaneous poems have light tones, because the light tones were originally a supplement to the four traditional tones.

The tone line diagram of Tang and Song poems
The singing styles of ancient poetry songs are mainly of several types: chanting, ethnic style, American style, and pop style. This paper integrates the four aforementioned styles to build a model of singing styles in ancient poems under the category of vocal art, and provides an in-depth understanding of both the singers and the songs.
Chanting The so-called “chanting”, i.e. singing and reciting, is a unique singing style of ancient Chinese poems. To a certain extent, “chanting” and ancient Chinese poetry go hand in hand. Whether it is the “Classic of Poetry”, “Chu Shi”, Tang poetry, Song lyrics, etc., all of them originally had their own tunes, and the ancients used their own voices to sing out the poems when they composed them. This singing was often improvised and could be done without musical instruments. Americanized Singing Style Americanized singing style is a kind of exploration and innovation of Chinese ancient poetry song creators, which relies on traditional Chinese ancient poetry and combines western music with traditional Chinese music to interpret ancient poetry and express emotions. Ethnic Style Ethnic style Chinese poetry songs emphasize the original flavor of Chinese poetry songs, focusing on the control of the overall “flavor” of Chinese poetry. First of all, singers should understand the background of the creation of the poems, and feel the emotion expressed by the author. Pop Style Pop music, also known as “popular music”, “popular music”, often short and concise structure, easy to understand the content, sincere and touching emotions, public acceptance, high degree of singing. Currently, the rapid development of the economy, the rapid change of science and technology, and the consequent intensification of social change have made popular music more and more able to meet the aesthetic needs of people in the current society due to its popularity. Pop-style Chinese ancient poetry songs, like American-style Chinese ancient poetry songs, are based on traditional ancient poetry, but the difference is that the former incorporates the singing style of pop music into it, which is more grounded and easier to be recognized by the public than the latter.
Based on text mining, the article utilizes KH coder software to analyze the word frequency of Chinese ancient poetry songs. At the same time, K-Means clustering and semantic analysis methods are used to categorize word frequency types. On this basis, the vocal art songs of Tang and Song poems were selected to carry out emotional analysis, and it was found that the songs of ancient poems were good at using words such as landscape and weather to express emotions and to borrow things to express their aspirations. In the Tang and Song dynasties, the number of poems has been increasing over time, and then from the perspective of tone, the study of the characteristics of ancient poetry songs found that the fourth tone is the most frequent, followed by the first and second tones are comparable, the first tone is slightly more than the second tone, and then the third tone, and the light tone is the least. And when the fourth tone is pronounced, it will be accompanied by gas exhaled from the mouth, psychologically expressing emotion. Finally, on the basis of the singing style of vocal art of ancient poetry songs, the singing style model is constructed by integrating four styles, namely, chanting, ethnic style, American style, and popular style, which is easy to analyze the emotion of the singer and the song.
