Research on the Role of Digital Media Technology in the Cultural Inheritance of Fujian-Taiwan Folk Songs

Minnan folk songs originated in the southern belt of Minnan, while in Taiwan, they are called Taiwanese songs (Putian songs are sung in the Putian area of southern Minnan). It is very popular in Quanzhou, Xiamen, and Zhangzhou, and is sung by both men and women of all ages. There has been a cultural relationship between Fujian and Taiwan since ancient times, and the local Han Chinese in Taiwan have been using the Minnan language, so the Minnan folk songs are also Min-Taiwanese folk songs, which are important symbols of the national culture, and will be beneficial to its inheritance and development under the impetus of the digital media technology [1–4].

Digital media technology is one of the most advanced communication technologies today and one of the most important components of the information age. Digital media technology is gradually penetrating our daily lives, and we use it almost all the time. With the continuous innovation and application of digital media technology, it has become an indispensable part of cultural heritage [5–8]. With its powerful functions and excellent forms of expression, digital media technology has the characteristics of visualisation, interactivity and omni-directionality, which can effectively convey cultural information, improve the efficiency and quality of cultural inheritance, and provide a new way for cultural inheritance and innovation [9–11]. Digital technology can digitally preserve cultural heritage for better protection and inheritance. Digital cultural heritage can be disseminated in a wider range, bringing people a more convenient cultural experience. Digital media technology can also provide new ways of cultural innovation, injecting new vitality into traditional media culture [12–15].

Literature [16] introduces China’s national culture, including folk music, dance, painting, etc., and the inheritance and development of these cultures need to be guided by the government to pay attention to the whole population. In this process, the integration of digital media technology and national culture and art is an important way to achieve the inheritance and development of national culture. Literature [17] points out that traditional music is an important part of Chinese national culture, but people’s attention to traditional music is gradually weakened in the new era. The current situation of the development of traditional music in the digital era is examined, and solution strategies are proposed to address its problems. Literature [18] not only emphasises the importance of primitive folk songs and the challenges they face in a new era but also elaborates on the status of primitive folk songs in contemporary society, their cultural value, and the ways to protect and pass them on, so as to ensure a better development of the cultural heritage of primitive folk songs. Literature [19] enumerated the role played by new media technology in the inheritance and development of traditional cultural heritage and emphasised that the application of multimedia technology is an innovative direction for the protection and inheritance of traditional cultural heritage, which can realise multimedia interactive products of user experience. Literature [20] indicated that “One Belt, One Road” is an important platform for the inheritance and development of Chinese folk songs, in which context Chinese folk songs should comply with the country’s development strategy to establish innovative forms of expression, thus promoting the enrichment and optimisation of traditional music culture. Suggestions such as customising multi-angle communication strategies and integrating domestic and international music elements are put forward. Literature [21] explored the development status of Chinese traditional music in the context of big data based on data analysis and constructed a database for the protection of traditional music. The results of the study show that information media technology and social changes have led to the decline of traditional music, and its survival environment is facing many challenges. Therefore, it is imperative to take advantage of the opportunities and platforms brought by the information age to promote the development of traditional music.

Literature [22] discusses the inheritance and innovation of traditional culture in the new media environment. A comprehensive introduction to traditional culture and its development was made using a combination of qualitative and quantitative analyses. Measures for the innovation and development of traditional culture in the new media environment are proposed, emphasising the important role of new media in promoting the inheritance and development of traditional culture. Literature [23] examined the modernisation of traditional cultures in southeastern coastal areas and the role played by information technology in preserving these traditional cultures. It was found that the modernisation of traditional culture is mainly reflected in the popularisation of music and the modernisation of communication methods. Literature [24] analysed the digital communication channels of the music of the quartet and found that the digital communication of the music of the quartet showed advantages in pitch, rap combinations, rhythmic changes, etc., but at the same time, there are also drawbacks such as insufficient talents of music of the quartet, monotonous communication methods, etc., to which optimization methods, such as digital media technology and self-media, are proposed to promote the development of the music of the quartet. Literature [25] analyses the opening ceremony of the Beijing Olympic Games and the Chinese Pavilion of the Shanghai World Expo and explores the inheritance and realization of multimedia art design based on traditional cultural elements using computer multimedia technology. Literature [26] introduces that digital music is an important part of promoting the construction of digital countryside, based on the function of music carrying cultural information, which can realise cultural inheritance. Therefore, digital music empowers rural civilisation to help promote rural economic development and modernisation. Literature [27] elucidates the advantages of using new media channels to fulfill the dissemination of ethnic minority music, as well as the inheritance and development of ethnic minority music under new media channels. The case study reveals that new media technology has great potential in promoting the field of minority music.

This paper analyzes the inheritance and development trend of Fujian-Taiwan folk song culture using a genetic algorithm. The existing people who are familiar with Fujian-Taiwan folk song culture and the creators of Fujian-Taiwan folk songs are taken as the “infectious source”, and the infectious disease model is used to construct the inheritance model of Fujian-Taiwan folk song culture. The model parameters are solved by the fitness function and selection operation in the genetic algorithm, and the optimized parameters are substituted into the equation system to get the inheritance development trend of Fujian-Taiwan folk song culture with or without digital media technology support. Based on digital media technology, the Hidden Markov Algorithm is used to design the automatic creation model of Fujian-Taiwan folk song culture. The automatic creation enhances the efficiency of creating Fujian-Taiwan folk songs, increases the probability of transmission of Fujian-Taiwan folk song culture, and thus creates an efficient new form of inheritance for Fujian-Taiwan folk song culture. Ten Min-Taiwan folk songs are selected as song samples for model training and learning, and their lyrics, melodies, and other information are processed. The HMM model is employed to explain the process of transferring undeclared implicit states under consistent observation states, and the Viterbi algorithm is employed to address the encoding problem in the HMM model. According to the given dataset, the HMM model is used to automatically compose Min-Taiwan folk songs, and the model creation results are compared with other folk song creation models to highlight the good performance of this paper’s model in Min-Taiwan folk song creation.

2

Trend of cultural inheritance of Fujian-Taiwan folk songs based on genetic algorithm

In order to explain the importance of the inheritance of Fujian-Taiwan folk song culture, this section will analyse the trend of the cultural inheritance of Fujian-Taiwan folk songs based on a genetic algorithm. Cultural inheritance is transmitted between generations, similar to the spread of infectious diseases, so this section firstly constructs the cultural inheritance model of Fujian-Taiwan based on the infectious disease model, then solves the model parameters using genetic algorithms, and finally analyses the trend of cultural inheritance of Fujian-Taiwan folk songs based on the solving results.

2.1

Establishment of the cultural inheritance model of Fujian-Taiwan folk songs

2.1.1

Modelling of infectious diseases

The SIR model [28] is one of the most classical infectious disease models whose main objective is to predict the number of uninfected, infected, and recovered persons at different points in time after the onset of an infectious disease. This model adds the recovered person R bin chamber to the SI model, and the main parameters of this model include the infection rate β, the recovery rate γ and the total number of the population N. Here, β denotes the number of susceptible persons to whom an infected person can transmit the virus per unit of time, while γ denotes the probability of an infected person recovering per unit of time. N then denotes the total population, i.e., the sum of susceptible S, infected I, and recovered R. The formulas for the SIR model can be expressed as equations (1), (2), and (3): 1 $\frac{d S}{d t} = - β \frac{S \times I}{N}$ 2 $\frac{d I}{d t} = β \frac{S \times I}{N} I$ 3 $\frac{d R}{d t} = γ I$

Equation (1) describes the change in the number of susceptible persons S over time t, which is influenced by the infection rate β, the number of susceptible persons S, and the number of infected persons I. The change in the number of infected persons I over time t is expressed in equation (2). Equation (2), on the other hand, expresses the change in the number of infected persons I over time t. The change is influenced by the infection rate β, the number of susceptible persons S, the number of infected persons I, and the recovery rate γ. Equation (3) describes the change in the number of recovered persons R over time t, which is influenced by the recovery rate γ and the number of infected persons I.

2.1.2

Cultural heritage model of Fujian-Taiwan folk songs

In this paper, we take the local people in the Fujian-Taiwan area as the people who are familiar with the culture of Fujian-Taiwan folk songs, combined with the SEIA model [29], without considering the birth rate and death rate for the time being, and taking the existing people who are familiar with the culture of Fujian-Taiwan folk songs and the creators of Fujian-Taiwan folk songs as the “contagious source”, and introducing the Technical Reference Factor u₁, u₂, we constructed the model of the cultural inheritance of Fujian-Taiwan folk songs as follows: 4 ${\begin{matrix} \frac{d S (t)}{d t} = - α S (t) E (t) + β E (t) u_{1} + (1 - ν) I (t) u_{2} \\ \frac{d E (t)}{d t} = α S (t) E (t) - E (t) u_{1} - η E (t) \\ \frac{d I (t)}{d t} = η E (t) - I (t) u_{2} \\ \frac{d A (t)}{d t} = (1 - β) E (t) u_{1} + ν I (t) u_{2} \end{matrix}$ Where: S(t) is the proportion of the population interested in Fujian-Taiwan folk song culture to the national population. E(t) is the proportion of people who are familiar with the culture of Fujian-Taiwan folk songs to the national population. I(t) is the ratio of the number of Min-Tai folk song creators and other Min-Tai folk song culture inheritors to the national population. A(t) is the ratio of the number of opponents of cultural protection to the national population. α is the contact probability between each person familiar with the culture of Min-Tai folk songs and those interested in the culture of Min-Tai folk songs. β is the probability that people who are familiar with the culture of Min-Tai folk songs do not actively spread the culture. η is the probability that a person who is familiar with the Min-Tai folk song culture becomes a Min-Tai folk song creator. ν is the probability that creators of Fujian-Taiwan folk songs become opponents of cultural protection. u₁ is the reference factor for people familiar with the culture of Min-Tai folk songs to become the inheritors of the culture of Min-Tai folk songs with the technical support of digital media. u₂ is the reference factor of the decline of Dongba culture under digital media technology support.

Where u₁,u₂ is a randomly assumed value based on the support of existing digital media technology for traditional culture. α,β,η,ν is unknown, and this paper applies genetic algorithm to solve the parameters of the mathematical model of Min-Tai folk song culture inheritance.

2.2

Genetic algorithm to solve the model parameters

2.2.1

Overview of genetic algorithms

The genetic algorithm replaces the natural biological population with the coding group, takes the fitness function as the basis of superiority and inferiority, and replaces the genetic mechanism of the genetic algorithm with the data operation of coding go to which mention, including mutation, crossover, mutation, and so on. Genetic algorithm through a certain law random recombination coding instead of the genetic mechanism of mutation, mutation, crossover, etc. The genes of the whole population are constantly optimized so that the genes in the population are close to the optimal solution to achieve the solution of the problem.

The principle of the genetic algorithm is: first of all, the algorithm establishes an initial population, which represents a solution set of the problem. The population can be generated either by random generation or through the experience of experts on different problems. Each individual in the population has a genetic code that represents a solution to the problem.

2.2.2

Solving for model parameters

1)

Calculation of fitness

Individual survival is selected and eliminated displacement indicator is their adaptive value, so the fitness function directly determines the evolutionary direction of the group, in accordance with the principle of survival of the fittest, the need to evaluate the adaptability of individuals. The fitness function is a function that evaluates the adaptive ability of an individual. In this paper, we choose $\sum_{j = 1}^{n} φ_{j}^{2} = \sum_{j = 1}^{n} {(φ (i_{j}) - i_{j})}^{2}$ and $\sum_{j = 1}^{n} φ_{j}^{2} = \sum_{j = 1}^{n} {(φ (a_{j}) - a_{j})}^{2}$ as the fitness function, i.e., the residual sum of squares of the estimated value of the number of people who are familiar with the folk song culture of Fujian-Taiwan and the actual value of the number of people who are familiar with the folk song culture of Fujian-Taiwan, the residual sum of squares of the estimated value of the number of people who are opposed to the protection of the culture of Fujian-Taiwan and the actual value of the number of opponents to the protection of the culture of Fujian-Taiwan, and the value of the parameter α, β, η, and ν when the fitness function is taken to be the smallest value.

2)

Selection

Selection operation is to select the individuals with higher adaptation value in the population with a certain probability, and keep them to form new offspring for the next inheritance, and the elimination operation of the individuals with lower adaptation value reflects the principle of survival of the fittest. The selection method used in this paper is roulette selection, which is similar to roulette in roulette games, and its basic idea can be understood as follows: the probability of each being selected is directly proportional to the size signal of the individual’s fitness value.

The selection strategy is implemented by generating a random number r(r ∈ [0,1]) and if P₁ + P₂ +⋯+ P_i−1 ≤ P₁ + P₂ +⋯+ P_i, then individual i is selected. It is clear to see that the higher the probability, the more likely an individual is to be selected and the more likely it is to be retained.

2.3

Analysing the trend of cultural inheritance of folk songs in Fujian and Taiwan

The selection of the fitness function in the genetic algorithm ensures the consistency between the parameters of the established Min-Tai folk song culture inheritance model and the survey statistics. Substituting α, β, η and ν obtained after parameter optimisation into the equation system, if u₁ = u₂ = 1 is taken without any technical support, the development trend of the number of people who are familiar with the culture of Fujian-Taiwan folk songs is shown in Fig. 1. As can be seen from the figure, when there is no technical support conditions, the number of people who are familiar with the culture of Min-Tai folk songs gradually decreases over time.

In order to consider the different degree of influence of digital media technology on the protection and inheritance of Min-Tai folk song culture, different values are taken for reference factors u₁ and u₂, and the development trend of the inheritance of Min-Tai folk song culture is shown in Fig. 2. From the figure, it can be seen that with the increase of time, the proportion of the population familiar with the culture of Fujian-Taiwan folk songs did not change significantly before the number of inheritance 15 times. Technical reference factor u₁ and u₂ represent different strengths of digital media technical support for Fujian-Taiwan folk song culture, and this paper takes two values of 0.5 and 0.05 respectively to analyse them, and the difference between the two values is 10 times, reflecting that there is a big gap in the strength of digital media technical support as well. When a greater strength of digital media technical support is taken, i.e., when u₁ = u₂ = 0.05, the growth trend of people familiar with the folk song culture of Fujian and Taiwan comes in advance, and the maximum threshold is larger. When the digital media technology support is less strong, i.e., when u₁ = u₂ = 0.5, the growth trend of people familiar with Fujian-Taiwan folk song culture arrives slower than the former, and the maximum threshold is smaller than the former. This indicates that digital media technology has a significant impact on the inheritance of Fujian-Taiwan folk song culture, which can effectively promote the dissemination of Fujian-Taiwan folk song culture, and only greater and stronger digital media technology support will not make Fujian-Taiwan folk song culture go into extinction.

3

Cultural Inheritance Methods of Fujian-Taiwan Folk Songs Based on Digital Media Technology

From the previous analysis, it can be seen that the inheritance of Fujian-Taiwan folk song culture needs the support of digital media technology. In order to protect the culture of Fujian-Taiwan folk songs so that it will not go into extinction, in this section, the Hidden Markov Algorithm will be used to design the automatic creation model of Fujian-Taiwan folk songs based on digital media technology. Adopting the digital method to automatically generate new Min-Tai folk songs based on the constructed sample library of Min-Tai folk songs can efficiently improve the creation speed and quality of Min-Tai folk songs, increase the probability of Min-Tai folk songs’ dissemination, and thus promote the dissemination of Min-Tai folk song culture, and ultimately realise the inheritance of Min-Tai folk song culture.

3.1

Sample song collection and organisation

Since there is no authoritative expert database and dataset for Fujian-Taiwan folk songs, this paper collects the required sample songs through expert consultation and literature research. At present, the Min-Taiwan folk songs collected in this section consist of ten representative and widely disseminated Min-Taiwan folk songs of traditional Min-Taiwan music, the source of which is “Selected Min-Taiwan Folk Songs”, and the format of the collected sample songs is the sheet music images in JPG format. Table 1 demonstrates some of the folk songs from Fujian-Taiwan that have been gathered in this paper. As the songs in the table are in sheet music format, the information contained in the sheet music is relatively rich because the research in this paper is only from the melody for research and folk song counting. At the same time, in order to facilitate the analysis of the samples, it is necessary to collect the sample file of the other secondary information for processing. The main lyrics, melody, key, and related musical information are processed.

Table 1.

Sample song set

Number	Song name
1	Working hard can be win
2	Just be happy
3	A gust of love
4	The mood of a ronin
5	Other world
6	Heartbreak hotel
7	Floating brothers
8	Alang
9	A thousand mistakes
10	Empty laughing dream

3.1.1

Lyrics processing

In the score of all vocal songs, lyrics take up a relatively large portion, which in turn includes words with no real meaning, such as exclamations, liner notes, and so on. There is usually more than one line of lyrics in the same melody, and this is also the case for Min-Tai folk songs. Since the melody fragment generation experiments in this chapter do not require lyrics elements, the short scores of the sample set are processed by removing the lyrics and retaining the melody. After processing, only the melody part of the sample song is retained, and it is converted to a pentatonic score in order to visually represent the morphological characteristics of the melody more intuitively. The melody’s characteristics can be more intuitively displayed in the processed sheet music than in the original sheet music.

3.1.2

Melodic processing

A melody is composed of tones with various properties in a certain order. Generally speaking, the tones in a melody are all musical tones, and the various properties of musical tones include height, strength, length, and timbre. The four characteristics of musical tones occupy an important position in music theory, especially the height and length of tones, which constitute the “skeleton and blood” of music.

In Fujian-Taiwan folk songs, some songs may consist of two voices due to the existence of duets and instrumental accompaniment. The sample song’s melody is comprised of both the main melody and the accompaniment melody. When dealing with the melody, the accompaniment melody and the secondary melody in the antiphonal singing are removed, and only the main melody of the song is retained, which is able to better reflect the relationship between the notes of the song, and is conducive to the subsequent analysis of the song. In addition, most of the music in Fujian and Taiwan is in a two-part form, with the second part repeating the first part. To avoid redundant melodic information, the repetition of passages is omitted during melodic processing, so as to analyze and obtain more accurate information.

3.1.3

Modulation processing

There are many pieces in different keys in the sample song collection, so if we directly use pieces in different keys to analyse them, it will lead to inaccurate results of the analysis of the pieces and thus affect the results of the subsequent folk song compositions. Therefore, it is necessary to unify the tuning of the songs into C natural major and A natural minor. In addition, in terms of tuning, Fujian-Taiwan folk songs belong to the Chinese national tuning, which can be judged by the tuning judgement method in musicology, and if other tunings appear in very few songs, they will be temporarily excluded from consideration, so that the sample songs that do not satisfy the conditions can be transposed. The transposition is assisted by Sibelius software, and after all the sample songs have been transposed, the information of the songs will become more unified, which is conducive to the creation of subsequent folk songs.

3.1.4

Expression of musical score information

In this section of the study, common musical information will be represented in an algorithmically recognisable way. The representation of musical information mainly involves information about the pitch and time value of the notes. The research in this section adopts a textual approach to express musical score information in terms of pitch. The pitch is expressed using the corresponding pitch of each note and the group alias name, e.g., {c, d, e}. In terms of time value, fractions are used as a way of expression. In the case of a quarter note as a beat, 1 means a whole note, 1/2 means a half note, 1/4 means a quarter note…

3.2

Hidden Markov-based model of folk song composition

3.2.1

Hidden Markov Models

In an HMM, states are not directly visible, but certain variables affected by the states are [30]. Each state has a probability distribution over the possible output symbols, so the sequence of output symbols can reveal something about the sequence of states. Thus, an HMM contains two probability matrices, the transfer probability matrix that exists for the state itself and the emission probability matrix that corresponds to the generation or acceptance of a symbol when the state is transferred. An HMM contains the following five parameters: 1)

The implied state is shown in equation (5): 5 $X = {x_{1}, x_{2}, ..., x_{n}}$ Where n denotes the number of all possible states.

2)

The set of observation symbols is equation (6): 6 $Y = {y_{1}, y_{2}, ..., y_{m}}$ Where m denotes the number of possible observation symbols corresponding to each state.

3)

The state transfer matrix is shown in equations (7) and (8): 7 $A = {a_{i j}}$ 8 $a_{i j} = p (q_{t + 1} = x_{j} | q_{t} = x_{i}), 1 \leq i, j \leq n$ Where q_t denotes the state held at moment t.

4)

The firing matrix, i.e., the observation probability matrix is shown in equations (9) and (10): 9 $B = {b_{i} (k)}, 1 \leq k \leq m, 1 \leq i \leq n$ 10 $b_{i} (k) = p (o_{t} = y_{t} | q_{t} = x_{i})$ where o_i denotes the observation with state x_i at moment t.

5)

Initial state distribution bits Eqs. (11), (12): 11 $π = {π_{i}}, 1 \leq i \leq n$ 12 $π_{i} = p (q_{1} = s_{i})$

An HMM model can describe the state transfer process of an unknown implicit state in the presence of a known observation state. That is, given an observation sequence O = o₁,o₂,…,o_i, to select a most probable state sequence Q = q₁,q₂,…,q_t, which is the decoding problem in HMM, when the parameters of the model are known. In this paper, the Viterbi algorithm is used for the solution.

3.2.2

Viterbi algorithm

The Viterbi algorithm is a dynamic programming algorithm [31]. It is often applied to the problem of decoding Hidden Markov Models to find the implied state sequence that is most likely to generate the observed state sequence based on the known observed state sequence.

Let the state space be x, the probability of the initial state x_i be π_i, the state transfer probability matrix be A, the firing probability matrix be B, and the output obtained from the observation be o₁,o₂,…,o_t, then the most probable sequence of states q₁,q₂,…,q_l generating the observation can be obtained recursively by Equation (13) and Equation (14): 13 $V_{1, x_{i}} = P (o_{1} | x_{i}) \cdot π_{i}$ 14 $V_{t, x_{i}} = P (o_{t} | x_{i}) \cdot \max_{x_{i} \in X} (a_{j, k} \cdot V_{t - 1, x_{i}})$

In Eq. (14) is the probability of the sequence of states that most likely corresponds to the first t observation with final state x_i. The Viterbi path can be obtained by saving the backward pointer noting the states in Eq. (14). Additionally, a function Ptr(x_i, t) is designed to return x_i if the computation is in progress at t > 1 or if it is in progress at t = 1. This leads to Eq. (15): 15 ${\begin{array}{l} x_{t} = \arg \max_{x_{t} e X} (V_{t, x_{t}}) \\ x_{t - 1} = P t r (x_{t}, t) \end{array}$

Then, according to the Viterbi algorithm, the most probable implicit state can be inferred using the known observed states.

3.2.3

HMM authoring model design

Rhythm is the backbone of music, and even if a melody ebbs and flows gracefully, without the right rhythm as a foundation, the melody will become disorganised. The problem of considering rhythm and melody in association is resolved by HMM as a good solution. In this paper, rhythm is considered as an observed state of the HMM, and melody is considered as an implicit state of the HMM. Firstly, a rhythm sequence is generated. Later, based on this observation sequence, the Viterbi algorithm is applied to generate a new melodic sequence.

In order to maximize the internal structure of musical knowledge based on first-order HMM, this paper adopts the learning method of melodic elements and individual notes simultaneously. A melodic element is defined as a musical fragment that has more than two occurrences in a piece of music and is between two and five notes in length. Consider all occurrences of each note as a root node querying downwards until a different note occurs in all branches of a root node. By this method, all eligible melodic elements in a series of music to be learned can be extracted.

4

Experiments and Analyses of Min-Tai Folk Song Composition

The programming language used for the experiments in this paper is Python 3.6.1, and the server configuration is Intel Xeon E5-2603 v2 1.8 GHz x 8 processor, 16 GB of RAM and NVIDIA1080Ti GPU with 12 GB of video memory. The experimental dataset text of this paper is a collection of 10 Min-Taiwanese folk song samples, which are divided into the training set and the test set according to a ratio of 6:4.

4.1

Reconfiguration performance analysis

The reconstruction loss determines the decoding ability of the model and determines whether the model can restore the real samples from the latent variable space, which is an important indicator of the variational autoencoder. Therefore, in this paper, we design a comparative experiment on the reconstruction performance of models to predict the 2D sequence at the next time step with the 2D sequence at the previous time step in a given dataset and calculate the predicted reconstruction loss. The convergence of the proposed models in this paper is compared to the baseline models, which have the same parameter settings. Figure 3 shows the convergence of the reconstruction loss of each model after 1000 rounds of training. The reconstruction loss of the VAEGAN model and the VAEGAN+G model have similar trends and convergence speeds during the training process, while the reconstruction loss of the VAEGAN+L model converges better than the previous two models due to the long-time memory structure that improves the sample reconstruction ability of the model, but the convergence speed in the early stage is slower than the previous two models. The HMM model proposed in this paper has a better convergence of reconstruction loss than the previous two due to the long time memory structure, which improves the sample reconstruction ability, but the convergence speed in the early stage is slower than the previous two. The HMM model proposed in this paper further accelerates the training speed and convergence effect due to the addition of the Viterbi algorithm, which improves the model’s ability to learn the important features of music data and outperforms the other models. Table 2 shows the average reconstruction loss for the model on both the training and test sets. The HMM model is optimal for both training and test sets. The model has no overfitting problem and has better generalization ability.

Table 2.

Reconstruction loss comparison

Model	Training set	Test set
VAEGAN	2.372	2.857
VAEGAN+G	2.264	2.578
VAEGAN+L	2.096	2.347
HMM	1.271	1.302

4.2

Comparison of objective evaluation indicators

In order to be able to more comprehensively verify the good performance of this paper’s model in the creation of Min-Taiwan folk songs, this paper uses objective assessment indicators to compare this paper’s model with other models for ablation experiments. In order to be able to compare the generation effect of the model more intuitively, box plots and scatter plots are drawn for the indicator values of different model-generated samples for comparison to prove the superiority of this paper’s model. Define the creation rate to evaluate the effect of the model on creating Min-Taiwan folk songs: 16 $C r e a t i o n R a t e = \sum_{i = 0}^{T} \frac{P (n_{i})}{T}$ 17 $P (n_{i}) = {\begin{cases} 1, n_{i} \geq 2 \\ 0, n_{i} < 2 \end{cases}$ where T is the total number of time steps and n_i is the number of notes played simultaneously at time step i in the composed folk music, which represents the ratio of the number of time steps in which the sample played at least two or more pitches simultaneously to the total number of time steps. Repetition was also used to count the number of repetitions of two or more note combinations in the music, where regular short sequences of repetitions over a long period of time can reflect the unity of the thematic material of the music and the smoothness of the melody. High-quality notes counted the proportion of notes in the music with a temporal value of more than 2 time steps, and high-quality notes allowed an assessment of whether the model was creating too many fragmented notes with short temporal values.

Using the evaluation indexes mentioned above to experiment on the model creation of folk songs, each model randomly generates 200 samples of folk songs whose audition is 50 bars, calculates the index value of each sample, and draws a box plot to compare with the scatter plot. Figure 4 shows the box plot of the objective evaluation indexes of the creation samples for comparison, in which the red line is the mean value of the real samples, the blue line on the box is the mean line of the creation samples, and the box is the range of variation of the mean value within 1 standard deviation. Most of the samples created by the VAEGAN model are in the range of low below the mean value in terms of the rate of creation, and the creation effect is the worst. The VAEGAN+G model has poor results for the creation rate of the creation samples. The small fraction of samples created by the VAEGAN+L model deviates far from the mean of the true sample. The HMM model is closer to the real samples in terms of composition rate, which indicates that the music model of Fujian-Taiwan folk songs designed using the HMM model in this paper has excellent performance in folk song composition.

In order to be able to more intuitively compare the variability of the creation effects of different models and the diversity of the creation samples, this paper also plotted scatter plots of the scale consistency and polyphony rate of the model creation samples for comparison, and the results are shown in Fig. 5, where the red line still indicates the mean of the real samples. From the overall distribution of the composition rate, none of the models generates too many abnormal samples, and the diversity of samples is rich. The samples created by the HMM model surround the real sample mean line, and the standard deviation between the samples is small, which indicates that the samples created by the HMM model are more stable and further highlights the good performance of the HMM model in the composition of Fujian-Taiwan folk songs.

4.3

Subjective comparison of the human ear

Due to the strong subjective nature of folk songs, the quality of their samples is largely influenced by subjective factors. Therefore, this paper designs an evaluation experiment based on human ear auditory judgement, comparing the HMM model with three existing mainstream folk song composition models, which are MusicVAE, C-RNN-VAEGAN, and SR-CNN-VAEGAN. 30 folk song samples of 30 seconds in duration each were randomly selected from the data generated by each model and the real data, which were converted to the audio format and uploaded to the questionnaire platform after being completely disrupted, and given to 30 anonymous participants, including 10 students and teachers from the Conservatory of Music, 10 folk song enthusiasts and 10 folk song creators. A ten-point scale was used to evaluate each folk song from the perspectives of continuity, pleasantness, and naturalness, and the participating assessors scored each song out of 10 after listening to it. The statistics and summary of the data can be carried out after the evaluation is submitted, and Fig. 6 shows the comparison of the mean values of the summarised evaluation results. In terms of naturalness and pleasantness, the mean values of the scores of the samples created by the HMM model are 8.47 and 8.39, respectively, which are higher than the results of the other models and closer to the scores of the real samples, indicating that the auditory sensations of the samples created by the model in this paper are more pleasant and close to the natural listening sensation of the real folk songs. As for fluency, the average value of the score of the HMM model is 8.16, which is slightly lower than that of the C-RNN-GAN model, and although the folk songs created by this model have better fluency, its performance in terms of pleasantness and naturalness is poorer than that of other models. The subjective evaluation findings validated the effectiveness of this paper’s model, and the HMM model’s creation of Min-Tai folk songs can be subjectively accepted by individuals with a specific level of artistry.

To sum up, based on digital media technology, the creation model of Fujian-Taiwan folk songs designed by the HMM model can effectively complete the creation of Fujian-Taiwan folk songs, and the creation performance is significantly better than other folk song creation models. In addition, the Min-Taiwan folk songs created by the model of this paper meet the subjective preferences of the listeners, which is conducive to the dissemination of the Min-Taiwan folk songs created in this paper, and to a certain extent, it will promote the cultural inheritance of Min-Taiwan folk songs.

5

Conclusion

In this paper, we design the cultural inheritance model of Fujian-Taiwan folk songs using digital media technology, which is based on the infectious disease model, and use a genetic algorithm to solve the model parameters. The Hidden Markov Algorithm is applied to construct an automatic creation model of Min-Tai folk songs, which plays an important role in the cultural inheritance of Min-Tai folk songs through digital media technology.

The HMM model improves the reconstruction loss averages on the training set and test set to 1.271 and 1.302, respectively, compared to the VAEGAN model by 46.42% and 54.43%, and the algorithm’s convergence speed is significantly faster than other models. It indicates that the HMM model has excellent decoding ability in the automatic creation of Min-Taiwan folk songs, which can restore the real samples from the latent variable space and make it easier to create high-quality Min-Taiwan folk songs. Meanwhile, the folk song samples created by the HMM model are closer to the real sample mean in the evaluation index of creation rate, and the sample standard deviation is also smaller. This reflects that the HMM model can create Fujian-Taiwan folk songs that are more stable and consistent, which are close to the real samples. The mean values of the model’s scores on naturalness and pleasantness are 8.47 and 8.39, respectively, which are close to the mean values of 8.96 and 8.57 of the real samples and are significantly higher than the mean values of the other models’ scores on these two indicators. It explains that the auditory perception of the Fujian-Taiwan folk songs created by the HMM model is more pleasing to the ear and close to the natural auditory perception of real folk songs. This paper’s model’s excellent performance in creating Fujian-Taiwan folk songs indicates that it is feasible to use it to innovate and pass on the culture of Fujian-Taiwan folk songs.

Langue:: Anglais

Périodicité:: 1 fois par an
Sujets de la revue:: Sciences de la vie, Sciences de la vie, autres, Mathématiques, Mathématiques appliquées, Mathématiques générales, Physique, Physique, autres

RSS Feed de la revue

Research on the Role of Digital Media Technology in the Cultural Inheritance of Fujian-Taiwan Folk Songs

Wei Guo

Xiaoning Wang

Weiwei Tong

Publié en ligne: 03 févr. 2025

Reçu: 30 août 2024

Accepté: 17 déc. 2024

DOI: https://doi.org/10.2478/amns-2025-0033

Mots clés<kwd>Infectious disease model</kwd>, <kwd>Genetic algorithm</kwd>, <kwd>Digital media technology</kwd>, <kwd>Hidden Markov Algorithm</kwd>, <kwd>Fujian-Taiwan folk songs</kwd>

© 2025 Wei Guo et al., published by Sciendo

This work is licensed under the Creative Commons Attribution 4.0 International License.

Mots clés
<kwd>Infectious disease model</kwd>, <kwd>Genetic algorithm</kwd>, <kwd>Digital media technology</kwd>, <kwd>Hidden Markov Algorithm</kwd>, <kwd>Fujian-Taiwan folk songs</kwd>