A Study of Artistic Expression and Spectral Data Analysis in Yangqin Performance Techniques
Pubblicato online: 24 set 2025
Ricevuto: 10 gen 2025
Accettato: 06 mag 2025
DOI: https://doi.org/10.2478/amns-2025-0954
Parole chiave
© 2025 Siyu Wei, published by Sciendo
This work is licensed under the Creative Commons Attribution 4.0 International License.
The yangqin is a musical instrument that has an important position in the history of Chinese music. Since ancient times, the yangqin has been the instrument of choice for the literati, and it not only embodies the essence of ancient Chinese music culture, but also is one of the most widely used instruments in Chinese classical music [1–4]. In terms of the performance techniques and artistic expression of the yangqin, continuous innovation is the fundamental reason why the yangqin has stood out in the world. Yangqin techniques include pizzicato, glissando, overtones, vibrato, transcription, etc., of which the most basic operation is pizzicato. Usually three fingers are used, the first plucking the string, the second loosening the string, and the third playing the string [5–8].
To be a good yangqin player, it requires a lot of work on the technical aspects, especially the stabilized finger-picking technique. The training of finger-picking is the cornerstone of yangqin technique, and stability and speed can be improved by practicing hard [9–10]. In the practice of finger-picking, the yangqin player should always remember to keep a good posture and learn to observe the distance between the fingertips and the strings carefully. Only by mastering the finger-picking technique can they play beautiful music. In addition, some of the techniques of the yangqin also need constant training to be mastered yangqin players also need to pay attention to the overtones and tuning of the strings, and be able to flexibly adjust the tension of the strings [11–14] to adapt to different tonal requirements. The study of yangqin technique and artistic expression is one of the important branches in the field of Chinese music history research. By continuously exploring the techniques and artistic expressions of the yangqin, we can not only inherit and carry forward the cultural connotation of the yangqin, but also promote the international exchange of Chinese classical music [15–17].
In order to realize the measurement and analysis of spectral data in yangqin performance, this study proposes a model for analyzing spectral data of yangqin performance with the processing and identification of spectral features and the quantization of spectral data as the core contents. The importance of time-frequency fine features is calculated and ranked using the feature selection based on recursive feature elimination and random forest, and fused with MFCC and NMFCC to form two new features before the random forest model is trained to complete the processing and identification of spectral features. The spectrum quantization process follows the principle of nearest neighbor and the principle of center of mass, and the split method is used to design the initial codebook, set the codebook and iterative training parameters to complete the spectrum quantization work. This paper selects the yangqin concerto work “Smoke” created by young composer Liu Chang as the object of research and analysis, and analyzes its yangqin playing skills performance in depth from the aspects of spectral data analysis and artistic expression evaluation.
The traditional evaluation of yangqin performance skills tends to focus on generalized characterization, and there is still less research work on frequency domain data. With the rapid development of information technology, it is more and more feasible to analyze the spectral data of yangqin performance. In this paper, we build up a model for analyzing the spectral data of yangqin performance, propose a spectral feature processing and recognition algorithm to more accurately characterize the resonance body spectral features, and realize spectral quantization on this basis.
Yangqin Audio Generation Model
In the sound generation process of the yangqin, it can generally be viewed as a process in which the audio signals emitted by the vibrating excitation source are co-acted by the resonator. This process can be modeled by the excitation source-filter model. Its time domain expression is in convolution form:
Where
Where
NMFCC extraction process
Firstly, the effective audio segment is intercepted by the endpoint detection algorithm, then the base frequency of the audio segment is estimated to get the base frequency
Random Forest [20].
Random forest belongs to a kind of integrated learning, which improves the accuracy by training and classifying votes on multiple decision tree units, calculating the number of votes and using the label corresponding to the highest number of votes as the final classification result. For each decision tree, Random Forest draws samples from the training set each time, putatively and randomly, with the same number of samples as the size of the original training set. Thus the training set for each tree is different and somewhat the same. Because random forests both randomly draw samples and randomly select a subset of features from all features, random forests are more resistant to overfitting.
The process of constructing a decision tree includes selecting samples, selecting a subset of features, and selecting features from the subset of features for node splitting. If there is
Random forest can not only deal with the dataset containing high dimensional features, but also generate the importance score of each feature dimension, which can provide feature ranking as a reference for our feature selection on the basis of high classification performance. Moreover, it has good ability to prevent overfitting and anti-noise, so Random Forest model is chosen as the classification model in this study.
Feature Selection
In feature engineering, we always want to filter out invalid features or even counterproductive features through effective feature selection algorithms, and select the highest performance feature subset in the existing feature set. This can not only improve the classification performance of the model, but also reduce the feature dimension and thus improve the computing speed.
Feature selection is mainly based on filtering ideas, wrapping ideas, embedding ideas, etc. Filtering refers to the idea that by setting some criteria for the features, when the indicators calculated by the features do not meet the criteria, they are filtered.
Embedding idea is mainly to choose the machine learning model to train the feature subset and obtain the weight of each feature dimension, according to the weight and then recharacterization of feature selection, which relies on the setting of the threshold value. And the threshold value as a hyperparameter it is often difficult to determine.
Recursive feature elimination and random forest based feature selection
Feature selection ideas in the filtering idea of high computational efficiency, but the classification accuracy is low, the embedding idea classification accuracy is high, but it is difficult to grasp the threshold setting, the parcel idea classification accuracy is high, in the case of a larger dataset consumes more resources. In summary, considering that the data set in this paper is not large, the parcel idea is chosen, and the feature selection based on recursive feature elimination and random forest in the parcel idea is more comprehensive than forward search and backward search in the process of feature selection, and has a higher classification accuracy rate, so this study uses the algorithm that combines the recursive feature elimination and the random forest for the ranking of the importance of the features and feature screening. And the importance score adopts the fl-weighted score criterion, which determines the weight of fl-score by the frequency of occurrence of each category, and then weights the fl-score to obtain the weighted calculation. The formula of flscore is as follows:
Where
For superior performance coding, scalar quantization alone cannot be achieved. When multiple source symbols are united to form a multidimensional vector and then the vector is scalar quantized the degree of freedom will be greater, the quantization base can be further reduced and the code rate can be further compressed with the same distortion.
Quantization is an efficient data compression and coding technique. Its basic idea is to form a vector from a number of scalar data, it divides the vector space into a number of small regions, each small region looks for a representative vector, and vectors that fall into the small regions during quantization are replaced with this representative vector. As shown in the figure below, each vector in the space is quantized into the red * vector in the small region it falls into.
The basics of quantization, the speech signal consists of many frames, a frame of the speech signal is similar to a vector, the voice channel parameters extracted from a particular frame of the speech signal, a total of K,
A criterion for quantization that minimizes the distortion caused by quantization for a given codebook size K.
The design of quantization, in speaker recognition system, the recognition process is divided into two aspects: training and recognition. In the training process, it simply means that the codebook is generated by the LBG algorithm and continuously trained to optimize the codebook and get a well-designed codebook [21]. In the recognition process, the first step is also to train from unknown speech and then quantize the sequence of feature vectors from each codebook in turn. In codebook design, to get a well-designed codebook that is, to minimize the statistical mean of the error in the codebook design, the following two principles are followed.
Nearest Neighbor Principle NNR [22]. This criterion should be followed when selecting the corresponding code word based on X. The mathematical expression is:
Center of Mass Principle. Setting the set of all input vectors X that select the code word
When quantization of the signal is carried out, the quantization interval is generally made to be finely divided in the range of values where the number of occurrences of the signal value is high with high probability, and slightly sparsely densely divided where the number of occurrences of the signal value is low with low probability, and in this way the average quantization distortion can be reduced. Based on the LGB algorithm, the design and selection of the initial codebook has a great influence on the design of the optimal codebook. The methods for generating the initial codebook are random selection method, splitting method and chain mapping method. The random selection method is simple and does not require initialization, but it will select atypical vectors as code words, and its convergence speed is slow in code book training, and the code words in the trained code book are not fully applied. The split method and chain mapping method can make up for the shortcomings of the random selection method. Therefore, the design of the initial codebook in this experiment adopts the splitting method, and the following are the specific steps for generating the codebook.
Step 1, set the codebook and iterative training parameters, set the set of training vectors X of all input music signals to be S, set the size of the codebook to be N, set the maximum number of iterations of the iterative algorithm to be L, set the number of quantization levels, the threshold for distortion improvement to be
In Step 2, all the trained sequences of the recorded music are first considered as one class and their center of mass is calculated as the code word of the initial codebook:
Step 3, set the initial value
Step 4, divide 5 into M different subsets of
When
Step 5, Calculate the total aberration
Step 6, the relative value
Step 7, once again, calculates the yardage of the new Mabon
Step 8, comparing
Step 9, judge m<L? if it holds, another
Step 10, send the generation to the end, take
The quality of the codebook is improved to some extent by using the splitting method to generate the codebook.
Yangqin Concerto “Smoke Gesture” was composed by young composer Liu Chang in 2016, which is one of the representative works among the new works of Yangqin in recent years. The motif of the piece originates from the folk song of Sangzhi in western Hunan province, and the composer adopts the western music composition method of “symphonic thinking” to create the piece, which makes the music have conflict and contrast. The whole piece is divided into six parts: introduction, slow movement, middle movement, fast movement, climax and coda, with a clear structure and exquisite conception, and the moods are constantly advancing in the music.
In this paper, we will take “Smoke” as the research object, take the spectral data analysis model of yangqin performance constructed in this paper as the means, with the help of this paper's spectral feature processing and recognition algorithm, identify different playing bridges in the work of hitting playing, scraping playing, and anti-bamboo playing, and use this paper's spectrum quantization algorithm to carry out the analysis of the spectral data.
The spectrograms of the bass, middle and treble regions of the yangqin concerto “Smoke Gesture” performed are specifically shown in Figure 1. As can be seen from the figure, the obvious peaks in the spectrogram of the bass region are around 20, while the number of peaks in the spectrograms of the middle and treble regions are around 15 and 10. It can be seen that the number of peaks in the spectrograms from the bass to the treble regions are decreasing, which is in line with the characteristics of the yangqin as a pitch instrument. Moreover, the distribution of peaks from the bass region is relatively decentralized, while the middle and treble regions are relatively concentrated, and the fundamental frequency energy is also gradually enhanced with the rise of pitch.

Pitch
When the yangqin concerto “Smoke” is played by scraping, the maximum amplitude difference of each tone region is shown in Figure 2. It can be seen that the energy in the performance of the concerto “Smoke” using the scratch method is concentrated between 394 and 2442 Hz. The maximum amplitude difference between the bass and middle registers is 12.85 and 14.06, while the maximum amplitude difference between the tenor and soprano registers reaches 25.07 and 32.5, and the maximum amplitude difference in the bass register is the smallest.

Maximum vibration difference
In this section, the melodic pitch frequency of the yangqin concerto “Smoke” is analyzed when it is performed using the anti-bamboo method. A total of 24 sets of pitches are counted and labeled with the numbers 1~24, corresponding to the specific melodic pitch frequencies as shown in Fig. 3. Figure a shows the reference frequency values from an objective point of view, while Figure b shows the melodic pitch frequency of the yangqin concerto “Smoke” using the anti-bamboo playing method. It can be seen that the energy of the performance of the concerto “Smoke” is mainly concentrated in the range of 440-740 Hz, with a difference of 300 Hz. Comparing with the reference frequency values, the melodic pitch frequencies of the concerto “Smoke” are generally slightly higher, with an average of each group of pitch frequencies higher than the reference frequency value by 6.65 Hz, and the 19th group of soprano notes is higher than the reference frequency value by 30.26 Hz at most, which conforms to the performance timbre of the anti-bamboo method. This is consistent with the brighter character of the playing tone of the anti-bamboo method.

Frequency
Music is the art of hearing, the objective evaluation of sound is based on the physical level of analysis, the final evaluation of the sound quality of the instrument still needs to be based on the human subjective feeling as an important basis. In this chapter, the performance of the yangqin concerto “Smoke” will be evaluated from the perspective of artistic expression. There were 10 participants in this evaluation of artistic expression, among which 2 were yangqin teachers from College of Arts A, 5 were yangqin undergraduates from College of Arts A, and 3 were yangqin artists.
The listening venue was the Recording Technology Laboratory of College A. The control values of the acoustic parameters of the listening venue were reverberation time 0.3s~0.65s, deviation less than 25%, background noise less than 35dB, ambient temperature 20°C~25°C, ambient humidity 50%~70%, which were in line with the requirements of “Chinese National Musical Instruments Sound Standard Library”.
In this paper, a multi-dimensional evaluation method will be used to evaluate the artistic expression of the dulcimer concerto “Smoke Posture”. The compilation of the dimensions of the multi-dimensional evaluation rules refers to Baldeman's analysis of the constituent elements of music, that is, from the idea of a whole, a whole, the general dimension is from “the overall appearance of the performance work”, “the accuracy of the performance technique” controlled for the purpose of music, “the degree of grasp of some elements” in the performance process, and “the ability to interpret the performance work”, a total of four general dimensions. The sub-dimensions are: difficulty of the piece, accuracy of score reading, independence of fingers, dexterity and evenness, key touch, use and relaxation of the arms, use of the pedals, mastery of tempo, mastery of rhythm, mastery of strength, mastery of timbre, musical respiration, mastery of syntax, mastery of the melodic line, harmony, completeness and fluency of the piece, clarity of the compositional structure, embodiment of the style of the piece, and expressiveness of the performance, totaling 17 sub-dimensions. Completeness and fluency of the piece, clarity of the compositional structure of the performance, embodiment of the style of the piece, and rich expressiveness of the performance, totaling 17 sub-dimensions.
The evaluation of the total dimensions of artistic expression of the yangqin concerto “Smoke Gesture” is specifically shown in Table 1, using a percentage scoring method. In terms of the overall dimensions, the highest average rating was 81.1 for “mastery of some elements” and the lowest average rating was 72 for “technical accuracy”. The average ratings for “overall representation of the work” and “ability to interpret the work” were 74.9 and 80.3, respectively. Among the dimensions, the overall dimension with the largest difference between the highest and lowest ratings is “Accuracy of playing technique”, which has a high degree of dispersion, with a minimum rating of 60 for Expert 1 and a maximum rating of 95 for Expert 3, and a difference of 35 between the maximum and minimum values.
Evaluation of the total dimension
| Number | The overall appearance of the work | Accuracy of performance technique | The degree of mastery of some elements | The ability to explain the work |
|---|---|---|---|---|
| Expert 1 | 69 | 60 | 82 | 69 |
| Expert 2 | 68 | 64 | 94 | 92 |
| Expert 3 | 95 | 95 | 80 | 73 |
| Expert 4 | 69 | 64 | 72 | 85 |
| Expert 5 | 66 | 75 | 86 | 78 |
| Expert 6 | 68 | 90 | 81 | 73 |
| Expert 7 | 76 | 67 | 77 | 86 |
| Expert 8 | 68 | 68 | 81 | 77 |
| Expert 9 | 94 | 73 | 84 | 89 |
| Expert 10 | 76 | 64 | 74 | 81 |
| Average | 74.9 | 72 | 81.1 | 80.3 |
The sub-dimensions were divided into 17 items, mainly including the difficulty of playing the piece, the accuracy of reading the score, finger ability, and key touching, etc., which were named correspondingly through D1~D17, as shown in Table 2.
Dimension
| Number | Dimension |
|---|---|
| D1 | The difficulty of playing a work |
| D2 | Accuracy of reading spectrometry |
| D3 | Finger ability |
| D4 | Touch key |
| D5 | The use and relaxation of the arm |
| D6 | Pedals |
| D7 | Speed control |
| D8 | Pace |
| D9 | Strength |
| D10 | The performance of timbre |
| D11 | The grasp of music breathing and syntax |
| D12 | The whole line of the melody |
| D13 | Harmony |
| D14 | The integrity and fluency of the work |
| D15 | The curved structure of the play is clear |
| D16 | The performance of the work is reflected in the style of the work |
| D17 | Play is rich in expressiveness |
The ratings of the dimensions of artistic expression are shown in Figure 4, using a five-point scale. It can be clearly seen from the figure that the highest rated dimensions are “ease of playing the piece” and “accuracy of reading the score”, with an average rating of 4.4 and 4.6, both of which were given a rating of 5 by the experts. Among the 17 sub-dimensions, “finger ability” and “grasp of force” are the only two sub-dimensions with average ratings below 3, with average ratings of 2.4 and 2.8 respectively. This suggests that in the artistic performance of the yangqin concerto “Smoke”, attention should be paid to the improvement of finger playing ability and the grasp of playing strength.

Evaluation of fractal dimensions
This study establishes a spectral data analysis model of yangqin performance, focusing on the processing and recognition of spectral features and the quantization of spectral data. The yangqin concerto “Smoke” composed by Liu Chang, a young composer, was chosen as the research object to analyze the corresponding spectral data and evaluate the artistic expression.
The spectral data of the Yangqin concerto “Smoke” is analyzed from the three methods of playing Yangqin, namely, striking, scraping and anti-bamboo. In the conventional percussion method, the number of spectral peaks from the bass to the treble regions gradually decreases, with about 20 peaks in the bass region, and about 15 or 10 peaks in the middle and treble regions. When playing with the scraping method, the energy of the performance is concentrated between 394 and 2442 Hz, and the maximum amplitude of the tenor area and the treble area is as high as 25.07 and 32.5, while the maximum amplitude difference between the bass area and the alto area is relatively low, 12.85 and 14.06. When playing through the anti-bamboo method, the frequency of the melodic pitch is generally a little bit higher than that of the reference frequency value from the objective point of view, and the average frequency of each pitch is 6.65 Hz higher than the average frequency value from the objective point of view, and the average frequency of each pitch is 6.65 Hz higher than the average frequency value from the objective point of view. 6.65 Hz higher.
A multidimensional evaluation method was used to evaluate the artistic expression of the yangqin concerto “Smoke Gesture”. In the evaluation of the total dimensions, the average evaluation values of the four total dimensions of “overall appearance of the performance work”, “performance technique”, “grasp degree of some elements” and “ability to interpret the performance work” were 74.9, 72, 81.1 and 80.3, respectively. Among the sub-dimensions, only the average evaluation value of “finger ability” and “grasp of dynamics” was lower than 3, and the other sub-dimensions were higher than 3, and the average evaluation values of “difficulty of performing works” and “accuracy of reading music” were the highest, reaching 4.4 and 4.6.
