A study on timbre feature extraction and sound quality optimisation of guzheng performance based on spectral analysis
Published Online: Sep 29, 2025
Received: Jan 27, 2025
Accepted: May 11, 2025
DOI: https://doi.org/10.2478/amns-2025-1121
Keywords
© 2025 Dan Lu, published by Sciendo.
This work is licensed under the Creative Commons Attribution 4.0 International License.
In today’s world integration, the unique individual tone of national instruments is also valued and explored by more musicians. In the process of continuous improvement of musical instruments, when the shape of musical instruments is decided, the nature and personality of each musical instrument is decided, that is to say, the basic timbre of the instrument itself is decided. The basic timbre of musical instruments can not be changed or replaced arbitrarily, and it can only be decided through the vibration of sound to determine the quality and timbre of the instrument [1-3].
Guzheng belongs to plucked instruments, the excitation of zheng strings in different directions and positions will produce different timbres and sound intensity (sound size), and the vibrations in different directions and positions reflect the characteristics of the sound of guzheng in a certain orientation. The modern guzheng triggers the sound source through string vibration, and the vibration is triggered by plucking the strings with the fingers or pseudo-nail clippings, and the strings arranged in sequence are excited by the fingers (pseudo-nail clippings or plectrums), which change their original static state, i.e., out of the equilibrium position, and then suddenly make use of the strings’ inertia and rebound force to make them rebound back to the equilibrium position and transgress, and then the weeks repeat themselves, and the continuous round-trip generates the acoustic vibration [4-6]. At this point, the sound energy through the conduction system kite column to the resonance box, and then get diffusion. At the same time, acoustic convection and acoustic resonance are generated, thus expanding the (zither) fist sound. In practice, the pitch and timbre changes of the guzheng are controlled and altered by the tension and length of the strings and the density of the string material. Strings of different materials have different elasticity and density, and vibrate with different frequency components [7-8].
The term “sound quality” is generally used in a general sense to refer to the quality of sound. To judge the sound quality of a sound producer, the pitch, volume, and timbre of the sound are measured in a comprehensive manner. “Timbre, also known as timbre, is the core element that characterizes the voice. The pitch is determined by the frequency of the vibration of the sound body, the volume is determined by the amplitude of the vibration of the sound body, and the timbre depends on the harmonic column structure and the onset transient of the sound (commonly known as the “tone head”) [9-10]. Sound quality encompasses timbre. However, timbre is difficult to improve in relation to the sound-producing body, and it expresses a harmonic in the sound wave, which is inherent in nature. Guzheng, due to the use of different materials with different sound-producing bodies, the thickness of the wood, the sparseness of the wood, and the wood structure, even in the case of the same pitch and the same intensity of the sound, the timbre emitted is different. This is the reason why people can distinguish different guzhengs according to the different tones emitted by the sound producing body. Therefore, as a professional manufacturer of musical instruments, the professional staff needs to distinguish the concept and nature of “sound quality” and “timbre”, and not to mix them arbitrarily [11-12].
The silk strings of the Guzheng’s traditional era, though pure in sound quality, lacked ethereality, and the tone was not bright and penetrating in the high notes, resulting in a short after-tone and low volume. The steel strings after the change of the ancient era make up for the shortcomings of the silk strings, and the clear and bright tone is more suitable for playing sensual and rich music. However, the long aftertone produces our murmur, which makes its performance far inferior to that of metal-nylon synthetic strings. Xue, H et al. constructed a generalisable guzheng work dataset with multiple sources and types of texts, achieved accurate classification based on a single feature, and confirmed that the quality of synthetic guzheng music is significantly different from that of real guzheng music [13]. Han, M et al. designed a Large Long Short-Term Memory (LSTM) model to assess the quality and style of synthesised guzheng compositions, and tested it with famous guzheng performance pieces, among others, confirming the feasibility and validity of the proposed model [14]. Wang, Z et al. proposed the use of a neural network model based on residual convolution algorithms with single-tasking and multi-tasking models with three recognition strategies for the recognition of Chinese ethnic pentatonic tunings. The study promotes the diversified development of traditional music culture [15]. Jiang, W et al. conceived a framework that can analyse and perceive timbre features, and through simulation experiments corroborated that the proposed framework can scientifically analyse the correlation between spatial dimensions and timbre evaluation, and thus confirm the auditory perception attributes of the three-dimensional timbral space [16]. Li, D et al. envisioned a multiscale network as the underlying logic as a technical approach to solve the frame-level multi-label classification problem in guzheng performance, and experimental results showed that the approach optimised IPT detection [17].
The main accessories affecting the sound of guzheng are strings and yards. Due to the differences in production materials, craft standards and handmade habits among manufacturers, the main body of the guzheng is often mismatched with the accessories, which affects the sound quality of the product. Zhang, S attempts to analyse the factors affecting the expressiveness of the guzheng from the perspectives of the technique of the guzheng performance, the emotion, the arrangement, and the environment of the performance as well as the suggestions for improvement, which makes a positive contribution to the enhancement of the artistic charm of the guzheng performance! [18]. Ding, H et al. cut and subdivided the guzheng audio samples to highlight the attribute features of each fingering of the guzheng, and replaced the deep learning algorithm with the traditional machine learning algorithm as the fingering recognition algorithm, which effectively improved the recognition accuracy of the guzheng playing techniques regarding the six fingerings [19]. Chen, H et al. combined quantitative research methods to reveal a strong correlation between ancient accounts of guzheng techniques, compositions, regional influences and authors’ subjective perceptions among guzheng researchers, as well as clarifying the role the guzheng has played in the development of traditional music, and deepening people’s understanding of traditional music [20]. Zhao, C et al. demonstrated the consistency between the objective evaluation system of pipa string sound quality and subjective perception based on spectral analysis and numerical simulation methods, and elucidated that the rosewood pipa has better sound pressure uniformity while the mahogany pipa has superior sound quality [21]. Zhong, X. introduced the birth and development of the Ya-Zheng national instrumental music and its performance characteristics based on the historical research literature on the Ya-Zheng, and pointed out that the development of the Ya-Zheng was closely related to the social power hierarchy, the historical environment and the cultural background of the time [22].
This paper establishes the string vibration model of guzheng music based on the principle of playing sound of guzheng music. Using the fast Fourier transform, the guzheng playing audio is preprocessed, the centre of the window function is shifted, and the signal is intercepted, and the conversion of the time-frequency function to the spectral function is achieved by the Fourier transform to complete the extraction of the audio signal of the guzheng playing, and the output is passed through the band-pass filter. Subsequently, the power spectrum of guzheng playing is obtained by using discrete Fourier instead of continuous Fourier transform. Additive and convolutional operations are performed respectively to transform the guzheng signal into an additive signal, and the timbre component sequence is obtained by linear system processing. The Mel frequency cepstrum coefficients are selected as the characteristic parameters of guzheng timbre, and the characteristic parameters of guzheng timbre are extracted. The harmonic structure is used to analyse the expression spectrum of guzheng timbre and evaluate the results of guzheng timbre extraction. Finally, the sound quality optimisation experiment is designed and the effect of guzheng sound quality after optimisation is evaluated by time-frequency curve analysis.
According to the analysis of the energy conduction path of guzheng vibration in the body of guzheng, the articulation of guzheng can be divided into four parts: excitation system, vibration system, conduction system and resonance system. The vibration body and resonance body of guzheng interact with each other, and different frequency components will be enhanced or attenuated in the process of articulation, so different guzheng timbres have differences in auditory perception.
The source of the sound of the guzheng is the vibration of the strings. Assuming that the length of the string is
Solving the equation gives:
From the formula, it can be seen that the guzheng sound consists of the superposition of different frequencies, phases and amplitudes of the crossover tones, and each guzheng presents a different timbre, with the composition and relative strength of these crossover tones playing a decisive role.
Spectral analysis is the process of transforming the time domain into the frequency domain. Fast Fourier Transform (FFT) is one of the algorithms to perform this transformation [23].
Fourier analysis is a powerful tool for analysing the steady-state properties of linear systems and smooth signals, and it is widely used in many engineering and scientific fields. Fast Fourier analysis, which is a method to deal with non-smooth signals by steady state analysis based on the assumption of short-time smoothness, can also be called time-dependent Fourier transform. For the music played by guzheng, the faster rapid music is generally about 240 beats per minute, even according to the limit of the guzheng player’s ability - at this speed, each beat plays thirty-two notes (that is, eight notes per beat), then according to the calculation of the neighbouring two notes are different, it can be played 960 notes per minute, and the time occupied by the playing of a single note is 0.0625 s. The time occupied by the playing of a single note is 0.0625 s. It can be seen that the assumption of short-time smoothness for fast Fourier analysis of guzheng music is valid (it can be assumed that the music signal is smooth in such a short time period as 10 ms). In this case, the lowest frequency that can be distinguished is 16 Hz, and the pitch of the lowest tone on the guzheng is about 27.5 Hz. The process of sound perception is closely related to the fact that the human auditory system has a spectral analysis function. Therefore, spectral analysis of music signals is one of the effective means of recognising music signals and processing audio signals.
The fast Fourier transform of signal {
Where 〈
This can be viewed as the output produced when the time signal passes through a bandpass filter with centre frequency
The square of the fast Fourier transform amplitude |
where the short-time autocorrelation function is defined as:
In practical calculations, the discrete Fourier transform is generally used instead of the continuous Fourier transform, which requires a periodic expansion of the signal, i.e.,
Assume that the music signal
where
“+” and ‘*’ denote additive and convolutional operations, respectively [25]. The role of the first system
The second system
In the actual processing of music signals, if
In the characteristic subsystems
In most digital signal processing,
If only the real part of
where
The inverse spectral coefficients in sound signal processing contain more information than other parameters, and the more commonly used acoustic feature parameters are MFCC and LPCC, etc. The principle of MFCC is to construct a human auditory model, and the acoustic features of the sound signal passing through the filter bank are transformed directly by the Discrete Fourier Transform (DFT), while LPCC is from the point of view of the acoustic model, and uses the linear prediction coding (LPC) technique to find the inverse spectral coefficients. Coding (LPC) technique to find the inverse spectral coefficients. The evaluation of guzheng timbre is essentially based on the auditory perception characteristics of the human ear, so this paper chooses MFCC as the characteristic parameter of guzheng timbre.
Mel frequency cepstrum coefficient (MFCC) is a commonly used characteristic parameter in the analysis of music signals with an emphasis on the auditory properties of the human ear, i.e., to analyse the spectral characteristics of the music signals based on the results of human hearing experiments, and to derive the timbral characteristics of the guzheng that conform to the subjective auditory sensations of human beings [26]. Mel frequency cepstrum coefficient transforms the actual frequency into the Mel scale frequency, which emphasizes the low-frequency information of the sound signals. The specific relationship between Mel frequency and actual frequency can be expressed by the following formula:
The unit of perceived frequency
Based on the above theory, combined with the characteristics of guzheng music signal, this paper adopts the MFCC parameters as the timbre feature parameters of guzheng, and Fig. 1 shows the specific computational steps of MFCC feature parameter extraction.

Extraction of MFCC feature parameters
Preprocessing The preprocessing of guzheng music signal Fast Fourier Transform (FFT) The discrete Fourier transform can convert the music signal from time domain to frequency domain, but the disadvantage is that the arithmetic is large, thus in order to reduce the arithmetic in the MFCC parameter extraction, the fast Fourier transform is used instead of the discrete Fourier transform for the conversion of the time domain to the frequency domain. The FFT is performed on the Spectral line energy calculation Before Mel filtering, it is necessary to calculate the energy of its spectral lines for each frame of data after FFT, i.e.:
Mel filter bank The music signal is pre-processed, FFT transformed to obtain the corresponding discrete spectrum, the energy of each frame of the spectrum through the corresponding sequence of triangular filters to achieve the filtering process, to obtain a series of related coefficients For the calculation of the filter coefficients
Where
where
Where,
In calculating the energy of the Mel filter, the derived spectral line energy is passed through the Mel filter bank described above and the energy in that Mel filter bank is calculated. That is, the energy spectrum
where Discrete cosine transform (DCT) Assuming that
where
The discrete cosine transform has rich signal spectral components, has good energy concentration, and does not need to estimate the phase of the sound in the operation, so it can achieve a better speech enhancement effect with lower computational complexity.
The inverse spectrum of the DCT is calculated in the process of extracting the MFCC parameters, which is similar to the FFT inverse spectrum when the signal is logarithmically transformed by the Fourier transform and then the FFT inverse transform is calculated to convert the frequency-domain signal back to the time-domain signal, i.e., the Mel filter energy is logarithmically transformed and then its DCT is calculated:
The Mel cepstrum coefficients of the signal in frame
Since the timbre of a musical instrument depends on the harmonic structure, similar musical instruments have similar harmonic structure, thus the harmonic structure can be defined as an objective index corresponding to the timbre of the instrument. Based on the existing discrete harmonic transform, the steps of timbre feature extraction based on the harmonic structure are as follows: firstly, the music signal is sub-framed, with a frame length of 0.5s and a frame shift of 0.25s, and no sub-framing is carried out for signals with a duration of less than 0.5s. According to the extraction method of harmonic structure, the harmonic structure information of guzheng signal is obtained, and the harmonic coefficients are normalised to obtain the harmonic coefficients, which constitute the timbre expression spectrum from the discrete harmonic transform coefficients, the first-order differential discrete harmonic transform coefficients, and the second-order differential discrete harmonic transform coefficients.
For the A4 monotone of guzheng, the fundamental frequency is 440.0Hz, and since the adoption rate is 44.2798kHz, the window length of the discrete harmonic transform is 100 samples, and the window shift is set to be 1/3 of the window length, and the highest harmonic number of 10 is computed for each frame of the audio signal with a length of 1s, and the tone expression spectrum with the highest harmonic number of 10 is computed in Fig. 2 for the tone expression spectrum of the guzheng’s A4 monotone, and the figure is the eigenvalues of the frame 0~100. It can be seen that the first few frames of the timbre expression spectrum contain a large amount of audio information, and the coefficient of the timbre expression spectrum is greater than 0.5, so it is necessary to choose the appropriate highest harmonic number when carrying out the experiments of guzheng timbre feature extraction.

The sound color expression of the guzheng A4
The harmonic structures of a total of seven single tones from C4 to B4 of the guzheng were extracted, and the harmonic structures of three different sampling points of the guzheng were randomly selected for analysis. The amplitude-frequency spectra of the guzheng are shown in Fig. 3, with (a) and (b) as the first sampling points, (c) and (d) as the second sampling points, and (e) and (f) as the third sampling points, and the first 10 harmonic coefficients of the seven single tones are shown for each subfigure in Fig. 3. From the figure, it can be seen that the time-frequency waveform of sample point 2 has the largest signal amplitude, ranging from -0.75 to 0.75, and the duration of sample point 3 is the longest, close to 4 × 104s. At the same time, there is a clear distinction between the harmonic structure of different sample points, and the harmonic structure of the same instrument (guzheng) is more similar, which can be used as a basis for extracting the timbre of the guzheng.

The spectrum of guzheng
The sound quality of guzheng is closely related to the aging degree of wood, and the body can be regarded as an exquisite and special wooden box structure. The acoustic sweeping technology is adopted, and the resonance peak rising, resonance peak frequency becoming lower, and multiple resonance peaks generating are taken as the effective judgement for the sound quality optimisation process of guzheng. Thus it becomes possible to transfer the vibrational ageing technique, which has been successfully applied to steel structures, to the production of koto. This chapter attempts to explore the feasibility of optimising the quality of the koto by applying a sweep signal to the test koto, returning the resonance frequency, and then continuously inputting the resonance frequency to excite the koto to vibrate.
Four strings were sampled before and after vibration for empty string pulling and plucking respectively. In this chapter, only the results of the A string are compared and analysed, and each extracted waveform and time are closer to the data segments, so as to observe the effect of vibration on the guzheng through the amplitude-time curves and amplitude-frequency curves, and to explore the effect of vibration on the optimisation of the acoustic quality of the guzheng.
For the results of the guzheng sound quality optimisation test, an instrumental evaluation was first carried out. The Labview sound acquisition program recorded the results of the two sweeps before and after the vibration optimisation and automatically compared the results to make the difference automatically.
The optimisation time of the first optimisation test was 50 minutes. Due to the immaturity of the Labview sound acquisition program, the acquisition stopped in the middle of the optimisation and the experimental data were lost. Considering that the characteristics of vibration aging is that more than 70% of the effect appears in the first half hour of vibration, so the second optimisation test had to extend the optimisation time, so the optimisation time was extended to 150 minutes, and the resulting curve is shown below. Figure 4 shows the comparison of the sweep curves of the second test, from which it can be seen that after the second optimisation, under the premise of equal amplitude vibration, the 4# guzheng has tended to be stabilised, and in the first 500Hz of the frequency, there is not a big difference in the amplitude between the two sweeps, and the amplitude range is between 0.2 and 0.45. At around 750Hz and 1750Hz, the amplitude difference between the two sweeps is large, and the amplitude of the sweeps in the rest of the states is almost the same. The only next step is to consider increasing the power in order to achieve the desired optimisation effect.

The second test sweep curve was compared
So the power amplifier was replaced with a larger one and accordingly with a thicker coil. In this scenario, a third optimisation test was carried out. The vibration time was 150 minutes.
Figure 5 shows the third test sweep curve comparison, as can be seen from the figure, from 20Hz to 1200Hz effect is not obvious, from 1200Hz ~ 1700Hz effect, the emergence of several new peaks, respectively, the frequency = 1250Hz, 1385Hz and 1678Hz, the peak amplitude of the peak respectively, 0.5V, 0.65V and 0.735V. Greater than 1700Hz region has the most significant effect, with the peak amplitude of the second sweep exceeding that of the first across the board, indicating that the high-frequency region has been significantly optimised. As the vibration optimisation of low amplitude after the first test has already formed an optimisation effect on the low frequency region and part of the mid-frequency region, increasing the power of the power amplifier means that the amplitude is increased and the high frequency region is optimised.

The third test sweep curve is compared
Figure 6 shows the difference of the two sweep curves. Combined with the difference of the sweep curves, it can be seen more intuitively that the amplitude difference fluctuates in the interval of [0.01,0.058] in the high-frequency interval of 1700 Hz, which further verifies the effect of the optimisation of the guzheng’s sound quality.

The two sweep curve difference
The more pronounced the overtone spiky peaks and the greater the variation in the koto’s time domain curve, the more beautiful the koto will sound. Therefore, the 4# zither has been optimised from the point of view of time domain analysis.
Figure 7 shows the amplitude-time curve of the A-string pulling before the vibration treatment of the guzheng, and the amplitude-frequency curve obtained by FFT transformation. The curve is characterised by the fact that the highest amplitude corresponds to a frequency of 1136 Hz (about 5/2 octave), followed by 1326.5 Hz. The peak corresponding to the main frequency of 436.2 Hz is not the highest, and the amplitudes corresponding to the 5-octave frequency in the high-frequency region and the 11/2, 17/2 octave are larger, while the rest are smaller.

The guzheng vibration processing the amplitudes of the previous A string
Figure 8 shows the amplitude-time curve of A-string playing after the vibration treatment of guzheng, and the frequency-amplitude curve obtained by FFT transformation. The characteristic 483Hz is the highest amplitude of the main frequency, while the 2x, 3x, 5x, 8x, 9x overtones are relatively more prominent, and the amplitude of the main frequency shows a ‘logarithmic curve’ type of rapid decay. On the other hand, there are 1 or 2 small peaks of overtones in the crossover overtones of the main frequency. These changes are the result of vibration optimisation.

The amplitude of the time curve of the guzheng vibration processing
The basic tone includes four dimensions: fullness, solidity, evenness and relaxation. If the expression of the basic tone is detached from the basic tone, the melody played can only be heard as floating notes without the hazy sense of tone concentration, and the mood of the piece that the composer wants to express can not be shown. In addition, the bass melody of the left hand contrasts with the middle and high melody of the right hand, and the left hand must use the basic tone to the extreme in order to enrich the melody and harmony of the piece, otherwise the melody of the left hand is just notes floating in the air.
Then from the concept of basic tone, first of all, “full” tone expression composition, full tone in playing to “sink down”, no matter in what kind of music under the circumstances, round and full, from the inside out the tone is the player must master, followed by The second is the composition of “solid” tone, the players use the big arm to drive the small arm, the power will fall with the fingertips and then play, so that the sound played out of the sound is more solid, and can bring out the flavor very well. Furthermore, the composition of the “even” tone in the music, even tone requirements emphasize the playing of each tone tone to achieve uniformity, such as techniques in the rocking fingers, wheel fingers, arpeggios, etc., including the same tone requirements of the phrases to achieve uniformity.
The granular tone is characterised by fast power generation and strong finger independence. When the player plays the wheel finger, first of all, it should be different from the single tone without accent marking, the middle finger, index finger and big finger should concentrate their playing power and play with explosive force, especially pay attention to the independence of the fingertips, and avoid mixing the three tones of the wheel finger into one. Secondly, the mood of the piece should be advanced again and again in repetition, which requires the player to differentiate the expression of granular timbre in terms of strength and weakness when playing the same melody for the first and second time.
The use of linear tone in zheng performance is relatively common, and it is also reflected in the innovation of playing techniques, such as finger shaking, wheel playing, trill and glissando, etc. These techniques are all used to enhance the expression of linear tone in guzheng performance. The linear timbre of monophonic melody is mainly reflected in the loose plate and slow plate of the piece. In order to express the hazy morning mist, the piece uses a simple and single technique. From the player’s point of view, when playing a single technique, if you want to express the linear tone, you need to use the slow force and the wrist-driven force to connect the tones with each other.
Controlled tone emphasizes the “strong but not explosive, weak but not false” fingertip tone grasp. In order to show the misty, illusory scenery, the player needs to play the melody of the right hand with extremely weak strength, which requires the player to have the ability to control the weak tone, and to achieve a full tone in the extremely light and weak melody. Strong but not explosive tone also belongs to the category of tone control, the player in playing music rich in expressive fast passages, due to the promotion of the mood, the player’s playing state will generally be more excited, in this case played out the sound, especially the sound of the treble area will be controlled because of the expression of tone control is not well mastered and the “sonic boom” of the effect. The right hand should pay attention to the power of the right hand when playing. When playing with the right hand, pay attention to the power and keep it within certain limits.
When playing guzheng music, the first thing is to pay attention to the expression of timbre in the process of playing the music. The primary task and significance of timbre expression is to convey the mood that the composer wants to depict and the emotion that he wants to express, and the second is to serve as the connection of the subjective emotion of the performer. Most guzheng players ignore the importance of timbre expression in their practice, their interpretation of the work is not expressed in their performance, and they neglect the training of fingertip timbre over time. The importance of timbre expression is not only reflected in the emotional expression of the piece of music and the shaping of the mood, but also through the training of timbre, the fingertip timbre processing and other aspects of the importance of the performance level, thus invisibly improve the level of performance, and further inspire the players to pay attention to the expression of the timbre of the piece of music performance.
After paying attention to the fingertip playing tone, the performer can also promote the ability to analyse the context of the piece before playing, which can also lay the foundation for the combination of playing tone expression and the context of the piece. In the process of analysing the context of a piece of music, first of all, we should have an in-depth understanding and analysis of the genre of the piece, the background of the composition, and the introduction of the piece of music, which is to lay the foundation for the contextual colours of the whole piece of music.
After stimulating the player to pay more attention to the expression of the fingertip timbre and strengthening the player’s ability to analyse the context of the piece, the player can be further promoted to apply the timbre to the corresponding piece to complete the expression of the timbre. For example, in emotional passages, the player should use linear timbre to promote the expression of the emotion of the piece, while in intense emotional passages, the player should use a more granular, solid timbre to reflect. What kind of timbre should be used in different musical situations? How should the tone be used to express the mood of the piece? This is a mutually reinforcing process that requires thought and practice on the part of the performer. From this, it can be seen that timbre expression is not only concerned with the timbre of the fingertips in performance, but also can improve the players’ ability to analyse the context of the music in a step-by-step process, so as to combine theory with practice to promote the combination of timbre expression and the context of the music, and ultimately to reflect the significance of the reality of timbre expression in performance.
This paper proposes a string vibration model for guzheng music based on the sound generation principle of guzheng music playing. Using Fast Fourier Transform, the audio of guzheng is preprocessed, and MFCC is selected as the timbre characteristic parameter of guzheng, and after homomorphic processing, cepstrum and compound cepstrum processing, the Mel frequency cepstrum coefficients are extracted, which is the timbre characteristic parameter of guzheng. The performance timbre extraction results of guzheng are analysed through simulation experiments.
The harmonic structures of seven single tones in the C4~B4 range of the guzheng were extracted, and the harmonic structures of three different sampling points of the guzheng were randomly selected and analysed; the amplitude of the time-frequency waveform signals at sampling point 2 was the largest, ranging from -0.75 to 0.75, and the duration at sampling point 3 was the longest, close to 4 × 104s. The harmonic structures of the seven single tones in the C4~B4 range of the guzheng were extracted and analysed. Adopting acoustic sweeping technology to optimise the sound quality of the guzheng, after three experimental sweeps, a new peak appeared in the effect of 1200Hz~1700Hz, respectively, at frequencies of 1250Hz, 1385Hz and 1678Hz, with the peak amplitudes of 0.5V, 0.65V and 0.735 V. The effect of the region greater than 1700Hz was most significant. Combined with the sweep curve difference, it is more intuitive to see that in the high frequency interval of 1700Hz, the amplitude difference fluctuates in the interval of [0.01,0.058], and the optimisation effect of guzheng sound quality is more obvious.