A study of auditory-associative musical emotion based on multidimensional signal processing techniques
Pubblicato online: 17 mar 2025
Ricevuto: 10 nov 2024
Accettato: 18 feb 2025
DOI: https://doi.org/10.2478/amns-2025-0279
Parole chiave
© 2025 Xiaohong Cui, published by Sciendo
This work is licensed under the Creative Commons Attribution 4.0 International License.
In this paper, we mainly introduce the attention mechanism into the VGG16 network and utilize the feature mapping of convolutional layers for music visual emotion characterization. In terms of recognizing auditory emotional features, a CNN network is constructed to extract emotional features from music. The extracted audio-visual features are input into the fusion module, thus achieving the study of multi-dimensional signal processing and associative music emotion. Comparative analysis of the emotion recognition effect of this paper’s method shows that the fusion module is most effective when the audiovisual associative features are downscaled to 200 dimensions. The average recognition rate of emotion when fusing audiovisual features is 88.07%, which improves the emotion recognition rate. The length of the music piece is at 60s, and the recognition accuracy is 0.87, so the shorter the length of the music piece, the higher the recognition accuracy. However, rhythmic features do not have a significant effect on emotion recognition.
