Constructing a Multimodal Music Teaching Model in College by Integrating Emotions

In this study, we enhanced the CaffeNet network for recognizing students’ facial expressions in a music classroom and extracted emotional features from their expressions. Additionally, students’ speech signals were processed through filters to identify emotional characteristics. Using the LRLR fusion strategy, these expression and speech-based emotional features were combined to derive multimodal fusion emotion results. Subsequently, a music teaching model incorporating this multimodal emotion recognition was developed. Our analysis indicates a mere 6.03% discrepancy between the model’s emotion recognition results and manual emotional assessments, underscoring its effectiveness. Implementation of this model in a music teaching context led to a noticeable increase in positive emotional responses—happy and surprised emotions peaked at 30.04% and 27.36%, respectively, during the fourth week. Furthermore, 70% of students displayed a positive learning status, demonstrating a significant boost in engagement and motivation for music learning. This approach markedly enhances student interest in learning and provides a solid basis for improving educational outcomes in music classes.

Langue:: Anglais

Périodicité:: 1 fois par an
Sujets de la revue:: Sciences de la vie, Sciences de la vie, autres, Mathématiques, Mathématiques appliquées, Mathématiques générales, Physique, Physique, autres

RSS Feed de la revue

Constructing a Multimodal Music Teaching Model in College by Integrating Emotions

Jia Song

Publié en ligne: 22 mai 2024

Reçu: 25 janv. 2024

Accepté: 03 avr. 2024

DOI: https://doi.org/10.2478/amns-2024-1202

Mots clésCaffeNet network, Speech signal, LRLR, Feature fusion, Music teaching

© 2024 Jia Song, published by Sciendo

This work is licensed under the Creative Commons Attribution 4.0 International License.

Mots clés
CaffeNet network, Speech signal, LRLR, Feature fusion, Music teaching