A Multidimensional Mining and Pattern Recognition Approach for Piano Teaching Behavior Data in Music Education
Publicado en línea: 24 mar 2025
Recibido: 01 nov 2024
Aceptado: 14 feb 2025
DOI: https://doi.org/10.2478/amns-2025-0705
Palabras clave
© 2025 Cheng Lyu, published by Sciendo
This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
Piano is one of the important basic courses and a compulsory course for music education majors in colleges and universities. The cultivation and improvement of students’ piano playing ability and performance level are especially important for the music teaching majors in colleges and universities. For a long time, the research and discussion on the reform of piano teaching for college music teaching majors has been very active, highlighting the importance of piano teaching for talent cultivation. With the arrival of the “Internet+” era represented by big data, Internet of Things and cloud computing, traditional piano teaching faces unprecedented opportunities and challenges.
The traditional piano playing course teaching method is relatively single, without considering the actual learning situation and learning needs of each student, resulting in a “one-size-fits-all” teaching mode affects the effect of playing and singing class teaching, but also restricts the students’ learning initiative to play. The use of big data technology can be all the students of the playing and singing class knowledge mastery and learning needs to identify and analyze, for different students to develop different teaching programs, to ensure that the playing and singing class teaching knowledge can be better suited to the learning needs of students, can fully stimulate students to learn the independent mobility [1-2]. After class for the students have not mastered the teaching knowledge, again focus on explaining, and for the students to set aside some targeted training tasks, as a way to consolidate the theoretical basis of the knowledge students have learned, to enhance their use of playing and singing technology level [3-4]. In addition, through the flexible application of big data technology, teachers can also make clear their own teaching deficiencies, which can better guide teachers to carry out the subsequent “playing and singing” teaching work, which will help to improve the quality and efficiency of “playing and singing” teaching as a whole [5].
Cheng, M. et al. constructed a piano playing gesture recognition model based on the Extreme Learning Machine algorithm, which is able to obtain the dynamic information of the hand joints of the piano learners by recognizing and analyzing the changes of their hand gestures, based on which the data analysis model is used to recommend personalized piano learning resources for the learners [6]. Johnson, D. et al. designed an automatic evaluation system for recognizing piano hand poses by machine learning on depth graph for hand segmentation and hand pose detection. Experiments showed that the best performance was achieved by the model that used deep contextual features for hand image segmentation and used normal vector histogram for hand image detection [7]. Hsiao, C. P. et al. proposed a glove that can simulate the haptics of a piano playing teacher by embedding vibration sensors to capture the sound signals during the teacher’s playing and recognize them as tapping behaviors by machine learning algorithms, which complements the way students learn during piano teaching [8]. Chen, Y. C. et al. developed a human posture image recognition system for piano playing, which is based on edge computing technology and can be implemented to capture side sitting, front sitting, and hand images of a player, which helps to standardize students’ postures during piano playing [9]. Dai, L. Aiming at the problem of low recognition accuracy of two-piano playing sequences under noise and reverberation environments, a neural network model-based assisted training and analysis system is proposed to provide more scientific training means for two-piano players, which significantly improves the training efficiency and performance quality of the players [10]. Huang, N. et al. improved the BP neural network-based note recognition method in traditional music teaching by fusing the endpoint detection algorithm with the radial frequency extraction algorithm, which strengthened the performance of the model in note timing and note base note recognition, and meanwhile implemented a piano performance evaluation model through the improvement of BP neural network [11]. Chen, Q. divided the piano performance scoring system into single note recognition and multi-note recognition tasks, and for the real-time single note recognition task, algorithms such as local energy endpoint detection were utilized to improve the real-time and robustness of its recognition process, to improve the stability of the piano performance scoring system, and to provide students with the correct feedback needed for their performance [12]. Yu, Z. et al. emphasized the importance of automatic recognition and evaluation of playing intensity during piano performance for music teaching assistance, and proposed a piano playing intensity evaluation system, whose performance initially meets the expectations and can accurately assess the piano playing effect under interference conditions [13]. Xue, X. et al. examined the use of AI wireless networks in music teaching, using MIDI and audio editing to capture and record piano performances labeled with notes and audio waveforms, where students could both train with the support of AI multiple signal classification algorithms and select teachers accordingly [14]. Asahi, S. et al. utilized a piano practice support system based on a long and short-term memory network to extract information about learners’ performances during practice, enabling students to evaluate the rhythm and melody of their performances in a systematic analysis of the information, and to achieve independent piano practice practice [15].
This study is oriented towards multidimensional data mining of teaching behaviors, and based on the characteristic attributes of teaching behaviors, the information gain IG is adopted as an effective method to measure the role of features, and clustering of data behaviors of teaching behaviors is carried out. In order to be able to reduce the interference factors existing in the teaching video so as to better identify the teaching behavior patterns, this paper proposes the Teacher-Set IE algorithm to identify and extract the teaching behavior patterns. And by bilinearly aggregating the last layer of 2D convolutional neural network and 3D convolutional neural network across layers, a teacher behavior pattern recognition model based on 3D bilinear pooling (3D BP-TBR) is proposed. Finally, the practical effectiveness of the teacher behavior pattern recognition method in this paper is tested through experiments.
Data mining is also known as knowledge discovery in data mining repositories. It is from a large amount of incomplete, noisy, fuzzy, and random data for practical applications. The process of extracting information and knowledge implicit in it that people do not know beforehand, but is potentially useful topic. With more than 20 years of development, data mining technology is not only becoming more theoretically mature, but also a considerable number of data mining products and application systems have emerged subsequently and have been successful.
Data mining is a multidisciplinary cross-cutting information technology, which contains theories and techniques from a number of subject areas, such as databases, machine learning, artificial intelligence, and statistics. Databases, artificial intelligence, and mathematical statistics are strong technical pillars of data mining research. Methods and mathematical tools for data mining include statistics, decision trees, neural networks, fuzzy logic, linear programming, etc.
The basic steps of data mining are shown in Fig. 1, Data mining is a complete and iterative process of human-computer interaction processing, which needs to go through several steps. Generally speaking, the process of data mining consists of five main stages, namely: data preparation, data selection, data preprocessing, data mining, and transforming models and patterns.

The Basic Steps of Data Mining
Data preparation involves identifying the research object and setting predictable objectives for the project. Data selection is collecting data, obtaining the data needed for the study, and parsing the data. Data preprocessing includes operations such as cleaning, outlier removal, filtering, denoising, and normalization. Model Evaluation i.e. Evaluating the effectiveness of constructing the model and fully explaining and justifying the final discovered knowledge to aid practical problems.
In the process of feature extraction for multidimensional data mining, the dimension of features gradually increases with the continuous addition of new features, which can easily lead to a dimensionality disaster if it is not restricted. When the dimension of the extracted features is high, it can be found that there will be some correlation or redundancy between the features. If the feature space is too large, it may increase computational complexity, affect model training accuracy, and reduce the classification effect. The high dimensionality of data is a key and difficult problem in data mining research. Therefore, it is necessary to do a good job of classification management of features in the process of behavior recognition, and try to ensure that the number of features is the least and the information is the most complete.
According to the feature attributes of teaching behaviors, in the case of uncertainty about which feature attributes should be included in class characterization or class comparison, certain feature item filtering methods are used to help identify irrelevant or weakly relevant attributes, so as to pick out a subset of features to represent the teaching behavior data with the computation of classification filtering.
Information gain IG is an effective measure of the role of features. The information gain IG value of a feature characterizes the magnitude of the average role that this feature plays in classification. The larger the value of information gain IG of a feature, the smaller the role of this feature for classification in that corpus set. If the same feature has a significant difference in its IG value between two different corpus sets, it indicates that there is a significant difference in the role played by this feature in the two corpus sets.
The calculation of information gain is described below.
Let
where
Let attribute
where term
where
Calculate the information gain for each attribute of the sample in
The specific steps of feature screening are to calculate the information gain IG of each behavioral feature for a given training set, and to remove from the feature space those feature attributes whose IG is lower than a set threshold, and the calculation process includes the calculation of probability and entropy for each feature attribute.
There are various types of teaching behavioral features, and the data representation needs to be normalized in order to facilitate the fusion calculation. After collecting data, data mining consists of 3 stages, data preprocessing, pattern discovery and pattern analysis. As a data source for pattern discovery, the quality of data preprocessing directly affects the final result of pattern discovery. A good data source can not only discover high-quality patterns but also improve the performance of data mining. Therefore, data preprocessing is the foundation of the whole data mining and the key to data mining quality assurance. The data processing process is as follows:
Data cleaning, filling in vacant values, smoothing early born data, identifying and removing isolated points. Dirty data can throw the mining process into chaos, leading to unreliable output. In the multidimensional data mining in this paper, the filter coefficient part is a real-time determination of the teaching behavior filtering without the need to finish the whole teaching process before giving the conclusion, so the information recorded in the teaching behavior logs will have some missing fields. At this time, these missing fields need to be filled with default values. As for the data with missing key information, it is directly removed during the data preprocessing stage. Data integration is the combination of data from multiple data sources stored in a consistent data store for the research project, the data sources include teaching behavior, teacher data, etc., in the data preprocessing process will be extracted from both the characteristics of the attributes needed to become a new set of data. Data transformation transforms data into a form suitable for mining, summarizes and aggregates data, and uses conceptual hierarchies. The low-level “raw” data is generalized into high-level concepts. Attribute data is normalized to fall within specific intervals and new attributes are constructed and added to the attribute set to aid in the mining process.
The research work in data mining has been focused on practical clustering analysis to find appropriate methods and effective for large databases. Popular research themes focus on the scalability of clustering methods, high-dimensional cluster analysis techniques, the effectiveness of methods for clustering complex types of data and clustering methods for mixed numerical and categorical data in large databases.
As an important data analysis method, clustering has received more attention for large-scale data applications. Based on the similarity between the data, the data set can be divided into different classes by establishing a mathematical model, which minimizes the similarity of data between classes and maximizes the similarity between data within classes. A large number of clustering algorithms have emerged from the birth of clustering until now. In this section, clustering algorithms are classified into six categories such as division based clustering algorithms, hierarchical based clustering algorithms, spectral clustering algorithms, lattice based clustering algorithms, density based clustering algorithms and fuzzy based clustering algorithms.
The process of k-means algorithm is as follows:
Randomly select Divide the data points into the closest class by calculating the distance of the remaining data points from the centers of the For the newly formed Repeat 2) and 3) until all class centers no longer change.
The selection of the initial centers of the k-means clustering algorithm is random, and the initial points selected in different ways make the clustering results show differences.
The k-medoids clustering algorithm uses a data point closest to the center of the class to represent the class. k-medoids algorithm is used in many applications due to the advantages of fast convergence, local search capability and simplicity of the algorithm.
The theory of the k-medoids clustering algorithm is that given
where
The k-medoids algorithm intra-class similarity is usually measured using the Euclidean distance, which is defined as:
where
The exact procedure of k-medoids clustering algorithm is as follows:
Select Calculate the distances of the other data points to their respective class centers according to Eq. and group them in the class with the closest distance. Calculate the required objective function Generate a non-center point Update the centers of the original classes to form Repeat 2) 3) 4) 5) until the objective function no longer changes.
In this paper, after studying a large number of actual piano teaching videos, we found that the teacher’s behavior in a large number of videos of teaching scenes has a certain pattern, that is, the first few people in the video of the whole teaching scene who have relatively large changes in the distance of movement and body postures may include the teacher with a high probability. In this paper, we name this concept of regularity of teaching behavior as “teacher set”, that is, the possible spatial area of teacher behavior in the whole classroom teaching video.
In order to reduce the interfering factors in the teaching video so as to better identify the teaching behavior patterns, this paper proposes the Teacher-Set IE (Teacher-Set IE) algorithm, which can identify and extract the teaching behavior patterns. The Teacher-Set IE algorithm consists of three steps: teacher and student human body key point tracking, teacher set recognition and teacher set extraction in the video of the actual teaching scene.
In the piano teaching video, assume that
Where
Next, the movement distances and body posture gesture change distances of the teacher and multiple students in the actual teaching scene are calculated. Assuming that
Where,
After determining the spatial region of the teacher set, the behavioral information of the teacher set will be acquired and recorded, mainly including: frames, Person ID,
Where,
After identifying and extracting the teacher’s set and motion regions, it is necessary to intelligently recognize the teacher’s behaviors in the teaching scenario. In order to obtain the types of teacher behaviors in the teaching video at a fine-grained level, this paper proposes a new method of two-dimensional convolutional neural network last layer and three-dimensional convolutional neural network cross-layer bilinear aggregation (3D CLBP) based on the two-dimensional image fine-grained recognition method, and proposes a three-dimensional bilinear pooling-based pattern recognition model of teacher behaviors based on 3D CLBP that can incorporate more features of the three-dimensional convolutional layers (3D BP-TBR) model.
Since the number of 3D convolutional operations is
Assume
where
Similarly, the formula for the aggregated 3D convolutional feature
where

The architecture of the 3DBP-TBR
In this paper, piano teaching is classified into four teaching modes based on the characteristics of piano teaching behaviors, i.e., stagnant (S1), focused (S2), rushed (S3), and rhythmic (S4), and then multidimensional data mining is performed on the real course data set. In teaching, learners can choose when to participate in the course according to their needs. In this paper, the learning data of the teaching behavior model is counted in weekly units, and according to the teaching behavior payoff, learning gain, and learning efficiency calculation formula, the payoff formula is:
Where,
Its harvest formula is:
The formula for learning efficiency is given by Eq:
The learning efficiency under the rhythmic teaching behavior mode during the entire course is less fluctuating and relatively better than the other modes. It has been stable above 2.0 after the second week. Learning efficiency is closely related to the teaching behavior mode, which directly affects the learners’ effectiveness in course learning and reflects the teaching situation and the characteristics of the teaching behavior. The fact that the learners were able to maintain their learning efficiency in the rhythmic teaching behavior in the piano course indicates that the rhythmic teaching behavior enables the learners to enjoy the exploration of knowledge in the teaching process, to independently choose the learning space to explore their own values, and to have stronger motivation to learn, as well as to obtain better learning results in their learning.

The dynamic change of learning efficiency of different teaching behavior mode
This paper studies the dynamic evolution of quiz scores for various piano teaching behavior patterns, where the scores can indicate learners’ mastery of the course content of the unit and demonstrate their learning progress. The evolutionary pattern of learning gains is shown in Figure 4. The stagnant teaching behavior shows that the instructor only has a small contribution to the quiz near the end of the lesson, and similarly, the learner’s quiz scores only have a small gain near the end of the lesson, and almost zero at other times. The first four weeks of focused instructional behavior yielded good gains in quiz performance. However, as the course progressed, their quiz scores showed a similar phenomenon to learning efficiency. The Rushed Instructional Behavior Model made relatively good gains in engagement payoffs throughout the course, so learners also made better quiz scores, with learning gains ranging from 0.4 to 0.9. Rhythmic teaching behavior model of the first two weeks of relatively little pay, but this type of educators immediately realize the problem, can make timely teaching adjustments, so that the learners in this teaching behavior model in a timely manner to be corrected, in the subsequent quiz scores in the upward trend, learning gains in the second week after all can be maintained at 0.8 or more.

Learn the evolution of the harvest
The dataset ucf101 is used as the experimental dataset for the experiments in this paper. ucf101 is a piano teaching course video dataset with a total number of 13,320 video clips and a total duration of 27 hours, with video clip lengths ranging from 4 to 10 seconds. In this paper, we first divide the ucf101 dataset into a training set and a validation set using random division, with the training set accounting for 80% of the total number of videos in the dataset and the validation set accounting for 20% of the total number of videos in the dataset, and the number of video clips in the training set and validation set being 9537 and 3783, respectively.We select stagnant (S1), focus-on-attention (S2), catch-up (S3), and rhythmic (S4) 4 teaching behavior patterns as the research object of this paper. Accuracy, loss, recall, precision and F1 index are used to evaluate the model recognition results of this paper.
The experiment verifies the recognition effect of the model when randomly dividing the dataset, which is also commonly known as a pattern recognition experiment. Randomly dividing data into training and validation sets according to a certain proportion is the most common way of processing it, and such a division can better evaluate the model’s recognition ability. In the experiments, the recognition model of this paper (3D BP-TBR), the traditional BP recognition model, and the TSN model are used to train and validate on the training set and validation set, respectively, to compare and analyze the recognition effect of the two methods. The experimental results when randomly dividing the dataset are shown in Fig. 5, the Train Loss and Val Loss of the recognition model proposed in this paper are 0.089% and 0.047%, respectively, and the Train Acc (97.11%) and Val Acc (99.03%) are higher than that of the traditional BP recognition model as well as the TSN model. The results show that the method proposed in this paper effectively improves the recognition accuracy of the model.

Random data set
Using the model of this paper to test the recognition of four teaching behavior patterns, the recognition results of the teaching behavior patterns are shown in Table 1, among the recognition of the teaching behavior patterns, the precision rate of the four teaching behavior patterns such as stagnation type (S1), focus type (S2), catching up with the work (S3), and rhythmic type (S4) are all over 93%, and the average recognition precision rate, recall rate, and F1 value are 97.31%, 96.96%, and 97.34%, respectively, which indicates that the pattern recognition method of this paper is effective and can adequately identify the piano teaching behavior patterns. 96.96% and 97.34%, which indicates that the pattern recognition method in this paper is effective and can adequately identify the behavior patterns of piano teaching.
Identification of the pattern of teaching behavior
| Serial number | Behavior pattern category | Recall rate | Accuracy rate | F1 |
|---|---|---|---|---|
| S1 | Stagnation type | 0.9735 | 0.9895 | 0.9866 |
| S2 | Focus type | 0.9984 | 0.9387 | 0.9682 |
| S3 | Drive type | 0.9085 | 0.9748 | 0.9435 |
| S4 | Rhythm type | 0.9978 | 0.9893 | 0.9954 |
| Mean | 0.9696 | 0.9731 | 0.9734 | |
Mining and identification of teaching behavior data can provide scientific data and a basis for optimizing teaching processes, teaching results, and teaching environments. Therefore, this paper establishes a model for recognizing teacher behavior patterns based on 3D bilinear pooling (3D BP-TBR) using multidimensional data mining. The research results of this paper are as follows:
Through the law mining of different teaching behavior patterns, it can be seen that in piano teaching, the learning efficiency of stagnation-type and focus-type teaching behaviors in the whole course of study is relatively low, and tends to be close to 0 in the tenth week of the teaching process. The learning efficiency of the catch-up-type teaching behavior pattern is between 1.00 and 2.75. It remained stable above 2.0 after the second week under the rhythmic teaching behavior mode. In the learning gain method, the learning gain trend pattern of the four teaching behavior patterns was similar to the teaching efficiency pattern. Among them, the rhythmic teaching behavior pattern has relatively better learning efficiency and learning gains compared to the other teaching behavior patterns. It shows that different teaching behaviors have different impacts on teaching learning efficiency and learning gains. However, the rhythmic teaching behavior pattern is associated with better learning gains in piano teaching. 3D BP-TBR achieved optimal results in the experiment with Train Loss and Val Loss of 0.089% and 0.047%, and Train Acc and Val Acc of 97.11% and 99.03%, respectively, and the recognition performance is higher than that of the traditional BP recognition model as well as the TSN model. In addition, the precision rates for the four teaching behavior patterns of stagnation, focus, rush, and rhythm were 98.95%, 93.87%, 97.48%, and 98.93%, respectively, and the average recognition precision, recall, and F1 value were all higher than 96%, which verified that the pattern recognition model in this paper performs well and can accurately recognize the piano teaching behavior patterns.
