Research on the optimisation of music education curriculum content and implementation path based on big data analysis

Curriculum is crucial to music teaching, but the music curriculum of most colleges and universities is not in harmony with the teaching effect. First, for non-professional students, the music curriculum is highly specialized. The textbook includes not only basic courses but also many professional courses, including music appreciation and other content. Too complicated course content leads non-professional students to increase the difficulty of music learning, so they loss interest in learning [1-4]. Secondly, the curriculum is not strong. Non-major students’ music curriculum often depends on the teacher’s situation. The teacher is good at what will open what class, resulting in the music class basically becoming a “music appreciation class”, resulting in the music class has become “chicken ribs” [5-7].

In order to better enable students to understand the connotation and expression of music, it is necessary to explore teaching methods that are suitable for the cultivation of musical talents and to cultivate students’ independent opinions about music [8-9]. From this point of view, the advent of the big data era will solve part of the problem of meeting different students for different music learning and appreciation [10-11]. The use of big data technology to reduce the music teaching materials, the preparation of music e-textbooks and online teaching materials that meet the learning of students at different levels, to provide students with a comprehensive, efficient and three-dimensional music learning environment and resources for learning, appreciation and communication [12-14]. Multimedia technology in the era of big data can also be utilized to organically integrate music teaching with the learning of other disciplines, combine students’ abilities and interests to carry out a more scientific music education and stimulate students’ enthusiasm for learning [15-18]. In addition, the application of big data platforms opens a new chapter of music curriculum evaluation. With the gradual improvement of the evaluation mechanism, the aesthetic function of music education will be more deeply developed [19-20].

This paper briefly describes big data analysis technology and its processing platform, focusing on data mining technology through an overview, process, and algorithms. Based on the Node2vec algorithm, which is a graph embedded learning algorithm, fused with a machine learning algorithm, a method for recommending course content is proposed. Taking the accuracy rate, recall rate, and F-score value as evaluation indexes, we designed and completed comparison experiments to verify the effectiveness of the Node2vec recommendation algorithm and utilised the Hadoop big data processing service as well as the MapReduce computing framework to improve the overall performance of the recommendation system. Finally, the requirements for the recommendation system for MOOC music education courses are analysed, and a specific implementation path for the course content recommendation system in music education is proposed.

2

Big data analysis techniques

Big data is a tool for aggregating large-scale, wide-ranging data or information assets, which are characterised by high volume, speed, and change in different forms and from different sources. The real purpose of big data technology is not simply to store and master massive amounts of data and information but to “add value” to the data by analysing and processing it to obtain knowledge with hidden meanings.

2.1

Platform for analysing big data

Big data analytics platform is usually divided into a data collection layer, data storage layer, data processing layer and service encapsulation layer, and the specific structure is shown in Figure 1. Among them, data collection is the first step of the big data processing process, and data are generally extracted from data sources such as business, the Internet and the Internet of Things using batch collection, web crawlers and other methods. Data storage is the second step of big data, using distributed file systems, databases, and other methods to store the collected data.

Data processing is the third step of big data and the most important step in the process of big data processing, where interesting and meaningful knowledge is analysed in a large amount of collected data through various data mining algorithms. Service encapsulation is the last step of big data, often also called data visualisation, where the knowledge and information obtained after analysis and mining are displayed in the form of charts and graphs.

2.2

Data mining techniques

Data mining is the technique of finding data rules in a large amount of data. It is the process of mining potential information and discovering valuable knowledge from the database. It contains many ideas from other fields, such as sampling estimation in statistics, modelling techniques and algorithm optimization in artificial intelligence, etc., which play an important role in supporting data mining techniques.

2.2.1

Data mining process

Data mining has unknown potential and value. The mining process is generally divided into four stages, including data selection, preprocessing, data mining, and data analysis and evaluation. The data mining model can be seen in Figure 2. 1)

Data Selection

Data selection, also known as data integration, entails bringing together data from various operational scenarios. Data mining presupposes the availability of a large amount of rich data, which can be obtained from either an online processing system or an offline data warehouse. This step is to identify the data set to be analyzed, reduce the scope of data processing, and improve the efficiency of the process.

2)

Data Preprocessing

Data cleansing is a form of data preprocessing that involves removing duplicate records, anomalies, type inconsistencies, and missing data from the data. The data obtained from the data selection stage may have “bad data”, such as inconsistent data types, missing and duplicated data, and incomplete data information.

3)

Data Mining

It is the most important step in the data mining process, which requires the selection of appropriate mining algorithms according to the mining objectives and data source types, and this step is divided into two categories: discovery type and verification type.

4)

Data Analysis and Evaluation

According to the mining objectives to analyse the collection of data, delete the information that is not related to the target results and does not meet the expected results of the user, and extract the information that meets the expectations. Adjust the data mining model according to the evaluation results, and repeat mining on the data until satisfactory results are obtained, until the user is satisfied with the knowledge information obtained.

2.2.2

Data mining algorithms

Data mining is aimed at mining interesting, useful and implicit information, with the task of finding human-acceptable patterns to describe the data, using some variables to predict the future values of unknown or other variables, to find the knowledge that can be used from a large amount of data that is large, varied and complex, different mining tasks can be analysed using different algorithms. 1)

Description

Algorithms such as clustering, discovery of association rules, discovery of sequence patterns, regression, and bias detection are generally used.

2)

Clustering

That is, given a set of data points, each with a set of attributes and a measure of similarity between them, find clusters such that data points in the same cluster are similar and data points in different clusters are different. Select the most dominant features from a large number of behavioral features, put similar data together into a category, and find groups with distinctive features.

3)

Association rules

That is, given a set of records, and each record contains a number of objects, based on the frequency of occurrence to find the association between certain objects, the associated regular data mining.

4)

Regression

That is to describe the trend of a point.

5)

Deviation detection

That is, after knowing many data, find the problematic points in these data.

3

Course recommendation methods based on big data analysis techniques

3.1

Hadoop Computing Framework

Hadoop is a distributed computing framework for big data processing that runs on inexpensive, large hardware cluster devices to provide a set of reliable and stable interfaces for applications [21]. At the same time, without understanding the underlying principles and implementation details of Hadoop, the full use of clusters of distributed programs to ensure its high-speed computing and storage capabilities is not possible. Hadoop is used for reliable, scalable, parallel distributed computing software framework, which can be processed in parallel with big data programs. The core of Hadoop is the HDFS distributed file system, as well as the MapReduce distributed parallel computing framework.

3.1.1

HDFS distributed data management

HDFS is a Hadoop distributed file system, which can be used to store more than a million size number of files, mainly using streams to access the data and batch processing of data. Its best feature is that it is more fault-tolerant and can be automatically recovered when the saved copy is lost, so it is currently widely used mainly on commercial hardware. The HDFS architecture is shown in Figure 3. The HDFS file system is guaranteed to provide services by the following procedures, including NameNode is the master node and is the daemon running on the master node, and Secondary Namenode is the secondary node of NameNode.DataNode is the data node that runs on each slave node. 1)

NameNode

NameNode is mainly responsible for managing the directory structure of HDFS, including the number of file directories, file data block index, and file directory metadata information. The second is responsible for managing block files, and when reading data, it will contact the NameNode to obtain the data node where it is located, and then contact the DataNode to retrieve the data. The NameNode manages and controls how to decompose files into blocks, identifies the nodes that should be subordinated to store these blocks, and overall status and applicability of the distributed file system. The tasks performed by NameNode take a lot of memory and I/O.

2)

DataNodes

DataNode mainly writes data blocks from the upper layer of HDFS to the actual Linux file for reading. When the client reads the file data in HDFS, it first obtains the NameNode to get the DataNode of the data, and then processes the data. The advantage of DataNode is to be able to keep the communication with the other data nodes, which makes the redundancy capacity of the data to be Improvement.

3)

Secondary NameNode

It is not a backup of NameNode. Its main job is to periodically read the file system, record changes, and apply them to the fsimage file, which helps to update the NameNode for the next startup.

3.1.2

MapReduce Computing Framework

MapReduce is a software framework for writing applications that can process as well as analyse large datasets in parallel and enables multi-node clustering through features such as reliability, scalability and fault tolerance [22]. It consists of two main phases, i.e. Map phase and the Reduce phase. 1)

Map

The Map phase involves analysing and extracting data, processing it into key-value pairs, and sending it out using parallel computing.

2)

Reduce

The key-value pairs sent in the Map phase are sorted and partitioned as input to Reduce and are sent and stored by summing and calculating these data.

The core idea of MapReduce is to divide and create multiple logical data blocks by slicing and splitting the input data, and the number of data blocks determines the number of Maps. Each Map task will be executed in parallel on the respective computing node, and a node can also run multiple Maps. Fig. 4 demonstrates the computation process of MapReduce, which combines the input data values as the input to the Reduce phase and puts the same key-value pairs into the same Reduce phase. Figure 4 shows the MapReduce computation process, which combines the input data values as inputs to the Reduce stage, puts the same key-value pairs into the same Reduce task for processing, and finally performs the summation calculation and outputs them as key-value pairs.

3.2

Node2vec algorithm

This paper takes the course learning data from students’ online music education platform as the research object and investigates the course recommendation method using the graph embedding learning (Node2vec) algorithm. Aiming at the problems of insufficient extraction of student and course information and insufficient mining of feature relationships between students and courses in the existing methods, a course recommendation method based on the Node2vec algorithm in the graph-embedded learning algorithm, combined with machine learning algorithms is proposed.

3.2.1

Node Nearest Neighbour Sequence Acquisition

The Node2vec algorithm models the use of a random wandering method with bias. Random wandering method with bias is used as a node information mining strategy for graphs, by adjusting hyperparameters to have preferred node information acquisition for the graph structure and generating a sequence of node’s nearest neighbours using the mined information [23].

The Node2vec algorithm uses a random wandering strategy as shown in Fig. 5, assuming that the current random wandering path passes through link (t, n) to node n, at which point the bias operator α_pq is computed from node n to node x: (1) $α_{P q} = {\begin{array}{l} 1 / p & d_{Δ x} = 0 \\ 1 & d_{s s} = 1 \\ 1 / q & d_{Δ x} = 2 \end{array}$ $${\alpha _{Pq}} = \left\{ {\begin{array}{*{20}{l}} {{1 /p}}&{{d_{\Delta x}} = 0} \\ 1&{{d_{ss}} = 1} \\ {{1 /q}}&{{d_{\Delta x}} = 2} \end{array}} \right.$$

Where d_tx is the shortest path between vertex t and vertex x.

The hyperparameter p controls the probability of repeated visits to the node just visited. From the above equation, it can be seen that p is involved in the computation only if d_tx is 0. If p is lower then the probability of visiting the vertex just visited becomes higher and vice versa. The hyperparameter q controls the direction of the wandering and determines whether the wandering is inward or outward. When q > 1, random wandering tends to visit vertices close to t. When q < 1, the random walk strategy tends to visit vertices farther away from t.

When the paranoid operator $α_{p q} (t, x)$ $${\alpha _{pq}}\left( {t,x} \right)$$ is obtained after passing through a certain edge $(n, x)$ $$\left( {n,x} \right)$$, follow: (2) $π_{n x} = α_{p q} (t, x) \cdot w_{n x}$ $${\pi _{nx}} = {\alpha _{pq}}\left( {t,x} \right) \cdot {w_{nx}}$$

Calculate the transfer probability π_nx of randomly wandering past an edge $(t, n)$ $$\left( {t,n} \right)$$ and reaching node x. where w_nx is the weight of edge $(n, x)$ $$\left( {n,x} \right)$$. Then, calculate the probability $P (x | n)$ $$P\left( {x\left| n \right.} \right)$$ of the current node n, visiting the next vertex x, based on this transfer probability: (3) $P (x | n) = {\begin{array}{l} π_{n x} / Z & i f (n, x) \in E \\ 0 & o t h e r w i s e \end{array}$ $$P\left( {x\left| n \right.} \right) = \left\{ {\begin{array}{*{20}{l}} {{{{\pi _{nx}}} /Z}}&{if\left( {n,x} \right) \in E} \\ 0&{otherwise} \end{array}} \right.$$

Where Z is a normalised constant.

In summary, the random walk to obtain the node near-neighbour sequence of a total of n nodes, each node according to the random walk strategy (formulated by p, q) on the sampled node information is extracted. r then represents each need to carry out random walk r times, each time the final integration of each node after r times of random walk to obtain the node information constitutes the node sequence of the node.

3.2.2

Student/course feature relationship mining

In the Node2vec model, in order to calculate the conditional probability between words, each word is considered as two n-dimensional vectors. Assuming that the index of a word in the whole dictionary is i, when it is the centre word, its word vector is v_i ∈ R_N. When it is the context word, its word vector is u_i ∈ R_N. Assuming that the index of the centre word “GCN” in the corpus is c, and the index of the context word in the corpus is o, according to Softmax operation, we can get the conditional probability of the context word under the condition of the occurrence of “GCN”, and we can obtain the conditional probability of the context word under the condition of the occurrence of “GCN”. According to Softmax operation, we get the probability of the occurrence of the context word under the condition of “GCN”: (4) $P (w_{o} | w_{c}) = \frac{\exp (u_{o}^{T} v_{c})}{\sum_{i t} \exp (u_{i}^{T} v_{c})}$ $$P\left( {{w_o}\left| {{w_c}} \right.} \right) = \frac{{\exp \left( {u_o^T{v_c}} \right)}}{{\sum\limits_{it} {\exp } \left( {u_i^T{v_c}} \right)}}$$

This is then converted into the form of a probabilistic concatenation: (5) $P (w_{o}) = \prod_{i = I}^{T} \prod_{- m \leq j \leq m, j = 0} P (w^{(i + j)} | w^{t})$ $$P\left( {{w_o}} \right) = \prod\limits_{i = I}^T {\prod\limits_{ - m \leq j \leq m,j = 0} P } \left( {{w^{(i + j)}}\left| {{w^t}} \right.} \right)$$

Where T denotes the position of the window centre word and m denotes the size of the window. This allows the probability of inferring a background word for each centre word to be calculated.

The goal of Node2vec is then to find the background word with the highest probability of occurrence. Therefore, the next step is to consider only result maximisation. In this paper, the great likelihood method is used: (6) $- \sum_{t = I}^{T} \sum_{- m \leq j \leq m, j = 0} \log P (w^{t + 1} | w^{(i)})$ $$ - \sum\limits_{t = I}^T {\sum\limits_{ - m \leq j \leq m,j = 0} {\log } } P\left( {{w^{t + 1}}\left| {{w^{(i)}}} \right.} \right)$$

The loss function needs to be minimised in order to make the gap between the predicted result and the true result as small as possible: (7) $\log P (w_{o} |) = u_{o}^{T} v_{c} - \log (\sum_{i \in V} \exp (u_{i}^{T} v_{c}))$ $$\log P\left( {{w_o}\left| {} \right.} \right) = u_o^T{v_c} - \log \left( {\sum\limits_{i \in V} {\exp } \left( {u_i^T{v_c}} \right)} \right)$$

Finally, in order to improve the efficiency of iterative solution in large-scale numerical matrices, as well as to address the inability of the least squares method to compute the unique optimal solution in the whole domain, the gradient descent method is used to perform the minimisation operation on the modified loss function: (8) $\frac{\partial \log P (w_{o} | w_{c})}{\partial v_{c}} = u_{o} - \sum_{j \in V} P (w_{j} | w_{c}) u_{j}$ $$\frac{{\partial \log P\left( {{w_o}\left| {{w_c}} \right.} \right)}}{{\partial {v_c}}} = {u_o} - \sum\limits_{j \in V} P \left( {{w_j}\left| {{w_c}} \right.} \right){u_j}$$

In this way, the model parameters are iteratively updated to achieve optimisation.

The scenario for the use of the Node2vec model is shifted from contextual text prediction to mining the learning context for feature relationships between students and courses. In this context, its role is to calculate the probability of other nodes appearing under the condition that the target node appears.

3.2.3

Student/course relationship map construction

After mining the feature relationships between students and courses, the feature relationships obtained from mining are used as a priori knowledge to improve the representation of learning ability. Therefore, in this paper, two types of (i.e., student and course) relationship graphs are constructed to mine patterns in a data-driven manner.

Their collaborative interactions can well reflect the information on the student side of the student relationship graph. Therefore, complex patterns can be learned by exploiting the collaborative interaction behaviour of users. Mathematically, there exists a boundary $e_{u v}^{U, *}$ $$e_{uv}^{U,*}$$ between student u and course v, additionally in this paper, we use Jaccard similarity calculation to represent the strength of correlation between students, which is calculated as follows: (9) $w_{u v}^{U, *} = \frac{| e_{u}^{*} \cap e_{v}^{*} |}{| e_{u}^{*} \cup e_{v}^{*} |}$ $$w_{uv}^{U,*} = \frac{{\left| {e_u^* \cap e_v^*} \right|}}{{\left| {e_u^* \cup e_v^*} \right|}}$$

Unlike student relationships, courses have directed relationships due to the presence of temporal information. Therefore, a directed course relation graph is constructed to obtain a better representation of learning. Specifically, order the interactions of each student u into a sequence S_u based on the interaction timestamps, and count the number of times each pair of items from all sequences $(i, j)$ $$\left( {i,j} \right)$$ appears in a particular order. Formally, under behaviour $* i f f C n t (i \to j) > 0$ $$*iff\;Cnt\left( {i \to j} \right) > 0$$, the course-relationship graph G^I has a boundary $e_{u v}^{I, *}$ $$e_{uv}^{I,*}$$ from course i to course j. where $C n t (\cdot)$ $$Cnt\left( \cdot \right)$$ is a counter that calculates sequence strength using Jaccad similarity: (10) $w_{u v}^{I, *} = \frac{\overset{*}{C n t (i \to j)}}{\overset{*}{C n t (i \to j)} + \overset{*}{C n t (j \to i)}}$ $$w_{uv}^{I,*} = \frac{{\mathop {Cnt(i \to j)}\limits^* }}{{\mathop {Cnt(i \to j)}\limits^* + \mathop {Cnt(j \to i)}\limits^* }}$$

The constructed student/course relationship diagram is shown in Figure 6.

3.3

Course Recommended Performance Indicators

1)

Prediction accuracy

The prediction accuracy rate is a quantitative indicator, which mainly measures the accuracy of the recommendation algorithm in predicting user behaviour. Usually the historical dataset of student behaviour is first divided into training set and a test set, using the training set to build a recommendation model, using the test set in the recommendation model to predict user behaviour, and calculating the fit between the predicted behaviour of the student and the real behaviour of the student is the prediction accuracy rate.

Suppose $R (u)$ $$R\left( u \right)$$ is a list of items recommended for the user and $T (u)$ $$T\left( u \right)$$ is a list of the user’s behaviour on the test set. Then the recall formula is:

(11)

Re c a l l = \frac{\sum_{u \in U} | R (u) \cap T (u) |}{\sum_{u \in U |} | T (u) |}

$$\operatorname{Re} call = \frac{{\sum\limits_{u \in U} {\left| {R(u) \cap T(u)} \right|} }}{{\sum\limits_{u \in U|} {\left| {T(u)} \right|} }}$$

The formula for the accuracy of the recommendation results is: (12) $\Pr e c i s i o n = \frac{\sum_{u \in U} | R (u) \cap T (u) |}{\sum_{u \in U} | R (u) |}$ $$\Pr ecision = \frac{{\sum\limits_{u \in U} {\left| {R(u) \cap T(u)} \right|} }}{{\sum\limits_{u \in U} {\left| {R(u)} \right|} }}$$

The F-score value is the reconciled mean of Precision and Recall with the formula: (13) $F = \frac{2 * \Pr e c i s i o n * Re c a l l}{\Pr e c i s i o n + Re c a l l}$ $$F = \frac{{2*\Pr ecision*\operatorname{Re} call}}{{\Pr ecision + \operatorname{Re} call}}$$

This paper focuses on evaluating the results of offline data calculations by using the above three main indicators for assessing the quality of the results. 2)

Coverage rate

Coverage is also used to evaluate the quantitative calculation of the performance of the recommended indicators. The most used definition is the number of items that have been recommended for the user as a percentage of the number of all items. Assuming that U represents the set of user items and $R (u)$ $$R\left( u \right)$$ represents the number of items recommended by the system to the user. Then the specific formula for the coverage ratio is:

(14)

C o v e r a g e = \frac{| U_{u \in U} R (u) |}{| I |}

$$Coverage = \frac{{\left| {{U_{u \in U}}R\left( u \right)} \right|}}{{\left| I \right|}}$$

4

Recommended Optimisation and Implementation Pathways for the Music Education Curriculum

4.1

Experimental data set

This study collaborates with the Collaborative Innovation Centre for Monitoring the Quality of MOOC Teaching in a University to analyze the quality of music education courses based on the university. The platform text data information of the students on this platform was collected to construct a survey dataset on the effect of recommending quality music education courses, which has a formal representation of text data consistent with the MOOC teaching scenario.

Then, the collected data was cleaned to remove invalid responses and delete invalid text data. Invalid responses include two cases: one is that the response time of the scale is too short, and the other is that the results of positive and negative questions are too similar, both of which are considered to be that student users did not respond to the scale carefully and the assessment results obtained are unreliable. To delete invalid data, you have to delete sentences in text that don’t make sense, including direct quotes and duplicate data copied and pasted during the interaction.

Eventually, a total of 1754 student users provided valid scale responses and had at least one valid text data, and thus were selected as experimental subjects, i.e. $U = {u_{i}}_{i = 1}^{| U |}$ $$U = \left\{ {{u_i}} \right\}_{i = 1}^{\left| U \right|}$$. Each student user u_i corresponds to a validity score measured by the Validity Evaluation Scale for Music Education Course Recommendations $y_{i}$ $$\;{y_i}$$. The distribution of the validity evaluation results of the MOOC platform’s music education course content recommendations measured by the professional scale is shown in Figure 7. It can be seen that the value of y ranges from 12 to 74 points, the mean Mean(y) is 33.19 points, the standard deviation Std(y) is 10.34 points, and the adjusted $R^{2}$ $${R^2}$$ is 0.9586, which is a good fit. It can be intuitively seen that the low-rating population and the high-rating population occupy a smaller proportion, and the student users in the MOOC teaching platform think that the recommendation effect of the platform’s course content is average.

For the 1851 users in this university, this study collects a total of 308225 student behavioural data in the MOOC platform, each with temporal information, noting that $S_{p} = {(s_{j}, t_{j})}_{j = 1}^{| S_{p} |}$ $${S_p} = \left\{ {({s_j},{t_j})} \right\}_{j = 1}^{\left| {{S_p}} \right|}$$, $| S_{p} |$ $$\left| {{S_p}} \right|$$ = 308225 is used for the training of the Node2vec algorithm. Each student user u_i has a unique subset of personalised learning behaviour data S_i, $i \in [1, 1851]$ $$i \in [1,1851]$$. The parameters of the Node2vec algorithm satisfy $\cup_{i = 1}^{| U |} S_{i} = S_{p}$ $$ \cup _{i = 1}^{\left| U \right|}{S_i} = {S_p}$$, $\forall i, j \in [1, | U |]$ $$\forall i,j \in [1,\left| U \right|]$$, i ≠ j, S_i ∩ S_p = ϕ. The distribution of the number of student users’ texts $| S_{i} |$ $$\left| {{S_i}} \right|$$ is shown in Fig. 8, and the distribution indicates that the number of individual users’ texts shows an uneven distribution, in which 220 student users have a number of texts $| S_{i} |$ $$\left| {{S_i}} \right|$$ even less than 20.

Meanwhile, the distribution of student behavioural text lengths collected by the MOOC platform is shown in Figure 9, and the distribution of text lengths also shows a long-tailed distribution, with the vast majority of the textual data being less than 300 words in length but some of the students’ behavioural textual data reaching lengths of more than 1,000 words. These uneven data distributions, as well as the problem of sparse student behavioural data, make the task of personalised recommendation of students’ music education course content more difficult.

In order for the Node2vec algorithm to understand fine-grained textual information about students’ behaviours, from the total textual data S_p, a smaller collection of texts S_q was randomly selected to be labelled by an expert as to whether or not each piece of textual data was able to embody the learning behavioural traits of an individual user. By using S_q as supervised data, a machine learning model can be constructed to automatically identify whether sentences contain information that reflects the learning behavioural traits of individual students in music education courses.

4.2

Effectiveness of course content recommendation

4.2.1

Experimental results

In order to verify the effectiveness of the Node2vec algorithm applied to music education course content recommendation after 308,225 student behaviour data training. In this paper, Pre@K, Recall@K, NDCG@K, MRR and AUC are used as evaluation metrics to assess the performance of the Node2vec algorithm on the test set, while K is set to 20 in order to have a more intuitive feeling of the gap between different models on the same metrics, and the experimental results are shown in Table 1. In the last row of the table, Node2vec algorithm comparisons are calculated for Matrix Factorization (MF), Heterogeneous Graph (HERec), Graph Convolutional Networks (NGCF), Contextual Information (ACKRec), Heterogeneous Information Networks (MOOCIR), Meta-Paths (HFCNqh), Knowledge Concepts (HFCNqk) and Behavioural Sequence Mining (HFCNqb) for a total of eight baselines The lift rate of the best performing model among the models on each evaluation metric. The following four conclusions can be drawn from the experimental results: 1)

Firstly, from the experimental results, it can be seen that MOOCIR, ACKRec model, HERec model and Node2vec algorithm outperform the MF algorithm in all the indicators, which indicates that the rich auxiliary data in the course recommender system characterised by heterogeneous graphs can be utilised to improve the recommender performance of the recommender system. Thus, it validates the importance of using heterogeneous graphs in recommender systems to mine entities and relationships between them.

2)

MOOCIR, NGCF algorithm and ACKRec algorithm, like the algorithms in this paper, use the random wandering strategy to obtain node nearest neighbour sequences. It can be seen that these algorithms outperform the MF algorithm and the HERec algorithm on each evaluation metric. For example, the performance of the NGCF model on the evaluation index Recall@20 is improved by 63.49% and 27.94% compared to the models MF and HERec, respectively, and the performance of the MOOCIR algorithm on the evaluation index Recall@20 is improved by 69.72% and 32.82% compared to the MF and HERec algorithms respectively. Thus, the importance and effectiveness of the random wandering strategy in obtaining the sequence of nodes’ nearest neighbours have been verified.

3)

The Node2vec algorithm outperforms all baseline models on all evaluation metrics, and the HFCNqh algorithm is the best performing recommendation algorithm among all baseline models. The performance of the Node2vec algorithm on the evaluation metrics Pre@20, Recall@20, NDCG@20, MRR, and AUC, compared to the HFCNqh algorithm, is improved by 10.23%, 5.96%, 20.94%, 9.05%, and 4.14%, respectively. This verifies the effectiveness of the Node2vec algorithm proposed in this paper for the course recommendation task, as well as its practicality.

Table 1.

Results of recommended performance indicators for each model

Model	Pre@20	Recall@20	NDCG@20	MRR	AUC
MF	0.11521	0.10156	0.02516	0.01408	0.50423
HERec	0.14722	0.13489	0.05069	0.03023	0.62355
NGCF	0.18836	0.15266	0.58996	0.04815	0.65882
ACKRec	0.19156	0.16189	0.06251	0.05047	0.67321
MOOCIR	0.19554	0.16205	0.06322	0.06381	0.68011
HFCNqh	0.02131	0.18732	0.07182	0.06852	0.72342
HFCNqk	0.19983	0.17956	0.07134	0.06433	0.69834
HFCNqb	0.02015	0.18090	0.07560	0.06385	0.70705
Node2vec	0.02349	0.19849	0.08686	0.07472	0.75336
Improvement (%)	10.23%	5.96%	14.89%	9.05%	4.14%

4.2.2

Effect of data sparsity on experiments

Currently, one of the major problems facing course content recommendation systems is the problem of sparse student behaviour data. In this section, the performance of the experimental model when the data sparsity varies is discussed. The learners are divided into four groups (0,5], (5,15], (15,30] and (30,100) according to the number of interactions between learners and courses in the training set, and the division results are shown in Table 2. It can be seen that the first group contains the largest number of learners, and the last group contains the smallest number of learners, only 42 student users, indicating that most of the learners have studied courses concentrated in less than 5 courses, with fewer historical interaction behaviours, and the whole dataset is relatively sparse, which is more in line with the current status quo, and close to the reality.

Table 2.

Partitioning results based on learner interaction data

Data set	Groups	Number of users	Number of courses	Users-courses	Density (%)
MOOC music course content recommended data set	(0, 5]	25510	596	82694	0.61%
	(5, 15]	4583	575	34566	1.55%
	(15, 30]	284	432	4315	4.56%
	(30, 100]	42	401	1675	11.05%

Using accuracy and AUC as evaluation metrics, the performance of Node2vec, NGCF, and ACKRec algorithms on four sets of sparsity datasets is compared with the model HERec, and the enhancement rate of evaluation metrics based on the HERec algorithm is calculated for Node2vec, NGCF, and ACKRec algorithms and the result is visualised as a bar chart as shown in Figure 10. It can be seen that the Node2vec algorithm proposed in this paper achieves the best performance on all four sparsity datasets. Thus, it shows that the Node2vec algorithm can, to a certain extent, alleviate the problem of data sparsity.

4.3

Pathways to Implementing Curriculum Recommender Systems in Music Education

Under the big data era, traditional teaching resources in music courses are no longer able to meet the diversified and personalized teaching resource needs of students and cannot accurately promote personalized services. Therefore, the construction of the course content recommendation system based on big data analysis requires relevant personnel to pay attention to it.

In the process of realising the personalised course content teaching resources recommendation system based on the big data platform, the relevant personnel can proceed from the following points:

Firstly, build a big data hardware platform.

Secondly, use the platform to store and manage music course teaching resources efficiently.

Then, the precise push of the teaching content of the music course is realised.

Finally, creating a first-class curriculum is a key link in cultivating high-quality undergraduate talents. High-quality optimization of the music education curriculum ensures that music students gain training in creative abilities and complete the specific process and ultimate goal of music teaching, which is to improve core literacy later. It is related to music and arts education, which is a significant issue regarding macro strategies. University overall music according to the “gold course” standard to create a number of first-class courses and to achieve the standard gold course structure, which can effectively guide the music teaching to focus on educational reform projects to optimise and improve the academic curriculum talent training, so that we can establish a prestigious undergraduate education and build a high level of music education colleges. The construction of a personalised curriculum content and teaching resources recommendation system is a long-term strategy to create a “golden curriculum”, which is the ultimate goal of music teaching. It can also provide constructive suggestions to optimize the structure and impact of various music courses. Integrate online and offline courses and expand educational thinking. Create effective measures to enhance the quality of music and arts education and ultimately contribute to the development of high-quality and innovative musical talent.

5

Conclusion

The development of the Internet and big data technology has led to a change in traditional education. Students can learn course content through online education platforms, but it brings the problem of “course overload”. It has been found that applying recommendation models to online education platforms will effectively alleviate this problem. Therefore, this paper focuses on building an efficient course recommendation model to optimize course content in music education. Aiming at the problem of sparse student behavior data in existing recommendation models, a course content recommendation method embedded with the Node2vec algorithm, combined with machine learning to train on learning behavior data, is proposed. The Effectiveness Evaluation Scale of Music Education Course Recommendation is used as a measurement tool, and MOOC platform users are used as a source of learning behavior data to construct the experimental dataset. In order to validate the Node2vec algorithm, the collected 308,225 student behaviour data are input into the model training and applied to the music education course content recommendation, and the results show that the algorithm in this paper achieves the optimal performance in all evaluation indexes compared to all baseline models. The results show that the algorithm achieves the best performance in each evaluation index compared to all the baseline models, and also achieves the best performance in four data sets with different sparsities. It shows that the Node2vec algorithm can, to a certain extent, alleviate the problem of data sparsity.

The effectiveness and feasibility of the Node2vec algorithm proposed in this paper for course recommendations have been demonstrated through experiments, but there are still some shortcomings. In the actual course recommendation problem, the learner’s interest and demand for the course will change with the change of time, and the research on course recommendation in this paper does not take into account the influence of the learner’s interest by the time change factor. In future research, auxiliary data with time cycle signals, such as learners’ daily study time records, records of participation in tests, and the time span of browsing a certain course, can be mined and incorporated into the course recommendation system so as to improve the performance of the recommendation algorithm.

Language:: English

Publication timeframe:: 1 times per year
Journal Subjects:: Life Sciences, Life Sciences, other, Mathematics, Applied Mathematics, General Mathematics, Physics, Physics, other

Journal RSS Feed

Research on the optimisation of music education curriculum content and implementation path based on big data analysis

Menghan Li

Li Zhang

Published Online: Feb 05, 2025

Received: Sep 27, 2024

Accepted: Jan 06, 2025

DOI: https://doi.org/10.2478/amns-2025-0067

KeywordsBig data analytics, Hadoop, Node2vec algorithm, Distributed computing, Course content recommendation

© 2025 Menghan Li et al., published by Sciendo

This work is licensed under the Creative Commons Attribution 4.0 International License.

Keywords
Big data analytics, Hadoop, Node2vec algorithm, Distributed computing, Course content recommendation