Accesso libero

Optimization Design of College Teaching Reform Paths in the Context of Big Data Mining-Driven High-Quality Development of Commerce and Circulation Based on Big Data Mining

  
29 set 2025
INFORMAZIONI SU QUESTO ARTICOLO

Cita
Scarica la copertina

Introduction

The establishment of educational reform program is an important initiative to promote the university teachers to carry out educational teaching research and improve the quality of university education [1]. The education reform project must be combined with the reality of the university and with the reform of the school integrated planning, supporting, based on the school’s talent cultivation goals, from the whole university level of thinking and planning for education reform work, to focus on all aspects of the strength of the joint efforts to promote the development of education reform research work. Each school has a different school philosophy and school characteristics, its school positioning and development concepts are also different, the actual work of the education reform project management will also be very different [2-4]. With the rapid development of the communication and computer industry, the concept of big data is favored by the government, society, and researchers. The arrival of the era of big data for colleges and universities lies in the change of ideas [5]. The application of big data concept in colleges and universities can improve the wisdom of educational management, decision-making and evaluation [6]. Based on the background of the big data era, in order to improve the efficiency of education and teaching reform research project management and give full play to the guiding function and service function of project management, a group of universities have carried out research on the construction of informationization platform for education reform project management [7-8].

Higher education informatization is an effective way to promote higher education reform and innovation and improve quality, and it is the innovation frontier of education informatization development. In the future, we should focus on promoting the in-depth fusion of information technology and higher education, promoting the modernization of education content, teaching means and methods, innovating talent training, research organization and social service mode, promoting cultural heritage and innovation, and promoting the overall improvement of higher education quality [9-11]. As a university teaching manager, it is necessary to actively use advanced information technology to innovatively carry out various educational and teaching reforms and management, especially under the concept of big data, collect and utilize educational and teaching data, improve the level of educational management, guide the educational and teaching work of the school, and continuously promote the improvement of the quality of talent cultivation [12-14].

Educational reform project is a key link and important means of educational teaching reform work in colleges and universities, and the management level of educational reform project affects the development of educational teaching reform work in colleges and universities. As the management department of education reform project, every year the organization undertakes the declaration and completion of various subject projects, and has a library of subject projects in the past years, which are the results of the school’s teaching reform and the wisdom library to guide the school’s education and teaching reform [15-17]. But all along, the subject project declaration and other work more paper form, resulting in more data in the form of paper dispersion is saved, in the data summary and analysis is only limited to the name of the subject project, for the specific study of the form, content, mode, results and other aspects can not be comprehensive and effective summary and analysis of statistics. And with the incentives of education reform policy, teachers’ enthusiasm for education reform continues to improve, the number of declared education reform projects increases year by year, the pressure of effective management of education reform projects increases significantly, the need to improve the level of informationization of project management, revitalization of project management data and information, and effectively improve the efficiency of project management [18-20].

First of all, a systematic overview of factor analysis is carried out, and according to the mathematical model of the factor analysis method, its computational characteristics are summarized, the computational process and steps are sorted out, and the correlation between the factor analysis variables is studied. Subsequently, factor analysis was used to mine and analyze the student achievement data using certain technical routes, so as to discover the shortcomings of the current teaching in colleges and universities. Then, we will explore the construction of a practical teaching system of “basic interconnection, hierarchical progression, integration of competition and innovation, and comprehensive leapfrogging” within the professional group, and build a “five-in-one, virtual and real” cross-professional integrated simulation training center. At the same time, the study introduces the “teaching factory” model in the professional group, deepens the “combination of engineering and learning, school-enterprise cooperation”, and innovates the practical teaching model. On this basis, the article proposes a teaching quality assessment model based on the fireworks algorithm to optimize k-mean clustering, using the fireworks algorithm with the ability to balance global and local search to optimize the k-mean clustering algorithm, and using the obtained data results as the initial clustering centroid of the k-mean clustering algorithm, to solve the problem of the k-mean clustering algorithm easily falling into the local optimum. Finally, based on the results of commerce and circulation majors of students in a university, the k-mean value clustering algorithm is optimized by FWA to achieve accurate and effective clustering segmentation of innovation education, and to explore the relationship between college students’ innovation education and course teaching.

Mathematical modeling of factor analysis and geometric interpretation
Raw data and correlation matrix

To study an object using factor analysis is to study the underlying relationships between its attributes. The raw data, which are the sample values, are provided with 2 random variables x, y, which represent two variables A and B. Their content values are measured for n specimens: x=(x1,x2,,xn)$$\overrightarrow x = ({x_1},{x_2}, \cdots \cdots ,{x_n})$$ y=(y1,y2,,yn)$$\overrightarrow y = ({y_1},{y_2}, \cdots \cdots ,{y_n})$$

The samples were first standardized and the mean and variance were calculated according to the following formula: x¯=1ni=1nxi y¯=1ni=1nyi σx2=1ni=1n(xix¯)2 σy2=1ni=1n(yiy¯)2$$\begin{array}{l} \overline x = \frac{1}{n}\sum\limits_{i = 1}^n {{x_i}} \\ \overline y = \frac{1}{n}\sum\limits_{i = 1}^n {{y_i}} \\ \sigma_x^2 = \frac{1}{n}\sum\limits_{i = 1}^n {{{({x_i} - \bar x)}^2}} \\ \sigma_y^2 = \frac{1}{n}\sum\limits_{i = 1}^n {{{({y_i} - \bar y)}^2}} \\ \end{array}$$

Re-order: xi=xix¯σx,yi=yiy¯σy,i=1,2,n$$x_i^\prime = \frac{{{x_i} - \overline x }}{{{\sigma_x}}},\quad y_i^\prime = \frac{{{y_i} - \overline y }}{{{\sigma_y}}},\quad i = 1,2 \cdots \cdots ,n$$

The sample after standardization meets the following conditions: x¯=1ni=1nxi=0,y¯=1ni=1nyi=0$$\overline {x'} = \frac{1}{n}\sum\limits_{i = 1}^n {{x'_i}} = 0,\quad \overline {y'} = \frac{1}{n}\sum\limits_{i = 1}^n {{{y'}_i}} = 0$$ σx2=1ni=1nxi2=1,σy2=1ni=1nyi2=1$$\sigma_{x'}^2 = \frac{1}{n}\sum\limits_{i = 1}^n {{x'_i}^2} = 1,\quad \sigma_{y'}^2 = \frac{1}{n}\sum\limits_{i = 1}^n {{{y'}_i}^2} = 1$$

Here, x¯,y¯$$\overline x ,\overline y$$ is still used to represent the samples after standardization, and their variance and correlation coefficients can be calculated according to the following formula: { σx2=1ni=1nxi2=1nx¯x¯=1 σy2=1ni=1nyi2=1ny¯y¯=1 Yxy=1ni=1nxiyi=1nx¯y¯$$\left\{ {\begin{array}{*{20}{l}} {\sigma_x^2 = \frac{1}{n}\sum\limits_{i = 1}^n {x_i^2} = \frac{1}{n}\overline {{x^\prime }} \bar x = 1} \\ {\sigma_y^2 = \frac{1}{n}\sum\limits_{i = 1}^n {y_i^2} = \frac{1}{n}\overline {{y^\prime }} \bar y = 1} \\ {{Y_{xy}} = \frac{1}{n}\sum\limits_{i = 1}^n {{x_i}} {y_i} = \frac{1}{n}\overline {{x^\prime }} \bar y} \end{array}} \right.$$

It can be shown that the random variables x,y$$\overrightarrow x,\overrightarrow y$$ are uncorrelated, Yxy = 0 and algebraically equivalent to their inner product xy=0$${\overrightarrow x'}\overrightarrow y=0$$ and geometrically the two vectors are directly intersecting.

For n samples with m variables each, the original data matrix is as follows: X=[ x11 x12 x1m x21 x22 x2m xn1 xn2 xnm]=[ x1,x2,,xm]$$X = \left[ {\begin{array}{*{20}{c}} {{x_{11}}}&{ {x_{12}}}& \cdots &{ {x_{1m}}} \\ {{x_{21}}}&{ {x_{22}}}& \cdots &{ {x_{2m}}} \\ \cdots & \cdots & \cdots & \cdots \\ {{x_{n1}}}&{ {x_{n2}}}& \cdots &{ {x_{nm}}} \end{array}} \right] = \left[ {\begin{array}{*{20}{c}} {\overrightarrow {{x_1}} ,\overrightarrow {{x_2}} , \cdots ,\overrightarrow {{x_m}} } \end{array}} \right]$$

The column vector at the right end of the equation: xj=(x1j,x2j,,xnj),j=1,2,,m$${\vec x_j} = {({x_{1j}},{x_{2j}}, \cdots ,{x_{nj}})^\prime },j = 1,2, \cdots ,m$$

The observation representing the jst variable on the n sample can be viewed as a point or vector in a n dimensional Euclidean space, here denoted by xj$${\vec x_j}$$. The relationship between the original variables is studied by examining the positional relationship of these m points or vectors.

If the sample data is normalized, i.e., X a normalized matrix, there is: xj=1ni=1nxij=0$${\vec x_j} = \frac{1}{n}\sum\limits_{i = 1}^n {{x_{ij}}} = 0$$ σj2=1ni=1nxij2=1nxjxj=1,2,,m$$\sigma_j^2 = \frac{1}{n}\sum\limits_{i = 1}^n {x_{ij}^2} = \frac{1}{n}{\vec x_j}{\vec x_j} = 1,2, \cdots ,m$$

Then, the correlation coefficient between xj$${\vec x_j}$$ and xk$${\vec x_k}$$ is, by Eq: Yjk=1ni=1nxijxik=1nxjxk,j,k=1,2,,m$${Y_{jk}} = \frac{1}{n}\sum\limits_{i = 1}^n {{x_{ij}}} {x_{ik}} = \frac{1}{n}\vec x_j^\prime{\vec x_k}\:,j,k = 1,2, \cdots ,m$$

The correlation coefficient matrix R consists of the correlation coefficients between the m variables: R=[ r11 r12 r1m r21 r22 r2m rm1 rm2 rmm]=1nxx$$R = \left[ {\begin{array}{*{20}{c}} {{r_{11}}}&{ {r_{12}}}& \cdots &{ {r_{1m}}} \\ {{r_{21}}}&{ {r_{22}}}& \cdots &{ {r_{2m}}} \\ \cdots & \cdots & \cdots & \cdots \\ {{r_{m1}}}&{ {r_{m2}}}& \cdots &{ {r_{mm}}} \end{array}} \right] = \frac{1}{n}x'x$$

The correlation coefficient matrix R is symmetric and at least semi-positive definite, which means that all its eigenvalues are non-negative.

The correlation coefficient matrix is the starting point of the factor analysis method and an important part of factor analysis is to study the structure of the correlation matrix [21]. Also in factor analysis, we are often involved in the correlation coefficient matrix between two sets of variables, assuming that in addition to the previous m random variables, there are another p random variables, the matrix is as follows: y=[ y11 y12 y1p y21 y22 y2p yn1 yn2 ynp]=[ y¯1,y¯2,,y¯p]$$y = \left[ {\begin{array}{*{20}{c}} {{y_{11}}}&{ {y_{12}}}& \cdots &{ {y_{1p}}} \\ {{y_{21}}}&{ {y_{22}}}& \cdots &{ {y_{2p}}} \\ \cdots & \cdots & \cdots & \cdots \\ {{y_{n1}}}&{ {y_{n2}}}& \cdots &{ {y_{np}}} \end{array}} \right] = \left[ {\begin{array}{*{20}{c}} {{{\bar y}_1},{{\bar y}_2}, \cdots ,{{\bar y}_p}} \end{array}} \right]$$

Assuming all standardized data, the correlation coefficient between yk$${\vec y_k}$$ and xj$${\vec x_j}$$ is given by Eq: Skj=1nykxj,k=1,2,,p;j=1,2,,m$${S_{kj}} = \frac{1}{n}{\vec y_k}{\vec x_j},k = 1,2, \cdots ,p;j = 1,2, \cdots ,m$$

Written in matrix form as follows: Sp×m=[ S11 S12 S1m S21 S22 S2m Sp1 Sp2 Spm]=[ 1ny1x1 1ny1x2 1ny1xm 1ny2x1 1ny2x2 1ny2xm 1nypx1 1nypx2 1nypxm]=1n[ y1 y2 yp][ x1,x2,,xm x1,x2,,xm]=1nYX$${S_{p \times m}} = \left[ {\begin{array}{*{20}{c}} {{S_{11}}}&{ {S_{12}}}& \cdots &{ {S_{1m}}} \\ {{S_{21}}}&{ {S_{22}}}& \cdots &{ {S_{2m}}} \\ \cdots & \cdots & \cdots & \cdots \\ {{S_{p1}}}&{ {S_{p2}}}& \cdots &{ {S_{pm}}} \end{array}} \right] = \left[ {\begin{array}{*{20}{c}} {\frac{{1 - }}{n}{y_1}{x_1}}&{ \frac{{1 - }}{n}{y_1}{x_2}}& \cdots &{ \frac{{1 - }}{n}y_1^\prime {x_m}} \\ {\frac{{1 - }}{n}{y_2}{x_1}}&{ \frac{{1 - }}{n}y_2^\prime {x_2}}& \cdots &{ \frac{{1 - }}{n}y_2^\prime {x_m}} \\ \cdots & \cdots & \cdots & \cdots \\ {\frac{{1 - }}{n}y_p^\prime {x_1}}&{ \frac{{1 - }}{n}y_p^\prime {x_2}}& \cdots &{ \frac{{1 - }}{n}y_p^\prime {x_m}} \end{array}} \right] = \frac{1}{n}\left[ {\begin{array}{*{20}{c}} {\vec y_1^\prime } \\ {\vec y_2^\prime } \\ \vdots \\ {\vec y_p^\prime } \end{array}} \right]\left[ {\begin{array}{*{20}{c}} {{{\vec x}_1},{{\vec x}_2}, \cdots ,{{\vec x}_m}} \\ {{{\vec x}_1},{{\vec x}_2}, \cdots ,{{\vec x}_m}} \end{array}} \right] = \frac{1}{n}YX$$

Mathematical model for factor analysis

The common factor of factor analysis can, in fact, be expressed in the following linear algebraic form: { x1=a11f¯1+a21f¯2++ap1f¯p+μ1ε¯1 x2=a12f¯1+a22f¯2++ap2f¯p+μ2ε¯2 xm=a1mf¯1+a2mf¯2++apmf¯p+μmε¯m$$\left\{ {\begin{array}{*{20}{c}} {\overrightarrow {{x_1}} = {a_{11}}{{\bar f}_1} + {a_{21}}{{\bar f}_2} + \cdots + {a_{p1}}{{\bar f}_p} + {\mu_1}{{\bar \varepsilon }_1}} \\ {\overrightarrow {{x_2}} = {a_{12}}{{\bar f}_1} + {a_{22}}{{\bar f}_2} + \cdots + {a_{p2}}{{\bar f}_p} + {\mu_2}{{\bar \varepsilon }_2}} \\ \cdots \\ {\overrightarrow {{x_m}} = {a_{1m}}{{\bar f}_1} + {a_{2m}}{{\bar f}_2} + \cdots + {a_{pm}}{{\bar f}_p} + {\mu_m}{{\bar \varepsilon }_m}} \end{array}} \right.$$

Abbreviated into: xj=k=1pakjf¯k+μjεj,j=1,2,,m$${\vec x_j} = \sum\limits_{k = 1}^p {{{\text{a}}_{kj}}} {\bar f_k} + {\mu_j}{\vec \varepsilon_j},j = 1,2, \cdots ,m$$

Where f1,f2,,fp $${\vec f_1},{\vec f_2}, \ldots \ldots ,{\vec f_p}$$ and ε1,ε2,,εm$${\vec \varepsilon_1},{\vec \varepsilon_2}, \ldots \ldots ,{\vec \varepsilon_m}$$ are the new variables sought, the former is the common factor can be understood as commonality. The latter is called the single factor, or the individuality factor. Positive integer P represents the number of common factors, which is much smaller than the original number of variables m, the formula means to simplify the original m variables into a small number of factors, the coefficients akj and μj(j = 1, 2, ⋯⋯, m; k = 1, 2, ⋯⋯, p) are called factor loadings or factor loadings, the former is called the common factor loadings, the latter is called the single factor loadings, since we are concerned only with the common factors, usually referred to as factor loadings refers only to the former.

Notation: A=[ a11 a12 a1m a21 a22 a2m ap1 ap2 apm]p×m$$A = {\left[ {\begin{array}{*{20}{c}} {{a_{11}}}&{ {a_{12}}}& \cdots &{ {a_{1m}}} \\ {{a_{21}}}&{ {a_{22}}}& \cdots &{ {a_{2m}}} \\ \cdots & \cdots & \cdots & \cdots \\ {{a_{p1}}}&{ {a_{p2}}}& \cdots &{ {a_{pm}}} \end{array}} \right]_{p \times m}}$$

where akj is the loading of the jnd variable on the krd factor (k = 1, 2, ……, p; j = 1, 2, ……., m). F=[ f¯1,f¯2,,f¯p]=[ f11 f12 f1p f21 f22 f2p fn1 fn2 fnp]n×p$$F = \left[ {\begin{array}{*{20}{c}} {{{\bar f}_1},{{\bar f}_2}, \cdots ,{{\bar f}_p}} \end{array}} \right] = {\left[ {\begin{array}{*{20}{c}} {{f_{11}}}&{ {f_{12}}}& \cdots &{ {f_{1p}}} \\ {{f_{21}}}&{ {f_{22}}}& \cdots &{ {f_{2p}}} \\ \cdots & \cdots & \cdots & \cdots \\ {{f_{n1}}}&{ {f_{n2}}}& \cdots &{ {f_{np}}} \end{array}} \right]_{n \times p}}$$

Where column k is the value of the knd factor on each specimen, this matrix is called the factorial measure. U=[ u1 0 0 0 u2 0 ... ... ... ... 0 0 um]m×m$$U = {\left[ {\begin{array}{*{20}{c}} {{u_1}}&0& \cdots &0 \\ 0&{ {u_2}}& \cdots &0 \\ {...}&{ ...}&{ ...}&{ ...} \\ 0&0& \cdots &{ {u_m}} \end{array}} \right]_{m \times m}}$$

This is the mst order diagonal matrix where the jnd diagonal element uj is the loading (j = 1, 2, ……, m) of variable Xj on a single factor εj. E=[ ε1,ε2,,εm]=[ ε11 ε12 ε1m ε21 ε22 ε2m εn1 εn2 εnm]$$E = \left[ {\begin{array}{*{20}{c}} {{{\vec \varepsilon }_1},{{\vec \varepsilon }_2}, \cdots ,{{\vec \varepsilon }_m}} \end{array}} \right] = \left[ {\begin{array}{*{20}{c}} {{\varepsilon_{11}}}&{ {\varepsilon_{12}}}& \cdots &{ {\varepsilon_{1m}}} \\ {{\varepsilon_{21}}}&{ {\varepsilon_{22}}}& \cdots &{ {\varepsilon_{2m}}} \\ \cdots & \cdots & \cdots & \cdots \\ {{\varepsilon_{n1}}}&{ {\varepsilon_{n2}}}& \cdots &{ {\varepsilon_{nm}}} \end{array}} \right]$$

where column j is the value of εj on each specimen. Then Eq. can be rewritten in the following form: [ x˜1,x˜2,,x˜m]=[ f¯1,f¯2,,f¯p][ a11 a12 a1m a21 a22 a2m ap1 ap2 apm]+[ ε1,ε2,,εm][ u1 0 0 0 u2 0 0 0 um]$$\left[ {\begin{array}{*{20}{c}} {{{\tilde x}_1},{{\tilde x}_2}, \cdots ,{{\tilde x}_m}} \end{array}} \right] = \left[ {\begin{array}{*{20}{c}} {{{\bar f}_1},{{\bar f}_2}, \cdots ,{{\bar f}_p}} \end{array}} \right]\left[ {\begin{array}{*{20}{c}} {{a_{11}}}&{ {a_{12}}}& \cdots &{ {a_{1m}}} \\ {{a_{21}}}&{ {a_{22}}}& \cdots &{ {a_{2m}}} \\ \cdots & \cdots & \cdots & \cdots \\ {{a_{p1}}}&{ {a_{p2}}}& \cdots &{ {a_{pm}}} \end{array}} \right] + \left[ {\begin{array}{*{20}{c}} {{{\vec \varepsilon }_1},{{\vec \varepsilon }_2}, \cdots ,{{\vec \varepsilon }_m}} \end{array}} \right]\left[ {\begin{array}{*{20}{c}} {{u_1}}&0& \cdots &0 \\ 0&{ {u_2}}& \cdots &0 \\ \cdots & \cdots & \cdots & \cdots \\ 0&0& \cdots &{ {u_m}} \end{array}} \right]$$ X=FA+EU$$X = FA + EU$$

Factor loads

We already know that the original variables xj$$\overrightarrow{x_j}$$ in Eq. are all standardized variables, now assume again that both the public factor fk(k=1,2,,p)$${\vec f_k}(k = 1,2, \cdots \cdots ,p)$$ and the single factor εj(j=1,2,,m)$${\vec \varepsilon_j}(j = 1,2, \cdots \cdots ,m)$$ to be solved are also standardized variables.

And the correlation coefficients between all the common factors and between the single factors are 0. Then, there is the following relationship according to Eq: { 1ni=1nfik=0,k=1,2,,p 1ni=1nεij=0,j=1,2,,m 1nfkfl=δkl={ 1, k=l 0, klk,l=1,2,,p 1nεjεq=δjq={ 1, j=q 0, jqj,q=1,2,,m 1nfkεj=0,k=1,2,,p;j=1,2,,m$$\left\{ {\begin{array}{*{20}{c}} {\frac{1}{n}\sum\limits_{i = 1}^n {{f_{ik}}} = 0,k = 1,2, \cdots ,p} \\ {\frac{1}{n}\sum\limits_{i = 1}^n {{\varepsilon_{ij}}} = 0,j = 1,2, \cdots ,m} \\ {\frac{1}{n}\vec f_k^\prime{{\vec f}_\ell } = {\delta_{k\ell }} = \left\{ {\begin{array}{*{20}{l}} {1,}&{ k = \ell } \\ {0,}&{ k \ne \ell } \end{array}} \right.k,\ell = 1,2, \cdots ,p} \\ {\frac{1}{n}\vec \varepsilon_j^\prime {{\vec \varepsilon }_q} = {\delta_{jq}} = \left\{ {\begin{array}{*{20}{l}} {1,}&{ j = q} \\ {0,}&{ j \ne q} \end{array}} \right.j,q = 1,2, \cdots ,m} \\ {\frac{1}{n}{{\vec f}_k}^\prime {{\vec \varepsilon }_j} = 0,k = 1,2, \cdots ,p;j = 1,2, \cdots ,m} \end{array}} \right.$$

These relational equations are written in matrix form and the correlation matrix between the metrics is obtained from Eq: 1nFF=1n[ f¯1 f¯2 f¯p][ f¯1,f¯2,,f¯p]=[ 1nf¯1f¯1 1nf¯1f¯2 1nf¯1f¯p 1nf¯2f¯1 1nf¯2f¯2 1nf¯2f¯p 1nf¯pf¯1 1nf¯pf¯2 1nf¯pf¯p]=[ 1 0 0 0 1 0 0 0 1]=Ip$$\frac{1}{n}F'F = \frac{1}{n}\left[ {\begin{array}{*{20}{c}} {\bar f_1^\prime } \\ {\bar f_2^\prime } \\ \vdots \\ {\bar f_p^\prime } \end{array}} \right]\left[ {\begin{array}{*{20}{c}} {{{\bar f}_1},{{\bar f}_2}, \cdots ,{{\bar f}_p}} \end{array}} \right] = \left[ {\begin{array}{*{20}{c}} {\frac{1}{n}\bar f_1^\prime {{\bar f}_1}}&{ \frac{1}{n}\bar f_1^\prime {{\bar f}_2}}& \cdots &{ \frac{1}{n}\bar f_1^\prime {{\bar f}_p}} \\ {\frac{1}{n}\bar f_2^\prime {{\bar f}_1}}&{ \frac{1}{n}\bar f_2^\prime {{\bar f}_2}}& \cdots &{ \frac{1}{n}\bar f_2^\prime {{\bar f}_p}} \\ \cdots & \cdots & \cdots & \cdots \\ {\frac{1}{n}\bar f_p^\prime {{\bar f}_1}}&{ \frac{1}{n}\bar f_p^\prime {{\bar f}_2}}& \cdots &{ \frac{1}{n}\bar f_p^\prime {{\bar f}_p}} \end{array}} \right] = \left[ {\begin{array}{*{20}{c}} 1&0& \cdots &0 \\ 0&1& \cdots &0 \\ \cdots & \cdots & \cdots & \cdots \\ 0&0& \cdots &1 \end{array}} \right] = {I_p}$$

where Ip is a unit matrix of order p. Similarly, the correlation matrix between the single factors can be obtained as: 1nEE=Im$$\frac{1}{n}{E^\prime }E = {I_m}$$

Then the correlation matrix between the common factor and the single factor is: 1nFE=1n[ f¯1 f¯2 f¯p][ ε1,ε2,,εm]=[ 1nf¯1f¯1 1nf¯1f¯2 1nf¯1f¯p 1nf¯2f¯1 1nf¯2f¯2 1nf¯2f¯p 1nf¯pf¯1 1nf¯pf¯2 1nf¯pf¯p]=[ 1 0 0 0 1 0 0 0 1]=H$$\frac{1}{n}{F^\prime }E = \frac{1}{n}\left[ {\begin{array}{*{20}{c}} {\bar f_1^\prime } \\ {\bar f_2^\prime } \\ \vdots \\ {\bar f_p^\prime } \end{array}} \right]\left[ {\begin{array}{*{20}{c}} {{{\vec \varepsilon }_1},{{\vec \varepsilon }_2}, \cdots ,{{\vec \varepsilon }_m}} \end{array}} \right] = \left[ {\begin{array}{*{20}{c}} {\frac{1}{n}\bar f_1^\prime {{\bar f}_1}}&{ \frac{1}{n}\bar f_1^\prime {{\bar f}_2}}& \cdots &{ \frac{1}{n}\bar f_1^\prime {{\bar f}_p}} \\ {\frac{1}{n}\bar f_2^\prime {{\bar f}_1}}&{ \frac{1}{n}\bar f_2^\prime {{\bar f}_2}}& \cdots &{ \frac{1}{n}\bar f_2^\prime {{\bar f}_p}} \\ \cdots & \cdots & \cdots & \cdots \\ {\frac{1}{n}\bar f_p^\prime {{\bar f}_1}}&{ \frac{1}{n}\bar f_p^\prime {{\bar f}_2}}& \cdots &{ \frac{1}{n}\bar f_p^\prime {{\bar f}_p}} \end{array}} \right] = \left[ {\begin{array}{*{20}{c}} 1&0& \cdots &0 \\ 0&1& \cdots &0 \\ \cdots & \cdots & \cdots & \cdots \\ 0&0& \cdots &1 \end{array}} \right] = H$$

According to Eq. the correlation matrix between the obtained public factors and the original variables can be obtained: [ 1nf¯1f¯1 1nf¯1f¯2 1nf¯1f¯p 1nf¯2f¯1 1nf¯2f¯2 1nf¯2f¯p 1nf¯pf¯1 1nf¯pf¯2 1nf¯pf¯p]1n[ f¯1 f¯2 f¯p][ x1,x2,,xm]=1nFX=1nFFA+1nFEU=IpA+HU=A$$\left[ {\begin{array}{*{20}{c}} {\frac{1}{n}\bar f_1^\prime {{\bar f}_1}}&{ \frac{1}{n}\bar f_1^\prime {{\bar f}_2}}& \cdots &{ \frac{1}{n}\bar f_1^\prime {{\bar f}_p}} \\ {\frac{1}{n}\bar f_2^\prime {{\bar f}_1}}&{ \frac{1}{n}\bar f_2^\prime {{\bar f}_2}}& \cdots &{ \frac{1}{n}\bar f_2^\prime {{\bar f}_p}} \\ \cdots & \cdots & \cdots & \cdots \\ {\frac{1}{n}\bar f_p^\prime {{\bar f}_1}}&{ \frac{1}{n}\bar f_p^\prime {{\bar f}_2}}& \cdots &{ \frac{1}{n}\bar f_p^\prime {{\bar f}_p}} \end{array}} \right]\frac{1}{n}\left[ {\begin{array}{*{20}{c}} {\bar f_1^\prime } \\ {\bar f_2^\prime } \\ \vdots \\ {\bar f_p^\prime } \end{array}} \right]\left[ {\begin{array}{*{20}{c}} {{{\vec x}_1},{{\vec x}_2}, \cdots ,{{\vec x}_m}} \end{array}} \right] = \frac{1}{n}FX = \frac{1}{n}FFA + \frac{1}{n}FEU = {I_p}A + HU = A$$

where H is the zero matrix.

This shows is the correlation coefficient between the common factor and the original variable for the elements in the factor loading A: 1nfkxj=akj,k=1,2,,p;j=1,2,,m$$\frac{1}{n}{\vec f_{k'}}{\vec x_j} = {{\text{a}}_{kj}},k = 1,2, \cdots ,p;j = 1,2, \cdots ,m$$

Factor loading akj reflects the link between factor f¯k$${\bar f_k}$$ and variable x¯j$${\bar x_j}$$. When akj > 0, it indicates a positive correlation between factor f¯k$${\bar f_k}$$ and variable xj. When akj < 0, it indicates an inverse correlation between factor fk$${\vec f_k}$$ and variable xj$${\vec x_j}$$. When akj ≈ 0, indicates a weak link between factor fk$${\vec f_k}$$ and variable xj$${\vec x_j}$$, the role of akj can be seen more clearly.

The correlation array R can be expressed as from Eq: R = 1nXX=1n(FA+EU)(FA+EU)=1nAFFA+1nUEFA+1nAFEU+1nUEEU = A(1uFF)A+U(1uEF)A+A(1uFE)U+U(1uEE)U=AA+UU = R*+[ u12 0 0 0 u22 0 0 0 um2]$$\begin{array}{rcl} R &=& \frac{1}{n}X'X = \frac{1}{n}{(FA + EU)^\prime }(FA + EU) = \frac{1}{n}A'F'FA + \frac{1}{n}U'E'FA + \frac{1}{n}A'F'EU + \frac{1}{n}U'E'EU \\ &=& {A^\prime }(\frac{1}{u}{F^\prime }F)A + {U^\prime }(\frac{1}{u}{E^\prime }F)A + {A^\prime }(\frac{1}{u}{F^\prime }E)U + {U^\prime }(\frac{1}{u}{E^\prime }E)U = {A^\prime }A + {U^\prime }U \\ &=& {R^*} + \left[ {\begin{array}{*{20}{c}} {u_1^2}&0& \cdots &0 \\ 0&{ u_2^2}& \cdots &0 \\ \cdots & \cdots & \cdots & \cdots \\ 0&0& \cdots &{ u_m^2} \end{array}} \right] \\ \end{array}$$

Among them: R*=AA$${R^*} = {A^\prime}A$$

is called the approximate correlation matrix, the raw variable correlation coefficients, and the non-diagonal elements are the same as in R. rij=k=1pakiakj,i,j=1,2,,m$${r_{ij}} = \sum\limits_{k = 1}^p {{a_{ki}}} {a_{kj}},i,j = 1,2, \cdots ,m$$

The diagonal element of R* is: hj2=k=1pakj2=1uj2,j=1,2,,m$${h_j}^2 = \sum\limits_{k = 1}^p {a_{kj}^2} = 1 - {u_j}^2,j = 1,2, \cdots ,m$$

hj2$$h_j^2$$ is called the metric variance of variable xj$${\vec x_j}$$, which represents the share of each metric in the variance of the original variable, and is numerically equal to the sum of the squares of the elements in column j of A. The common factor variance represents the extent to which all of the original variables can be explained by these p common factors, and takes values between 0 and 1, which are positively correlated.

The sum of the common factor variances is shown below: j=1mhj2=j=1mk=1pakj2=k=1pj=1makj2=k=1pSk2$$\sum\limits_{j = 1}^m {{h_j}^2} = \sum\limits_{j = 1}^m {\sum\limits_{k = 1}^p {{a_{kj}}^2} } = \sum\limits_{k = 1}^p {\sum\limits_{j = 1}^m {{a_{kj}}^2} } = \sum\limits_{k = 1}^p {{S_k}^2}$$

Among them: Sk2=j=1makj2,k=1,2,,p$${S_k}^2 = \sum\limits_{j = 1}^m {a_{kj}^2} ,k = 1,2, \cdots ,p$$

is called the variance contribution of factor f¯k$${\bar f_k}$$ and is numerically equal to the sum of the squares of the elements in row k of A. It indicates the degree of contribution, or significance, that factor f¯k$${\bar f_k}$$ plays in all of the common factors, and is again positively correlated.

Geometric Interpretation

All the attributes, variables or indicators, known or calculated, in Eq. can be regarded as vectors in a n-dimensional Euclidean space. Since they are all normalized variables, the squares of their modulus lengths are n, i.e., the lengths of the vectors are all n$$\sqrt n$$. From Eq. the vectors corresponding to all the factors are orthogonal to each other two by two, and they form the base of a p + m subspace, Eq. is the expansion of the variables xj$${\vec x_j}$$ within this set of bases, and the factor loadings are the coordinates of xj$${\vec x_j}$$ the variables within this set of bases. The variables and the projections of the variables into the common factor space are shown in Figure 1. The common factor space refers to the p dimensional subspace generated by the p common factors, and the projection of variable xj$${\vec x_j}$$ in the common factor space is: xj*=a1jf1+a2jf2++apjfp=k=1pakjfk,j=1,2,,m$$\vec x_j^* = {a_{1j}}{\vec f_1} + {a_{2j}}{\vec f_2} + \cdots + {a_{pj}}{\vec f_p} = \sum\limits_{k = 1}^p {{a_{kj}}} {\vec f_k},j = 1,2, \cdots ,m$$

Figure 1.

Variables and their projections in the common factor space

This xj*$$\vec x_j^*$$ is called the projective variable of xj$${\vec x_j}$$, which is the square of its length: ||xj*||2=x{_j^* }xj*=(k=1pakjf¯k)(l=1paljf¯l)=k=1pl=1pakjaijf¯kf¯l=k=1pakj2f¯kf¯k=k=1pakj2=nhj2$$||\vec x_j^*|{|^2} = \vec x_j^{*\prime} \vec x_j^* = {(\sum\limits_{k = 1}^p {{a_{kj}}} {\bar f_k})^\prime }(\sum\limits_{\ell = 1}^p {{a_{\ell j}}} {\bar f_\ell }) = \sum\limits_{k = 1}^p {\sum\limits_{\ell = 1}^p {{a_{kj}}} } {a_{ij}}\bar f_k^\prime {\bar f_\ell } = \sum\limits_{k = 1}^p {a_{kj}^2} \bar f_k^\prime {\bar f_k} = \sum\limits_{k = 1}^p {a_{kj}^2} = nh_j^2$$

Also from ||xj||2=n$$||{\vec x_j}|{|^2} = n$$ you can get the square of the cosine of the angle between vector xj$${\vec x_j}$$ and the projected vector xj*$$\vec x_j^*$$ as: cos2θ=(||xj||||xj||)2=nhj2n=hj2$${\cos^2}\theta = {(\frac{{||{{\vec x}_j}||}}{{||{{\vec x}_j}||}})^2} = \frac{{n{h_j}^2}}{n} = {h_j}^2$$

Application of comprehensive evaluation of student achievement based on factor analysis
Modeling
KMO and Bartlett’s test

KMO test

According to the common KMO criteria43 given by Kaiser to determine whether it is suitable for factor analysis, the KMO criteria are shown in Table 1.

Bartlett’s test of sphericity

Bartlett’s spherical test is a type of test to check the degree of association between different variables.

Based on the above principles, KMO and Bartlett’s test are done on the data of students’ performance, KMO and Bartlett’s test are shown in Table 2. Since larger KMO values are more favorable for factor analysis, as seen from the output of the table: the KMO metric value in this case is 0.965, and the test probability Sig value of the KMO and Bartlett’s spherical test values is 0.000. This is because the chi-square value is too large and the Sig value is much less than 0.05. Therefore, the coefficient matrices of the case-observed variables in this study are unlikely to be unit matrices and should be able to express multidimensional numerical relationships, making them well suited for factor analysis.

KMO standard

Is it appropriate to do factorial analysis Perfect for Fit Basic fit Reluctance Discomfort
Score K>=0.9 0.9>K>=0.8 0.8>K>=0.7 0.7>K>=0.6 K<=0.6

KMO and bartlett test

KMO sampling availability number 0.955
Bartlett sphericity test 15205.074 15215.621
.351 0.362
Significance 0.000
Common factor variance

The variance of the common factor is shown in Table 3. From the table, we can see that the variance of the common factor of all the “extracted” variables is mostly in the range of 0.403-0.782, so the factor is considered to be basically indicative of the variance of each course.

Common factor variance

Initial Extraction
Pathology 1.000 0.66
Formulology 1.000 0.782
The golden chamber is slightly read 1.000 0.664
Internal reading 1.000 0.671
Chilling theory 1.000 0.689
Physiology 1.000 0.655
Biochemistry 1.000 0.769
Microbial parasitology 1.000 0.619
Epidemiology 1.000 0.553
Western medicine 1.000 0.611
Pharmacology 1.000 0.748
Medical history 1.000 0.545
Acupuncture 1.000 0.703
Diagnostic foundation 1.000 0.619
Human anatomy 1.000 0.746
Chinese medical history 1.000 0.403
Combination of Chinese and western medicine 1.000 0.687
Chinese and western medicine combine the oral and throat 1.000 0.702
Combination of Chinese and western medicine combined with gynecology 1.000 0.712
Chinese and western medicine combined with foreign science 1.000 0.68
Chinese and western medicine combined ophthalmology 1.000 0.685
Chinese medicine 1.000 0.676
Traditional Chinese medicine 1.000 0.676
Basic theory of Chinese medicine 1.000 0.554
Internal medicine 1.000 0.714
TCM diagnosis 1.000 0.657
Histology 1.000 0.724
Extraction of principal factor components

The total variance is explained in Table 4. As shown in the table, the system presets 27 common factors, and after multiple iterations, the “initial eigenvalues” of 4 components are greater than 1, and “component 1” explains 50.604% of the variance, “component 2” explains 6.244%, “component 3” explains 4.444% of the variance, and “component 4” explains the variance of 4.133, and the cumulative variance contribution rate is 65.425%, that is, the 4 component factors explain 65.425% of the original 27 variables. It shows that the common factors extracted by factor analysis can represent most of the information of the variables to be analyzed. The “gravel diagram” is a graphical representation of the influence of all common factors in the factor analysis, and the gravel diagram of the factor variable is shown in Figure 2. The “gravel diagram” shown in the figure shows the common factors formed after the analysis of the 27 initial common factors, and it can be clearly seen that the influence of the first 4 common factors is greater than 1, and the influence of the subsequent common factors decreases in turn. Therefore, it is reasonable to extract the four principal components.

Total variance interpretation

Constituent Initial eigenvalue Extracting the load of the load Rotational load squared
Total Percentage of variance Cumulation% Total Percentage of variance Cumulation% Total Percentage of variance Cumulation%
1 13.663 50.604 50.604 13.663 50.604 50.604 5.184 19.200 19.200
2 1.686 6.244 56.848 1.686 6.244 56.848 4.872 18.044 37.244
3 1.2 4.444 61.292 1.2 4.444 61.292 4.719 17.478 54.722
4 1.116 4.133 65.425 1.116 4.133 65.425 2.89 10.703 65.425
5 0.866 3.207 68.632
6 0.863 3.196 71.828
7 0.632 2.341 74.169
8 0.603 2.233 76.402
9 0.564 2.089 78.491
10 0.548 2.030 80.521
11 0.477 1.767 82.288
12 0.435 1.611 83.899
13 0.435 1.611 85.51
14 0.361 1.337 86.847
15 0.357 1.322 88.169
16 0.352 1.304 89.473
17 0.315 1.167 90.64
18 0.307 1.137 91.777
19 0.291 1.078 92.855
20 0.286 1.059 93.914
21 0.283 1.048 94.962
22 0.272 1.007 95.969
23 0.231 0.856 96.825
24 0.22 0.815 97.64
25 0.219 0.811 98.451
26 0.217 0.804 99.255
27 0.201 0.744 100.000
Figure 2.

Factor variable rubble

Factor model a before rotation

The composition matrix a before rotation is shown in Table 5. As can be seen from the table, the results of factor analysis mainly include four principal components, among which “principal component 1” has a strong performance on each observed variable, and its factor loading values are basically equal, and there is no outstanding performance. The “principal component 2” is in the courses of the observation variables “Jin Kui Yao Brief Reading”, “Selected Readings of the Neijing” and “Selected Readings of Typhoid Fever”, “Principal Component 3” is in the courses of the observed variables “Biochemistry”, “Microbial Parasitology”, “Warm Disease” and “Ancient Medical Literature”, and “Principal Component 4” is in the courses of the observed variables “Pathology”, “Formulary”, “Selected Readings of the Neijing” and “Selected Readings of Typhoid Fever”, etc., and the loading values of the above principal components 2, 3 and 4 are negative, and the characteristics are obvious. Since the load value of principal component 1 on each observed variable is too uniform, the corresponding principal factor cannot be abstracted from this “component matrix”. In view of these phenomena, it is necessary to study rotational transformations.

Rotated factor model a

The rotated component matrix a is shown in Table 6. As in the table, the maximum variance method is used to unfold the flip change on the factor loading matrix, which can lead to the flipped factor loading matrix. Due to the rotational transformation, it is possible to make the change of loadings on the observed factors of different principal components more centralized, so it can be more convenient to understand the significance of different principal factors.

Factor scores

Factor analysis methods represent factors as linear combinations between common factors and specific variables. Alternatively, the factor scores can be obtained by reversing the process of representing all the common variables as linear combinations of the factors. The matrix of factor scores for each component is shown in Table 7.

The component matrix a before the rotation

Constituent
Course name 1 2 3 4
Pathology 0.743 0.172 0.175 -0.022
Formulology 0.76 0.069 0.141 -0.402
The golden chamber is slightly read 0.74 -0.254 0.07 0.098
Internal reading 0.722 -0.043 0.188 -0.205
Chilling theory 0.826 -0.121 0.184 -0.098
Physiology 0.665 0.405 0.311 0.147
Biochemistry 0.723 0.1 -0.361 0.281
Microbial parasitology 0.79 0.146 -0.038 -0.178
Epidemiology 0.606 -0.307 -0.054 0.219
Western medicine 0.704 -0.297 0.15 0.161
Pharmacology 0.763 -0.017 0.342 -0.098
Medical history 0.646 0.365 -0.218 0.212
Acupuncture 0.795 -0.107 0.151 -0.11
Diagnostic foundation 0.73 0.011 -0.324 0.136
Human anatomy 0.714 0.345 -0.084 0.287
Chinese medical history 0.554 0.256 0.169 0.056
Combination of Chinese and western medicine 0.633 -0.288 -0.341 -0.278
Chinese and western medicine combine the oral and throat 0.674 -0.3 -0.227 -0.276
Combination of Chinese and western medicine combined with gynecology 0.764 -0.303 -0.126 -0.007
Chinese and western medicine combined with foreign science 0.783 -0.251 -0.057 0.148
Chinese and western medicine combined ophthalmology 0.746 -0.262 -0.127 0.211
Chinese medicine 0.729 0.158 -0.103 -0.201
Traditional Chinese medicine 0.474 -0.331 0.506 0.377
Basic theory of Chinese medicine 0.604 0.311 -0.241 -0.065
Internal medicine 0.768 -0.222 0.059 -0.063
TCM diagnosis 0.632 0.387 0.057 -0.192
Histology 0.776 0.293 -0.043 0.147

The component matrix of the rotation

Constituent
Course name 1 2 3 4
Pathology 0.197 0.591 0.486 0.273
Formulology 0.366 0.758 0.243 0.085
The golden chamber is slightly read 0.44 0.32 0.188 0.553
Internal reading 0.321 0.269 0.257 0.662
Chilling theory 0.434 0.375 0.228 0.551
Physiology -0.038 0.44 0.618 0.393
Biochemistry 0.566 0.047 0.624 0.157
Microbial parasitology 0.378 0.556 0.436 0.073
Epidemiology 0.491 0.032 0.292 0.397
Western medicine 0.463 0.271 0.238 0.452
Pharmacology 0.266 0.635 0.316 0.437
Medical history 0.217 0.181 0.726 0.056
Acupuncture 0.426 0.575 0.275 0.399
Diagnostic foundation 0.562 0.136 0.491 0.131
Human anatomy 0.181 0.21 0.787 0.216
Chinese medical history 0.086 0.326 0.477 0.23
Combination of Chinese and western medicine 0.727 0.334 0.122 -0.016
Chinese and western medicine combine the oral and throat 0.714 0.397 0.09 0.045
Combination of Chinese and western medicine combined with gynecology 0.642 0.397 0.263 0.312
Chinese and western medicine combined with foreign science 0.655 0.254 0.282 0.376
Chinese and western medicine combined ophthalmology 0.639 0.104 0.356 0.387
Chinese medicine 0.399 0.561 0.479 0.001
Traditional Chinese medicine 0.802 0.117 0.007 0.111
Basic theory of Chinese medicine 0.277 0.32 0.574 -0.097
Internal medicine 0.538 0.467 0.254 0.381
TCM diagnosis 0.131 0.636 0.514 -0.032
Histology 0.247 0.336 0.704 0.231

Each component score coefficient matrix

Constituent
Course name 1 2 3 4
Pathology -0.11 0.126 0.026 0.002
Formulology -0.019 0.333 -0.142 -0.134
The golden chamber is slightly read 0.076 -0.018 -0.038 0.179
Internal reading -0.062 0.014 -0.158 0.278
Chilling theory 0.023 0.089 -0.081 0.191
Physiology -0.274 0.056 0.18 0.135
Biochemistry 0.196 -0.275 0.171 -0.067
Microbial parasitology 0.02 0.14 0.024 -0.116
Epidemiology 0.135 -0.183 0.023 0.158
Western medicine 0.21 -0.036 -0.063 0.015
Pharmacology -0.114 0.222 -0.113 0.15
Medical history -0.025 -0.123 0.3 -0.07
Acupuncture -0.023 0.138 -0.084 0.058
Diagnostic foundation 0.201 -0.149 0.135 -0.083
Human anatomy -0.047 -0.179 0.322 0.029
Chinese medical history -0.159 0.075 0.158 0.091
Combination of Chinese and western medicine 0.331 0.019 -0.12 -0.23
Chinese and western medicine combine the oral and throat 0.246 0.078 -0.106 -0.122
Combination of Chinese and western medicine combined with gynecology 0.165 -0.014 -0.061 0.003
Chinese and western medicine combined with foreign science 0.136 -0.046 -0.023 0.157
Chinese and western medicine combined ophthalmology 0.166 -0.162 0.031 0.104
Chinese medicine 0.039 0.168 0.061 -0.204
Traditional Chinese medicine 0.566 -0.081 -0.069 -0.134
Basic theory of Chinese medicine 0.046 0.004 0.186 -0.19
Internal medicine 0.091 0.096 -0.091 0.027
TCM diagnosis -0.128 0.228 0.086 -0.132
Histology -0.046 -0.073 0.228 0.034
Results and Analysis of Teaching Quality Monitoring

A comparison of the top 30 students’ composite score rankings with the mean score rankings is shown in Figure 3. From the figure, it can be seen that the results of the factor analysis composite score and the traditional mean score ranking are still different. For example, the student whose number is #1 has the highest composite score and ranked #1 in the factor analysis, but the mean score ranking is #30.

Figure 3.

The overall ranking was compared with the average ranking

To summarize, in the context of the high-quality development of commerce and distribution, the performance of different students in their ability to deal with common diseases in various clinical disciplines, clinical diagnosis and identification of traditional Chinese medicines, the ability to diagnose and examine the condition, take medical history, and the ability to recognize and treat the illnesses in Chinese medicine is also very different. Teachers can tailor their teaching to the needs of their students. Teaching managers can also use the structure of professional knowledge and ability obtained from factor analysis as an objective reference in improving or refining the setting of professional courses and the formulation of professional training objectives.

Practical teaching reform ideas and measures

In view of the common problems in practice teaching of professional groups, we actively carry out research and reform of practice teaching, construct “basic interoperability, hierarchical progression, integration of competition and creation, comprehensive leap” practice teaching system, and put forward effective reform measures in the construction of training rooms and teaching resources [22].

Constructing the practice teaching system of “basic interoperability, hierarchical progression, integration of competition and innovation, and comprehensive leap”.

In accordance with the principle of “basic interoperability, hierarchical progression, integration of competition and creation, and comprehensive leap”, the practice teaching system of the commerce professional group centers on the requirements of enterprises for the technical skills and comprehensive quality of finance and commerce management personnel in the context of industrial integration, is based on the cognitive law of students’ professional learning and the needs of professional development, and is aligned with the work tasks and work process of finance and commerce job groups, combining with the law of professional growth. Based on the cognitive law of students’ professional learning and the needs of career development, buttressing the work tasks and work process of finance and trade job groups, combining with the law of career growth, breaking through the boundaries of the separation of various majors, highlighting the inter-specialty nature of practice teaching according to the basic knowledge of economic management and circulation and commerce and the work related to finance and trade management jobs, and realizing the interoperability and sharing of practice teaching resources among different majors.

The practical teaching system of “Basic Interoperability, Layered Progression, Competition and Creation Integration, and Comprehensive Leap” is shown in Fig. 4, which includes 6 stages, i.e., professional cognitive experience training, general practical training of professional group, professional basic practical training, professional development practical training, practical training of national higher vocational skills competition, and inter-professional comprehensive practical training and top job internship. Following the law of students’ professional learning cognition and ability enhancement, it forms a hierarchical progression system of professional ability cognition - professional ability formation - professional ability expansion - professional comprehensive ability enhancement, and at the same time, it emphasizes the integration and enhancement of personal quality - professional quality - comprehensive quality. Professional cognitive experience and general practical training of professional group, as the general basic practical training activities of each specialty in the professional group, on the one hand, enhance the students’ cognition and understanding of all the occupational positions in finance, economics and trade. On the other hand, it enables students to have a full understanding of the whole process of enterprise operation and the cooperation between various departments. Through carrying out professional basic practical training, professional development practical training and national higher vocational skills competition practical training, students’ professional foundation and core skills are strengthened to promote the formation of vocational ability and the ability to deal with vocational complex problems. Interdisciplinary comprehensive practical training is an indispensable practical training stage for the cultivation of composite and developmental talents, and students complete the integration of interdisciplinary knowledge and ability, and are able to solve practical problems quickly, efficiently and creatively, and have strong adaptability to the new period, the new business model and the new business.

Construction of a “five-in-one, virtual and real” cross-disciplinary integrated simulation training center

In order to adapt to the requirements of the integration and synergistic development of the manufacturing industry and the financial and trade service industry, after the research of enterprises and job groups, it is found that the four majors of e-commerce, logistics management, accounting, advertising planning and marketing have a high degree of relevance in the positions of supply chain operation, marketing, warehousing and distribution, purchasing and supplying, and financial management, etc., and they have the common core skills, such as marketing planning, resource planning, cost control, and on-line trading. In order to strengthen the relevance of each major, improve the core practical training courses and core skills training, on the basis of the original on-campus training rooms and training bases, we have gathered the advantageous teaching resources and constructed a cross-specialty integrated simulation training center. In accordance with the idea of “school-enterprise co-construction, intra-group sharing, and professional co-management”, the enterprise production management standards are introduced, the real jobs of the enterprise are compared, the types of work are improved, and the development trend of finance and commerce is oriented to build a cross-professional integrated simulation training center integrating “practical teaching, skill competition, vocational training, teaching and research, innovation and entrepreneurship incubation” - “simulation training center for the integrated development of financial and commercial services and manufacturing industry”. The simulation training center for the integrated development of financial and commercial services and manufacturing industry is shown in Figure 5

Introducing the “Teaching Factory” model, connecting with real work and innovating practical teaching resources.

The “Teaching Factory” education model is created by Nanyang Technological Institute of Singapore, which introduces the practical environment of enterprises into the teaching environment, takes the project as a link, and integrates teaching, learning and research in depth, which plays an important role in cultivating students’ professional and vocational abilities. The combination of enterprises and schools, practice and theory, teachers and experts in the “Teaching Factory” realizes the seamless connection between graduates and jobs, and has cultivated a large number of highly-skilled and applied talents with high technological research and development capabilities for Singapore.

Figure 4.

Practice teaching system

Figure 5.

Simulation Training Center

Educational reform data mining based on FWA optimized k-mean clustering algorithm
FWA

The new swarm intelligence FWA originates from the process of sparking when fireworks are ignited, which is regarded as the process of searching for the location of fireworks ignition in the local space of the neighborhood of a specific point by igniting the sparks generated by the ignition, and then continuously igniting fireworks in the search space until the sparks reach the optimal location.

Sparks based on fitness are categorized into 2 types, i.e., good sparks and bad sparks, where sparks with smaller fitness have a stronger search capability, have a smaller explosion radius in a smaller search space, and produce a higher number of fireworks. On the contrary, sparks with greater adaptation have stronger digging ability, explode in larger radius in larger search space and produce less number of fireworks [23]. In FWA, there are 2 parameters that play a decisive role, one is the explosion radius Ri of firework i, and the other is the number of sparks Si produced by the explosion of firework i, which are calculated as follows Ri=R^fiymin+εi=1N(fiymin)+ε$${R_i} = \hat R\frac{{{f_i} - {y_{\min }} + \varepsilon }}{{\sum\limits_{i = 1}^N {({f_i} - {y_{\min }})} + \varepsilon }}$$ Si=Mymaxfi+εi=1N(ymaxfi)+ε$${S_i} = M\frac{{{y_{\max }} - {f_i} + \varepsilon }}{{\sum\limits_{i = 1}^N {({y_{\max }} - {f_i})} + \varepsilon }}$$

Where: R^$$\hat R$$ is the average explosion radius of the firework. fi is the fitness function of fireworks i. ymin = minfi, ymax = maxfi are the minimum and maximum values of the fitness function in the fireworks population, respectively. ε is the machine minimum. M is the constant. N is the firework size.

The number of sparklers Si is also subject to certain constraints in Eq: Si={ round(aW),si<aW round(bW),si<bW round(si),other$${S_i} = \left\{ {\begin{array}{*{20}{l}} {{\text{round}}(aW),{s_i} < aW} \\ {{\text{round}}(bW),{s_i} < bW} \\ {{\text{round}}({s_i}),{\text{other}}} \end{array}} \right.$$

Where: a, b are constants, set by default a = 0.1, b = 0.5. round(·) is a rounding function. si is the initial number of fireworks. W is the weight parameter.

Set the dimension of the fireworks for k, in-law i position for xi = (xi, xi2, ⋯, xik), according to Eq. respectively calculated fireworks i explosion radius Ri, the number of sparks Si, randomly selected u(1 ≤ uw) fireworks position component, and according to Eq. position update to generate the explosion of sparks, that is: xik=xik+riU(1,1)$$x_{ik}^\prime = {x_{ik}} + {r_i}U( - 1,1)$$

Where: xik is the position element of firework i in dimension k. xi is the explosion updated xik. U(−1, 1) is a random number on the interval [-1, 1]. ri is the scaling parameter for fireworks i.

Exploding fireworks may be out of range of the feasible domain boundary, sparks within the range are not processed, but for fireworks outside the range, they need to be assigned to a new search space by the mapping rule. The mapping formula is: ||x^||=xib,k+|x^ik||xub,kxib,k|$$||\hat x|| = {x_{ib,k}} + |{\hat x_{ik}}||{x_{ub,k}} - {x_{ib,k}}|$$

Where: x^$$\hat x$$ is the position of the next generation firework. x^ik$${\hat x_{ik}}$$ is the position element of the next generation firework i in k dimensions. xub,k, xlb,k are the upper and lower bounds of the solution space in k dimensions, respectively. ||·|| is the mode operation.

The variational operator is introduced in FWA in order to generate Gaussian sparks, which allows the diversity of the population to increase. The process of increasing starts with randomly selecting xi, then selecting a specific dimension for Gaussian variation, and finally performing Gaussian variation calculations through the dimension k of xi, i.e: x^=xike$$\hat x = {x_{ik}}e$$

where e is equivalent to N(1, 1) and N(1, 1) is a Gaussian random number with variance and mean of 1.

The computational flow of the fireworks algorithm is shown in Fig. 6.

Figure 6.

Calculation flow of fireworks algorithm

k-mean clustering algorithm

k The principle of the mean clustering algorithm is to make the objects of different classes of clusters as different as possible and the objects of the same class of clusters as identical as possible according to the corresponding similarity rules. The algorithm uses distance as a criterion for classifying the cluster classes, and the Euclidean distance formula is usually used to calculate the distance between samples of data objects, i.e.: d=i=1N(xiyi)2$$d = \sqrt {\sum\limits_{i = 1}^N ( {x_i} - {y_i}{{\text{)}}^2}}$$

Where: d is the Euclidean distance. xi, yi are the data object samples, Xi = {x1, x2, ⋯, xN}, Y = {y1, y2, ⋯, yN}, i ∈ [1, N] respectively.

In the k mean clustering algorithm clustering process, each iteration needs to recalculate the average of all samples in the cluster that is the cluster centroid ci, update ci is calculated as: Ci=1|ci|xicixi$${C_i} = \frac{1}{{|{c_i}|}}\sum\limits_{{x_i} \in {c_i}} {{x_i}}$$

Where Ci is the cluster.

k The mean clustering algorithm needs to iteratively update the partitioned categories and cluster centroids ci until the termination conditions are met. The default termination condition is that the number of iterations has reached the maximum value or the objective function of the algorithm is less than a threshold value.

From the above description, it can be seen that the core of the k-mean clustering algorithm is based on the minimum error sum of squares criterion, and the basic idea is to divide the given data objects into the same class clusters through a certain number of iterations, and then recalculate the clustering centers, and carry out cyclic iterations according to a certain number of times, and output the results when the criterion function converges [24]. k The criterion function E is defined as: E=i=1kxiCi(xix¯i)2$$E = \sum\limits_{i = 1}^k {\sum\limits_{{x_i} \in {C_i}} ( } {x_i} - {\bar x_i}{)^2}$$

where x¯i$${\bar x_i}$$ is the mean value of xi.

It can be seen from Eq. The larger E is, the lower the similarity within class clusters. On the contrary, the smaller E is, the higher the similarity within class clusters.

k-mean clustering algorithm based on FWA optimization

The k mean clustering algorithm is more reliant on initializing the cluster centroids and easily falls into the local optimum problem, while FWA has the ability to balance the global search and local search, therefore, FWA is used to optimize the k mean clustering algorithm. Firstly, FWA is used to find k clustering centers as the initial cluster centroids of the k-mean clustering algorithm, and then the k-mean clustering algorithm is used for clustering to get the optimal values. The selection strategy is set in FWA, and the candidate set YH (H is the total number of elements) is set to contain the original fireworks, exploding sparks and Gaussian sparks. The set Yn contains the optimal elements, in addition to the optimal elements, select Q − 1 elements in the set to form a new population of Q elements, where the probability of each element selection is: pi=j=1Hdiji=1Hj=1Hdij$${p_i} = \frac{{\sum\limits_{j = 1}^H {{d_{ij}}} }}{{\sum\limits_{i = 1}^H {\sum\limits_{j = 1}^H {{d_{ij}}} } }}$$ dij=|fifj|$${d_{ij}} = \left| {{f_i} - {f_j}} \right|$$

where dij is the Euclidean distance between firework i and firework j.

Empirical studies
Data comprehension

In this section of the experiment, the grades of three courses related to commerce and distribution in a university were selected for statistical analysis, and Q-Q plots were selected to observe the changes of students at the high grades level, with an effective sample of 1,366. The normal Q-Q plots for the year of enrollment are shown in Figure 7. There is no significant change, mainly due to the limitation of the laboratory equipment with the limitation of the size of the innovation laboratory to accommodate a limited number of students. The trend normal Q-Q plot for year of enrollment is shown in Figure 8. The current main results are achieved by the students of class 2021 and 2022, with class 2023 as a strong reserve. The trend normal Q-Q graph of examination results in the main courses is shown in Fig. 9 (Fig. a shows the trend normal Q-Q graph of Data Structures and Fig. b shows the trend normal Q-Q graph of Database Principles). From the data of the above table and figure, it can be seen that through participation in the innovation laboratory and enterprise practical training, after a period of time, the students’ motivation to learn has improved, which is reflected in the improvement of the average grade, especially the number of excellent grades is growing. At the same time, polarization is also increasing, after all, the innovation laboratory does not cover all students.

Figure 7.

The normal state of the year of the admission Q-Q

Figure 8.

The trend of the year of admission Q-Q

Figure 9.

The main course test results are normal

Data pre-processing

The database principles were selected as the comparison object, the ANOVA model was constructed, the year of enrollment was selected as the dependent variable, and the fixed factor was the database principles, and the ANOVA was utilized to view the test of the between-subjects effect, and the test of the between-subjects effect of the database principles course is shown in Table 8. Where df represents the degree of freedom, F is the group variance value, Sig is the test value of the difference is significant, the value is generally compared with 0.05 or 0.01, if it is greater than 0.05, it means that the difference is significant. As can be seen from the table, Sig>0.05 indicates that there is a significant change in the achievement with the innovative activities in each grade. Next, a random factor analysis was conducted to derive the test of between-subjects effect by comparing the achievement of data structure and database principles based on the year of enrollment, and the test of between-subjects effect of comparing data structure and database principles courses is shown in Table 9. Comparing on the year of enrollment, the Sig<0.05 indicates that there is no significant difference in the degree of students’ choosing to participate in innovative activities in different years of enrollment, but with the deepening of the innovative activities, the before-and-after comparison of the two courses of Data Structure and Database Principles, there is a significant difference between the students’ learning outcomes and the rest of the students with the Sig>0.05.

The test of the effect of the course subject in the database principle

Source Type iii sum df Mean square F Sig.
Calibration model 756.853a 1 369.495 2.234 .112
intercept 1313830.727 1 1315231.672 7952.285 .000
Year of admission 751.996 2 369.493 2.324 .110
error 52524.312 315 168.606
total 1925995.000 305
Total correction 52269.326 301

The data structure and the database principle course compare the test

Source Type iii sum df Mean square F Sig.
Intercept hypothesize 748 752.631 1 747482.833 4214.836 .000
error 17236.548 98.417 167.951a
Year of admission hypothesize 912.435 1 922.431 5.822 .015
error 33364.626 215 156.282b
Data structure hypothesize 6899.072 40 195.932 1.285 .143
error 33164.526 215 156.122b

In order to compare the mined data more intuitively, the year of enrollment, the main course scores and the trend of scores were compared by introducing RFM model to measure the level of students’ learning motivation and innovation ability, and the students’ year of enrollment and the main course scores were compared as shown in Figure 10. The distribution of the trend of training students’ test scores is shown in Figure 11. It shows that the impact of innovation education on students’ course learning is very obvious, and achieves the goal of optimizing the path of education reform in colleges and universities.

Figure 10.

The student enrollment was compared with the main course score

Figure 11.

Training student test performance trend distribution

Conclusion

The study provides an in-depth analysis of the optimal design method of college education path through data mining algorithms and discusses its practical applications. The experimental conclusions drawn in this paper are as follows:

The experiment utilizes factor analysis to make a comprehensive evaluation of the performance data, and finds that the student whose academic number is No. 1 has the highest and ranked No. 1 in the comprehensive rating of the factor analysis, but the average score ranking is No. 30. From this, it can be concluded that the comprehensive evaluation derived from the factor analysis is more reflective of the overall professional competence of the students. Therefore, the performance of different students in terms of their ability to deal with different types of problems varies greatly. Teachers can target and sex to tailor their teaching.

Using the cluster analysis algorithm to compare and analyze the year of entry, main course grades and grade trends of students in a school, it is found that the impact of innovative education on students’ course learning is very obvious, which helps to achieve the goal of optimizing the path of education reform in colleges and universities.

Lingua:
Inglese
Frequenza di pubblicazione:
1 volte all'anno
Argomenti della rivista:
Scienze biologiche, Scienze della vita, altro, Matematica, Matematica applicata, Matematica generale, Fisica, Fisica, altro