Uneingeschränkter Zugang

Geometric significance of eigenvalues and eigenvectors in linear algebra and their potential value in data analysis

  
24. Sept. 2025

Zitieren
COVER HERUNTERLADEN

Introduction

Eigenvalues and eigenvectors are important attributes of matrices, which have important applications in quantum mechanics, machine learning, signal and image processing, and other subject areas, and their concepts and connotations should be skillfully mastered and deeply understood by science and engineering students. Let a square array A of order n , if there exists a number λ and a nonzero n -dimensional vector ξ such that Aξ=λξ holds, the number λ is said to be the eigenvalue of the square array A , and the vector ξ is the eigenvector of the square array A corresponding to the eigenvalue λ . From the point of view of transformations, when a n -dimensional vector ξ is an eigenvector of a matrix A , the transformations of the matrix A on the vector ξ act as an equivalent of stretching the eigenvectors λ times.

Eigenvalues and eigenvectors have a large number of application cases in big data analysis. When processing big data, in order to reduce the processing difficulty, it is usually necessary to reduce the dimensionality of the sample data, and principal component analysis is a method to project the high-dimensional data into a low-dimensional space using linear transformation under the premise of losing as little information as possible [1-5]. For example, in the observation of different cars, measuring their number of seats, number of tires, number of doors, number of windows, number of tires, size of cylinders, etc., some of these indicators have strong correlation, and this redundant information needs to be removed during data processing. Therefore, it is reasonable for principal component analysis to consider indicators with high variance as better for class differentiation [6-7]. PCA is one of the commonly used algorithms for big data analysis and machine learning, and it is an algorithm that will be involved in the study of computer, electronic information, economics, and medicine, etc. [8-10]. In this example, in addition to the use of eigenvalues and eigenvectors in linear algebra, it also involves square matrix diagonalization and coordinate transformation [11-14].

In this paper, we study the trajectory of vector y=(y1,y2)T after linear transformation y=Ax when matrix A is invertible by way of illustration. By solving the eigenvalues and eigenvectors of invertible matrix A , the geometric significance of the eigenvalues and eigenvectors for the trajectory of y=(y1,y2)T under linear transformation is illustrated. In order to ensure the comprehensiveness and universality of the research conclusions, this paper also illustrates the geometric significance of the eigenvalues and eigenvectors under linear transformation by giving an example under the condition that matrix A is not invertible. The principal component analysis and spectral clustering algorithm are selected to demonstrate the application of eigenvalues and eigenvectors and their geometrical significance in data analysis, and the principal component analysis is used to realize the dimensionality reduction of the interferometric data collected by 2048 groups of spatial aberration spectrometers, as well as the correction of the data affecting the recovery of the spectral accuracy of the restored spectra, such as the existence of irregular spots in the spatial aberration interferograms. The division of complex networks is studied by spectral clustering algorithm, and how to automatically determine the number of clusters and select feature vectors is improved. The improved spectral clustering algorithm is used to classify the karate club network, revealing the effectiveness of the improved spectral clustering algorithm in this paper and the potential value of the geometric significance of eigenvalues and eigenvectors for application in data analysis.

The geometric significance of eigenvalues and eigenvectors

Eigenvalues and eigenvectors are two important concepts in linear algebra, which are now widely used in hot areas such as dynamical systems, machine learning, image processing and data analysis [15]. In this paper, we take the 2nd order square matrix as an example, focusing on explaining the geometric significance of eigenvalues and eigenvectors.

In the plane, vector x=(x1,x2)T satisfies x12+x22=1 , i.e., x=(x1,x2)T is on the unit circle (or call it a unit vector).

Consider the linear transformation y=Ax , here the matrix: A=abcd,a2+b20,c2+d20° $A=\left(\begin{array}{ll}a & b \\ c & d\end{array}\right), a^2+b^2 \neq 0, c^2+d^2 \neq 0_\text{o}$

Using abcdx1x2=y1y2 , this can be obtained: (c2+d2)y122(ac+bd)y1y2+(a2+b2)y22=(adbc)2

The distribution of the trajectories of vector y=(y1,y2)T as x=(x1,x2)T changes is studied geometrically below.

Matrix A invertible

When matrix A is invertible, adbc0 , geometrically study the trajectory of vector y=(y1,y2)T as it changes with x=(x1,x2)T :

When adbc0 and ad+bc=0 , then equation (2) represents an ellipse, and the long and short axes of the ellipse are on the coordinate axes.

When adbc0 and ad+bc0 , equation (2) is still an ellipse, and the long and short axes of the ellipse are not on the coordinate axes.

The following is an example:

Example 1: Given matrix A=1331 , examine the trajectory of vector y=(y1,y2)T after linear transformation y=Ax .

Solution: Matrix A=1331 satisfies adbc0 , ad+bc0 . The trajectory of vector y=(y1,y2)T after linear transformation is: 10y1212y1y2+10y22=64

Equation (3) represents an ellipse whose long and short axes are not on the coordinate axes.

The Matlab program is applied to plot the geometry of equation (3), and the result is shown in Fig. 1, from which it can be seen that the unit element is transformed into an ellipse whose long and short axes are not on the coordinate axes by the linear transformation y=Ax .

Figure 1.

The trajectory of linear transformation

Taking x=(22,22) , due to 13312222=42222 , here 4 is greater than zero, implying that the transformed vectors scale in the same direction, as shown in Fig. 2(a). Taking x=(22,22) , due to 13312222=(2)2222 , here -2 is less than zero, implying that the transformed vector scales in the opposite direction, as shown in Fig. 2(b). Thus in terms of linear transformation, x=(22,22)T,(22,22) is an invariant of the linear transformation, where invariant means that the overall direction is unchanged, including the same direction and opposite direction.

Thus, define k1=(22,22)T , k2=(22,22)T are the eigenvectors (k10,k20 ) of the linear transformation Ax , and the corresponding 4 and -2 are their corresponding eigenvalues.

Figure 2.

The geometric meaning of eigenvalues and eigenvectors

Equation (3) is examined from the point of view of quadratic forms. Equation (3) represents an ellipse, as shown in Fig. 3(a), when the long and short axes of the ellipse are no longer on the coordinate axes. Its long and short axes are made to fall on the coordinate axes by picking an orthogonal linear transformation.

The quadratic type (3) corresponds to matrix B=106610 , and the computation yields matrix B with 2 mutually exclusive eigenvalues of λ1=4,λ2=16 and corresponding unit eigenvectors of η1=(22,22)T , η2=(22,22)T .

η1 , η2 are obtained after standard orthogonalization, construct the orthogonal matrix T1=(η1,η2)=22222222 , do the orthogonal linear transformation y=T1X , i.e., y1y2=T1X1X2 , where T1 satisfies T1TT1=E , and substituting into equation (3) yields 4X12+16X22=64 , i.e.,: X1216+X224=1

Equation (4) is an ellipse with the long and short axes falling on the coordinate axes, as shown in Figure 3(b).

Let T2=(η2,η1)=22222222 , make an orthogonal linear transformation y=T2X , i.e., y1y2=T2X1X2 . Here T2 also satisfies T2TT2=E , and substituting into equation (3) gives 16X12+4X22=64 , i.e.,: X124+X2216=64

Equation (5) is also an ellipse with the long and short axes falling on the coordinate axes, as shown in Figure 3(c). Figure 3(b) is equivalent to making a clockwise rotation of 45° for Figure 3(a), and Figure 3(c) is equivalent to making a counterclockwise rotation of 45° for Figure 3(a).

Figure 3.

The corresponding curve of equation(3)~(5)

Matrix A is not invertible

Matrix A is not invertible, i.e., adbc=0 , at which point equation (2) changes to: (c2+d2)y122(ac+bd)+(a2+b2)y22=0

An example of the curve represented by equation (6) is given below.

Example 2: Consider matrix A=1111 (A irreducible, symmetric) and satisfying adbc=0,ad+bc0 . At this point, the trajectory of vector y=(y1,y2)T after linear transformation is: 2y124y1y2+2y22=0y2=y1,2y12

This shows that by this linear transformation, the unit circle becomes line segment y2=y1,2y12 . The result of the transformation is shown in Fig. 4, where the trajectory of the linear transformation is given in Fig. 4(a) and the correspondences of certain points are given in Fig. 4(b).

Figure 4.

The trajectory of linear transformation

A linear transformation changes A(1,0) to A(1,1) , B(22,22) to B(2,2) , C(0,1) to C(1,1) , D(22,22) to D(0,0) , E(1,0) to E(1,1) , F(22,22) to F(2,2) , G(0,1) to G(1,1) , and H(22,22) to H(0,0) . In the figure A coincides with C , D coincides with H to the origin, and E coincides with G .

As: 11112222=02222 , 11112222=22222 , so that 2222 and 2222 are “invariants of the linear transformation” from the point of view of the linear transformation. Thus, k1=2222 and k2=2222 are defined to be the eigenvectors (k10,k20 ) of the linear transformation, and the corresponding 0 and 2 are their corresponding eigenvalues.

Example 3: Consider the matrix A=2412 (A irreducible, nonsymmetric) and satisfying adbc=0,ad+bc0 . The trajectory of vector y=(y1,y2)T after linear transformation is then: 5y12+20y1y2+20y22=0y2=12y1,25y125

This shows that by this linear transformation, the unit circle becomes a line segment: y2=12y1,25y125 . The transformation trajectory is shown in Figure 5.

Figure 5.

The trajectory of linear transformation

Since the corresponding eigenvalue of this matrix is 0, and 0 is a 2-fold eigenroot. The corresponding unitized eigenvector: 24124515=04515 , and therefore 4515 is an “invariant of the linear transformation” in terms of the linear transformation, so that k1=4515 is defined as the eigenvector (k10 ) of the linear transformation, and the corresponding 0 is its corresponding eigenvalue, where 0 is the dual root.

Application of eigenvalues and eigenvectors to data analysis

Eigenvalues and eigenvectors have very important applications in modern science, as the basis of linear algebra, it has an important role in theory and real life, and a lot of data analysis work is closely related to it. This section will mainly introduce the application of eigenvalues and eigenvectors in data analysis, including principal component analysis and, through the example study to explain the application effect of various methods residing in the eigenvalues and eigenvectors, to reveal the geometric significance of the eigenvalues and eigenvectors play an important role in data analysis in a more in-depth manner.

Principal component analysis
Principal component analysis methods

The starting point of the principal component analysis method is to compute from a set of features a set of new features [16] in descending order of importance, which are linear combinations of the original features and are uncorrelated with each other.

Denote P original feature as x1,,xρ and assume that the new feature, ξi,i=1,,p , is a linear combination of these original features: ξi=j=1pαixj=a7ix

Here the linear combination is required to have coefficients modulo 1, which can be obtained in order to unify the ξi scale: airai=1

Eq. (10) is written in matrix form as: ξ=A Tx where ξ is the vector consisting of the new features ξi and A is the feature transformation matrix. The solution required here is the optimal orthogonal transformation A , which maximizes the variance of the new feature ξi . The orthogonal transformation ensures that the new feature is uncorrelated, and the larger the variance of the new feature, the more the samples differ in that dimension of the feature, thus making this feature more important.

Consider the first new feature ξi : ξi=i=1iaijxj=a1Tx

Its variance is: var(ξi)=E[ξi]E[ξi]=E[airxxrai]E[airx]E[xrai]=airai where Σ is the covariance matrix of x , which can be estimated using samples. E[] is the mathematical expectation. To maximize the variance of ξi under constraint aiτx=1 is equivalent to finding the extreme values of the Lagrangian function: f(a1)=a1ra1v(a1Ta11) N is then a Lagrange multiplier, and by taking the derivative of Eq. (14) with respect to a1 and making it equal to 0, the optimal solution a1 is obtained satisfying the following equation: a1=va1

That is, the eigenequation of the covariance array Σ , i.e., ai must be the eigenvector of the matrix Σ , and v is the corresponding eigenvalue. Substituting Eq. (15) into Eq. (14) gives: var(ξ1)=a1Ta1=va1Ta=v

So, the optimal ai should be the eigenvector corresponding to the largest eigenvalue of Σ . Moreover, ξ1 is the first principal component which has the largest variance among all linear combinations of the original features.

The second new feature ξ2 is solved for below, and the second new feature must be uncorrelated with the first principal component, i.e., I, in addition to fulfilling the same requirements as the first feature, i.e., having the largest variance and also having mode 1: E[ξ1ξ2]E[ξ1JE[ξ2]=0

Substituting into Eq. (10) and organizing, we can get: a2Ta1=0

Considering (16), the irrelevant requirements are equivalent to requirements a2 and a1 orthogonal: a2Ta1=0a2Ta2=1

Maximizing the variance of ξ2 under the constraints yields that a2 is the eigenvector corresponding to the second largest eigenvalue of Σ , while ξ2 is called the second principal component.

There are  p eigenvalues in the covariance matrix Σ , including eigenvalues that may be equal and eigenvalues that may be 0. Arranging them from smallest to largest yields  p principal components constructed from the corresponding eigenvectors of these eigenvalues, ξi,i=1,,p . The sum of the variances of all principal components is: i=1pvar(ξ1)=i=1pλi

It is equal to the sum of the variances of the individual original features.

The individual column vectors of the exchange matrix A are composed of the orthogonal normalized eigenvectors of Σ , AT=A1 , i.e. A is an orthogonal matrix.

As a feature extraction method, it is generally desirable to represent the data with fewer principal components. If the first k principal components are taken, then the proportion of the full variance of the data represented by these k principal components is: i=1iλi/i=1pλi

Figure 6 shows an example of the magnitude of the individual eigenvalues on some dataset, and it can be seen that the first three eigenvalues, i.e., the variance of the first three principal components, account for most of the total variance, and one can decide on the choice of a few principal components to represent the total data based on such an eigenvalue mapping. In many cases, it is possible to determine in advance the proportion of the total variance of the data that it is hoped that the new eigenvalues will represent, and then try to calculate the appropriate k according to equation (21).

Figure 6.

Eigenvalue of Principal component analysis

The selection of relatively few principal components to represent the data can be used not only as a dimensionality reduction of the features, but also to eliminate noise from the data. Generally the principal components (or called sub-components) arranged at the back in the eigenvalue spectrum generally represent random noise in the data. In this case, if the very small component of ξ corresponding to the eigenvalue is treated as 0, and then inverse transformed back to the original space, the noise reduction of the original data is realized.PCA (Principal Component Analysis) can downsize n features to k , which can be used for data compression, for example, a vector of 100 dimensions can be represented by 10 dimensions at the end, and then it is known that the compression rate is 90%.

Example applications

Spatial aberration spectroscopy (SHS) is a new hyperspectral remote sensing detection technology, the two-dimensional measured interferometric data acquired by spatial aberration spectrometer will be infected by a variety of influences, which will reduce the accuracy of the recovered spectra, so the experiments in this section focus on investigating the spatial aberration interferometric data correction method based on principal component analysis.

Experimental data

The test data were collected using the spatial aberration spectrometer HEP-765-S, which has a fundamental frequency wavelength of 764.8 nm and a spectral resolution due to 0.01 nm. The monochromatic light source was a hollow cathode lamp potassium lamp, and the continuous light source was a GY-10 high-pressure spherical xenon lamp. The raw interferograms were collected using a spatial aberration spectrometer in a darkroom environment. The size of both images was 2048×2048 pixels, and each line in the figure represents a set of interferometric data. Both interferograms have the phenomenon of uneven intensity distribution, and there are irregularly shaped spots or patches in some areas, and the existence of these effects will reduce the accuracy of the recovered spectra, which need to be corrected and processed.

Data processing and analysis

There is a Fourier transform relationship between the interferogram and the spectrogram, and the spectral data can be obtained by Fourier transforming the preprocessed two-dimensional interferogram. The two-dimensional interferograms to be processed in this experiment have 2048 groups of one-dimensional interferometric data in a hundred rows, and 2048 groups of spectral data can be obtained after Fourier transform processing of each group of interferometric data, which can be expressed as B=(b1,b2,,b2048) . The principal component analysis algorithm can analyze the correlation between the target spectra and the noise spectra in the spectral data, decompose the target spectra and the noise spectra into independent components, and then achieve the effect of data downscaling by retaining the target spectral components. Data dimensionality reduction, to achieve the effect of removing noise. Therefore, the principal component analysis algorithm is used to correct the spectral data.

According to the above analysis, 2048 rows of spectral data are obtained after de-baselining and Fourier transforming the original interferograms. 2048 sets of spectral data are processed by principal component analysis, and the eigenvalues, eigenvectors and projection values of each principal component are sorted, of which the eigenvalues, contribution rates and cumulative contribution rates of the first 10 principal components are shown in Table 1. As can be seen from Table 1, the first two principal components have a larger contribution than the other principal components and are not of an order of magnitude, and the cumulative contribution rate reaches 97.71%.

The results of the first ten principal components

Principal component Eigenvalue Contribution rate/% Cumulative contribution/%
1 2847.83 51.43 51.43
2 2672.39 46.28 97.71
3 61.27 0.83 98.54
4 54.78 0.46 99.00
5 10.83 0.63 99.63
6 6.28 0.21 99.84
7 3.12 0.05 99.89
8 2.76 0.03 99.92
9 2.14 0.02 99.94
10 1.82 0.01 99.95

Figure 7 shows the average spectrogram and the spectrograms of the first ten principal components, the two characteristic peaks of the potassium lamp are and 766.70nm and 770.59nm, it can be seen from the figure that in the first principal component and the second principal component, the noise around the two characteristic peaks is small, and the characteristic peak intensity is large. The noise gradually increases in the third principal component to the tenth principal component, and the intensity of the two peaks in each principal component gradually decreases, and none of them are located at 766.70nm and 770.59nm, indicating that the noise has become the main influence in these principal components.

Figure 7.

Spectra of potassium lamps

In order to verify that the principal component analysis method has the same denoising effect on the interference of continuous light, the xenon lamp spectral data were processed in the same way. The contributions of the first three principal components in the processed xenon lamp spectral data were 42.78%, 36.94%, and 0.83%, respectively, of which the contributions of the first two principal components were greater than 35%, with a cumulative contribution of 79.72%. The average spectrogram with the first ten principal components spectrogram is shown in Figure 8. As can be seen from Fig. 8, the intensity of the xenon characteristic peaks in the first two principal components is larger, and the intensity of the xenon characteristic peaks in the third principal component to the tenth principal component is basically the same as the noise intensity, i.e., the first two principal components can be used as xenon spectral components.

Figure 8.

Spectrum of xenon lamps

In order to quantitatively evaluate the effectiveness of the principal component analysis method, 300 rows of less noisy data were randomly selected from 2048 rows of data, and the mean square errors before and after spectral correction were calculated for the 524th, 596th and 974th rows of the spectra among them, and the results are shown in Table 2. From Table 2, it can be seen that the mean square error values of the three rows of spectra after correction are 0.134, 0.108 and 0.114, respectively, which are smaller than the mean square error values before correction, indicating that the correction effect of principal component analysis is better.

MSE of xenon lamp spectra before and after denoising

Mean square error Line 524 Line 596 Line 974
Before denoising 0.548 0.476 0.503
After denoising 0.134 0.108 0.114
Spectral clustering

Cluster analysis is a common method in data analysis, and among data clustering, spectral clustering is one of the most popular methods. Spectral clustering is based on the theory of spectral map division, compared with the traditional clustering algorithms, this class of algorithms in the consideration of the continuous relaxation form of the problem, the original problem is transformed into the search for the eigenvalues and eigenvectors of the Laplacian matrix. The algorithm is capable of recognizing non-convex distributions and is also applied to many practical problems, which are well represented in the fields of image segmentation, text mining, and bioinformatics research. In this section, the spectral clustering algorithm is investigated.

Algorithm overview

The spectral clustering algorithm is based on the theory of spectral graph partitioning, and regards data clustering as a multiplexed partitioning problem of an undirected graph [17]. It is assumed that each data sample is regarded as a vertex V of the graph, and the connected edges E between the vertices are assigned a weight value W according to the similarity between the data, thus obtaining an undirected weighted graph G = (V, E) based on the similarity of the samples. From the point of view of optimal graph partitioning, it is to minimize the similarity between any two subgraphs of the partition and maximize the similarity within each subgraph.

The optimal solution of the graph partitioning problem is an NP problem. -A better solution is to consider the continuous relaxed form of the problem, whereupon the original problem can be transformed into a spectral decomposition of the Laplacian matrix, thus obtaining a globally optimal solution of the graph partition criterion in the relaxed continuous domain.

The similarity matrix, also known as the affinity matrix, is usually denoted by W or A. This matrix is defined as: Wij=exp(xixi22σ2) where xi,xj denotes each data sample point, xixj is the Euclidean distance between sample points i,j , and σ specifies a parameter that determines the rate of decay between data points.

Spectral clustering according to different criterion functions and spectral mapping methods, there are a variety of different implementation methods. A representative one is the NJW algorithm, whose main steps are as follows:

Step 1: Construct the similarity matrix W of the data samples.

Step 2: Construct the Laplace matrix L.

Step 3: Find the first K largest eigenvalues and the corresponding eigenvectors vl,v2,,vk of the matrix L, and construct the eigenvector space matrix V=[v1,v2,,vk]Rn×k .

Step 4: Consider each row of the eigenvector space V as a point in the space and cluster it into k classes using classical clustering methods such as K-means.

Improved spectral clustering algorithm

This paper focuses on how to determine the number of clusters and how to select the eigenvectors for improvement. First solve the eigenvalue of the Laplace matrix of the network nodes to be divided, that is, λ1,λ2,λ3,,λn ; after that, use the eigeninterval formula C(i)=ei+1λeλi to solve the largest eigeninterval C(i) and write it down as C(i)max . Order p=argC(i)max , then select the first Np1 eigenvectors, of which N is the number of nodes in the complex network, and p is the maximum eigeninterval of the independent variable. The first Np1 feature vectors are processed by K-means algorithm and divided into Np1 clusters, which not only determines the number of clusters but also selects the feature vectors that should be processed. The specific algorithm is as follows:

Compute the adjacency matrix A , where Ai,j=1 if point i is connected to point j by an edge and Ai,j=0 otherwise.

Construct the Laplace matrix L from the adjacency matrix, where Li,j=Kiδi,jAi,j , Ki are the degrees of node i , δi,j is 1 when i=j is the case, and 0 otherwise.

Compute the eigenvalues and eigenvectors of matrix L and arrange the eigenvalues in ascending order, i.e., 0=λ1λ2λ3λn , where the eigenvectors corresponding to the eigenvalues are α1,α2,α3,,αn1 , respectively.

Calculate the eigenvalue C(1),,C(N2) from Eq. C(i)=ei+1λeλ .

Solve for C(i)max=max{C(1),C(2),C(3),,C(N2)} and let p=argC(i)max elect Np1 former eigenvectors, i.e. α1,α2,,αnp1 .

Cluster the Np1 selected eigenvectors with K-means algorithm and the number of clusters is Np1 .

Example applications

In order to test the feasibility of the improved algorithm and the accuracy of the delineation results, the Karate Club relationship network (Zachary network), which is commonly used in the delineation of complex network associations, is selected for testing in this paper. For the karate club member relationship network, solving its Laplace matrix eigenvalues as well as C(i) is shown in Table 3, and Table 4 shows the eigenvectors and node delineation results selected by this paper’s algorithm. From Table 3, C(P)=0.0507 , then the eigenvectors corresponding to taking the first Np1=34311=2 eigenvalues are derived according to the formula proposed in this paper, and at the same time, they are divided into Np1=2 groups by K-means.

Eigenvalues of Laplacian matrix and C(i) of Zachary network

Eigenvalue C(i) Eigenvalue C(i) Eigenvalue C(i)
λ1 0.0043 λ12 0.0257 0.0001 λ23 0.0582 0.0013
λ2 0.0107 0.0064 λ13 0.0257 0.0000 λ24 0.0637 0.0055
λ3 0.0138 0.0031 λ14 0.0314 0.0057 λ25 0.0708 0.0071
λ4 0.0157 0.0019 λ15 0.0348 0.0034 λ26 0.0812 0.0104
λ5 0.0198 0.0041 λ16 0.0376 0.0028 λ27 0.0843 0.0031
λ6 0.0216 0.0018 λ17 0.0403 0.0027 λ28 0.0916 0.0073
λ7 0.0228 0.0012 λ18 0.0429 0.0026 λ29 0.1164 0.0248
λ8 0.0241 0.0013 λ19 0.0431 0.0011 λ30 0.1368 0.0204
λ9 0.0256 0.0015 λ20 0.0439 0.0008 λ31 0.1672 0.0304
λ10 0.0256 0.0000 λ21 0.0532 0.0093 λ32 0.2179 0.0507
λ11 0.0256 0.0000 λ22 0.0569 0.0037 λ33 0.2308 0.0129

Selected eigenvectors and node category

Node Eigenvector1 Eigenvector2 Node number Node Eigenvector1 Eigenvector2 Node number
1 -0.1038 0.0647 1 18 -0.1001 0.1498 1
2 -0.0405 0.0936 1 19 0.1632 -0.0583 2
3 0.0231 0.0418 2 20 -0.0128 0.0647 1
4 -0.0527 0.1039 1 21 0.1539 -0.0594 1
5 -0.2876 -0.1203 2 22 -0.1000 0.1497 2
6 -0.3178 -0.1986 1 23 0.1579 -0.0597 2
7 -0.3194 -0.2006 1 24 0.1546 -0.0601 2
8 -0.5219 0.1007 1 25 0.1528 -0.0641 2
9 0.0504 0.0138 2 26 0.1489 -0.0732 1
10 0.0915 0.0129 2 27 0.1863 -0.0892 2
11 -0.02769 -0.1208 1 28 0.1176 -0.0359 1
12 -0.2117 0.7549 2 29 0.0948 -0.0059 2
13 -0.1093 0.1647 2 30 0.1634 -0.0698 2
14 -0.0139 0.0654 2 31 0.0726 0.0139 2
15 0.1576 -0.0613 1 32 0.0976 -0.0281 1
16 0.1643 -0.0619 2 33 0.1194 -0.0381 2
17 -0.4218 -0.3576 1 34 0.1173 -0.0285 1

Figure 9 shows the results of the Zachary network division.The Zachary Karate Club network is a common experimental network used to evaluate the effectiveness of club division. The network consists of 34 points and 75 edges. Due to some reasons, the club forms joint small clubs centered on the superintendent and the principal respectively. Applying the algorithm proposed in this paper, this network was divided into 2 parts, and from the division results, it can be seen that the algorithm proposed in this paper can accurately and automatically determine the clustering categories. At the same time, the eigenvectors corresponding to the first two second-smallest eigenvalues are selected according to the formula, and the accuracy of the club node division results after applying the K-means algorithm for clustering analysis reaches 98.03%, which further indicates that the eigenvectors automatically selected by the algorithm in this paper are effective.

Figure 9.

Grid division results of Zachary network

Conclusion

This paper investigates the geometric significance of eigenvalues and eigenvectors under the classification criteria of reversibility and irreversibility of matrix A , and demonstrates the value of eigenvalues and eigenvectors and their geometric significance in data analysis by using example applications of principal component analysis and spectral clustering algorithms.

The processing and dimensionality reduction of 2048 sets of interferometric data collected by the spatial aberration spectrometer are realized by using principal component analysis, and the first principal component and the second principal component are selected as the main feature components of monochromatic and continuous light sources, so as to realize the correction of the spatial aberration interferometric data. Taking the spectral data in rows 524, 596, and 974 as an example, the mean square error values of the corrected data are 0.134, 0.108, and 0.114, respectively, which are reduced by 75.55%, 77.31%, and 77.34% compared with the pre-correction. It shows the effective application of principal component analysis with eigenvalues and eigenvectors and their geometric significance as the main principle in the correction of spatial outlier interference data.

The improvement of the spectral clustering algorithm in this paper successfully realizes the automatic determination of the number of clusters and the automatic selection of the eigenvectors in the division of complex networks. Using the improved algorithm in this paper, the karate club network is divided into 2 groups, and the accuracy of the association node division results is as high as 98.03%. Through the study of the real situation of karate club network, it explains that the network division results of this paper are highly consistent with the real situation, highlighting the effectiveness of the improved spectral clustering algorithm of this paper and the important value of eigenvalues and eigenvectors and their geometrical significance in data analysis.

Sprache:
Englisch
Zeitrahmen der Veröffentlichung:
1 Hefte pro Jahr
Fachgebiete der Zeitschrift:
Biologie, Biologie, andere, Mathematik, Angewandte Mathematik, Mathematik, Allgemeines, Physik, Physik, andere