Geometric significance of eigenvalues and eigenvectors in linear algebra and their potential value in data analysis

Eigenvalues and eigenvectors are important attributes of matrices, which have important applications in quantum mechanics, machine learning, signal and image processing, and other subject areas, and their concepts and connotations should be skillfully mastered and deeply understood by science and engineering students. Let a square array $A$ of order $n$ , if there exists a number $λ$ and a nonzero $n$ -dimensional vector $ξ$ such that $A ξ = λ ξ$ holds, the number $λ$ is said to be the eigenvalue of the square array $A$ , and the vector $ξ$ is the eigenvector of the square array $A$ corresponding to the eigenvalue $λ$ . From the point of view of transformations, when a $n$ -dimensional vector $ξ$ is an eigenvector of a matrix $A$ , the transformations of the matrix $A$ on the vector $ξ$ act as an equivalent of stretching the eigenvectors $λ$ times.

Eigenvalues and eigenvectors have a large number of application cases in big data analysis. When processing big data, in order to reduce the processing difficulty, it is usually necessary to reduce the dimensionality of the sample data, and principal component analysis is a method to project the high-dimensional data into a low-dimensional space using linear transformation under the premise of losing as little information as possible [1-5]. For example, in the observation of different cars, measuring their number of seats, number of tires, number of doors, number of windows, number of tires, size of cylinders, etc., some of these indicators have strong correlation, and this redundant information needs to be removed during data processing. Therefore, it is reasonable for principal component analysis to consider indicators with high variance as better for class differentiation [6-7]. PCA is one of the commonly used algorithms for big data analysis and machine learning, and it is an algorithm that will be involved in the study of computer, electronic information, economics, and medicine, etc. [8-10]. In this example, in addition to the use of eigenvalues and eigenvectors in linear algebra, it also involves square matrix diagonalization and coordinate transformation [11-14].

In this paper, we study the trajectory of vector $y = {(y_{1}, y_{2})}^{T}$ after linear transformation $y = A x$ when matrix $A$ is invertible by way of illustration. By solving the eigenvalues and eigenvectors of invertible matrix $A$ , the geometric significance of the eigenvalues and eigenvectors for the trajectory of $y = {(y_{1}, y_{2})}^{T}$ under linear transformation is illustrated. In order to ensure the comprehensiveness and universality of the research conclusions, this paper also illustrates the geometric significance of the eigenvalues and eigenvectors under linear transformation by giving an example under the condition that matrix $A$ is not invertible. The principal component analysis and spectral clustering algorithm are selected to demonstrate the application of eigenvalues and eigenvectors and their geometrical significance in data analysis, and the principal component analysis is used to realize the dimensionality reduction of the interferometric data collected by 2048 groups of spatial aberration spectrometers, as well as the correction of the data affecting the recovery of the spectral accuracy of the restored spectra, such as the existence of irregular spots in the spatial aberration interferograms. The division of complex networks is studied by spectral clustering algorithm, and how to automatically determine the number of clusters and select feature vectors is improved. The improved spectral clustering algorithm is used to classify the karate club network, revealing the effectiveness of the improved spectral clustering algorithm in this paper and the potential value of the geometric significance of eigenvalues and eigenvectors for application in data analysis.

2

The geometric significance of eigenvalues and eigenvectors

Eigenvalues and eigenvectors are two important concepts in linear algebra, which are now widely used in hot areas such as dynamical systems, machine learning, image processing and data analysis [15]. In this paper, we take the 2nd order square matrix as an example, focusing on explaining the geometric significance of eigenvalues and eigenvectors.

In the plane, vector $x = {(x_{1}, x_{2})}^{T}$ satisfies $x_{1}^{2} + x_{2}^{2} = 1$ , i.e., $x = {(x_{1}, x_{2})}^{T}$ is on the unit circle (or call it a unit vector).

Consider the linear transformation $y = A x$ , here the matrix: 1 $A = (\begin{matrix} a & b \\ c & d \end{matrix}), a^{2} + b^{2} \neq 0, c^{2} + d^{2} \neq 0_{°}$ $A=\left(\begin{array}{ll}a & b \\ c & d\end{array}\right), a^2+b^2 \neq 0, c^2+d^2 \neq 0_\text{o}$

Using $(\begin{matrix} a & b \\ c & d \end{matrix}) (\begin{matrix} x_{1} \\ x_{2} \end{matrix}) = (\begin{matrix} y_{1} \\ y_{2} \end{matrix})$ , this can be obtained: 2 $(c^{2} + d^{2}) y_{1}^{2} - 2 (a c + b d) y_{1} y_{2} + (a^{2} + b^{2}) y_{2}^{2} = {(a d - b c)}^{2}$

The distribution of the trajectories of vector $y = {(y_{1}, y_{2})}^{T}$ as $x = {(x_{1}, x_{2})}^{T}$ changes is studied geometrically below.

2.1

Matrix

A

invertible

When matrix $A$ is invertible, $a d - b c \neq 0$ , geometrically study the trajectory of vector $y = {(y_{1}, y_{2})}^{T}$ as it changes with $x = {(x_{1}, x_{2})}^{T}$ : 1)

When $a d - b c \neq 0$ and $a d + b c = 0$ , then equation (2) represents an ellipse, and the long and short axes of the ellipse are on the coordinate axes.

2)

When $a d - b c \neq 0$ and $a d + b c \neq 0$ , equation (2) is still an ellipse, and the long and short axes of the ellipse are not on the coordinate axes.

The following is an example:

Example 1: Given matrix $A = (\begin{matrix} 1 & 3 \\ 3 & 1 \end{matrix})$ , examine the trajectory of vector $y = {(y_{1}, y_{2})}^{T}$ after linear transformation $y = A x$ .

Solution: Matrix $A = (\begin{matrix} 1 & 3 \\ 3 & 1 \end{matrix})$ satisfies $a d - b c \neq 0$ , $a d + b c \neq 0$ . The trajectory of vector $y = {(y_{1}, y_{2})}^{T}$ after linear transformation is: 3 $10 y_{1}^{2} - 12 y_{1} y_{2} + 10 y_{2}^{2} = 64$

Equation (3) represents an ellipse whose long and short axes are not on the coordinate axes.

The Matlab program is applied to plot the geometry of equation (3), and the result is shown in Fig. 1, from which it can be seen that the unit element is transformed into an ellipse whose long and short axes are not on the coordinate axes by the linear transformation $y = A x$ .

Taking $x = (\frac{\sqrt{2}}{2}, \frac{\sqrt{2}}{2})$ , due to $(\begin{matrix} 1 & 3 \\ 3 & 1 \end{matrix}) (\begin{matrix} \frac{\sqrt{2}}{2} \\ \frac{\sqrt{2}}{2} \end{matrix}) = 4 (\begin{matrix} \frac{\sqrt{2}}{2} \\ \frac{\sqrt{2}}{2} \end{matrix})$ , here 4 is greater than zero, implying that the transformed vectors scale in the same direction, as shown in Fig. 2(a). Taking $x = (\frac{\sqrt{2}}{2}, - \frac{\sqrt{2}}{2})$ , due to $(\begin{matrix} 1 & 3 \\ 3 & 1 \end{matrix}) (\begin{matrix} \frac{\sqrt{2}}{2} \\ - \frac{\sqrt{2}}{2} \end{matrix}) = (- 2) (\begin{matrix} \frac{\sqrt{2}}{2} \\ - \frac{\sqrt{2}}{2} \end{matrix})$ , here -2 is less than zero, implying that the transformed vector scales in the opposite direction, as shown in Fig. 2(b). Thus in terms of linear transformation, $x = {(\frac{\sqrt{2}}{2}, \frac{\sqrt{2}}{2})}^{T}, (\frac{\sqrt{2}}{2}, - \frac{\sqrt{2}}{2})$ is an invariant of the linear transformation, where invariant means that the overall direction is unchanged, including the same direction and opposite direction.

Thus, define $k_{1} = {(\frac{\sqrt{2}}{2}, \frac{\sqrt{2}}{2})}^{T}$ , $k_{2} = {(\frac{\sqrt{2}}{2}, - \frac{\sqrt{2}}{2})}^{T}$ are the eigenvectors ( $k_{1} \neq 0, k_{2} \neq 0$ ) of the linear transformation $A x$ , and the corresponding 4 and -2 are their corresponding eigenvalues.

Equation (3) is examined from the point of view of quadratic forms. Equation (3) represents an ellipse, as shown in Fig. 3(a), when the long and short axes of the ellipse are no longer on the coordinate axes. Its long and short axes are made to fall on the coordinate axes by picking an orthogonal linear transformation.

The quadratic type (3) corresponds to matrix $B = (\begin{matrix} 10 & - 6 \\ - 6 & 10 \end{matrix})$ , and the computation yields matrix $B$ with 2 mutually exclusive eigenvalues of $λ_{1} = 4, λ_{2} = 16$ and corresponding unit eigenvectors of $η_{1} = {(\frac{\sqrt{2}}{2}, \frac{\sqrt{2}}{2})}^{T}$ , $η_{2} = {(\frac{\sqrt{2}}{2}, - \frac{\sqrt{2}}{2})}^{T}$ .

$η_{1}$ , $η_{2}$ are obtained after standard orthogonalization, construct the orthogonal matrix $T_{1} = (η_{1}, η_{2}) = (\begin{matrix} \frac{\sqrt{2}}{2} & \frac{\sqrt{2}}{2} \\ \frac{\sqrt{2}}{2} & - \frac{\sqrt{2}}{2} \end{matrix})$ , do the orthogonal linear transformation $y = T_{1} X$ , i.e., $(\begin{matrix} y_{1} \\ y_{2} \end{matrix}) = T_{1} (\begin{matrix} X_{1} \\ X_{2} \end{matrix})$ , where $T_{1}$ satisfies $T_{1}^{T} T_{1} = E$ , and substituting into equation (3) yields $4 X_{1}^{2} + 16 X_{2}^{2} = 64$ , i.e.,: 4 $\frac{X_{1}^{2}}{16} + \frac{X_{2}^{2}}{4} = 1$

Equation (4) is an ellipse with the long and short axes falling on the coordinate axes, as shown in Figure 3(b).

Let $T_{2} = (η_{2}, η_{1}) = (\begin{matrix} \frac{\sqrt{2}}{2} & \frac{\sqrt{2}}{2} \\ - \frac{\sqrt{2}}{2} & \frac{\sqrt{2}}{2} \end{matrix})$ , make an orthogonal linear transformation $y = T_{2} X$ , i.e., $(\begin{matrix} y_{1} \\ y_{2} \end{matrix}) = T_{2} (\begin{matrix} X_{1} \\ X_{2} \end{matrix})$ . Here $T_{2}$ also satisfies $T_{2}^{T} T_{2} = E$ , and substituting into equation (3) gives $16 X_{1}^{2} + 4 X_{2}^{2} = 64$ , i.e.,: 5 $\frac{X_{1}^{2}}{4} + \frac{X_{2}^{2}}{16} = 64$

Equation (5) is also an ellipse with the long and short axes falling on the coordinate axes, as shown in Figure 3(c). Figure 3(b) is equivalent to making a clockwise rotation of 45° for Figure 3(a), and Figure 3(c) is equivalent to making a counterclockwise rotation of 45° for Figure 3(a).

2.2

Matrix

A

is not invertible

Matrix $A$ is not invertible, i.e., $a d - b c = 0$ , at which point equation (2) changes to: 6 $(c^{2} + d^{2}) y_{1}^{2} - 2 (a c + b d) + (a^{2} + b^{2}) y_{2}^{2} = 0$

An example of the curve represented by equation (6) is given below.

Example 2: Consider matrix $A = (\begin{matrix} 1 & 1 \\ 1 & 1 \end{matrix})$ ( $A$ irreducible, symmetric) and satisfying $a d - b c = 0, a d + b c \neq 0$ . At this point, the trajectory of vector $y = {(y_{1}, y_{2})}^{T}$ after linear transformation is: 7 $2 y_{1}^{2} - 4 y_{1} y_{2} + 2 y_{2}^{2} = 0 \Leftrightarrow y_{2} = y_{1}, - \sqrt{2} \leq y_{1} \leq \sqrt{2}$

This shows that by this linear transformation, the unit circle becomes line segment $y_{2} = y_{1}, - \sqrt{2} \leq y_{1} \leq \sqrt{2}$ . The result of the transformation is shown in Fig. 4, where the trajectory of the linear transformation is given in Fig. 4(a) and the correspondences of certain points are given in Fig. 4(b).

A linear transformation changes $A (1, 0)$ to $A^{'} (1, 1)$ , $B (\frac{\sqrt{2}}{2}, \frac{\sqrt{2}}{2})$ to $B^{'} (\sqrt{2}, \sqrt{2})$ , $C (0, 1)$ to $C^{'} (1, 1)$ , $D (- \frac{\sqrt{2}}{2}, \frac{\sqrt{2}}{2})$ to $D^{'} (0, 0)$ , $E (- 1, 0)$ to $E^{'} (- 1, - 1)$ , $F (- \frac{\sqrt{2}}{2}, - \frac{\sqrt{2}}{2})$ to $F^{'} (- \sqrt{2}, - \sqrt{2})$ , $G (0, - 1)$ to $G^{'} (- 1, - 1)$ , and $H (\frac{\sqrt{2}}{2}, - \frac{\sqrt{2}}{2})$ to $H^{'} (0, 0)$ . In the figure $A^{'}$ coincides with $C^{'}$ , $D^{'}$ coincides with $H^{'}$ to the origin, and $E^{'}$ coincides with $G^{'}$ .

As: $(\begin{matrix} 1 & 1 \\ 1 & 1 \end{matrix}) (\begin{matrix} - \frac{\sqrt{2}}{2} \\ \frac{\sqrt{2}}{2} \end{matrix}) = 0 (\begin{matrix} - \frac{\sqrt{2}}{2} \\ \frac{\sqrt{2}}{2} \end{matrix})$ , $(\begin{matrix} 1 & 1 \\ 1 & 1 \end{matrix}) (\begin{matrix} \frac{\sqrt{2}}{2} \\ \frac{\sqrt{2}}{2} \end{matrix}) = 2 (\begin{matrix} \frac{\sqrt{2}}{2} \\ \frac{\sqrt{2}}{2} \end{matrix})$ , so that $(\begin{matrix} - \frac{\sqrt{2}}{2} \\ \frac{\sqrt{2}}{2} \end{matrix})$ and $(\begin{matrix} \frac{\sqrt{2}}{2} \\ \frac{\sqrt{2}}{2} \end{matrix})$ are “invariants of the linear transformation” from the point of view of the linear transformation. Thus, $k_{1} = (\begin{matrix} - \frac{\sqrt{2}}{2} \\ \frac{\sqrt{2}}{2} \end{matrix})$ and $k_{2} = (\begin{matrix} \frac{\sqrt{2}}{2} \\ \frac{\sqrt{2}}{2} \end{matrix})$ are defined to be the eigenvectors ( $k_{1} \neq 0, k_{2} \neq 0$ ) of the linear transformation, and the corresponding 0 and 2 are their corresponding eigenvalues.

Example 3: Consider the matrix $A = (\begin{matrix} 2 & 4 \\ - 1 & - 2 \end{matrix})$ ( $A$ irreducible, nonsymmetric) and satisfying $a d - b c = 0, a d + b c \neq 0$ . The trajectory of vector $y = {(y_{1}, y_{2})}^{T}$ after linear transformation is then: 8 $5 y_{1}^{2} + 20 y_{1} y_{2} + 20 y_{2}^{2} = 0 \Leftrightarrow y_{2} = - \frac{1}{2} y_{1}, - 2 \sqrt{5} \leq y_{1} \leq 2 \sqrt{5}$

This shows that by this linear transformation, the unit circle becomes a line segment: $y_{2} = - \frac{1}{2} y_{1}, - 2 \sqrt{5} \leq y_{1} \leq 2 \sqrt{5}$ . The transformation trajectory is shown in Figure 5.

Since the corresponding eigenvalue of this matrix is 0, and 0 is a 2-fold eigenroot. The corresponding unitized eigenvector: $(\begin{matrix} 2 & 4 \\ - 1 & - 2 \end{matrix}) (\begin{matrix} - \sqrt{\frac{4}{5}} \\ \sqrt{\frac{1}{5}} \end{matrix}) = 0 (\begin{matrix} - \sqrt{\frac{4}{5}} \\ \sqrt{\frac{1}{5}} \end{matrix})$ , and therefore $(\begin{matrix} - \sqrt{\frac{4}{5}} \\ \sqrt{\frac{1}{5}} \end{matrix})$ is an “invariant of the linear transformation” in terms of the linear transformation, so that $k_{1} = (\begin{matrix} - \sqrt{\frac{4}{5}} \\ \sqrt{\frac{1}{5}} \end{matrix})$ is defined as the eigenvector ( $k_{1} \neq 0$ ) of the linear transformation, and the corresponding 0 is its corresponding eigenvalue, where 0 is the dual root.

3

Application of eigenvalues and eigenvectors to data analysis

Eigenvalues and eigenvectors have very important applications in modern science, as the basis of linear algebra, it has an important role in theory and real life, and a lot of data analysis work is closely related to it. This section will mainly introduce the application of eigenvalues and eigenvectors in data analysis, including principal component analysis and, through the example study to explain the application effect of various methods residing in the eigenvalues and eigenvectors, to reveal the geometric significance of the eigenvalues and eigenvectors play an important role in data analysis in a more in-depth manner.

3.1

Principal component analysis

3.1.1

Principal component analysis methods

The starting point of the principal component analysis method is to compute from a set of features a set of new features [16] in descending order of importance, which are linear combinations of the original features and are uncorrelated with each other.

Denote $P$ original feature as $x_{1}, \dots, x_{ρ}$ and assume that the new feature, $ξ_{i}, i = 1, \dots, p$ , is a linear combination of these original features: 9 $ξ_{i} = \sum_{j = 1}^{p} α_{i} x_{j} = a^{7}_{i} x$

Here the linear combination is required to have coefficients modulo 1, which can be obtained in order to unify the $ξ_{i}$ scale: 10 $a_{i}^{r} a_{i} = 1$

Eq. (10) is written in matrix form as: 11 $ξ = A^{T} x$ where $ξ$ is the vector consisting of the new features $ξ_{i}$ and $A$ is the feature transformation matrix. The solution required here is the optimal orthogonal transformation $A$ , which maximizes the variance of the new feature $ξ_{i}$ . The orthogonal transformation ensures that the new feature is uncorrelated, and the larger the variance of the new feature, the more the samples differ in that dimension of the feature, thus making this feature more important.

Consider the first new feature $ξ_{i}$ : 12 $ξ_{i} = \sum_{i = 1}^{i} a_{i j} x_{j} = a_{1}^{T} x$

Its variance is: 13 $var (ξ_{i}) = E [ξ_{i}] - E [ξ_{i}] = E [a_{i}^{r} x x^{r} a_{i}] - E [a_{i}^{r} x] E [x^{r} a_{i}] = a_{i}^{r} \sum a_{i}$ where $Σ$ is the covariance matrix of $x$ , which can be estimated using samples. $E []$ is the mathematical expectation. To maximize the variance of $ξ_{i}$ under constraint $a_{i}^{τ} x = 1$ is equivalent to finding the extreme values of the Lagrangian function: 14 $f (a_{1}) = a_{1}^{r} \sum a_{1} - v (a_{1}^{T} a_{1} - 1)$ $N$ is then a Lagrange multiplier, and by taking the derivative of Eq. (14) with respect to $a_{1}$ and making it equal to 0, the optimal solution $a_{1}$ is obtained satisfying the following equation: 15 $\sum a_{1} = v a_{1}$

That is, the eigenequation of the covariance array $Σ$ , i.e., $a_{i}$ must be the eigenvector of the matrix $Σ$ , and $v$ is the corresponding eigenvalue. Substituting Eq. (15) into Eq. (14) gives: 16 $var (ξ_{1}) = a_{1}^{T} \sum a_{1} = v a_{1}^{T} a = v$

So, the optimal $a_{i}$ should be the eigenvector corresponding to the largest eigenvalue of $Σ$ . Moreover, $ξ_{1}$ is the first principal component which has the largest variance among all linear combinations of the original features.

The second new feature $ξ_{2}$ is solved for below, and the second new feature must be uncorrelated with the first principal component, i.e., I, in addition to fulfilling the same requirements as the first feature, i.e., having the largest variance and also having mode 1: 17 $E [ξ_{1} ξ_{2}] - E [ξ_{1} J E [ξ_{2}] = 0$

Substituting into Eq. (10) and organizing, we can get: 18 $a_{2}^{T} \sum a_{1} = 0$

Considering (16), the irrelevant requirements are equivalent to requirements $a_{2}$ and $a_{1}$ orthogonal: 19 $\begin{array}{l} a_{2}^{T} a_{1} = 0 \\ a_{2}^{T} a_{2} = 1 \end{array}$

Maximizing the variance of $ξ_{2}$ under the constraints yields that $a_{2}$ is the eigenvector corresponding to the second largest eigenvalue of $Σ$ , while $ξ_{2}$ is called the second principal component.

There are $p$ eigenvalues in the covariance matrix $Σ$ , including eigenvalues that may be equal and eigenvalues that may be 0. Arranging them from smallest to largest yields $p$ principal components constructed from the corresponding eigenvectors of these eigenvalues, $ξ_{i}, i = 1, \dots, p$ . The sum of the variances of all principal components is: 20 $\sum_{i = 1}^{p} var (ξ_{1}) = \sum_{i = 1}^{p} λ_{i}$

It is equal to the sum of the variances of the individual original features.

The individual column vectors of the exchange matrix $A$ are composed of the orthogonal normalized eigenvectors of $Σ$ , $A^{T} = A^{- 1}$ , i.e. $A$ is an orthogonal matrix.

As a feature extraction method, it is generally desirable to represent the data with fewer principal components. If the first $k$ principal components are taken, then the proportion of the full variance of the data represented by these $k$ principal components is: 21 $\sum_{i = 1}^{i} λ_{i} / \sum_{i = 1}^{p} λ_{i}$

Figure 6 shows an example of the magnitude of the individual eigenvalues on some dataset, and it can be seen that the first three eigenvalues, i.e., the variance of the first three principal components, account for most of the total variance, and one can decide on the choice of a few principal components to represent the total data based on such an eigenvalue mapping. In many cases, it is possible to determine in advance the proportion of the total variance of the data that it is hoped that the new eigenvalues will represent, and then try to calculate the appropriate $k$ according to equation (21).

The selection of relatively few principal components to represent the data can be used not only as a dimensionality reduction of the features, but also to eliminate noise from the data. Generally the principal components (or called sub-components) arranged at the back in the eigenvalue spectrum generally represent random noise in the data. In this case, if the very small component of $ξ$ corresponding to the eigenvalue is treated as 0, and then inverse transformed back to the original space, the noise reduction of the original data is realized.PCA (Principal Component Analysis) can downsize $n$ features to $k$ , which can be used for data compression, for example, a vector of 100 dimensions can be represented by 10 dimensions at the end, and then it is known that the compression rate is 90%.

3.1.2

Example applications

Spatial aberration spectroscopy (SHS) is a new hyperspectral remote sensing detection technology, the two-dimensional measured interferometric data acquired by spatial aberration spectrometer will be infected by a variety of influences, which will reduce the accuracy of the recovered spectra, so the experiments in this section focus on investigating the spatial aberration interferometric data correction method based on principal component analysis.

1)

Experimental data

The test data were collected using the spatial aberration spectrometer HEP-765-S, which has a fundamental frequency wavelength of 764.8 nm and a spectral resolution due to 0.01 nm. The monochromatic light source was a hollow cathode lamp potassium lamp, and the continuous light source was a GY-10 high-pressure spherical xenon lamp. The raw interferograms were collected using a spatial aberration spectrometer in a darkroom environment. The size of both images was 2048×2048 pixels, and each line in the figure represents a set of interferometric data. Both interferograms have the phenomenon of uneven intensity distribution, and there are irregularly shaped spots or patches in some areas, and the existence of these effects will reduce the accuracy of the recovered spectra, which need to be corrected and processed.

2)

Data processing and analysis

There is a Fourier transform relationship between the interferogram and the spectrogram, and the spectral data can be obtained by Fourier transforming the preprocessed two-dimensional interferogram. The two-dimensional interferograms to be processed in this experiment have 2048 groups of one-dimensional interferometric data in a hundred rows, and 2048 groups of spectral data can be obtained after Fourier transform processing of each group of interferometric data, which can be expressed as $B = (b_{1}, b_{2}, \dots, b_{2048})$ . The principal component analysis algorithm can analyze the correlation between the target spectra and the noise spectra in the spectral data, decompose the target spectra and the noise spectra into independent components, and then achieve the effect of data downscaling by retaining the target spectral components. Data dimensionality reduction, to achieve the effect of removing noise. Therefore, the principal component analysis algorithm is used to correct the spectral data.

According to the above analysis, 2048 rows of spectral data are obtained after de-baselining and Fourier transforming the original interferograms. 2048 sets of spectral data are processed by principal component analysis, and the eigenvalues, eigenvectors and projection values of each principal component are sorted, of which the eigenvalues, contribution rates and cumulative contribution rates of the first 10 principal components are shown in Table 1. As can be seen from Table 1, the first two principal components have a larger contribution than the other principal components and are not of an order of magnitude, and the cumulative contribution rate reaches 97.71%.

Table 1.

The results of the first ten principal components

Principal component	Eigenvalue	Contribution rate/%	Cumulative contribution/%
1	2847.83	51.43	51.43
2	2672.39	46.28	97.71
3	61.27	0.83	98.54
4	54.78	0.46	99.00
5	10.83	0.63	99.63
6	6.28	0.21	99.84
7	3.12	0.05	99.89
8	2.76	0.03	99.92
9	2.14	0.02	99.94
10	1.82	0.01	99.95

Figure 7 shows the average spectrogram and the spectrograms of the first ten principal components, the two characteristic peaks of the potassium lamp are and 766.70nm and 770.59nm, it can be seen from the figure that in the first principal component and the second principal component, the noise around the two characteristic peaks is small, and the characteristic peak intensity is large. The noise gradually increases in the third principal component to the tenth principal component, and the intensity of the two peaks in each principal component gradually decreases, and none of them are located at 766.70nm and 770.59nm, indicating that the noise has become the main influence in these principal components.

In order to verify that the principal component analysis method has the same denoising effect on the interference of continuous light, the xenon lamp spectral data were processed in the same way. The contributions of the first three principal components in the processed xenon lamp spectral data were 42.78%, 36.94%, and 0.83%, respectively, of which the contributions of the first two principal components were greater than 35%, with a cumulative contribution of 79.72%. The average spectrogram with the first ten principal components spectrogram is shown in Figure 8. As can be seen from Fig. 8, the intensity of the xenon characteristic peaks in the first two principal components is larger, and the intensity of the xenon characteristic peaks in the third principal component to the tenth principal component is basically the same as the noise intensity, i.e., the first two principal components can be used as xenon spectral components.

In order to quantitatively evaluate the effectiveness of the principal component analysis method, 300 rows of less noisy data were randomly selected from 2048 rows of data, and the mean square errors before and after spectral correction were calculated for the 524th, 596th and 974th rows of the spectra among them, and the results are shown in Table 2. From Table 2, it can be seen that the mean square error values of the three rows of spectra after correction are 0.134, 0.108 and 0.114, respectively, which are smaller than the mean square error values before correction, indicating that the correction effect of principal component analysis is better.

Table 2.

MSE of xenon lamp spectra before and after denoising

Mean square error	Line 524	Line 596	Line 974
Before denoising	0.548	0.476	0.503
After denoising	0.134	0.108	0.114

3.2

Spectral clustering

Cluster analysis is a common method in data analysis, and among data clustering, spectral clustering is one of the most popular methods. Spectral clustering is based on the theory of spectral map division, compared with the traditional clustering algorithms, this class of algorithms in the consideration of the continuous relaxation form of the problem, the original problem is transformed into the search for the eigenvalues and eigenvectors of the Laplacian matrix. The algorithm is capable of recognizing non-convex distributions and is also applied to many practical problems, which are well represented in the fields of image segmentation, text mining, and bioinformatics research. In this section, the spectral clustering algorithm is investigated.

3.2.1

Algorithm overview

The spectral clustering algorithm is based on the theory of spectral graph partitioning, and regards data clustering as a multiplexed partitioning problem of an undirected graph [17]. It is assumed that each data sample is regarded as a vertex V of the graph, and the connected edges E between the vertices are assigned a weight value W according to the similarity between the data, thus obtaining an undirected weighted graph G = (V, E) based on the similarity of the samples. From the point of view of optimal graph partitioning, it is to minimize the similarity between any two subgraphs of the partition and maximize the similarity within each subgraph.

The optimal solution of the graph partitioning problem is an NP problem. -A better solution is to consider the continuous relaxed form of the problem, whereupon the original problem can be transformed into a spectral decomposition of the Laplacian matrix, thus obtaining a globally optimal solution of the graph partition criterion in the relaxed continuous domain.

The similarity matrix, also known as the affinity matrix, is usually denoted by W or A. This matrix is defined as: 22 $W_{i j} = \exp (- \frac{∥ x_{i} - x_{i} ∥^{2}}{2 σ^{2}})$ where $x_{i}, x_{j}$ denotes each data sample point, $∥ x_{i} - x_{j} ∥$ is the Euclidean distance between sample points $i, j$ , and $σ$ specifies a parameter that determines the rate of decay between data points.

Spectral clustering according to different criterion functions and spectral mapping methods, there are a variety of different implementation methods. A representative one is the NJW algorithm, whose main steps are as follows:

Step 1: Construct the similarity matrix W of the data samples.

Step 2: Construct the Laplace matrix L.

Step 3: Find the first K largest eigenvalues and the corresponding eigenvectors $v_{l}, v_{2}, \dots, v_{k}$ of the matrix L, and construct the eigenvector space matrix $V = [v_{1}, v_{2}, \dots, v_{k}] \in R^{n \times k}$ .

Step 4: Consider each row of the eigenvector space V as a point in the space and cluster it into $k$ classes using classical clustering methods such as K-means.

3.2.2

Improved spectral clustering algorithm

This paper focuses on how to determine the number of clusters and how to select the eigenvectors for improvement. First solve the eigenvalue of the Laplace matrix of the network nodes to be divided, that is, $λ_{1}, λ_{2}, λ_{3}, \dots, λ_{n}$ ; after that, use the eigeninterval formula $C (i) = e_{i + 1}^{λ} - e^{λ_{i}}$ to solve the largest eigeninterval $C (i)$ and write it down as $C {(i)}_{\max}$ . Order $p = \arg C {(i)}_{max}$ , then select the first $N - p - 1$ eigenvectors, of which $N$ is the number of nodes in the complex network, and $p$ is the maximum eigeninterval of the independent variable. The first $N - p - 1$ feature vectors are processed by K-means algorithm and divided into $N - p - 1$ clusters, which not only determines the number of clusters but also selects the feature vectors that should be processed. The specific algorithm is as follows: 1)

Compute the adjacency matrix $A$ , where $A_{i, j} = 1$ if point $i$ is connected to point $j$ by an edge and $A_{i, j} = 0$ otherwise.

2)

Construct the Laplace matrix $L$ from the adjacency matrix, where $L_{i, j} = K_{i} δ_{i, j} - A_{i, j}$ , $K_{i}$ are the degrees of node $i$ , $δ_{i, j}$ is 1 when $i = j$ is the case, and 0 otherwise.

3)

Compute the eigenvalues and eigenvectors of matrix $L$ and arrange the eigenvalues in ascending order, i.e., $0 = λ_{1} \leq λ_{2} \leq λ_{3} \leq \dots \leq λ_{n}$ , where the eigenvectors corresponding to the eigenvalues are $α_{1}, α_{2}, α_{3}, \dots, α_{n - 1}$ , respectively.

4)

Calculate the eigenvalue $C (1), \dots, C (N - 2)$ from Eq. $C (i) = e_{i + 1}^{λ} - e^{λ}$ .

5)

Solve for $C {(i)}_{\max} = \max {C (1), C (2), C (3), \dots, C (N - 2)}$ and let $p = \arg C {(i)}_{\max}$ elect $N - p - 1$ former eigenvectors, i.e. $α_{1}, α_{2}, \dots, α_{n - p - 1}$ .

6)

Cluster the $N - p - 1$ selected eigenvectors with K-means algorithm and the number of clusters is $N - p - 1$ .

3.2.3

Example applications

In order to test the feasibility of the improved algorithm and the accuracy of the delineation results, the Karate Club relationship network (Zachary network), which is commonly used in the delineation of complex network associations, is selected for testing in this paper. For the karate club member relationship network, solving its Laplace matrix eigenvalues as well as $C (i)$ is shown in Table 3, and Table 4 shows the eigenvectors and node delineation results selected by this paper’s algorithm. From Table 3, $C (P) = 0.0507$ , then the eigenvectors corresponding to taking the first $N - p - 1 = 34 - 31 - 1 = 2$ eigenvalues are derived according to the formula proposed in this paper, and at the same time, they are divided into $N - p - 1 = 2$ groups by K-means.

Table 3.

Eigenvalues of Laplacian matrix and $C (i)$ of Zachary network

	Eigenvalue	C(i)		Eigenvalue	C(i)		Eigenvalue	C(i)
λ₁	0.0043	─	λ₁₂	0.0257	0.0001	λ₂₃	0.0582	0.0013
λ₂	0.0107	0.0064	λ₁₃	0.0257	0.0000	λ₂₄	0.0637	0.0055
λ₃	0.0138	0.0031	λ₁₄	0.0314	0.0057	λ₂₅	0.0708	0.0071
λ₄	0.0157	0.0019	λ₁₅	0.0348	0.0034	λ₂₆	0.0812	0.0104
λ₅	0.0198	0.0041	λ₁₆	0.0376	0.0028	λ₂₇	0.0843	0.0031
λ₆	0.0216	0.0018	λ₁₇	0.0403	0.0027	λ₂₈	0.0916	0.0073
λ₇	0.0228	0.0012	λ₁₈	0.0429	0.0026	λ₂₉	0.1164	0.0248
λ₈	0.0241	0.0013	λ₁₉	0.0431	0.0011	λ₃₀	0.1368	0.0204
λ₉	0.0256	0.0015	λ₂₀	0.0439	0.0008	λ₃₁	0.1672	0.0304
λ₁₀	0.0256	0.0000	λ₂₁	0.0532	0.0093	λ₃₂	0.2179	0.0507
λ₁₁	0.0256	0.0000	λ₂₂	0.0569	0.0037	λ₃₃	0.2308	0.0129

Table 4.

Selected eigenvectors and node category

Node	Eigenvector1	Eigenvector2	Node number	Node	Eigenvector1	Eigenvector2	Node number
1	-0.1038	0.0647	1	18	-0.1001	0.1498	1
2	-0.0405	0.0936	1	19	0.1632	-0.0583	2
3	0.0231	0.0418	2	20	-0.0128	0.0647	1
4	-0.0527	0.1039	1	21	0.1539	-0.0594	1
5	-0.2876	-0.1203	2	22	-0.1000	0.1497	2
6	-0.3178	-0.1986	1	23	0.1579	-0.0597	2
7	-0.3194	-0.2006	1	24	0.1546	-0.0601	2
8	-0.5219	0.1007	1	25	0.1528	-0.0641	2
9	0.0504	0.0138	2	26	0.1489	-0.0732	1
10	0.0915	0.0129	2	27	0.1863	-0.0892	2
11	-0.02769	-0.1208	1	28	0.1176	-0.0359	1
12	-0.2117	0.7549	2	29	0.0948	-0.0059	2
13	-0.1093	0.1647	2	30	0.1634	-0.0698	2
14	-0.0139	0.0654	2	31	0.0726	0.0139	2
15	0.1576	-0.0613	1	32	0.0976	-0.0281	1
16	0.1643	-0.0619	2	33	0.1194	-0.0381	2
17	-0.4218	-0.3576	1	34	0.1173	-0.0285	1

Figure 9 shows the results of the Zachary network division.The Zachary Karate Club network is a common experimental network used to evaluate the effectiveness of club division. The network consists of 34 points and 75 edges. Due to some reasons, the club forms joint small clubs centered on the superintendent and the principal respectively. Applying the algorithm proposed in this paper, this network was divided into 2 parts, and from the division results, it can be seen that the algorithm proposed in this paper can accurately and automatically determine the clustering categories. At the same time, the eigenvectors corresponding to the first two second-smallest eigenvalues are selected according to the formula, and the accuracy of the club node division results after applying the K-means algorithm for clustering analysis reaches 98.03%, which further indicates that the eigenvectors automatically selected by the algorithm in this paper are effective.

4

Conclusion

This paper investigates the geometric significance of eigenvalues and eigenvectors under the classification criteria of reversibility and irreversibility of matrix $A$ , and demonstrates the value of eigenvalues and eigenvectors and their geometric significance in data analysis by using example applications of principal component analysis and spectral clustering algorithms.

1)

The processing and dimensionality reduction of 2048 sets of interferometric data collected by the spatial aberration spectrometer are realized by using principal component analysis, and the first principal component and the second principal component are selected as the main feature components of monochromatic and continuous light sources, so as to realize the correction of the spatial aberration interferometric data. Taking the spectral data in rows 524, 596, and 974 as an example, the mean square error values of the corrected data are 0.134, 0.108, and 0.114, respectively, which are reduced by 75.55%, 77.31%, and 77.34% compared with the pre-correction. It shows the effective application of principal component analysis with eigenvalues and eigenvectors and their geometric significance as the main principle in the correction of spatial outlier interference data.

2)

The improvement of the spectral clustering algorithm in this paper successfully realizes the automatic determination of the number of clusters and the automatic selection of the eigenvectors in the division of complex networks. Using the improved algorithm in this paper, the karate club network is divided into 2 groups, and the accuracy of the association node division results is as high as 98.03%. Through the study of the real situation of karate club network, it explains that the network division results of this paper are highly consistent with the real situation, highlighting the effectiveness of the improved spectral clustering algorithm of this paper and the important value of eigenvalues and eigenvectors and their geometrical significance in data analysis.

Sprache:: Englisch

Zeitrahmen der Veröffentlichung:: 1 Hefte pro Jahr
Fachgebiete der Zeitschrift:: Biologie, Biologie, andere, Mathematik, Angewandte Mathematik, Mathematik, Allgemeines, Physik, Physik, andere

Zeitschrift RSS Feed

Geometric significance of eigenvalues and eigenvectors in linear algebra and their potential value in data analysis

Yingdi Li

Online veröffentlicht: 24. Sept. 2025

Eingereicht: 17. Jan. 2025

Akzeptiert: 05. Mai 2025

DOI: https://doi.org/10.2478/amns-2025-0985

SchlüsselwörterEigenvalues, Eigenvectors, Principal component analysis, Spectral clustering, Geometric meaning, Data analysis

© 2025 Yingdi Li, published by Sciendo

This work is licensed under the Creative Commons Attribution 4.0 International License.

Schlüsselwörter
Eigenvalues, Eigenvectors, Principal component analysis, Spectral clustering, Geometric meaning, Data analysis