Automatic Identification of Surface Defects in Semiconductor Materials Based on Machine Learning

Semiconductor processing has developed rapidly in the 1990s, enabling another leap in the ability to store and process information and motion control per unit volume of matter. The design and processing technology of three-dimensional microfabrication, as an important part of this development, requires the realisation of the fabrication of miniature systems with high depth-to-width ratios on semiconductor materials [1-2]. The three-dimensional nature of processing three-dimensional structures is also increasingly placing new demands on the internal properties of semiconductor materials. The internal consistency of the material becomes an important guarantee for the realisation of microdevice functions [3-4].

Semiconductor components are essential components in the manufacture of precision electronic devices and are usually made of semiconductor materials (e.g., silicon) [5]. The main role of these components is to transmit, amplify, and control electrical signals in electronic circuits to achieve the proper functioning of various electronic devices, and they are widely used in the manufacture of integrated circuits, transistors, diodes, and other key electronic components [6-7]. Due to the working principle and material properties of semiconductor components, the development trend in this field is moving towards smaller size, higher performance and higher integration. However, as the size of the components decreases and the integration degree increases, the manufacturing process becomes more complex and also increases the difficulty of detecting surface defects. This is because as the size decreases, small and complex surface defects are more likely to appear, and these defects may directly affect the electrical performance, stability, and reliability of the components [8-10]. Therefore, research on the detection of surface defects in dense samples of semiconductor components has become particularly critical.

The detection of surface defects is crucial for improving the overall quality and reliability of semiconductor components [11]. Through early detection and effective detection of surface defects, manufacturers are able to take appropriate improvement and repair measures to enhance the performance and long-term reliability of the devices in order to meet the demand for high-quality devices in modern electronic devices. Secondly, this process helps to reduce production costs [12-13]. Early detection and treatment of surface defects can avoid more serious quality problems in the subsequent production process and reduce the scrap rate, which in turn reduces production costs and improves production efficiency. Therefore, surface quality inspection of semiconductor components has become an indispensable and important part of the automated production process, which not only helps to ensure the high quality and reliability of the devices, but also effectively reduces the performance problems that may occur in the subsequent production and use stages [14-16].

Traditional defect detection methods rely on manual labour. Workers identify and mark surface defects by visual inspection or by using simple tools [17]. However, this method suffers from high human resource consumption, inefficiency, and subjective judgement, especially in modern manufacturing environments with high throughput and high precision, where manual inspection is clearly not sufficient. Therefore, machine vision-based surface defect detection methods have become an indispensable alternative. One of the surface defect detection methods based on image processing uses computer vision and image processing techniques to analyse the image and thus achieve the detection of surface defects. This method solves the problems of manual detection to a certain extent and improves the automation and accuracy of detection [18-20].

In this paper, we study and compare several edge detection algorithms with more applications, and finally decide to use the Canny algorithm to determine the defective regions on the surface of semiconductor materials.The detected defect edges are removed from noise using a morphological filtering algorithm to increase the smoothness of detection. The high-dimensional feature space of the defective region is transformed to the low-dimensional feature space using transformation or mapping, and the geometric, gray-scale, and texture features of the surface of the semiconductor material are extracted based on the different shapes, sizes, grayscales, and texture information exhibited by the defects. Based on the TensorFlow machine learning framework, a semiconductor material surface defect recognition model is established, and the features are extracted through network self-training, and the convolution operation is utilized to improve the operation efficiency of the machine neural network model. Combined with the migration model to train the machine learning model constructed in this paper, the SGD optimization algorithm and the warm-up strategy are used to further improve the performance of the model for semiconductor material surface defect recognition. The industrial dataset WM-811K from the actual semiconductor production line is collected on-site, and this dataset is used to simulate and analyze the performance of this paper’s model for the recognition of eight types of defects on the surface of semiconductor materials. The recognition results are compared and analyzed with the decision tree algorithm, SVM algorithm and random forest algorithm, which visually highlights the good performance of this paper’s machine learning model in semiconductor material surface defect recognition.

2

Defect area detection

The automatic recognition technology of defects is interested in the defective region. In order to improve the efficiency of the recognition, the defective region in the semiconductor material should first be detected and segmented, and then characterization of this region. Edge detection experiments in the classic image detection segmentation method use the mutation type’s boundary gray value for the target and background regions as the basis for segmentation of the target region.During the imaging process, the gray value of the image of the defective region is reduced compared to the surrounding normal region due to scattering of light in the defective region. Imaging process due to light scattering in the defective region, so the gray value of the defective region of the image compared to the normal region is lower, the image is always at the border with the surrounding normal region will produce gray value of the mutation, so it can be used based on the gray value of the mutation type edge detection operator to carry out feature extraction.

Comparison of several more widely used edge detection operators can be obtained: first-order gradient operator Prewitt and Sobel low computational complexity, but the detection accuracy is also lower, and can not detect some of the edge of the finer crack defects. Canny and Log operators are used to detect the edge of finer defects with higher sensitivity, but the stability of the Canny operator is more stable and less susceptible to interference from noise.The Canny operator’s stability is enhanced and they are less susceptible to noise interference.Therefore, based on comprehensive consideration, the Canny operator will be used to determine the surface defects of semiconductor materials [21]. Figure 1 shows the results of the detection, in which Figure 1(a) is the detection result of the Canny operator and Figure 1(b) is the effect after denoising. It can be seen that the Canny operator is able to detect the defect edges completely, and for the part where there is a small number of noise points, it can be removed by morphological filtering algorithm (using open operation, corrosion first and then expansion), and it can show the defect edges accurately and clearly after removing the noise.

3

Defective area feature extraction

3.1

Feature extraction

The defect contour was successfully detected using Canny operator, and before recognition, the 2D image needs to be specially processed to convert it into quantitative information that is easy to process by the computer.Converting the information of a 2D image into quantitative information involves extracting features.

The high-dimensional feature space of the defect region is transformed to the low-dimensional feature space by using transformation or mapping. By analyzing the defective regions on the surface of semiconductor materials, it is found that different defects on the surface of semiconductor materials have their own characteristics, which are manifested in the shape, size, gray scale and texture information of the defects. The geometric features, grayscale features, and texture features of defects on the surface of semiconductor materials are extracted respectively [22].

3.1.1

Geometric Feature Extraction

The geometry of the defective region is an important feature in defect classification, and the geometry of the region can be obtained by extracting the geometric features of the defective region.The possible types of defects can be analyzed by using geometric features, which mainly include the area of the region, center of mass, and moments, etc.

Definition f(x,y) is the gray scale image of the defective region with M rows and N columns, and f(x,y) denotes the pixel coordinates of x rows and y columns. The set of pixel points of the defective region is denoted as S_region and the set of pixel points of the boundary of the defective region is denoted as S_òdgò. The geometric features used in this paper are as follows.

1)

Perimeter P

The total number of pixels at the boundary of the defective region can characterize the perimeter of the defective region, which is a contour feature parameter: (1) $P = \sum_{(x, y) \in S_{e d g e}} 1$

2)

Area A

Characterize the area of the region by the total number of pixels it contains: (2) $A = \sum_{(x, y) \in S_{r e g i o n}} 1$

3)

Center of mass coordinates (x_c,y_c)

The position of the defective region in the image is described by the region center of mass coordinates. The possible types of defects can be determined based on the location of the defective region. The center of mass coordinates can be expressed as: (3) ${\begin{array}{l} x_{c} = \frac{1}{A} \sum_{(x, y) \in S_{r e g b o r}} x \\ y_{c} = \frac{1}{A} \sum_{(x, y) \in S_{r e g b o r}} y \end{array}$

4)

Degree of rectangularity R_rect.

The ratio of the length of the short side of the smallest external rectangle of the defective region W to the length of the long side L indicates the degree of rectangularity R_net, usually the degree of rectangularity R_ncr < 1, and when the smallest external rectangle of the defective region is a square, the degree of rectangularity R_recr = 1: (4) $R_{r e c t} = \frac{W}{L}$

5)

Duty cycle R_D

The ratio of defect area A to the minimum circumscribed rectangular area of the defect region is defined as duty cycle R_D. For “line” and “striation” defects on the surface of semiconductor materials, the defect area accounts for a large proportion of the smallest external rectangle, but for irregular defects such as “scratches” and “jam marks”, the defect area accounts for a small proportion of the smallest external rectangle: (5) $R_{D} = \frac{A}{W \times L}$

6)

Eccentricity.

Eccentricity is used to indicate the compactness of the defective region and is defined as the ratio of the longest chord L₁ of the defective region to the longest chord L₂ perpendicular to L₁: (6) $R_{e c c} = \frac{L_{1}}{L_{2}}$

7)

Circularity R_cent

For geometric shapes with the same perimeter, the circle has the largest area. Roundness R_cemt is defined as the ratio of the area of a defective region to the area of a circle (the densest shape) with the same perimeter, and roundness is a denseness descriptor. Typically, circularity R_cemt < 1, for circular defective areas, and R_cemt = 1, is a dimensionless measure: (7) $R_{c e n t} = \frac{4 π A}{P^{2}}$

8)

Hu invariant moments

Hu invariant moments have translation, rotation and scale invariance. For a gray scale image f(x,y) of the defective region in M row and N columns the 2D (p+q)th order moments are defined as: (8) $ω_{p q} = \sum_{x = 0}^{M - 1} \sum_{y = 0}^{N - 1} x^{p} y^{q} f (x, y)$ where p = 0,1,2,⋯,q = 0,1,2,⋯ are integers. The image center of mass can be expressed as: (9) ${\begin{matrix} x_{c} = \frac{ω_{10}}{ω_{00}} \\ y_{c} = \frac{ω_{01}}{ω_{00}} \end{matrix}$

From Eqs. (8) and (9), the corresponding (p+q)st order central moments as well as the normalized central moments can be expressed as: (10) $u_{p q} = \sum_{x = 0}^{M - 1} \sum_{y = 0}^{N - 1} {(x - x_{c})}^{p} {(y - y_{c})}^{q} f (x, y)$ (11) $η_{p q} = \frac{u_{p q}}{u_{p q}^{α}}$ where $α = \frac{p + q}{2} + 1$ , where p+q = 2,3,⋯.

Seven invariant moments can be introduced from the second-order moments and third-order moments, due to the fact that the Hu higher-order invariant moments are easily affected by external factors such as noise in pattern recognition, only the first four Hu invariant moments are extracted in this section, and the computational formulas are as follows: (12) $λ_{1} = η_{20} + η_{02}$ (13) $λ_{2} = {(η_{20} - η_{02})}^{2} + 4 η_{11}^{2}$ (14) $λ_{3} = {(η_{30} - 3 η_{12})}^{2} + {(3 η_{21} - η_{03})}^{2}$ (15) $λ_{4} = {(η_{30} + η_{12})}^{2} + {(η_{21} + η_{03})}^{2}$

3.1.2

Gray-scale feature extraction

The grayscale features of the defective regions on the surface of semiconductor materials are mainly the average grayscale value, grayscale variance or standard deviation, and grayscale entropy. Since it is easy to obtain the histogram of the image, the histogram of the image is chosen for the statistics of the gray-scale features. The histogram is a grayscale image of h, with a range of gray values [0, L-1], such that z represents a random variable of gray level, the probability of occurrence of gray level i is p(z_i) = h(z_i)/K,i = {0,1,2,⋯, L–1}, and K represents the total number of image pixels. Three gray-scale features of the defective region are extracted.

1)

The average gray level μ, i.e: (17) $μ = \sum_{i = 0}^{L - 1} z_{i} p (z_{i})$

2)

Gray scale standard deviation σ(z)

The standard deviation is more intuitive compared to the variance, so it is chosen as a measure of the image gray scale features, which is obtained by calculating the arithmetic square root of the variance: (18) $σ (z) = \sqrt{\sum_{i = 0}^{L - 1} {(z_{i} - μ)}^{2} p (z_{i})}$

3)

Gray scale entropy E_ent, i.e: (19) $E_{e n t} = - \sum_{i = 0}^{L - 1} p (z_{i}) \log_{2} p (z_{i})$

3.1.3

Texture Feature Extraction

One of the simplest ways to describe texture is to use the gray-level histogram statistical moments of an image or region. The texture measures computed using only the histogram do not carry information about the relative positions of the pixels with respect to each other, which is critical when characterizing texture. Therefore, both the distribution of gray levels and the relative positions of pixels in the image are considered in texture analysis.

The gray level co-production matrix considers both the gray level distribution and the relative position of the pixels. Let the gray level of image f be L, let Q be the operator of the relative positions of the two pixels, and G be the covariance matrix of size L×L, whose element g_i,j is the number of times the pixel pairs satisfying the gray level i–1 and gray level j–1 of operator Q appear in the image, here i,j ∈ [1,L], corresponding to the number of rows and columns of the matrix, respectively. The matrix G is normalized using the following equation to obtain the normalized matrix P_Q: (20) ${\begin{array}{l} p_{i, j} = g_{i, j} / n \\ \sum_{i = 1}^{L} \sum_{j = 1}^{L} p_{i, j} = 1 \end{array}$ (21) $P_{Q} = [\begin{matrix} p_{1, 1} & \dots & p_{1, j} & \dots & p_{1, L} \\ \dots & \dots & \dots & \dots & \dots \\ p_{i, 1} & \dots & p_{i, j} & \dots & p_{i, L} \\ \dots & \dots & \dots & \dots & \dots \\ p_{L, 1} & \dots & p_{L, j} & \dots & p_{L, L} \end{matrix}]$

The position operator Q can be represented in a variety of ways, and four common operators have been proposed in related studies. Q_ϕ,d depicts two pixels separated by a distance of d in direction ϕ, where ϕ four values of 0°, 45°, 90°, and 135° are taken respectively. Four grayscale covariance matrices can be obtained, and these four covariance matrices are used to compute the texture features, and the computed texture features are averaged and used as the final texture features of the defective region.

By analyzing the elements of G, the texture patterns present in the image can be detected, and in this paper, a total of 6 depictors of correlation (Cor), contrast (Con), energy (Ene), homogeneity (Hom), inverse difference moment (Idm), and entropy (Ent) are selected as the texture features of defective regions.

3.1.4

Standardization of feature parameters

The geometric, grey scale as well as texture features extracted directly from the greyscale image of the defective region have different orders of magnitude, which is detrimental to feature selection and defect classification, and requires standardization of the raw data. The Auto Scaling method, i.e. standard deviation standardization, is utilized. The standardized feature data satisfy the standard normal distribution with mean 0 and standard deviation 1, and the standardization formula is: (22) ${\begin{matrix} x_{i j}^{*} = \frac{x_{i j} - m_{j}}{s_{j}}, i = 1, 2, \dots, n, j = 1, 2, \dots, d \\ m_{j} = \frac{1}{n} \sum_{i = 1}^{n} x_{i j} \\ s_{j} = \sqrt{\frac{1}{n - 1} \sum_{i = 1}^{n} {(x_{i j} - m_{j})}^{2}} \end{matrix}$ where n denotes the number of samples, d denotes the number of feature dimensions, m_j denotes the sample mean of feature j, and s_j denotes the sample standard deviation of feature j.

3.2

Experimental analysis

Five scratched (Sc) and five inclusion (In) defects were taken for feature extraction experiments, and the results are shown in Fig. 2, in which Fig. 2(a) is the extracted geometric feature parameters, Fig. 2(b) is the gray-scale feature parameters, and Fig. 2(c) is the texture feature parameters. From the figure, it can be seen that the defective region feature extraction algorithm used in this paper can effectively extract the defective features on the surface of the semiconductor material, and because of the standardization of the feature parameters in this paper, the feature data all satisfy the standard normal distribution characteristics with a mean value of 0 and a standard deviation of 1. At the same time, different kinds of uniform eigenvalues, the numerical difference is large, for example, in Figure 2(a), scratches 1 and inclusions 1 target area perimeter P standardized for 0.26 and 1.39, respectively, there is a significant difference. Therefore, using the difference between different types of defects on the same feature, the identification and classification of defect types can be carried out.

4

Machine learning-based defect recognition

4.1

Modeling

Based on the TensorFlow machine learning framework, a model for recognizing semiconductor material surface defects has been established [23]. The features are selected through network self-training, and the convolution operation is used to improve the efficiency of machine neural network model operation and enhance the performance of model training. The use of weight sharing not only improves the model’s training speed, but also strengthens its defect recognition function.Convolutional neural network modeling mainly includes input layer modeling, convolutional layer modeling, activation function modeling, pooling layer modeling, and fully connected layer modeling.

The main role of the convolution operation is to perform feature extraction and get the feature map. The convolutional modeling is: (23) $W_{2} = \frac{W_{1} - F + 2 P}{S} + 1$

Where: W₂ is the output feature image size. W₁ is the input feature image size. F is the convolution kernel size. S is the step size. P is padding size, fill pixels.

The role of activation function modeling is mainly to nonlinearize the single linear feature after convolution operation to make it closer to the actual feature model. The excitation function is chosen as LeakyReLU function, which converges rapidly and effectively avoids neuron inactivation, and is mathematically modeled as: (24) $f (x) = \max (0.1 x, x) = {\begin{array}{l} 0.1 x (x < 0) \\ x (x \geq 0) \end{array}$

Where: x is the upper level output value. f(x) is the lower layer input value.

Pooling modeling is located after convolution and activation, mainly to eliminate redundant features. Pooling acts to compress the data, changing the size of the feature matrix, but does not change the dimension of the feature matrix, i.e., the results of the upper layer are sampled and processed, and the mean pooling process is adopted for modeling.

The fully connected layer modeling is located at the end of the whole convolutional neural network modeling, which mainly integrates the outputs with features through the weight matrix, and classifies the defective situations in the feature image.

Therefore, after the training samples are processed in the input layer, the feature map of the semiconductor material surface is obtained through the convolutional operation operation, and then after the activation and pooling effect, the final semiconductor material surface defect recognition judgment is carried out through the integration of the fully connected layer model.

4.2

Experimental analysis

4.2.1

Experimental preparation

1)

Experimental platform and environment

The model proposed in this paper is developed using the Pytorch framework.Pytorch is a data-flow based machine learning library that supports high performance numerical computation on GPUs and CPUs, and is able to be installed on all types of systems with ease and flexibility. The experiments in this section are run on an Ubuntu system using an Nvidia Tesla P100 graphics card (GPU).There are multiple parameters in neural networks, and the computation of these parameters consumes a lot of memory resources. Furthermore, the performance of the CPU cannot meet the requirements of neural networks.While the GPU is specialized in solving complex image pixel operations with image acceleration, which makes processing images more efficient.Therefore, the algorithm in this paper has been ported to run on a GPU.

2)

Experimental Function Design

The process of semiconductor material surface defect recognition experiment in this paper starts from obtaining the data set, followed by labeling the data set, augmenting the obtained images and labels with data preprocessing algorithms, and then inputting them into the feature extraction algorithm, after which normalization is carried out until the better feature extraction model is obtained, and then the obtained results are inputted into the defect recognition structural network for processing to finally output the classification and localization The final output is the classification and localization results. The structure of the semiconductor material surface defect recognition algorithm in this paper has five main parts: input image, defect region detection, feature extraction, recognition, and output.

3)

Model Training

This paper combines the migration model to train the machine learning model, and adjusts the hyperparameters according to the semiconductor material surface defect problem. For the training of the model, the training set has positive examples IoU>0.70 and negative examples IoU<0.30. In training the recognition network, the model extracts 2500 RoI, of which the positive examples IoU>0.30, the negative examples IoU<0.01, and the proportion of positive samples is not more than 25%, and the lr=0.01. The regression uses the SmoothLlLoss function, and the calculation is as in equation (25). Classifier uses SoftMax loss function. Target segmentation uses the Sigmoid function. The loss function is obtained by linear summation during training of the multitasking network in order to allow end-to-end training.

The output of the bounding box regression layer is 9 anchor boxes, which correspond to the translation scaling parameters. L_reg is the summation of the smoothing L_l loss. The bounding box regression loss function equation (26): (25) $s m o o t h_{L 1} (x) = {\begin{array}{l} 0.5 x^{2}, & i f | x | < 1 \\ | x | - 0.5, & o t h e r \end{array}$ (26) $L_{r e g} (p_{i}, p_{i}^{*}) = \sum_{i \in {x, y, w, h}} s m o o t h_{L 1} (p_{i}^{*} - p_{i})$

As mentioned earlier, the output anchor frame of the target classification layer is the probability of belonging to the foreground or background, where i is the subscript of the RoI, p_i and t_i are the predicted values of the i th RoI, and $p_{i}^{*}$ and $t_{i}^{*}$ are the true values. N_cls and N_reg are classification and regression normalization parameters, respectively, and λ denotes the weights, which are adjusted during training. Equation (27) for the target classification loss function: (27) $L_{c l s} (p_{i}, p_{i}^{*}) = - \sum_{i = 0}^{k} p_{i}^{*} \log p_{i}$ (28) $L_{m a s k} (c l s_{k}) = s i g m o i d (c l s_{k})$

Total loss function for the target detection task (29): (29) $L = L_{c l s} + λ L_{r e g} + L_{m a s k}$

In the testing phase, the model was first screened for 2500 number of candidate frames with a NMS screening threshold of IoU>0.7. In the recognition network, the score screening threshold was 0.01 and the NMS screening threshold IoU was 0.8. Next, the optimizer used the SGD algorithm with a momentum of 0.5 and a weight decay of 0.0003. A warm-up strategy was used in the experiments, where the initial 200 iterations of the The learning rate was gradually increased, starting at 0.38, and decreased in the 19th and 36th cycles, for a total of 32 cycles of training.

4.2.2

Experimental data

The ability of the algorithm to recognize defects is examined using a sample library of defects on the surface of semiconductor materials from the industrial dataset WM-811K collected in the field, which contains images from actual semiconductor production lines.The WM-811K dataset consists of normal patterns and eight defect patterns. Figure 3 shows nine types of patterns: center, circular, edge localized, edge ring, localized, nearly full, random, scratch defect, and normal without defect.This sample inventory has a significant class imbalance, which poses a significant challenge in correctly identifying defective patterns.Therefore, the study employs a transfer learning method to expand the diversity and richness of the data by preprocessing the size of each semiconductor material image to 124 × 124 pixels.In this experiment, the images are randomly divided into five parts for five-fold cross-validation, and 7038 surface images of semiconductor materials are randomly selected as the training set for each validation, and the remaining 1349 images are used as the validation dataset.

4.2.3

Analysis of results

1)

Recognition performance analysis

Table 1 shows the confusion matrix of the model’s recognition of semiconductor material surface defects, from which it can be seen that the model’s comprehensive recognition accuracy on the test dataset reaches 94.53%, and it can be seen from the recognition rate of each category that, in addition to random-type defects, the model in this paper is able to effectively recognize other types of semiconductor material surface defects categories, and the recognition accuracy is above 94.59%. In addition, migration learning can effectively learn the data features of semiconductor material surface images, and obtain very good recognition results.

Table 1.

Confusion matrix od model defect recognition rate (%)

Forecast reality	Center	Torus	Marginal local	Edge ring	Local	Nearly full	Random	Scratches
Center	97.69	0.00	0.00	0.57	0.00	0.68	1.06	0.00
Torus	1.48	95.63	0.64	0.00	0.32	0.00	0.29	1.64
Marginal local	0.00	2.81	96.94	0.00	0.03	0.09	0.11	0.02
Edge ring	1.06	0.81	0.98	95.79	0.00	0.37	0.83	0.16
Local	0.09	0.13	0.42	0.00	96.82	1.38	1.04	0.12
Nearly full	0.46	0.74	0.98	0.33	1.47	95.07	0.43	0.52
Random	3.92	0.00	4.08	0.00	4.73	3.53	83.74	0.00
Scratches	1.32	0.00	0.87	0.00	1.43	0.00	1.77	94.59

For the Random class of semiconductor material surface defects, the reasons for misidentification are further analyzed. As can be seen from Table 1, 3.92%, 4.08%, 4.73%, and 3.53% of the surface defects of Random semiconductor materials are misidentified as Center, Marginal local, Local, and Nearly full categories. Fig. 4 shows four maps of surface defects of Random semiconductor materials misidentified as other classes, which can be seen to be misidentified by the model by having standard random defects along with the defect characteristics of Center, Marginal local, Local and Nearly full, in that order.

2)

Performance Comparison

In order to verify the effectiveness and advancement of the proposed method, several classical and latest semiconductor material surface defect pattern recognition methods are taken as the comparison objects of this paper’s model in the experiment. Among them, the maximum depth of the decision tree C4.5 is set to 50, and the number of nodes is 200. The support vector machine adopts linear kernel function and Gaussian kernel function respectively, and the penalty factor is set to C=1.5. The random forest has a maximum depth of 100 and 1000 trees. The KNN classifier’s K value was adjusted to 3.

The comparison of the recognition performance of this paper’s model with other models on surface defects of various semiconductor materials can be seen in Table 2. Where Rrec is the recall rate, which indicates the proportion of positive samples that are correctly predicted among all positive samples, and F is the reconciled average of the two, which is used to indicate the stability of the model’s recognition performance, F = 2R_acc R_rec/(R_acc + R_rec). From the table, we can see that this paper’s model shows the best performance in the surface defect pattern recognition task for the six semiconductor materials, namely, Center, Torus, Marginal local, Edge ring, Local and Random, even for the other two defects, namely, Nearly full and Scratches. From the table, it can be seen that the model in this paper shows the best performance in the task of recognizing surface defects in Center, Torus, Marginal local, Edge ring, Local and Random semiconductor materials, and even for the other 2 types of defects, namely Nearly full and Scratches, the F-value reaches 94.08 and 88.40. The highest F-values of all algorithms for the two types of defects are 95.57 and 88.58, respectively. The model in this paper performs close to the optimal F-values.The characteristics of defects in Scratches are very obvious, and they can be effectively recognized using all recognition algorithms.

Table 2.

Performance comparison of models (%)

Models	Defect category	R_acc	R_rec	F
Decision tree algorithm	Center	72.26	34.39	46.85
	Torus	59.27	43.23	49.91
	Marginal local	63.72	79.83	70.91
	Edge ring	93.49	87.61	90.59
	Local	57.33	48.73	52.66
	Nearly full	93.35	93.27	93.49
	Random	90.52	92.84	91.77
	Scratches	85.30	92.28	88.58
SVM	Center	84.75	77.15	80.85
	Torus	47.36	80.85	59.62
	Marginal local	87.55	81.54	84.34
	Edge ring	94.30	86.75	90.26
	Local	81.64	68.71	74.54
	Nearly full	74.92	74.96	75.09
	Random	89.92	99.35	94.4
	Scratches	88.00	73.01	80.09
Random forest	Center	76.4	85.46	80.73
	Torus	95.26	64.63	77.11
	Marginal local	79.76	91.32	85.00
	Edge ring	96.13	81.47	88.06
	Local	82.01	66.92	73.59
	Nearly full	91.69	72.04	95.57
	Random	95.1	87.41	97.31
	Scratches	86.25	65.54	74.49
Ours	Center	97.53	97.51	97.41
	Torus	99.93	94.52	97.32
	Marginal local	95.57	96.55	95.90
	Edge ring	98.68	95.26	97.01
	Local	95.86	96.23	96.02
	Nearly full	94.15	94.17	94.08
	Random	98.87	99.62	99.30
	Scratches	89.64	87.24	88.40

In order to avoid the influence of random factors on the experimental results and to verify the stability of the performance of the proposed model, five-fold cross-validation is performed. The average recognition rates of different recognizers based on the five-fold cross-validation are shown in Table 3. The recognition effect of this paper is exceptional, with a recognition rate of 96.82%, which is significantly better than any other recognizer. It shows that the performance of the model in this paper is significantly superior in the task of recognizing surface defects in semiconductor materials.

Table 3.

Comparison of five-fold cross validation of various algorithms

Models	R_acc
Decision tree algorithm	74.36
SVM	82.47
Random forest	85.39
Ours	96.82

5

Conclusion

In this paper, the Canny operator is used to detect the defective areas on the surface of semiconductor materials, and the geometric, gray scale and texture features of the defects on the surface of semiconductor materials are extracted based on the detected areas. A machine learning model is constructed to recognize the defects on the surface of semiconductor materials, and a data set from an actual semiconductor production line is selected to simulate and analyze the defect recognition performance of the model. The model in this paper achieves a comprehensive recognition accuracy of 94.53% on the test dataset. In addition to random defects, the model can effectively identify other types of semiconductor material surface defects such as Center, Edge localization, etc., and the recognition accuracy is above 94.59%. The F-values of this paper’s model in the six semiconductor material surface defect pattern recognition tasks of Center, Torus, Marginal local, Edge ring, Local, and Random are 94.71%, 97.32%, 95.90%, 97.01%, 96.02%, and 99.30%, respectively, which are significantly higher than those of the participating comparative decision tree, SVM and other defect recognition models.The optimal F-values for the two defect patterns of Nearly full and Scratches are obtained by the decision tree algorithm and the random forest algorithm, respectively.The F-values of this paper’s model on these two defect patterns are 94.08% and 88.40%, which are closer to the optimal F-values. Meanwhile, in the five-fold cross-validation, the model in this paper achieved the highest recognition accuracy of 96.82%, which significantly indicates that the model in this paper can effectively recognize surface defects in semiconductor materials.

Funding:

This research was supported by the 2023 Key research and development plan of Sichuan Province: Research on Blind Source Separation Technology for unmanned situational awareness Platform (No.: 2023YFG0331).

Lingua:: Inglese

Frequenza di pubblicazione:: 1 volte all'anno
Argomenti della rivista:: Scienze biologiche, Scienze della vita, altro, Matematica, Matematica applicata, Matematica generale, Fisica, Fisica, altro

Feed RSS della rivista

Automatic Identification of Surface Defects in Semiconductor Materials Based on Machine Learning

Huan Li

Pubblicato online: 17 mar 2025

Ricevuto: 08 ott 2024

Accettato: 04 feb 2025

DOI: https://doi.org/10.2478/amns-2025-0271

Parole chiaveMachine learning, Canny operator, TensorFlow framework, Five-fold cross-validation, Defect recognition

© 2025 Huan Li, published by Sciendo

This work is licensed under the Creative Commons Attribution 4.0 International License.

Parole chiave
Machine learning, Canny operator, TensorFlow framework, Five-fold cross-validation, Defect recognition