Accesso libero

Automatic Identification of Surface Defects in Semiconductor Materials Based on Machine Learning

  
17 mar 2025
INFORMAZIONI SU QUESTO ARTICOLO

Cita
Scarica la copertina

Introduction

Semiconductor processing has developed rapidly in the 1990s, enabling another leap in the ability to store and process information and motion control per unit volume of matter. The design and processing technology of three-dimensional microfabrication, as an important part of this development, requires the realisation of the fabrication of miniature systems with high depth-to-width ratios on semiconductor materials [1-2]. The three-dimensional nature of processing three-dimensional structures is also increasingly placing new demands on the internal properties of semiconductor materials. The internal consistency of the material becomes an important guarantee for the realisation of microdevice functions [3-4].

Semiconductor components are essential components in the manufacture of precision electronic devices and are usually made of semiconductor materials (e.g., silicon) [5]. The main role of these components is to transmit, amplify, and control electrical signals in electronic circuits to achieve the proper functioning of various electronic devices, and they are widely used in the manufacture of integrated circuits, transistors, diodes, and other key electronic components [6-7]. Due to the working principle and material properties of semiconductor components, the development trend in this field is moving towards smaller size, higher performance and higher integration. However, as the size of the components decreases and the integration degree increases, the manufacturing process becomes more complex and also increases the difficulty of detecting surface defects. This is because as the size decreases, small and complex surface defects are more likely to appear, and these defects may directly affect the electrical performance, stability, and reliability of the components [8-10]. Therefore, research on the detection of surface defects in dense samples of semiconductor components has become particularly critical.

The detection of surface defects is crucial for improving the overall quality and reliability of semiconductor components [11]. Through early detection and effective detection of surface defects, manufacturers are able to take appropriate improvement and repair measures to enhance the performance and long-term reliability of the devices in order to meet the demand for high-quality devices in modern electronic devices. Secondly, this process helps to reduce production costs [12-13]. Early detection and treatment of surface defects can avoid more serious quality problems in the subsequent production process and reduce the scrap rate, which in turn reduces production costs and improves production efficiency. Therefore, surface quality inspection of semiconductor components has become an indispensable and important part of the automated production process, which not only helps to ensure the high quality and reliability of the devices, but also effectively reduces the performance problems that may occur in the subsequent production and use stages [14-16].

Traditional defect detection methods rely on manual labour. Workers identify and mark surface defects by visual inspection or by using simple tools [17]. However, this method suffers from high human resource consumption, inefficiency, and subjective judgement, especially in modern manufacturing environments with high throughput and high precision, where manual inspection is clearly not sufficient. Therefore, machine vision-based surface defect detection methods have become an indispensable alternative. One of the surface defect detection methods based on image processing uses computer vision and image processing techniques to analyse the image and thus achieve the detection of surface defects. This method solves the problems of manual detection to a certain extent and improves the automation and accuracy of detection [18-20].

In this paper, we study and compare several edge detection algorithms with more applications, and finally decide to use the Canny algorithm to determine the defective regions on the surface of semiconductor materials.The detected defect edges are removed from noise using a morphological filtering algorithm to increase the smoothness of detection. The high-dimensional feature space of the defective region is transformed to the low-dimensional feature space using transformation or mapping, and the geometric, gray-scale, and texture features of the surface of the semiconductor material are extracted based on the different shapes, sizes, grayscales, and texture information exhibited by the defects. Based on the TensorFlow machine learning framework, a semiconductor material surface defect recognition model is established, and the features are extracted through network self-training, and the convolution operation is utilized to improve the operation efficiency of the machine neural network model. Combined with the migration model to train the machine learning model constructed in this paper, the SGD optimization algorithm and the warm-up strategy are used to further improve the performance of the model for semiconductor material surface defect recognition. The industrial dataset WM-811K from the actual semiconductor production line is collected on-site, and this dataset is used to simulate and analyze the performance of this paper’s model for the recognition of eight types of defects on the surface of semiconductor materials. The recognition results are compared and analyzed with the decision tree algorithm, SVM algorithm and random forest algorithm, which visually highlights the good performance of this paper’s machine learning model in semiconductor material surface defect recognition.

Defect area detection

The automatic recognition technology of defects is interested in the defective region. In order to improve the efficiency of the recognition, the defective region in the semiconductor material should first be detected and segmented, and then characterization of this region. Edge detection experiments in the classic image detection segmentation method use the mutation type’s boundary gray value for the target and background regions as the basis for segmentation of the target region.During the imaging process, the gray value of the image of the defective region is reduced compared to the surrounding normal region due to scattering of light in the defective region. Imaging process due to light scattering in the defective region, so the gray value of the defective region of the image compared to the normal region is lower, the image is always at the border with the surrounding normal region will produce gray value of the mutation, so it can be used based on the gray value of the mutation type edge detection operator to carry out feature extraction.

Comparison of several more widely used edge detection operators can be obtained: first-order gradient operator Prewitt and Sobel low computational complexity, but the detection accuracy is also lower, and can not detect some of the edge of the finer crack defects. Canny and Log operators are used to detect the edge of finer defects with higher sensitivity, but the stability of the Canny operator is more stable and less susceptible to interference from noise.The Canny operator’s stability is enhanced and they are less susceptible to noise interference.Therefore, based on comprehensive consideration, the Canny operator will be used to determine the surface defects of semiconductor materials [21]. Figure 1 shows the results of the detection, in which Figure 1(a) is the detection result of the Canny operator and Figure 1(b) is the effect after denoising. It can be seen that the Canny operator is able to detect the defect edges completely, and for the part where there is a small number of noise points, it can be removed by morphological filtering algorithm (using open operation, corrosion first and then expansion), and it can show the defect edges accurately and clearly after removing the noise.

Figure 1.

Defect area test results

Defective area feature extraction
Feature extraction

The defect contour was successfully detected using Canny operator, and before recognition, the 2D image needs to be specially processed to convert it into quantitative information that is easy to process by the computer.Converting the information of a 2D image into quantitative information involves extracting features.

The high-dimensional feature space of the defect region is transformed to the low-dimensional feature space by using transformation or mapping. By analyzing the defective regions on the surface of semiconductor materials, it is found that different defects on the surface of semiconductor materials have their own characteristics, which are manifested in the shape, size, gray scale and texture information of the defects. The geometric features, grayscale features, and texture features of defects on the surface of semiconductor materials are extracted respectively [22].

Geometric Feature Extraction

The geometry of the defective region is an important feature in defect classification, and the geometry of the region can be obtained by extracting the geometric features of the defective region.The possible types of defects can be analyzed by using geometric features, which mainly include the area of the region, center of mass, and moments, etc.

Definition f(x,y) is the gray scale image of the defective region with M rows and N columns, and f(x,y) denotes the pixel coordinates of x rows and y columns. The set of pixel points of the defective region is denoted as Sregion and the set of pixel points of the boundary of the defective region is denoted as Sòdgò. The geometric features used in this paper are as follows.

Perimeter P

The total number of pixels at the boundary of the defective region can characterize the perimeter of the defective region, which is a contour feature parameter: P=(x,y)Sedge1

Area A

Characterize the area of the region by the total number of pixels it contains: A=(x,y)Sregion1

Center of mass coordinates (xc,yc)

The position of the defective region in the image is described by the region center of mass coordinates. The possible types of defects can be determined based on the location of the defective region. The center of mass coordinates can be expressed as: { xc=1A(x,y)Sregborxyc=1A(x,y)Sregbory

Degree of rectangularity Rrect.

The ratio of the length of the short side of the smallest external rectangle of the defective region W to the length of the long side L indicates the degree of rectangularity Rnet, usually the degree of rectangularity Rncr < 1, and when the smallest external rectangle of the defective region is a square, the degree of rectangularity Rrecr = 1: Rrect=WL

Duty cycle RD

The ratio of defect area A to the minimum circumscribed rectangular area of the defect region is defined as duty cycle RD. For “line” and “striation” defects on the surface of semiconductor materials, the defect area accounts for a large proportion of the smallest external rectangle, but for irregular defects such as “scratches” and “jam marks”, the defect area accounts for a small proportion of the smallest external rectangle: RD=AW×L

Eccentricity.

Eccentricity is used to indicate the compactness of the defective region and is defined as the ratio of the longest chord L1 of the defective region to the longest chord L2 perpendicular to L1: Recc=L1L2

Circularity Rcent

For geometric shapes with the same perimeter, the circle has the largest area. Roundness Rcemt is defined as the ratio of the area of a defective region to the area of a circle (the densest shape) with the same perimeter, and roundness is a denseness descriptor. Typically, circularity Rcemt < 1, for circular defective areas, and Rcemt = 1, is a dimensionless measure: Rcent=4πAP2

Hu invariant moments

Hu invariant moments have translation, rotation and scale invariance. For a gray scale image f(x,y) of the defective region in M row and N columns the 2D (p+q)th order moments are defined as: ωpq=x=0M1y=0N1xpyqf(x,y) where p = 0,1,2,⋯,q = 0,1,2,⋯ are integers. The image center of mass can be expressed as: { xc=ω10ω00yc=ω01ω00

From Eqs. (8) and (9), the corresponding (p+q)st order central moments as well as the normalized central moments can be expressed as: upq=x=0M1y=0N1(xxc)p(yyc)qf(x,y) ηpq=upqupqα where α=p+q2+1 , where p+q = 2,3,⋯.

Seven invariant moments can be introduced from the second-order moments and third-order moments, due to the fact that the Hu higher-order invariant moments are easily affected by external factors such as noise in pattern recognition, only the first four Hu invariant moments are extracted in this section, and the computational formulas are as follows: λ1=η20+η02 λ2=(η20η02)2+4η112 λ3=(η303η12)2+(3η21η03)2 λ4=(η30+η12)2+(η21+η03)2

Gray-scale feature extraction

The grayscale features of the defective regions on the surface of semiconductor materials are mainly the average grayscale value, grayscale variance or standard deviation, and grayscale entropy. Since it is easy to obtain the histogram of the image, the histogram of the image is chosen for the statistics of the gray-scale features. The histogram is a grayscale image of h, with a range of gray values [0, L-1], such that z represents a random variable of gray level, the probability of occurrence of gray level i is p(zi) = h(zi)/K,i = {0,1,2,⋯, L–1}, and K represents the total number of image pixels. Three gray-scale features of the defective region are extracted.

The average gray level μ, i.e: μ=i=0L1zip(zi)

Gray scale standard deviation σ(z)

The standard deviation is more intuitive compared to the variance, so it is chosen as a measure of the image gray scale features, which is obtained by calculating the arithmetic square root of the variance: σ(z)=i=0L1(ziμ)2p(zi)

Gray scale entropy Eent, i.e: Eent=i=0L1p(zi)log2p(zi)

Texture Feature Extraction

One of the simplest ways to describe texture is to use the gray-level histogram statistical moments of an image or region. The texture measures computed using only the histogram do not carry information about the relative positions of the pixels with respect to each other, which is critical when characterizing texture. Therefore, both the distribution of gray levels and the relative positions of pixels in the image are considered in texture analysis.

The gray level co-production matrix considers both the gray level distribution and the relative position of the pixels. Let the gray level of image f be L, let Q be the operator of the relative positions of the two pixels, and G be the covariance matrix of size L×L, whose element gi,j is the number of times the pixel pairs satisfying the gray level i–1 and gray level j–1 of operator Q appear in the image, here i,j ∈ [1,L], corresponding to the number of rows and columns of the matrix, respectively. The matrix G is normalized using the following equation to obtain the normalized matrix PQ: { pi,j=gi,j/ni=1Lj=1Lpi,j=1 PQ=[ p1,1p1,jp1,Lpi,1pi,jpi,LpL,1pL,jpL,L ]

The position operator Q can be represented in a variety of ways, and four common operators have been proposed in related studies. Qϕ,d depicts two pixels separated by a distance of d in direction ϕ, where ϕ four values of 0°, 45°, 90°, and 135° are taken respectively. Four grayscale covariance matrices can be obtained, and these four covariance matrices are used to compute the texture features, and the computed texture features are averaged and used as the final texture features of the defective region.

By analyzing the elements of G, the texture patterns present in the image can be detected, and in this paper, a total of 6 depictors of correlation (Cor), contrast (Con), energy (Ene), homogeneity (Hom), inverse difference moment (Idm), and entropy (Ent) are selected as the texture features of defective regions.

Standardization of feature parameters

The geometric, grey scale as well as texture features extracted directly from the greyscale image of the defective region have different orders of magnitude, which is detrimental to feature selection and defect classification, and requires standardization of the raw data. The Auto Scaling method, i.e. standard deviation standardization, is utilized. The standardized feature data satisfy the standard normal distribution with mean 0 and standard deviation 1, and the standardization formula is: { xij*=xijmjsj,i=1,2,,n,j=1,2,,dmj=1ni=1nxijsj=1n1i=1n(xijmj)2 where n denotes the number of samples, d denotes the number of feature dimensions, mj denotes the sample mean of feature j, and sj denotes the sample standard deviation of feature j.

Experimental analysis

Five scratched (Sc) and five inclusion (In) defects were taken for feature extraction experiments, and the results are shown in Fig. 2, in which Fig. 2(a) is the extracted geometric feature parameters, Fig. 2(b) is the gray-scale feature parameters, and Fig. 2(c) is the texture feature parameters. From the figure, it can be seen that the defective region feature extraction algorithm used in this paper can effectively extract the defective features on the surface of the semiconductor material, and because of the standardization of the feature parameters in this paper, the feature data all satisfy the standard normal distribution characteristics with a mean value of 0 and a standard deviation of 1. At the same time, different kinds of uniform eigenvalues, the numerical difference is large, for example, in Figure 2(a), scratches 1 and inclusions 1 target area perimeter P standardized for 0.26 and 1.39, respectively, there is a significant difference. Therefore, using the difference between different types of defects on the same feature, the identification and classification of defect types can be carried out.

Figure 2.

Defect feature extraction

Machine learning-based defect recognition
Modeling

Based on the TensorFlow machine learning framework, a model for recognizing semiconductor material surface defects has been established [23]. The features are selected through network self-training, and the convolution operation is used to improve the efficiency of machine neural network model operation and enhance the performance of model training. The use of weight sharing not only improves the model’s training speed, but also strengthens its defect recognition function.Convolutional neural network modeling mainly includes input layer modeling, convolutional layer modeling, activation function modeling, pooling layer modeling, and fully connected layer modeling.

The main role of the convolution operation is to perform feature extraction and get the feature map. The convolutional modeling is: W2=W1F+2PS+1

Where: W2 is the output feature image size. W1 is the input feature image size. F is the convolution kernel size. S is the step size. P is padding size, fill pixels.

The role of activation function modeling is mainly to nonlinearize the single linear feature after convolution operation to make it closer to the actual feature model. The excitation function is chosen as LeakyReLU function, which converges rapidly and effectively avoids neuron inactivation, and is mathematically modeled as: f(x)=max(0.1x,x)={ 0.1x(x<0)x(x0)

Where: x is the upper level output value. f(x) is the lower layer input value.

Pooling modeling is located after convolution and activation, mainly to eliminate redundant features. Pooling acts to compress the data, changing the size of the feature matrix, but does not change the dimension of the feature matrix, i.e., the results of the upper layer are sampled and processed, and the mean pooling process is adopted for modeling.

The fully connected layer modeling is located at the end of the whole convolutional neural network modeling, which mainly integrates the outputs with features through the weight matrix, and classifies the defective situations in the feature image.

Therefore, after the training samples are processed in the input layer, the feature map of the semiconductor material surface is obtained through the convolutional operation operation, and then after the activation and pooling effect, the final semiconductor material surface defect recognition judgment is carried out through the integration of the fully connected layer model.

Experimental analysis
Experimental preparation

Experimental platform and environment

The model proposed in this paper is developed using the Pytorch framework.Pytorch is a data-flow based machine learning library that supports high performance numerical computation on GPUs and CPUs, and is able to be installed on all types of systems with ease and flexibility. The experiments in this section are run on an Ubuntu system using an Nvidia Tesla P100 graphics card (GPU).There are multiple parameters in neural networks, and the computation of these parameters consumes a lot of memory resources. Furthermore, the performance of the CPU cannot meet the requirements of neural networks.While the GPU is specialized in solving complex image pixel operations with image acceleration, which makes processing images more efficient.Therefore, the algorithm in this paper has been ported to run on a GPU.

Experimental Function Design

The process of semiconductor material surface defect recognition experiment in this paper starts from obtaining the data set, followed by labeling the data set, augmenting the obtained images and labels with data preprocessing algorithms, and then inputting them into the feature extraction algorithm, after which normalization is carried out until the better feature extraction model is obtained, and then the obtained results are inputted into the defect recognition structural network for processing to finally output the classification and localization The final output is the classification and localization results. The structure of the semiconductor material surface defect recognition algorithm in this paper has five main parts: input image, defect region detection, feature extraction, recognition, and output.

Model Training

This paper combines the migration model to train the machine learning model, and adjusts the hyperparameters according to the semiconductor material surface defect problem. For the training of the model, the training set has positive examples IoU>0.70 and negative examples IoU<0.30. In training the recognition network, the model extracts 2500 RoI, of which the positive examples IoU>0.30, the negative examples IoU<0.01, and the proportion of positive samples is not more than 25%, and the lr=0.01. The regression uses the SmoothLlLoss function, and the calculation is as in equation (25). Classifier uses SoftMax loss function. Target segmentation uses the Sigmoid function. The loss function is obtained by linear summation during training of the multitasking network in order to allow end-to-end training.

The output of the bounding box regression layer is 9 anchor boxes, which correspond to the translation scaling parameters. Lreg is the summation of the smoothing Ll loss. The bounding box regression loss function equation (26): smoothL1(x)={ 0.5x2,if|x|<1|x|0.5,other Lreg(pi,pi*)=i{x,y,w,h}smoothL1(pi*pi)

As mentioned earlier, the output anchor frame of the target classification layer is the probability of belonging to the foreground or background, where i is the subscript of the RoI, pi and ti are the predicted values of the i th RoI, and pi* and ti* are the true values. Ncls and Nreg are classification and regression normalization parameters, respectively, and λ denotes the weights, which are adjusted during training. Equation (27) for the target classification loss function: Lcls(pi,pi*)=i=0kpi*logpi Lmask(clsk)=sigmoid(clsk)

Total loss function for the target detection task (29): L=Lcls+λLreg+Lmask

In the testing phase, the model was first screened for 2500 number of candidate frames with a NMS screening threshold of IoU>0.7. In the recognition network, the score screening threshold was 0.01 and the NMS screening threshold IoU was 0.8. Next, the optimizer used the SGD algorithm with a momentum of 0.5 and a weight decay of 0.0003. A warm-up strategy was used in the experiments, where the initial 200 iterations of the The learning rate was gradually increased, starting at 0.38, and decreased in the 19th and 36th cycles, for a total of 32 cycles of training.

Experimental data

The ability of the algorithm to recognize defects is examined using a sample library of defects on the surface of semiconductor materials from the industrial dataset WM-811K collected in the field, which contains images from actual semiconductor production lines.The WM-811K dataset consists of normal patterns and eight defect patterns. Figure 3 shows nine types of patterns: center, circular, edge localized, edge ring, localized, nearly full, random, scratch defect, and normal without defect.This sample inventory has a significant class imbalance, which poses a significant challenge in correctly identifying defective patterns.Therefore, the study employs a transfer learning method to expand the diversity and richness of the data by preprocessing the size of each semiconductor material image to 124 × 124 pixels.In this experiment, the images are randomly divided into five parts for five-fold cross-validation, and 7038 surface images of semiconductor materials are randomly selected as the training set for each validation, and the remaining 1349 images are used as the validation dataset.

Figure 3.

Normal semiconductor material surface and 8 defect modes

Analysis of results

Recognition performance analysis

Table 1 shows the confusion matrix of the model’s recognition of semiconductor material surface defects, from which it can be seen that the model’s comprehensive recognition accuracy on the test dataset reaches 94.53%, and it can be seen from the recognition rate of each category that, in addition to random-type defects, the model in this paper is able to effectively recognize other types of semiconductor material surface defects categories, and the recognition accuracy is above 94.59%. In addition, migration learning can effectively learn the data features of semiconductor material surface images, and obtain very good recognition results.

Confusion matrix od model defect recognition rate (%)

Forecast reality Center Torus Marginal local Edge ring Local Nearly full Random Scratches
Center 97.69 0.00 0.00 0.57 0.00 0.68 1.06 0.00
Torus 1.48 95.63 0.64 0.00 0.32 0.00 0.29 1.64
Marginal local 0.00 2.81 96.94 0.00 0.03 0.09 0.11 0.02
Edge ring 1.06 0.81 0.98 95.79 0.00 0.37 0.83 0.16
Local 0.09 0.13 0.42 0.00 96.82 1.38 1.04 0.12
Nearly full 0.46 0.74 0.98 0.33 1.47 95.07 0.43 0.52
Random 3.92 0.00 4.08 0.00 4.73 3.53 83.74 0.00
Scratches 1.32 0.00 0.87 0.00 1.43 0.00 1.77 94.59

For the Random class of semiconductor material surface defects, the reasons for misidentification are further analyzed. As can be seen from Table 1, 3.92%, 4.08%, 4.73%, and 3.53% of the surface defects of Random semiconductor materials are misidentified as Center, Marginal local, Local, and Nearly full categories. Fig. 4 shows four maps of surface defects of Random semiconductor materials misidentified as other classes, which can be seen to be misidentified by the model by having standard random defects along with the defect characteristics of Center, Marginal local, Local and Nearly full, in that order.

Figure 4.

Error identification of Random class

Performance Comparison

In order to verify the effectiveness and advancement of the proposed method, several classical and latest semiconductor material surface defect pattern recognition methods are taken as the comparison objects of this paper’s model in the experiment. Among them, the maximum depth of the decision tree C4.5 is set to 50, and the number of nodes is 200. The support vector machine adopts linear kernel function and Gaussian kernel function respectively, and the penalty factor is set to C=1.5. The random forest has a maximum depth of 100 and 1000 trees. The KNN classifier’s K value was adjusted to 3.

The comparison of the recognition performance of this paper’s model with other models on surface defects of various semiconductor materials can be seen in Table 2. Where Rrec is the recall rate, which indicates the proportion of positive samples that are correctly predicted among all positive samples, and F is the reconciled average of the two, which is used to indicate the stability of the model’s recognition performance, F = 2Racc Rrec/(Racc + Rrec). From the table, we can see that this paper’s model shows the best performance in the surface defect pattern recognition task for the six semiconductor materials, namely, Center, Torus, Marginal local, Edge ring, Local and Random, even for the other two defects, namely, Nearly full and Scratches. From the table, it can be seen that the model in this paper shows the best performance in the task of recognizing surface defects in Center, Torus, Marginal local, Edge ring, Local and Random semiconductor materials, and even for the other 2 types of defects, namely Nearly full and Scratches, the F-value reaches 94.08 and 88.40. The highest F-values of all algorithms for the two types of defects are 95.57 and 88.58, respectively. The model in this paper performs close to the optimal F-values.The characteristics of defects in Scratches are very obvious, and they can be effectively recognized using all recognition algorithms.

Performance comparison of models (%)

Models Defect category Racc Rrec F
Decision tree algorithm Center 72.26 34.39 46.85
Torus 59.27 43.23 49.91
Marginal local 63.72 79.83 70.91
Edge ring 93.49 87.61 90.59
Local 57.33 48.73 52.66
Nearly full 93.35 93.27 93.49
Random 90.52 92.84 91.77
Scratches 85.30 92.28 88.58
SVM Center 84.75 77.15 80.85
Torus 47.36 80.85 59.62
Marginal local 87.55 81.54 84.34
Edge ring 94.30 86.75 90.26
Local 81.64 68.71 74.54
Nearly full 74.92 74.96 75.09
Random 89.92 99.35 94.4
Scratches 88.00 73.01 80.09
Random forest Center 76.4 85.46 80.73
Torus 95.26 64.63 77.11
Marginal local 79.76 91.32 85.00
Edge ring 96.13 81.47 88.06
Local 82.01 66.92 73.59
Nearly full 91.69 72.04 95.57
Random 95.1 87.41 97.31
Scratches 86.25 65.54 74.49
Ours Center 97.53 97.51 97.41
Torus 99.93 94.52 97.32
Marginal local 95.57 96.55 95.90
Edge ring 98.68 95.26 97.01
Local 95.86 96.23 96.02
Nearly full 94.15 94.17 94.08
Random 98.87 99.62 99.30
Scratches 89.64 87.24 88.40

In order to avoid the influence of random factors on the experimental results and to verify the stability of the performance of the proposed model, five-fold cross-validation is performed. The average recognition rates of different recognizers based on the five-fold cross-validation are shown in Table 3. The recognition effect of this paper is exceptional, with a recognition rate of 96.82%, which is significantly better than any other recognizer. It shows that the performance of the model in this paper is significantly superior in the task of recognizing surface defects in semiconductor materials.

Comparison of five-fold cross validation of various algorithms

Models Racc
Decision tree algorithm 74.36
SVM 82.47
Random forest 85.39
Ours 96.82
Conclusion

In this paper, the Canny operator is used to detect the defective areas on the surface of semiconductor materials, and the geometric, gray scale and texture features of the defects on the surface of semiconductor materials are extracted based on the detected areas. A machine learning model is constructed to recognize the defects on the surface of semiconductor materials, and a data set from an actual semiconductor production line is selected to simulate and analyze the defect recognition performance of the model. The model in this paper achieves a comprehensive recognition accuracy of 94.53% on the test dataset. In addition to random defects, the model can effectively identify other types of semiconductor material surface defects such as Center, Edge localization, etc., and the recognition accuracy is above 94.59%. The F-values of this paper’s model in the six semiconductor material surface defect pattern recognition tasks of Center, Torus, Marginal local, Edge ring, Local, and Random are 94.71%, 97.32%, 95.90%, 97.01%, 96.02%, and 99.30%, respectively, which are significantly higher than those of the participating comparative decision tree, SVM and other defect recognition models.The optimal F-values for the two defect patterns of Nearly full and Scratches are obtained by the decision tree algorithm and the random forest algorithm, respectively.The F-values of this paper’s model on these two defect patterns are 94.08% and 88.40%, which are closer to the optimal F-values. Meanwhile, in the five-fold cross-validation, the model in this paper achieved the highest recognition accuracy of 96.82%, which significantly indicates that the model in this paper can effectively recognize surface defects in semiconductor materials.

Funding:

This research was supported by the 2023 Key research and development plan of Sichuan Province: Research on Blind Source Separation Technology for unmanned situational awareness Platform (No.: 2023YFG0331).

Lingua:
Inglese
Frequenza di pubblicazione:
1 volte all'anno
Argomenti della rivista:
Scienze biologiche, Scienze della vita, altro, Matematica, Matematica applicata, Matematica generale, Fisica, Fisica, altro