Automatic Identification of Surface Defects in Semiconductor Materials Based on Machine Learning
Online veröffentlicht: 17. März 2025
Eingereicht: 08. Okt. 2024
Akzeptiert: 04. Feb. 2025
DOI: https://doi.org/10.2478/amns-2025-0271
Schlüsselwörter
© 2025 Huan Li, published by Sciendo
This work is licensed under the Creative Commons Attribution 4.0 International License.
Semiconductor processing has developed rapidly in the 1990s, enabling another leap in the ability to store and process information and motion control per unit volume of matter. The design and processing technology of three-dimensional microfabrication, as an important part of this development, requires the realisation of the fabrication of miniature systems with high depth-to-width ratios on semiconductor materials [1-2]. The three-dimensional nature of processing three-dimensional structures is also increasingly placing new demands on the internal properties of semiconductor materials. The internal consistency of the material becomes an important guarantee for the realisation of microdevice functions [3-4].
Semiconductor components are essential components in the manufacture of precision electronic devices and are usually made of semiconductor materials (e.g., silicon) [5]. The main role of these components is to transmit, amplify, and control electrical signals in electronic circuits to achieve the proper functioning of various electronic devices, and they are widely used in the manufacture of integrated circuits, transistors, diodes, and other key electronic components [6-7]. Due to the working principle and material properties of semiconductor components, the development trend in this field is moving towards smaller size, higher performance and higher integration. However, as the size of the components decreases and the integration degree increases, the manufacturing process becomes more complex and also increases the difficulty of detecting surface defects. This is because as the size decreases, small and complex surface defects are more likely to appear, and these defects may directly affect the electrical performance, stability, and reliability of the components [8-10]. Therefore, research on the detection of surface defects in dense samples of semiconductor components has become particularly critical.
The detection of surface defects is crucial for improving the overall quality and reliability of semiconductor components [11]. Through early detection and effective detection of surface defects, manufacturers are able to take appropriate improvement and repair measures to enhance the performance and long-term reliability of the devices in order to meet the demand for high-quality devices in modern electronic devices. Secondly, this process helps to reduce production costs [12-13]. Early detection and treatment of surface defects can avoid more serious quality problems in the subsequent production process and reduce the scrap rate, which in turn reduces production costs and improves production efficiency. Therefore, surface quality inspection of semiconductor components has become an indispensable and important part of the automated production process, which not only helps to ensure the high quality and reliability of the devices, but also effectively reduces the performance problems that may occur in the subsequent production and use stages [14-16].
Traditional defect detection methods rely on manual labour. Workers identify and mark surface defects by visual inspection or by using simple tools [17]. However, this method suffers from high human resource consumption, inefficiency, and subjective judgement, especially in modern manufacturing environments with high throughput and high precision, where manual inspection is clearly not sufficient. Therefore, machine vision-based surface defect detection methods have become an indispensable alternative. One of the surface defect detection methods based on image processing uses computer vision and image processing techniques to analyse the image and thus achieve the detection of surface defects. This method solves the problems of manual detection to a certain extent and improves the automation and accuracy of detection [18-20].
In this paper, we study and compare several edge detection algorithms with more applications, and finally decide to use the Canny algorithm to determine the defective regions on the surface of semiconductor materials.The detected defect edges are removed from noise using a morphological filtering algorithm to increase the smoothness of detection. The high-dimensional feature space of the defective region is transformed to the low-dimensional feature space using transformation or mapping, and the geometric, gray-scale, and texture features of the surface of the semiconductor material are extracted based on the different shapes, sizes, grayscales, and texture information exhibited by the defects. Based on the TensorFlow machine learning framework, a semiconductor material surface defect recognition model is established, and the features are extracted through network self-training, and the convolution operation is utilized to improve the operation efficiency of the machine neural network model. Combined with the migration model to train the machine learning model constructed in this paper, the SGD optimization algorithm and the warm-up strategy are used to further improve the performance of the model for semiconductor material surface defect recognition. The industrial dataset WM-811K from the actual semiconductor production line is collected on-site, and this dataset is used to simulate and analyze the performance of this paper’s model for the recognition of eight types of defects on the surface of semiconductor materials. The recognition results are compared and analyzed with the decision tree algorithm, SVM algorithm and random forest algorithm, which visually highlights the good performance of this paper’s machine learning model in semiconductor material surface defect recognition.
The automatic recognition technology of defects is interested in the defective region. In order to improve the efficiency of the recognition, the defective region in the semiconductor material should first be detected and segmented, and then characterization of this region. Edge detection experiments in the classic image detection segmentation method use the mutation type’s boundary gray value for the target and background regions as the basis for segmentation of the target region.During the imaging process, the gray value of the image of the defective region is reduced compared to the surrounding normal region due to scattering of light in the defective region. Imaging process due to light scattering in the defective region, so the gray value of the defective region of the image compared to the normal region is lower, the image is always at the border with the surrounding normal region will produce gray value of the mutation, so it can be used based on the gray value of the mutation type edge detection operator to carry out feature extraction.
Comparison of several more widely used edge detection operators can be obtained: first-order gradient operator Prewitt and Sobel low computational complexity, but the detection accuracy is also lower, and can not detect some of the edge of the finer crack defects. Canny and Log operators are used to detect the edge of finer defects with higher sensitivity, but the stability of the Canny operator is more stable and less susceptible to interference from noise.The Canny operator’s stability is enhanced and they are less susceptible to noise interference.Therefore, based on comprehensive consideration, the Canny operator will be used to determine the surface defects of semiconductor materials [21]. Figure 1 shows the results of the detection, in which Figure 1(a) is the detection result of the Canny operator and Figure 1(b) is the effect after denoising. It can be seen that the Canny operator is able to detect the defect edges completely, and for the part where there is a small number of noise points, it can be removed by morphological filtering algorithm (using open operation, corrosion first and then expansion), and it can show the defect edges accurately and clearly after removing the noise.

Defect area test results
The defect contour was successfully detected using Canny operator, and before recognition, the 2D image needs to be specially processed to convert it into quantitative information that is easy to process by the computer.Converting the information of a 2D image into quantitative information involves extracting features.
The high-dimensional feature space of the defect region is transformed to the low-dimensional feature space by using transformation or mapping. By analyzing the defective regions on the surface of semiconductor materials, it is found that different defects on the surface of semiconductor materials have their own characteristics, which are manifested in the shape, size, gray scale and texture information of the defects. The geometric features, grayscale features, and texture features of defects on the surface of semiconductor materials are extracted respectively [22].
The geometry of the defective region is an important feature in defect classification, and the geometry of the region can be obtained by extracting the geometric features of the defective region.The possible types of defects can be analyzed by using geometric features, which mainly include the area of the region, center of mass, and moments, etc.
Definition
Perimeter
The total number of pixels at the boundary of the defective region can characterize the perimeter of the defective region, which is a contour feature parameter:
Area
Characterize the area of the region by the total number of pixels it contains:
Center of mass coordinates (
The position of the defective region in the image is described by the region center of mass coordinates. The possible types of defects can be determined based on the location of the defective region. The center of mass coordinates can be expressed as:
Degree of rectangularity
The ratio of the length of the short side of the smallest external rectangle of the defective region
Duty cycle
The ratio of defect area
Eccentricity.
Eccentricity is used to indicate the compactness of the defective region and is defined as the ratio of the longest chord
Circularity
For geometric shapes with the same perimeter, the circle has the largest area. Roundness
Hu invariant moments
Hu invariant moments have translation, rotation and scale invariance. For a gray scale image
From Eqs. (8) and (9), the corresponding (
Seven invariant moments can be introduced from the second-order moments and third-order moments, due to the fact that the Hu higher-order invariant moments are easily affected by external factors such as noise in pattern recognition, only the first four Hu invariant moments are extracted in this section, and the computational formulas are as follows:
The grayscale features of the defective regions on the surface of semiconductor materials are mainly the average grayscale value, grayscale variance or standard deviation, and grayscale entropy. Since it is easy to obtain the histogram of the image, the histogram of the image is chosen for the statistics of the gray-scale features. The histogram is a grayscale image of
The average gray level
Gray scale standard deviation
The standard deviation is more intuitive compared to the variance, so it is chosen as a measure of the image gray scale features, which is obtained by calculating the arithmetic square root of the variance:
Gray scale entropy
One of the simplest ways to describe texture is to use the gray-level histogram statistical moments of an image or region. The texture measures computed using only the histogram do not carry information about the relative positions of the pixels with respect to each other, which is critical when characterizing texture. Therefore, both the distribution of gray levels and the relative positions of pixels in the image are considered in texture analysis.
The gray level co-production matrix considers both the gray level distribution and the relative position of the pixels. Let the gray level of image
The position operator
By analyzing the elements of
The geometric, grey scale as well as texture features extracted directly from the greyscale image of the defective region have different orders of magnitude, which is detrimental to feature selection and defect classification, and requires standardization of the raw data. The Auto Scaling method, i.e. standard deviation standardization, is utilized. The standardized feature data satisfy the standard normal distribution with mean 0 and standard deviation 1, and the standardization formula is:
Five scratched (Sc) and five inclusion (In) defects were taken for feature extraction experiments, and the results are shown in Fig. 2, in which Fig. 2(a) is the extracted geometric feature parameters, Fig. 2(b) is the gray-scale feature parameters, and Fig. 2(c) is the texture feature parameters. From the figure, it can be seen that the defective region feature extraction algorithm used in this paper can effectively extract the defective features on the surface of the semiconductor material, and because of the standardization of the feature parameters in this paper, the feature data all satisfy the standard normal distribution characteristics with a mean value of 0 and a standard deviation of 1. At the same time, different kinds of uniform eigenvalues, the numerical difference is large, for example, in Figure 2(a), scratches 1 and inclusions 1 target area perimeter P standardized for 0.26 and 1.39, respectively, there is a significant difference. Therefore, using the difference between different types of defects on the same feature, the identification and classification of defect types can be carried out.

Defect feature extraction
Based on the TensorFlow machine learning framework, a model for recognizing semiconductor material surface defects has been established [23]. The features are selected through network self-training, and the convolution operation is used to improve the efficiency of machine neural network model operation and enhance the performance of model training. The use of weight sharing not only improves the model’s training speed, but also strengthens its defect recognition function.Convolutional neural network modeling mainly includes input layer modeling, convolutional layer modeling, activation function modeling, pooling layer modeling, and fully connected layer modeling.
The main role of the convolution operation is to perform feature extraction and get the feature map. The convolutional modeling is:
Where:
The role of activation function modeling is mainly to nonlinearize the single linear feature after convolution operation to make it closer to the actual feature model. The excitation function is chosen as LeakyReLU function, which converges rapidly and effectively avoids neuron inactivation, and is mathematically modeled as:
Where:
Pooling modeling is located after convolution and activation, mainly to eliminate redundant features. Pooling acts to compress the data, changing the size of the feature matrix, but does not change the dimension of the feature matrix, i.e., the results of the upper layer are sampled and processed, and the mean pooling process is adopted for modeling.
The fully connected layer modeling is located at the end of the whole convolutional neural network modeling, which mainly integrates the outputs with features through the weight matrix, and classifies the defective situations in the feature image.
Therefore, after the training samples are processed in the input layer, the feature map of the semiconductor material surface is obtained through the convolutional operation operation, and then after the activation and pooling effect, the final semiconductor material surface defect recognition judgment is carried out through the integration of the fully connected layer model.
Experimental platform and environment
The model proposed in this paper is developed using the Pytorch framework.Pytorch is a data-flow based machine learning library that supports high performance numerical computation on GPUs and CPUs, and is able to be installed on all types of systems with ease and flexibility. The experiments in this section are run on an Ubuntu system using an Nvidia Tesla P100 graphics card (GPU).There are multiple parameters in neural networks, and the computation of these parameters consumes a lot of memory resources. Furthermore, the performance of the CPU cannot meet the requirements of neural networks.While the GPU is specialized in solving complex image pixel operations with image acceleration, which makes processing images more efficient.Therefore, the algorithm in this paper has been ported to run on a GPU.
Experimental Function Design
The process of semiconductor material surface defect recognition experiment in this paper starts from obtaining the data set, followed by labeling the data set, augmenting the obtained images and labels with data preprocessing algorithms, and then inputting them into the feature extraction algorithm, after which normalization is carried out until the better feature extraction model is obtained, and then the obtained results are inputted into the defect recognition structural network for processing to finally output the classification and localization The final output is the classification and localization results. The structure of the semiconductor material surface defect recognition algorithm in this paper has five main parts: input image, defect region detection, feature extraction, recognition, and output.
Model Training
This paper combines the migration model to train the machine learning model, and adjusts the hyperparameters according to the semiconductor material surface defect problem. For the training of the model, the training set has positive examples IoU>0.70 and negative examples IoU<0.30. In training the recognition network, the model extracts 2500 RoI, of which the positive examples IoU>0.30, the negative examples IoU<0.01, and the proportion of positive samples is not more than 25%, and the lr=0.01. The regression uses the SmoothLlLoss function, and the calculation is as in equation (25). Classifier uses SoftMax loss function. Target segmentation uses the Sigmoid function. The loss function is obtained by linear summation during training of the multitasking network in order to allow end-to-end training.
The output of the bounding box regression layer is 9 anchor boxes, which correspond to the translation scaling parameters.
As mentioned earlier, the output anchor frame of the target classification layer is the probability of belonging to the foreground or background, where
Total loss function for the target detection task (29):
In the testing phase, the model was first screened for 2500 number of candidate frames with a NMS screening threshold of IoU>0.7. In the recognition network, the score screening threshold was 0.01 and the NMS screening threshold IoU was 0.8. Next, the optimizer used the SGD algorithm with a momentum of 0.5 and a weight decay of 0.0003. A warm-up strategy was used in the experiments, where the initial 200 iterations of the The learning rate was gradually increased, starting at 0.38, and decreased in the 19th and 36th cycles, for a total of 32 cycles of training.
The ability of the algorithm to recognize defects is examined using a sample library of defects on the surface of semiconductor materials from the industrial dataset WM-811K collected in the field, which contains images from actual semiconductor production lines.The WM-811K dataset consists of normal patterns and eight defect patterns. Figure 3 shows nine types of patterns: center, circular, edge localized, edge ring, localized, nearly full, random, scratch defect, and normal without defect.This sample inventory has a significant class imbalance, which poses a significant challenge in correctly identifying defective patterns.Therefore, the study employs a transfer learning method to expand the diversity and richness of the data by preprocessing the size of each semiconductor material image to 124 × 124 pixels.In this experiment, the images are randomly divided into five parts for five-fold cross-validation, and 7038 surface images of semiconductor materials are randomly selected as the training set for each validation, and the remaining 1349 images are used as the validation dataset.

Normal semiconductor material surface and 8 defect modes
Recognition performance analysis
Table 1 shows the confusion matrix of the model’s recognition of semiconductor material surface defects, from which it can be seen that the model’s comprehensive recognition accuracy on the test dataset reaches 94.53%, and it can be seen from the recognition rate of each category that, in addition to random-type defects, the model in this paper is able to effectively recognize other types of semiconductor material surface defects categories, and the recognition accuracy is above 94.59%. In addition, migration learning can effectively learn the data features of semiconductor material surface images, and obtain very good recognition results.
Confusion matrix od model defect recognition rate (%)
| Forecast reality | Center | Torus | Marginal local | Edge ring | Local | Nearly full | Random | Scratches |
|---|---|---|---|---|---|---|---|---|
| Center | 97.69 | 0.00 | 0.00 | 0.57 | 0.00 | 0.68 | 1.06 | 0.00 |
| Torus | 1.48 | 95.63 | 0.64 | 0.00 | 0.32 | 0.00 | 0.29 | 1.64 |
| Marginal local | 0.00 | 2.81 | 96.94 | 0.00 | 0.03 | 0.09 | 0.11 | 0.02 |
| Edge ring | 1.06 | 0.81 | 0.98 | 95.79 | 0.00 | 0.37 | 0.83 | 0.16 |
| Local | 0.09 | 0.13 | 0.42 | 0.00 | 96.82 | 1.38 | 1.04 | 0.12 |
| Nearly full | 0.46 | 0.74 | 0.98 | 0.33 | 1.47 | 95.07 | 0.43 | 0.52 |
| Random | 3.92 | 0.00 | 4.08 | 0.00 | 4.73 | 3.53 | 83.74 | 0.00 |
| Scratches | 1.32 | 0.00 | 0.87 | 0.00 | 1.43 | 0.00 | 1.77 | 94.59 |
For the Random class of semiconductor material surface defects, the reasons for misidentification are further analyzed. As can be seen from Table 1, 3.92%, 4.08%, 4.73%, and 3.53% of the surface defects of Random semiconductor materials are misidentified as Center, Marginal local, Local, and Nearly full categories. Fig. 4 shows four maps of surface defects of Random semiconductor materials misidentified as other classes, which can be seen to be misidentified by the model by having standard random defects along with the defect characteristics of Center, Marginal local, Local and Nearly full, in that order.

Error identification of Random class
Performance Comparison
In order to verify the effectiveness and advancement of the proposed method, several classical and latest semiconductor material surface defect pattern recognition methods are taken as the comparison objects of this paper’s model in the experiment. Among them, the maximum depth of the decision tree C4.5 is set to 50, and the number of nodes is 200. The support vector machine adopts linear kernel function and Gaussian kernel function respectively, and the penalty factor is set to C=1.5. The random forest has a maximum depth of 100 and 1000 trees. The KNN classifier’s K value was adjusted to 3.
The comparison of the recognition performance of this paper’s model with other models on surface defects of various semiconductor materials can be seen in Table 2. Where Rrec is the recall rate, which indicates the proportion of positive samples that are correctly predicted among all positive samples, and F is the reconciled average of the two, which is used to indicate the stability of the model’s recognition performance,
Performance comparison of models (%)
| Models | Defect category | |||
|---|---|---|---|---|
| Decision tree algorithm | Center | 72.26 | 34.39 | 46.85 |
| Torus | 59.27 | 43.23 | 49.91 | |
| Marginal local | 63.72 | 79.83 | 70.91 | |
| Edge ring | 93.49 | 87.61 | 90.59 | |
| Local | 57.33 | 48.73 | 52.66 | |
| Nearly full | 93.35 | 93.27 | 93.49 | |
| Random | 90.52 | 92.84 | 91.77 | |
| Scratches | 85.30 | 92.28 | ||
| SVM | Center | 84.75 | 77.15 | 80.85 |
| Torus | 47.36 | 80.85 | 59.62 | |
| Marginal local | 87.55 | 81.54 | 84.34 | |
| Edge ring | 94.30 | 86.75 | 90.26 | |
| Local | 81.64 | 68.71 | 74.54 | |
| Nearly full | 74.92 | 74.96 | 75.09 | |
| Random | 89.92 | 99.35 | 94.4 | |
| Scratches | 88.00 | 73.01 | 80.09 | |
| Random forest | Center | 76.4 | 85.46 | 80.73 |
| Torus | 95.26 | 64.63 | 77.11 | |
| Marginal local | 79.76 | 91.32 | 85.00 | |
| Edge ring | 96.13 | 81.47 | 88.06 | |
| Local | 82.01 | 66.92 | 73.59 | |
| Nearly full | 91.69 | 72.04 | ||
| Random | 95.1 | 87.41 | 97.31 | |
| Scratches | 86.25 | 65.54 | 74.49 | |
| Ours | Center | 97.53 | 97.51 | |
| Torus | 99.93 | 94.52 | ||
| Marginal local | 95.57 | 96.55 | ||
| Edge ring | 98.68 | 95.26 | ||
| Local | 95.86 | 96.23 | ||
| Nearly full | 94.15 | 94.17 | 94.08 | |
| Random | 98.87 | 99.62 | ||
| Scratches | 89.64 | 87.24 | 88.40 |
In order to avoid the influence of random factors on the experimental results and to verify the stability of the performance of the proposed model, five-fold cross-validation is performed. The average recognition rates of different recognizers based on the five-fold cross-validation are shown in Table 3. The recognition effect of this paper is exceptional, with a recognition rate of 96.82%, which is significantly better than any other recognizer. It shows that the performance of the model in this paper is significantly superior in the task of recognizing surface defects in semiconductor materials.
Comparison of five-fold cross validation of various algorithms
| Models | |
|---|---|
| Decision tree algorithm | 74.36 |
| SVM | 82.47 |
| Random forest | 85.39 |
| Ours | 96.82 |
In this paper, the Canny operator is used to detect the defective areas on the surface of semiconductor materials, and the geometric, gray scale and texture features of the defects on the surface of semiconductor materials are extracted based on the detected areas. A machine learning model is constructed to recognize the defects on the surface of semiconductor materials, and a data set from an actual semiconductor production line is selected to simulate and analyze the defect recognition performance of the model. The model in this paper achieves a comprehensive recognition accuracy of 94.53% on the test dataset. In addition to random defects, the model can effectively identify other types of semiconductor material surface defects such as Center, Edge localization, etc., and the recognition accuracy is above 94.59%. The F-values of this paper’s model in the six semiconductor material surface defect pattern recognition tasks of Center, Torus, Marginal local, Edge ring, Local, and Random are 94.71%, 97.32%, 95.90%, 97.01%, 96.02%, and 99.30%, respectively, which are significantly higher than those of the participating comparative decision tree, SVM and other defect recognition models.The optimal F-values for the two defect patterns of Nearly full and Scratches are obtained by the decision tree algorithm and the random forest algorithm, respectively.The F-values of this paper’s model on these two defect patterns are 94.08% and 88.40%, which are closer to the optimal F-values. Meanwhile, in the five-fold cross-validation, the model in this paper achieved the highest recognition accuracy of 96.82%, which significantly indicates that the model in this paper can effectively recognize surface defects in semiconductor materials.
This research was supported by the 2023 Key research and development plan of Sichuan Province: Research on Blind Source Separation Technology for unmanned situational awareness Platform (No.: 2023YFG0331).
