A Semi-Supervised Method for Grain Boundary Segmentation: Teacher–Student Knowledge Distillation and Pseudo-Label Repair

Huang, Yuanyou; Zhang, Xiaoxun; Ma, Fang; Li, Jiaming; Wang, Shuxian

doi:10.3390/electronics13173529

Open AccessArticle

A Semi-Supervised Method for Grain Boundary Segmentation: Teacher–Student Knowledge Distillation and Pseudo-Label Repair

by

Yuanyou Huang

¹,

Xiaoxun Zhang

^1,*

,

Fang Ma

²,

Jiaming Li

¹ and

Shuxian Wang

¹

School of Materials Science and Engineering, Shanghai University of Engineering Science, Shanghai 201620, China

²

School of Mechanical and Automotive Engineering, Shanghai University of Engineering Science, Shanghai 201620, China

^*

Author to whom correspondence should be addressed.

Electronics 2024, 13(17), 3529; https://doi.org/10.3390/electronics13173529

Submission received: 31 July 2024 / Revised: 29 August 2024 / Accepted: 2 September 2024 / Published: 5 September 2024

(This article belongs to the Special Issue Applications of Artificial Intelligence in Computer Vision)

Download

Browse Figures

Versions Notes

Abstract

:

Grain boundary segmentation is crucial for the quantitative analysis of grain structures and material optimization. However, challenges persist due to noise interference, high labeling costs, and low detection Accuracy. Therefore, we propose a semi-supervised method called Semi-SRUnet, which is based on teacher–student knowledge distillation and pseudo-label repair to achieve grain boundary detection for a small number of labels. Specifically, the method introduces SCConv (Spatial and Channel Reconstruction Convolution) and boundary regression to improve the U-Net (a convolutional neural network architecture) as a teacher network. These innovations aim to reduce spatial and channel redundancy, expand the receptive field, and effectively capture contextual information from images, thereby improving feature extraction robustness and boundary precision in noisy environments. Additionally, we designed a pseudo-label repair algorithm to enhance the Accuracy of pseudo-labels generated by the teacher network and used knowledge distillation to train a lightweight student network. The experimental results demonstrate that Semi-SRUnet achieves 88.86% mean Intersection over Union (mIoU), 96.64% mean Recall (mRecall), 91.5% mean Precision (mPrecision), and 98.77% Accuracy, surpassing state-of-the-art models and offering a novel approach for reliable grain boundary segmentation and analysis.

Keywords:

semi-supervision; knowledge distillation; pseudo-label repair; grain boundary segmentation; optical microscopy

1. Introduction

In materials science, a grain boundary is a region in a crystal structure where different grains intersect [1]. Compared to the interior of the crystal, grain boundaries possess higher energy and their structure directly affects the overall behavior of the material, significantly influencing its mechanical, electrical, and optical properties [2]. For instance, the presence of grain boundaries can lead to variations in material properties, such as strengthening, fracture, and deformation [3,4,5]. Therefore, grain boundary detection is an important step in order to better understand and optimize the overall properties of materials.

With the emergence of advanced materials, research techniques, scanning electron microscopy (SEM) [6], transmission electron microscopy (TEM) [7], X-ray micro-computed tomography (XCT) [8], and optical microscopy (OM) [9] have become widely used for microstructure characterization and analysis. However, SEM, TEM, and XCT involve relatively high equipment costs and are time-consuming compared to OM. Despite this, OM faces challenges due to its lower resolution and the noise generated during sample preparation. As a result, grain boundary detection remains challenging, and achieving clear and accurate extraction of grain boundaries is still difficult.

Grain size is crucial for analyzing the mechanical properties of materials. To determine grain size, researchers must observe and measure grain boundaries. Grain boundary detection can be categorized into two types of methods: rule-based methods and learning-based methods. Rule-based methods [10,11,12,13] utilize the physical properties or geometric rules of the material to analyze grain shape and set appropriate thresholds to distinguish regions with different properties in the image. Although current rule-based methods have achieved good results in grain boundary detection, they are susceptible to environmental effects such as noise, lighting, and low resolution, as well as difficulties in adapting to complex scenarios with ambiguous grain boundaries. Learning-based (including supervised learning, semi-supervised learning, and unsupervised learning) methods have achieved significant results in the field of grain boundary detection. These methods use deep neural networks to learn features of grain boundaries in images, offering greater flexibility and generalizability compared to traditional rule-based approaches. However, supervised learning methods [14,15] require extensive manual labeling of grain boundary information, and their denoising effectiveness can be compromised by scratches, blurred, or missing grain boundaries. Certainly, there are also semi-supervised and unsupervised [16] methods currently. While semi-supervised [17] and unsupervised learning approaches reduce labeling costs, the limited amount of labeled data often fails to capture all features and complexities of grain boundaries, resulting in lower Accuracy compared to supervised learning. This presents significant challenges in practical applications. Moreover, the effectiveness of the teacher–student framework [18] in handling noisy data has been demonstrated in medical imaging. By incorporating attention mechanisms, this framework can accurately distinguish subtle anatomical features from noisy regions, ensuring the transmission of context-rich information and achieving more precise model outputs [19].

Therefore, to reduce the labeling cost while effectively removing the noise and improving the Accuracy of the grain boundary detection, this paper proposes a semi-supervised grain boundary segmentation method based on teacher–student knowledge distillation and pseudo-labeling repair. Specifically, the proposed semi-supervised model employs a teacher–student knowledge distillation framework. The teacher–student network introduces SCConv [20] and boundary regression to improve the U-Net [21] model, which enhances the network’s representation of spatial and channel features as well as improves the ability to capture fine-grained features. Then, the teacher network trained with a small amount of labeled data generates pseudo-labels for unlabeled images. Additionally, a pseudo-label repair algorithm based on the physical characteristics of grain boundaries is introduced to address missing label information in pseudo-labels. Finally, the student network takes the dataset to a data enhancement strategy (adding noise points, scratches, and fuzzy boundaries) to improve the robustness of the model. Compared with previous work, our contribution can be summarized as follows:

(1): We propose a novel semi-supervised grain boundary segmentation method based on a knowledge distillation framework. This method enhances model learning efficiency and segmentation Accuracy, addressing the challenge of high labeling costs in grain boundary segmentation.
(2): We propose the novel SRUnet network for teacher–student frameworks. This network integrates SCConv into the U-Net architecture to improve spatial and channel reconstruction, thereby reducing noise impact. Additionally, a boundary regression module is included to further suppress noise propagation.
(3): In order to improve the Accuracy of pseudo-labeling and make full use of unlabeled data, we propose a novel pseudo-labeling repair method based on the unique physical characteristics of grain boundaries.

2. Related Work

2.1. Grain Boundary Segmentation

Grain boundary segmentation is essentially a binary classification task akin to edge detection in computer vision, where each pixel must be classified as either a boundary pixel or not. Early approaches to grain boundary detection primarily utilized the physical properties of the material and pixel gradients for boundary extraction. For instance, Ma et al. [10] employed an overlapping tiling strategy to detect grain boundaries based on the high degree of consistency between two neighboring slices in a material image. Wang et al. [11] proposed a method to extract grain boundaries by reconstructing the boundary region and analyzing pixel relationships. Gajalakshmi et al. [12] applied both the Otsu and Canny edge detection techniques for grain boundary detection. Peregrina-Barreto et al. [13] used image simplification, noise removal, automatic thresholding, and grain delineation to measure grain size in mild steel. However, with the development of deep learning, the use of neural networks to learn grain boundary features has become mainstream. For example, Li et al. [14] used multi-task learning and generative adversarial networks (GAN) for the segmentation of second-phase grains and grain boundary detection. Wang et al. [15] developed an end-to-end CNN-based method for compressing and identifying δ-phase grain boundaries in CrNiFe alloy from SEM images, achieving automatic, rapid, and accurate identification. Na et al. [16] framed grain boundary detection as a real-to-virtual (R2V) translation problem, mapping real microstructures to virtual ones and incorporating regularization into unsupervised segmentation. Li et al. [17] introduced a semi-supervised boundary detection method combining transfer learning and boundary region growth for detecting aluminum grains in metallographic images. It is worth noting that most of the current deep learning methods for grain boundary segmentation are fully supervised and require a large annotation cost. Semi-supervised and unsupervised learning approaches often suffer from poor generalization and low robustness due to their limited annotation. In practical scenarios involving complex tasks such as scratches, partially missing boundaries, and noise from impurities, these models are prone to noise interference, resulting in reduced Accuracy. In contrast, our model addresses these issues by reducing annotation costs while effectively managing noise problems encountered in practical applications.

2.2. Semi-Supervised Learning

In the case of insufficient labeled data, semi-supervised learning leverages both labeled and unlabeled data during training, using information from the unlabeled data to enhance the model’s performance and generalization ability. Current semi-supervised learning methods are primarily categorized into consistency regularization, pseudo-labeling, and holistic models. Consistency regularization aims to ensure that a model’s predictions for the same unlabeled image remain consistent before and after adding noise. For example, Tarvainen et al. [22] proposed the Mean Teacher model, which enhances the student model’s performance in semi-supervised learning by generating stable consistency targets using exponential moving average weights of the teacher model. Zhao et al. [23] combined contrastive learning with consistency constraints to improve medical image segmentation by leveraging both labeled and unlabeled data. However, pseudo-labeling methods initially train a model using labeled data, then predict labels for the unlabeled data and incorporate these predictions into the training set. The model is subsequently trained on both the true labeled data and the pseudo-labeled data. For instance, McClosky et al. [24] proposed a self-training model that trains on labeled data and then predicts and adds high-confidence pseudo-labels to the training set. Cascante-Bonilla et al. [25] introduced Curriculum Labeling, which starts with easier samples and progressively incorporates more complex samples until all unlabeled data are included in the training set. Additionally, holistic models integrate both consistency regularization and pseudo-labeling techniques. For example, Sohn et al. [26] developed the FixMatch model, which computes supervised loss on labeled data, generates pseudo-labels for unlabeled data through weak augmentation, and uses these pseudo-labels to compute unsupervised loss. While these semi-supervised methods enhance model robustness by utilizing unlabeled data, challenges remain regarding perturbation selection and the reliability of pseudo-labels. In this paper, we propose a pseudo-label repair algorithm based on the physical properties of grains to improve pseudo-label Accuracy and mitigate performance degradation caused by pseudo-label noise.

3. Methodology

3.1. Data Preprocessing

In this study, 21 OM images of size

2048 \times 450

pixels were collected. The labelme [27] tool was used to manually annotate the grain boundaries in 10 of these OM images (see Figure 1), while the remaining 11 images were treated as unlabeled for use in unsupervised learning. Each OM image, originally sized at

2048 \times 450

pixels, was cropped into four sub-images of

512 \times 512

pixels each. These sub-images underwent data augmentation, including rotation, flipping, and contrast adjustment. Consequently, we obtained 380 labeled images (340 for the training set and 40 for the test set) and 420 unlabeled images.

3.2. Semi-SRUnet Modeling Framework

In a mathematical sense, this paper defines the OM image as

X \in ℝ^{W \times H \times C}

. The labeling diagram for the Semi-SRUnet model is

Y \in {0, 1}^{W \times H \times C}

, with 0 denoting the background category and 1 denoting the grain boundary category. The dataset D used by the model consists of n labeled samples and N unlabeled samples, denoted as

D

=

D^{l}

\cup

D^{u}

, where

D^{l} = {(X_{i}^{l}, Y_{i}^{l})}_{i = 1}^{n}

and

D^{u} = {X_{i}^{u}}_{i = 1}^{N}

.

The semi-supervised grain boundary segmentation model proposed in this paper is illustrated in Figure 2. The process involves the following steps: (i) Using the labeled data

D^{u}

to train the teacher network. The teacher network learns the feature information of the grain boundaries from the OM image

X_{i}^{l}

. Subsequently, it generates predicted images of the grain boundaries

Y_{i}^{p 1}

and boundary regression images

Y_{i}^{p 2}

. These predictions are then combined with the ground truth labels

Y_{i}^{l}

to train the teacher network and adjust the parameters of the model. (ii) The trained teacher network will segment the unlabeled OM images

X_{i}^{u}

with grain boundaries and generate pseudo-label images

Y_{i}^{u}

. Subsequently, the label

Y_{i}^{u}

repair is performed using the breakpoint connection method to obtain the repaired pseudo-label images

Y_{i}^{f}

. (iii) During the training process, we employed a random noise generation procedure, which involves adding noise points, missing boundaries, and scratches to the labeled OM images

X_{i}^{l}

. Among these, missing boundaries are simulated by binarizing and extracting regions with significant grain boundary features in the OM image and adding a mask to these regions. This approach more realistically simulates the phenomenon of missing grain boundaries in real images. The noise generation strategy aims to enhance the model’s generalization capability when faced with noise and distortions beyond those present in the training dataset. The OM images with random noise added and their corresponding ground truth labeler are included in a dataset named MixData. Additionally, unlabeled OM images

X_{i}^{u}

and repaired pseudo-labeled images

Y_{i}^{f}

are also incorporated into the MixData dataset. Therefore, the MixData dataset is denoted as MixData =

{(X_{i}^{m}, Y_{i}^{m})}_{i = 1}^{M = 3 n + N}

. Subsequently,

X_{i}^{m}

are fed into the student network for training and output

Y_{i}^{S 1}

and

Y_{i}^{S 2}

, as well as predictions

Y_{i}^{t}

from using the trained teacher network, are used for knowledge distillation to accelerate the student’s feature learning. Finally, the predicted output images are compared with ground truth labels to compute the loss value for the student network. Based on this, the parameters of the student network are adjusted to enhance model Accuracy.

3.2.1. Teacher–Student Network

In this paper, we enhance U-Net by introducing SCConv and a boundary regression module to improve its spatial and channel information capture and edge segmentation performance. This improved version is named SRUnet (U-Net enhanced with SCConv and boundary regression), as shown in Figure 3. SCConv is composed of two parts, the Spatial Reconstruction Unit (SRU) and the Channel Reconstruction Unit (CRU) (Figure 3a). SRU filters out irrelevant spatial details and CRU optimizes key features across channels, helping the model to focus on the most representative features even in the presence of distortion in the input image. Therefore, SRUnet inserts three SCConv modules in the encoding process to avoid the extraction of redundant features in the encoding process, and the convolved information is fused with the up-conv image information by jump connection for feature fusion. During the decoding process, SRUnet also inserts three SCConv modules during the decoding process to capture the spatial and channel information after feature fusion. Finally, two conv and Rectified Linear Units are applied to the image features of the end up-conv output to preserve the detail information of local features and output images. Additionally, three more convolutional layers and a Rectified Linear Units are applied to the end up-conv output to expand the receptive field, capture contextual information, and accurately locate boundaries, resulting in boundary regression images (Figure 3b).

However, in the semi-supervised grain boundary segmentation method proposed in this paper, the teacher network is based on SRUnet. The student network also uses SRUnet but without the SCConv module, and it receives guidance from the teacher network through knowledge distillation. This approach avoids the need for the student model to learn from scratch, helps it find the correct direction more quickly, reduces the search space, and improves learning efficiency.

The output images and regression images in the SRUnet network structure are used to calculate the loss values with real ground labels using two loss functions, binary cross entropy with logits Loss (

l o s s_{1}

) defined in (1) and mean squared error (

l o s s_{2}

) defined in (2), respectively. Teacher uses the loss function

T l o s s

defined in (4), while the student’s loss function

S l o s s

is defined in (5).

S l o s s

is based on the teacher network and introduces a knowledge distillation loss function (

l o s s_{d}

) in the form of Kullback–Leibler Divergence defined in (3) to learn the prior knowledge of the teacher. The calculation formulas are as follows:

l o s s_{1} = - \frac{1}{M} \sum_{i = 1}^{M} [y_{i} \times \log_{e} (σ (p_{i})) + (1 - y_{i}) \times \log_{e} (1 - σ (p_{i}))]

(1)

l o s s_{2} = \frac{1}{M} \sum_{i = 1}^{M} {(y_{i} - {\hat{y}}_{i})}^{2}

(2)

l o s s_{d} = \frac{1}{M} \sum_{i = 1}^{M} σ (p_{i}) \times \log_{e} (\frac{σ (p_{i})}{σ (q_{i})})

(3)

T l o s s = α \times l o s s_{1} + β \times l o s s_{2}

(4)

S l o s s = λ \times l o s s_{d} + ω \times l o s s_{1} + μ \times l o s s_{2}

(5)

where

M

denotes the number of samples in the training set,

p_{i}

denotes the predicted output from the student model,

q_{i}

denotes the predicted output from the teacher model,

{\hat{y}}_{i}

denotes regression images,

y_{i}

denotes real ground labeling,

σ

denotes sigmoid activation function, and

α

,

β

and

λ

,

ω

,

μ

denote the weight ratio of the loss function.

3.2.2. Pseudo-Label Repair

Since only a small amount of labeled data is used for supervised learning, the pseudo-labels predicted by the teacher network contain some errors compared to the real ground truth labels. Therefore, this paper presents an algorithm for pseudo-label repair, as outlined in Algorithm 1:

Algorithm 1: Pseudo-label inpainting

Input: Pseudo-label

Y_{i}^{u}

.

Output : Fixed - label Y_{i}^{f} .

1 : i m g = thinning (Y_{i}^{u})

2 : p_{b} = break_detect (i m g)

3 : if distance (p_{b i}, p_{b j}) < l e n g t h_{1} and | θ_{b i} - θ_{b j} | > θ_{0} or | θ_{b i} - θ_{b j} | < (180 - θ_{0}) : Connect (i m g, p_{b i}, p_{b j})

4 : p_{n e w b} = break_detect (i m g) 5 : p_{f} = forkpoint_detect (i m g) 6 : if distance (p_{n e w b i}, p_{f j}) < l e n g t h_{2} and | θ_{n e w b i} - angle (p_{n e w b i}, p_{f j}) | < θ_{1} :

Connect (i m g, p_{n e w b i}, p_{f j}) 7 : return 2, decrease θ_{0} and increase the value of l e n g t h_{1}, until θ_{0} = 0 . 8 : p_{r e m a i n} = break_detect (i m g) 9 : n \to + \infty; (x, y) = p_{r e m a i n i} while k < n : x = x + \cos (θ_{r}); y = y + \sin (θ_{r}) if (x, y) = = grain boundary point : connect (i m g, (x, y), p_{r e m a i n i}) break else if (x, y) == border point: break

10 : Y_{i}^{f} = dilate (i m g) 11 : return Y_{i}^{f}

For Algorithm 1, the main steps are as follows. (1) Perform skeleton extraction on the input pseudo-labeled image

Y_{i}^{u}

. (2) Iterate over each non-zero pixel in the image, treating each non-zero pixel as the center. Extract the eight neighboring pixels around it; if there is only one non-zero pixel among these neighbors, the central non-zero pixel is considered a breakpoint. This process continues until all breakpoints in the image are detected, resulting in a set of breakpoint coordinates. (3) Iterate over any two breakpoints

p_{b i}

and

p_{b j}

of the set

p_{b}

. If the distance between the two points is less than the threshold

l e n g t h_{1}

and

| θ_{b i} - θ_{b j} |

greater than threshold

θ_{0}

or

| θ_{b i} - θ_{b j} |

less than (180 −

θ_{0}

), then connect the two breakpoints

p_{b i}

and

p_{b j}

. While

θ_{b i}

is the direction angle of the line segment at

p_{b i}

,

θ_{b j}

is the direction angle of the line segment at

p_{b j}

, and

θ_{0}

is the threshold. (4) Repeat the operation of (2) to detect breakpoints on the image and obtain a new set of breakpoints

p_{n e w b}

. (5) The method of the morphology of shapes is applied to detect all the fork points of the crystal boundaries in the image to obtain the set of fork points

p_{f}

. (6) Traverse all combinations of fork points and breakpoints assuming the traversals are

p_{f j}

,

p_{n e w b i}

. Connect

p_{f j}

and

p_{n e w b i}

if the distance between

p_{f j}

and

p_{n e w b i}

is less than

l e n g t h_{2}

and the absolute value of

θ_{n e w b i}

minus angle(

p_{n e w b i}

,

p_{f j}

) is less than

θ_{1}

. (7) Return (2) to continue with the following steps, decrease

θ_{0}

, and increase the value of

l e n g t h_{1}

until

θ_{0}

= 0. (8) After the breakpoints and forks are connected, repeat the operation in (2) to detect the remaining grain boundary breakpoints. (9) Extend breakpoint

p_{r e m a i n} (x, y)

by adding

x = x + \cos (θ_{r})

and

y = y + \sin (θ_{r})

until it reaches the coordinates of the grain boundary, and then connect

p_{r e m a i n}

with the coordinates of the grain boundary. (10) Perform the expansion operation on the

i m g

of the image with the breakpoints connected to restore the thickness of the grain boundary as the real ground label. (11) Return the restored pseudo-label image

Y_{i}^{f}

.

To avoid introducing noise due to erroneous connections between breakpoints, appropriate thresholds for both length and angle must be set. Based on the polygonal characteristics of grain boundaries, when connecting breakpoints, the target point typically falls within the angular range of (−60°, 60°) relative to the breakpoint. Similarly, when connecting a breakpoint to a bifurcation point, the directional angle between the breakpoint and the bifurcation point predominantly lies within the same range. Therefore, we conducted a grid search within a range of 0 to 100 pixels, performing multiple tests and observing the connection results to determine the optimal pixel length as 30 pixels. Given that the side length of our images is 512 pixels, and considering that different image sizes may require different connection lengths, we converted the pixel length into a proportion of the image side length to represent

l e n g t h_{1}

and

l e n g t h_{2}

. The final optimal threshold values for

l e n g t h_{1}

and

l e n g t h_{2}

are determined to be 5.85%. Additionally, within the angular range of (−60°, 60°), we used the same testing method to determine the optimal angle thresholds as

θ_{1}

= 30° and (180° −

θ_{0}

) = 20°. When the direction of the line segments at the two breakpoints is nearly opposite and close to 180°, it may indicate that the grain boundary is interrupted in the middle. Therefore, |

θ_{b i} - θ_{b j}

| >

θ_{0}

also meets our connection criteria. However, this may also lead to situations where breakpoint connection becomes impossible. For example, if the grain boundaries in the pseudo-labeled image are severely missing, resulting in excessively long distances between breakpoints and the inability to find grain boundary points by extending the breakpoints, the breakpoint repair cannot be completed.

Using Algorithm 1, skeleton extraction is performed for pseudo-label

Y_{i}^{u}

, as shown in Figure 4a. Subsequently,

l e n g t h_{1} = 5.85 %

(5.85% represents the proportion of the image’s side length) and

θ_{0} = 160^{°}

are set to connect the two breakpoints that meet the conditions for connecting them, as shown in Figure 4b. Next, the thresholds

l e n g t h_{2} = 5.85 %

and

θ_{1} = 30^{°}

are applied to connect the breakpoints and fork points, as shown in Figure 4c. Then, line segments are extended for the remaining breakpoints until they reach the grain boundary points, as shown in Figure 4d. Finally, shape morphology is applied to expand the grain boundaries, as well as black and white inversion of the image after the connection of the breakpoints to obtain the repaired pseudo-labeled image

Y_{i}^{f}

, as shown in Figure 4e.

3.3. Training Details

In this paper, we empirically determine the optimal hyperparameters for the teacher and student networks. Both networks are trained using the RMSProp optimizer with a batch size of 2, a learning rate of 0.00001, and a weight decay of 1 × 10⁻⁸ to prevent overfitting. The momentum is set to 0.9 to facilitate faster convergence and improve model performance. After experimentation and testing, the weights

α

and

β

of the loss function

T l o s s

in the teacher network were set to 0.8 and 0.2. The weights

λ

,

ω

, and

μ

of the loss function

S l o s s

in the student network were set to 0.88, 0.02, and 0.1. In addition, this study was trained and tested on a computer (Intel(R) Xeon(R) Platinum 8352 V CPU @ 2.10 GHz, NVIDIA GTX 3090 GPU, ubuntu20.04, PyTorch 1.10.0).

4. Experiments and Results

4.1. Dataset Partition

The experiments in this paper involve dividing the 340 labeled images and 420 unlabeled images in the training set into three strategies for training, as detailed in Table 1. The effectiveness of these methods is then tested using the test set. Specifically, when the strategy is set to 0.5, 170 labeled images and 420 unlabeled images are used for training; when the strategy is set to 0.75, 255 labeled images and 420 unlabeled images are used; and when the strategy is set to 1, all 340 labeled images and 420 unlabeled images are utilized.

4.2. Evaluation Metrics

In this paper, four evaluation metrics are used to assess the model’s performance: mean precision (mPrecision), mean recall (mRecall), Accuracy, and mean intersection over union (mIoU).

Since grain boundary segmentation is a binary classification task, confusion matrices are constructed for the model’s predictions, including true positives (TP), true negatives (TN), false positives (FP), and false negatives (FN). Based on the confusion matrix, the evaluation metrics are defined as:

m P r e c i s i o n = m e a n (\frac{T P}{T P + F P}) \times 100 %

(6)

m R e c a l l = m e a n (\frac{T P}{T P + F N}) \times 100 %

(7)

A c c u r a c y = \frac{T P + T N}{T P + T N + F P + F N} \times 100 %

(8)

m I o U = m e a n (\frac{T P}{T P + F P + F N}) \times 100 %

(9)

where mPrecision, mRecall, and mIoU are the average indicator values for the two categories of background and grain boundary predicted by the model.

4.3. Comparison with Baseline Models

This chapter compares our SRUnet model with the baseline model U-Net. The U-Net model uses the same data strategy (Strategy = 1) and experimental configuration but does not include the SCConv module and the boundary regression (BR) module.

As shown in Table 2, the improved SRUnet significantly outperforms the baseline U-Net model in terms of mIoU, mRecall, and mPrecision. Notably, introducing the BR module into the baseline model improved Accuracy by 0.17%, with the other three evaluation metrics increasing by more than 1%. These results suggest that the BR module can more accurately capture object contours, thereby enhancing the boundary precision of segmentation results. Comparing results from (1) and (3), adding the SCConv module to the baseline model increased mIoU by 3.98%, mRecall by 5.09%, mPrecision by 3.26%, and Accuracy by 0.7%. This demonstrates that the SCConv module, through spatial and channel reconstruction, better preserves and conveys critical spatial information, reducing information loss and thus improving segmentation performance. However, our Semi-SRUnet model is an integrated model where each component is indispensable. Therefore, rather than analyzing the model through ablation experiments, this paper evaluates and compares the model’s performance across different proportions of labeled data to demonstrate its effectiveness.

4.4. Comparison with Supervised Models

In this paper, four state-of-the-art supervised models in the field of semantic segmentation, namely U-Net, UNet++ [28], ResUNet++ [29], and DSCNet [30], are used to compare against our SRUnet and Semi-SRUnet models. The experiments were conducted using each of the three strategies outlined in the Table 1 dataset (the supervised models did not use unlabeled data), and the test results are presented in Table 3. As shown in the table, when the strategy is set to 0.5, our SRUnet demonstrates improvements in all three metrics—mIoU, mRecall, and mPrecision—compared to other segmentation models, with increases of 4.33%, 7.87%, and 0.12%, respectively. However, the Accuracy drops by 0.31% compared to the DSCNet model. For the Semi-SRUnet model, all four metrics show improvements, with mIoU, mRecall, mPrecision, and Accuracy increasing by 8.18%, 12.9%, 3.54%, and 0.57%, respectively. When the strategy is adjusted to 0.75, the advantages of our model become more apparent. Specifically, SRUnet sees increases of 4.71% in mIoU, 8.69% in mRecall, 1.53% in mPrecision, and 0.14% in Accuracy, while Semi-SRUnet’s metrics improve significantly by 25.3% in mIoU, 23.21% in mRecall, 18.1% in mPrecision, and 3.80% in Accuracy. However, when the strategy is set to 1, the magnitude of improvements in our model’s metrics compared to other segmentation models decreases but still outperforms them. For example, SRUnet’s metrics increase by 1.21% in mIoU, 1.36% in mRecall, 0.93% in mPrecision, and 0.2% in Accuracy, while Semi-SRUnet’s metrics improve by 19.7% in mIoU, 16.69% in mRecall, 13.79% in mPrecision, and 2.91% in Accuracy.

It can be seen that our SRUnet model captures the boundary and shape of the target more accurately and improves the Accuracy of grain boundary detection after incorporating SCConv and boundary regression. Moreover, our semi-supervised model, Semi-SRUnet, achieves a reduced labeling cost and a substantial improvement in detection Accuracy by using knowledge distillation to accelerate network feature extraction and by repairing the pseudo-labels.

We use strategy = 1 for training and testing, and the three test images from the test set are shown in Figure 5. As seen in the figure, certain regions (scratches circled by green and blurred grain boundaries circled by blue) in the four models—U-Net, UNet++, ResUNet++, and DSCNet—are not accurately segmented to capture the grain boundaries. Additionally, three of the models, except for U-Net, fail to remove the noise points circled in red. Since the noise points and scratches have similar pixel values to the grain boundaries, and the regions with blurred boundaries have the same pixel intensity as the background, the models mistakenly identify the noise points and scratches as grain boundaries and the blurred regions as background. This issue arises because these models lack the ability to improve generalization and fail to adequately learn the texture, shape, and proximity features of the grains. In contrast, our semi-SRUnet model introduces a random noise generation module to enhance the model’s generalization and employs boundary regression along with the SCConv module to extract grain features at a deeper level. Consequently, as seen in the segmentation results of our model, noise points and scratches are effectively removed, and grain boundaries in regions with blurred boundaries are accurately segmented, avoiding problems such as missing or disconnected grain boundaries.

4.5. Comparison with Semi-Supervised Models

In this paper, we compare four mainstream semi-supervised methods: Mean Teacher (MT) [22], Uncertainty-guided Collaborative Mean Teacher (UCMT) [31], Self-Supervised Correction (SSC) [32], and Cross-level Contrastive Learning and Consistency Constraint (CLCC) [23]. To validate the effectiveness of our model, we used the three dataset partitioning strategies outlined in Table 1 for training under the same experimental setup.

The comparison results of the models are presented in Table 4. Our Semi-SRUnet model significantly outperforms the other mainstream semi-supervised models. Specifically, when using the strategy of 0.5, our model improves mIoU, mRecall, and mPrecision by 6.34%, 8.51%, and 5.02%, respectively, compared to CLCC. In terms of Accuracy, our model shows a 1.03% improvement over SSC, which is the best-performing model among the others. When using the strategy of 0.75, compared to the best-performing CLCC model, our model achieved improvements of 25.25% in mIoU, 24.81% in mRecall, 17.7% in mPrecision, and 3.93% in Accuracy. Additionally, when using the strategy of 1, our model significantly outperforms the other four mainstream semi-supervised models.

To provide a more subjective comparison, we tested the model trained with strategy 1 and visualized the results, as shown in Figure 6. From the noise points indicated by red circles, it can be observed that except for UC-MT, the other three models are unable to remove these noise points. For the scratches indicated by the blue circles, the other four semi-supervised models mistakenly identify them as crystal boundaries. In contrast, our model effectively removes the noise points and is not affected by the scratches. Additionally, our Semi-SRUnet model can segment the blurred parts of the crystal boundaries, while the other four mainstream semi-supervised models mistakenly recognize these blurred boundary areas as background, resulting in broken and discontinuous boundaries.

Therefore, through performance comparisons with different proportions of labeled data and intuitive visual assessments, it is demonstrated that our model can significantly improve the Accuracy and completeness of crystal boundary segmentation with limited labeled data, showing better generalization ability and robustness.

4.6. Comparison with Unsupervised Models

In this paper, we compare unsupervised models, including R2V (real-to-virtual) with regularization [16], the Watershed Algorithm [33], and Canny Edge Detection [34] with our Semi-SRUnet model (Figure 7). As shown in the red rectangular area of the figure, compared to Semi-SRUnet, the Canny and Watershed algorithms only provide vague segmentation of the grain boundary contours and fail to remove noise points and scratches. Although R2V segments the contours of the grain boundaries clearly, it also fails to remove noise points and scratches. For regions with fuzzy boundaries, the three unsupervised models predict them as the background, leading to issues such as multiple grains merging into one large grain or resulting in broken grain boundaries. Although these unsupervised models can be used without labeling cost, the segmentation Accuracy is much lower than the supervised models. In practical application scenarios, these errors will cause significant errors in the subsequent quantitative analysis of the grains and are not allowed. Therefore, our Semi-SRUnet model, which uses a small amount of labeled data, significantly improves grain boundary detection Accuracy, effectively addressing the issues of labeling cost and segmentation precision.

4.7. Comprehensive Analysis and Discussion

4.7.1. Ensemble Techniques for Model Enhancement

Ensemble techniques [35] have been widely utilized in deep learning as they combine the advantages of multiple models to reduce variance and bias, thereby enhancing generalization capabilities. In this study, we applied both bagging and cascading ensemble techniques to our semi-SRUnet model. Specifically, we replaced the student network in the semi-supervised SRUnet architecture with ensembles of three advanced supervised models: SRUnet, DSCNet, and U-Net. The resulting models, named semi-SRUnet(Bagging) and semi-SRUnet(Cascading), were evaluated using the strategy = 0.5 dataset to analyze their performance. As indicated in Table 5, the improved semi-SRUnet(Bagging) and semi-SRUnet(Cascading) models show improvements in four evaluation metrics, demonstrating that these ensemble techniques effectively enhance model performance. However, it is important to note that employing ensemble methods may increase computational costs. Therefore, when computational resources permit, utilizing ensemble techniques can further boost the Accuracy of the semi-SRUnet model.

4.7.2. Qualitative Comparison with Advanced Grain Boundary Segmentation Models

We conducted a qualitative comparison of recent boundary segmentation methods with our proposed model. Due to the lack of specific names for two of these methods, they are referred to as Method A [36] and Method B [37] in this study. As shown in Table 6, MPU-Net [38] employs pruning strategies and adaptive particle redistribution techniques to further correct over-segmentation errors. However, this method is fully supervised learning, which incurs high labeling costs, and its correction strategy is limited to scenarios where only partial overlap between grains occurs. Method A employs data augmentation strategies such as random noise generation and simulated defects to expand the dataset, followed by training with a U-Net network. However, this method is associated with complex data preprocessing, high labeling costs, and occasional discontinuities in the predicted grains. Method B leverages EBSD instrumentation to generate label data and applies supervised models like U-Net for boundary segmentation, thus avoiding manual annotation. However, the equipment costs for EBSD instrumentation are high, and this approach merely shifts the labor costs to equipment costs. Furthermore, discrepancies between SEM and EBSD can lead to pixel drift, causing mismatches between SEM images and label data.

In contrast, our proposed semi-SRUnet model integrates a teacher–student network framework and improves U-Net, reduces labeling costs, and enhances denoising capabilities. The pseudo-label refinement algorithm further improves label Accuracy. Additionally, we conducted tests with the MPU-Net model under the same experimental conditions and using the data strategy of strategy = 1. The other two methods could not be compared due to the unavailability of their code. Comparative analysis of mIoU and Accuracy with the MPU-Net model indicates that our approach not only reduces labeling costs but also achieves higher Accuracy.

4.7.3. Comparison of Model Complexity and Training Time

In this section, we utilize Params (parameters) and FLOPs (floating-point operations) as metrics to assess the model’s complexity. Additionally, we evaluate the training time per epoch under the strategy = 1 data strategy and consistent experimental conditions. As shown in Table 7, our semi-SRUnet model exhibits the lowest FLOPs among the compared semi-supervised models, even lower than that of the unsupervised models. In terms of parameters, our model ranks second smallest among the semi-supervised models. In terms of training time, the number of model parameters is more than that of the UCMT model, resulting in a training time of about 4 s per epoch more than that of the UCMT. However, compared to the other semi-supervised models, the training time of our model is the shortest, and it even outperforms the DSCNet model. This indicates that our semi-SRUnet model performs exceptionally well among semi-supervised models, offering not only optimal computational complexity but also relatively fast training speed. When compared to certain supervised models, our model’s complexity and training time may increase due to the necessity of learning from both labeled and unlabeled data to enhance Accuracy. Nevertheless, our model effectively keeps complexity and training time within an acceptable range, achieving a balanced trade-off between model complexity and Accuracy.

4.7.4. Performance with Varying Labeled Ratios

In this section, we conduct experiments with varying label ratios to thoroughly analyze the performance of the proposed semi-SRUnet model. As shown in Table 8, when training with only 34 labeled images, the Accuracy remains at 94.24%, while mRecall and mPrecision are 77.37% and 70.05%, respectively. However, mIoU is relatively low at 63.73%. This result is likely due to the insufficient amount of labeled data, which limits the teacher network to learning only partial boundary features, leading to significant boundaries missing in the generated pseudo-labels. In the process of pseudo-label repair, the large area of missing grain boundaries may result in the distance between the breakpoint and the target point being too long. Additionally, grain boundary points may not be found during breakpoint extension, leading to incomplete breakpoint repair and, consequently, affecting the model’s Accuracy.

As the amount of labeled data increases, particularly with the use of 255 labeled images and 420 unlabeled images, the model performance improves significantly, with mRecall reaching 96.57%. This enhancement in performance is primarily due to the fact that, at this label ratio, the teacher network is able to capture more feature information, resulting in pseudo-labels that do not exhibit large areas of blank regions. This allows the pseudo-label refinement algorithm to more effectively repair the pseudo-labels and fully leverage the unlabeled data, thereby significantly improving the overall performance of the model.

Therefore, in order to achieve reliable grain boundary segmentation with the proposed semi-SRUnet model, the proportion of labeled data in the training set should reach 37.70%, which corresponds to a combination of 37.70% labeled data and 62.30% unlabeled data.

4.7.5. Performance under High Noise

To validate the denoising capability of our proposed model, we introduced noise types such as missing boundaries, noise points, and scratches into the test set to simulate the severe noise conditions that might be encountered in actual OM boundary images, and evaluated the model’s performance under these conditions. Additionally, we used the model trained under the strategy = 1 data strategy to test the noise-added test sets and compare the Accuracy. As shown in Table 9, the model is largely unaffected by scratches and noise points, but under the noise of missing boundaries, the mIoU and mRecall decreased by 4.48% and 3.34%, respectively, but still maintained a relatively high level of Accuracy.

Moreover, SEM and TEM images are influenced by electron noise generated [39] during the interaction of the electron beam with the sample, which intensifies with the increase in signal strength and typically exhibits Gaussian distribution characteristics [40]. Concurrently, the scattering effect of electrons passing through the sample introduces scattering noise [41], resulting in image blurring [42]. Additionally, the thermal motion of electron detectors and associated electronic components generates thermal noise [43], which generally manifests as uniform background noise, akin to low-frequency noise in images. To preliminarily assess the denoising capability of our model on different types of images, we simulated electron noise and scattering noise in SEM and TEM images using Gaussian noise and modeled thermal noise using uniform noise. As shown in Table 9, despite the addition of Gaussian and uniform noise, the mIoU and mRecall decreased by only approximately 5%, and the model maintained a high overall Accuracy, reliably performing the task of boundary segmentation. It is worth noting that our model has not yet undergone data augmentation through simulated noise in SEM and TEM images, which represents a direction for further improvement and enhancement of the model’s performance. Therefore, these preliminary noise simulation tests provide a solid foundation for applying the semi-SRUnet model to SEM and TEM images.

The prediction results of our model on images with various added noise are shown in Figure 8. It can be observed that boundary gaps, noise points, and scratches did not significantly affect the model, as the predictions are nearly identical to the ground truth labels. However, uniform and Gaussian noise caused a few scratches in the original images to be mistakenly identified as crystal boundaries. Despite this, the overall prediction results remain consistent with the ground truth labels and do not affect the integrity and granularity of the grains.

5. Conclusions

Although many methods exist for grain boundary segmentation, including traditional machine learning techniques (such as Canny and Watershed) and unsupervised, semi-supervised, and supervised deep learning models, these methods suffer from labeling cost and interference by noise. Therefore, this study proposes a semi-supervised grain boundary segmentation method, Semi-SRUnet, to address these challenges. The validity of the method was verified through extensive experimental and comparative tests, with the following results:

(1): By introducing SCConv and boundary regression to the U-Net model, the performance of the model surpasses that of state-of-the-art supervised models, with significant improvements in both mIoU and mRecall.
(2): Our Semi-SRUnet model, based on the SRUnet model, introduces the method of knowledge distillation to guide the training of the student network and predict unlabeled images as pseudo-labels to augment the training dataset.
(3): Our proposed pseudo-label repair algorithm fixes the defects of breakpoints present in pseudo-labeled images, improves the Accuracy of pseudo-labeling, and solves the problem that semi-supervised learning of labeling data is too small to learn all the features of grain boundaries.
(4): Our Semi-SRUnet model, using a small amount of labeling cost, has greatly improved robustness, especially in dealing with defects such as noise points, scratches, and grain boundary blurring. It is noteworthy that when strategy = 1, the Accuracy increased to 98.77%, mRecall reached 96.57%, mPrecision reached 91.50%, and mIoU reached 88.86%.

However, we acknowledge certain limitations in our work. Specifically, our pseudo-label repair algorithm may struggle to rectify issues when there are severe missing boundaries or large blank areas in the pseudo-labels. While the complexity of our model is manageable, it still requires a degree of computational resources. Furthermore, although the model demonstrates excellent segmentation performance on OM images, it is necessary to evaluate its effectiveness against evolving imaging technologies to enhance its applicability.

In future work, we plan to extend our model to microstructure detection across images captured by various devices, including SEM, TEM, and XCT, to enhance its robustness. Additionally, we will focus on optimizing the model to reduce its complexity. We also aim to advance our pseudo-label repair algorithm by integrating more sophisticated image repair models, such as deep learning-based generative adversarial networks (GANs), to more effectively address severe missing data and large blank regions. These efforts will contribute to the advancement and application of deep learning techniques in quantitative microstructure analysis.

Author Contributions

Conceptualization, Y.H., J.L. and S.W.; Formal analysis, Y.H.; Funding acquisition, X.Z. and F.M.; Investigation, Y.H.; Methodology, Y.H., X.Z., J.L. and S.W.; Project administration, X.Z.; Resources, X.Z.; Supervision, X.Z., F.M., J.L. and S.W.; Writing—original draft, Y.H.; Writing—review and editing, Y.H. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Class III Peak Discipline of Shanghai—Materials Science and Engineering (High-Energy Beam Intelligent Processing and Green Manufacturing, Project Number: 0235-A1-5300-23-0302) and National Key R&D Program of China under grant 2020AAA0109300.

Data Availability Statement

The data of this study are available from the corresponding author upon request.

Acknowledgments

We express our gratitude to the Editors and Reviewers for their valuable comments.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Randle, V. Grain Boundary Engineering: An Overview after 25 Years. Mater. Sci. Technol. 2010, 26, 253–261. [Google Scholar]
Ke, R.; Hu, C.; Zhong, M.; Wan, X.; Wu, K. Grain Refinement Strengthening Mechanism of an Austenitic Stainless Steel: Critically Analyze the Impacts of Grain Interior and Grain Boundary. J. Mater. Res. Technol. 2022, 17, 2999–3012. [Google Scholar] [CrossRef]
You, Z.Y.; Tang, Z.Y.; Li, J.P.; Chu, F.B.; Ding, H.; Misra, R.D.K. Effect of Grain Boundary Engineering on Grain Boundary Character Distribution and Deformation Behavior of a TRIP-Assisted High-Entropy Alloy. Mater. Charact. 2023, 205, 113294. [Google Scholar] [CrossRef]
Tan, Y.B.; Zeng, M.T.; Zhang, W.W.; Yang, Y.; Zhou, Y.L.; Zhao, F.; Xiang, S. High-Temperature Tensile Properties and Strengthening Mechanism of Cryo-Rolled MP159 Superalloy Strengthened by Deformation Nano-Twins. Mater. Charact. 2024, 209, 113692. [Google Scholar] [CrossRef]
Somekawa, H.; Basha, D.A.; Singh, A. Change in Dominant Deformation Mechanism of Mg Alloy via Grain Boundary Control. Mater. Sci. Eng. A 2019, 746, 162–166. [Google Scholar] [CrossRef]
Goldstein, J.I.; Newbury, D.E.; Michael, J.R.; Ritchie, N.W.M.; Scott, J.H.J.; Joy, D.C. Scanning Electron Microscopy and X-Ray Microanalysis; Springer: Berlin/Heidelberg, Germany, 2017. [Google Scholar]
Arregui-Mena, J.D.; Worth, R.N.; Bodel, W.; März, B.; Li, W.; Selby, A.; Campbell, A.A.; Contescu, C.; Edmondson, P.D.; Gallego, N. SEM and TEM Data of Nuclear Graphite and Glassy Carbon Microstructures. Data Brief 2022, 46, 108808. [Google Scholar] [CrossRef]
Hypolite, G.; Vicente, J.; Taligrot, H.; Moulin, P. X-ray Tomography Crystal Characterization: Growth Monitoring. J. Cryst. Growth 2023, 612, 127187. [Google Scholar] [CrossRef]
Davidson, M.W.; Abramowitz, M. Optical Microscopy. Encycl. Imaging Sci. Technol. 2002, 2, 120. [Google Scholar]
Ma, B.; Ban, X.; Su, Y.; Liu, C.; Wang, H.; Xue, W.; Zhi, Y.; Wu, D. Fast-FineCut: Grain Boundary Detection in Microscopic Images Considering 3D Information. Micron 2019, 116, 5–14. [Google Scholar] [CrossRef]
Wang, Y.H.; He, Q.; Xie, Z. Grain Boundary Extraction Method Based on Pixel Relationship. Measurement 2022, 202, 111796. [Google Scholar] [CrossRef]
Gajalakshmi, K.; Palanivel, S.; Nalini, N.J.; Saravanan, S.; Raghukandan, K. Grain Size Measurement in Optical Microstructure Using Support Vector Regression. Optik 2017, 138, 320–327. [Google Scholar] [CrossRef]
Peregrina-Barreto, H.; Terol-Villalobos, I.R.; Rangel-Magdaleno, J.J.; Herrera-Navarro, A.M.; Morales-Hernández, L.A.; Manríquez-Guerrero, F. Automatic Grain Size Determination in Microstructures Using Image Processing. Measurement 2013, 46, 249–258. [Google Scholar] [CrossRef]
Li, M.; Chen, D.; Liu, S.; Liu, F. Grain Boundary Detection and Second Phase Segmentation Based on Multi-Task Learning and Generative Adversarial Network. Measurement 2020, 162, 107857. [Google Scholar] [CrossRef]
Wang, N.; Guan, H.; Wang, J.; Zhou, J.; Gao, W.; Jiang, W.; Zhang, Y.; Zhang, Z. A Deep Learning-Based Approach for Segmentation and Identification of δ Phase for Inconel 718 Alloy with Different Compression Deformation. Mater. Today Commun. 2022, 33, 104954. [Google Scholar] [CrossRef]
Na, J.; Lee, J.; Kang, S.H.; Kim, S.J.; Lee, S. Label-Free Grain Segmentation for Optical Microscopy Images via Unsupervised Image-to-Image Translation. Mater. Charact. 2023, 206, 113410. [Google Scholar] [CrossRef]
Li, M.; Chen, D.; Liu, S.; Liu, F. Semisupervised Boundary Detection for Aluminum Grains Combined with Transfer Learning and Region Growing. IEEE Trans. Neural Netw. Learn. Syst. 2023, 34, 6158–6172. [Google Scholar] [CrossRef] [PubMed]
Liang, H.; Burgio, L.; Bailey, K.; Lucian, A.; Dilley, C.; Bellesia, S.; Cheung, C.; Brooks, C. Distilling the Knowledge in a Neural Network (Godfather’s Work). Stud. Conserv. 2014, 59. [Google Scholar] [CrossRef]
Muksimova, S.; Umirzakova, S.; Mardieva, S.; Cho, Y.I. Enhancing Medical Image Denoising with Innovative Teacher–Student Model-Based Approaches for Precision Diagnostics. Sensors 2023, 23, 9502. [Google Scholar] [CrossRef] [PubMed]
Li, J.; Wen, Y.; He, L. SCConv: Spatial and Channel Reconstruction Convolution for Feature Redundancy. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada, 17–24 June 2023. [Google Scholar]
Ronneberger, O.; Fischer, P.; Brox, T. U-net: Convolutional networks for biomedical image segmentation. In Proceedings of the Medical Image Computing and Computer-Assisted Intervention–MICCAI 2015: 18th International Conference, Munich, Germany, 5–9 October 2015; pp. 234–241. [Google Scholar]
Tarvainen, A.; Valpola, H. Mean Teachers Are Better Role Models: Weight-Averaged Consistency Targets Improve Semi-Supervised Deep Learning Results. In Proceedings of the Advances in Neural Information Processing Systems, Long Beach, CA, USA, 4–9 December 2017. [Google Scholar]
Zhao, X.; Fang, C.; Fan, D.J.; Lin, X.; Gao, F.; Li, G. Cross-Level Contrastive Learning and Consistency Constraint for Semi-Supervised Medical Image Segmentation. In Proceedings of the International Symposium on Biomedical Imaging, Kolkata, India, 28–31 March 2022. [Google Scholar]
McClosky, D.; Charniak, E.; Johnson, M. Effective Self-Training for Parsing. In Proceedings of the HLT-NAACL 2006—Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics, Main Conference, New York, NY, USA, 4–9 June 2006. [Google Scholar]
Cascante-Bonilla, P.; Tan, F.; Qi, Y.; Ordonez, V. Curriculum Labeling: Revisiting Pseudo-Labeling for Semi-Supervised Learning. In Proceedings of the 35th AAAI Conference on Artificial Intelligence, AAAI, Virtually, 2–9 February 2021. [Google Scholar]
Sohn, K.; Berthelot, D.; Li, C.L.; Zhang, Z.; Carlini, N.; Cubuk, E.D.; Kurakin, A.; Zhang, H.; Raffel, C. FixMatch: Simplifying Semi-Supervised Learning with Consistency and Confidence. In Proceedings of the Advances in Neural Information Processing Systems, Online, 6–12 December 2020. [Google Scholar]
Russell, B.C.; Torralba, A.; Murphy, K.P.; Freeman, W.T. LabelMe: A Database and Web-Based Tool for Image Annotation. Int. J. Comput. Vis. 2008, 77, 157–173. [Google Scholar] [CrossRef]
Zhou, Z.; Siddiquee, M.M.R.; Tajbakhsh, N.; Liang, J. UNet++: Redesigning Skip Connections to Exploit Multiscale Features in Image Segmentation. IEEE Trans. Med. Imaging 2020, 39, 1856–1867. [Google Scholar] [CrossRef]
Jha, D.; Smedsrud, P.H.; Riegler, M.A.; Johansen, D.; De Lange, T.; Halvorsen, P.; Johansen, H.D. ResUNet++: An Advanced Architecture for Medical Image Segmentation. In Proceedings of the 2019 IEEE International Symposium on Multimedia, ISM, San Diego, CA, USA, 9–11 December 2019. [Google Scholar]
Qi, Y.; He, Y.; Qi, X.; Zhang, Y.; Yang, G. Dynamic Snake Convolution Based on Topological Geometric Constraints for Tubular Structure Segmentation. In Proceedings of the IEEE International Conference on Computer Vision, Paris, France, 1–6 October 2023. [Google Scholar]
Shen, Z.; Cao, P.; Yang, H.; Liu, X.; Yang, J.; Zaiane, O.R. Co-Training with High-Confidence Pseudo Labels for Semi-Supervised Medical Image Segmentation. In Proceedings of the IJCAI International Joint Conference on Artificial Intelligence, Macao, China, 19–25 August 2023. [Google Scholar]
Zhang, R.; Liu, S.; Yu, Y.; Li, G. Self-Supervised Correction Learning for Semi-Supervised Biomedical Image Segmentation. In Proceedings of the Medical Image Computing and Computer Assisted Intervention—MICCAI 2021, Strasbourg, France, 27 September–1 October 2021; Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Springer: Cham, Switzerland, 2021. [Google Scholar] [CrossRef]
Flores, F.C.; De Alencar Lotufo, R. Object Segmentation in Image Sequences by Watershed from Markers: A Generic Approach. In Proceedings of the Brazilian Symposium of Computer Graphic and Image Processing, IEEE Computer Society, Sao Carlos, Brazil, 12–15 October 2003; pp. 347–352. [Google Scholar]
Canny, J. A Computational Approach to Edge Detection. IEEE Trans. Pattern Anal. Mach. Intell. 1986, PAMI–8, 679–698. [Google Scholar] [CrossRef]
de Zarzà, I.; de Curtò, J.; Hernández-Orallo, E.; Calafate, C.T. Cascading and Ensemble Techniques in Deep Learning. Electronics 2023, 12, 3354. [Google Scholar] [CrossRef]
Warren, P.; Raju, N.; Prasad, A.; Hossain, M.S.; Subramanian, R.; Kapat, J.; Manjooran, N.; Ghosh, R. Grain and Grain Boundary Segmentation Using Machine Learning with Real and Generated Datasets. Comput. Mater. Sci. 2024, 233, 112739. [Google Scholar] [CrossRef]
Chowdhury, S.A.; Taufique, M.F.N.; Wang, J.; Masden, M.; Wenzlick, M.; Devanathan, R.; Schemer-Kohrn, A.L.; Kappagantula, K.S. Automated Grain Boundary (GB) Segmentation and Microstructural Analysis in 347H Stainless Steel Using Deep Learning and Multimodal Microscopy. Integr. Mater. Manuf. Innov. 2024, 13, 244–256. [Google Scholar] [CrossRef]
Zhou, P.; Zhang, X.; Shen, X.; Shi, H.; He, J.; Zhu, Y.; Yi, F. Multi-phase material microscopic image segmentation for microstructure analysis of superalloys via modified U-Net and rectify strategies. Comput. Mater. Sci. 2024, 242, 113063. [Google Scholar] [CrossRef]
Reimer, L. Scanning Electron Microscopy: Physics of Image Formation and Microanalysis; Springer: Berlin/Heidelberg, Germany, 1998. [Google Scholar]
Ishikawa, R.; Lupini, A.R.; Findlay, S.D.; Pennycook, S.J. Quantitative Annular Dark Field Electron Microscopy Using Single Electron Signals. Microsc. Microanal. 2013, 20, 99–110. [Google Scholar] [CrossRef]
Egerton, R.F. Electron Energy-Loss Spectroscopy in the Electron Microscope; Springer: Berlin/Heidelberg, Germany, 2011. [Google Scholar]
Rahman, S.S.M.M.; Salomon, M.; Dembélé, S. Towards Scanning Electron Microscopy Image Denoising: A State-of-the-Art Overview, Benchmark, Taxonomies, and Future Direction. Mach. Vis. Appl. 2024, 35, 87. [Google Scholar] [CrossRef]
Banerjee, A. Noise in Semiconductor Devices. In Semiconductor Devices. Synthesis Lectures on Engineering, Science, and Technology; Springer: Berlin/Heidelberg, Germany, 2024. [Google Scholar] [CrossRef]

Figure 1. Grain boundary labeling (a) cropped portion of the OM image, (b) Labelme tool to label grain boundaries, (c) ground truth labeler.

Figure 2. Overview of the Semi-SRUnet model.

Figure 3. Teacher –student network: (a) SCConv structure, (b) SRUnet network structure.

Figure 4. Effect of Algorithm 1: (a) skeleton extraction (white lines). (b) two-breakpoint connection (red lines), (c) breakpoint and fork point connection (red lines), (d) breakpoint extension (red lines), (e) grain boundary expansion and black–white inversion.

Figure 5. Comparison of grain segmentation with supervised algorithms. From left to right are the OM images; manually labeled images, including the results of U-Net, UNet++, ResUNet++, DSCNet models; and the results of our Semi-SRUnet model. The orange rectangle in each image indicates the local zoomed-in region and the three local zoomed-in regions are placed to its right. The red circles, blue circles, and green circles in the magnified regions indicate noise points, boundary blur, and scratches, respectively.

Figure 6. Comparison with semi-supervised algorithm for grain segmentation. From left to right are the OM images; manually labeled images, including the results of MT, UC-MT, SCC, and CLCC models; and the results of our Semi-SRUnet model. The red circles and blue rectangles in each image indicate the regions of noise points and scratches, respectively.

Figure 7. Comparison with an unsupervised algorithm for grain segmentation. From left to right, the results are shown for metallographs, manually labeled images, and models such as Canny, R2V with regularization, Watershed Algorithm, and Semi-SRUnet. Red rectangles indicate defects.

Figure 8. Comparison of the model’s predictions under different noise conditions is presented. The right side shows the zoomed-in region of the red rectangle in the original image. Below the cropped original image is the ground truth label, while to the right of the cropped original are the images with various added noise. Below each noisy image is the model’s prediction result.

Table 1. Dataset partition strategy.

Strategy	Labeled	Unlabeled
0.5	170 (50%)	420
0.75	255 (75%)	420
1	340 (all)	420

Table 2. Performance comparison with baseline models (methods: (1) baseline, (2) baseline + BR, (3) baseline + SCConv, (4) baseline + BR + SCConv).

Method	BR	SCConv	mIoU	mRecall	mPrecision	Accuracy
(1)			64.09	75.61	72.22	94.71
(2)	√		65.24	77.26	73.11	94.89
(3)		√	68.07	80.7	75.48	95.41
(4)	√	√	70.37	81.31	78.64	96.06

Table 3. Comparison results with advanced supervisory models.

Method	Strategy	Metrics
Method	Strategy	mIoU	mRecall	mPrecision	Accuracy
U-Net	0.5	60.39	68.27	70.71	94.6
UNet++		60.92	70.88	69.56	94.22
ResUNet++		60.84	69.54	70.48	94.51
DSCNet		60.78	68.22	71.79	94.8
SRUnet		65.25	78.75	71.91	94.49
Semi-SRUnet		69.10	83.78	75.33	95.37
U-Net	0.75	62.54	73.36	70.98	94.48
UNet++		62.04	71.54	71.3	94.61
ResUNet++		61.91	71.36	71.18	94.59
DSCNet		62.19	70.9	72.2	94.82
SRUnet		67.25	82.05	73.73	94.96
Semi-SRUnet		87.84	96.57	90.30	98.62
U-Net	1	64.09	75.61	72.22	94.71
UNet++		63.21	73.81	71.84	94.67
ResUNet++		64.64	72.99	75.49	95.39
DSCNet		69.16	79.95	77.71	95.86
SRUnet		70.37	81.31	78.64	96.06
Semi-SRUnet		88.86	96.64	91.50	98.77

Table 4. Comparative results with state-of-the-art semi-supervised models.

Method	Strategy	Metrics
Method	Strategy	mIoU	mRecall	mPrecision	Accuracy
MT	0.5	60.29	69.32	69.56	94.15
UCMT		58.72	70.9	65.87	92.84
SSC		56.79	62.34	68.61	94.34
CLCC		62.76	75.27	70.31	94.05
Semi-SRUnet		69.10	83.78	75.33	95.37
MT	0.75	60.24	68.17	70.56	94.43
UCMT		58.76	69.86	66.33	93.14
SSC		59.00	68.06	67.82	93.79
CLCC		62.59	71.76	72.33	94.69
Semi-SRUnet		87.84	96.57	90.30	98.62
MT	1	62.36	73.76	70.45	94.16
UCMT		59.02	71.13	66.24	92.97
SSC		61.03	75.7	67.72	93.11
CLCC		60.61	68.67	70.92	94.48
Semi-SRUnet		88.86	96.64	91.50	98.77

Table 5. Comparative analysis of ensemble techniques on semi-SRUnet.

Method	mIoU	mRecall	mPrecision	Accuracy
semi-SRUnet	69.10	83.78	75.33	95.37
semi-SRUnet(Bagging)	69.40	84.30	75.49	95.40
semi-SRUnet(Cascading)	69.53	84.20	75.70	95.46

Table 6. Comparison with advanced boundary segmentation models.

Method	Pros	Cons	mIoU	Accuracy
semi-SRUnet	Low labeling cost and high Accuracy	Model complexity is relatively high	88.86	98.77
MPU-Net	Corrected segmentation errors	High labeling cost and limited correction strategies	64.99	94.09
Method A	Introduced a data augmentation strategy with simulated defects	Data preprocessing is complex and labeling costs are high	---	---
Method B	Without the need for manual data annotation	High device costs and inaccurate labeled data	---	---

Table 7. The complexity and training time of different models.

Method	Learning Paradigm	Params	FLOPs	Train Time
DSCNet	Supervised	1.35 M	106,063 M	79 s
U-Net		17.26 M	320,683 M	37 s
UNet++		26.89 M	300,488 M	38 s
ResUNet++		4.06 M	506,432 M	40 s
MPU-Net		43.00 M	386,014 M	43 s
R2V with regularization	Unsupervised	28.25 M	115,505 M	23 s
UCMT	Semi-supervised	10.88 M	575,517 M	62 s
CLCC		78.79 M	683,638 M	73 s
SSC		98.07 M	538,992 M	69 s
MT		93.13 M	1,159,855 M	74 s
Semi-SRUnet		27.40 M	511,588 M	66 s

Table 8. Performance of our semi-SRUnet model under varying labeled ratios.

Labeled	Unlabeled	mIoU	mRecall	mPrecision	Accuracy
34 (10%)	420	63.73	77.37	70.05	94.24
80 (25%)	420	65.07	80.21	71.49	94.34
175 (50%)	420	69.10	83.78	75.33	95.37
255 (75%)	420	87.84	96.57	90.3	98.62
340 (100%)	420	88.86	96.64	91.50	98.77

Table 9. Performance comparison under different noise effects.

Noise	mIoU	mRecall	mPrecision	Accuracy
Original	88.86	96.64	91.5	98.77
Missing boundaries	84.38	93.3	88.83	98.2
Noise points	88.67	96.33	86.41	98.75
Scratches	88.01	95.97	90.03	98.66
Gaussian noise	83.35	93.41	87.52	98.02
Uniform noise	83.59	91.85	89.02	98.13

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Huang, Y.; Zhang, X.; Ma, F.; Li, J.; Wang, S. A Semi-Supervised Method for Grain Boundary Segmentation: Teacher–Student Knowledge Distillation and Pseudo-Label Repair. Electronics 2024, 13, 3529. https://doi.org/10.3390/electronics13173529

AMA Style

Huang Y, Zhang X, Ma F, Li J, Wang S. A Semi-Supervised Method for Grain Boundary Segmentation: Teacher–Student Knowledge Distillation and Pseudo-Label Repair. Electronics. 2024; 13(17):3529. https://doi.org/10.3390/electronics13173529

Chicago/Turabian Style

Huang, Yuanyou, Xiaoxun Zhang, Fang Ma, Jiaming Li, and Shuxian Wang. 2024. "A Semi-Supervised Method for Grain Boundary Segmentation: Teacher–Student Knowledge Distillation and Pseudo-Label Repair" Electronics 13, no. 17: 3529. https://doi.org/10.3390/electronics13173529

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Semi-Supervised Method for Grain Boundary Segmentation: Teacher–Student Knowledge Distillation and Pseudo-Label Repair

Abstract

1. Introduction

2. Related Work

2.1. Grain Boundary Segmentation

2.2. Semi-Supervised Learning

3. Methodology

3.1. Data Preprocessing

3.2. Semi-SRUnet Modeling Framework

3.2.1. Teacher–Student Network

3.2.2. Pseudo-Label Repair

3.3. Training Details

4. Experiments and Results

4.1. Dataset Partition

4.2. Evaluation Metrics

4.3. Comparison with Baseline Models

4.4. Comparison with Supervised Models

4.5. Comparison with Semi-Supervised Models

4.6. Comparison with Unsupervised Models

4.7. Comprehensive Analysis and Discussion

4.7.1. Ensemble Techniques for Model Enhancement

4.7.2. Qualitative Comparison with Advanced Grain Boundary Segmentation Models

4.7.3. Comparison of Model Complexity and Training Time

4.7.4. Performance with Varying Labeled Ratios

4.7.5. Performance under High Noise

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI