A Method for Detecting the Yarn Roll’s Margin Based on VGG-UNet

Wang, Junru; Zhao, Xiong; Peng, Laihu; Wang, Honggeng

doi:10.3390/app14177928

Open AccessArticle

A Method for Detecting the Yarn Roll’s Margin Based on VGG-UNet

by

Junru Wang

¹,

Xiong Zhao

¹,

Laihu Peng

^1,* and

Honggeng Wang

²

¹

School of Mechanical Engineering, Zhejiang Sci-Tech University, Hangzhou 310018, China

²

Zhejiang Provincial Innovation Center of Advanced Textile Technology, Shaoxing 312000, China

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2024, 14(17), 7928; https://doi.org/10.3390/app14177928

Submission received: 27 June 2024 / Revised: 26 August 2024 / Accepted: 27 August 2024 / Published: 5 September 2024

Download

Browse Figures

Review Reports Versions Notes

Abstract

:

The identification of the yarn roll’s margin represents a critical phase in the automated production of textiles. At present, conventional visual detection techniques are inadequate for accurately measuring, filtering out background noise, and generalizing the margin of the yarn roll. To address this issue, this study constructed a semantic segmentation dataset for the yarn roll and proposed a new method for detecting the margin of the yarn roll based on deep learning. By replacing the encoder component of the U-Net with the initial 13 convolutional layers of VGG16 and incorporating pre-trained weights, we constructed a VGG-UNet model that is well suited for yarn roll segmentation. A comparison of the results obtained on the test set revealed that the model achieved an average Intersection over Union (IoU) of 98.70%. Subsequently, the contour edge point set was obtained through the application of traditional image processing techniques, and contour fitting was performed. Finally, the actual yarn roll margin was calculated based on the relationship between pixel dimensions and actual dimensions. The experiments demonstrate that the margin of the yarn roll can be accurately measured with an error of less than 3 mm. This is particularly important in situations where the margin is narrow, as the detection accuracy remains high. This study provides significant technical support and a theoretical foundation for the automation of the textile industry.

Keywords:

yarn margin detection; deep learning; semantic segmentation; image processing; VGG-UNe

1. Introduction

In recent years, the textile industry has increasingly adopted automation technology to enhance production efficiency, reduce labor costs, and improve product quality [1,2]. A principal aspect of this transformation is the automation of yarn margin detection. The detection of yarn margins has a significant impact on the quality of the final textile products [3,4]. In the past, the detection of yarn roll margins has been dependent on manual inspections and basic visual techniques. However, these methods are often plagued by low accuracy, limited anti-interference capabilities, and poor generalizability, which do not align with the stringent requirements of contemporary textile manufacturing.

The advent of digital image processing has led to a notable increase in the adoption of machine vision in textile manufacturing [5,6,7,8,9]. A principal challenge in the automation of yarn roll margin detection is the positioning of yarn cones on racks, which are typically arranged with their ends facing the aisles. as shown in Figure 1a. This setup allows only the cone ends to be visible during automated inspections, thereby complicating the assessment of the yarn length remaining on the cones. The diverse shapes and sizes of yarn cones present an additional challenge to the standardization of residual detection methods. Our research indicates that there are mainly three types of yarn cones commonly used in enterprises, as illustrated in Figure 1b.

This study proposes a method for detecting the margin of the yarn roll margin based on VGG-UNet semantic segmentation. By leveraging the pre-trained weights of the VGG network on large-scale datasets and its powerful feature extraction capabilities, combined with the pixel-level segmentation of U-Net, the method efficiently and accurately segments the remaining yarn regions. Subsequently, edge detection algorithms are used to extract the edge contours, which are then fitted into circles using the least squares method. Finally, the relationship between pixel dimensions and actual dimensions is established to calculate the actual yarn roll’s margin.

2. Related Work

According to related literature, the techniques utilized for the measurement of yarn roll dimensions are predominantly classified into two categories: contact and non-contact. Contact measurement entails the use of assorted instruments or sensors to obtain the dimensions and other characteristics of the yarn roll. For example, Imae, M., Iwad, T., and Shintani, Y. proposed a yarn detection method based on the monitoring of yarn tension [7]. However, these methods have the potential to damage the yarn and generally exhibit complex detection steps, low efficiency, and negative impact on the normal use of yarn rolls, making them impractical for widespread use in production environments.

Non-contact measurement technologies rely on computer vision to identify yarn roll features such as color and edges. These can be classified into three categories: monocular vision, stereo vision, and deep learning-based techniques.

Monocular vision technology refers to image processing techniques applied to images captured by a single camera. It is the most common detection technology within machine vision. Xiang, Z., and Zhang, J. designed an online detection system for yarn roll density based on machine vision [10]. The theory of perspective projection was employed to develop a correction model for yarn rolls, whereby the linear characteristics of the upper and lower boundaries of the yarn roll were restored and an ideal side-view image of the yarn was obtained. Subsequently, the yarn’s precise volume was calculated using integral methods.

Stereo vision systems capture images of an object from different angles, followed by 3D reconstruction to form the object’s model. Often used in conjunction with lasers, this approach has seen rapid development recently due to its strong anti-interference capabilities. However, stereo vision requires significant computational resources to process and analyze images from two viewpoints, perform image matching, and compute depth. Da Silva Vale, R.T., Ueda, E.K., and others have used stereo vision systems to generate point clouds of fish surfaces and proposed a B-spline surface fitting algorithm to estimate the volume of fish [11], effectively enhancing the accuracy of weight and volume estimates.

In contrast, deep learning techniques have demonstrated significant potential in the measurement of yarn spool dimensions. By utilizing deep learning models, it is possible to automatically extract image features and adapt to various complex scenarios and changing lighting conditions. By training on large amounts of annotated data, deep learning models can achieve high-precision measurements and exhibit strong robustness and generalization capabilities. For example, Wang, J., Shi, Z. and others utilized the YOLO [12,13,14,15] model to roughly detect each yarn roll and its pixel diameter. Subsequently, the region of interest (ROI) images of each yarn roll were subjected to further processing in order to extract the internal and external contours of the yarn roll. Finally, the measurements and predictions were integrated using a Kalman filter. The validation results demonstrate that this method effectively addresses existing issues and accurately detects the yarn margin [4]. Similarly, Huo, Y., and Bai, H. addressed the issue of automatic recognition of gauge reading recognition by introducing the SIFT algorithm for correction and improved U-Net for segmentation [16]. This approach enhanced enhancing the accuracy and efficiency of recognition.

3. Materials and Methods

3.1. Network Structure

In 2015, fully convolutional networks (FCNs) [17] emerged as a pivotal innovation in semantic segmentation, marking a substantial advancement in the field. Building upon this foundation, Ronneberger, O. et al. developed the U-Net model [18], tailored explicitly to meet the challenges of biomedical image segmentation. U-Net is highly regarded for its outstanding segmentation efficacy and swift training capabilities [19,20,21], making it ideal for use with small datasets. Consequently, this study adopts the U-Net model as the primary framework for detecting yarn roll margins.

The U-Net model is distinguished by its symmetric architecture, comprising an encoder and a decoder. The encoder extracts features from the input image, successively generating feature maps at four distinct levels. Conversely, the decoder identifies these features through successive deconvolution and up-sampling processes. It integrates these features with corresponding feature maps from the encoder, thereby mitigating information loss during up-sampling and enhancing the model’s segmentation precision. Furthermore, the model’s feature extraction capacity is enhanced by integrating the first 13 convolutional layers of VGG16 [22] into the U-Net encoder, as shown in Figure 2, leveraging its pre-trained weights on large datasets for effective transfer learning [23]. This approach substantially boosts the model’s performance and adaptability, particularly in specialized applications like yarn residual volume detection.

In this study’s implementation, as shown in Figure 3, the input yarn images are down-sampled by the encoder to generate four sets of feature maps at different scales: 512 × 512 × 64, 256 × 256 × 128, 128 × 128 × 256, and 64 × 64 × 512. In the decoder stage, these up-sampled feature maps are concatenated with the corresponding feature maps from the encoder layers to effectively fuse deep and shallow features. Finally, the number of channels in the feature maps is adjusted to match the number of target segmentation classes using a 1 × 1 convolutional layer to output precise segmentation results. The details of each layer are shown in Table 1. Through this approach, our model not only effectively improves the accuracy of yarn roll margin detection but also demonstrates strong capabilities in handling complex image segmentation tasks.

3.2. Dataset Construction

The imaging device in this study was the Hikvision Color Industrial Camera MV-CA050-10GM (Hikvision, Hangzhou, China), with a resolution of 3072 pixels by 2048 pixels. The cylindrical yarn bobbins in the experiment had an inner diameter of 56 mm and an outer diameter of 76 mm, while the conical yarn bobbins had a front-end inner diameter of 32 mm and an outer diameter of 38 mm. To facilitate image analysis, the camera was positioned 50 cm directly in front of the yarn bobbin rack during each imaging session, ensuring that the camera captured the entire end face of the yarn bobbins, as shown in Figure 4.

3.2.1. YOLOv8 Rapid Dataset Expansion

To augment the training data and enhance the model’s generalization ability, images showcasing the overall distribution of yarn bobbins within the workshop were captured as presented in Figure 5a. The YOLOv8 model was then utilized to detect and extract yarn bobbins from these images, as shown in Figure 5b, substantially increasing the efficiency of dataset expansion.

After determining the position coordinates of the yarn margin bounding box within the image, we extracted images of the yarn roll captured from various angles, distances, margins, and shapes. Subsequently, we curated a collection of 200 high-quality images from this set, as shown in Figure 5c.

3.2.2. Dataset Annotation

The 300 raw images will be annotated using the ISAT labeling software (v1.1.1), a semi-automatic segmentation annotation tool based on the segment anything model [24]. ISAT enables users to adjust the segmentation area by utilizing forward and backward hint points, enhancing the efficiency and accuracy of annotations compared to polygon annotations in LabelMe (v5.4.0). Following the annotation process, corresponding JSON files will be produced. These files will be converted into mask images, where the red area indicates the yarn margin region and the black area delineates the background and yarn spool region, as shown in Figure 6.

The final dataset comprises 300 sample images along with their corresponding annotated images. These images will be split into a training set (210 images), a validation set (60 images), and a test set (30 images), adhering to a 7:2:1 ratio.

This study utilizes an enhanced U-Net model for the semantic segmentation of the yarn margin region, yielding favorable results and producing the mask image of the yarn margin region. By applying circular fitting to the yarn margin mask, a more accurate estimation of the residual yarn quantity is achieved. The use of this improved U-Net model ensures effective segmentation and enables precise assessment of the remaining yarn.

3.2.3. Extraction of Yarn Margin Contour Points

To determine the fitting circle of the end face, the process begins by employing the Canny edge detection algorithm to extract edge contour points from the bobbin region mask image. Initially, the Sobel operator is used to calculate the gradient intensity (

G

) and direction (

θ

) for each pixel in the image. Following this, non-maximum suppression is utilized to remove non-edge pixels, resulting in refined edge points. The specific calculation formula for these steps is outlined below.

G = \sqrt{G_{x}^{2} + G_{y}^{2}}

(1)

θ = \arctan (\frac{G_{x}}{G_{y}})

(2)

3.2.4. Least Squares Fitting of Inner and Outer Contours

Edge detection provides a series of edge points, which can be utilized to find the optimal matching circles for two sets of points through least squares fitting. To employ least squares fitting [25,26,27], it is essential to first define a residual function to calculate the discrepancy between the fitted circle and the actual data points. For each point

(x_{i}, y_{i})

, the residual is computed as follows:

{res}_{i} = {(x_{i} - x_{c})}^{2} + {(y_{i} - y_{c})}^{2} - r^{2}

(3)

The objective of least squares fitting is to find the values

(x_{i}, y_{i})

that minimize the sum of squared residuals. The fitting effect is shown in Figure 7.

\sum_{i} {res}_{i} = {\sum_{i} ({(x_{i} - x_{c})}^{2} + {(y_{i} - y_{c})}^{2} - r^{2})}^{2}_{\min}

(4)

3.2.5. Calculation of Actual Yarn Roll’s Margin

After fitting the inner and outer contours and obtaining the pixel radius parameters for the two circles (

R^{'}, r^{'}

), the pixel width of the yarn margin is denoted as (

R^{'} - r^{'}

). From the pixel diameter of the bobbin, one can determine whether the detected bobbin is cylindrical or conical, based on known actual diameters of each type (

D

). This procedure establishes a correlation between the pixel measurements and the actual physical dimensions. Finally, the actual dimensions of the yarn roll’s margin (X) are obtained.

\frac{R^{'} - r^{'}}{X} = \frac{2 r^{'}}{D}

(5)

4. Experiments

We conducted control experiments to evaluate the segmentation effectiveness of this model and to compare its strengths and weaknesses with other backbone networks. These experiments elucidate the model’s capabilities in handling different segmentation tasks and provide insights into its comparative efficiency and accuracy across various contexts.

4.1. Loss Function

This experiment chooses the Dice loss as the loss function for model training. The formula for calculating the Dice loss can be expressed as:

1 - \frac{2 \sum_{i} P_{i} G_{i}}{\sum_{i} (P_{i}^{2} + G_{i})}

(6)

In this formula,

P_{i}

represents the value of the predicted result at the i-th pixel and

G_{i}

denotes the value of the true label at the i-th pixel.

4.2. Training and Optimization

The model used in this experiment is an enhanced variant of the U-Net architecture, where the encoding section has been replaced by the VGG16 feature extraction component. It is trained over 100 epochs using a batch size of 2. The initial learning rate is set to 0.0001, and the model uses the Adam optimizer [28] in conjunction with a cosine decay learning rate scheduling strategy [29]. This strategy effectively modulates the learning rate over time to optimize the training process and improve model performance. The formula for calculating the learning rate using cosine decay is:

l r (t) = l r_{\min} + 0.5 \times (l r_{\max} - l r_{\min}) \times (1 + \cos (\frac{t}{T_{\max}} π))

(7)

l r (t)

is the learning rate at time t;

l r_{\min}

and

l r_{\max}

are the minimum and maximum values of the learning rate, respectively;

T_{\max}

is the predetermined decay period, usually set to the total number of training rounds or a training cycle. When combining the Adam optimizer with cosine decay, Adam is responsible for adjusting the learning rate based on the real-time updates of the model, while cosine decay provides a global learning rate adjustment strategy that changes over time. Both mechanisms work together in the model training process to achieve improved training outcomes and model performance.

4.3. Evaluation

To objectively evaluate the segmentation performance of the model, we use the confusion matrix to calculate metrics such as mean Precision (mPrecision), mean Recall (mRecall), mean Pixel Accuracy (mPA), and mean Intersection over Union (mIoU). The specific formulas are shown in Table 2.

Pre c i s i o n = \frac{T P}{T P + F P}

(8)

Re c a l l = \frac{T P}{T P + F N}

(9)

m P A = \frac{1}{k} \sum_{i = 1}^{k} \frac{T P}{T P + F P}

(10)

m I o U = \frac{1}{k} \sum_{i = 1}^{k} \frac{T P}{T P + F P + F N}

(11)

4.4. Comparative Experiment

In this study, the encoder part of the U-Net model can be enhanced by utilizing different architectures to improve its feature extraction capabilities. Two models, VGG-UNet and Res-UNet [30], were constructed by replacing the encoder with efficient feature extraction networks VGG16 and ResNet [31], respectively. To investigate the specific impact of pre-trained weights on model performance, an experimental variable was introduced to assess the use of pre-trained weights. Under the same training conditions and parameter settings, these enhanced models were compared to the original UNet model through comparative experiments. Among them, ResNetUNet-pre and VGGUNet-pre represent the ResNet-UNet and VGG-UNet models loaded with pre-trained weights, respectively. The experimental results are shown in Figure 8. The results indicate that models using pre-trained weights exhibit significantly faster convergence during training and smoother loss value reduction, further confirming the effectiveness of pre-trained weights.

To validate the effectiveness of the model in yarn segmentation tasks, we evaluated its performance on the test dataset. The specific results are depicted in Figure 9. Without the application of transfer learning, the Res-UNet model demonstrated the best performance, achieving an average Intersection over Union (mIoU) of 90.26%. In contrast, the original VGG-UNet model exhibited inferior performance, achieving only 79.73% mIoU, indicating room for improvement in its segmentation effectiveness. However, upon introducing pre-trained weights, the VGG-UNet model’s performance significantly improved in all aspects, surpassing the Res-UNet model in metrics such as mPrecision, mRecall, mPA, and mIoU, surpassing the Res-UNet model in these metrics. Particularly, compared to the VGG-UNet model without pre-trained weights, the model fine-tuned with pre-trained weights showed an increase of 9.52% in average precision, 26.23% in average recall, and 18.3% in mIoU for yarn region segmentation tasks. This outcome clearly demonstrates that under conditions of limited sample size and few epochs, the strategy of fine-tuning the model with pre-trained weights is significantly superior to training from scratch.

4.5. Qualitative Results

After performing semantic segmentation on the image data to extract the yarn margin area, the inner and outer contours of the yarn were fitted to obtain the radii of the inner and outer contour circles. Subsequently, the actual width of the yarn margin was determined based on the relationship between pixel dimensions and actual dimensions. To verify and evaluate the performance of the yarn margin detection method proposed in this study, it was applied to measure the yarn margin in various images with different margin amounts and bobbin shapes, as shown in Figure 10. Relevant data were collected and documented, and specific results are documented in Table 3.

From Table 3, it can be observed that the errors in determining the yarn roll’s margin for different types of yarn bobbin using the algorithm proposed in this study are all less than 3 mm. Additionally, regardless of whether the margin increases or decreases, the relative error in margin remains within 4%, meeting the actual production requirements of the factory.

5. Conclusions

This study combines deep learning with traditional image processing methods. Through comparative experiments, the VGG-UNet model was selected for segmenting the margin of yarn rolls. By replacing the encoder part of the U-Net with the first 13 convolutional layers of VGG16 and applying transfer learning by loading pre-trained weights, the original VGG-UNet model, which initially had average performance metrics, was enhanced by replacing the UNet’s encoder with the first 13 convolutional layers of VGG16 and applying transfer learning with pre-trained weights. After introducing pre-trained weights, the model showed significant improvement in overall performance and generalization ability. The precision, recall, and mean IoU of the segmentation were superior to other models. After obtaining the segmentation results, traditional image processing methods were used to extract contour edge points, which were then fitted using the least squares method. The measurement results were then calculated based on the relationship between pixel dimensions and actual dimensions. A comparison between automatically and manually measured values revealed that the error in detecting the yarn roll’s margin using the proposed algorithm is less than 3 mm, with a relative error within 4%, demonstrating practical utility. This research method ensures high accuracy while maintaining detection efficiency.

Based on the initial experimentation, it has been identified that there are areas for improvement in the proposed method. In future work, we plan to further research the following aspects:

(1): Enhancing the model’s generalization ability: Research methods to ensure detection effectiveness in more complex environments and improve the model’s generalization capabilities.
(2): Optimizing the model’s inference speed: The semantic segmentation network is significantly slower than general detection networks. Our model’s current inference speed is 18.19 FPS, which temporarily meets industrial requirements. In the future, we will consider pruning the model and optimizing the algorithm to improve inference speed.

Author Contributions

Conceptualization, J.W. and X.Z.; methodology, J.W. and X.Z.; validation, J.W. and X.Z.; formal analysis, H.W.; investigation, X.Z. and L.P.; resources, J.W. and L.P.; data curation, X.Z. and H.W.; writing—original draft preparation, X.Z.; writing—review and editing, J.W. and L.P.; supervision, J.W.; project administration; funding acquisition, H.W. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the Key R&D projects of the Science and Technology Department of Zhejiang Province [grant numbers 2023C01158, 2024C03118, 2024C01222].

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data presented in this study are available on request from the corresponding author. The data are not publicly available due to privacy.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Dal Forno, A.J.; Bataglini, W.V.; Steffens, F.; Ulson de Souza, A.A. Industry 4.0 in textile and apparel sector: A systematic literature review. Res. J. Text. Appar. 2023, 27, 95–117. [Google Scholar] [CrossRef]
Kaur, G.; Dey, B.K.; Pandey, P.; Majumder, A.; Gupta, S. A Smart Manufacturing Process for Textile Industry Automation under Uncertainties. Processes 2024, 12, 778. [Google Scholar] [CrossRef]
Wang, J. The Foundation of the Intellectualization of the Textile Accessories and Parts Including On-line Detection of Textile Production Process, Quality Data Mining and Process Parameters Optimization. Text. Accessories 2018, 5, 1–3. [Google Scholar]
Wang, J.; Shi, Z.; Shi, W.; Wang, H. The Detection of Yarn Roll’s Margin in Complex Background. Sensors 2023, 23, 1993. [Google Scholar] [CrossRef]
Chen, Z.; Shi, Y.; Ji, S. Improved image threshold segmentation algorithm based on OTSU method. Laser Infrared 2012, 5, 584–588. [Google Scholar]
Catarino, A.; Rocha, A.; Monteiro, J. Monitoring knitting process through yarn input tension: New developments. In Proceedings of the IEEE 2002 28th Annual Conference of the Industrial Electronics Society (IECON 02), Seville, Spain, 5–8 November 2002; IEEE: Piscataway, NJ, USA, 2002; pp. 2022–2027. [Google Scholar]
Imae, M.; Iwade, T.; Shintani, Y. Method for Monitoring Yarn Tension in Yarn Manufacturing Process. US6014104A, 11 January 2000. [Google Scholar]
Miao, Y.; Meng, X.; Xia, G.; Wang, Q.; Zhang, H. Research and development of non-contact yarn tension monitoring system. Wool Text. J. 2020, 48, 76–81. [Google Scholar]
Yang, Y.; Ma, X.; He, Z.; Gao, M. A robust detection method of yarn residue for automatic bobbin management system. In Proceedings of the 2019 IEEE/ASME International Conference on Advanced Intelligent Mechatronics (AIM), Sha Tin, Hong Kong, 8–12 July 2019; IEEE: Piscataway, NJ, USA, 2019; pp. 1075–1079. [Google Scholar]
Xiang, Z.; Zhang, J.; Hu, X. Vision-based portable yarn density measure method and system for basic single color woven fabrics. J. Text. Inst. 2018, 109, 1543–1553. [Google Scholar] [CrossRef]
da Silva, V.R.T.; Kenji, U.E.; da Silva, V.R.T.; Rogério, Y.T.; Kenji, U.E.; de Castro, M.T.; Yugo, T.R.; de Castro, M.T. Fish Volume Monitoring Using Stereo Vision for Fish Farms. IFAC-PapersOnLine 2020, 53, 15824–15828. [Google Scholar] [CrossRef]
Chen, D.; Cheng, J.-J.; He, H.-Y.; Ma, C.; Yao, L.; Jin, C.-B.; Cao, Y.-S.; Li, J.; Ji, P. Computed tomography reconstruction based on canny edge detection algorithm for acute expansion of epidural hematoma. J. Radiat. Res. Appl. Sci. 2022, 15, 279–284. [Google Scholar] [CrossRef]
Tian, J.; Zhou, H.-J.; Bao, H.; Chen, J.; Huang, X.-D.; Li, J.-C.; Yang, L.; Li, Y.; Miao, X.-S. Memristive Fast-Canny Operation for Edge Detection. IEEE Trans. Electron Devices 2022, 69, 6043–6048. [Google Scholar] [CrossRef]
Laroca, R.; Severo, E.; Zanlorensi, L.A.; Oliveira, L.S.; Gonçalves, G.R.; Schwartz, W.R.; Menotti, D. A Robust Real-Time Automatic License Plate Recognition Based on the YOLO Detector. In Proceedings of the 2018 International Joint Conference on Neural Networks (IJCNN), Rio de Janeiro, Brazil, 8-13 July 2018; pp. 1–10. [Google Scholar]
Yu, Z.; Liu, Y.; Yu, S.; Wang, R.; Song, Z.; Yan, Y.; Li, F.; Wang, Z.; Tian, F. Automatic detection method of dairy cow feeding behaviour based on YOLO improved model and edge computing. Sensors 2022, 22, 3271. [Google Scholar] [CrossRef]
Huo, Y.; Bai, H.; Sun, L.; Fang, Y. Reading recognition of pointer meters based on an improved UNet++ network. Meas. Sci. Technol. 2023, 35, 035009. [Google Scholar] [CrossRef]
Long, J.; Shelhamer, E.; Darrell, T. Fully convolutional networks for semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 7–12 June 2015; pp. 3431–3440. [Google Scholar]
Ronneberger, O.; Fischer, P.; Brox, T. Medical Image Computing and Computer-Assisted Intervention–MICCAI 2015, Proceeding of the 18th international conference, Munich, Germany, 5–9 October 2015; Proceedings, Part III; Springer: Berlin/Heidelberg, Germany, 2015; Volume 18, pp. 234–241. [Google Scholar]
Liu, Z.; Cao, Y.; Wang, Y.; Wang, W. Computer vision-based concrete crack detection using U-net fully convolutional networks. Autom. Constr. 2019, 104, 129–139. [Google Scholar] [CrossRef]
Liao, J.; Chen, M.H.; Zhang, K.; Zou, Y.; Zhang, S.; Zhu, D.Q. Segmentation of crop plant seedlings based on regional semantic and edge information fusion. Trans. CSAM 2021, 52, 171–181. [Google Scholar]
Pan, Z.; Xu, J.; Guo, Y.; Hu, Y.; Wang, G. Deep learning segmentation and classification for urban village using a worldview satellite image based on U-Net. Remote Sens. 2020, 12, 1574. [Google Scholar] [CrossRef]
Simonyan, K.; Zisserman, A. Very Deep Convolutional Networks for Large-Scale Image Recognition. arXiv 2014, arXiv:1409.1556. [Google Scholar]
Jiang, J.; Shu, Y.; Wang, J.; Long, M. Transferability in Deep Learning: A Survey. arXiv 2022, arXiv:2201.05867. [Google Scholar]
Kirillov, A.; Mintun, E.; Ravi, N.; Mao, H.; Rolland, C.; Gustafson, L.; Xiao, T.; Whitehead, S.; Berg, A.C.; Lo, W.-Y.; et al. Segment Anything. arXiv 2023, arXiv:2304.02643. [Google Scholar] [CrossRef]
Li, B.; Li, B.; Yan, J.; Zhao, Q.; He, J.; Li, R.; Liu, X. Image recognition and diagnosis for vibration characteristics of cone valve core. Adv. Mech. Eng. 2020, 12, 1687814020916389. [Google Scholar] [CrossRef]
Xiong, L.; Wang, Z.; Liao, H.; Kang, X.; Yang, C. Overlapping citrus segmentation and reconstruction based on Mask R-CNN model and concave region simplification and distance analysis. J. Phys. Conf. Ser. 2019, 1345, 032064. [Google Scholar]
Luo, Z.; Jia, Y.; He, J. An Optic Disc Segmentation Method Based on Active Contour Tracking. Trait. Du Signal 2019, 36, 265–271. [Google Scholar] [CrossRef]
Kingma, D.P.; Ba, J. Adam: A Method for Stochastic Optimization. arXiv 2014, arXiv:1412.6980. [Google Scholar]
Loshchilov, I.; Hutter, F. Decoupled Weight Decay Regularization. arXiv 2017, arXiv:1711.05101. [Google Scholar]
Diakogiannis, F.I.; Waldner, F.; Caccetta, P.; Wu, C. ResUNet-a: A deep learning framework for semantic segmentation of remotely sensed data. ISPRS J. Photogramm. Remote Sens. 2020, 162, 94–114. [Google Scholar] [CrossRef]
He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]

Figure 1. Environment. (a) Production environment and layout of yarn bobbin. (b) Types and sizes of three different yarn rolls.

Figure 2. The VGG16 architecture.

Figure 3. VGG-UNet architecture diagram.

Figure 4. Yarn bobbin end-face image.

Figure 5. Data collection. (a) Multiple yarn bobbin overall Image. (b) Detection result. (c) Extracting single yarn roll.

Figure 6. Original image and annotated image. (a) Original image. (b) Annotated image. (c) Mask image.

Figure 7. Segmentation and fitting. (a) Original. (b) Segmented (red area indicates the segmented region). (c) Fitted (green contour represents the fitted circle contour).

Figure 8. Comparison of training loss curves for various models.

Figure 9. Comparison of evaluation metrics among various models.

Figure 10. Segmentation and fitting results (red area is the segmented region, green contour represents the fitted circle contour), (a–f) represents various images with different margin amounts and bobbin shapes.

Table 1. The detailed configuration and specifications of VGG-UNet.

Layer	Type	Pad	Kernel Size	Stride	Output Size	Note
1	Input	-	-	-	512 × 512 × 3
2	Conv + ReLU	1	3 × 3 × 64	1	512 × 512 × 64	down-sampling block 1
3	Conv + ReLU	1	3 × 3 × 64	1	512 × 512 × 64
4	MaxPooling	0	2 × 2	2	256 × 256 × 64
5	Conv + ReLU	1	3 × 3 × 128	1	256 × 256 × 128	down-sampling block 2
6	Conv + ReLU	1	3 × 3 × 128	1	256 × 256 × 128
7	MaxPooling	0	2 × 2	2	128 × 128 × 128
8	Conv + ReLU	1	3 × 3 × 256	1	128 × 128 × 256	down-sampling block 3
9	Conv + ReLU	1	3 × 3 × 256	1	128 × 128 × 256
10	Conv + ReLU	1	3 × 3 × 256	1	128 × 128 × 256
11	MaxPooling	0	2 × 2	2	64 × 64 × 256
12	Conv + ReLU	1	3 × 3 × 512	1	64 × 64 × 512	down-sampling block 4
13	Conv + ReLU	1	3 × 3 × 512	1	64 × 64 × 512
14	Conv + ReLU	1	3 × 3 × 512	1	64 × 64 × 512
15	MaxPooling	0	2 × 2	2	32 × 32 × 512
16	Conv + ReLU	1	3 × 3 × 512	1	32 × 32 × 512	down-sampling block 5
17	Conv + ReLU	1	3 × 3 × 512	1	32 × 32 × 512
18	Conv + ReLU	1	3 × 3 × 512	1	32 × 32 × 512
19	MaxPooling	0	2 × 2	2	16 × 16 × 512
20	Upsampling	-	-	-	32 × 32 × 512	up-sampling block 1
21	Concatenate	-	-	-	32 × 32 × 1024
22	Conv + BN	1	3 × 3 × 512	1	32 × 32 × 512
23	Conv + BN	1	3 × 3 × 256	1	32 × 32 × 256
24	Upsampling	-	-	-	64 × 64 × 256	up-sampling block 2
25	Concatenate	-	-	-	64 × 64 × 512
26	Conv + BN	1	3 × 3 × 256	1	64 × 64 × 256
27	Conv + BN	1	3 × 3 × 128	1	64 × 64 × 128
28	Upsampling	-	-	-	128 × 128 × 128	up-sampling block 3
29	Concatenate	-	-	-	128 × 128 × 256
30	Conv + BN	1	3 × 3 × 128	1	128 × 128 × 128
31	Conv + BN	1	3 × 3 × 64	1	128 × 128 × 64
32	Upsampling	-	-	-	256 × 256 × 64	up-sampling block 4
33	Concatenate	-	-	-	256 × 256 × 128
34	Conv + BN	1	3 × 3 × 64	1	256 × 256 × 64
35	Upsampling	-	-	-	512 × 512 × 64	up-sampling block 5
36	Conv + BN	1	3 × 3 × 64	1	512× 512× 64
37	Conv	-	1 × 1 × 3	1	512 × 512 × 3
38	Output	-	-	-	512 × 512 × 3

Table 2. Confusing matrix of detection.

	True	False
True	True Positive (TP)	False Negative (FN)
False	False Positive (FP)	True Negative (TN)

Table 3. Segmentation and fitting results.

ID	Yarn Margin/mm		Error/mm	Relative Error/%
ID	Detection Value/mm	Actual Value/mm	Error/mm	Relative Error/%
a	69.26	71.92	2.66	3.7
b	7.99	8.04	0.05	0.6
c	19.34	19.59	0.25	1.3
d	54.28	52.76	1.52	2.8
e	28.14	27.82	0.32	1.1
f	32.91	33.96	1.05	3.1

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Wang, J.; Zhao, X.; Peng, L.; Wang, H. A Method for Detecting the Yarn Roll’s Margin Based on VGG-UNet. Appl. Sci. 2024, 14, 7928. https://doi.org/10.3390/app14177928

AMA Style

Wang J, Zhao X, Peng L, Wang H. A Method for Detecting the Yarn Roll’s Margin Based on VGG-UNet. Applied Sciences. 2024; 14(17):7928. https://doi.org/10.3390/app14177928

Chicago/Turabian Style

Wang, Junru, Xiong Zhao, Laihu Peng, and Honggeng Wang. 2024. "A Method for Detecting the Yarn Roll’s Margin Based on VGG-UNet" Applied Sciences 14, no. 17: 7928. https://doi.org/10.3390/app14177928

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Method for Detecting the Yarn Roll’s Margin Based on VGG-UNet

Abstract

1. Introduction

2. Related Work

3. Materials and Methods

3.1. Network Structure

3.2. Dataset Construction

3.2.1. YOLOv8 Rapid Dataset Expansion

3.2.2. Dataset Annotation

3.2.3. Extraction of Yarn Margin Contour Points

3.2.4. Least Squares Fitting of Inner and Outer Contours

3.2.5. Calculation of Actual Yarn Roll’s Margin

4. Experiments

4.1. Loss Function

4.2. Training and Optimization

4.3. Evaluation

4.4. Comparative Experiment

4.5. Qualitative Results

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI