Optic Nerve Sheath Ultrasound Image Segmentation Based on CBC-YOLOv5s

Chu, Yonghua; Xu, Jinyang; Wu, Chunshuang; Ye, Jianping; Zhang, Jucheng; Shen, Lei; Wang, Huaxia; Yao, Yudong

doi:10.3390/electronics13183595

Open AccessArticle

Optic Nerve Sheath Ultrasound Image Segmentation Based on CBC-YOLOv5s

by

Yonghua Chu

¹,

Jinyang Xu

²,

Chunshuang Wu

^3,4,

Jianping Ye

¹,

Jucheng Zhang

^5,*,

Lei Shen

^2,*,

Huaxia Wang

⁶ and

Yudong Yao

⁷

¹

Department of Clinical Engineering, School of Medicine, The Second Affiliated Hospital, Zhejiang University, Hangzhou 310009, China

²

College of Communication Engineering, Hangzhou Dianzi University, Hangzhou 310018, China

³

Department of Emergency Medicine, School of Medicine, Second Affiliated Hospital, Zhejiang University, Hangzhou 310009, China

⁴

Key Laboratory of The Diagnosis and Treatment of Severe Trauma and Burn of Zhejiang Province, Zhejiang Province Clinical Research Center for Emergency and Critical Care Medicine, Hangzhou 310009, China

⁵

Key Laboratory of Medical Molecular Imaging of Zhejiang Province, Department of Clinical Engineering, School of Medicine, The Second Affiliated Hospital, Zhejiang University, Hangzhou 310009, China

⁶

Department of Electrical and Computer Engineering, Rowan University, Glassboro, NJ 08028, USA

⁷

Department of Electrical and Computer Engineering, The Stevens Institute of Technology, Hoboken, NJ 07030, USA

^*

Authors to whom correspondence should be addressed.

Electronics 2024, 13(18), 3595; https://doi.org/10.3390/electronics13183595

Submission received: 18 July 2024 / Revised: 25 August 2024 / Accepted: 2 September 2024 / Published: 10 September 2024

(This article belongs to the Special Issue Deep Learning-Based Object Detection/Classification)

Download

Browse Figures

Versions Notes

Abstract

:

The diameter of the optic nerve sheath is an important indicator for assessing the intracranial pressure in critically ill patients. The methods for measuring the optic nerve sheath diameter are generally divided into invasive and non-invasive methods. Compared to the invasive methods, the non-invasive methods are safer and have thus gained popularity. Among the non-invasive methods, using deep learning to process the ultrasound images of the eyes of critically ill patients and promptly output the diameter of the optic nerve sheath offers significant advantages. This paper proposes a CBC-YOLOv5s optic nerve sheath ultrasound image segmentation method that integrates both local and global features. First, it introduces the CBC-Backbone feature extraction network, which consists of dual-layer C3 Swin-Transformer (C3STR) and dual-layer Bottleneck Transformer (BoT3) modules. The C3STR backbone’s multi-layer convolution and residual connections focus on the local features of the optic nerve sheath, while the Window Transformer Attention (WTA) mechanism in the C3STR module and the Multi-Head Self-Attention (MHSA) in the BoT3 module enhance the model’s understanding of the global features of the optic nerve sheath. The extracted local and global features are fully integrated in the Spatial Pyramid Pooling Fusion (SPPF) module. Additionally, the CBC-Neck feature pyramid is proposed, which includes a single-layer C3STR module and three-layer CReToNeXt (CRTN) module. During upsampling feature fusion, the C3STR module is used to enhance the local and global awareness of the fused features. During downsampling feature fusion, the CRTN module’s multi-level residual design helps the network to better capture the global features of the optic nerve sheath within the fused features. The introduction of these modules achieves the thorough integration of the local and global features, enabling the model to efficiently and accurately identify the optic nerve sheath boundaries, even when the ocular ultrasound images are blurry or the boundaries are unclear. The Z2HOSPITAL-5000 dataset collected from Zhejiang University Second Hospital was used for the experiments. Compared to the widely used YOLOv5s and U-Net algorithms, the proposed method shows improved performance on the blurry test set. Specifically, the proposed method achieves precision, recall, and Intersection over Union (IoU) values that are 4.1%, 2.1%, and 4.5% higher than those of YOLOv5s. When compared to U-Net, the precision, recall, and IoU are improved by 9.2%, 21%, and 19.7%, respectively.

Keywords:

optic nerve sheath; image segmentation; YOLOv5s; U-Net

1. Introduction

Increased Intracranial Pressure (ICP) is a common but challenging fatal complication in neurosurgery, neurology, pediatrics, and ophthalmology, usually caused by acute brain injury, intracranial tumors, cerebral hemorrhage, hydrocephalus, and intracranial inflammation [1]. Timely intervention is typically required to prevent adverse outcomes. Therefore, it is crucial for physicians to accurately and quickly determine whether a patient exhibits symptoms of ICP. In recent years, various ICP detection methods have emerged, including non-invasive and invasive methods. The invasive methods generally involve implanting sensors into the cranium to directly measure the ICP, such as strain gauge devices, pneumatic sensors, and fiber optic sensors. While the invasive methods offer high accuracy and reliability, they also carry risks of local wound infection and bleeding. In contrast, the non-invasive methods, although less direct, are sufficiently safe and more reasonable for patients with milder symptoms. The non-invasive methods include Transcranial Doppler (TCD) and Optic Nerve Sheath Diameter (ONSD) measurements [2]. The optic nerve sheath is a thin membrane that envelops the optic nerve, providing protection and support. ICP can lead to optic nerve sheath edema, characterized by the accumulation and expansion of fluid within the sheath. Therefore, observing and evaluating the changes in the optic nerve sheath are crucial for assessing the state of ICP in intracranial pressure monitoring and related disease diagnosis [3].

Currently, the most common methods for obtaining eye images are through ultrasound and CT. Both traditional methods and deep learning methods can process the ultrasound or CT images of the eyes and output the ONSD. In traditional image processing, Reza et al. [4] proposed a simple line integral method for detecting the ONSD. First, the image is denoised, and then the region of interest is detected using a simple line integral method. Subsequently, the ONSD is measured by analyzing the superpixels. In the direction of deep learning methods [5,6,7,8,9,10], Maolin Pang et al. [5] proposed a new method for the automatic semantic segmentation of B-type ocular ultrasound images based on the U-Net network. They added two dropout layers at the end of the U-Net network encoder and finally obtained a 97.20% IoU on a well-selected test set. Mohammad I. et al. [6] used the U-Net model to label the boundaries of the ONSD, with the real boundaries defined by experts. The average difference between the ONSD obtained from the U-Net model segmentation results and that determined by the experts was 0.19 mm. Kristen et al. [7] proposed a method for automatically measuring the Optic Nerve Diameter (OND) and ONSD using the U-Net network with a ResNet50 encoder for optic nerve segmentation. On a small sample test set, the automatic measurement results were compared with manual measurements obtained by an operator, yielding average errors of 0.07 ± 0.34 mm for the OND and −0.07 ± 0.67 mm for the ONSD. Youping Xiao [8] proposed a system capable of segmenting and calculating the ONSD in TOS images. This system was trained using a pre-trained fully convolutional neural network (FCN) on 464 images from 110 different patients. The final automatic measurements of the OND and ONSD had errors of −0.12 ± 0.32 mm and 0.14 ± 0.58 mm, respectively, compared to the manual operator assessments. Ranjbarzadeh et al. [9] first used fuzzy C-means clustering and histogram equalization methods to preprocess CT scan images and then trained a Convolutional Neural Network (CNN) on 1600 CT scan images. The network achieved Dice coefficient, specificity, and precision values of 87.7%, 91.3%, and 90.1%, respectively. Junejo et al. [10] explored the method of using portable ultrasound devices (POCUSs) combined with artificial intelligence algorithms for ONSD measurement. After training a CNN on ocular ultrasound images, the precision, recall, and F1 score for detecting the ONSD outside the normal range of 0.3 to 0.6 cm were 0.34, 0.83, and 0.49, respectively.

Most of the methods mentioned above that use U-Net [11] or CNN networks for ONSD prediction are often limited by the local receptive field of convolutions. As a result, when predicting the edges of the optic nerve sheath, these methods tend to overlook the global features of the optic nerve sheath image, leading to suboptimal segmentation performance on blurry ocular ultrasound images. Aside from U-Net [11] and CNN networks, YOLO networks [12,13,14,15] can also be applied to segmentation tasks. The YOLOv5s network and its variants use CSPDarknet53 as the backbone network, combined with the Cross-Stage Partial Network (CSPNet) structure, which can effectively extract multi-scale features from images. Its deep feature extraction capability helps to improve the segmentation precision. However, YOLOv5s networks also emphasize local features. Therefore, to achieve better performance in optic nerve sheath segmentation tasks, it is necessary to enhance their attention to global features. Both Transformer [16] and Swin-Transformer [17] networks implement a unique attention mechanism that focuses on the global features of the input image. Thus, introducing Transformer and Swin-Transformer network modules into the YOLOv5s network can compensate for its tendency to focus primarily on local features.

This paper proposes a CBC-YOLOv5s optic nerve sheath ultrasound image segmentation method that integrates both local and global features. The key contributions are as follows:

CBC-Neck for Feature Fusion: We introduce the CBC-Backbone architecture that integrates dual-layer C3STR [17] modules and dual-layer BoT3 [18] modules. The C3STR module focuses on local features of the optic nerve sheath through multi-layer convolutions and residual connections. Additionally, the WTA mechanism within the C3STR module and MHSA within the BoT3 modules enhance the network’s understanding of global features, thereby improving the overall segmentation performance.
CBC-Neck for Feature Fusion: The CBC-Neck feature pyramid is introduced, consisting of a single-layer C3STR module and three CRTN [19] modules. During upsampling, the C3STR module increases the network’s awareness of both local and global features, while the CRTN modules, with their multi-level residual design, improve the network’s ability to capture global features during downsampling. This thorough integration of features enhances the model’s robustness, especially in handling blurry or unclear optic nerve sheath boundaries in ultrasound images.
CBC-Neck for Feature Fusion: The introduction of these modules enables the effective fusion of local and global features in ocular ultrasound images, allowing the network to efficiently and accurately capture the boundaries of the optic nerve sheath even when the images are blurry or the edges are not clearly defined. This, in turn, can better assist medical professionals in making clinical judgments.

The remainder of this paper is organized as follows: Section 2 details the proposed network architecture. Section 3 discusses the datasets used in this paper and presents our experimental results, including the visualization of the attention maps, ablation experiments, and evaluations based on metrics such as precision, recall, IoU, model memory use, number of parameters, and inference time. Finally, the conclusions are provided in Section 4.

2. Optic Nerve Sheath Image Segmentation Based on CBC-YOLOv5s

The basic framework of the optic nerve sheath image segmentation algorithm based on CBC-YOLOv5s is shown in Figure 1. In the CBC-Backbone feature extraction backbone network, five-layer CBS modules and C3STR modules containing C3 modules are utilized to enhance the network’s perception of local image features. Simultaneously, the WTA mechanism in the C3STR module and the MHSA mechanism in the BoT3 module accurately capture global features of the optic nerve sheath images. The SPPF module then facilitates the interaction of global and local features of the optic nerve sheath. In the CBC-Neck feature pyramid section, Fourth-Layer CBS modules and C3STR modules are again used to capture local features of the optic nerve sheath. Additionally, the WTA mechanism of the C3STR module and the multi-level residual design of the CRTN module enhance the network’s perception of contextual and global information in the fused features, enabling the segmenter to achieve more precise semantic segmentation across large-scale, medium-scale, and small-scale feature maps.

2.1. CBC-Backbone Feature Extraction Backbone Network

To enhance the backbone network’s ability to focus on the overall features of the image while extracting deep features from the input image, the C3STR module with global perception and the BOT3 module are introduced. The CBC-Backbone feature extraction network consists of CBS modules, C3STR modules, BoT3 modules, and SPPF modules. It enhances feature representation capabilities through layer-by-layer CBS modules, helping the network to better perceive local features of the input image. Additionally, C3STR and BoT3 modules are introduced between the CBS modules. The C3STR module includes a C3 module with residual structures, which strengthens the network’s focus on deep local information and alleviates the gradient vanishing problem during deep feature extraction. Moreover, the WTA mechanism of the C3STR module and the MHSA mechanism of the BoT3 module enable the network to perceive global information of the image. Finally, the SPPF module performs pooling operations on feature maps at different scales, achieving the interaction of local and global information. The CBC-Backbone gradually extracts and processes deep semantic information from the input image using deep neural networks, fully mining local and global information, and generates feature maps at different scales to be input into the upsampling feature fusion stage of the CBC-Neck.

2.1.1. C3STR Module

Figure 2 illustrates that the C3STR module consists of a single-layer C3, double-layer LN, single-layer WTA, single-layer MLP, and two residual structures. First, the C3 module convolves to fuse optic nerve sheath features and normalizes the feature map using LayerNorm. Next, the WTA module combines window attention (W-MSA) with shifted window attention (SW-MSA). As shown in Figure 2, the W-MSA mechanism divides the input into local windows and performs self-attention calculations within these windows to facilitate the interaction of local and global information. However, solely using W-MSA sometimes fails to capture the global features of the optic nerve sheath in ocular ultrasound images due to window size limitations. Therefore, SW-MSA is introduced, which shifts the window by a fixed size within each layer, helping the model to better understand the entire ocular ultrasound image and capture global features of the optic nerve sheath more effectively. The MLP module then uses fully connected layers to adjust and reorganize the features interacted with through the WTA module, helping the model to better utilize the features obtained by the attention mechanism. Finally, the optic nerve sheath features processed through the first residual connection are combined with the features extracted by the MLP module through a second residual connection, further increasing the network depth.

2.1.2. BoT3 Module

As shown in Figure 3, the BoT3 module consists of three layers of CBS, BT, and CONCAT modules. The input feature map to this module is split into two branches. One branch undergoes feature extraction through the CBS module, followed by the BT module. The BT module leverages the MHSA module to capture global relationships among features. The other branch only goes through the CBS module for feature extraction and then concatenates with the processed features from the first branch. The MHSA module enables interaction between different positions within the feature map, enhancing the richness and diversity of feature representation. By incorporating the MHSA, the BoT3 module can better capture global information and long-range dependencies, which aids in extracting complex optic nerve sheath features from ultrasound images.

2.2. CBC-Neck Feature Pyramid

To help the Neck feature pyramid obtain more contextual and global information during the feature fusion process, the C3STR module and the CRTN module are introduced. The CBC-Neck feature pyramid consists of CBS modules, UPSAMPLE modules, C3STR modules, CONCAT modules, and CRTN modules. In the upsampling feature fusion process, the fused features from the SPPF module in the CBC-Backbone are upsampled through two cascaded operations of CBS and UPSAMPLE modules. After the first upsampling, the output feature map from the second BoT3 module in the CBC-Backbone is concatenated with the upsampled features, and the C3STR module, which does not contain residual structures, is used to enhance the perception of local and global features while fusing these features. After the second upsampling, the features are concatenated with the output features from the second C3STR module in the CBC-Backbone, and the concatenated feature map is passed to the downsampling stage. In the downsampling feature fusion process, the first CRTN module initially fuses the input concatenated feature map. The output features from the CRTN module then undergo two cascaded operations of CBS and CRTN modules for downsampling. After each downsampling, the features are concatenated with the same scale features from the upsampling part, resulting in a richer feature representation. After each concatenation, the CRTN module is used for feature fusion. The CRTN module, with its multiple residual structures, enhances the network’s perception of contextual and global information, aiding in more accurate segmentation operations at the corresponding scales. The CBC-Neck achieves multi-scale feature fusion through a pyramid structure, effectively integrating local and global information. This enhances the network’s performance in handling blurry ocular ultrasound images.

CRTN Module

As shown in Figure 4, the CRTN module is composed of two layers of CBSW modules integrated with the Swish activation function and three layers of Basic Block Residual (BBR) modules. The input feature map first goes through a CBSW structure for initial feature extraction and then enters the BBR modules. After that, feature maps that have undergone zero to three BBR module operations are concatenated. Finally, the concatenated feature map is fed into another CBSW structure for a second round of feature extraction. Compared to the conventional C3 module, the CRTN module, through its modular design and multi-level feature fusion, can better capture and utilize the global features of the optic nerve sheath in eye ultrasound images, thereby improving segmentation precision.

3. Simulation Experiment and Analysis

3.1. Dataset

This paper evaluates the proposed algorithm using the Z2HOSPITAL-5000 dataset, which consists of 5000 eye ultrasound images provided by the Second Affiliated Hospital of Zhejiang University School of Medicine, located in Hangzhou, China.

The Z2HOSPITAL-5000 dataset includes ocular ultrasound images from various patients and acquisition angles. As shown in Table 1, the boundaries of the optic nerve sheath are clearly visible in the normal images, whereas, in the blurry images, these boundaries become indistinct due to variations in acquisition angles and patient conditions.

To accurately evaluate the performance of our algorithm on both normal and blurry images, the dataset’s 5000 images were categorized as follows: 3834 for the training set, 426 for the validation set, and 740 for the test set, including 342 blurry images and 398 normal images.

3.2. Experimental Environment and Parameter Settings

The experiments were conducted on a server equipped with an NVIDIA TITAN RTX 4090 GPU, sourced from PowerLeader Co., Ltd., Shenzhen, China, using the PyTorch deep learning framework.The input image resolution was adjusted for the optic nerve sheath ultrasound images. The training batch size was set to 16, and the total number of training epochs was 100. The loss function was optimized using the Stochastic Gradient Descent (SGD) optimizer, with the learning rate and weight decay coefficient configured as specified.

3.3. Model Performance Evaluation Metrics

This paper visualizes the optic nerve sheath features extracted by the model. By examining the visualized feature maps, it is possible to directly observe the model’s focus areas within the image, enabling us to assess whether the algorithm effectively emphasizes both the local and global features of the optic nerve sheath.

Additionally, the model is evaluated using six metrics: precision, recall, IoU, model memory use, number of parameters, and inference time.

3.4. Segmentation Performance

To evaluate the segmentation capability of the proposed algorithm for optic nerve sheath images, YOLOv5s, the U-Net [11] network, and the proposed algorithm are trained on the Z2HOSPITAL-5000 dataset.

3.4.1. Feature Visualization

Figure 5 illustrates that both the U-Net [11] network and the YOLOv5s network simultaneously focus on the optic nerve sheath and other areas, with the U-Net [11] network paying more attention to the edge area compared to the YOLOv5s network. In contrast, the proposed algorithm primarily focuses on the optic nerve sheath.

3.4.2. Specific Segmentation Example

As shown in Figure 6, compared to the manual annotations by the operators, for the normal image with a clear optic nerve sheath boundary, the U-Net [11] network makes some segmentation errors due to over-focusing on the other regions. In contrast, YOLOv5s and the proposed algorithm focus more accurately on the optic nerve sheath area, resulting in more precise segmentation. For the blurry image, due to the blurry optic nerve sheath boundary, both the U-Net [11] and YOLOv5s networks, limited by their local receptive fields, exhibit significant segmentation errors. However, the proposed algorithm achieves a closer match to the true annotation by capturing the global information of the optic nerve sheath.

3.4.3. Blurry Test Set Performance Comparison

The segmentation performance of the three algorithms on the Z2HOSPITAL-5000 blurry test set is shown in Table 2.

Analyzing the segmentation results on the Z2HOSPITAL-5000 blurry test set, compared to the U-Net [11] network, the proposed algorithm shows increases in precision, recall, and IoU by 9.2%, 21%, and 19.7%, respectively; compared to the traditional YOLOv5s network, the proposed algorithm shows increases in precision, recall, and IoU by 4.1%, 2.1%, and 4.5%, respectively.

The proposed algorithm enhances the global feature attention, enabling it to capture the main characteristics of the optic nerve sheath even when its edges are blurry, thus achieving more accurate segmentation.

3.4.4. Total Test Set Performance Comparison

The segmentation performance of the three algorithms on the Z2HOSPITAL-5000 total test set is shown in Table 3.

As shown in Table 3, compared to the U-Net [11] network, the proposed algorithm’s precision, recall, and IoU are higher by 7.2%, 14.7%, and 17.5%, respectively. Compared to the traditional YOLOv5s network, our algorithm’s precision, recall, and IoU are higher by 1.8%, 0.4%, and 1.9%, respectively.

The proposed algorithm improves upon the YOLOv5s model by enhancing the global perception without sacrificing the original model’s focus on local features. As a result, the proposed algorithm achieves certain improvements regarding the total test set, which includes both normal and blurry images.

3.4.5. Ablation Experiments

To verify the contributions of the C3STR, BoT3, and CRTN modules in the proposed algorithm, we designed the following ablation experiments: Based on the traditional YOLOv5s network structure, we combined the C3STR, BoT3, and CRTN modules to obtain C3STR-YOLOv5s, BOT3-YOLOv5s, and CRTN-YOLOv5s, respectively. Additionally, we combined these modules with the CBC-Backbone and CBC-Neck to obtain CBC-Backbone-YOLOv5s and CBC-Neck-YOLOv5s. We evaluated these five modified networks along with the U-Net network, YOLOv5s network, and the proposed algorithm on the Z2HOSPITAL-5000 blurry test set.

As shown in Table 4, compared to the U-Net [11] and YOLOv5s networks, the C3STR-YOLOv5s, BOT3-YOLOv5s, CRTN-YOLOv5s, CBC-Backbone-YOLOv5s, and CBC-Neck-YOLOv5s models all demonstrated superior segmentation performance, with the proposed algorithm achieving the best results. These ablation experiments highlight the critical role of the three additional modules regarding the proposed algorithm, showing that their contributions are complementary and essential for achieving high-performance segmentation.

3.4.6. Comparison of Model Efficiency

As shown in Table 5, compared to U-Net [11] and the traditional YOLOv5s network, the proposed algorithm reduces the memory use by 109.246 MB and 2.903 MB, respectively, and decreases the number of parameters by 25,518,361 and 1,840,535, respectively. These reductions contribute to a more lightweight model, enabled by faster feature extraction and quicker training convergence. However, the introduction of certain complex attention mechanisms results in a slightly longer inference time compared to the YOLOv5s network.

4. Conclusions

This paper proposes a novel algorithm for optic nerve sheath ultrasound image segmentation based on CBC-YOLOv5s. The algorithm effectively integrates global and local features, capturing the contextual relationships between the global and local characteristics of the optic nerve sheath in ocular ultrasound images. Compared to the U-Net [11] and YOLOv5s networks, our algorithm demonstrates superior segmentation performance, particularly on blurry ocular ultrasound images.

Author Contributions

Conceptualization, Y.C., J.Z. and C.W.; methodology, Y.C. and C.W.; visualization, J.X.; investigation, Y.C. and J.Z.; Software, J.X.; writing—original draft preparation, Y.C., J.X. and C.W.; writing—review and editing, J.Y., H.W. and Y.Y.; validation, L.S.; project administration, Y.C., J.Z. and L.S. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by Key Research and Development Program of Zhejiang Province (2022C03111 and 2023C03088), and Scientific Research Fund of National Health Commission-Major Health Science and Technology Program of Zhejiang Province (WKJ-ZJ-2334).

Institutional Review Board Statement

The study was conducted according to the guidelines of the Declaration of Helsinki, and the protocol was approved by the Ethics Committee of The Second Affiliated Hospital of Zhejiang University School of Medicine (Project identification code: 2023-0687-Research, 6 September 2023).

Informed Consent Statement

Informed consent was obtained from all subjects involved in the study.

Data Availability Statement

The data presented in this study are available on request from the corresponding author due to the sensitive nature of the patient data collected from The Second Affiliated Hospital, Zhejiang University. To protect patient privacy and confidentiality, these data are not publicly available. Requests to access the datasets will be considered on a case-by-case basis, ensuring that all necessary ethical and privacy considerations are addressed.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Steiner, L.A.; Andrews, P.J.D. Monitoring the injured brain: ICP and CBF. BJA Br. J. Anaesth. 2006, 97, 26–38. [Google Scholar] [CrossRef] [PubMed]
Harary, M.; Dolmans, R.G.F.; Gormley, W.B. Intracranial pressure monitoring—Review and avenues for development. Sensors 2018, 18, 465. [Google Scholar] [CrossRef] [PubMed]
Dubourg, J.; Messerer, M.; Karakitsos, D.; Rajajee, V.; Antonsen, E.; Javouhey, E.; Cammarata, A.; Cotton, M.; Daniel, R.T.; Denaro, C.; et al. Individual patient data systematic review and meta-analysis of optic nerve sheath diameter ultrasonography for detecting raised intracranial pressure: Protocol of the ONSD research group. Syst. Rev. 2013, 2, 1–6. [Google Scholar] [CrossRef] [PubMed]
Soroushmehr, R.; Rajajee, K.; Williamson, C.; Gryak, J.; Najarian, K.; Ward, K.; Tiba, M.H. Automated optic nerve sheath diameter measurement using super-pixel analysis. In Proceedings of the 2019 IEEE 41st Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), Berlin, Germany, 23–27 July 2019; pp. 2793–2796. [Google Scholar]
Pang, M.; Liu, S.; Lin, F.; Liu, S.; Tian, B.; Yang, W.; Chen, X. Measurement of optic nerve sheath on ocular ultrasound image based on segmentation by CNN. In Proceedings of the 2019 IEEE International Conference on Signal, Information and Data Processing (ICSIDP), Chongqing, China, 1–13 December 2019; pp. 1–5. [Google Scholar]
Hirzallah, M.I.; Bose, S.; Hu, J.; Maltz, J.S. Automation of ultrasonographic optic nerve sheath diameter measurement using convolutional neural networks. J. Neuroimaging 2023, 33, 898–903. [Google Scholar] [CrossRef] [PubMed]
Meiburger, K.M.; Naldi, A.; Lochner, P.; Marzola, F. Automatic segmentation of the optic nerve in transorbital ultrasound images using a deep learning approach. In Proceedings of the 2021 IEEE International Ultrasonics Symposium (IUS), Virtual, 11–16 September 2021; pp. 1–4. [Google Scholar]
Xiao, Y. Automatic Optic Nerve Assessment From Transorbital Ultrasound Images: A Deep Learning-based Approach. Curr. Med. Imaging 2024, 20. [Google Scholar] [CrossRef] [PubMed]
Ranjbarzadeh, R.; Dorosti, S.; Ghoushchi, S.J.; Safavi, S.; Razmjooy, N.; Sarshar, N.T.; Bendechache, S.A.M. Nerve optic segmentation in CT images using a deep learning model and a texture descriptor. Complex Intell. Syst. 2022, 8, 3543–3557. [Google Scholar] [CrossRef]
Junejo, N.; Khazaei, D.; Lipor, J.; Ng, J.; Etesami, F.; Khazaei, H. Evaluation of post-traumatic optic neuropathy by using POCUS comparing optic nerve sheath diameter in phantom tissue and real world to train deep neural networks. Investig. Ophthalmol. Vis. Sci. 2024, 65, 5484. [Google Scholar]
Ronneberger, O.; Fischer, P.; Brox, T. U-net: Convolutional networks for biomedical image segmentation. In Proceedings of the Medical Image Computing and Computer-Assisted Intervention–MICCAI 2015: 18th International Conference, Munich, Germany, 5–9 October 2015; proceedings, part III 18. Springer: Munich, Germany, 2015; pp. 234–241. [Google Scholar]
Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You only look once: Unified, real-time object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 779–788. [Google Scholar]
Redmon, J.; Farhadi, A. YOLO9000: Better, faster, stronger. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 7263–7271. [Google Scholar]
Redmon, J.; Farhadi, A. Yolov3: An incremental improvement. arXiv 2018, arXiv:1804.02767. [Google Scholar]
Bochkovskiy, A.; Wang, C.Y.; Liao, H.Y.M. Yolov4: Optimal speed and accuracy of object detection. arXiv 2020, arXiv:2004.10934. [Google Scholar]
Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention is all you need. Adv. Neural Inf. Process. Syst. 2017, 30, 5998–6008. [Google Scholar]
Liu, Z.; Lin, Y.; Cao, Y.; Hu, H.; Wei, Y.; Zhang, Z.; Lin, S.; Guo, B. Swin transformer: Hierarchical vision transformer using shifted windows. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, QC, Canada, 11–17 October 2021; pp. 10012–10022. [Google Scholar]
Srinivas, A.; Lin, T.Y.; Parmar, N.; Shlens, J.; Abbeel, P.; Vaswani, A. Bottleneck transformers for visual recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 20–25 June 2021; pp. 16519–16529. [Google Scholar]
Xu, X.; Jiang, Y.; Chen, W.; Huang, Y.; Zhang, Y.; Sun, X. Damo-yolo: A report on real-time object detection design. arXiv 2022, arXiv:2211.15444. [Google Scholar]

Figure 1. CBC-YOLOv5s optic nerve sheath segmentation algorithm.

Figure 2. C3STR module.

Figure 3. BoT3 module.

Figure 4. CRTN module.

Figure 5. Different algorithms for visualization with normal and blurry images.

Figure 6. Different algorithms for segmentation examples with normal and blurry images.

Table 1. Part of the Z2HOSPITAL-5000 dataset.

Normal Image	Blurry Image

Table 2. Segmentation performance comparison of different algorithms on Z2HOSPITAL-5000 blurry test set.

Arithmetic	Precision P/%	Recall Rate R/%	Intersection Ratio I/%
U-Net [11]	70.2	64.6	50.3
YOLOv5s	75.3	83.5	65.5
The proposed algorithm	79.4	85.6	70.0

Table 3. Segmentation performance comparison of different algorithms on Z2HOSPITAL-5000 total test set.

Arithmetic	Precision P/%	Recall Rate R/%	Intersection Ratio I/%
U-Net [11]	85.1	78.3	68.8
YOLOv5s	90.5	92.6	84.4
The proposed algorithm	92.3	93	86.3

Table 4. Comparison of segmentation performance of different improved algorithms on the Z2HOSPITAL blurry test set.

Arithmetic	Precision P/%	Recall Rate R/%	Intersection Ratio I/%
U-Net [11]	70.2	64.6	50.3
YOLOv5s	75.3	83.5	65.5
CRTN-YOLOv5s	76.5	84.9	67.3
CBC-Neck-YOLOv5s	76.6	84.8	67.3
BOT3-YOLOv5s	76.6	85.6	67.5
CBC-Backbone-YOLOv5s	76.8	84.9	67.5
C3STR-YOLOv5s	77.4	84.2	67.6
The proposed algorithm	79.4	85.6	70.0

Table 5. Comparison of memory use, parameter count, and inference time for different algorithm models.

Arithmetic	Model Memory Use/MB	Parameters	Inference Time/ms
U-Net [11]	121.230	31,378,945	230
YOLOv5s	14.869	7,401,119	6.5
The proposed algorithm	11.966	5,860,584	16.6

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Chu, Y.; Xu, J.; Wu, C.; Ye, J.; Zhang, J.; Shen, L.; Wang, H.; Yao, Y. Optic Nerve Sheath Ultrasound Image Segmentation Based on CBC-YOLOv5s. Electronics 2024, 13, 3595. https://doi.org/10.3390/electronics13183595

AMA Style

Chu Y, Xu J, Wu C, Ye J, Zhang J, Shen L, Wang H, Yao Y. Optic Nerve Sheath Ultrasound Image Segmentation Based on CBC-YOLOv5s. Electronics. 2024; 13(18):3595. https://doi.org/10.3390/electronics13183595

Chicago/Turabian Style

Chu, Yonghua, Jinyang Xu, Chunshuang Wu, Jianping Ye, Jucheng Zhang, Lei Shen, Huaxia Wang, and Yudong Yao. 2024. "Optic Nerve Sheath Ultrasound Image Segmentation Based on CBC-YOLOv5s" Electronics 13, no. 18: 3595. https://doi.org/10.3390/electronics13183595

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Optic Nerve Sheath Ultrasound Image Segmentation Based on CBC-YOLOv5s

Abstract

1. Introduction

2. Optic Nerve Sheath Image Segmentation Based on CBC-YOLOv5s

2.1. CBC-Backbone Feature Extraction Backbone Network

2.1.1. C3STR Module

2.1.2. BoT3 Module

2.2. CBC-Neck Feature Pyramid

CRTN Module

3. Simulation Experiment and Analysis

3.1. Dataset

3.2. Experimental Environment and Parameter Settings

3.3. Model Performance Evaluation Metrics

3.4. Segmentation Performance

3.4.1. Feature Visualization

3.4.2. Specific Segmentation Example

3.4.3. Blurry Test Set Performance Comparison

3.4.4. Total Test Set Performance Comparison

3.4.5. Ablation Experiments

3.4.6. Comparison of Model Efficiency

4. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI