DeMambaNet: Deformable Convolution and Mamba Integration Network for High-Precision Segmentation of Ambiguously Defined Dental Radicular Boundaries
Abstract
:1. Introduction
- Proposd DeMambaNet for panoramic dental X-ray segmentation, incorporating a dual-pathway encoder capable of multilevel feature extraction, to address challenges such as the density concordance between dental and osseous tissues, intricate root geometries, and dental overlaps evident. The source code is available on GitHub (https://github.com/IMOP-lab/DeMambaNet) to catalyze expansive research and clinical adoption.
- Proposed HCD for stratified feature fusion, orchestrate and equilibrate local and global information. Maintaining diversity in feature representation through the Triplet Attentional Feature Integration (TAFI) module across various decoding phases.
- Implementation of Deformable Convolution and State Space Models to enhance proficiency in managing the compression-induced overlaps and intersections of three-dimensional dental structures into two-dimensional representations through dynamic adaptability of DCN and the spatial resolution capabilities of SSM.
2. Related Works
2.1. Traditional Computational Approaches in Dental X-ray Segmentation
2.2. Advancements in CNN-Based Dental Image Segmentation
2.3. Exploration of State Space Models in Image Segmentation
2.4. Advancements in Image Segmentation with Deformable Convolutions
2.5. Feature Enhancement and Fusion Techniques for Improved Segmentation
3. Methods
3.1. Coalescent Structural Deformable Encoder (CSDE)
3.2. Cognitively Optimized Semantic Enhance Module (SEM)
3.3. Hierarchical Convergence Decoder (HCD)
4. Experiments and Results
4.1. Dataset
4.2. Evaluation Metrics
4.3. Implementation Details
4.4. Loss Function Formulation
4.5. Comparison with State-of-the-Art Methods
5. Discussion
5.1. Ablation Experiment
5.2. Clinical Application
5.3. Clinical Implementation Challenges and Potential Limitations
6. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
Abbreviations
DeMambaNet | Deformable Convolution and Mamba Integration Network |
CSDE | Coalescent Structural Deformable Encoder |
SEM | Cognitively Optimized Semantic Enhancement Module |
HCD | Hierarchical Convergence Decoder |
TAFI | Triplet Attentional Feature Integration |
SSP | State Space Pathway |
ADP | Adaptive Deformable Pathway |
DCN | Deformable Convolutional Networks |
SSM | Statistical Shape Models |
AFF | Attentional Feature Fusion |
EVC | Efficient Vision Center |
MLP | Multi-Layer Perceptron |
LVC | Learnable Vision Center |
DSC | Dice Similarity Coefficient |
HD95 | 95% Hausdorff Distance |
IoU | Intersection over Union |
MCC | Matthews Correlation Coefficient |
References
- Seitz, M.W.; Listl, S.; Bartols, A.; Schubert, I.; Blaschke, K.; Haux, C.; van der Zande, M.M. Current knowledge on correlations between highly prevalent dental conditions and chronic diseases: An umbrella review [dataset]. Prev. Chronic Dis. 2019, 16, 180641. [Google Scholar] [CrossRef] [PubMed]
- Chen, Y.C.; Chen, M.Y.; Chen, T.Y.; Chan, M.L.; Huang, Y.Y.; Liu, Y.L.; Lee, P.T.; Lin, G.J.; Li, T.F.; Chen, C.A.; et al. Improving dental implant outcomes: CNN-based system accurately measures degree of peri-implantitis damage on periapical film. Bioengineering 2023, 10, 640. [Google Scholar] [CrossRef] [PubMed]
- Mao, Y.C.; Chen, T.Y.; Chou, H.S.; Lin, S.Y.; Liu, S.Y.; Chen, Y.A.; Liu, Y.L.; Chen, C.A.; Huang, Y.C.; Chen, S.L.; et al. Caries and restoration detection using bitewing film based on transfer learning with CNNs. Sensors 2021, 21, 4613. [Google Scholar] [CrossRef] [PubMed]
- Sivari, E.; Senirkentli, G.B.; Bostanci, E.; Guzel, M.S.; Acici, K.; Asuroglu, T. Deep learning in diagnosis of dental anomalies and diseases: A systematic review. Diagnostics 2023, 13, 2512. [Google Scholar] [CrossRef] [PubMed]
- Huang, X.; He, S.; Wang, J.; Yang, S.; Wang, Y.; Ye, X. Lesion detection with fine-grained image categorization for myopic traction maculopathy (MTM) using optical coherence tomography. Med. Phys. 2023, 50, 5398–5409. [Google Scholar] [CrossRef] [PubMed]
- Huang, X.; Huang, J.; Zhao, K.; Zhang, T.; Li, Z.; Yue, C.; Chen, W.; Wang, R.; Chen, X.; Zhang, Q.; et al. SASAN: Spectrum-Axial Spatial Approach Networks for Medical Image Segmentation. IEEE Trans. Med. Imaging 2024. [Google Scholar] [CrossRef] [PubMed]
- Huang, C.; Wang, J.; Wang, S.; Zhang, Y. A review of deep learning in dentistry. Neurocomputing 2023, 554, 126629. [Google Scholar] [CrossRef]
- Majanga, V.; Viriri, S. Dental Images’ Segmentation Using Threshold Connected Component Analysis. Comput. Intell. Neurosci. 2021, 2021, 2921508. [Google Scholar] [CrossRef]
- Muresan, M.P.; Barbura, A.R.; Nedevschi, S. Teeth detection and dental problem classification in panoramic X-ray images using deep learning and image processing techniques. In Proceedings of the 2020 IEEE 16th International Conference on Intelligent Computer Communication and Processing (ICCP), Cluj-Napoca, Romania, 3–5 September 2020; pp. 457–463. [Google Scholar]
- Li, C.W.; Lin, S.Y.; Chou, H.S.; Chen, T.Y.; Chen, Y.A.; Liu, S.Y.; Liu, Y.L.; Chen, C.A.; Huang, Y.C.; Chen, S.L.; et al. Detection of dental apical lesions using CNNs on periapical radiograph. Sensors 2021, 21, 7049. [Google Scholar] [CrossRef]
- Moran, M.; Faria, M.; Giraldi, G.; Bastos, L.; Oliveira, L.; Conci, A. Classification of approximal caries in bitewing radiographs using convolutional neural networks. Sensors 2021, 21, 5192. [Google Scholar] [CrossRef]
- Buhari, P.A.M.; Mohideen, K. Deep Learning Approach for Partitioning of Teeth in Panoramic Dental X-ray Images. Int. J. Emerg. Technol. 2020, 11, 154–160. [Google Scholar]
- Lin, J.; Huang, X.; Zhou, H.; Wang, Y.; Zhang, Q. Stimulus-guided adaptive transformer network for retinal blood vessel segmentation in fundus images. Med. Image Anal. 2023, 89, 102929. [Google Scholar] [CrossRef] [PubMed]
- Huang, X.; Yao, C.; Xu, F.; Chen, L.; Wang, H.; Chen, X.; Ye, J.; Wang, Y. MAC-ResNet: Knowledge distillation based lightweight multiscale-attention-crop-ResNet for eyelid tumors detection and classification. J. Pers. Med. 2022, 13, 89. [Google Scholar] [CrossRef]
- Alharbi, S.S.; AlRugaibah, A.A.; Alhasson, H.F.; Khan, R.U. Detection of Cavities from Dental Panoramic X-ray Images Using Nested U-Net Models. Appl. Sci. 2023, 13, 12771. [Google Scholar] [CrossRef]
- Xing, Z.; Ye, T.; Yang, Y.; Liu, G.; Zhu, L. SegMamba: Long-range Sequential Modeling Mamba For 3D Medical Image Segmentation. arXiv 2024, arXiv:2401.13560. [Google Scholar]
- Wang, W.; Dai, J.; Chen, Z.; Huang, Z.; Li, Z.; Zhu, X.; Hu, X.; Lu, T.; Lu, L.; Li, H.; et al. Internimage: Exploring large-scale vision foundation models with deformable convolutions. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada, 17–24 June 2023; pp. 14408–14419. [Google Scholar]
- Quan, Y.; Zhang, D.; Zhang, L.; Tang, J. Centralized Feature Pyramid for Object Detection. IEEE Trans. Image Process. 2023, 32, 4341–4354. [Google Scholar] [CrossRef] [PubMed]
- Dai, Y.; Gieseke, F.; Oehmcke, S.; Wu, Y.; Barnard, K. Attentional feature fusion. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, Virtual, 5–9 January 2021; pp. 3560–3569. [Google Scholar]
- Larsson, G.; Maire, M.; Shakhnarovich, G. Fractalnet: Ultra-deep neural networks without residuals. arXiv 2016, arXiv:1605.07648. [Google Scholar]
- Zhang, Y.; Ye, F.; Chen, L.; Xu, F.; Chen, X.; Wu, H.; Cao, M.; Li, Y.; Wang, Y.; Huang, X. Children’s dental panoramic radiographs dataset for caries segmentation and dental disease detection. Sci. Data 2023, 10, 380. [Google Scholar] [CrossRef] [PubMed]
- Panetta, K.; Rajendran, R.; Ramesh, A.; Rao, S.P.; Agaian, S. Tufts dental database: A multimodal panoramic X-ray dataset for benchmarking diagnostic systems. IEEE J. Biomed. Health Inform. 2021, 26, 1650–1659. [Google Scholar] [CrossRef]
- Huang, X.; Bajaj, R.; Li, Y.; Ye, X.; Lin, J.; Pugliese, F.; Ramasamy, A.; Gu, Y.; Wang, Y.; Torii, R.; et al. POST-IVUS: A perceptual organisation-aware selective transformer framework for intravascular ultrasound segmentation. Med. Image Anal. 2023, 89, 102922. [Google Scholar] [CrossRef]
- Sun, Y.; Huang, X.; Zhou, H.; Zhang, Q. SRPN: Similarity-based region proposal networks for nuclei and cells detection in histology images. Med. Image Anal. 2021, 72, 102142. [Google Scholar] [CrossRef] [PubMed]
- Huang, X.; Li, Z.; Lou, L.; Dan, R.; Chen, L.; Zeng, G.; Jia, G.; Chen, X.; Jin, Q.; Ye, J.; et al. GOMPS: Global Attention-Based Ophthalmic Image Measurement and Postoperative Appearance Prediction System. Expert Syst. Appl. 2023, 232, 120812. [Google Scholar] [CrossRef]
- Ronneberger, O.; Fischer, P.; Brox, T. U-net: Convolutional networks for biomedical image segmentation. In Proceedings of the Medical Image Computing and Computer-Assisted Intervention—MICCAI 2015: 18th International Conference, Munich, Germany, 5–9 October 2015; Proceedings, Part III 18. Springer: Berlin/Heidelberg, Germany, 2015; pp. 234–241. [Google Scholar]
- Cai, S.; Tian, Y.; Lui, H.; Zeng, H.; Wu, Y.; Chen, G. Dense-UNet: A novel multiphoton in vivo cellular image segmentation model based on a convolutional neural network. Quant. Imaging Med. Surg. 2020, 10, 1275. [Google Scholar] [CrossRef] [PubMed]
- Zhou, Z.; Rahman Siddiquee, M.M.; Tajbakhsh, N.; Liang, J. Unet++: A nested u-net architecture for medical image segmentation. In Proceedings of the Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support: 4th International Workshop, DLMIA 2018, and 8th International Workshop, ML-CDS 2018, Held in Conjunction with MICCAI 2018, Granada, Spain, 20 September 2018; Proceedings 4. Springer: Berlin/Heidelberg, Germany, 2018; pp. 3–11. [Google Scholar]
- Wang, Z.; Zheng, J.Q.; Zhang, Y.; Cui, G.; Li, L. Mamba-unet: Unet-like pure visual mamba for medical image segmentation. arXiv 2024, arXiv:2402.05079. [Google Scholar]
- Chen, J.; Lu, Y.; Yu, Q.; Luo, X.; Adeli, E.; Wang, Y.; Lu, L.; Yuille, A.L.; Zhou, Y. Transunet: Transformers make strong encoders for medical image segmentation. arXiv 2021, arXiv:2102.04306. [Google Scholar]
- Oktay, O.; Schlemper, J.; Folgoc, L.L.; Lee, M.; Heinrich, M.; Misawa, K.; Mori, K.; McDonagh, S.; Hammerla, N.Y.; Kainz, B.; et al. Attention u-net: Learning where to look for the pancreas. arXiv 2018, arXiv:1804.03999. [Google Scholar]
- Alom, M.Z.; Hasan, M.; Yakopcic, C.; Taha, T.M.; Asari, V.K. Recurrent residual convolutional neural network based on u-net (r2u-net) for medical image segmentation. arXiv 2018, arXiv:1802.06955. [Google Scholar]
- Paszke, A.; Chaurasia, A.; Kim, S.; Culurciello, E. Enet: A deep neural network architecture for real-time semantic segmentation. arXiv 2016, arXiv:1606.02147. [Google Scholar]
- Zhao, H.; Qi, X.; Shen, X.; Shi, J.; Jia, J. Icnet for real-time semantic segmentation on high-resolution images. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 405–420. [Google Scholar]
- Wang, Y.; Zhou, Q.; Liu, J.; Xiong, J.; Gao, G.; Wu, X.; Latecki, L.J. Lednet: A lightweight encoder-decoder network for real-time semantic segmentation. In Proceedings of the 2019 IEEE International Conference on Image Processing (ICIP), Taipei, Taiwan, 22–25 September 2019; pp. 1860–1864. [Google Scholar]
- Yuan, Y.; Huang, L.; Guo, J.; Zhang, C.; Chen, X.; Wang, J. Ocnet: Object context network for scene parsing. arXiv 2018, arXiv:1809.00916. [Google Scholar]
- Zhao, H.; Shi, J.; Qi, X.; Wang, X.; Jia, J. Pyramid scene parsing network. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–27 July 2017; pp. 2881–2890. [Google Scholar]
- Badrinarayanan, V.; Kendall, A.; Cipolla, R. Segnet: A deep convolutional encoder-decoder architecture for image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 39, 2481–2495. [Google Scholar] [CrossRef]
- Ruan, J.; Xiang, S. Vm-unet: Vision mamba unet for medical image segmentation. arXiv 2024, arXiv:2402.02491. [Google Scholar]
- Chen, L.C.; Papandreou, G.; Schroff, F.; Adam, H. Rethinking atrous convolution for semantic image segmentation. arXiv 2017, arXiv:1706.05587. [Google Scholar]
- Chen, L.C.; Zhu, Y.; Papandreou, G.; Schroff, F.; Adam, H. Encoder-decoder with atrous separable convolution for semantic image segmentation. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 801–818. [Google Scholar]
- Isensee, F.; Jaeger, P.F.; Kohl, S.A.; Petersen, J.; Maier-Hein, K.H. nnU-Net: A self-configuring method for deep learning-based biomedical image segmentation. Nat. Methods 2021, 18, 203–211. [Google Scholar] [CrossRef] [PubMed]
- Gu, Z.; Cheng, J.; Fu, H.; Zhou, K.; Hao, H.; Zhao, Y.; Zhang, T.; Gao, S.; Liu, J. Ce-net: Context encoder network for 2d medical image segmentation. IEEE Trans. Med. Imaging 2019, 38, 2281–2292. [Google Scholar] [CrossRef] [PubMed]
- Wang, S.; Liang, S.; Chang, Q.; Zhang, L.; Gong, B.; Bai, Y.; Zuo, F.; Wang, Y.; Xie, X.; Gu, Y. STSN-Net: Simultaneous Tooth Segmentation and Numbering Method in Crowded Environments with Deep Learning. Diagnostics 2024, 14, 497. [Google Scholar] [CrossRef]
Model | Dice (%) ↑ | IoU (%) ↑ | 95 Hausdorff ↓ | Accuracy (%) ↑ | Kappa (%) ↑ | MCC (%) ↑ | GFLOPS ↓ | Params (MB) ↓ |
---|---|---|---|---|---|---|---|---|
UNet [26] | 302.07 | 29.6 | ||||||
Dense-UNet [27] | 497.41 | 17.51 | ||||||
UNet++ [28] | 217.25 | 8.74 | ||||||
Mamba-UNet [29] | 17.61 | 9.53 | ||||||
TransUNet [30] | 147.51 | 64.72 | ||||||
Attention U-Net [31] | 416.77 | 33.26 | ||||||
R2U-Net [32] | 240.22 | 9.33 | ||||||
ENet [33] | 3.22 | 0.34 | ||||||
ICNet [34] | 57.88 | 26.98 | ||||||
LEDNet [35] | 9.92 | 2.21 | ||||||
OCNet [36] | 367.63 | 52.48 | ||||||
PSPNet [37] | 288.95 | 46.5 | ||||||
SegNet [38] | 251.24 | 28.08 | ||||||
VM-UNet [39] | 33.41 | 26.16 | ||||||
DeMambaNet (ours) | 216.24 | 41.25 |
Model | Backbone | Dice (%) ↑ | IoU (%) ↑ | Accuracy (%) ↑ |
---|---|---|---|---|
UNet [26] | - | 91.26 | 84.09 | 98.04 |
PSPNet [37] | ResNet18 | 91.49 | 85.66 | 94.76 |
DeepLabV3 [40] | ResNet18 | 91.87 | 86.02 | 94.91 |
DeepLabV3+ [41] | ResNet18 | 91.80 | 95.13 | |
nnUNet [42] | - | 90.86 | 86.11 | 94.91 |
CE-Net [43] | - | 86.62 | 81.64 | 92.67 |
DeMambaNet (ours) | - | 85.50 |
SSP | ADP | TAFI | SEM | Dice (%) ↑ | IoU (%) ↑ | 95 Hausdorff ↓ | Accuracy (%) ↑ | Kappa (%) ↑ | MCC (%) ↑ |
---|---|---|---|---|---|---|---|---|---|
\ | √ | √ | √ | 92.37 ± 4.83 | 86.05 ± 5.34 | 8.01 ± 1.18 | 97.09 ± 0.84 | 90.55 ± 4.94 | 90.75 ± 4.89 |
√ | \ | √ | √ | 91.91 ± 6.11 | 85.39 ± 6.43 | 8.14 ± 1.11 | 96.93 ± 0.76 | 89.99 ± 6.12 | 90.28 ± 5.94 |
√ | √ | \ | √ | 92.96 ± 4.85 | 87.08 ± 5.45 | 7.66 ± 1.13 | 97.27 ± 0.85 | 91.25 ± 4.98 | 91.49 ± 4.90 |
√ | √ | √ | \ | 93.24 ± 4.80 | 87.56 ± 5.33 | 7.55 ± 1.15 | 97.38 ± 0.84 | 91.60 ± 4.94 | 91.81 ± 4.88 |
√ | √ | √ | √ | 93.38 ± 4.80 | 87.81 ± 5.30 | 7.49 ± 1.17 | 97.45 ± 0.84 | 91.78 ± 4.93 | 91.98 ± 4.87 |
TAFI1 | TAFI2 | TAFI3 | Dice (%) ↑ | IoU (%) ↑ | 95 Hausdorff ↓ | Accuracy (%) ↑ | Kappa (%) ↑ | MCC (%) ↑ |
---|---|---|---|---|---|---|---|---|
\ | √ | √ | 93.26 ± 5.68 | 87.67 ± 5.93 | 7.52 ± 1.14 | 97.45 ± 0.80 | 91.66 ± 5.74 | 91.84 ± 5.71 |
√ | \ | √ | 93.02 ± 4.72 | 87.16 ± 5.10 | 7.67 ± 1.09 | 97.31 ± 0.76 | 91.33 ± 4.81 | 91.54 ± 4.76 |
√ | √ | \ | 93.36 ± 4.76 | 87.76 ± 5.18 | 7.51 ± 1.16 | 97.46 ± 0.80 | 91.77 ± 4.86 | 91.94 ± 4.82 |
√ | √ | √ | 93.38 ± 4.80 | 87.81 ± 5.30 | 7.49 ± 1.17 | 97.45 ± 0.84 | 91.78 ± 4.93 | 91.98 ± 4.87 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Zou, B.; Huang, X.; Jiang, Y.; Jin, K.; Sun, Y. DeMambaNet: Deformable Convolution and Mamba Integration Network for High-Precision Segmentation of Ambiguously Defined Dental Radicular Boundaries. Sensors 2024, 24, 4748. https://doi.org/10.3390/s24144748
Zou B, Huang X, Jiang Y, Jin K, Sun Y. DeMambaNet: Deformable Convolution and Mamba Integration Network for High-Precision Segmentation of Ambiguously Defined Dental Radicular Boundaries. Sensors. 2024; 24(14):4748. https://doi.org/10.3390/s24144748
Chicago/Turabian StyleZou, Binfeng, Xingru Huang, Yitao Jiang, Kai Jin, and Yaoqi Sun. 2024. "DeMambaNet: Deformable Convolution and Mamba Integration Network for High-Precision Segmentation of Ambiguously Defined Dental Radicular Boundaries" Sensors 24, no. 14: 4748. https://doi.org/10.3390/s24144748
APA StyleZou, B., Huang, X., Jiang, Y., Jin, K., & Sun, Y. (2024). DeMambaNet: Deformable Convolution and Mamba Integration Network for High-Precision Segmentation of Ambiguously Defined Dental Radicular Boundaries. Sensors, 24(14), 4748. https://doi.org/10.3390/s24144748