Segmentation of Liver Tumors by Monai and PyTorch in CT Images with Deep Learning Techniques

Muhammad, Sabir; Zhang, Jing

doi:10.3390/app14125144

Open AccessArticle

Segmentation of Liver Tumors by Monai and PyTorch in CT Images with Deep Learning Techniques

by

Sabir Muhammad

^1,2

and

Jing Zhang

^1,2,3,*

¹

School of Information Science and Engineering, University of Jinan, Jinan 250022, China

²

Shandong Provincial Key Laboratory of Network-Based Intelligent Computing, Jinan 250022, China

³

School of Data Intelligence, Yantai Institute of Science and Technology, Yantai 265699, China

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2024, 14(12), 5144; https://doi.org/10.3390/app14125144

Submission received: 15 April 2024 / Revised: 12 May 2024 / Accepted: 28 May 2024 / Published: 13 June 2024

(This article belongs to the Special Issue Advanced Convolutional Neural Network (CNN) Technology in Object Detection and Data Processing)

Download

Browse Figures

Versions Notes

Abstract

:

Image segmentation and identification are crucial to modern medical image processing techniques. This research provides a novel and effective method for identifying and segmenting liver tumors from public CT images. Our approach leverages the hybrid ResUNet model, a combination of both the ResNet and UNet models developed by the Monai and PyTorch frameworks. The ResNet deep dense network architecture is implemented on public CT scans using the MSD Task03 Liver dataset. The novelty of our method lies in several key aspects. First, we introduce innovative enhancements to the ResUNet architecture, optimizing its performance, especially for liver tumor segmentation tasks. Additionally, by harassing the capabilities of Monai, we streamline the implementation process, eliminating the need for manual script writing and enabling faster, more efficient model development and optimization. The process of preparing images for analysis by a deep neural network involves several steps: data augmentation, a Hounsfield windowing unit, and image normalization. ResUNet network performance is measured by using the DC metric Dice coefficient. This approach, which utilizes residual connections, has proven to be more reliable than other existing techniques. This approach achieved DC values of 0.98% for detecting liver tumors and 0.87% for segmentation. Both qualitative and quantitative evaluations show promising results regarding model precision and accuracy. The implications of this research are that it could be used to increase the precision and accuracy of liver tumor detection and liver segmentation, reflecting the potential of the proposed method. This could help in the early diagnosis and treatment of liver cancer, which can ultimately improve patient prognosis.

Keywords:

segmentation; tumor; CT images; deep learning techniques; residual network; public image data; Monai

1. Introduction

Image segmentation plays a crucial role in computer vision. In the medical field, this technology can be applied to different areas. Liver cancer is the second most commonly diagnosed disease in the world. According to the World Health Organization (WHO), an estimated 8.8 million people died from this disease in 2015, and approximately 788,000 people died from cancer [1]. In the United States, the American Cancer Society predicts that an estimated 20,710 new cases (29,200 in males and 11,510 in females) will be diagnosed annually. In 2017, 28,920 individuals (9310 females and 19,610 males) died from intrahepatic epithelial duct cancer and primary cancer [2]. Radiologists and oncologists use computed tomography (CT) or magnetic resonance imaging (MRI) to examine the structure and texture of the liver. These abnormalities are vital biomarkers for diagnosing and monitoring primary and secondary hepatic tumors. Historically, manual or semi-manual methods have been used to analyze liver CT volume scans, but these methods are expensive, time-consuming, and prone to errors. Researchers have developed various computational techniques to improve the accuracy and efficiency of diagnosing liver cancer, but these techniques still face challenges and difficulties. Obstacles include the inability to precisely segment and detect liver lesions due to factors such as irregular tumor growth, tissue abnormalities, minimal contrast between healthy and cancerous tissue distinction between the liver and the surrounding organs, and variations in the quantity and size of tumors. Oncologists and radiologists used magnetic resonance imaging (MRI) and computed tomography (CT) to determine the anatomy and physiology of the liver. These anomalies were found to be essential biomarkers for the early diagnosis of both primary and secondary liver cancer [3]. The convolution neural network model is known for its efficiency in analyzing different appearances of images. Our goal in this paper is to improve previous research by evaluating the performance of a framework model trained by deep learning techniques on a diverse set of CT images of tumors taken from the MSD Task03 Liver dataset. To achieve this, we propose a new method that overcomes the challenges in the field. The primary objective is to develop a stronger and more capable model based on deep learning to identify regions of interest (ROIs) in images accurately. This approach can be used for both feature classification and automated feature extraction. Earlier studies had some drawbacks as they used images from enhanced magnetic reasoning and CT, and only a few types of liver tumors were treated with the CNN method. Our current research goal is to develop a fully automated system that can quickly and accurately detect and segment liver tumors with minimal effort and time. Automatic procedures can adapt over time, regardless of how multiple inputs and scenarios are integrated. Medical imaging extensively utilizes image segmentation techniques to identify and isolate regions of interest, such as cancer cells, tumor tissues, or lesions. The results obtained from segmentation are critical and require high accuracy, as they directly impact the diagnosis and treatment prescribed to patients. Researchers are actively working to improve the effectiveness of image segmentation for medical imaging applications. Therefore, the segmentation of images is a challenging task, and choosing the appropriate technique for segmenting a digital image for a specific application remains an active area of research.

There are two types of image segmentation: semantic and instance segmentation. The main difference between the two types is that semantic segmentation can only produce pixel probabilities if a pixel belongs to one of two classes. The objective of instance segmentation is object indexing, regardless of whether the objects belong to the same class. In our case, we use semantic segmentation because we need to discover whether the slices can contain a liver or not. Our primary goal in choosing semantic segmentation over instance segmentation was to accurately identify and segment liver tumors in CT images for medical purposes. Semantic segmentation usually employs simpler architectures and requires fewer computational resources than instance segmentation methods. Given the complexity of medical image analysis and limited computational resources in clinical settings, we prioritized a method that could provide more efficient and dependable results without excessive computational burden. However, instance segmentation requires more detailed and precise annotation of individual objects in the training data. Therefore, our decision to use semantic segmentation aligns with our goal of accurately delineating liver tumors in CT images while considering factors like simplicity, efficiency, data availability, annotation complexity, and clinical interpretability. In this paper, we implemented the ResUNet model architecture to detect liver and tumor segmentation in CT images with Monai version 0.6. The process of preparing images for analysis by a deep neural network involves several steps. It consists of preprocessing techniques such as data augmentation to increase the number of images and variations, a Hounsfield windowing unit to adjust the brightness of images and image normalization. The ResUNet network’s performance is measured using the DC metric Dice coefficient, and the proposed methodology provides better results for liver detection and tumor segmentation. In this work, we analyzed other related state-of-the-art techniques related to liver tumor segmentation. We present the materials, methods, and practical results with a discussion and comparisons with other methods. In conclusion, our work contributes to advancing the field of liver tumor segmentation by introducing a novel approach that overcomes existing limitations and demonstrates promising performance. We suggest future research directions to refine further and extend our methodology for broader applications in medical imaging.

2. Literature Methods

In recent years, various machine-learning techniques have been utilized to segment the liver and tumors in medical images [4]. Additionally, Song and coauthors introduced an adaptive algorithm based on the fast marching method (FMM) for fully automatic liver segmentation. Their method dynamically adjusts parameters by considering the intensity statistics within the potential liver region, as detailed in their research [5]. Researchers have developed methods for identifying liver tumors in systems by using deep learning models, such as CNNs. They classified sets of functions into predefined or undefined classes using both supervised and unsupervised methods. Training and test data are needed to train and assess a classifier’s output [6]. Regarding medical image analysis, professionals who label a set of objects—differentiating between normal and diseased cells—usually provide training data. In this study, 79 HE-stained tissue samples were used; 48 of these samples were HCC tissue, and the remaining 31 were normal tissue. The researchers aimed to distinguish between malignant and normal hepatocellular carcinoma (HCC) liver tissues and proposed a method to identify it. Deep learning approaches have been found to have good learning capacities for detecting liver tumors [7]. To detect liver tumors, several deep learning models, “including stacked autoencoders (SAEs), convolutional neural networks (CNNs), deep Boltzmann machines (DBMs) and deep belief networks, have been used” [3,4,5,6,7,8]. Various CNN architectures [9,10] have been used; however, some studies have used the VGG16 architecture, while others have used a two-dimensional (2D) UNet [3,11,12,13,14] architecture, which is frequently employed for segmenting medical images [15]. The liver and its surrounding organs have low contrast intensities, challenging liver segmentation from CT images. Recent techniques for image segmentation in medicine have used deep neural networks [16,17] and proposed a high-quality memory network for interactive 3D medical image segmentation. It uses an extended memory network to swiftly encode and retrieve segments from the past to segment new slices. Ref. [18] eliminated tiny missing lesions from the data to speed up calculation and increase accuracy using two trained deep CNN models. Ref. [19] employed a convolutional neural network using image patches to assess each pixel as either cancerous or benign liver tissue. It was deemed positive if a patch had at least 50% or more tumor tissue. Ref. [20] integrated a 2D DenseNet for merging intra-slice features derived from the data with a 3D counterpart for hierarchical aggregation of volumetric context for liver and lesion segmentation. Network In a study by Ref. [7], liver tumors were segmented using a cascaded architecture that combined long or short skip couplings with soft and hard focusing approaches. A standard function for Dice loss is used to decrease the number of false positives. Ref. [21] used UNet modules to integrate spatial stream convolution to create an edge system. Furthermore, the liver segmentation methods used today may not be appropriate for segmenting interventional CT images because they are meant for segmenting diagnostic CT images.

Moreover, low subtle brightness, roughness, and other anomalies are frequently visible in CT images. The currently used tumor segmentation techniques have achieved varying degrees of success in resolving these difficult issues. Ref. [22] employed UNet in combination with recurrent nerve block Ref. [23] to overcome the shortcomings of the most advanced models and schemes. UNet will employ a recurrent neural network to gather pertinent liver area data and support better liver and liver segmentation techniques. Ref. [24] constructed a graphical model of problems for retinal image segmentation and obtained an accuracy of 0.99. Ref. [25] employed a fully convolutional network (FCN) that enhanced the segmentation of brain MR images by changing ischemia lesion dimensions and forms, resulting in a Dice symmetric coefficient of 0.75. The liver in the synthesized images is located using the YOLOv3 detector and ResNet block. Furthermore, DeeplabV3+ with the InceptionResNetv2 model was used for better liver segmentation. Ref. [26] used UNet for segmenting the liver and its tumors and achieved DSC values of 0.96 and 0.74, respectively. Ref. [27] developed a novel network named residual attention aware UNet (RA-UNet) for 3D segmenting the liver and its lesions. With an accuracy of 99.9, Ref. [17] classified liver tumors using a semantic pixel classification network called SegNet. Automatic segmentation may be challenging due to the liver’s abdominal location and the tumor’s overlap with the liver. Extensive liver and tumor segmentation studies have been conducted utilizing various methods, including semiautomatic, automatic, and manual approaches. The authors, Sabir et al. [28], conducted research for liver tumor segmentation; they employed a ResUNet model for automatically segmenting liver tumors in CT images. Infect their approach resembles ours, but our study introduces unique advancements, particularly in model training and testing, performance evaluation, and time efficiency using the Monai framework. This framework facilitates the integration of the ResUNet architecture and eliminates the need for manual script development.

3. Materials and Methods

3.1. ResUNet Architecture

The UNet model is effective in liver tumor segmentation, but it has a tendency to produce negative results, resulting in an empty mask. As a result, we prefer using the ResUNet model, which is a type of neural network architecture specifically designed for medical image segmentation tasks. The ResUNet design expands upon the UNet architecture used for object detection and image segmentation in deep learning. The ResUNet model combines UNet and deep residual learning architectures, making it well-suited for medical image analysis tasks requiring precise localization and segmentation. It can handle large-scale medical images with high resolution and complex structures. It has been used in various medical image segmentation tasks such as brain tumor segmentation, organ segmentation, cell segmentation, and lesion detection. ResUNet comprises an encoding network, a decoding network, and a bridge that links those networks. Like UNet, ResUNet incorporates a gateway, encoding connections, and decoding system. The UNet employs two 3 × 3 convolutions with ReLU activation functions. ResUNet employs pre-activated residual blocks, as shown in Figure 1. This structure maintains an encoding network, decoding network, and bridge while substituting the previously mentioned layers with pre-activated residual blocks.

Encoder: This part of the network is used to increase the number of feature maps and reduce the input spatial dimensions of the input image by capturing high-level features of the image. The encoder is responsible for capturing features from the input data. It consists of multiple convolution layers and max pooling layers.

Residual Connection: This component is added to the traditional UNet architecture to address the issues of invisible gradients and enhance network stability. The residual connections allow gradients to flow more easily through the network, improving network accuracy.
Decoder: This network section is tasked with up-sampling from the encoder feature map and producing segmentation maps corresponding to the input image data. Numerous convolutional and up-sampling layers are produced by the decoder.
Skip Connections: In the ResUNet architecture, skip connections allow information from the encoder to bypass the residual connection and be directly concatenated with decoder feature maps, enabling the network to maintain the fine features of the input image.

The ResUNet model produces a segmentation map, where each pixel is labeled, indicating whether there is a liver or a background image. The training network uses a supervised learning approach to predict accurate labels for a set of annotated images. The network weights are adjusted iteratively to minimize the difference between the predicted labels and ground-truth labels.

3.2. Liver CT Image Dataset

In order to assess the effectiveness of the ResUnet architecture, we utilized the MSD Task03 Liver dataset from the Decathlon dataset, which is publicly accessible [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29]. This dataset consists of DICOM 3D CT scan images comprising sets of 3D slices. To conduct our evaluation, we leveraged Monai, which has the ability to process NIfTI files. As a result, we preprocessed all CT image files using the dicom2nifti library and converted them to NIfTI files. An illustration of the segmentation approach’s complete workflow can be seen in Figure 2.

3.3. Preprocessing of CT Image Data

Before analysis, each slice was subjected to preprocessing techniques to distinguish liver tumors. To ensure accuracy and consistency, we normalized the spacings during the training preprocess, thereby ensuring that the voxel dimensions were uniform across all slices belonging to different patients. As for contrast, we opted to normalize the values to a range between 0 and 1 without making any further adjustments. We believe that a neural network’s capabilities extend beyond just visual recognition, as it can recognize images in a way that humans cannot. In order to further enhance our results, we employed both dataset splitting and additional preprocessing techniques as outlined below.

3.3.1. Dataset Splitting and Preprocessing

In the initial phases of dataset splitting and pre-processing, we divided the MSD Task03 liver dataset into three parts to facilitate training, testing, and validation. The dataset comprises 3D slices in DICOM format, which we transformed into the NIfTI file format using the dicome2nifti library. This conversion was necessary to ensure compatibility with the Monai framework. To ensure balanced distribution and facilitate model evaluation, we allocated 70% of the data to the training set, 15% to the testing set, and the remaining 15% to the validation set. We employed various cross-validation techniques during training to prevent overfitting and enhance the model’s generalization ability. Additionally, we applied a transformation to select the key work of either the image or label or both to diversify and strengthen the training dataset. This approach aimed to avoid pattern memorization by the model and subsequently improve its generalization ability. We incorporated dropout and L2 regularization techniques within the model architecture to prevent overfitting. Using dropout layers, we randomly deactivated neurons during training, thus reducing co-adaptation and enhancing generalization. The L2 regularization technique penalized large weight values, thereby preventing the model from fitting noise in the training data. We recorded the training loss, training metric, testing loss, and testing metric for each epoch, implementing early stopping to halt model training when the dice loss ceased decreasing or began increasing. Our dataset splitting and pre-processing pipeline ensured the robustness of training and validation of the ResUNet architecture, preventing overfitting and generating reliable segmentation results.

3.3.2. Hounsfield Window

Hounsfield units (HU) measure the tissue density in CT scans using a scale where air is assigned a value of 1000, water is assigned 0, and bone is assigned +1000. For the visualization of regions of interest, a Hounsfield window is typically used, and the component of the grayscale image is modified using CT numbers. The window level governs the image’s brightness, while the window width controls the contrast.

H U = 1000 \times \frac{μ tissue - μ H_{2} O}{μ H_{2} O}

(1)

The assignment of grayscale intensity to each pixel is part of the process of calculating Hounsfield unit (HU) values, where higher values correspond to the brightness of pixels. In the context of liver CT scans, a windowing operation within the range of [−100, 400], [19] is employed to improve the visualization of specific features in the image. Figure 3 shows the HU windowing steps applied to the original CT image for isolating the liver.

3.3.3. Data Augmentation

In the MSD Task03 Liver dataset, each scan is accompanied by a tumor mask delineating the liver and tumor’s respective positions. Data augmentation techniques, such as rotation, zoom, RandGaussianNoised, and RandAffined, are applied. Employing this procedure results in improved network accuracy. Figure 4 shows image slices with different transforms we mention here, and Figure 5 represents the transformed image result. The following data augmentation techniques were used.

Flipped: Flipping enhances the robustness and diversity of the training dataset by providing additional variations of input images without changing semantic information. By using flipping in the training dataset, the model learns to recognize and segment objects from different orientations.
Rotated: This technique involves rotating the input images along different angles, introducing variations that the model needs to learn and adapt to during training.
Zoomed: Zoomed aims to enhance the model’s ability to handle scale variations, improve generalization, and increase robustness, leading to better performance on unseen data.
RandGaussianNoised: This technique involves adding random Gaussian noise to the images. When the training data are relatively small, RandGaussianNoised can artificially increase the size of the dataset.
RandAffined: This involves applying random combinations of translation, rotation, scaling, and shearing operations to input images during the dataset’s training. Introducing noise and variability into the training dataset can help prevent overfitting.

3.3.4. Image Normalization

Intensity normalization is a method applied to mitigate variations in patient data in image intensity distribution within the dataset used, as explained in Ref. [29]. Our aim in using this particular technique is to enhance the amount of patient data, improve the comparability, and improve the accuracy of the segmentation results. For this reason, intensity normalization is applied to liver CT scan results. In medical image analysis, intensity values are used to differentiate between different types of tissues or structures. However, due to differences in acquisition protocols and imaging equipment, the intensity values of images can vary significantly. Intensity normalization aims to address this issue by adjusting the intensity values of images to a common scale. There are various methods of intensity normalization, including linear and nonlinear methods. Linear methods include scaling and normalization based on statistical measures such as the mean and standard deviation, while nonlinear methods involve more complex transformations of the intensity values. In the context of liver CT scan results, intensity normalization can help enhance and improve segmentation results by reducing variability in the intensity values of liver tissues. This can make it easier to differentiate between the liver and surrounding tissues and improve the accuracy of liver volume measurements or the segmentation of liver tumors.

4. The Experiments of Liver Tumor Segmentation

After preprocessing the data, we imported the ResUnet model from Monai. ResUNet produces a pixel-wise mask that matches the dimensions of the input image and comprises the same number of channels [30] as the classes in your specific task. Each channel has the pixel probabilities of a specific class.

When performing tumor segmentation, the output channels are shown in Figure 6. Two channels are obtained, one indicating the probabilities of background pixels (absence of tumor) and the other indicating the probabilities of foreground pixels (presence of tumor). The final result, indicating the existence of a tumor, can be achieved by using a soft mask or applying a threshold. Various parameters must be considered when implementing a ResUNet model for liver tumor segmentation. The ‘dimensions’ parameter dictates the dimensions of the convolutional blocks, with options for 2D or 3D settings. ‘Inchannel’ corresponds to the input channel, and ‘outchannel’ is configured to 2 to denote background and foreground classes. The ‘channels’ parameter manages the number of filters or kernels for each convolutional block. The other parameters have minimal relevance to the task.

4.1. Loss Function

In this research, convolutional neural networks utilize weighted cross-entropy (WCE) loss functions. This function assigns penalties to every class-based, object-based, and detection problem based on their individual median frequencies [31].

W C E = - \frac{1}{n} \sum_{i = 1}^{n} W_{c, i} [T_{i} l o g P_{i} + (1 - T_{i}) l o g (1 - P_{i})]

(2)

In this equation, the sum covers the images for training, denoted by n, where P_i represents the segmentation class, T_i corresponds to the detectable object target or real segmentation label, and W_c_,i represents the weight used for class.

4.2. Evaluation Metrics

The following are the metrics evaluated throughout the segmentation process.

4.2.1. Dice Similarity Coefficient Metric

When training for semantic segmentation, assessing the model’s performance using a metric that contrasts the ground truth with the predicted mask is essential. The key terms included “true positive (TP)”, “false positive (FP)”, and “false negative (FN)”.

D S C = \frac{| 2 T P |}{| 2 T P | + | F P | + | F N |^{'}}

(3)

4.2.2. Model Accuracy Metric

This metric assesses the accurate classification of positive and negative observations. The parameter (TP) is used for “true positive”, (FP) is used for “false positive”, and (FN) is used for “negative” results.

Accuracy = (T P + T N) / (T P + T N + F P + F N)

(4)

Adam Optimizer

We use the Adam optimizer to optimize our model on the training dataset in liver segmentation with Monai. This is one of the optimization algorithms available that is well-suited for refining the segmentation model. With its adaptive nature, Adam adjusts the learning rate for each weight of the neural network based on the gradient of the loss function. This makes it particularly effective for training deep neural networks, where the gradient loss function can vary greatly across different network layers. In liver segmentation, the goal is to identify the various regions of the liver in medical imaging tests, such as CT or MRI. This is typically performed using a convolutional neural network (CNN) that has been trained on a large dataset of annotated images. The Adam optimizer can be used to fine-tune this pre-trained CNN by reducing the contrast between the predicted and true segmentation masks of new liver images [32]. One of the advantages of using Adam for liver segmentation is that it can help improve the accuracy of the segmented image by optimizing the weights of the neural network to fit the given data better.

5. Results and Discussion

The model code is developed on VS Code 1.82 using Monai 0.6 and PyTorch 1.10, a Python 3.11 language that sets the device to Cuda 11.3 for testing and training. Using the Adam optimizer, the network weight is updated [32] from Monai by setting the learning rate to 1 × 10⁻⁵ and setting the mini-batch size to 16. We set the batch normalization technique to improve neural network training by normalizing the previous layer’s activation for each input data batch. We used all other necessary libraries to convert the dicome2nifti files to readable by Monai. ResUNet training and testing were performed on an NVIDIA GPU. Our proposed method has been examined and validated using synthetic and publicly available CT images.

Table 1 shows the outcomes derived from 20 epochs of training the ResUNet model for liver segmentation, details on the tumor segmentation results, and monitors the ResUNet model’s training progress across 100 epochs.

The training dice and dice loss metrics are shown in Figure 7; liver training and validation loss gradually decrease for every epoch, indicating a good learning rate. In each epoch, we save the loss of training, metric of training, loss of testing, and metric of testing to avoid overfitting. During model training, when the dice loss ceased decreasing or began increasing, we stopped and saved the training loss to prevent the graph from decreasing gradually. The liver model training and validation accuracy value increased with every epoch. The proposed model attains an accuracy rate of 98%, accompanied by an SVD score of 0.23, indicating minimal deviation between the predicted and actual masks, which is outlined in Table 2. The higher accuracy score can be attributed to class imbalance, particularly in CT images where more pixels are assigned to a background class with a low likelihood of tumor presence. Consequently, the accuracy tends to be skewed toward the background class, as accuracy considers the total number of true positives (TPs), false positives (FPs), true negatives (TNs), and false negatives (FNs) for all classes. We load and create the ResUNet model to test the model and then put its weights. Finally, we passed the test patients to determine whether the model was segmenting well. The following are some patient slices, and the quantitative and qualitative results of the public patient CT image segmentation are shown in Figure 8 and Table 3.

In our developed method, the liver and tumor are segmented separately. We compared our approach with several other cutting-edge methods, as presented in Table 4. Christ et al. [36] utilized an FCN for liver and tumor segmentation in CT and MRI abdomen scans (CFCNs).

Han et al. [37] introduced a 2.5D deep learning model with a DSC of 0.67 built on the DCNN methodology. In our study, the model we developed demonstrates considerable potential for application in various medical image modalities. This approach could prove beneficial for surgeons in both treatment procedures and the identification of emerging tumors. The ResUNet and recurrent neural block demonstrated improved segmentation for hepatic and liver tumors. By accumulating additional data using recurrent neural blocks, the UNet network has proven superior in segmenting liver CT scans compared to contemporary studies. The suggested model holds considerable promise for application to various medical image modalities, potentially aiding surgeons in identifying treatments for emerging tumors. The authors developed a novel technique for liver tumor segmentation in CT volumes. Table 4 compares the performance of the developed method with that of other cutting-edge approaches. Table 1 displays the tumor segmentation results and ResUNet model training progress over 100 epochs. Some additional metrics were determined for assessment, including training accuracy and the SVD value. The SVD metric determines the difference between the actual and expected masks. Table 2 shows that the proposed model achieved a notable accuracy of 98%, accompanied by a low SVD of 0.22, indicating minimal deviation between the actual and predicted masks, as shown in Table 4. However, it is essential to acknowledge that a higher accuracy score is affected by class imbalance. The elevated number of pixels assigned to the background class (where the likelihood of a tumor’s presence is low) results in accuracy being skewed toward the background class, as it encompasses the overall numbers of (TP), (FP), (TN), and (FN) for all classes.

Table 4. Comparison of the results achieved in liver tumor segmentation using the ResUNet model with state-of-the-art techniques.

Approach	DSC	Accuracy	Precision	Specificity
Ref. [36]	0.823	0.81	0.812	0.85
Ref. [34], UNet [38]	67.5 ± 30.8%	92 ± 3.8%	0.930	0.96
Ref. [39], Ref. [40]	0.83	93	98.9	98.0
Ref. [39]	0.67	0.89	0.891	0.90
Our proposed ResUNet	0.983	0.98	0.950	0.957

In our approach, the tumor is distinctly identified from the segmented liver. Slices lacking liver in the CT volume were excluded from subsequent tumor segmentation procedures. Table 5 highlights the comparison of our method with contemporary techniques. Ref. [36] employed an FCN for liver and tumor segmentation in CT and MRI abdomen scans (CFCNs) and obtained a VOE of 0.823. Ref. [34] used a 2.5D DCNN to achieve a DSC of 0.67. Ref. [39] proposed an automated liver tumor delineation technique using “adaptive fuzzy C-means (FCM) and graph cuts”, which resulted in a Dice symmetric coefficient DSC of 0.83. Highlighting impressive extension and generalization capabilities in other tumor segmentation datasets, our approach showed competitive performance in the liver tumor challenge. It demonstrated high generalizability for detecting liver and tumor segmentation. The ResUNet and recurrent neural blocks displayed superior segmentation for both hepatic and liver tumors. Compared with other contemporary networks, the UNet, which incorporates additional data with recurrent neural blocks, enhances the segmentation of liver CT scans. With a DSC of 0.87, our proposed model holds substantial promise for application in diverse medical image modalities, potentially assisting surgeons in devising treatments for emerging tumors.

The Main Contributions of the Research Paper

We developed a novel method of liver tumor segmentation for CT images with Monai in a single run.
The main feature of using Monai in this research is that it helps us import the ResUNet architecture instead of writing all the scripts.
Using the ResUNet deep neural network, we achieve faster testing with few details; UNet provides promising accuracy results, and ResNet extracts high-level features from an image.
Our research evaluates the proposed technique’s performance completely, comparing it to a few other fully automated techniques.

6. Conclusions and Future Work

This research employed the ResUNet model to perform automated liver and liver tumor segmentation in public CT images. In this paper, the proposed technique has shown high effectiveness in liver and tumor segmentation and holds promise for similar research on lung and brain tumors. In this study, a 3D model was trained using Monai, a deep-learning framework for medical imaging. Ongoing performance-based studies explore networks tailored for liver and tumor segmentation. The proposed approach is being tested and implemented across diverse datasets, with the potential for continuous improvement in results. Despite advancements in medical image segmentation, there remains substantial room for developing novel techniques that balance high accuracy and sensitivity while minimizing computational complexity. This study involves training a 3D model using Monai, a deep-learning framework specializing in medical imaging. Although the results are promising, it is important to note that medical imaging tasks are inherently challenging, and perfect segmentation is often unfeasible due to the difficulty of accessing medical images. For better performance and more accurate results, larger datasets are needed. The potential for enhancing liver segmentation performance using deep learning techniques, particularly with the extensive use of larger liver datasets, appears promising in future endeavors. Data augmentation techniques, such as the utilization of generative adversarial networks (GANs), can be employed to increase the number of liver images accessible for analysis.

Author Contributions

S.M. is the first author, and he carried out liver tumor segmentation experiments from a public CT image dataset and tested and trained the ResUNet Model loaded from Monai. S.M. drafted the manuscript, and J.Z. participated in its design as a coauthor and helped draft the manuscript. All authors have read and agreed to the published version of the manuscript.

Funding

2022–2025 National Natural Science Foundation of China under Grand, No. 52171310. 2021–2023 National Natural Science Foundation of China under Grand (Youth) No. 52001039.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The dataset associated with this work is available at http://medicaldecathlon.com/ (accessed on 2 February 2023).

Acknowledgments

The authors of this article would like to express gratitude to those who have provided support for the completion of the paper, especially to Junzheng Yang, the Dean of the School of Data Intelligence at Yantai Institute of Science and Technology, for providing technical advice and guidance in this article.

Conflicts of Interest

The authors declare no conflict of interest.

References

World Health Organization; Çelikgün, S.; Koc, T.; Arslan, E.; Gonca, K.; Yildiz, F.; Emine, G.; Gultop, F.; Argon, M.; Tekin, S.; et al. Cancer. Son. Erişim. Tarihi 2021, 24. Available online: https://www.who.int/news-room/factsheets/detail/cancer (accessed on 3 February 2022).
Li, Q.; Cao, M.; Lei, L.; Yang, F.; Li, H.; Yan, X.; He, S.; Zhang, S.; Teng, Y.; Xia, C.; et al. Burden of liver cancer: From epidemiology to prevention. Chin. J. Cancer Res. 2022, 34, 554. [Google Scholar] [CrossRef] [PubMed]
Christ, P.F.; Elshaer, M.E.A.; Ettlinger, F.; Tatavarty, S.; Bickel, M.; Bilic, P.; Rempfler, M.; Armbruster, M.; Hofmann, F.; D’Anastasi, M.; et al. Automatic liver and lesion segmentation in CT using cascaded fully convolutional neural networks and 3D conditional random fields. In Proceedings of the International Conference on Medical Image Computing and Computer-assisted Intervention, Athens, Greece, 17–21 October 2016; Springer: Berlin/Heidelberg, Germany, 2016; pp. 415–423. [Google Scholar]
Li, D.; Liu, L.; Chen, J.; Li, H.; Yin, Y. A multistep liver segmentation strategy by combining level set based method with texture analysis for CT images. In Proceedings of the 2014 International Conference on Orange Technologies, Xi’an, China, 20–23 September 2014; IEEE: Piscataway, NJ, USA, 2014; pp. 109–112. [Google Scholar]
Song, X.; Cheng, M.; Wang, B.; Huang, S.; Huang, X.; Yang, J. Adaptive fast marching method for automatic liver segmentation from CT images. Med. Phys. 2013, 40, 091917. [Google Scholar] [CrossRef] [PubMed]
Wen, Y.; Chen, L.; Deng, Y.; Zhou, C. Rethinking pretraining on medical imaging. J. Vis. Commun. Image Represent. 2021, 78, 103145. [Google Scholar] [CrossRef]
Li, X.; Chen, H.; Qi, X.; Dou, Q.; Fu, C.W.; Heng, P.A. H-DenseUNet: Hybrid densely connected UNet for liver and tumor segmentation from CT volumes. IEEE Trans. Med. Imaging 2018, 37, 2663–2674. [Google Scholar] [CrossRef] [PubMed]
Meraj, T.; Rauf, H.T.; Zahoor, S.; Hassan, A.; Lali, M.I.; Ali, L.; Bukhari, S.A.C.; Shoaib, U. Lungnodulesdetectionusingsemantic segmentation and classification with optimal features. Neural Comput. Appl. 2021, 33, 10737–10750. [Google Scholar] [CrossRef]
Yang, D.; Xu, D.; Zhou, S.K.; Georgescu, B.; Chen, M.; Grbic, S.; Metaxas, D.; Comaniciu, D. Automatic liver segmentation using an adversarial image-to-image network. In Medical Image Computing and Computer Assisted Intervention—MICCAI 2017, Proceedings of the 20th International Conference, Quebec City, QC, Canada, 11–13 September 2017; Proceedings, Part III 20; Springer: Berlin/Heidelberg, Germany, 2017; pp. 507–515. [Google Scholar]
Shafaey, M.A.; Salem, M.A.M.; Ebied, H.M.; Al-Berry, M.N.; Tolba, M.F. Deep learning for satellite image classification. In Proceedings of the International Conference on Advanced Intelligent Systems and Informatics, Cairo, Egypt, 1–3 September 2018; Springer: Berlin/Heidelberg, Germany, 2019; pp. 383–391. [Google Scholar]
Peng, J.; Dong, F.; Chen, Y.; Kong, D. A region-appearance-based adaptive variational model for 3D liver segmentation. Med. Phys. 2014, 41, 043502. [Google Scholar] [CrossRef] [PubMed]
Pan, F.; Huang, Q.; Li, X. Classification of liver tumors with CEUS based on 3D-CNN. In Proceedings of the 2019 IEEE 4th international conference on advanced robotics and mechatronics (ICARM), Toyonaka, Japan, 3–5 July 2019; IEEE: Piscataway, NJ, USA, 2019; pp. 845–849. [Google Scholar]
Yasaka, K.; Akai, H.; Abe, O.; Kiryu, S. Deep learning with convolutional neural network for differentiation of liver masses at dynamic contrast-enhanced CT: A preliminary study. Radiology 2018, 286, 887–896. [Google Scholar] [CrossRef] [PubMed]
Wen, Y.; Chen, L.; Deng, Y.; Ning, J.; Zhou, C. Toward better semantic consistency of 2D medical image segmentation. J. Vis. Commun. Image Represent. 2021, 80, 103311. [Google Scholar] [CrossRef]
Krizhevsky, A.; Sutskever, I.; Hinton, G.E. Imagenet classification with deep convolutional neural networks. Commun. ACM 2017, 60, 84–90. [Google Scholar] [CrossRef]
Khan, Z.; Yahya, N.; Alsaih, K.; Al-Hiyali, M.I.; Meriaudeau, F. Recent automatic segmentation algorithms of MRI prostate regions: A review. IEEE Access 2021, 9, 97878–97905. [Google Scholar] [CrossRef]
Zhou, T.; Li, L.; Bredell, G.; Li, J.; Konukoglu, E. Quality-aware memory network for interactive volumetric image segmentation. In Medical Image Computing and Computer Assisted Intervention—MICCAI 2021, Proceedings of the 24th International Conference, Strasbourg, France, 27 September–1 October 2021; Proceedings, Part II 24; Springer: Berlin/Heidelberg, Germany, 2021; pp. 560–570. [Google Scholar]
Christ, P.F.; Ettlinger, F.; Grün, F.; Elshaera, M.E.A.; Lipkova, J.; Schlecht, S.; Ahmaddy, F.; Tatavarty, S.; Bickel, M.; Bilic, P.; et al. Automatic liver and tumor segmentation of CT and MRI volumes using cascaded fully convolutional neural networks. arXiv 2017, arXiv:1702.05970. [Google Scholar]
Li, W.; Jia, F.; Hu, Q. Automatic segmentation of liver tumor in CT images with deep convolutional neural networks. J. Comput. Commun. 2015, 3, 146. [Google Scholar] [CrossRef]
Hu, P.; Wu, F.; Peng, J.; Liang, P.; Kong, D. Automatic 3D liver segmentation based on deep learning and globally optimized surface evolution. Phys. Med. Biol. 2016, 61, 8676. [Google Scholar] [CrossRef] [PubMed]
Jiang, H.; Shi, T.; Bai, Z.; Huang, L. Ahcnet: An application of attention mechanism and hybrid connection for liver tumor segmentation in ct volumes. IEEE Access 2019, 7, 24898–24909. [Google Scholar] [CrossRef]
Chen, Y.; Wang, K.; Liao, X.; Qian, Y.; Wang, Q.; Yuan, Z.; Heng, P.A. Channel-Unet: A spatial channelwise convolutional neural network for liver and tumors segmentation. Front. Genet. 2019, 10, 1110. [Google Scholar] [CrossRef] [PubMed]
Ronneberger, O.; Fischer, P.; Brox, T. U-net: Convolutional networks for biomedical image segmentation. In Medical Image Computing and Computer-Assisted Intervention—MICCAI 2015, Proceedings of the 18th International Conference, Munich, Germany, 5–9 October 2015; Proceedings, Part III 18; Springer: Berlin/Heidelberg, Germany, 2015; pp. 234–241. [Google Scholar]
Xiong, H.; Liu, S.; Sharan, R.V.; Coiera, E.; Berkovsky, S. Weak label based Bayesian U-Net for optic disc segmentation in fundus images. Artif. Intell. Med. 2022, 126, 102261. [Google Scholar] [CrossRef] [PubMed]
Karthik, R.; Radhakrishnan, M.; Rajalakshmi, R.; Raymann, J. Delineation of ischemic lesion from brain MRI using attention gated fully convolutional network. Biomed. Eng. Lett. 2021, 11, 3–13. [Google Scholar] [CrossRef] [PubMed]
Goshtasby, A.; Satter, M. An adaptive window mechanism for image smoothing. Comput. Vis. Image Underst. 2008, 111, 155–169. [Google Scholar] [CrossRef]
Jin, Q.; Meng, Z.; Sun, C.; Cui, H.; Su, R. RA-UNet: A hybrid deep attention-aware network to extract liver and tumor in CT scans. Front. Bioeng. Biotechnol. 2020, 8, 1471. [Google Scholar] [CrossRef]
Sabir, M.W.; Khan, Z.; Saad, N.M.; Khan, D.M.; Al-Khasawneh, M.A.; Perveen, K.; Qayyum, A.; Azhar Ali, S.S. Segmentation of liver tumor in CT scan using ResU-Net. Appl. Sci. 2022, 12, 8650. [Google Scholar] [CrossRef]
Simpson, A.L.; Antonelli, M.; Bakas, S.; Bilello, M.; Farahani, K.; Van Ginneken, B.; Kopp-Schneider, A.; Landman, B.A.; Litjens, G.; Menze, B.; et al. A large annotated medical image dataset for the development and evaluation of segmentation algorithms. arXiv 2019, arXiv:1902.09063. [Google Scholar]
Christ, P.; Ettlinger, F.; Grün, F.; Lipkova, J.; Kaissis, G. Lits-liver tumor segmentation challenge. In ISBI and MICCAI; 2017; Available online: https://competitions.codalab.org/competitions/17094 (accessed on 15 July 2023).
Bilic, P.; Christ, P.; Li, H.B.; Vorontsov, E.; Ben-Cohen, A.; Kaissis, G.; Szeskin, A.; Jacobs, C.; Mamani, G.E.H.; Chartrand, G.; et al. The liver tumor segmentation benchmark (lits). Med. Image Anal. 2023, 84, 102680. [Google Scholar] [CrossRef] [PubMed]
Kingma, D.P.; Ba, J. Adam: A method for stochastic optimization. arXiv 2014, arXiv:1412.6980. [Google Scholar]
Maqsood, M.; Bukhari, M.; Ali, Z.; Gillani, S.; Mehmood, I.; Rho, S.; Jung, Y.A. A residual-learning-based multiscale parallelconvolutions-assisted efficient CAD system for liver tumor detection. Mathematics 2021, 9, 1133. [Google Scholar] [CrossRef]
Sun, C.; Guo, S.; Zhang, H.; Li, J.; Chen, M.; Ma, S.; Jin, L.; Liu, X.; Li, X.; Qian, X. Automatic segmentation of liver tumors from multiphase contrast-enhanced CT images based on FCNs. Artif. Intell. Med. 2017, 83, 58–66. [Google Scholar] [CrossRef] [PubMed]
Trestioreanu, L. Holographic visualization of radiology data and automated machine learning-based medical image segmentation. arXiv 2018, arXiv:1808.04929. [Google Scholar]
Christ, P.F. Convolutional Neural Networks for Classification and Segmentation of Medical Images. Ph.D. Thesis, Technische Universitat München, Munich, Germany, 2017. [Google Scholar]
Han, X. Automatic liver lesion segmentation using a deep convolutional neural network method. arXiv 2017, arXiv:1704.07239. [Google Scholar]
Afzal, S.; Maqsood, M.; Mehmood, I.; Niaz, M.T.; Seo, S. An efficient false-positive reduction system for cerebral microbleeds detection. CMC Comput Mater. Contin. 2021, 66, 2301–2315. [Google Scholar] [CrossRef]
Wu, W.; Wu, S.; Zhou, Z.; Zhang, R.; Zhang, Y. 3D liver tumor segmentation in CT images using improved fuzzy C-means and graph cuts. BioMed Res. Int. 2017, 2017, 5207685. [Google Scholar] [CrossRef]
Lu, S.; Xia, K.; Wang, S.H. Diagnosis of cerebral microbleed via VGG and extreme learning machine trained by Gaussian map bat algorithm. J. Ambient. Intell. Humaniz. Comput. 2020, 14, 5395–5406. [Google Scholar] [CrossRef] [PubMed]

Figure 1. ResUNet architecture representation.

Figure 2. Overview of the liver tumor segmentation workflow.

Figure 3. Windowing steps to obtain an isolated liver.

Figure 4. Slice with different transforms.

Figure 5. Due to simple transformation, the same slice of the patient can show two different body parts.

Figure 6. Illustration of the ResUNet architecture output.

Figure 7. Dice loss and metrics are trained over 100 epochs during model training.

Figure 8. Slices sample of the patient liver tumor segmentation mask.

Table 1. Results for 100 epochs of training metric loss and accuracy.

Epoch	Loss	Accuracy
1	0.3927	0.7608
10	0.4286	0.8696
20	0.4525	0.9867
30	0.5462	0.9393
40	0.4127	0.9594
50	0.2027	0.9473
60	0.6548	0.9632
70	0.3710	0.9786
80	0.4284	0.9803
90	0.3455	0.9829
100	0.2997	0.9837

Table 2. Segmentation results for training metrics represented as MSDs.

Author	DS	Accuracy	Std. Deviation
Ref. [33]	66 ± 34.6%	90 ± 2.3%	0.23
Ref. [34]	0.58	86 ± 7.8%	0.45
Ref. [35]	0.63	90 ± 4.2%	0.23
Proposed Model	98.3%	98 ± 0.3%	0.22

Table 3. Results of the ResUNet model for liver segmentation.

Sr. No	Evaluation Metrics	ResUNet Network
1	DSC	0.983
2	Accuracy	0.98
3	Precision	0.950

Table 5. Liver tumor segmentation of ResUNet base comparison with other methods.

Used Technique	Dice Coefficient	Accuracy	Precision	Specificity	VOE	RVD
Ref. [36]	0.823	-	-	-	-	-
Ref. [34]	-	-	-	-	15.6	5.8
Ref. [39]	0.83	-	-	-	29.04	−2.20
Ref. [40]	0.67	-	-	-	-	0.40
Our ResUNet Model	0.87	0.945	0.930	0.947	12.09	6.93

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Muhammad, S.; Zhang, J. Segmentation of Liver Tumors by Monai and PyTorch in CT Images with Deep Learning Techniques. Appl. Sci. 2024, 14, 5144. https://doi.org/10.3390/app14125144

AMA Style

Muhammad S, Zhang J. Segmentation of Liver Tumors by Monai and PyTorch in CT Images with Deep Learning Techniques. Applied Sciences. 2024; 14(12):5144. https://doi.org/10.3390/app14125144

Chicago/Turabian Style

Muhammad, Sabir, and Jing Zhang. 2024. "Segmentation of Liver Tumors by Monai and PyTorch in CT Images with Deep Learning Techniques" Applied Sciences 14, no. 12: 5144. https://doi.org/10.3390/app14125144

APA Style

Muhammad, S., & Zhang, J. (2024). Segmentation of Liver Tumors by Monai and PyTorch in CT Images with Deep Learning Techniques. Applied Sciences, 14(12), 5144. https://doi.org/10.3390/app14125144

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Segmentation of Liver Tumors by Monai and PyTorch in CT Images with Deep Learning Techniques

Abstract

1. Introduction

2. Literature Methods

3. Materials and Methods

3.1. ResUNet Architecture

3.2. Liver CT Image Dataset

3.3. Preprocessing of CT Image Data

3.3.1. Dataset Splitting and Preprocessing

3.3.2. Hounsfield Window

3.3.3. Data Augmentation

3.3.4. Image Normalization

4. The Experiments of Liver Tumor Segmentation

4.1. Loss Function

4.2. Evaluation Metrics

4.2.1. Dice Similarity Coefficient Metric

4.2.2. Model Accuracy Metric

Adam Optimizer

5. Results and Discussion

The Main Contributions of the Research Paper

6. Conclusions and Future Work

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI