Author Contributions
Conceptualization, V.K.S. and M.A.-N.; methodology, V.K.S. and M.A.-N.; software, V.K.S. and M.A.-N.; validation, V.K.S., M.A.-N. and N.P.; formal analysis, V.K.S. and M.A.-N.; investigation, V.K.S. and M.A.-N.; resources, V.K.S. and M.A.-N.; data curation, V.K.S., M.A.-N., and N.P.; writing—original draft preparation, V.K.S., M.A.-N. and N.P.; writing—review and editing, V.K.S., M.A.-N. and N.P.; visualization, V.K.S., M.A.-N. and N.P.; supervision, M.A.-N. and D.P.; project administration, M.A.-N. and D.P.; funding acquisition, M.A.-N. and D.P. All authors have read and agreed to the published version of the manuscript.
Figure 1.
Four examples of COVID-19 existing in chest CT images (COVID-19 infection is highlighted by red boxes).
Figure 1.
Four examples of COVID-19 existing in chest CT images (COVID-19 infection is highlighted by red boxes).
Figure 2.
Framework of proposed LungINFseg.
Figure 2.
Framework of proposed LungINFseg.
Figure 3.
The encoder network. represent the block-wise losses. D refers to the down-sampling rate. Out represents the output features generated by the Encoder network. RFA refers to the receptive field aware module. DWT refers to the discrete wavelet transform.
Figure 3.
The encoder network. represent the block-wise losses. D refers to the down-sampling rate. Out represents the output features generated by the Encoder network. RFA refers to the receptive field aware module. DWT refers to the discrete wavelet transform.
Figure 4.
Illustration of decomposing a CT image into sub-band using DWT.
Figure 4.
Illustration of decomposing a CT image into sub-band using DWT.
Figure 5.
Illustration of a zoom-in visualization of decomposing a CT image into four sub-bands using DWT. A, H, V and D refer to Approximation, Horizontal Detail, Vertical Detail and Diagonal Detail, respectively.
Figure 5.
Illustration of a zoom-in visualization of decomposing a CT image into four sub-bands using DWT. A, H, V and D refer to Approximation, Horizontal Detail, Vertical Detail and Diagonal Detail, respectively.
Figure 6.
Proposed RFA module. Conv refers to convolution layers. D refers to the down-sampling rate. Out represents the output features generated by the RFA module. LPDGC refers to the learnable parallel dilated group convolutional block. FAM refers to the feature attention module.
Figure 6.
Proposed RFA module. Conv refers to convolution layers. D refers to the down-sampling rate. Out represents the output features generated by the RFA module. LPDGC refers to the learnable parallel dilated group convolutional block. FAM refers to the feature attention module.
Figure 7.
Illustration of receptive fields. (a) Receptive fields in the same layer with the same size kernel that capture more background pixels, and (b) receptive fields with varying dilation rates (shown in four different colored boxes) which capture the small and relevant regions.
Figure 7.
Illustration of receptive fields. (a) Receptive fields in the same layer with the same size kernel that capture more background pixels, and (b) receptive fields with varying dilation rates (shown in four different colored boxes) which capture the small and relevant regions.
Figure 8.
Illustration of the LPDGC block. Here, p and d refer to the padding and dilation rates respectively. ELU refers to the exponential linear unit activation function.
Figure 8.
Illustration of the LPDGC block. Here, p and d refer to the padding and dilation rates respectively. ELU refers to the exponential linear unit activation function.
Figure 9.
Diagram of our feature attention module (FAM).
Figure 9.
Diagram of our feature attention module (FAM).
Figure 10.
Illustration of the decoder network.
Figure 10.
Illustration of the decoder network.
Figure 11.
The role of LPDGC in capturing small COVID-19 lung infections from CT images (represented in dark blue). (a–d) present examples of COVID-19 infection in CT images (left) and the corresponding Heatmaps (right). Here, the red box presents a zoom-in visualization of the infected region.
Figure 11.
The role of LPDGC in capturing small COVID-19 lung infections from CT images (represented in dark blue). (a–d) present examples of COVID-19 infection in CT images (left) and the corresponding Heatmaps (right). Here, the red box presents a zoom-in visualization of the infected region.
Figure 12.
Different examples of channel attention maps (CAMs) obtained from the LungINFseg. (a–d) present examples of COVID-19 infection in CT images (left) and the corresponding CAMs (right).
Figure 12.
Different examples of channel attention maps (CAMs) obtained from the LungINFseg. (a–d) present examples of COVID-19 infection in CT images (left) and the corresponding CAMs (right).
Figure 13.
Boxplots of Dice coefficient and Intersection over Union (IoU) scores for all test samples of lung CT infection. Different boxes indicate the score ranges of several methods; the red line inside each box represents the median value; box limits include interquartile ranges Q2 and Q3 (from 25% to 75% of samples); upper and lower whiskers are computed as 1.5 times the distance of upper and lower limits of the box; and all values outside the whiskers are considered as outliers, which are marked with the (+) symbol.
Figure 13.
Boxplots of Dice coefficient and Intersection over Union (IoU) scores for all test samples of lung CT infection. Different boxes indicate the score ranges of several methods; the red line inside each box represents the median value; box limits include interquartile ranges Q2 and Q3 (from 25% to 75% of samples); upper and lower whiskers are computed as 1.5 times the distance of upper and lower limits of the box; and all values outside the whiskers are considered as outliers, which are marked with the (+) symbol.
Figure 14.
Qualitative comparison of the segmentation results of LungINFseg and five state-of-the-art segmentation methods (left to right: LungINFseg-SQNet). Here, left and right side numbers on each example refer to dice and IoU scores, respectively. The colors used to represent the segmentation results are as follows: TP (orange), FP (green), FN (red), and TN (black).
Figure 14.
Qualitative comparison of the segmentation results of LungINFseg and five state-of-the-art segmentation methods (left to right: LungINFseg-SQNet). Here, left and right side numbers on each example refer to dice and IoU scores, respectively. The colors used to represent the segmentation results are as follows: TP (orange), FP (green), FN (red), and TN (black).
Figure 15.
Qualitative comparison of the segmentation results of LungINFseg and six state-of-the-art segmentation methods (left to right: ContextNet–DABNet). Here, left and right side numbers on each example refer to dice and IoU scores, respectively. The colors are used to represent the segmentation results as follows: TP (orange), FP (green), FN (red), and TN (black).
Figure 15.
Qualitative comparison of the segmentation results of LungINFseg and six state-of-the-art segmentation methods (left to right: ContextNet–DABNet). Here, left and right side numbers on each example refer to dice and IoU scores, respectively. The colors are used to represent the segmentation results as follows: TP (orange), FP (green), FN (red), and TN (black).
Figure 16.
Qualitative comparison of the segmentation results of the LungINFseg, Inf-Net, and MIScnn. Here, left and right side numbers on each example refer to dice and IoU scores, respectively. The colors are used to represent the segmentation results as follows: TP (orange), FP (green), FN (red), and TN (black).
Figure 16.
Qualitative comparison of the segmentation results of the LungINFseg, Inf-Net, and MIScnn. Here, left and right side numbers on each example refer to dice and IoU scores, respectively. The colors are used to represent the segmentation results as follows: TP (orange), FP (green), FN (red), and TN (black).
Table 1.
Architecture details of LungINFseg. Skip connection is used to connects the encoder layers with the corresponding decoder layers to preserve the spatial information.
Table 1.
Architecture details of LungINFseg. Skip connection is used to connects the encoder layers with the corresponding decoder layers to preserve the spatial information.
| Layer | Type | Input Feature Size | Stride | Kernel Size | Padding | Output Feature Size |
---|
ENCODER | 1 | Initial block with DWT | n × 1 × 256 × 256 | 1 | 7 | 3 | n × 64 × 128 × 128 |
2 | RFA Block 1 | n × 64 × 128 × 128 | 1 | 3 | 1 | n × 64 × 64 × 64 |
3 | RFA Block 2 | n × 64 × 64 × 64 | 2 | 3 | 1 | n × 128 × 32 × 32 |
4 | RFA Block 3 | n × 128 × 32 × 32 | 2 | 3 | 1 | n × 256 × 16 × 16 |
5 | RFA Block 4 | n × 256 × 16 × 16 | 2 | 3 | 1 | n × 512 × 8 × 8 |
DECODER | 6 | Block 1 | n × 512 × 8 × 8 | 2 | 3 | 1 | n × 256 × 16 × 16 |
7 | Block 2 | n × 256 × 16 × 16 | 2 | 3 | 1 | n × 128 × 32 × 32 |
8 | Block 3 | n × 128 × 32 × 32 | 2 | 3 | 1 | n × 64 × 64 × 64 |
9 | Block 4 | n × 64 × 64 × 64 | 1 | 3 | 1 | n × 64 × 64 × 64 |
9 | ConvTranspose | n × 64 × 64 × 64 | 2 | 3 | 1 | n × 32 × 128 × 128 |
9 | Convolution | n × 32 × 128 × 128 | 1 | 3 | 1 | n × 32 × 128 × 128 |
10 | ConvTranspose (Output) | n × 32 × 128 × 128 | 2 | 2 | 0 | n × classes (1) × 256 × 256 |
Table 2.
Metric used to evaluate the segmentation methods.
Table 2.
Metric used to evaluate the segmentation methods.
Metric | Formula |
---|
Accuracy (ACC) | (TP + TN) / (TP + TN + FP + FN) |
Dice coefficient (DSC) | 2.TP / (2.TP + FP + FN) |
Intersection over Union (IoU) | TP / (TP + FP + FN) |
Sensitivity (SEN) | TP/(TP + FN) |
Specificity (SPE) | TN / (TN + FP) |
Table 3.
Investigating the performance of different configurations of the proposed method (mean ± standard deviation). Best results are in bold.
Table 3.
Investigating the performance of different configurations of the proposed method (mean ± standard deviation). Best results are in bold.
Model | ACC (%) | DSC (%) | IoU (%) | SEN (%) | SPE (%) |
---|
Baseline | | | | | |
Baseline + DWT | | | | | |
Baseline + LPDGC | | | | | |
Baseline + FAM | | | | | |
Baseline + DWT + LPDGC | | | | | |
Baseline + DWT + FAM | | | | | |
LungINFseg (w/o augmentation) | | | | | |
LungINFseg (with augmentation) | | | | | |
Table 4.
The performance of the LungINFseg with different image resolutions on the test set (mean ± standard deviation). Best results are in bold.
Table 4.
The performance of the LungINFseg with different image resolutions on the test set (mean ± standard deviation). Best results are in bold.
Input Size | ACC | DSC | IoU | SEN | SPE | Feature Map Size |
---|
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
Table 5.
The evaluation of the LungINFseg with different loss functions on the test set (mean ± standard deviation). Best results are in bold.
Table 5.
The evaluation of the LungINFseg with different loss functions on the test set (mean ± standard deviation). Best results are in bold.
Loss Function | ACC (%) | DSC (%) | IoU (%) | SEN (%) | SPE (%) |
---|
BCE | | | | | |
BCE + IoU-binary | | | | | |
BCE + SSIM | | | | | |
TL | | | | | |
LungINFseg (OL) | | | | | |
Table 6.
Comparing the proposed model with 13 state-of-the-art baseline segmentation methods on the test set (mean ± standard deviation). Best results are in bold. Dashes-indicate that the information is not reported in the cited references.
Table 6.
Comparing the proposed model with 13 state-of-the-art baseline segmentation methods on the test set (mean ± standard deviation). Best results are in bold. Dashes-indicate that the information is not reported in the cited references.
Model | ACC (%) | DSC (%) | IoU (%) | SEN (%) | SPE (%) | Parameters (M) |
---|
FCN | | | | | | |
UNet | | | | | | |
SegNet | | | | | | |
FSSNet | | | | | | |
SQNet | | | | | | |
ContextNet | | | | | | |
EDANet | | | | | | |
CGNet | | | | | | |
ERFNet | | | | | | |
ESNet | | | | | | |
DABNet | | | | | | |
Inf-Net [12] | − | | − | | | |
MIScnn [26] | | | | | | |
LungINFseg | | | | | | |