**4. Prostate Organ: Segmentation and Volume Estimation**

Although prostate segmentation and volume approximation could greatly improve PCa and BPH management, existing techniques are limited. Currently, prostate segmentation is performed in a manual or semi-automated fashion and is limited by inter-observer variability [61]. According to a study by Rash et al. [62], the mean prostate organ volume among three radiation oncologists varied between 0.95 and 1.08. Currently, prostate volume is most often calculated during TRUS utilizing an ellipsoid estimate [63] or estimated during a prostate exam. Even though this volume approximation with TRUS is commonly used, it has significant intra-observer variation and is not as accurate as an approximation with mpMRI images [64,65]. Prostate volume approximation with software has been attempted with limited results. Medical students outperform the accuracy of a commercially available tool [66].

To meet this need for an automatic, accurate prostate segmentation and volume approximation tool, ML methods have been applied by various groups (Figure 3). A ML technique, fuzzy c-means clustering, categorizes data into groups via unsupervised learning and was used by Rundo et al. [67] to segment the prostate on T1-weighted and T2-weighted mpMRI images. Rundo et al. evaluated 21 patients to yield an average Dice score of 0.91 [67]. The Dice score is a standard statistic for assessing the spatial intersection between two images and ranges from 0 (no overlap) to 1 (perfect overlap) [68]. Therefore, a Dice score of 0.91 demonstrates that the technique was able to segment and estimate the volume of prostates with a high level of precision.

**Figure 3.** Prostate organ segmentation performed by machine learning methods. The computer takes multiparametric magnetic resonance imaging images as inputs and applies the developed machine learning algorithm to correctly identify the borders of the prostate.

Besides fuzzy c-means clustering, DL has been extensively used for complete prostate segmentation. In 2012, the release of the PROMISE12 challenge dataset, which contained 100 patients, prompted many studies on this topic [69,70]. Two groups led by Tian et al. [71] and Karimi et al. [70] both employed CNNs. Tian et al. [71] trained their CNN on T2-weighted mpMRI images from 140 patients and achieved a Dice score of 0.85. Karimi et al.'s [70] CNN was trained on a limited dataset of 49 T2-weighted mpMRI images supplemented by data augmentation. Their Dice score was 0.88. Both studies achieved high Dice scores and demonstrated that prostate segmentation could be achieved with commonly used technical designs.

Additionally, a uniquely designed DL network for biomedical images, U-Net, has also been proposed for complete prostate segmentation [72]. U-Net is an algorithm that successively compresses an image, derives features during these contractions, and classifies every pixel in the image [72]. Three studies used U-Net for prostate segmentation and obtained Dice scores of 0.89, 0.93, and 0.89 [73–75]. These three groups showed that U-Net could effectively segment the prostate with dataset sizes between 81 and 163 patients. The high Dice scores across multiple studies with comparable network architectures demonstrate substantial progress towards completely automated

prostate segmentation and volume approximation. Table 1 lists the previously discussed studies along with several others that also segmented the prostate using various CNNs. To establish the ground truth label, which is used in establishing a Dice score, five studies used radiologists, two studies used clinicians of unstated specialties, one study used an expert, and one study used a radiologist for most of its data and an unnamed source for the rest of its data [67,70,71,73–78].


**Table 1.** Machine learning techniques applied to prostate organ segmentation.

#### **5. Prostate Lesion: Detection, Segmentation, and Volume Estimation**

Although prostate lesion detection, segmentation, and volume approximation could benefit PCa management, an effective tool that can automate these processes has not been created. For prostate lesion detection, satellite small lesions can be challenging to detect [19]. In a study by Steenbergen et al. [19], six different teams, each composed of one radiologist and one radiation oncologist, missed 66 out of 69 satellite lesions distributed across 20 patients. In addition to prostate lesion detection, segmentation is difficult because sparse tumors composed of benign glands and stroma are challenging to outline [79]. When segmentation across multiple institutions is compared, the contours reveal considerable differences [80]. As a result of inexact segmentation, volume approximation of prostate lesions is also challenging and often underestimates the histopathological volume [79]. This need for improved lesion metrics could be satisfied using ML algorithms that could learn to identify these features within mpMRI images.

For prostate lesion detection, ML approaches have been used to identify potential malignancies (Figure 4). Lay et al. [81] used a prostate computer-aided diagnosis (CAD) based on a random forest for prostate lesion detection (Table 2). This study's dataset used 224 patient cases across three sequences (T2-weighted, ADC, and DWI) for a total of 287 benign lesions and 123 lesions with a Gleason score of 6 or higher [81]. The Gleason scoring system describes PCa grades on a scale of 1 to 5 based on the pattern that the cancerous cells fall into, with 1 or 2 being low grade and 5 being high grade. It uses the combined grades of the most prominent and second most prominent patterns in a biopsy as the final score. A Gleason score of 6 or greater has malignant potential [82]. Lay et al.'s random forest technique yielded an area under the curve (AUC) score of 0.93 [81]; AUC is a measurement for binary classification and ranges from 0 to 1. Therefore, this study demonstrates that the ML model can detect lesions with high accuracy.

**Figure 4.** Prostate lesion detection using machine learning methods. The computer takes multiparametric magnetic resonance imaging images of the prostate as inputs and applies the developed machine learning algorithm to correctly localize lesions in the prostate.


**Table 2.** Machine learning techniques applied to prostate lesion detection.

DL techniques have also been applied to prostate lesion detection (Table 2). Xu et al. [84] implemented a type of neural network with extensive layers, ResNet [86], to find lesions on T2-weighted, ADC, and DWI images. This study used images from the Cancer Imaging Archive data portal and included 346 patients. They achieved an AUC of 0.97 [84]. Tsehay et al. [85] also used a DL algorithm with a 5-layer CNN architecture that used an individual loss function for each layer. The CNN was trained and validated on a dataset of 39 benign lesions and 86 lesions with a Gleason 6 or higher [85]. Tsehay's group achieved an impressive AUC of 0.90 [85], which demonstrates high accuracy of prostate lesion detection. All four studies in Table 2 used radiologists for labeling the ground truth [81,83–85].

Although prostate lesion detection has been implemented with ML, automated prostate lesion segmentation and volume approximation remain largely unsolved (Figure 5). Few studies have attempted this task due to a dearth of well-curated data and its technical requirements. One obstacle for prostate lesion segmentation is a lack of guidelines across institutions for prostate lesion contours, which results in significant inter-observer variability [19,80]. Despite the lack of standardization, three studies have attempted prostate lesion segmentation (Table 3). A study by Liu et al. [87] used fuzzy Markov random fields to achieve a Dice score of 0.62 with 11 patients. Two other groups, Kohl et al. [88] and Dai et al. [89], both employed DL algorithms and used U-Net and Mask R-CNN, respectively. Kohl's group used a dataset of 152 patients and implemented U-Net combined with an adversarial network. Their architecture resulted in an average Dice score for prostate lesion segmentation of 0.41 [88]. Dai's group used a highly specialized DL algorithm, Mask R-CNN, and trained with 63 patients to achieve a prostate lesion Dice score of 0.46 [89]. To label the ground truth, Dai et al. [89] used a clinician, Kohl et al. [88] used a radiologist, and Liu et al. [87] used a pathologist. These studies' lower Dice scores demonstrate that the current techniques have limited precision. These studies show that prostate lesion segmentation and volume estimation remain challenging. A bigger dataset with more uniform labeling would permit the development of more ML models geared toward these tasks.

**Figure 5.** Prostate lesion segmentation using machine learning techniques. The computer takes multiparametric magnetic resonance imaging images of the prostate as inputs and applies the developed machine learning algorithm to correctly identify the borders of the lesion.

**Table 3.** Machine learning techniques applied to prostate lesion segmentation.


#### **6. Prostate Lesion: Characterization**

Although prostate lesions have been increasingly imaged with mpMRI since 2013 [4], their characterization has been hindered by the variability in classification conventions across different radiologists and institutions [4,47,90]. To establish better standardization, the PI-RADS scoring system was created in 2012, with an updated version PI-RADS v2 released in 2015, and the newest version PI-RADS v2.1 released in 2019 [53,54,91]. Since their conception, multiple studies have attempted to elucidate the clinical utility of PI-RADS, PI-RADS v2, and PI-RADS v2.1. Challenges to its broader acceptance include inter-reader agreement, radiologist experience, and the substantial interpretation time of images [4,47,90]. This need for more consistent lesion characterization makes ML an attractive method for accurate, quick classification.

ML algorithms can augment the PI-RADS scoring system as well as independently classify lesions (Table 4). Regarding PI-RADS, Litjens et al. [92] created a CAD system that applied a random forest for characterizing prostate lesions on a scale of suspicion for malignancy. After combining the ML generated scores and the radiologist provided PI-RADS scores on a dataset of 107 patients, the overall AUC was greater than either the ML generated scores or the PI-RADS scores [92]. Similarly, Wang, J. et al. [93], who used 54 patients in their dataset, also concluded that a support vector machine (SVM) algorithm enhanced the PI-RADS performance of radiologists. Song et al. [94] opted to use a DL algorithm based off of VGG-Net, a deep CNN, as a tool for improving PI-RADS scores assigned by radiologists. Song's group gathered data from 195 patients and also observed that their AUC improved when radiologists' decisions were combined with the VGG-Net [94].


**Table 4.** Machine-learning techniques applied to prostate lesion characterization.

In addition to bolstering lesion classification by radiologists, ML algorithms have been trained to characterize prostate lesions independently (Figure 6, Table 4). Many studies explored this task with the PROSTATEx challenge dataset that was released in 2017 [101]. The PROSTATEx dataset was gathered from 344 patients and contained segmented lesions along with their respective pathology-defined Gleason scores [101]. From this public database, Wang, Z. et al. [96] achieved an AUC of 0.96 by running two CNNs in parallel. Both Seah et al. [97] and Liu et al. [98] obtained an AUC of 0.84 by using deep layered CNNs. Mehrtash et al. [99] implemented a 3D CNN to reach an AUC of 0.80. One study by Kwak et al. [95] used its own proprietary dataset to implement an SVM that trained on T2-weighted and DWI images to characterize prostate lesions. In this study, 244 patients were used for a total of 333 benign and 146 malignant lesions [95]. The SVM method used discriminative features in training that resulted in an AUC score of 0.89 [95]. All of the studies listed in Table 4 used radiologists to determine their ground truth [77,92–95,97–100]. These studies highlight the ability of DL algorithms to predict the likelihood of a lesion's malignancy based upon Gleason scores.

**Figure 6.** Prostate lesion characterization using machine-learning techniques. The computer receives multiparametric magnetic imaging images of prostate lesions and applies the developed machine learning algorithm to categorize the lesion as clinically significant prostate cancer or non-significant prostate cancer.
