*2.2. Stain Normalization*

The proposed algorithm employs a specific preprocessing stage, called stain normalization, to reduce the color variability of the histological samples. Previous studies have shown that stain variability significantly a ffects the performance of automatic algorithms in digital pathology [12,13]. The procedure of stain normalization allows for transforming a source image *I* into another image *INORM*, through the operation *INORM* = *f*(*<sup>I</sup>*, *IREF*), where *IREF* is a reference image and *f*(·) is the function that applies the color intensities of *IREF* to the source image [14]. The reference image is chosen by the pathologist as the image with the most optimal tissue staining and visual appearance. For each image of the dataset, the RENFAST algorithm applies the same stain normalization method that we developed in our previous work [15]. First, the image is converted to the optical density space (OD) where the relationship between stain concentration and light intensity is linear. The algorithm then estimates the stain color appearance matrix (W) and the stain density map (H) for both the source and reference images. In order to apply the normalization, the stain density map of the source image is adjusted using the following equation:

$$I\_{NORM} = \; W\_{REF} \cdot \frac{H\_{SOILICE}}{H\_{REF}} \tag{1}$$

where (·)*SOURCE* and (·)*REF* denote the source and reference images, respectively. Finally, the normalized image is converted back from the OD space to RGB. Figure 3 illustrates the color normalization process for sample PAS and TRIC images.

**Figure 3.** Stain normalization performed by the RENFAST algorithm. (**a**) PAS normalization; (**b**) TRIC normalization.

### *2.3. Deep Network Architecture*

After stain normalization, the first step performed by the RENFAST algorithm is semantic segmentation using a convolutional neural network (CNN). To perform blood vessel segmentation, a UNET architecture with ResNet34 backbone [16] is employed using the Keras framework. The overall network architecture is shown in Figure 4. This network consists of an encoder structure that downsamples the spatial resolution of the input image through convolutional operations, to obtain a low-resolution feature mapping. These features are then resampled by a decoding structure to obtain a pixel-wise prediction of the same size of the input image. The output of the network is a probability map that assigns to each pixel a probability of belonging to a specific class. The entire network is trained on a three-class problem, giving the 512 × 512 RGB images as input and the corresponding labeled masks as the target. In each image of the dataset, pixels are labeled in three classes: (i) background, (ii) blood vessel, and (iii) blood vessel boundaries. To solve the problem of class imbalance, our network's loss function is class-weighted by taking into account how frequently a class occurs in the training set. This means that the least-represented class will have a greater contribution than a more represented one during the weight update. The class weight is computed as follows:

$$f\_{\text{classX}} = \sum\_{i=1}^{N} \frac{\text{\textquotedblleft}\_{o} \text{pixel}\_{\text{classX}}}{N} \qquad \text{x = 1,2,3} \tag{2}$$

$$\text{class}\_{\text{WEIGHT}} = \frac{\text{median}(\lfloor f\_{\text{class1}}, \lfloor f\_{\text{class2}}, f\_{\text{class3}} \rfloor)}{\lceil f\_{\text{class1}}, \lceil f\_{\text{class2}}, f\_{\text{class3}} \rceil} \tag{3}$$

where *N* is the total number of images and *fclassX* is the class frequency of generic class X.

**Figure 4.** Architecture of the deep network employed to perform blood vessel detection. A UNET with ResNet34 backbone was implemented using Keras framework.

The encoding network was pre-trained on ILSVRC 2012 ImageNet [17]. During the training process, only the decoder weights were updated, while the encoder weights were set to non-trainable. This strategy allows for exploiting the knowledge acquired from a previous problem (ImageNet) and using the features learned to solve a new problem (vessel segmentation). This approach is useful both to speed up the training process and to create a robust model even using fewer data. The training data are real-time augmented while passing through the network, applying the same random transformations (rotation, shifting, flipping) both to the input image and to the corresponding encoded mask. Real-time data augmentation allows us to increase the amount of data available without storing the transformed data in memory. This strategy makes the model more robust to slight variations and prevents the network from overfitting.

Our network (Figure 4) was trained on 300 images with a mini-batch size of 32 and categorical cross-entropy as a loss function. The Adam optimization algorithm was employed with an initial learning rate of 0.01. The maximum number of epochs was set to 50, with a validation patience of 10 epochs for early stopping of the training process.

To preserve the information near the boundaries of the image, the RENFAST algorithm applies a specific procedure to build the CNN softmax. Briefly, a mirror border is synthesized in each direction and a sliding window approach is employed to build the probability map. To give the reader the opportunity to observe the entire procedure, we added a detailed description along with a summary figure in Appendix A.

### *2.4. Blood Vessel Detection*

Starting from the normalized RGB image (Figure 5a), the RENFAST algorithm applies the deep network described in the previous section. Figure 5b shows the probability map obtained from the CNN, in which the red and green areas represent the pixels inside and on the edge of the blood vessels, respectively. Then, our method detects all the white and nuclear regions within the image. All the unstained structures are segmented by thresholding the grayscale image of the PAS sample, while cell nuclei are detected using the object-based thresholding developed in our previous work [15]. Figure 5c illustrates the segmentation of cellular structures performed by the RENFAST algorithm.

**Figure 5.** Steps performed by RENFAST for blood vessel detection. (**a**) Normalized image; (**b**) CNN probability map; (**c**) Cellular structure detection (yellow: nuclei, cyan: lumen); (**d**) Initial blood vessel segmentation; (**e**) Softmax with high SNR (signal-to-noise ratio); (**f**) Final blood vessel segmentation.

To obtain initial detection of the vascular structures, the probability maps of the regions inside and on the border of the blood vessels are added together and thresholded with a fixed value of 0.35. Then, morphological closing with a disk of 3-pixel radius (equal to 2.80 μm) is carried out to obtain smoother contours. As can be seen from Figure 5d, this strategy leads to accurate detection of the blood vessel boundaries but does not allow the separation of touching structures. To overcome this problem, an additional processing stage is performed to divide clustered blood vessels. The RENFAST algorithm employs a four-step procedure to increase the contrast between each blood vessel's boundary and the background:


This procedure generates a softmax with a high SNR (signal-to-noise ratio) where the border of each blood vessel is clearly defined (Figure 5e). Finally, for each connected component of the initial mask (Figure 5d), a simple check is performed: if by subtracting the green layer of the high-SNR softmax (Figure 5e), more than one region is generated, these regions are dilated by 1 pixel and added to the final mask. In this way, the thickness lost during the subtraction is recovered while maintaining the blood vessels' separation. Otherwise, if no additional structure is created with the subtraction, the connected component is inserted directly into the final mask.

The last step of the RENFAST algorithm for vessel segmentation is a structural check on the segmented objects: All the regions with an area less than 180 μm<sup>2</sup> are erased as they are too small to be considered blood vessels. In addition, objects must have at least 2.5% and 5% of the area occupied by lumen and nuclei, respectively. With these structural checks, most of the false positives generated by the CNN are deleted. The final result provided by the proposed algorithm is shown in Figure 5f.
