EpidermaQuant: Unsupervised Detection and Quantification of Epidermal Differentiation Markers on H-DAB-Stained Images of Reconstructed Human Epidermis

Zamojski, Dawid; Gogler, Agnieszka; Scieglinska, Dorota; Marczyk, Michal

doi:10.3390/diagnostics14171904

Open AccessArticle

EpidermaQuant: Unsupervised Detection and Quantification of Epidermal Differentiation Markers on H-DAB-Stained Images of Reconstructed Human Epidermis

¹

Department of Data Science and Engineering, Silesian University of Technology, 44-100 Gliwice, Poland

²

Genetic Laboratory, Gyncentrum Sp. z o.o., 41-208 Sosnowiec, Poland

³

Center for Translational Research and Molecular Biology of Cancer, Maria Sklodowska-Curie National Research Institute of Oncology Gliwice Branch, 44-102 Gliwice, Poland

⁴

Yale Cancer Center, Yale School of Medicine, New Haven, CT 06510, USA

^*

Author to whom correspondence should be addressed.

Diagnostics 2024, 14(17), 1904; https://doi.org/10.3390/diagnostics14171904

Submission received: 15 July 2024 / Revised: 22 August 2024 / Accepted: 27 August 2024 / Published: 29 August 2024

(This article belongs to the Topic AI in Medical Imaging and Image Processing)

Download

Browse Figures

Versions Notes

Abstract

:

The integrity of the reconstructed human epidermis generated in vitro can be assessed using histological analyses combined with immunohistochemical staining of keratinocyte differentiation markers. Technical differences during the preparation and capture of stained images may influence the outcome of computational methods. Due to the specific nature of the analyzed material, no annotated datasets or dedicated methods are publicly available. Using a dataset with 598 unannotated images showing cross-sections of in vitro reconstructed human epidermis stained with DAB-based immunohistochemistry reaction to visualize four different keratinocyte differentiation marker proteins (filaggrin, keratin 10, Ki67, HSPA2) and counterstained with hematoxylin, we developed an unsupervised method for the detection and quantification of immunohistochemical staining. The pipeline consists of the following steps: (i) color normalization; (ii) color deconvolution; (iii) morphological operations; (iv) automatic image rotation; and (v) clustering. The most effective combination of methods includes (i) Reinhard’s normalization; (ii) Ruifrok and Johnston color-deconvolution method; (iii) proposed image-rotation method based on boundary distribution of image intensity; and (iv) k-means clustering. The results of the work should enhance the performance of quantitative analyses of protein markers in reconstructed human epidermis samples and enable the comparison of their spatial distribution between different experimental conditions.

Keywords:

immunohistochemistry; DAB-stained images; image pre-processing; segmentation; unsupervised methods

1. Introduction

Immunohistochemistry (IHC) allows the visualization of specific proteins in fixed tissue sections. It involves the application of antibodies specific to a target antigen, which ultimately allows microscopic observations of the antigen–antibody complex in situ [1]. The so-called secondary antibody that can be conjugated to an enzyme, such as horseradish peroxidase (HRP), is used to visualize antigen–antibody binding in DAB (3,3′-Diaminobenzidine)-mediated IHC staining. HRP oxidizes DAB and converts it into an insoluble brown precipitate [1]. The presence and localization of brown staining in tissue can be evaluated and imaged using a light microscope [2]. The use of quantitative digital analysis of images obtained by IHC staining is crucial for basic biomedical researchers and as a diagnostic tool in pathology [3]. The application of deep-learning and machine-learning algorithms for the analysis of IHC protein markers has been tested in precision medicine for the analysis of diagnostic biomarkers in pathological conditions such as malignant cancers (breast, prostate, lung), melanocytic proliferations, and hematologic conditions [4]. For example, routine molecular diagnostics of breast cancer [5] can be improved by deep-neural-network algorithms to help with the IHC analysis of biomarkers (estrogen receptor, progesterone receptor, epidermal growth factor receptor 2, and Ki67) [6,7,8,9,10,11]. Computational tools for the detection of IHC-stained epithelial-type biomarkers [12] to discriminate between benign, in situ, and malignant breast lesions are limited (if any), but for this purpose, several algorithms developed for slides stained with hematoxylin and eosin (H&E) can be adopted [13,14]. Quantitative IHC assessment is also important for basic research; e.g., it can provide a quantitative assessment of protein markers in tissue-like (organotypic) cultures generated using 3D-in vitro-cell-culture techniques. For example, the in vitro reconstructed human epidermis (RHE) is considered a representative and sufficient model of the human epidermis. RHEs can be obtained by culturing keratinocytes at the air–liquid interface on a collagen–fibroblast matrix. Such a 3D-in vitro cell culture is a valuable tool to study the impact of selected gene products on the integrity and function of the human epidermis [3].

The preparation of IHC slides, the method of image acquisition, technical differences during image capturing, and the presence of various artifacts are the main challenges in computational detection and segmentation of DAB-stained areas. The solution to this problem is to use an unsupervised algorithm for processing IHC images. Compared to the frequently used H&E-stained samples, the colors of DAB-stained samples are less consistent, which is considered a major problem in fully automated image processing systems. In addition, a multitude of obstacles and limitations make it difficult for both the specialist and the algorithm to produce consistent results [15]. The variability in shape, size, and color of interesting objects, the lack of clear criteria for cell type identification, and the presence of overlapping structures should be mentioned as the main difficulties. An effective image pre-processing concept should overcome all these limitations to ensure high quality and accuracy in the studies performed, especially clinical ones. With the help of computer-assisted evaluation of digital images of IHC-stained tissue (such as bioptates), a quick and more accurate prognosis or a new understanding of the mechanism of the disease is possible.

A variety of techniques have been proposed to segment and quantify tissue stained with H&E and stored as digital images [16]. Studies related to DAB staining are still limited. Patel et al. used the Otsu method for negative control slides to define the threshold that distinguishes tissue from the background, and all pixels considered tissue were evaluated for normalized red-minus-blue (NRMB) color intensity. Then, a user-defined error tolerance was applied to the negative control slides to set the NRMB threshold, distinguishing DAB-stained from unstained tissue. This threshold was used to calculate the pixel fraction of DAB-stained tissue on each test slide [17]. Bencze et al. compared semi-quantitative scoring and digital analysis of DAB-stained images using a Convolutional Neural Network. Semi-quantitative scoring was carried out using a four-point scale based on the IHC intensity of the cells. Users might submit images, which were then analyzed to determine whether the algorithm had correctly recognized the objects or if manual annotation of points of interest was required [18]. Eszter Sziva et al. proposed a mechanism for quantitative histomorphometric-mathematical image analysis of DAB-stained tissue as support in andrology and reproductive medicine. The quantitative analysis was based on the use of features such as area, circumference, and cross-sectional diameter of stained testes [19]. Roszkowiak et al. described a comprehensive system capable of quantifying digitized DAB&H-stained breast cancer tissue samples at different intensities. The researchers introduced segmentation based on regions of interest, as well as a step of dividing a cluster of nuclei, followed by boundary enhancement. The system used machine learning and local recursive processing to eliminate distorted and inaccurate regions of interest [15]. Deep-learning methods are gradually overtaking classical image-processing methods that deal with feature space partitioning. This is particularly evident in images where the features of objects are uncertain and variable, as is the case of IHC-stained tissue images [20,21]. Some IHC image analysis methods have been developed and made available as freeware, for example, QuPath v0.5.1 or ImageJ version 1.54j software [22,23].

Here, we developed an unsupervised method for detecting protein markers of human epidermal keratinocyte differentiation on IHC images. The need to develop a new method was mainly due to (i) technological limitations of current IHC image-processing methods, which are not dedicated to epidermal images; (ii) problems with the structural complexity of the 3D tissue-like models resulting from their culture methodology; (iii) significant variability in the structure and shape of the areas of interest between the analyzed images, resulting from the specificity of the cell growth in the post-confluent cell density. We tested the effectiveness of various methods for pre-processing and segmentation of IHC-stained epidermal differentiation markers, proposed our solutions, and summarized the best methods into a freely available pipeline.

2. Materials and Methods

2.1. Data

The dataset consisted of 598 unannotated .jpg images (1936 × 1460 pixels) containing formalin-fixed and paraffin-embedded sections of in vitro reconstructed human epidermis, tissue-like structures generated by 3D culture in vitro of immortalized epidermal keratinocyte HaCaT line. Samples were processed by immunohistochemistry and subsequently stained using an antigen-specific primary antibody and an appropriate secondary antibody. After visualization of antigen–antibody binding by DAB-mediated reaction, cell nuclei were counterstained with hematoxylin (see examples in Figure 1). The details on tissue processing and IHC protocols are described in Gogler-Piglowska et al. [24]. Nine biological repeats in the form of positive control (specimens that contained the target molecule in its expected location and enabled the visualization of histomorphology) and negative control without primary antibodies (to confirm the specificity of the IHC staining) were prepared [25]. Four selected proteins (filaggrin (FLG), keratin 10 (K10), Ki67 protein (Ki67), and HSPA2 protein (HSPA2)) considered as markers of keratinocyte differentiation which are present in individual layers of the epidermis, indicative of epidermal development, were analyzed. The number of images per marker was as follows: FLG—189 images (163 positive and 26 negative controls); K10—131 images (113 positive and 18 negative controls); Ki67—96 images (96 positive and 0 negative controls); HSPA2—182 images (124 positive and 58 negative controls). Methods were tested using images of FLG, K10, and Ki67 markers, and the final pipeline was visually tested using HSPA2 marker data. Images were captured using a ZEISS AXIOCAM 503 color camera (Zeiss, Oberkochen, Germany) with a ZEN 2.6 photo archiving system at 40× magnification. Data preparation and scanning was done at the Center for Translational Research and Molecular Biology of Cancer, Maria Sklodowska-Curie National Research Institute of Oncology Gliwice Branch, Poland.

2.2. Color Normalization and Deconvolution

Reinhard’s method, normalization by RGB histogram specification, and Macenko’s method were tested as the implementation of the non-linear staining normalization algorithms [26]. Reinhard’s method involved applying the mat2gray transformation to improve results. This operation converts a matrix to a grayscale image and stretches the histogram intensity values from 0 to 1. Then, to transform stained biological samples into images representing the concentrations of individual stains, color deconvolution was performed [27]. The method by Ruifrok and Johnston with different sets of parameters was used: (i) from the ImageJ Color Deconvolution 2 plug in; and (ii) from the Python scikit-image implementation (see details in the Supplementary Materials) [28].

2.3. Background Detection and Mask Creation

The image on which Otsu’s binarization method was performed was a greyscale hematoxylin channel. Morphological operations were implemented to remove noise and unnecessary regions in the specimen, including the image background. Morphological operations were applied in the following sequence: opening, closing, and hole filling. Opening removed unconnected light objects that were smaller than the structuring element (SE), while closing removed unconnected dark objects that were smaller than the SE. The SE was a disk with a radius of 15. The filling holes operation completed the background, that is, the empty spaces between pixels close to each other. Subsequently, artifacts were discarded by selecting the largest area of the tissue slide on a previously created black and white mask. In the end, the morphological operations were used once again to improve the quality of the results.

2.4. Image Rotation and Crop

Two methods of image rotation were tested: (i) the Hough transform and (ii) the proposed image rotation method based on the boundary distributions of pixel intensities.

The principle of the Hough transform is based on the spatially transformed points so that the lines in the image are represented by points in Hough space. Then, by analyzing the relationships between the points in Hough space, the algorithm can identify the lines in the image. The Hough space contains parameters that describe the shape being searched for. For example, to detect a straight line in an image, the Hough space is two-dimensional and consists of the angle of the line and the distance of that line from the origin of the coordinate system [29].

The proposed image rotation method automatically rotates the tissue to a horizontal position based on maximizing the boundary pixel intensities. The boundary intensity distribution is calculated as the sum of the intensities of the pixels in each row. When the orientation of the tissue is horizontal, the intensity distribution reaches maximum values, which allows the algorithm to accurately arrange the tissue to a horizontal position. The proposed algorithm is summarized in the following steps:

The tissue mask (negation of background mask) is used as the input;
The mask is rotated from 1 to 180 degrees with a step of 0.1 degrees;
For each rotation position, the sum of pixels in each row of the mask is calculated to create boundary intensity distribution;
For each rotation position, the maximum value of the boundary intensity distribution is found;
The final rotation position is the one that maximizes the maximum value of the boundary intensity.

2.5. Elimination of Images without DAB Stain

To prevent the segmentation of images without DAB staining, an average proportion (AP) measure was introduced. The determination of minimum staining level was carried out for a set of images representing the FLG marker as the most difficult to analyze. The result for an individual DAB channel image was expressed as a percentage of the number of pixels passing the condition to the total number of pixels in the image:

A P = \frac{n o . o f p i x e l s < t h r e s h o l d}{t o t a l n u m b e r o f p i x e l s} \times 100 [%]

(1)

To determine the appropriate threshold and the value of the average proportion, which is a determinant of the elimination of images without DAB stain, ROC curve analysis was performed, and the Youden index was calculated. The threshold for AP was searched in the range from 0.1 to 1% with a step of 0.01.

2.6. DAB Region Segmentation

Segmentation algorithms in the context of IHC imaging can be used to find areas stained with particular stains. From the existing segmentation methods, the k-means clustering algorithm was selected due to its efficiency and simplicity [30]. The process was divided into several steps: (i) reshape the DAB image into a new matrix retaining the original data; (ii) normalize DAB image channels independently using the mean and standard deviation; (iii) estimate the number of clusters using the Davies–Bouldin method, based on the ratio of distances within a cluster and between clusters in the range from 1 to 3; (iv) k-means clustering; (v) mask DAB image using obtained clusters; and (vi) arrange clusters from the darkest using the average DAB intensity per cluster.

2.7. EpidermaQuant Framework

The final pipeline for analysis was based on the custom algorithm written in the MATLAB^® R2021b programming environment (see Supplementary Materials; Supplementary Figure S1). The algorithm is publicly available here: https://github.com/DawZam/EpidermaQuant (accessed on 28 August 2024).

3. Results

The analyzed pre-processing steps include image normalization, color deconvolution, background removal, and image rotation. Then, the image is segmented using the k-means clustering method. We tested selected methods, and the best algorithms chosen by visual investigation of resulting images by experts were implemented in the EpidermaQuant framework (Figure 2).

3.1. Color Normalization and Deconvolution Results

To better distinguish the interesting objects in the image and harmonize images from different scans, we tested several color-normalization and color-deconvolution methods (Supplementary Figure S2). The best normalization method was Reinhard’s normalization. In most images, the brown staining was best exposed, and background noise was reduced in comparison to others (Figure 3A). Macenko’s method gave the poorest visual results. The effectiveness of the color deconvolution method was tested using the results of the hematoxylin and DAB channels (Figure 3B). The best set of parameters was from ImageJ Color Deconvolution 2 plug-in. Only this method provided proper extraction of the DAB image, including only areas with brown staining representing the analyzed marker. Also, the residual image showed low intensities, which is preferential. In other methods, background noise was visible in the DAB image (Figure 3B).

3.2. Background Removal and Image Rotation

We detected a background region using the image of the hematoxylin channel since most tissue structures are visible here. The first step of the algorithm, image binarization, gives a rough estimate of tissue regions (Figure 4). By introducing a series of subsequent morphological operations and binarization, we found the background area of the image and created a mask without artifacts. Lastly, we discarded small artifacts by selecting the largest area of the mask. The mask allows the definition of a clear boundary between a tissue slice and an unnecessary image background.

The reconstructed epidermis forms a thin, single stripe on each stained slide. To reduce the scanned image size, we rotated the image and cropped as much background as possible (Supplementary Figure S3). At first, we tested functions implementing the Hough transform; however, its performance was not effective when tests were conducted on the entire set of images. Therefore, we developed the image-rotation method based on the boundary distribution of image intensity, which proved to work much more efficiently (Figure 4B). Based on the results of the mask rotation, we cropped the DAB channel image for further analysis.

3.3. Elimination of Images without DAB Stain Results

Some tissues may not contain the analyzed marker, so no DAB-stained regions should be observed on the image. Since we noticed brown-colored individual pixels in the negative samples, we developed a method to find and remove negative samples from further analysis. First, we calculated the AP for all images representing the FLG marker in both positive (with marker) and negative samples. By maximizing the AUC and Youden indices, we found the optimal threshold value as 0.6% (Supplementary Figure S4).

3.4. DAB Area Detection

Using the DAB channel of the image, we found areas of interest in the form of DAB staining in all images representing each marker. In most cases, two clusters were obtained using the k-means algorithm (Figure 5). The cluster containing the DAB image area with the largest number of darkest pixels forms the final segmentation of the analyzed marker region. These outcomes were used to quantitatively analyze the results, to determine the percentage of DAB-stained tissue, and to mark the outlines of the DAB areas on the original image.

3.5. Comparison of DAB-Staining Levels between Markers

The percentage of DAB-stained tissue area for the representatives of each study group is compared in Figure 6. Samples representing the K10 marker showed the biggest area with DAB staining in comparison to other markers (median equal to 39.5% vs. 4.1% in FLG, 2.5% in Ki67, and 25.3% in HSPA2). The spread of results obtained was also the largest. The smallest spread of results and, at the same time, the smallest area of the DAB staining were found for the Ki67 marker (see examples in Supplementary Figures S5–S8).

3.6. Validation of Developed Pipeline on Human Protein Atlas Data

To verify if the proposed EpidermaQuant framework could determine the DAB-stained tissue correctly in an independent dataset, we use the images from The Human Protein Atlas (HPA). HPA contains images of histological sections from normal and tumor tissues obtained by tissue microarrays and immunohistochemistry procedures. Specimens comprising both normal and tumor tissue have been collected and sampled from anonymized paraffin-embedded surgical material. Antibodies are labeled with DAB and counterstained with hematoxylin to enable the visualization of microscopic features. The HPA dataset consisted of 66 unannotated .jpg images (3000 × 3000 pixels). Only tissues representing skin were selected. Thus, the number of images per marker was as follows: FLG—17 images; K10—12 images; Ki67—25 images; HSPA2—12 images. The percentage of DAB occupancy on the slide was for FLG: 0.501–8.3979; K10: 4.7549–16.7749; Ki67: 0.6336–21.9975; HSPA2: 6.3639–15.2822 (see Figure 7). Only one image was excluded and classified as negative.

3.7. Ablation Study

To demonstrate that the proposed steps of the pipeline are required for superior results, we selected parts of the algorithm that could be removed (normalization and image rotation) or that were optional (plastic removal) and tested these different configurations of the processing pipeline using the K10-marker images as an example. The four tested configurations are as follows: (i) full pipeline presented in the study (used as a baseline), (ii) pipeline without normalization step, (iii) pipeline without image rotation step and crop, and (iv) full pipeline with additional plastic removal step. Since no ground truth is given, for each scenario, we examined the difference in computational time (Figure 8A) and the difference in the percentage of DAB occupancy (Figure 8B). In the K10-marker dataset, turning off image normalization did not change computational time but increased the percentage of DAB occupancy by 17%, on average. Removing the image rotation step from the pipeline increased computational time by 15 s on average but did not change the percentage of DAB occupancy. When there is a visible plastic in the image, the method of its removal improves the estimate of DAB occupancy percentage (see example in Supplementary Figure S9) without a high increase in computational time.

4. Discussion

We developed a new pipeline for automated detection, segmentation, and quantitative analysis of human epidermis differentiation markers in images containing slides of in vitro reconstructed human epidermis stained by DAB-mediated IHC and counterstained with hematoxylin. A performed analysis led to the choice of the best methods in subsequent processing steps.

Stain color inconstancy is a known problem that could be caused by several factors: The thickness of the tissue section, antibody concentration, stain timings, stain reactivity, characteristics of the scanner, and others. For the color normalization step, Reinhard’s normalization was chosen because its results best reflected the colors of the staining used, especially for the brown areas of the DAB. The other two methods produced less satisfactory results. Both the RGB histogram specification and Macenko’s normalization methods failed to sufficiently highlight areas of interest in the form of DAB staining. Our results were dissimilar from the study conducted by Khan et al. [26], where the authors selected Macenko’s normalization method as the best. They mentioned that Reinhard’s method was based on the false assumption of unimodal color distribution in each channel as the reason for rejecting it. This confirms that there are no gold standards and tests should be performed on each different tissue and staining type.

In the case of the image rotation method, it was decided to develop a new method that was more efficient compared to the Hough transform. In most cases, it excluded situations in which the image was not rotated maximally to a horizontal position or when it was incorrectly rotated to a vertical position. This step was important for the proper presentation of the segmentation results. Without it, it would have been difficult to compare images in different positions with each other, especially in the case of images representing the FLG marker, where the sizes of clusters of DAB staining areas are very small.

The analyzed images were mostly divided into two or three clusters, correctly reflecting the areas of interest. The k-means algorithm was also used in a publication by Vidhya et al., which presented an effective technique for segmenting and classifying IHC tissue images to detect colorectal tumors. This work also used the k-means method with morphological segmentation, such as erosion and dilation [31].

Quantification of staining intensity in histopathology is relevant only if the absorption of the stain into the preparation is stoichiometric [32]. IHC methods, like most histologic stains, are non-stoichiometric, making it impossible to accurately measure antigen expression based on DAB stain intensity. In addition, the IHC technique uses a series of amplification steps to visualize the results, making it difficult to control the actual signal intensity associated with the amount of antigen. It should be noted that the DAB chromogen does not follow Lambert–Beer’s law, which describes a linear relationship between the concentration of a compound and its absorbance (optical density). The stain acts as a light scatterer with a broad, featureless spectrum, resulting in a darkly colored DAB, which has a different spectral shape than a lightly colored DAB [32]. Therefore, it was decided to quantitatively analyze the results by determining only the percentage of DAB stain occupancy on the slide relative to the entire surface of the tissue.

Considering the characteristics of the biological material in the form of a longitudinal tissue slide, its specific horizontal orientation, and the markers of keratinocyte differentiation used in the study, no publicly available annotated dataset was found to compare and validate the algorithm’s results. Some studies concentrate on the examination of IHC tissue slides, which were generated through the utilization of techniques such as tissue microarrays or alternative methods of slide acquisition [5,6]. With the aforementioned facts, a decision was made to validate the EpidermaQuant algorithm using images provided by The Human Protein Atlas. The images were generated using the tissue microarray technique and did not resemble the shape of the reconstructed human epidermis images. However, from the visual analysis of the results, it can be concluded that the algorithm was able to successfully analyze the external dataset and may be used for other images that were prepared using the IHC procedure and stained with H-DAB.

The importance of selected steps of the algorithm reported here was established by the ablation study. The analysis without image rotation and crop required the greatest investment of time due to the size of the analyzed image. It proves that this step is crucial to maintain the time performance of the algorithm. Further analysis proved that a normalization step might decrease the false positive results resulting from color variation (too dark images; see Supplementary Figure S10). We assume that this step is especially important for some low-quality images or images acquired in different laboratories. However, since it does not increase computational time much, we suggest using it in all cases. In the case of tissue containing a plastic membrane, the optional plastic removal step allowed for a more precise definition of the region of interest (tissue) and did not require a significant investment of time, which is of particular importance in the context of this study.

Several different algorithms have been developed for the analysis of IHC images [7,10,11,13,14,15,18,20,21], but not strictly DAB-stained images of human epidermis. Many of these publications deal with artificial intelligence algorithms and neural networks or use commercially available but paid applications [8,9]. These supervised deep-learning models require complicated architecture systems, high computing power, and the availability of large and complex fully annotated datasets, which can be a challenge in the medical field [33]. Additionally, the heterogeneity of medical examples can present difficulties for deep learning and artificial intelligence models [33]. This excluded the possibility of using these types of algorithms to analyze images of reconstructed human epidermis. Our developed tool allows the quantitative analysis of protein markers in reconstructed human epidermis samples to facilitate a more accurate comparison of their spatial distribution between different experimental conditions.

As with every method, our solution also has drawbacks. The method may not work well when the plastic culture medium joins the slide and is additionally stained with DAB, or its color does not resemble the synthetic substrate. Moreover, the algorithm may also consider brown artifacts during segmentation. Another limitation is the MATLAB implementation. A likely gain could be achieved by implementing the proposed algorithm in a more commercial programming language. Lastly, the comparison of the methods was based only on visual evaluation by the expert. More precise methods would be needed to accurately evaluate the performance of the tested algorithms. To obtain automatic matching or mismatching of localized DAB-stained areas, it would have been necessary to compare them with ground truth results (segmentation based on manual labeling), which were not provided in our case.

5. Conclusions

The bioinformatics tool developed in the presented work enabled quantitative analysis of marker proteins and assisted the analysis of their spatial distribution by experts. These results might be useful for a better description of structural changes in the reconstructed epidermis and for staining analysis in bioptates, as biopsy material taken by pathologists for diagnostic purposes also has an elongated shape and undergoes IHC analysis. At the same time, it may inspire other researchers to fill the gap in the literature regarding the detection of markers of human epidermal differentiation on IHC images and may be further developed or adjusted to the other research problems in digital pathology.

Supplementary Materials

The following supporting information can be downloaded at https://www.mdpi.com/article/10.3390/diagnostics14171904/s1, Figure S1: epidermaquant application. The graphical interface allows the user to select the appropriate image by the marker used in the analysis and to preview the image entered, the final result, and the percentage of DAB-stained tissue; Figure S2: color normalization and deconvolution of IHC image. The first step of image processing is color normalization, which reduces the variability of pixel intensity values in different samples. In the color deconvolution step, the RGB image is split into three channels corresponding to the colors of the stains used (the residual-channel image is not further processed); Figure S3: automatic image rotation and crop. Based on the rotated and cropped mask, it is possible to scale the DAB-channel image accordingly. It allows us to remove the unnecessary in further analysis of the background. Rotation is performed by its method based on the boundary distribution of image intensity; Figure S4: summary of determining the optimal threshold value results. Comparison of potential threshold values based on ROC curves, including results of AUC and Youden index; Figure S5: final output of the algorithm, for example, images representing the FLG marker. Percentage of DAB occupancy on the tissue with the outlines of the DAB areas marked over the original image; Figure S6: final output of the algorithm, for example, images representing the K10 marker. Percentage of DAB occupancy on the tissue with the outlines of the DAB areas marked over the original image; Figure S7: final output of the algorithm, for example, images representing the Ki67 marker. Percentage of DAB occupancy on the tissue with the outlines of the DAB areas marked over the original image; Figure S8: final output of the algorithm, for example, images representing the HSPA2 marker. Percentage of DAB occupancy on the tissue with the outlines of the DAB areas marked over the original image; Figure S9: final output of the algorithm, for example, image with a plastic membrane (indicated by an arrow symbol) representing the K10 marker. Percentage of DAB occupancy on the tissue with the outlines of the DAB areas marked over the original image before (A) and after (B) removing the plastic culture medium; Figure S10: final output of the algorithm, for example, image with and without image normalization representing the K10 marker. Percentage of DAB occupancy on the tissue with the outlines of the DAB areas marked over the original image with (A) and without (B) normalization.

Author Contributions

Data, A.G. and D.S.; algorithm, computational framework, and data analysis, D.Z. and M.M.; writing—original draft preparation, D.Z.; writing—review and editing, M.M.; project administration, D.Z.; original idea, D.S.; supervision, M.M. All authors have discussed the results and reviewed the manuscript. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Silesian University of Technology, grant number 02/070/BK_24/0052 for Support and Development of Research Potential (M.M.), by the Excellence Initiative—Research University program implemented at the Silesian University of Technology, grant number 02/070/SDU/10-21-02 (M.M.), by the Ministry of Science and Higher Education ‘Implementation Doctorate’, grant number DWD/7/0396/2023 (D.Z.), and by the National Science Centre, Poland, grant number 2017/25/B/NZ4/01550 (D.S.).

Institutional Review Board Statement

The study was conducted in accordance with the Declaration of Helsinki, and approved by the Ethics Committee of Maria Sklodowska-Curie National Research Institute of Oncology Gliwice Branch (protocol code KB/430-17/18 and date of approval 23 March 2018).

Informed Consent Statement

Informed consent was obtained from all subjects involved in the study.

Data Availability Statement

The datasets analyzed during the current study are available in the GitHub repository, https://github.com/DawZam/EpidermaQuant/tree/main/DATA (accessed on 28 August 2024).

Conflicts of Interest

Author Dawid Zamojski was employed by the Gyncentrum Sp. z o.o. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

Magaki, S.; Hojat, S.A.; Wei, B.; So, A.; Yong, W.H. An Introduction to the Performance of Immunohistochemistry. Methods Mol. Biol. 2019, 1897, 289–298. [Google Scholar] [CrossRef] [PubMed]
Jahn, S.W.; Plass, M.; Moinfar, F. Digital Pathology: Advantages, Limitations and Emerging Perspectives. J. Clin. Med. 2020, 9, 3697. [Google Scholar] [CrossRef] [PubMed]
Gurcan, M.N.; Boucheron, L.E.; Can, A.; Madabhushi, A.; Rajpoot, N.M.; Yener, B. Histopathological Image Analysis: A Review. IEEE Rev. Biomed. Eng. 2009, 2, 147–171. [Google Scholar] [CrossRef] [PubMed]
Poalelungi, D.G.; Neagu, A.I.; Fulga, A.; Neagu, M.; Tutunaru, D.; Nechita, A.; Fulga, I. Revolutionizing Pathology with Artificial Intelligence: Innovations in Immunohistochemistry. J. Pers. Med. 2024, 14, 693. [Google Scholar] [CrossRef]
Sorlie, T.; Perou, C.M.; Tibshirani, R.; Aas, T.; Geisler, S.; Johnsen, H.; Hastie, T.; Eisen, M.B.; van de Rijn, M.; Jeffrey, S.S.; et al. Gene expression patterns of breast carcinomas distinguish tumor subclasses with clinical implications. Proc. Natl. Acad. Sci. USA 2001, 98, 10869–10874. [Google Scholar] [CrossRef]
Holten-Rossing, H.; Talman, M.-L.M.; Kristensson, M.; Vainer, B. Optimizing HER2 assessment in breast cancer: Application of automated image analysis. Breast Cancer Res. Treat. 2015, 152, 367–375. [Google Scholar] [CrossRef]
Saha, M.; Arun, I.; Ahmed, R.; Chatterjee, S.; Chakraborty, C. HscoreNet: A Deep network for estrogen and progesterone scoring using breast IHC images. Pattern Recognit. 2020, 102, 107200. [Google Scholar] [CrossRef]
Hartage, R.; Li, A.C.; Hammond, S.; Parwani, A.V. A Validation Study of Human Epidermal Growth Factor Receptor 2 Immunohistochemistry Digital Imaging Analysis and its Correlation with Human Epidermal Growth Factor Receptor 2 Fluorescence In situ Hybridization Results in Breast Carcinoma. J. Pathol. Inform. 2020, 11, 2. [Google Scholar] [CrossRef]
Shafi, S.; Kellough, D.A.; Lujan, G.; Satturwar, S.; Parwani, A.V.; Li, Z. Integrating and validating automated digital imaging analysis of estrogen receptor immunohistochemistry in a fully digital workflow for clinical use. J. Pathol. Inform. 2022, 13, 100122. [Google Scholar] [CrossRef]
Abele, N.; Tiemann, K.; Krech, T.; Wellmann, A.; Schaaf, C.; Länger, F.; Peters, A.; Donner, A.; Keil, F.; Daifalla, K.; et al. Noninferiority of Artificial Intelligence-Assisted Analysis of Ki-67 and Estrogen/Progesterone Receptor in Breast Cancer Routine Diagnostics. Mod. Pathol. 2023, 36, 100033. [Google Scholar] [CrossRef]
Erber, R.; Frey, P.; Keil, F.; Gronewold, M.; Abele, N.; Rezner, W.; Beister, T.; Daifalla, K.; Päpper, M.; Springenberg, S.; et al. 48P An AI System for accurate Ki-67 IHC assessment in breast cancer following the IKWG whole section global scoring protocol. ESMO Open 2023, 8, 101272. [Google Scholar] [CrossRef]
Kaufmann, O.; Fietze, E.; Mengs, J.; Dietel, M. Value of p63 and cytokeratin 5/6 as immunohistochemical markers for the differential diagnosis of poorly differentiated and undifferentiated carcinomas. Am. J. Clin. Pathol. 2001, 116, 823–830. [Google Scholar] [CrossRef]
Gandomkar, Z.; Brennan, P.C.; Mello-Thoms, C. MuDeRN: Multi-category classification of breast histopathological image using deep residual networks. Artif. Intell. Med. 2018, 88, 14–24. [Google Scholar] [CrossRef]
Sandbank, J.; Bataillon, G.; Nudelman, A.; Krasnitsky, I.; Mikulinsky, R.; Bien, L.; Thibault, L.; Shach, A.A.; Sebag, G.; Clark, D.P.; et al. Validation and real-world clinical application of an artificial intelligence algorithm for breast cancer detection in biopsies. NPJ Breast Cancer 2022, 8, 129. [Google Scholar] [CrossRef]
Roszkowiak, L.; Korzynska, A.; Siemion, K.; Zak, J.; Pijanowska, D.; Bosch, R.; Lejeune, M.; Lopez, C. System for quantitative evaluation of DAB&H-stained breast cancer biopsy digital images (CHISEL). Sci. Rep. 2021, 11, 9291. [Google Scholar] [CrossRef]
Belsare, A.D. Histopathological Image Analysis Using Image Processing Techniques: An Overview. Signal Image Process. Int. J. 2012, 3, 23–36. [Google Scholar] [CrossRef]
Patel, S.; Fridovich-Keil, S.; Rasmussen, S.A.; Fridovich-Keil, J.L. DAB-quant: An open-source digital system for quantifying immunohistochemical staining with 3,3′-diaminobenzidine (DAB). PLoS ONE 2022, 17, e0271593. [Google Scholar] [CrossRef]
Bencze, J.; Szarka, M.; Kóti, B.; Seo, W.; Hortobágyi, T.G.; Bencs, V.; Módis, L.V.; Hortobágyi, T. Comparison of Semi-Quantitative Scoring and Artificial Intelligence Aided Digital Image Analysis of Chromogenic Immunohistochemistry. Biomolecules 2022, 12, 19. [Google Scholar] [CrossRef]
Sziva, R.E.; Ács, J.; Tőkés, A.-M.; Korsós-Novák, Á.; Nádasy, G.L.; Ács, N.; Horváth, P.G.; Szabó, A.; Ke, H.; Horváth, E.M.; et al. Accurate Quantitative Histomorphometric-Mathematical Image Analysis Methodology of Rodent Testicular Tissue and Its Possible Future Research Perspectives in Andrology and Reproductive Medicine. Life 2022, 12, 189. [Google Scholar] [CrossRef]
Raza, S.E.A.; AbdulJabbar, K.; Jamal-Hanjani, M.; Veeriah, S.; Le Quesne, J.; Swanton, C.; Yuan, Y. Deconvolving Convolutional Neural Network for Cell Detection. In Proceedings of the 2019 IEEE 16th International Symposium on Biomedical Imaging (ISBI 2019), Venice, Italy, 8–11 April 2019; pp. 891–894. [Google Scholar] [CrossRef]
Swiderska-Chadaj, Z.; Pinckaers, H.; van Rijthoven, M.; Balkenhol, M.; Melnikova, M.; Geessink, O.; Manson, Q.; Sherman, M.; Polonia, A.; Parry, J.; et al. Learning to detect lymphocytes in immunohistochemistry with deep learning. Med. Image Anal. 2019, 58, 101547. [Google Scholar] [CrossRef]
Bankhead, P.; Loughrey, M.B.; Fernández, J.A.; Dombrowski, Y.; McArt, D.G.; Dunne, P.D.; McQuaid, S.; Gray, R.T.; Murray, L.J.; Coleman, H.G.; et al. QuPath: Open source software for digital pathology image analysis. Sci. Rep. 2017, 7, 16878. [Google Scholar] [CrossRef] [PubMed]
Schneider, C.A.; Rasband, W.S.; Eliceiri, K.W. NIH Image to ImageJ: 25 years of image analysis. Nat. Methods 2012, 9, 671–675. [Google Scholar] [CrossRef] [PubMed]
Gogler-Pigłowska, A.; Klarzyńska, K.; Sojka, D.R.; Habryka, A.; Głowala-Kosińska, M.; Herok, M.; Kryj, M.; Halczok, M.; Krawczyk, Z.; Scieglinska, D. Novel role for the testis-enriched HSPA2 protein in regulating epidermal keratinocyte differentiation. J. Cell. Physiol. 2018, 233, 2629–2644. [Google Scholar] [CrossRef] [PubMed]
Hewitt, S.M.; Baskin, D.G.; Frevert, C.W.; Stahl, W.L.; Rosa-Molinar, E. Controls for Immunohistochemistry. J. Histochem. Cytochem. 2014, 62, 693–697. [Google Scholar] [CrossRef] [PubMed]
Khan, A.M.; Rajpoot, N.; Treanor, D.; Magee, D. A Nonlinear Mapping Approach to Stain Normalization in Digital Histopathology Images Using Image-Specific Color Deconvolution. IEEE Trans. Biomed. Eng. 2014, 61, 1729–1738. [Google Scholar] [CrossRef]
Haub, P.; Meckel, T. A Model based Survey of Colour Deconvolution in Diagnostic Brightfield Microscopy: Error Estimation and Spectral Consideration. Sci. Rep. 2015, 5, 12096. [Google Scholar] [CrossRef]
Ruifrok, A.C.; Johnston, D.A. Quantification of histochemical staining by color deconvolution. Anal. Quant. Cytol. Histol. 2001, 23, 291–299. [Google Scholar]
Hassanein, A.S.; Mohammad, S.; Sameer, M.; Ragab, M.E. A Survey on Hough Transform, Theory, Techniques and Applications. arXiv 2015, arXiv:1502.02160. [Google Scholar] [CrossRef]
Ashabi, A.; Sahibuddin, S.B.; Haghighi, M.S. The Systematic Review of K-Means Clustering Algorithm. In Proceedings of the 2020 The 9th International Conference on Networks, Communication and Computing, Tokyo Japan, 18–20 December 2020; ACM: New York, NY, USA; pp. 13–18. [Google Scholar] [CrossRef]
Vidhya, S.; Shijitha, M.R. Deep Learning based Approach for Efficient Segmentation and Classification using VGGNet 16 for Tissue Analysis to Predict Colorectal Cancer. Ann. Romanian Soc. Cell Biol. 2021, 25, 4002–4013. [Google Scholar]
van der Loos, C.M. Multiple Immunoenzyme Staining: Methods and Visualizations for the Observation With Spectral Imaging. J. Histochem. Cytochem. 2008, 56, 313–328. [Google Scholar] [CrossRef]
Raman, R.; Radha, R.H.; Inbamalar, T.M.; Subhahan, D.A.; Kumar, A.; Bathrinath, S.; Sarkar, S. Penetration of Deep Learning in Human Health Care and Pharmaceutical Industries; the Opportunities and Challenges. In Proceedings of the 2023 3rd International Conference on Innovative Practices in Technology and Management (ICIPTM), Nagar, India, 22–24 February 2023; pp. 1–6. [Google Scholar] [CrossRef]

Figure 1. An example of IHC images of reconstructed human epidermis stained for: (A)—FLG marker; (B)—K10 marker; (C)—Ki67 marker; (D)—HSPA2 marker.

Figure 2. EpidermaQuant algorithm flowchart. The main steps of the pipeline include image color normalization, image color deconvolution, background detection, image rotation and crop, initial DAB-area detection step where the algorithm decides whether the image should be further analyzed, image segmentation, and calculation of the DAB percentage on tissue.

Figure 3. Comparison of normalization (A) and color-deconvolution (B) methods with the original image, depending on the selected marker using exemplary images.

Figure 4. Subsequent steps of background removal (A) and comparison of two image rotation methods for an exemplary image of K10 marker (B).

Figure 5. Example of DAB image segmentation result. Interesting regions of markers of human epidermal differentiation are segmented by the k-means method. The selected cluster outlines the DAB-stained areas (red line) on the original image with additional hematoxylin staining (blue).

Figure 6. The percentage of DAB-stained tissue. The graph shows the percentage of DAB occupancy on the slide in each sample by study marker group.

Figure 7. Exemplary results of EpidermaQuant on HPA dataset. For each analyzed marker (rows), images with the lowest and the highest estimated percentage of DAB occupancy were selected. The red line indicates the estimated area of the tissue.

Figure 8. Summary of the ablation study using the K10-marker images. All results are compared to the full pipeline. Panel (A) shows the computational time comparison for selected configurations of pipeline, while panel (B) shows the difference in the percentage of DAB occupancy.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Zamojski, D.; Gogler, A.; Scieglinska, D.; Marczyk, M. EpidermaQuant: Unsupervised Detection and Quantification of Epidermal Differentiation Markers on H-DAB-Stained Images of Reconstructed Human Epidermis. Diagnostics 2024, 14, 1904. https://doi.org/10.3390/diagnostics14171904

AMA Style

Zamojski D, Gogler A, Scieglinska D, Marczyk M. EpidermaQuant: Unsupervised Detection and Quantification of Epidermal Differentiation Markers on H-DAB-Stained Images of Reconstructed Human Epidermis. Diagnostics. 2024; 14(17):1904. https://doi.org/10.3390/diagnostics14171904

Chicago/Turabian Style

Zamojski, Dawid, Agnieszka Gogler, Dorota Scieglinska, and Michal Marczyk. 2024. "EpidermaQuant: Unsupervised Detection and Quantification of Epidermal Differentiation Markers on H-DAB-Stained Images of Reconstructed Human Epidermis" Diagnostics 14, no. 17: 1904. https://doi.org/10.3390/diagnostics14171904

APA Style

Zamojski, D., Gogler, A., Scieglinska, D., & Marczyk, M. (2024). EpidermaQuant: Unsupervised Detection and Quantification of Epidermal Differentiation Markers on H-DAB-Stained Images of Reconstructed Human Epidermis. Diagnostics, 14(17), 1904. https://doi.org/10.3390/diagnostics14171904

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

EpidermaQuant: Unsupervised Detection and Quantification of Epidermal Differentiation Markers on H-DAB-Stained Images of Reconstructed Human Epidermis

Abstract

1. Introduction

2. Materials and Methods

2.1. Data

2.2. Color Normalization and Deconvolution

2.3. Background Detection and Mask Creation

2.4. Image Rotation and Crop

2.5. Elimination of Images without DAB Stain

2.6. DAB Region Segmentation

2.7. EpidermaQuant Framework

3. Results

3.1. Color Normalization and Deconvolution Results

3.2. Background Removal and Image Rotation

3.3. Elimination of Images without DAB Stain Results

3.4. DAB Area Detection

3.5. Comparison of DAB-Staining Levels between Markers

3.6. Validation of Developed Pipeline on Human Protein Atlas Data

3.7. Ablation Study

4. Discussion

5. Conclusions

Supplementary Materials

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI