1. Introduction
Medical care relies on diagnoses made by medically educated physicians based on their own experiences. This experience in clinical education has a specific degree of latitude and only sometimes leads to consistent diagnostic results for the same case. Additionally, in medically underpopulated areas, specialists in each field are often absent. This situation results in a bias in the medical care quality in the region. In response, online telemedicine has been introduced. This online medical care (telehealth) system was also effective during the COVID-19 epidemic [
1,
2,
3,
4]. However, the basis of online medical care requires the physician to be on the other side of the network. In contrast, a medical system using artificial intelligence (AI) is also considered to address these issues. One of the critical roles that AI can play in medicine is to raise the adequacy rate of diagnosis and make it more uniform.
AI has been introduced and put into practical use in the United States, primarily in radiological imaging [
5,
6]. This method utilizes deep learning, where the AI is trained based on supervised images. In addition, AI has developed a diagnostic system for wounds. Pressure ulcers are classified into Stage 1 to Stage 4 according to the depth of the damage (EPUAP). In Stage 1, the damage is superficial and likely reversible. In Stage 2, damage to the epidermis and dermis is observed but does not extend to the subcutaneous fat, whereas damage that reaches the subcutaneous fat is classified as Stage 3. When it reaches deeper muscles, tendons, bones, and so on, it is defined as Stage 4. In Stage 1 and Stage 2, no specialized wound treatment is required, as tissue damage is shallow and localized. Improvements in environmental factors that caused wound formation are the main focus of treatment. However, appropriate wound management, including proper debridement of necrotic tissue, is required in Stage 3 and Stage 4 cases. In particular, in Stage 4 cases, there is a possibility that the infection has spread to the subcutaneous tissue, and immediate intervention by a wound specialist is desirable. However, there is a shortage of personnel who can make these judgments in medically underserved areas and home care settings. Therefore, the development of a wound assessment system using AI is desired. The ultimate goal is for AI to stage pressure ulcers and propose appropriate treatment methods. However, the usefulness of AI is low for Stage 1 or Stage 2, which do not necessarily require special treatment. On the other hand, it is important to not miss cases requiring urgent attention in Stages 3 or 4 and to avoid incorrect treatments. Therefore, AI proposals are crucial. However, deep learning requires a large amount of image data. Thus, the development difficulty differs significantly between radiological images with unlimited teacher images and wound images with limited sources. In the field of wounds, systems have been developed using machine learning (ML) or deep learning (DL) to evaluate necrotic tissue, granulation tissue, slough, and so on. Many of these systems use tens of thousands to hundreds of thousands of training images, which are amplified images, but the basic images used are only a few tens to a few hundred. For example, Veredas et al. used 113 basic images to create approximately 16,000 images [
7], and Zahia et al. used 22 basic images to create approximately 380,000 images [
8]. On the other hand, Chang et al. used approximately 2800 basic images without amplification to compare five popular AI systems and constructed a system with a high degree of accuracy [
9].
In building a wound diagnosis system using deep learning, many systems first perform wound segmentation, followed by wound measurement and tissue classification [
7,
8,
10,
11]. In studies using support vector machine (SVM) as part of machine learning, wound segmentation is also performed, and color correction is applied [
12,
13].
Instead of building a system to determine the stage of pressure ulcers with high accuracy using small data, we aimed to develop a system that evaluates individual items for assessing the condition of pressure ulcers and combines the data obtained from each item to perform the final wound evaluation. That is, we created a simple system that does not require wound segmentation by focusing only on the evaluation of necrotic tissue in the wound.
This study used standard Japanese decubitus ulcer images collected during medical treatment at the Department of Plastic and Reconstructive Surgery, Kobe University Hospital. We constructed an image identification algorithm to identify the presence or absence of necrosis, type of necrosis (black or white necrosis), and depth of the wound. The constructed model was also used to verify the correctness of the algorithm by matching the identification results with the physician’s judgment using images of a different case than the previous case.
2. Materials and Methods
2.1. Image Collection
In conducting this study, approval for clinical images was obtained from the Kobe University Medical Ethics Committee. Clinical images of patients with bedsores treated at the Department of Plastic and Reconstructive Surgery, Kobe University Hospital, were analyzed. Clinical images have been captured and stored in the past. Compact digital cameras were used for photography; however, a constant model was not employed. The method and environment in which the photographs were taken were not consistent. Of the images captured, 50 were randomly selected. Except for cropping the images such that the wound area was approximately 50% of each image, no other processing or manipulation was performed. Of these 50 cases, 27 were used as teacher images for the development of the system, and the remaining 23 cases (24 sites) were utilized to validate the developed system.
2.2. Physician Determination of Necrosis
Two plastic surgeons with experience in wound care determined the presence of necrosis and wound depth from images. First, the two participants were asked to refer to and evaluate the images individually. When there were differences in the evaluation, we discussed and reached a consensus. However, because the evaluation was the same in all 27 images, we used this answer as an exemplary answer in image identification.
In pressure ulcers, necrotic tissue changes color from black to white depending on whether it is dry or moist. To identify the color tones, both black and white are used as the endpoints of the color range, and both states, which are recognized as being far apart, must be recognized as necrotic tissue. Here, necrotic tissue that has dried and turned black is defined as “black necrosis”, and necrotic tissue that is relatively moist and appears white is defined as “white necrosis”.
For the presence of necrosis, only the presence of areas of necrosis was an evaluation item; black, white, or other states of necrosis were not listed as evaluation items. The Fibrin membrane and biofilm of the slough, which resemble white necrosis, were distinguished from necrosis.
For “wound depth“, we defined “superficial wounds“ as wounds that remain in the superficial dermis layer and “deep wounds“ as wounds that extend from the mid-dermis to the fat layer. Wounds replaced by granulation were classified as “deep“. If the image analysis output included some of the worst information in the image (presence of necrosis, deep wound), it was used to evaluate the case.
2.3. Determining the Presence of Necrosis Using Color Pixels
Trimming was performed on the images of the 27 cases to reduce the computational complexity of the algorithm. The cropped image was assumed to be a square, not a rectangle, and the length of one side of the image was assumed to be given by
× 1.25 of a pressure sore area with long and short diameters. This approach set the area of the pressure ulcer to be approximately 50% of the image. The cropped image was then resized to 200 pixels per side (200 × 200 = 40 k pixels per image), and this image was used to identify necrosis. Black and white necrosis exist, but both should be recognized as necrosis [
14].
The color depth was set as red, green, and blue (RGB) with eight bits. Therefore, 256 levels (0–255) of brightness were in each spectrum per pixel. For each of the RGB colors, the color range was set as an absolute value. The value was set as the threshold value, and a black pixel was defined when R, G, and B were below this threshold value. In contrast, the average value of the color range for each of R, G, and B in the 40 k pixels that constitute the image was calculated and used as the threshold value. Next, for each pixel, when the R and G color ranges were more significant than the threshold value, the R color range was more significant than B, and the G color range was greater than B; this pixel was defined as a white pixel.
The percentage of black pixels in the image (%) was then defined as black pixels (%). Similarly, the percentage of white pixels in the entire image was defined as white pixels (%). This white pixel included the skin without pressure ulcers. These numbers were plotted on a graph with black pixels (%) on the vertical axis and white pixels (%) on the horizontal axis.
2.4. Construction of an Algorithm for Determining Scratch Depth Using Images
The image was converted to 50 × 50 pixels (2.5 k pixels) using the nearest neighbor method to remove small bumps and observe significant changes [
15]. In addition, the pressure ulcer image was shifted diagonally by 45° by one pixel. The respective RGB color ranges were subtracted between this image and the original image at the pixels mapped to the same location. This difference was measured across the selected images, and the maximum absolute value of the change in luminance between R, G, and B was considered the luminance difference. The data of 27 cases were placed on a scatter plot with the black pixels (%) in the 40 k pixel image calculated in
Section 2.3 on the vertical axis and the R luminance difference in the 2.5 k pixel image on the horizontal axis. The threshold values were contrasted with the physician’s judgments of wound depth on this scatterplot.
4. Discussion
As life expectancy increases because of advances in social systems and the healthcare environment, it is clear that the number of chronic wounds, such as pressure ulcers, that commonly occur in the elderly will increase [
16,
17]. In contrast, wound care is only sometimes performed by experts because of the limited number of experts specializing in wound care and their uneven distribution in the region. The goals of experts in wound care are to heal wounds and shorten their duration, which are the same as the objectives of telemedicine [
18]. In contrast, for non-expert healthcare providers, the critical goal of treatment is to avoid incorrect treatment. Although these objectives may differ in direction, they share the need for accurate wound assessment.
Wounds should be evaluated through a visual examination and palpation. Heat and tenderness require palpation of the three signs of infection, whereas redness is evaluated by visual examination. The presence of necrotic tissue and wound depth should be evaluated by visual examination. In other words, much information can be obtained from images.
Attempts have been made to evaluate wounds with AI, but none have thus far resulted in practical applications [
11,
19,
20]. Howell et al. [
20] traced wound and granulation areas on the images, comparing manual tracing to tracing with AI and quantitatively evaluating both. The results showed that the wound tracing by AI almost matched the manual tracing by the specialist but did not reach perfect agreement. In addition, the tracing by AI was done using existing software. Because they did not develop the software, there was no way to provide feedback on whether the AI or human tracing were accurate without knowing the insights of the software. They only used AI software and compared that software with human tracing. In contrast to their research, our image identification engine considered the judgment of a skilled specialist as true first and then took each pixel of the image and gave clinical meaning to its measurements to fit that analysis. Iizaka et al. attempted to extract the red color range to assess the conditions of pressure ulcer granulation [
21]. After adjusting the color tone of the captured image using image processing software, their method extracted only the red color range using RGB filters to calculate the luminance and correlate it with the healing process. Although this method effectively evaluates changes over time in the same patient, it requires constant color correction. It does not allow for evaluating patients or wounds other than granulation tissue.
Measuring the wound area during treatment and evaluating changes in the wound area over time is an essential endpoint because it leads to the evaluation of treatment methods. However, wound size measurement does not require advanced technology. The benefits of automating wound size measurement would be to reduce the risk of infection by making non-contact measurements and free medical personnel from cumbersome processes. Identifying the wound area separately from the healthy area is the gateway to developing a wound evaluation system using AI. However, we evaluated the wound without tracing its area. By viewing the image as a collection of pixels and comparing the individual characteristics of these collections, we successfully evaluated the presence of necrotic tissue and wound depth. Because this system uses the difference in parameters from the surrounding healthy skin for its calculation, the assumption is that the photograph includes a wound to which the photographer sets the area ratio of the wound in the photograph to approximately 50%. The automatic diagnosis system we are aiming for will be useful for in-home medical care and in underpopulated areas. As we are developing a system with the expectation that the person taking the photographs in the field will be able to determine whether or not a wound is present, we believe that the problems at the photography stage will be resolved.
In this study, we did not trace the wound in the wound images. The first step in many previous studies was automatically recognizing the wound and tracing its boundaries [
20,
22,
23,
24,
25,
26,
27,
28]. However, when the AI system is used in a wound care setting, the assumption is that a medical professional (regardless of their knowledge or experience with wounds) is taking pictures and artificially determining whether a wound is present. Thus, it is possible to fit the wound area into approximately half the area of the image to be taken. We consider the entire image to be a group of pixels. The image contains both normal and wounded areas. By comparing the distribution width of this pixel information, the presence or absence of necrotic tissue and the depth of the wound were successfully determined. However, the extent of necrotic tissue was not measured, and only the deepest point of the wound was evaluated. In this respect, it can be said that the system extracts only the worst parts of the wound. However, the most critical aspect of wound management is not to neglect to address the worst parts of the wound. From this perspective, the system is necessary and sufficient.
In this study, we found that white and black necrosis could be recognized as necrosis using our algorithm. As the present evaluation focused only on the presence or absence of necrosis, no distinction was made between black and white necrosis. In contrast, in ischemic necrosis in toes, for example, black necrosis plays an important role in guiding treatment. The fact that the threshold for necrosis can be set from black and white pixels suggests that it is possible to distinguish between black and white necrosis.
This algorithm could be effective for wounds other than pressure ulcers. To determine the depth of the scratches, a threshold value of 100 was set for the R luminance difference. This threshold value was not determined by a statistical process, but rather was set based on the physician’s judgments and the distribution of each case on the graph. The set values were obtained from the data of 27 cases and, although they were not derived from a large number of cases, they were compatible with the 23 cases used for validation. The threshold value for determining the depth of the wound probably exists in the neighborhood of this value of 100. Still, a more precise value is expected as this system is implemented and more cases are added.
According to the criteria in
Section 3.2, all wounds with necrosis should have been deep, but the physician’s judgment for Cases 42 (2) and 50 was that the wounds were shallow with necrosis. The same decision was made by image identification. In these two cases, necrosis was part of the wound, so the image identification did not place the wounds in the deep zone, and the wounds were identified as shallow despite necrosis. Ultimately, the judgments of the physicians and image identification system were in agreement.
Figure 3 shows a very deep case (blue circle 1) that has a pocket without necrosis. Thus, based on luminance differences, the depth discrimination method makes it possible to identify the presence of pockets.
Sloughs could be evaluated using a threshold value of 5% black pixels for continuous necrosis and depth determination. This situation suggests that they are distributed in areas that are closer in color tone to the surrounding healthy skin than white necrotic areas. The classification by luminance difference 60 was used to distinguish between the depth of the necrotic wounds. Still, because only two cases were included in this study, it is impossible to determine whether this classification is correct. In the verified cases, physicians judged the wounds to be shallow despite necrotic tissue using a luminance difference of 50 as the threshold value (Cases 42 (2) and 50). In these two cases, necrotic tissue was present, but only superficially. Their similar distributions make it challenging to distinguish between these superficial necroses and sloughs. However, whether slough or superficial necrosis is present, removal of slough and necrotic tissue by debridement and response to infection is required. Even if the slough is determined to be superficial necrotic tissue in wound management, the necessary treatment details are not much different. There were no significant differences between the groups. Similarly, the slough group was positioned between the black and white necrosis groups, but this is not considered a problem in wound management, as the procedures are similar in the field treatment for the time being. In contrast, a trained physician can distinguish sloughs from necrotic tissue with the naked eye. The high water content relative to the necrotic tissue is thought to be responsible for the high white luminosity. In recent years, the prevention of slough formation has been recommended for wound management. Continued accumulation of cases may allow for more detailed classification and threshold values for sloughs.
Compared to other studies using deep learning, the number of images used in this study is relatively small. However, in each study, the same images were augmented and used, so the actual number of basic images does not differ significantly from that of this study. Because this study is not a clinical trial aimed at general treatment outcomes, we do not believe that the sample size requires to be set by power analysis. In addition, as shown in
Table 2,
Table 3,
Table 4,
Table 6 and
Table 7, the results of Fisher’s exact test are close to zero, which indicates that these results are unlikely to occur by chance. Therefore, the sample size is considered sufficient to obtain these results. However, in the future, by utilizing a larger sample size, it may be possible to set thresholds in more detail and increase sensitivity and specificity accordingly.
One potential limitation of this study is that the color information extracted from the entire targeted segmented image may lead to erroneous identification if there are objects with similar color tones to necrotic tissue or granulation within the region, causing the cutoff value to be exceeded. Such objects could involve discoloration due to erythema or hemosiderin deposition in the wound surroundings, or the coloration of garments. Nevertheless, this matter might be resolved by running our developed system after the wound region is extracted. We intend to investigate this potential solution in our future research endeavors.