1. Introduction
Image segmentation is a crucial research direction in the field of computer vision, aiming to partition images into multiple regions with similar properties or meanings for further analysis and processing [
1]. With the continuous advancement of digital technology, image segmentation has demonstrated extensive application prospects in numerous fields, including medical image analysis [
2], autonomous driving [
3], intelligent monitoring [
4], and remote sensing image processing [
5]. Compared with grayscale images, color images exhibit significant advantages in visual perception, reversible information hiding, and technological applications, making them widely utilized and valued in various domains [
6,
7,
8].
The existing image segmentation methods are mainly divided into the following categories: threshold-based segmentation methods [
9], color-based methods [
10], edge-based segmentation methods [
11], clustering-based segmentation methods [
12], region-based segmentation methods [
13], etc. Among them, threshold-based methods are widely used due to their simple algorithm implementation and good segmentation results. The existing image threshold segmentation algorithms include two classic algorithms: the Kapur entropy method and the Otsu method. The Kapur entropy method is a threshold selection method based on information theory. The basic idea is to select the optimal threshold by calculating the entropy of pixels at different grayscale levels in the image. In an image, the grayscale values of pixels are considered random variables, and the distribution of pixel values can reflect the information content of the image. By calculating the entropy of pixels at different grayscale levels, the Kapur entropy method can find an optimal threshold that maximizes the information difference between segmented foreground and background pixels [
14]. Therefore, the Kapur entropy method can serve as an objective function to optimize the effectiveness of image segmentation. This method has a wide range of applications in the field of image processing, especially in scenes that require precise image segmentation [
15]. The Otsu method is a widely used image segmentation algorithm [
16]. Its core idea is to determine a threshold based on the grayscale characteristics of the image, dividing it into two parts: a set of pixels that meet specific threshold conditions and a set of pixels that do not meet specific threshold conditions. In this process, the Otsu method aims to minimize the intra-class variance by finding a threshold that minimizes the difference between the two segmented pixel sets. This difference is quantified by calculating the grayscale distribution and intra-class variance of pixels. By continuously updating the threshold and recalculating the intra-class variance, the threshold that minimizes the intra-class variance is ultimately found, which is the optimal segmentation or binarization threshold. This method performs well in image segmentation, especially suitable for scenes that require precise differentiation between foreground and background [
17]. The Kapur entropy method and Otsu method have been widely applied in image processing after years of in-depth research and practice [
18,
19,
20]. However, these two methods have also exhibited some shortcomings in their applications. The Kapur entropy method, due to its relatively complex calculation process, may result in longer processing time, especially when dealing with large-scale images with lower efficiency. In addition, for certain images with complex textures or noise interference, the Kapur entropy method may be difficult to use to accurately identify the boundaries of different regions, resulting in less accurate segmentation results. The Otsu method may find it difficult to accurately identify the optimal segmentation threshold when processing images with bimodal or multimodal grayscale distributions, resulting in inaccurate segmentation results [
21,
22,
23]. Furthermore, when there is noise or uneven lighting interference in the image, the segmentation effect of the Otsu method may also be affected [
24].
To overcome these shortcomings, researchers have proposed a method that combines metaheuristic optimization algorithms to enhance the segmentation performance of the algorithm [
25]. Metaheuristic optimization algorithms are a type of universal problem solver inspired by mechanisms such as biological evolution, swarm intelligence, and natural physical and chemical processes. This algorithm type boasts advantages such as robust global search capabilities, fast convergence speed, and relatively straightforward parameter settings. It can iteratively search for the optimal solution to the problem without requiring prior strict mathematical modeling of the problem structure. Therefore, it exhibits significant advantages in solving complex optimization problems. In recent years, numerous metaheuristic optimization algorithms have been employed for image segmentation [
26]. For instance, Qingxin Liu proposed the chimpanzee-inspired remora optimization algorithm (HCROA) in 2023. HCROA improves the optimization effect of the remora optimization algorithm and achieves promising results in color multi-threshold image segmentation [
27]. Guoyuan Ma presented an enhanced whale optimization algorithm for multi-threshold image segmentation in 2022, addressing the shortcomings of the traditional whale optimization algorithm (WOA) and enhancing the quality and stability of segmentation results in color multi-threshold image segmentation [
19].
In 2023, Heming Jia et al. proposed a crayfish optimization algorithm (COA) [
28]. This algorithm mainly simulates the foraging, summer avoidance, and competitive behavior of crayfish. The COA has fast search speed and strong search ability, which can effectively balance the ability of global search and local search. The COA is a promising new optimization algorithm. However, there are still problems, such as insufficient convergence ability in the later stage and susceptibility to falling into global optima. Yi Zhang et al. proposed an improved crawfish optimization algorithm, which effectively improved the optimization effect of the algorithm [
29]. Nabila H. Shikoun proposed a new binary crayfish optimization algorithm (BinCOA) in 2024, which adds a crisscrossing strategy to the original COA and improves the convergence accuracy of the COA [
30].
To better solve the problem of image segmentation, this article proposes a hybrid adaptive crayfish optimization algorithm with differential evolution for color multi-threshold image segmentation (ACOADE). This method significantly improves the flexibility and efficiency of the algorithm in exploring the solution space by finely adjusting the key parameters in the crayfish optimization algorithm, especially the maximum foraging amount p, and introducing an adaptive foraging amount adjustment strategy. Meanwhile, in order to balance the exploration and exploitation capabilities of the algorithm, we cleverly incorporated the core formula of the differential evolution algorithm, ensuring that ACOADE can balance global and local searches when searching for the optimal threshold. To verify the optimization performance of ACOADE, we selected the IEEE CEC2020 test function as the benchmark and conducted in-depth comparisons with eight algorithms. The experimental results showed that ACOADE showed excellent superiority in optimization problems, achieving significant advantages in both solution quality and convergence speed. In addition, in order to verify the practical application effect of ACOADE in image segmentation, we used the classical Kapur entropy method and Otsu method as the objective function of RGB primary color and carried out segmentation experiments on multiple images. To verify the optimization effect of ACOADE, we comprehensively and objectively evaluated the segmentation results using evaluation metrics such as the peak signal noise ratio (PSNR), the feature similarity index (FSIM), and the structural similarity index (SSIM). The experimental results showed that ACOADE had significant advantages in objective function value, image quality measurement, convergence, and robustness. The results fully proved the effectiveness and practicability of ACOADE in the field of color multi-threshold image segmentation.
2. Related Work
This section describes the related work.
Table 1 shows that different researchers use different methods to solve the problem of image segmentation. They use different test methods to solve the problem of image segmentation and improve the algorithm to enhance the effect. These works show that various improved versions of meta heuristic optimization algorithms are proposed and combined with traditional methods to obtain the optimal threshold in the search process. These traditional methods have different effects and advantages.
Table 2 compares Otsu’s method, Rényi entropy, Tsallis entropy, and Kapur’s entropy in image segmentation. In principle, Otsu’s method determines the threshold by maximizing the variance between classes, and the other three use different entropy concepts to find the threshold based on information theory. In terms of computational complexity, Otsu’s method is relatively low, while the other three methods are relatively high. The application scenarios are different. Otsu’s method is suitable for simple bimodal images, Rényi entropy is suitable for images with regular distribution and high accuracy requirements, Tsallis entropy is used for complex distribution images, and Kapur’s entropy performs well in texture rich images. Otsu’s method is average in terms of image detail processing ability, and others are better or very good. They are similar in that they are threshold segmentation methods based on gray information, which are designed to complete the task of image segmentation and are widely used.
In this paper, the Kapur entropy method and the Otsu method are used to realize threshold segmentation. Based on information theory, the Kapur entropy method determines the threshold by maximizing the sum of entropy of foreground and background, which is suitable for texture rich images and can effectively retain details; the Otsu method is based on the image gray characteristics and maximizes the variance between classes. It is simple to calculate and has a good segmentation effect for a bimodal histogram image. The ACOADE algorithm can enhance randomness and exploration ability, can perform well in multi-threshold image segmentation experiments, and can more effectively solve the problem of color multi-threshold image segmentation.
3. Image Segmentation Theory and Methods
Threshold-based image segmentation is the process of obtaining the optimal threshold through a certain method, comparing the pixels of the image, and distinguishing the target from the background. Threshold-based image segmentation methods can be divided into two categories: single-threshold segmentation and multi-threshold segmentation. Single-threshold segmentation divides the histogram into target and background based on the threshold, while multi-threshold segmentation divides the image into multiple categories through the threshold and maximizes the inter-class variance between each category. In complex images containing various objects, single-threshold methods are not as effective as multi-threshold images. Therefore, multi-threshold image segmentation methods have been widely studied [
36,
37]. There are many methods for multi-threshold image segmentation, and this article mainly uses the Kapur entropy method and the Otsu method to achieve multi-threshold image segmentation.
The main idea of the Kapur entropy method and the Otsu method is to divide images into multiple categories. Assume an image has L grayscale values. Let ni represent the number of pixels in the ith class of grayscale values in the image. The total number of pixels Sum_N = n0 + n1 + … + NL − 1. Assuming there are K thresholds, the grayscale of a given image can be divided into K + 1 categories. The threshold set can be represented by [th1, th2, th3,…, thK]. The details are as follows.
3.1. Kapur Entropy Method
The Kapur entropy method divides the histogram of an image into different types through multiple thresholds, calculates the sum of entropy for each type, and maximizes it [
38]. The specific formula of the Kapur entropy method is as follows:
where
In the formula,
H0,
H1, ⋯,
Hk represent the entropy of different classes.
ω0,
ω1,
ω2, ⋯,
ωK represents the sum of probabilities for each part. And
pi represents the probability of the
ith grayscale level. The formula is as follows:
To obtain the optimal threshold, it is necessary to maximize the objective function (4) by controlling the parameters {
th1,
th2 …
thK}.
3.2. Otsu Method
The main idea of the Otsu method is to divide the histogram of an image into different types through multiple thresholds, calculate the inter-class variance of each type, and sum it up. The Otsu method states that when the sum of inter-class variances is maximized, selecting the segmentation threshold at this time results in the best image segmentation effect [
39,
40]. The specific formula for the Otsu method is as follows:
where
In the formula,
μT represents the average grayscale level of the image as shown below:
To obtain the optimal threshold, it is necessary to maximize the objective function (8) by controlling the parameters {
th1,
th2 …
thK}:
From the above formula, it can be seen that as the threshold K increases, the complexity increases, and the requirements for parameters become more precise. It is particularly important to find a better threshold. Therefore, this article proposes the ACOADE algorithm to search for more precise threshold parameters.
6. Experimental Results and Discussion
This section uses the IEEE CEC2020 test function to verify the optimization effect of the ACOADE algorithm. The crayfish optimization algorithm (COA), differential evolution (DE) [
43], particle swarm optimization (PSO) [
44], the arithmetic optimization algorithm (AOA) [
45], the whale optimization algorithm (WOA) [
46], the sand cat swarm optimization (SCSO) [
47], the grey wolf optimization (GWO) [
48], and the moss growth optimization (MGO) [
49] were selected as comparison algorithms. The ACOADE algorithm and eight comparison algorithms all had a population size of
N = 20, dimension
dim = 30/50, and a maximum evaluation frequency of
maxFES =
dim × 10,000. The comparative experimental results showed that the ACOADE algorithm had better optimization performance. The parameter settings for these algorithms are shown in
Table 3.
All the experiments in this paper were completed on a computer with the 11th Gen Intel(R) Core(TM) i7-11700 processor with the primary frequency of 2.50 GHz, 16 GB memory, and the operating system of 64-bit windows 11 using matlab2021a.
6.1. IEEE CEC2020 Test Function Experimental Results
Table 4 shows the statistical results obtained by running ACOADE and eight comparison algorithms independently for 30 times at
dim = 30. “Best” represents the best fitness value, “mean” represents the average fitness value, “std” represents the variance of fitness values, and “rank” represents the Friedman ranking result. From F1, it could be concluded that ACOADE could achieve better fitness values, except for the best, which did not achieve the best results. Compared with other comparative algorithms, ACOADE had better optimization performance. DE could achieve the best, indicating that DE had a good convergence effect, but its stability was insufficient. In F2, the performance of ACOADE was poor, and the results obtained were not as good as other comparison algorithms, resulting in poor optimization effects. In F2 and F3, MGO had good performance, ranking first and exhibiting good optimization effects. ACOADE also had good performance in F3, ranking slightly higher than PSO and MGO. In F4, most algorithms had good optimization effects and could achieve good fitness values. In F5, ACOADE achieved good results, except for “best”, which did not achieve the best results. Compared with other comparative algorithms, ACOADE had better optimization performance. In F6, ACOADE achieved the best, but MGO had better optimization results. In F7, DE had a better optimization effect and better performance. The optimization effect of ACOADE was relatively stable. In F8, ACOADE performed better than other compared algorithms and could achieve the best optimization results. PSO could also achieve better “best” results and have good optimization effects. In F9, ACOADE could achieve better results. The overall optimization effect of MGO was better than other comparative algorithms, with better results. In F10, ACOADE performed better and could achieve better optimization results.
Figure 4 shows the ranking results of each algorithm. Compared with other algorithms, ACOADE ranked better and ultimately ranked better.
Table 5 shows the statistical results obtained by running ACOADE and eight comparison algorithms independently for 30 times at
dim = 50. “Best” represents the best fitness value, “mean” represents the average fitness value, “std” represents the variance of fitness values, and “rank” represents the Friedman ranking result. In F1, ACOADE had a good optimization effect and could achieve good optimization results, only slightly higher than MGO at best. In F2, F3, F8, and F9, MGO had a good optimization effect, and the overall effect was due to other algorithms. ACOADE also had good performance in these functions, with good optimization effects. In F4, most algorithms had good performance and could find a good fitness value. In F5, the optimization effect of DE was better, but its stability was not as good as ACOADE, and the Friedman ranking was the same. In F6, ACOADE and GWO had the same ranking, and both had good performance. ACOADE had better performance in best and mean, but its stability was limited. In F7, ACOADE had a better overall optimization effect than other compared algorithms and could achieve better rankings. In F10, ACOADE ranked second and also had good optimization effects. The overall optimization effect of DE was better.
Figure 5 shows the ranking results of each algorithm. Compared with other algorithms, ACOADE ranked better and ultimately ranked better. MGO also had good optimization effects, with ACOADE being superior to MGO.
6.2. Convergence Curve Analysis of IEEE CEC2020 Test Function
Figure 6 shows the convergence curves of eight algorithms in IEEE CEC2020, with dimensions
dim = 30/50. In F1, the ACOADE algorithm had a good convergence curve at
dim = 30 and could continuously converge close to the theoretical optimal solution. The ACOADE algorithm had a significantly better convergence curve than other algorithms at
dim = 50 and could quickly converge to obtain a solution close to the theoretical optimal solution. The convergence curve obtained by the DE algorithm was also very excellent. In F2, ACOADE could quickly converge to obtain a better fitness value. GWO and MGO could continuously converge in the iteration to obtain better fitness value, but the convergence speed was slow. In F3, ACOADE converged well and could jump out of the local optimum on the way of convergence. PSO and MGO also had this good effect and could continuously converge, but the effect was not as good as MGO. Function F4 was relatively simple, and all nine algorithms had excellent convergence ability, which could quickly converge and find excellent solutions. In F5, the ACOADE algorithm obtained the best convergence curve at
dim = 30, followed by the DE algorithm. The curves obtained by ACOADE and DE algorithms at
dim = 50 were similar, and both could continuously converge and obtain better solutions. In F6, ACOADE could jump out of the local optimum in the process of convergence and obtain a better fitness value. The convergence of MGO and GWO was better, and better fitness values could be obtained. In F7, ACOADE, DE, and PSO algorithms all had excellent convergence performance, continuously converging at
dim = 30 and
dim = 50. In F8, ACOADE and the COA had a good effect at
dim = 30 and could find very excellent solutions in the early stage. At
dim = 50, ACOADE outperformed other comparative algorithms. In F9, the ACOADE algorithm could jump out of local optima and find better solutions. MGO could also continuously converge and obtain better fitness values. In F10, the convergence curves of the eight algorithms had a small difference and could all obtain good solutions.
6.3. Time Complexity Analysis
In order to verify the time complexity of each algorithm, this paper analyzes each algorithm with reference to the time complexity evaluation standard of IEEE CEC 2020 special meeting [
50]. The specific steps are as follows:
(a) Calculate the system runtime
T0, run the test program below:
(b) Calculate the complete computing time with 100,000 evaluations of the same dim dimensional function, which is T1.
(c) Calculate the complete computing time for the algorithm with 100,000 evaluations of the same D dimensional function, which is T2.
(d) The complexity of the algorithm is reflected by (T2 − T1)/T0.
Table 6 shows the (
T2 −
T1)/
T0 results obtained by eight algorithms in the IEEE CEC2020 test function, with dimension
dim = 30. From the table, it can be concluded that the SCSO algorithm was relatively large. The time complexity of ACOADE, the COA, and DE algorithms was similar. PSO, AOA, and GWO algorithms had lower time complexity.
7. Multi-Threshold Image Segmentation Experiment
This section will use ACOADE and eight comparison algorithms for image segmentation to verify the optimization effect of ACOADE. To verify the optimization effect of ACOADE, this study selected 12 images with a size of 321 × 481 × 3 from the Berkeley segmentation dataset and the benchmark 500 (BSDS500). The Berkeley segmentation dataset and the benchmark 500 (BSDS500) can be found at the website
https://www2.eecs.berkeley.edu/Research/Projects/CS/vision/bsds/ (accessed on 31 March 2025). These original images and RGB are shown in
Figure 7. The red curve represents the R of the three primary colors, the green curve represents the G of the three primary colors, and the blue curve represents the B of the three primary colors. For ease of comparison, the population size of all algorithms was
N = 20; the threshold K was the algorithm dimension dim, which was 15, 20, 25, and 30; and the evaluation frequency was
maxFES =
dim × 3000.
7.1. Statistical Results and Analysis of Image Segmentation
Table A1 presents the average fitness values derived from using the Kapur entropy method as the fitness metric and conducting 30 independent runs, while
Table A2 shows the average fitness values obtained by adopting the Otsu method as the fitness value and also running independently for 30 times. The bolded values signify that among the eight algorithms, they achieved the best results. From an overall perspective, the ACOADE algorithm generally performed quite well and could obtain better results. Specifically, ACOADE secured the best average fitness values for most images, with the exceptions being in image 10 when K equaled 20 and 25. However, other algorithms also exhibited their own advantages. For instance, among image 2, image 4, and image 6, at
K = 15, ACOADE failed to attain the best average fitness value, indicating that other algorithms outperformed it in these specific scenarios. In image 3, the COA algorithm demonstrated its strength by obtaining the best average fitness values at
K = 15 and
K = 20, suggesting that the COA was effective in optimizing the fitness values for image 3 under these conditions. In image 5, the GWO algorithm shone by achieving the best average fitness value at
K = 30, showing its superiority in dealing with the fitness evaluation of image 5 at this parameter setting. Additionally, in image 8 and image 10, the COA managed to obtain the best average fitness values at
K = 15,
K = 20, and
K = 25, highlighting its good performance in these two images across multiple parameter values.
Through a comprehensive analysis of the data in the two tables, it is evident that while ACOADE could achieve better fitness values in most images, demonstrating its relatively better optimization effect, other algorithms like the COA and GWO also had their moments of excellence in specific images and parameter settings, contributing to a more diverse and competitive performance landscape among the algorithms.
7.2. Analysis of Image Segmentation Convergence Curve
Figure A1 shows the convergence curve of image segmentation obtained by using the Kapur entropy method between ACOADE and eight comparison algorithms.
Figure A2 shows the image segmentation convergence curves obtained by ACOADE and eight comparison algorithms using the Otsu method. It could be inferred from these two figures that among the four thresholds, ACOADE could achieve a very excellent fitness value in the early stage. The COA, WOA, GWO, and PSO algorithms also had good convergence performance. But overall, it was not as good as ACOADE. The DE, AOA, and SCSO algorithms were difficult to converge, resulting in insufficient fitness values. Overall, it can be seen that ACOADE had better optimization effects.
7.3. Analysis of ACOADE Image Segmentation Results
Figure A3 shows the image segmentation results obtained by ACOADE using the Kapur entropy method.
Figure A4 is an image segmentation result obtained by ACOADE using the Otsu method. It can be seen that the result of ACOADE segmentation was better than that of the image. With the increase in the threshold, ACOADE segmentation results were better.
7.4. Results and Analysis of Wilcoxon Rank Sum Test
Table 7 and
Table 8 are Wilcoxon rank sum test tables obtained by using the Kapur entropy method and the Otsu method with ACOADE and eight comparison algorithms at threshold
K = 30. The Wilcoxon rank sum test was used to detect differences between algorithms. The
p in the table represents the Wilcoxon rank sum test results obtained,
h = 1 indicates significant differences between algorithms, and
h = 0 indicates no significant differences between algorithms. From the two tables, it can be concluded that there were significant differences between ACOADE and the eight comparison algorithms, with only some images showing no significant differences. Compared with the traditional COA, it had significant differences, indicating significant improvements.
7.5. Segmentation Effect Evaluation Index
The Kapur entropy method and the Otsu method were used to obtain the average fitness value as the index. This part used three indicators: the feature similarity index measure (FSIM), the peak signal to noise ratio (PSNR), and the structural similarity index measure (SSIM) [
51,
52,
53]. Details are as follows.
FSIM measures the quality of the segmented image by comparing the similarity of features between the original image and the segmented image. The specific formula is as follows:
where Ω represents the whole image pixel domain,
SL(
x) is the similarity index, and
PCm(
x) is the phase consistency feature, which is defined as
Among them,
PC1(
x) and
PC2(
x) represent the phase consistency characteristics of the two regions.
Among them, SPC(x) represents the similarity measure of phase consistency; SG(x) represents the gradient size between two regions G1(x) and G2(x); and α, β, T1, and T2 are constants.
PSNR is an indicator used to measure the similarity between segmented images and original images, which is defined as
Among them, MSE represents mean square error, I(i,j) represents the grayscale level of the ith row and jth column of the original image, K(i,j) represents the grayscale level of the ith row and jth column of the segmented image, and M and N are the number of rows and columns in the image matrix.
SSIM compares the similarity between two images through brightness, contrast, and similarity. The specific formula is as follows:
where
x and
y represents two images, and
μx and
μy represent the average values of
x and
y.
σx2 and
σy2 represent the variance of
x and
y.
σxy is the covariance of
x and
y, and
c1 and
c2 are constants.
7.6. Statistical Results and Analysis of Each Algorithm
Table A3 and
Table A6 are FSIM data.
Table A4 and
Table A7 are PSNR data.
Table A5 and
Table A8 are SSIM data. These tables are based on the average values of FSIM, PSNR and SSIM obtained from the above formula by using Kapur entropy method and Otsu method to segment the image. Bold is the optimal value obtained by comparing the eight algorithms and the proportion of the optimal value in the last row. From the six tables, it can be clearly concluded that ACOADE outperformed other algorithms significantly, having the most optimal values and demonstrating the best optimization effect. Specifically, for
Table A3,
Table A4 and
Table A5, which are based on the Kapur entropy method, ACOADE is the best. Following ACOADE, PSO took second place, showing very excellent performance. PSO had a relatively high proportion of optimal values in these tables, indicating that it could achieve good results in image segmentation using the Kapur entropy method, especially in terms of FSIM, PSNR, and SSIM values. Although not as good as ACOADE, PSO could still be a viable option when ACOADE was not applicable. For
Table A6,
Table A7 and
Table A8, which are based on the Otsu method, ACOADE was also the best. SCSO ranked second and had a higher proportion of the optimal value compared with other algorithms except ACOADE. SCSO showed good performance in image segmentation using the Otsu method, particularly excelling in obtaining relatively high FSIM, PSNR, and SSIM values. In general, ACOADE achieved the best FSIM, PSNR, and SSIM in image segmentation based on the Kapur entropy method and the Otsu method. The FSIM, PSNR, and SSIM obtained by PSO based on the Kapur entropy method were better, but the FSIM, PSNR, and SSIM obtained by the Otsu method were poor. This indicated that PSO was more suitable for image segmentation tasks using the Kapur entropy method. The FSIM, PSNR, and SSIM obtained by SCSO based on the Otsu method were also good, but the FSIM, PSNR, and SSIM obtained by the Kapur entropy method were poor, suggesting that SCSO performed better with the Otsu method for image segmentation.
Figure A5,
Figure A6,
Figure A7,
Figure A8,
Figure A9 and
Figure A10 are line graphs of eight algorithms corresponding to FSIM, PSNR, and SSIM. From these figures, we can see that ACOADE had always been at the highest value, indicating that the result of ACOADE was excellent and the segmentation effect was good.