Various experiments on the simulated, synthetic and remote sensing images were conducted to evaluate the performance of the proposed HSMM algorithm. Fuzzy C-means (FCM) [
7], GMM [
15], and SMM [
20] were also implemented for comparative analysis. The implementation and analysis of the different algorithms were executed in Matlab software on an Intel Core i7 computer. Some constants and parameters of the proposed algorithm are set as follows: the number of components
K is set by visual interpretation, which is the number of classes in the image; the number of distributions
M is set to 2, and it is sufficient to describe the statistical characteristics of remote sensing images; for the initial values of parameters,
β is set to 0.1,
μ and
Σ are randomly generated from Gaussian distributions,
w and
r are randomly generated in [0, 1] range,
v is set to 1, and the above parameters can be calculated in iterations.
3.1. Simulated Grayscale Image Segmentation
Figure 3a shows the template image with three homogeneous regions (1–3 are the labels of regions), and the simulated images generated using the parameters listed in
Table 1 are shown in
Figure 3b. Two sets of parameters were used in generating random values as pixel intensities for the three regions to simulate the spectral heterogeneity of remote sensing images. The FCM, GMM, SMM and HSMM algorithms were then used to segment the image, and the results are shown in
Figure 3c–f.
In
Figure 3c,d, FCM and GMM algorithms were able to segment each region, but some pixels were incorrectly segmented. The SMM algorithm (see
Figure 3e) generated worse results than the FCM and GMM algorithms, with more pixels wrongly segmented, especially in region 2. In
Figure 3f, the proposed HSMM algorithm was able to segment the regions accurately, with fewer pixels segmented erroneously.
Table 2 summarizes the accuracy results (i.e., product accuracy, overall accuracy and kappa coefficient) of the different segmentation algorithms for the simulated image (
Figure 3). While the FCM and GMM algorithms produced good segmentation results with product accuracies above 95% in all three regions, some pixels were incorrectly segmented. For the SMM, due to the incorrect segmentation in region 2, the algorithm had the lowest product accuracy. The proposed HSMM algorithm was able to segment each region accurately, with product accuracies above 97% in all regions. In terms of overall accuracy, the HSMM algorithm had the highest value (98.92%), which was 1.11%, 1.31%, and 19.46% greater than the FCM, GMM, and SMM algorithms; for the kappa coefficient, the HSMM algorithm was 0.02, 0.02, and 0.27 higher than the other algorithms.
To test the modeling performance of the proposed algorithm,
Figure 4a–c shows the histograms and the fitting curves of the simulated image using the proposed HSMM, where the black areas are the histograms of three regions and the red curves show the fitting curves of the HSMM algorithm. In
Figure 4a,b, the histograms of regions 1 and 2 were asymmetric, while the histogram of region 3 was asymmetric and multimodal in
Figure 4c. As shown in the figures, the components of HSMM can accurately fit the complex histograms. The component was defined using the weighted two Student’s t-distributions. It was flexible to fit the symmetrical histograms of regions 1 and 2, and the multimodal histogram of region 3. Further, the proposed HSMM can accurately build the statistical model of the simulated image.
According to the principle of the gradient optimization method, the step length is able to affect the convergence of the log-likelihood function. For that, the log-likelihood functions with different step lengths are shown in
Figure 5, aiming to analyse the influence of step length for the convergence. In
Figure 5, the horizontal axis refers to the iteration count, the vertical axis indicates the log-likelihood function, and the curves correspond to the functions with different step lengths. As the number of iterations increased, the change in curve became increasingly smaller; at the 50th iteration, the functions almost remained unchanged. In addition, the functions converged to different values. When the step length was 10
−6, the corresponding function was maximum. Based on the maximum log-likelihood function criterion, the step length 10
−6 was selected as the empirical parameter for the subsequent segmentation experiments.
3.2. Synthetic Multispectral Image Segmentation
Figure 6 shows a synthetic multispectral image and its segmentation results.
Figure 6a shows the template image having four homogeneous regions. In
Figure 6b, the image was synthesized by intercepting four different regions (forest land, water, bare land, grassland) of multispectral image; 2% salt and pepper noise was randomly added to the synthetic image.
In the segmentation results, the FCM, GMM and SMM algorithms yielded lots of incorrect segmentation. The FCM algorithm generated segmentation errors, particularly in region 2 (see
Figure 6c). The GMM algorithm was unable to segment regions 2 and 3 (see
Figure 6d). The SMM algorithm was able to segment the regions but had incomplete segmentation and erroneous results in each region (see
Figure 6e). In
Figure 6f, the HSMM algorithm was able to accurately segment the region, with almost no incorrectly segmented results. The better result is because the proposed algorithm combined the HSMM and the correlation of local pixels for the reasonable use of spectral and spatial information. Hence, the proposed algorithm was robust to noise and obtained the optimal result.
Table 3 lists the segmentation accuracies (i.e., product accuracy, overall accuracy and kappa coefficient) of the various segmentation algorithms for the synthetic image (
Figure 6). The FCM algorithm had product accuracy below 80% in one of the regions since it had incorrectly segmented in region 2. The GMM algorithm was unable to segment region 3, causing it to have a product accuracy of 0% for that region. The SMM algorithm was able to segment all regions, having product accuracies above 91%. The HSMM algorithm was able to accurately segment each region, and the product accuracy in all regions was 100%. In terms of overall accuracy, the HSMM algorithm was 10.39%, 29.01%, and 5.27% higher than the other algorithms; it also had a better kappa coefficient, which was 0.14, 0.36, and 0.09 higher than the other algorithms.
3.3. Remote Sensing Image Segmentation
Various high-resolution panchromatic images were segmented (see
Figure 7) to verify the effectiveness of the proposed HSMM algorithm.
Figure 7a,g,m shows Cartosat1 images of farmland, bare land and buildings with a 2.5 m spatial resolution; the numbers of object regions were 3, 4, and 4. Standard segmentation (see
Figure 7b,h,n) can be conducted using visual recognition for the quantitative evaluation, and the different regions can be labeled 1–4. In
Figure 7c–f, the FCM and GMM algorithms were unable to segment region 1 and there were some wrongly segmented pixels in each region; the SMM algorithm could not segment each region, and its result was poor; the proposed algorithm better segmented each region, and there were a few incorrectly segmented pixels. In
Figure 7i–l, the regions 1 and 3 were better segmented using the FCM and GMM algorithms, and the wrongly segmented pixels were mainly in region 2; the SMM was unable to segment regions 1 and 2, and there were many wrongly segmented pixels in regions 3 and 4; the proposed algorithm more accurately segmented each region and there were some wrong pixels in region 2. In
Figure 7o–r, there were some incorrect pixels in the results of the FCM and GMM algorithms, and region 3 was segmented worse; the SMM algorithm was unable to segment regions 2 and 3; the result of the proposed algorithm was better than other algorithms, while there were some incorrect pixels in region 2. While the FCM and GMM algorithms considered the spatial correlation of local pixels, they were susceptible to spectral heterogeneity and obtained poor results, such as region 2 in
Figure 7i,j. The SMM algorithm was unable to accurately segment each region and there was either over-segmentation or under-segmentation for farmland, such as region 1 of
Figure 7e and region 2 of
Figure 7q. Based on the spatial constraint, HSMM, the proposed algorithm, was able to reasonably utilize the spectral information, reduce spectral heterogeneity effects and obtain better results.
Figure 8a,g,m shows different high-resolution multispectral images with 2% salt and pepper noise to test the robustness of the proposed algorithm.
Figure 8a has 0.5 m spatial resolution from GeoEye1 and includes building, an athletic track, and an area of artificial grass.
Figure 8g is a 0.8 m spatial resolution image from Ikonos and includes buildings, farmland, and bare land.
Figure 8m is a 0.5 m spatial resolution image from Worldview2 and includes buildings, roads, grassland and trees. The object regions in these figures were 3, 4, and 4. Visual recognition was used for generating the standard segmentation, and each region was labeled (1–4). In
Figure 8c–f, the FCM and SMM algorithms were affected by noise, and some pixels in region 1 were wrongly segmented; there were less wrongly segmented pixels in
Figure 8d using the GMM algorithm; a few pixels were wrongly segmented using the proposed algorithm, and the region was segmented accurately. The FCM, GMM, and SMM algorithms had difficulty segmenting region 2 with texture in
Figure 8i–k, while the proposed algorithm introduced the spatial correlation to better segment each region in
Figure 8l, especially texture region 2. The three comparative algorithms were unable to segment the road and grassland in
Figure 8o–q, and shadow pixels were wrongly segmented. In contrast, the proposed HSMM algorithm was able to accurately segment road and grassland, and obtain a better result in
Figure 8r; a few shadow pixels were erroneously segmented. The proposed algorithm utilized the spatial constraint HSMM to obtain the optimal segmentation. It was more robust to noise due to the correlation of local pixels, and avoided inaccurate segmentation of texture regions. Visually, the proposed HSMM algorithm generated better results than the other algorithms.
Figure 9 shows a large-scale panchromatic image with a high-resolution and its segmentation results. In
Figure 9a, there is farmland, rest-arable land, bare land and a residential area, labeled as 1–4. The spectral heterogeneity is obvious in the image, especially in the farmland. There is spectral similarity between farmland and rest-arable land. In
Figure 9c, the FCM algorithm was unable to segment each region. For example, many parts of region 1 were wrongly segmented into region 2, and it failed to segment regions 3 and 4. In
Figure 9d, the GMM algorithm wrongly segmented parts of region 1 into region 3. Moreover, regions 3 and 4 were also unable to be segmented. The SMM algorithm obtained better result than the FCM and GMM algorithms in
Figure 9e, while there were also wrongly segmented pixels in region 1. The proposed algorithm more accurately segmented each region, and less pixels were wrongly segmented in region 1. In general, the result of the proposed algorithm is optimal.
Table 4 summarizes the overall segmentation accuracies for the remote-sensing images (
Figure 7,
Figure 8 and
Figure 9). The SMM algorithm had the lowest segmentation accuracy for the panchromatic images. The accuracy of the GMM algorithm was the lowest for the multispectral images except for the accuracy of segmenting
Figure 8a. The FCM algorithm had the lowest accuracy of segmenting,
Figure 9a. Compared with traditional segmentation algorithms, the proposed algorithm had better overall accuracy and its accuracy was greatly than 87%. Moreover, the average accuracy of the proposed algorithm was 26.08%, 25.73%, and 27.29% higher than the FCM, GMM, and SMM algorithms, respectively. Hence, the proposed algorithm obtained the best results for the remote sensing images.
Table 5 lists the segmentation times of the different algorithms to test the efficiency of the proposed HSMM algorithm. For remote sensing images in
Figure 7 and
Figure 8 with 256 × 256 pixels, the FCM, GMM, SMM, and HSMM algorithms had segmentation duration between 15 to 22 s, 14 to 22 s, 33 to 44 s, and 28 to 36 s, respectively. The scale of
Figure 9a was 512 × 512 pixels, and its segmentation time was significantly more than other images for each algorithm. Hence, the larger the scale of the image, the more time it took to segment the image using the above algorithms. For the average time, the FCM and GMM algorithms were the least, the SMM algorithm was the most, and the proposed algorithm was better than the SMM algorithm. The FCM and GMM algorithms were able to deduce the formulas of parameters using minimum objective function and maximum likelihood estimation respectively, so they obtained segmentation results more efficiently. The SMM algorithm optimized the parameters using a gradient descent method and took much more time for convergence. The proposed algorithm combined the maximum likelihood estimation and gradient optimization method to solve parameters based on the simplified segmentation model. Hence, its time was better than the SMM algorithm and 9.2654 s less than the SMM algorithm on average.
Figure 10 presents the segmentation results of 10 remote sensing images provided by the proposed algorithm and the Hierarchical GMM (HGMM)-based parameter optimization image segmentation algorithm (called HGMM algorithm). In
Figure 10, images 1 and 2 are from EROS, images 3 and 4 are from Worldview1, images 5 and 6 are from SPOT5, images 7 and 8 are from Cartosat1, images 9 and 10 are from Pleiades1, respectively. For the image 1, the HGMM algorithm a better result, while the proposed algorithm was affected by the shadow and there were some wrongly segmented pixels. For the image 2, the HGMM algorithm was unable to segment the right region, while the proposed algorithm could better segment the image. For the images 7 and 8, the upper regions were incorrectly segmented using the HGMM algorithm, whereas the proposed algorithm was able to better segment each region and avoid the effect of spectral heterogeneity. For other images, the HGMM and the proposed algorithms accurately segmented them. Moreover, the results of the proposed algorithm were better than the HGMM algorithm in detail, such as the lower regions of images 3 and 5. Hence, the proposed algorithm is superior to the HGMM algorithm in segmentation performance.
Table 6 lists the segmentation time of the HGMM and HSMM algorithms for 10 remote sensing images. The time of the HGMM and HSMM algorithms was in the range of (1.66, 2.32) and (2.28, 2.97), respectively. They were affected by the number of classes. The greater the number of classes, the more segmentation time. Obviously, it taken more time to segment images 2, 3 and 4 than other images. Since the structure of Student’s t-distribution was more complex than Gaussian distribution and the parameter number of Student’s t-distribution was more than Gaussian distribution, the proposed algorithm took more time than the HGMM algorithm. For the average time, the proposed algorithm was 0.6693 s more than the HGMM algorithm.