1. Introduction
Plastic greenhouses (PGs) have been widely built for decades [
1]; consequently, pixel-based indexes [
2,
3,
4], supervised classification [
5,
6,
7,
8,
9], or semantic segmentation [
10,
11,
12,
13,
14], window-based detection [
10,
13,
15,
16], and object-based analysis [
17,
18,
19,
20,
21,
22,
23,
24,
25] have been proposed to extract the location, boundary, or number of PGs. Generally, the three classification units have their own advantages and disadvantages in different image resolutions or scales, and object-based analysis of PGs is still a significant approach.
In China, since the PGs with walls are generally nearly in quarter-cylindroid shape, and the PGs without walls are generally nearly in semi-cylindroid shape [
26], the film covering PGs interacts with sunlight and scattered light from the sky in various angles. As a result, the DN values of pixels belonging to the same PG often show strong heterogeneity in high-resolution remote sensing images [
27,
28], which increases the difficulty of segmenting the PG pixels in a Gaofen-2 (GF-2) fusion image [
29].
Although various segmentation methods have been proposed in recent years, multi-resolution segmentation (MRS) [
30] still plays a vital role [
31] as a data preprocessing tool for machine learning methods such as nearest neighbor [
17,
32], decision tree [
18], support vector machine [
17,
25], and random forest [
17,
23,
24,
25]. Moreover, the estimation of scale parameter (ESP) 2 [
33], which was proposed to calculate the optimal scale parameter (OSP) on multiple layers, has improved the practicability of MRS. Furthermore, mean shift and MRS are also utilized as a preprocessing step for deep learning application [
34,
35]. Nevertheless, there are still several problems with obtaining and evaluating effective PG segments (EPGSs).
First, it is known that it is not necessary to evaluate all the segments containing PG pixels; however, the criteria for effective segments are much less restrictive in previous studies [
29,
36], while guaranteeing that EPGSs contain enough PG pixels to be representative is critical to later classification.
Second, the efficiency of obtaining representative EPGSs of each experimental group is not high enough. Most previous studies extract the spot-check samples of EPGSs [
18,
20,
21,
22] in less time, but a small number of EPGSs cannot represent the whole quality. Yao et al. used whole samples that were manually selected, whereas the more convincing comparative trials require indignantly more time [
29].
Third, atmospheric correction has the potential to improve the quality of images; thus, the effects of atmospheric correction on EPGSs of Worldview-3 have been evaluated by local sampling [
22] via modified Euclidean distance 2 (ED2) [
21]. However, as with the original system of ED2 [
37], modified ED2 also ignores the difference between the attributes of the quantity-based indicator and those of the area-based indicator, and gives the larger indicator a bigger weight than the smaller one in calculation [
29]. In order to reduce the uncertainty caused by the sampling process and dimensional difference in evaluation, a new pattern composed of the over-segmentation index (OSI), under-segmentation index (USI), error index of total area (ETA), and composite error index (CEI) was proposed to evaluate the effects of linear compression and mean filtering on EPGSs of GF-2 fusion image by full samples [
29]. However, the effects of atmospheric correction and image enhancement on EPGSs have not yet been evaluated by the new pattern.
To improve the MRS quality of PG in GF-2 images, in this study, a new proportion of 70% was used to determine whether a segment can be an EPGS by analyzing the relationship between extraction results and reference polygons, and an accurate and efficient method was proposed to extract EPGSs by boundary-removed point samples with adjustable density. Moreover, the effects of atmospheric correction and image enhancement on EPGSs were evaluated by intersection over union (IoU) and the OSI-USI-ETA-CEI pattern [
29]. The experimental results show that:
The proportion of 70% designed in the experiment is a reasonable pixel ratio to determine the EPGSs;
The OSI-USI-ETA-CEI pattern can be more effective than IoU when it is needed to evaluate the quality of EPGSs of GF-2 images in the study area;
With the consideration of heterogeneity and target characteristics, the atmospheric correction and image enhancement prior to MRS can improve the quality of EPGSs of GF-2 images in the study area;
The combination of atmospheric correction, Fast Fourier Transform (FFT), and a circular low-pass (CLP) filter with a radius of 800 pixels obtained the lowest CEI in this study.
2. Materials and Methods
The GF-2 images in the study area were preprocessed using image correction, registration, fusion, or enhancement, and then the reference polygons were made into point samples with adjustable density and boundary removal using the characteristics of chessboard segmentation. Finally, the effect of atmospheric correction and image enhancement on EPGSs was analyzed using the IoU and OSI-USI-ETA-CEI pattern for full sample evaluation. A technical flowchart and the configuration of the computer used in the study are shown in
Figure 1 and
Table 1, respectively:
2.1. Study Area and Data
The GF-2 image, reference polygons, and ground verification photos and their locations are shown in
Figure 2, among which the original images and photos are the same as those of a study by Yao et al. [
29], while the reference polygons were improved manually through visual interpretation in some places. For instance, because the imaging time of the GF-2 image is earlier than that of the photo in
Figure 2a, the weeds and shrubs that covered the PG with walls in the photo do not appear on the GF-2 image; thus, a PG with walls was added to the new reference polygons. PGs both with and without walls [
26], and water, trees, buildings with high reflectance, residences, and barren land, are contained in the study area, which is a typical one in Shouguang City, Shandong Province, China. As a footnote,
Table 2 presents basic parameters of the GF-2 images in this study [
38].
2.2. Preprocessing of GF-2 Images
Three schemes were designed for the comparative experiments: the first was used to generate the control group, and the other two were used to generate experimental groups. All preprocessing operations were conducted in ENVI (L3Harris Geospatial Solutions, Inc., Broomfield, CO, USA) software.
2.2.1. Orthographical Correction, Image Registration and Fusion
Firstly, we orthorectified the GF-2 images using rational polynomial coefficients and ASTER GDEM version 2 [
39]; secondly, we registered the multispectral image without atmospheric correction (MI) to the panchromatic image that is only orthorectified, and then fused them using the Gram Schmidt Pan Sharpening tool, which outperforms most of the other pan sharpening methods in terms of both maximizing image sharpness and minimizing color distortion [
40,
41]; thus, a multispectral and panchromatic fusion image without atmospheric correction (FI) with 0.8 m resolution could be obtained.
2.2.2. Atmospheric and Orthographical Correction, Image Registration and Fusion
The difference between this schedule and the first one is the radiometric calibration and atmospheric correction prior to the orthographical correction; thus, an atmospheric-corrected multispectral image (ACMI) with 3.2 m resolution and an atmospheric-corrected fusion image (ACFI) with 0.8 m resolution can be obtained. The atmospheric correction was conducted using the Fast Line-of-sight Atmospheric Analysis of Spectral Hypercubes (FLAASH) [
42].
2.2.3. Image Enhancement
Linear compression is a concept opposite to linear stretching. The uniform expansion of the image gray value is called linear stretching, whereas the uniform reduction is called linear compression. The maximum digital number (MDN) value is the maximum gray value of a single-band image in a monochrome or multispectral image, and the gray value of each pixel in each band can be reassigned between 0 and MDN according to the frequency of each original gray value, thereby changing the distribution range of the gray value. When the MDN is reduced, the gray values of the pixels in the image will be compressed to a smaller range [
43]. The MDNs adopted in this study were 511, 255, and 127.
Spatial filtering is also an important method for enhancing image information by changing the DN value of an image for a specific purpose. In order to further improve the heterogeneity among pixels and reduce the noise, the input image is filtered from the spatial domain (x, y) and the spatial frequency domain (ξ, η), respectively [
43,
44]. In this study, mean Gaussian low-pass (GLP) and median convolution filters with a size of 3 × 3 were selected in the spatial domain, and the circular low-pass (CLP) filter combined with FFT was selected in the spatial frequency domain.
2.3. MRS via ESP 2 Tool
Focusing on the effect of atmospheric correction and image enhancement on MRS, the uniform shape and compactness were set as 0.3 and 0.5, respectively [
29]. Thus, each optimal scale parameter (OSP) in this research was automatically calculated using the ESP 2 tool [
33] with the algorithm parameters [
29] set as shown in
Table 3. Level 1 and its segments in the exported results were adopted for the next stage of analysis.
2.4. A Semi-Automatic Method to Extract EPGSs
The automatic extraction of EPGSs from different segmentation results requires different object samples, features, parameters, and thresholds, which is cumbersome and makes it difficult to avoid the error of false extraction and omission. In this study, the EPGSs were extracted by a semi-automatic process.
Firstly, the point samples used to select PG segments were made, and then the segments that intersected with the point samples were automatically extracted as the PG segments by the eCognition classification algorithm. Finally, visual interpretation and manual selection were carried out to extract the EPGSs. The geometric error of manual selection results has a theoretical minimum, but the error of total extraction area does not, because the total extraction area will be close to the real area of the PGs when the omission error is close to the false error.
2.4.1. Production of Point Samples
Segments overlapping with the reference polygons can be quickly classified using the assign algorithm in eCognition; however, the PG pixel proportion of the segments classified by the boundary of reference polygons is often less than 50%. In order to improve the efficiency of visual interpretation, this study first segmented the GF-2 image by using a chessboard segmentation algorithm combined with the reference polygons (
Figure 3a). Thus, the segments that intersected with the reference polygons were classified as initial samples and, then, we discarded most of the segment samples that were located at the boundary of the reference polygons by area calculation and filtering (
Figure 3b). Finally, the point samples used for extracting PG segments could be derived using the export algorithm in eCognition (
Figure 3c), and the “center of gravity” type was chosen when exporting them.
In this study, the size of the chessboard was set as 10 pixels or 20 pixels according to the characteristics of the PG segments. For the segmentation results with a large number of segments and a small average area, a denser chessboard was used; otherwise, we used a sparser one.
Figure 4 shows the specific algorithm steps of making point samples based on reference polygons and a chessboard with 20 pixels as an example.
2.4.2. Criterion for Determining EPGSs
In a broad sense, the PG segment refers to all the segments containing PG pixels when the characteristics of non-PG pixels are not obvious. If the proportion of PG pixels in these segments form a finite set G = {g1, g2, …, gi, …}, then PG segments with g above or equal to a set value gset (g ≥ gset) can be defined as EPGSs, in which the patches composed of non-PG pixels can be called extra fragments (EF), which are treated as false errors in the subsequent analysis. Moreover, those segments with g < gset can be regarded as invalid PG segments in a narrow sense, and the patches composed of PG pixels can be called lost fragments (LF), which are treated as omission errors in the subsequent analysis.
Generally speaking, the boundary of PG segments cannot completely coincide with the reference polygons. Supposing that the total area of EPGSs is S, and the total area of corresponding reference polygons is R, the area of extra fragments is EF, and the area of lost fragments is LF. For a specific segmentation result, with the gset decreasing, EF will increase, while LF will decrease accordingly. When g = gmin, the area of EPGSs reaches the maximum value Smax, EF and the difference between S and R reaches the maximum value EFmax = Smax− R, while LF reaches the minimum value LFmin = 0 and the producer accuracy (PA) reaches the maximum (PAmax = 1).
Conversely, with the gset increasing, EF will decrease, and LF will increase accordingly; so, when g = gmax, the area of EPGSs reaches the minimum value Smin, and EF reaches the minimum value EFmin, while LF and the difference between S and R reaches the maximum value LFmax = R − Smin, and the PA reaches the minimum. In this process, because the numerator and denominator of the user accuracy will change at the same time, it is difficult to judge the g value corresponding to its maximum and minimum values.
According to the inference above, when different
g values are used to extract the EPGSs in a specific segmentation result,
EF and
LF are in a trade-off relationship. From an application perspective, the reasonable
gset value should make the difference between
S and
R (which is equal to the difference between
EF and
LF) close to 0. Since the higher the
gset value, the more stringent the evaluation criterion [
36], and when
gset is set as 60%,
EF is much higher than
LF. If the
gset is set too high (for example, 80%), many segments containing PG pixels will not be classified as EPGSs, which leads to
LF being much higher than
EF; hence, the value of
gset can be set as 70%.
2.4.3. Extraction of EPGSs
Taking the extraction of EPGSs from FI as an example: Firstly, the point samples were used to obtain the segments that intersected with them (
Figure 5a). Then, a small number of segments that did not qualify with “above or equal 70% pixels belonging to the PG” were discarded; hence, the retained PG segments were those needed in the subsequent analysis. Moreover, a few omitted segments that did not intersect with point sample points but satisfied the criterion were added as the EPGSs (
Figure 5b).
2.5. Evaluation System
According to the single variable principle, all MRS times
T in this study were obtained off-line with the running of a single program. In order to evaluate the rationality of the experimental settings, intersection over union (IoU) was used to preliminarily evaluate the accuracy of EPGSs, and then the OSI-USI-ETA-CEI pattern [
29] was used to evaluate the segmentation error. In order to facilitate comparison, the value of 1 −
IoU was taken as the evaluation parameter for subsequent analysis. The description or expression of each parameter is shown in
Table 4.
In the formula, the set S represents the total area of EPGSs, and the set R represents the total area of reference polygons; v represents the number of EPGSs of the optimal segmentation result obtained by the ESP 2 tool, and v1 specifically refers to the number of EPGSs in the fusion image without atmospheric correction in the study area. The higher the OSI value, the heavier the over-segmentation degree. The sets LF and EF represent the total area of lost fragments and extra fragments of EPGSs, respectively; the higher the USI, the greater the error of under-segmentation, and the higher the ETA value, the greater the error of the total area. Regarding the CEI, λ is used to rescale the value of quantity-based OSI so that the indicator will not overwhelm the value of area-based USI and ETA.
4. Discussion
4.1. Comparison of IoU and OSI-USI-ETA-CEI Pattern
Since the IoU is widely applied, the consistent trend of 1 −
IoU and USI in
Figure 8 and
Figure 10 illustrates the validity of USI. However, the IoU cannot indicate the variation in the degree of over-segmentation, nor the change trend of the difference between
S and
R, whereas the OSI and the ETA can do that, respectively. Thus, the OSI-USI-ETA-CEI pattern can be more effective than IoU when it is needed to evaluate the quality of EPGSs.
4.2. Comparison with Related Research
For comparison, Yao et al. obtained the lowest CEI of 0.185 using the image enhancement method of gray compression and mean filtering on the fusion image without atmospheric correction in the same area [
29]. Even though the criterion of EPGS (the proportion of PG pixels in the segment) was raised from 60% to 70%, the CEI of nine atmospheric correction and image enhancement schemes in this study was lower than 0.185. The main reasons for the improvement are the use of atmospheric correction to optimize the DN value of the image, the use of new filters, and the point samples, which were made by improved reference polygons, which improved the extraction efficiency and accuracy of the EPGSs.
4.3. Hypothesis of Why Atmospheric Correction and Image Enhancement Works
From the results in
Section 3, it is suggested that not only the CEIs of ACMI and ACFI, but also those of the linearly compressed ACFI, are lower than that of the MI, FI, and ACFI, respectively. In order to find the reason for this, the line and symbol plots shown in
Figure 11 were obtained by counting the pixel number of each DN interval of the blue bands of MI, ACMI, FI, and ACFI.
It can be seen from
Figure 11a that atmospheric correction has the effect of expanding the DN value range of the MI, which can be regarded as the nonlinear stretching of the DN value. Since the radiometric calibration process also stretches the DN value of the panchromatic image, the DN value range of ACFI is significantly higher than that of the fusion image without atmospheric correction (
Figure 11b). Combined with the increase in OSIs and the decrease in USIs, it is suggested that atmospheric correction can improve the heterogeneity between the PG pixels and non-PG pixels; thus, the CEIs of ACMI and ACFI are lower than those of MI and FI, respectively.
Alternatively, since the ACFI can easily lead to over-segmentation, the linearly compressed ACFI can obtain better segmentation results by decreasing the heterogeneity of the whole image.
Regarding filters, whether they help improve the quality of EPGSs is still determined by heterogeneity, since the filters can remove image noise, while the homogeneity between the PG pixels and non-PG pixels can be increased at the same time.
Therefore, a hypothesis can be inferred combined with the theory of MRS that, if the increase in heterogeneity between the PG pixels and non-PG pixels is greater than that of the homogeneity during image preprocessing, then the quality of EPGSs will increase; otherwise, it will decrease.
4.4. Next Steps
Limited by the experimental conditions, this research only analyzed the effects of atmospheric correction on EPGSs of GF-2 multispectral image, and the effects of atmospheric correction, linear compression, and spatial filtering on EPGSs of GF-2 fusion image in the study area. Other image correction and enhancement methods or segmentation algorithms to improve the quality of EPGSs in the study area or other areas can be further discussed in a follow-up study.
PGs in China can be divided into two kinds according to whether they are built with walls [
26]. However, some research has mistakenly identified PGs with walls as PGs without walls. For instance,
Figure 7 in the study of Ma et al. [
10] mistakenly matches a photograph of PGs without walls to the satellite image of PGs with walls, which can be very misleading to the reader. It is essential to distinguish the two main types of PG, both in field investigation and image interpretation, in addition to the extraction.
In order to extract the surface object from various satellite images with better accuracy, many kinds of machine learning models have been applied in computing [
45,
46], which are also worthy of being assessed adequately in subsequent PG extraction.