**3. Results**

To see if the DenseNet is more suitable and e fficient than the other methods, we first compare the result of the proposed network with the ground truth to evaluate its e ffectiveness in identifying the water bodies. Then we compare with the results derived from the NDWI index and four other deep neural networks of VGG, ResNet, SegNet and DeepLab v3+. Finally, we chose the best model and made a simple analysis of the changes in water areas in Poyang Lake area in winter and summer from 2014 to 2018.

#### *3.1. The Image Preprocessing*

The dataset contains GF-1 images from the middle and lower reaches of the Yangtze River basin in di fferent periods. The corresponding labels were binary classifications of the water–nonwater area by expert visual interpretation. To improve the e fficiency of the model training, we clipped the input data to 224 × 224 pixels. We have deliberately selected some labels with both land and water bodies as training samples. Finally, we have selected 5558 water bodies samples. Of these, 4446 images were used as training sets, while the remaining 1112 images were used as test sets. This data is only used for model training and quantitative evaluation. Since the samples are cut into small pieces, and the selection of training set and test set are random, the recognition e fficiency of the model on a large range of images cannot be seen from the existing data. To qualitatively evaluate the performance of di fferent models in di fferent ground object types, we also applied the model to other GF-1 images in di fferent periods.

#### *3.2. Water Identification Result of DenseNet*

Figure 4 is the recognition result of 12 remote sensing images selected from the validation dataset, the corresponding ground truths, and the DenseNet result. The sizes of images are all 224 × 224. These 12 images contain water bodies of various shapes and colors, from which the recognition e ffectiveness of this model on di fferent water bodies can be understood.

**Figure 4.** Result of the water body identifications using the DenseNet model. The figure is divided into four rows and nine columns, showing the recognition effects in 12 different regions. Column (**a**), (**d**), (**g**) are the false-color remote sensing images; column (**b**), (**e**), (**h**) are the corresponding ground truths; column (**c**), (**f**), (**i**) are the corresponding model (DenseNet) recognition results.

It can be seen from Figure 4 that the recognition result of DenseNet is consistent with the ground truth. Although this model failed to identify some small water bodies, the error areas are generally very small, and such small errors have little influence on the overall distribution of water bodies, which can be ignored. In addition, the network can accurately identify the water bodies in different forms and regions, and accurately separate small rivers in the towns, and even small barriers such as bridges in the water can be correctly separated. The boundaries between water and land were identified, partly because of the fine resolution of the GF-1 images, and partly because of the efficiency of the proposed DenseNet model.

#### *3.3. Working E*ffi*ciency of DenseNet, ResNet, VGG, SegNet and DeepLab v3*+ *Models*

Figure 5 shows the training losses of DenseNet, ResNet, VGG, SegNet and DeepLab v3+. In the convolutional neural networks, the loss function is used to calculate the difference between the output of the model and the ground truth, so as to better optimize the model. The smaller the loss is, the better the robustness of the model is. In Figure 5, one epoch represents 1000 iterations. For the initial epochs, the loss value of VGG is by far the highest, which is two or three times higher than those of ResNet and DenseNet; and it remains the highest until 30 epochs. The initial loss value of SegNet is close to VGG, followed by DeepLab v3+. The DenseNet has a higher initial loss value than the ResNet, but then it declines faster than the ResNet and continues to be lower than the ResNet after five epochs. The loss of DenseNet maintains the lowest after five epochs, indicating the fastest convergence speed compared to the other four models.

Table 2 shows the training time of the five networks. Among them, the VGG has the longest training time. The DeepLab v3+'s training time is the shortest, and DenseNet is next to it. The DenseNet saves about 80 min compared to the VGG model, about 50 min compared to the ResNet model, and 40 min compared to the SegNet model.

**Figure 5.** Training losses of the DenseNet, ResNet, VGG, SegNet and DeepLab v3+ models. One epoch represents 1000 iterations.

**Table 2.** Training time of the DenseNet, ResNet, VGG, SegNet and DeepLab v3+ models.


But it takes more than 70 min compared to DeepLab v3+. This indicates that under the same training environment, the DeepLab v3+ requires the least training time; it is easier to train and use the lowest resource consumption capacity. The reason why we did not compare the time consumption of NDWI with these networks is that the NDWI method does not need a lot of time to process, and the required time can be ignored.

#### *3.4. Comparison of Identification Results*

The derived P, R, F1 score and mIoU of the VGG, the ResNet, the DenseNet, the SegNet, the DeepLab v3+ and the NDWI models are shown in Table 3. All values in the table were calculated by the prediction results of 1112 images in the test set, and their corresponding ground truth. Given the limited number of samples, we reported the 95% confidence interval of the metrics to see if the result is statistically significant. The best result of each indicator is in bold. We can see from the results that all neural networks' results are much better than the NDWI index. For each network model, the DenseNet result, with a narrower interval, appears more stable than the other methods.

**Table 3.** The derived P, R, F1score and mIoU of the VGG, ResNet, DenseNet, SegNet, DeepLab v3+ and NDWI models with 95% confidence interval. The optimal value for each metric is shown in bold.


Among six models, the DenseNet appears to have the highest precision of 0.961, meaning that 96.1% of the water bodies are correctly predicted among the predicted water bodies by the model. The precisions of ResNet, VGG, SegNet and DeepLab v3+ are 0.936, 0.914, 0.911 and 0.922, respectively. Such a rank of this precision is as expected, considering the pathway of theoretical improvements of these deep neural network models. However, the NDWI model based on the spectral bands appears to have a rather reduced prediction precision, which is only 0.702, although an adaptive threshold from the Otsu method is employed. Hence, the DenseNet appears to perform the best among the three deep neural networks regarding prediction precision; particularly, such a neural network, at least in this case, is by far the better than normally used NDWI method for water body identification in the remote sensing community.

Among the three deep neural networks, SegNet shows the highest recall value of 0.934. The ResNet shows the lowest recall, which is 0.902. The DenseNet is only 0.02 higher than ResNet. VGG and DeepLab v3+ have a recall of 0.915 and 0.917, respectively. The NDWI model shows the highest recall value of 0.983 among all the six methods, indicating it has successfully identified most of the water body samples in the training dataset. However, its precision value is the lowest, indicating that there are still serious ill predictions from this method. As can be seen, the matrices of recall and precision have given contrary indications of the model performances. To make a comprehensive evaluation of these two indicators, we investigate the F1 score considering both the precision and the recall values. We also use mIoU to evaluate the accuracy of model segmentation results. A higher F1 score and mIoU indicates a better performance. The F1 scores of the DenseNet, ResNet, VGG, SegNet and DeepLab v3+ models are 0.931, 0.919, 0.914, 0.922 and 0.919, respectively, and the mIoUs of them are 0.872, 0.850, 0.842, 0.856 and 0.850, respectively. We can see from the results that the performance of DenseNet is better than ResNet, VGG, SegNet and DeepLab v3+. This may be due to the dense connection, which increases the utilization e fficiency of the features. As for the result of DeepLab v3+, the training efficiency is much better than DenseNet. This is because the backbone of DeepLab v3+ we chose is MobileNet, which is a lightweight network using the depth-wise separable convolution to reduce the number of parameters and the amount of calculation. The F1 score and the mIoU of the NDWI index are as low as 0.819 and 0.767, showing that all the deep neural networks have much better performance than the traditional NDWI method from a comprehensive viewpoint.

The recalls of DenseNet and ResNet are not very good in these models, meaning that these networks are not good at capturing all the water areas. Figure 6 shows some examples of this disadvantage. The third column is the result of DenseNet, and the fourth column is the result of ResNet; this figure shows that the water area which DenseNet recognized is the smallest in all six models, and it distributes in small rivers and intertidal zones. Column (h) is the result of NDWI. NDWI recognized the biggest water area, which is consistent with its highest recall value. However, with the increase of identified water area, the probability of recognition error is also increasing, meaning that the precision is more likely to drop with it. To increase the recall value of DenseNet, it may cost a sharp drop of precision. It has good results of F1 score and mIoU, meaning that the overall performance of this network is very good. Therefore, we decided not to further optimize the recall of DenseNet.

In order to further understand the performance of each method in di fferent regions, we selected two GF-1 images of the Poyang Lake during the wet and the dry seasons, respectively, to evaluate the performance of di fferent models, i.e., 29 July and 31 December 2016. Figure 7 shows the results from the image on 31 December 2016, when the Poyang Lake basin was dry with a complex distribution of water area.

**Figure 6.** Examples of the recognition effect of different models which shows the high false negative (FN) of DenseNet. (**a**) False color composite remote sensing image, and water body identification result by (**b**) the ground truth, (**c**) the DenseNet, (**d**) the ResNet, (**e**) the VGG, (**f**) the SegNet, (**g**) the DeepLab v3+ and (**h**) the NDWI models. White color indicates the identified water bodies.

**Figure 7.** Comparison of the water identification effect of different models in Poyang Lake on 31 December 2016. (**a**) False color composite remote sensing image, and water body identification result by (**b**) the DenseNet, (**c**) the ResNet, (**d**) the VGG, (**e**) the SegNet, (**f**) the DeepLab v3+ and (**g**) the NDWI models. White color indicates the identified water bodies, solid line depicts mountain area, dashed line depicts urban area.

In the false-color image, the blue area is mostly water body, and the red area is mostly vegetated. The other colored areas include bare land, buildings and other nonwater areas. The mountain area is depicted with a solid line frame, while the urban area is marked with a dashed line frame. In the prediction results of ResNet, many patches in the corresponding region of the mountains are predicted to be water bodies, which proves that the ResNet model is prone to confuse mountain shadows with water. In the same regions, the VGG and SegNet models have also falsely identified some mountain shadow areas as water bodies. DeepLab v3+ has not confused the mountain shadow with a water body, but the boundary of water area it extracted was not as concise as the other methods. The main water body was correctly identified by the NDWI models, which are however much larger than the actual water bodies, and the NDWI model has also identified too many fine patches. The NDWI result also had false detection of the mountain shadows, which is larger than those from the ResNet model, but smaller than those from the VGG model. Other than the mountain shadow, the biggest problem with the NDWI result is that it falsely identified some bare land and urban construction areas as water bodies. The DenseNet model has successfully identified the small rivers and lakes from the GF-1 image, and the mountain shadows and water bodies are successfully separated. In general, these five deep neural networks have consistently identified the large water bodies in winter, although the ResNet and the VGG models show a false identification of mountain shadows. These neural networks have performed much better than the traditionally used NDWI water body index.

Figure 8 shows the identified water bodies from the GF-1 image on 29 July 2016. In the false-color image, the white area in the dotted line indicates the cloud, and the dashed line depicts the urban area. In summer, the Poyang Lake is in a season with abundant water, and its water area reaches its peak within a year. It is found that the VGG, SegNet and DeepLab v3+ models have falsely identified the cloud as water bodies, and the DenseNet also has a small amount of false identification. We can see that the NDWI index can better identify the bulk of the water body, but there is much noise in the boundary areas; besides, it has falsely identified the urban buildings, bare ground and most clouds as water bodies. It is the ResNet model that completely distinguishes between the cloud and the water bodies, which however has some false identification of some water bodies. As for the DenseNet result, it shows a relatively accurate identification of water bodies with clear boundary separation for the transitional areas between land and water. The DenseNet method partially falsely identified cloud as water bodies, but it has filtered out most of it compared to the NDWI result.

**Figure 8.** Comparison of water identification effect in Poyang Lake on 29 July 2016. (**a**) False color composite remote sensing image, and water body identification result by (**b**) the DenseNet, (**c**) the ResNet, (**d**) the VGG, (**e**) the SegNet, (**f**) the DeepLab v3+ and (**g**) the NDWI models. White color indicates the identified water bodies, dashed line depicts urban area, dotted line depicts cloud area.

Therefore, for the image of 29 July 2016, these five deep networks have their advantages and disadvantages for the water body identification, but overall show better performances than the NDWI method.

#### *3.5. Interannual Variations of the Water Areas*

It can be concluded from the above results that the DenseNet model we proposed has higher accuracy, and can be used for water body identification. Therefore, we have used this model to understand the interannual changes of water areas of Poyang Lake. Since GF-1 was successfully launched in late 2013, we could only study the water area changes from 2014 to 2018. The water areas of Poyang Lake change significantly among seasons, and there is a huge difference between the wet and the dry seasons. The first row of Figure 9 shows the spatial distribution of Poyang Lake in summer from 2014 to 2018. The water area in 2016 was the largest, when there was a flooding disaster event in the Yangtze River basin, and the area in 2018 was the smallest when there was a summer drought due to the reduced precipitation. The second row shows the lake areas in winter. The water areas of Poyang Lake decrease sharply in winter, and the main lake body shrinks to only tributaries and smaller lakes. The disappearance of Poyang Lake is mainly concentrated in the central and southern parts of the lake, leaving only a small part of the water body in the north and northeast. This is principally due to the climatic conditions but is also partly related to the topography, the Yangtze River runoff and the three gorges dam [66,67].

**Figure 9.** The spatial variations of water area in summer and winter of 2014–2018 in Poyang Lake area based on DenseNet. The first row shows the lake areas in summer and the second row shows those in winter. White color indicates the identified water bodies.

Figure 10 shows the interannual variations of water areas of the Poyang Lake in summer and winter respectively, which were derived from GF-1 images from 2014 to 2018 based on the DenseNet model. The water areas in summer season are generally much larger than those in winter; this is not surprising, because summer is the rainy season in the Poyang Lake basin. The difference in the lake areas in winter and summer is about 2000 km<sup>2</sup> on average. The water area in 2014 summer is about 5200 km<sup>2</sup> and that in winter is about 3200 km2. In 2015, the water areas in summer and winter are equivalent, amounting to about 4300 km2; this is because of the increased winter precipitation and reduced summer precipitation contrasting to the normal years. In 2016, the water area in winter is about 3250 km<sup>2</sup> and that doubles in summer, reaching 7000 km<sup>2</sup> due to a severe flooding. It appears clearly that the water areas in summer are decreasing rapidly from 2016 on; however, those in winter show relatively small changes.

**Figure 10.** Statistics on the change of water area in summer and winter from 2014 to 2018 in Poyang Lake area, derived from GF-1 images.
