1. Introduction
China, as a populous nation, considers food security a paramount strategic resource. Agricultural plastic films, utilized for ground cover, play a crucial role in increasing crop yields, enhancing soil temperature, reducing water evaporation, preventing pest attacks, and mitigating diseases caused by certain microorganisms. Consequently, they are widely applied in agricultural production [
1]. However, with the increasing utilization of plastic films, the problem of plastic film residues has become increasingly severe. Effectively extracting plastic films has thus become a critical research problem. The dispersed nature, regional complexity, and diverse management of plastic film coverings make remote sensing technologies advantageous in plastic film identification. This is particularly essential for obtaining accurate spatiotemporal distribution information in China, aiding in farm environmental health assessment, plastic film recycling management, and supporting the implementation of low-carbon agriculture [
2].
Currently, traditional methods for acquiring plastic film cover information primarily rely on labor-intensive and time-consuming manual field measurements, with challenges in ensuring data accuracy. As remote sensing data and information extraction technologies mature, the automated extraction of plastic film cover information has become more convenient. In recent years, the methods for identifying plastic films have gradually shifted from traditional manual field surveys to segmentation extraction based on remote sensing images. The detection of agricultural plastic film coverage using satellite remote sensing imagery has emerged as a research hotspot. Lu [
3] proposed a threshold model based on moderate-resolution MODIS-NDVI time-series data, determining the threshold for detecting plastic film by analyzing data related to plastic film features. Xiong [
4] presented a method for agricultural plastic film monitoring based on multisource remote sensing data, including steps such as plastic film information extraction, classification, and area estimation. Using high-spatial-resolution satellite imagery and spectral and texture features, they achieved the rapid detection and monitoring of plastic film cover over large areas. Chen [
5] utilized high-spatial-resolution satellite imagery to develop a remote sensing index model for plastic greenhouses. By extensively exploring spectral and textural features and employing logistic regression analysis, they achieved the precise extraction of plastic film coverage in greenhouses. Picuno [
6] analyzed the spatiotemporal distribution characteristics and extraction methods of plastic film cover in the landscape of southern Italy using remote sensing and object modeling techniques based on Landsat TM imagery. The results indicated widespread plastic film cover in southern Italy, and the plastic film extraction method significantly impacted landscape feature extraction.
Although satellite remote sensing has advantages such as wide coverage, abundant information, and freedom from ground restrictions, its limitations, such as long image acquisition cycles and susceptibility to cloudy and foggy weather in Guizhou, hinder plastic film identification accuracy, making it challenging to meet the identification needs of small cultivated areas and fragmented crop planting in karst mountainous regions.
With the continuous development of drone technology, utilizing drone imagery for information extraction has become a rather popular research direction. Drones offer advantages such as high maneuverability, high spatial resolution, and timeliness, adapting well to complex environments and exhibiting low costs. Currently, an increasing number of scholars are utilizing drones for detection purposes [
7,
8]. However, most identification methods based on drone imagery still rely on traditional satellite remote sensing interpretation methods, involving the manual selection of spectral, texture, and shape features for classification. This not only requires specialized domain knowledge but also entails significant computational efforts. Since Krizhevsky et al. [
9] used deep learning technology to beat the world record in the ImageNet large-scale visual recognition competition, deep learning has opened up new prospects for applications in image classification, semantic segmentation, and other fields. Yang [
10] extracted plastic film from high-resolution drone imagery using deep semantic segmentation technology. They established a convolutional neural network model to achieve the precise identification and classification of plastic film-covered areas. The results indicated that plastic film extraction based on deep semantic segmentation technology exhibited high accuracy and reliability, providing a new solution for monitoring and managing plastic film-covered areas. Sun [
11] proposed a drone aerial monitoring method for greenhouses and plastic-covered farmland based on the SegNet deep semantic segmentation method, combining texture and spectral features. They used a convolutional neural network to extract plastic film and achieve the precise identification and classification of greenhouse and plastic film-covered areas. Zheng [
12], comparing the effects of deep learning methods, U-Net methods, and Support Vector Machine (SVM) algorithms in extracting plastic film from greenhouses, constructed the ENVINet5 deep learning model to extract plastic film through semantic learning. Song [
13] proposed using a pooling module to extract target features with a large receptive field based on a deep learning model and optimized the model by integrating high-level and low-level features.
In the aforementioned studies, the basic extraction of information regarding plastic film coverings in farmlands, including plastic film greenhouses, has been achieved. However, most of these studies focus on the remote sensing monitoring of large flat areas. Karst areas account for approximately 15% of the world’s land area [
14]. China has the largest and most widely distributed karst area [
15], with the southwest bare karst region, centered in Guizhou, being the largest and most densely distributed area globally [
16]. The rugged surface and extremely poor soil of karst terrain are unfavorable for agricultural development [
17], leading to the popular saying in the Yun-Gui Plateau region: “No three flat miles, no three sunny days, no three taels of silver.” Crop growth in karst mountainous regions is complex, with scattered planting distributions. Therefore, there is a need for more flexible, efficient, rapid, and accurate methods for plastic film recognition and monitoring in complex terrain areas.
With the continuous development of deep learning technology, its widespread application in automatic feature extraction and image fitting in the field of computer vision has provided new avenues for addressing target recognition issues in medium and high-resolution remote sensing images. In 2015, Ronneberger [
18] proposed the U-Net model to tackle challenges in image segmentation. This model has shown outstanding performance in medical image segmentation, demonstrating strong generalization capability and excellent segmentation performance. As a result, it has become one of the most highly acclaimed classic models and has been widely applied in various fields. In land use classification of satellite remote sensing images, some scholars, such as Ulmas P. [
19], utilized the U-Net model for land cover classification of high-resolution satellite images. Additionally, the U-Net method has been applied in building detection and road extraction in aerial images. For instance, Irwansyah E. [
20] employed an improved U-Net model for building detection in urban aerial images, achieving an average training accuracy of 0.83. In the field of intelligent transportation, the U-Net model has been used for the real-time detection and tracking of vehicles and pedestrians as well as for road segmentation. Yang X. [
21] utilized the U-Net model for vehicle detection and recognition in urban road images. Furthermore, the application of the U-Net model in agriculture is growing and includes crop growth monitoring and pest detection. Su Z. et al. [
22] proposed an end-to-end, pixel-to-pixel rice lodging identification semantic segmentation method using an improved U-Net network model for unmanned aerial vehicle remote sensing images which achieved an accuracy of 97.30% and proved suitable for small sample datasets. In plastic film extraction, Zhai Z. et al. [
23] combined unmanned aerial vehicle-acquired images of cotton fields with the U-Net model for image segmentation, achieving an average Mean Intersection over Union (MIOU) of 87.53%. Overall, as a deep learning method, the U-Net model has demonstrated significant potential and numerous application prospects in the fields of image segmentation and target recognition.
In this context, this study focuses on the Fengcong Dam area in Anlong County, Guizhou Province, China. Utilizing a DJI Mavic 2 Pro drone, a large number of visible light images covering farmland with plastic film were obtained. A plastic film sample dataset was constructed, and the U-Net deep learning model was trained to identify plastic film with the aim of extracting information on plastic film cover in fragmented terrain areas. This study aims to provide decision-making support for plastic film surveys, farm environmental health assessments, and green planting management in farmland, and to serve as a reference for the recognition and detection of agricultural plastic film under complex geographical conditions.
4. Discussion
4.1. Applicability of the Method
This study utilized unmanned aerial vehicle (UAV)-based visible light imagery to explore the application of the U-Net model in monitoring plastic film coverage in high-altitude mountainous farmland and investigated its suitability for such tasks. The rapid development of UAV technology, characterized by high mobility, low cost, and enhanced safety, provides a new avenue for geographical information acquisition. Traditional monitoring methods face limitations due to the diverse land cover types and complex terrain of karst mountainous farmlands. Therefore, this research introduces multi-rotor UAVs to monitor plastic film coverings in high-altitude mountainous areas, investigating their advantages and effectiveness in practical applications. Our results indicate that multi-rotor UAVs exhibit high cost-effectiveness and safety, coupled with unique advantages in land cover monitoring. They can swiftly acquire high-resolution visible light images, effectively monitoring features in fragmented planting areas. Moreover, visible light images captured by drones can also reflect plastic film coverage under different backgrounds, providing decision-making support for plastic film surveys, farmland health assessments, and modern agricultural park management. Additionally, determining the optimal sample size is a complex issue involving multiple factors, including processing time, machine resources, data quality, and model complexity. Through experimentation and comparison with different sample sizes, we found that with a sample size of 800, the accuracy of plastic film recognition can be effectively improved, demonstrating good performance during the training process. This result also provides an effective method for determining the optimal sample size for subsequent research. Furthermore, by using cross-validation techniques, we comprehensively evaluated the model’s performance under different sample sizes, thereby providing reliable evidence for selecting the optimal sample size. These methods not only enhance the training efficiency and performance of the model but also effectively save time and resource costs, providing strong support for subsequent research and practical applications.
4.2. Differences from Existing Research
This study, conducted in a karst mountainous region of southern China, utilized a UAV remote sensing platform, specifically the DJI Mavic 2 Pro, to extract information about plastic film coverings in complex habitats. This approach effectively addresses challenges in obtaining high-quality remote sensing image data for crop information extraction in the fragmented and environmentally fragile karst mountainous terrain with frequent cloudy and foggy weather conditions. In contrast, previous research employing medium- and low-resolution satellite remote sensing images, such as Landsat TM and Landsat 8, combined spectral and texture features to improve classification accuracy. For instance, Lu [
37] achieved overall accuracy rates of 85.27% and 95% using Landsat-5 TM images, and Hasituya [
38] achieved an overall classification accuracy of up to 94% by combining spectral and texture features based on Landsat-8 remote sensing data. In this study, the U-Net model was employed for the semantic segmentation of drone images, achieving a patch count of 96.38%, an area accuracy of 91%, and an IOU and F1-score of 85.89% and 94.20% respectively. In comparison with the studies conducted by Lu and Hasituya, our research may exhibit differences, which could stem from variations in data sources, study areas, methodologies, algorithms, and parameter settings. However, our study, tailored to specific geographical conditions and application needs, has devised methods and algorithms better suited for monitoring agricultural mulching in karst mountain areas. Consequently, our research findings remain somewhat comparable, potentially demonstrating superior applicability and efficacy in particular application scenarios.
4.3. Limitations
In future research, it is imperative to critically reflect on and address the limitations of the current study to further enhance the quality and reliability of our research outcomes. Throughout our investigation, the restricted endurance of the unmanned aerial vehicle (UAV) constrained our research to relatively small areas, thereby limiting our capacity for extensive data collection over larger regions. In subsequent studies, employing UAVs with higher endurance capabilities or optimizing flight path planning algorithms may enable coverage and data collection over larger areas. Additionally, despite augmenting sample sizes and incorporating samples with diverse backgrounds, the issue of plastic film misidentification between roads still remains unresolved. Future endeavors will delve deeper into understanding the impact of various scenarios on plastic film extraction and endeavor to refine training strategies and parameter settings to bolster the model’s recognition capabilities and accuracy in complex scenarios. Furthermore, while our study explored the influence of sample size on plastic film extraction, samples may still be susceptible to environmental factors such as illumination, weather, and ground reflectance. To comprehensively assess model performance, further investigation into the effects of these environmental factors on plastic film extraction is warranted. Efforts will be made to incorporate these factors into model training and optimization processes to enhance the model’s robustness and reliability across diverse environmental conditions.
Future endeavors may continue to refine and advance plastic film recognition technology based on UAV visible light imagery. The exploration of high-resolution remote sensing data and the utilization of high-resolution orbital images for plastic film recognition and monitoring can enhance recognition accuracy and spatial resolution. Improvements in and optimizations of plastic film recognition algorithms and models aim to enhance recognition efficiency. Additionally, leveraging multi-temporal remote sensing image data for the temporal monitoring and change analysis of plastic film will facilitate a better understanding of the growth evolution patterns of plastic film and the impacts of agricultural management practices.
5. Conclusions
Considering the varied backgrounds of plastic film environments, we employed drones to swiftly capture high-resolution visible light images in a karst mountainous area. Simultaneously, the experimental zone was partitioned into four distinct areas. Utilizing the U-Net model with different parameters, such as learning rates, batch sizes, and iteration counts, we systematically compared the impact of these model parameters. After assessing the effects, the optimal training parameters were identified. Furthermore, we compared the recognition outcomes with varying sample quantities. Ultimately, the U-Net model was used for image segmentation to extract plastic film, and the area method was employed for plastic film area calculation. This facilitated the swift identification and area calculation of plastic film, leading to the following key conclusions:
5.1. Deep Learning Framework and Parameter Optimization
Leveraging the U-Net model within a deep learning framework, this study extracted plastic film areas from UAV-based visible light images. Exploring various learning rates, batch sizes, and iteration counts, this study identified optimal model parameters to enhance training effectiveness and improve plastic film extraction accuracy. The best recognition accuracy was achieved with a learning rate of 0.001 (91.37%), batch size of 10 (92.14%), and iteration count of 25 (99.84%). Therefore, for UAV image-based plastic film extraction, the optimal parameter values for learning rate, batch size, and iteration count are 0.001, 10, and 25, respectively.
5.2. Validation of U-Net Model in Karst Highland Terrain
This study employed a U-Net model based on UAV visible light imagery for plastic film extraction and conducted comparative experiments with the traditional Support Vector Machine (SVM) method. By increasing the sample size, we effectively improved the training performance of the U-Net model, consequently enhancing the accuracy of plastic film identification. With a sample size of 800, the U-Net model demonstrated an area accuracy of 91%, a patch quantity accuracy of 96.38%, an IOU of 85.89%, and an F1-score of 94.20%. During training, there was a 24.1% increase in area accuracy, a 17.6% increase in patch quantity accuracy, and improvements of 13.76% and 8.31% in IOU and F1-score, respectively. These results validate the superiority of the U-Net model in plastic film identification. A comparative analysis of experimental results revealed that compared to the SVM method, the U-Net model exhibited higher area accuracy (increased by 1.10%), patch quantity accuracy (increased by 20.42%), IOU (increased by 9.94%), and F1-score (increased by 5.87%) in plastic film identification. These data further confirm the excellent performance of the U-Net model in plastic film identification and provide important reference for future optimization of model training and enhancement of plastic film identification effectiveness.
5.3. UAV Remote Sensing in Small-Scale Crop Recognition
In remote sensing identification studies in fragmented and small-scale agricultural geospatial contexts, UAV remote sensing holds vast application prospects and is poised to become an indispensable means of aerial remote sensing. This study explores the applicability of UAV visible light images in detecting plastic film mulch (PFM) in a karst mountainous area, considering the region’s characteristics of cloudy and misty weather, fragmented crop planting areas, and strong PFM heterogeneity. The proposed method features ease of operation, automation, and high accuracy, meeting the requirements for PFM detection in fragmented terrains, thus boasting broad application prospects. Moreover, by extracting and identifying agricultural PFM, accurate calculations of the covered area and distribution of PFM can be obtained, providing methodological references for PFM recycling and management. Additionally, by identifying the areas covered by PFM, the crop planting area in the region can be inferred, thereby offering data support for agricultural production. The selected parameters of the U-Net model and the sample dataset in this research meet the requirements for precise PFM identification in karst mountainous areas characterized by significant terrain undulations and fragmented spatial distribution of cultivation. This validates the applicability of the U-Net model in PFM identification in karst mountainous areas. Furthermore, this method can assist in monitoring land use and understanding specific land utilization patterns and occupancy situations, thereby providing a scientific basis for land resource management and planning and offering research methods and a decision-making basis for agricultural environmental health assessment and green planting management in agricultural fields.