1. Introduction
Agriculture monitoring plays a significant role for ensuring food security, social stability, and for providing information to farmers on crop yield predictions and decision makers for policy and planning purposes [
1]. These global monitoring systems require large area cropland information as a key input source to estimate crop yield and identify cropping patterns [
2,
3]. The acquisition of consistent cropland information over large areas relies heavily on the use of earth observation data to describe their precise location on the earth’s surface. Since the early 1990s, satellite imagery has been used to produce cropland extent maps because of its consistent, timely, and systematic observations. Some examples of previous cropland datasets are: the Global Map of Irrigation Areas (GMIA), the Global Map of Rain-fed Areas (GMRCA), the Global Monthly Irrigated and Rain-fed Crop Areas (MIRCA2000), the Global Rain-fed, Irrigated, and Paddy Croplands (GRIPC), and the Moderate Resolution Imaging Spectroradiometer—Cropland (MODIS) [
4].
Recently, with the increased accessibility of advanced computing platforms for processing large datasets, an improved spatial and thematic dataset compared to previous cropland extent mapping efforts called the GFSAD Project (Global Food Security-support Analysis Data) was completed. Three new GFSAD cropland extent maps were created separately at three different spatial resolutions (1 km, 250 m, and 30 m) using Landsat and MODIS imagery, along with other existing cropland data for the nominal year 2015 [
5,
6,
7,
8,
9,
10,
11,
12,
13,
14,
15,
16,
17,
18]. These datasets are available for download through Data Pool, DAAC2Disk, and the NASA Earth Data Search. It is well known that mapping of cropland areas at different spatial resolutions can result in large differences in the estimates of cropland area and spatial extent [
19,
20]. Therefore, these three GFSAD cropland extent maps must be assessed and compared both at the global and regional scale to establish their quality and reliability as the base map for generating higher level cropland products such as crop type and crop intensity maps [
21]. These maps provide for both large area (i.e., global) comparisons between the different spatial resolutions including the identification of similarities and differences and for determining their suitability for more regional analysis, especially when considering different agriculture field sizes [
19]. Therefore, these three GFSAD cropland extent maps should be assessed and compared to explore the agriculture field sizes.
Previous attempts at rigorous accuracy assessment of large area cropland extent maps has been very limited. Considerable ambiguity exists in the implementation and interpretation of large area thematic map accuracy assessment. In the literature, individual measures and guidelines for assessing thematic map accuracy have been well established by many researchers [
22,
23,
24,
25]. However, these guidelines are not often followed due to various limitations in the assessment process (e.g., thematic resolution, geo-location accuracy and availability of reference data) [
26]. The biggest limitation in the large area accuracy assessment process is the availability of valid reference data. As such, large area assessment efforts have mostly relied on insufficient, sparsely distributed reference data [
27,
28,
29,
30,
31]. The assessments performed with limited and insufficient reference dataset reported overall accuracies ranging from 66% to 78%, with considerably lower accuracies of between 10% and 50% for the cropland class [
32,
33].
Cropland reference data are extremely limited in most parts of the world, resulting in insufficient sample sizes, and thus in an inability to assess the accuracy of large area thematic maps [
34,
35,
36,
37]. Recently, a few global reference datasets (e.g., FAO-GFRA (Food and Agriculture Organization Global Forest Resources Assessments) and the Geo-wiki sample set) have been developed to perform the assessment of global land cover maps [
28,
38]. Despite the increasing number of initiatives to collect reference datasets freely in the public domain such as Geowiki and GOFC-GOLD (Global Observation of Forest and Land Cover Dynamics), cropland reference datasets are still lacking. More work must be done to create additional global cropland reference datasets. Any new cropland reference dataset must be generated using an appropriate sampling design based on the inclusion probability of occurrence of crop and no-crop areas to assess these cropland extent maps [
39]. If the inclusion probabilities of crop and no-crop areas are ignored, a significant bias is likely to occur resulting in a non-proportional and insufficient number of samples. Unless the reference data represents the entire cropland distribution, accuracy results will not be statistically valid and meaningful.
To perform the assessment on a global scale, the sampling design needs to be easy to implement and capable of accounting for the proportions of high and low map categories such as crop and no-crop distribution in different continents [
40,
41]. In simple random sampling, each sample of crop and no-crop class has an equal and independent chance of being selected [
42]. However, such sampling designs might result in insufficient numbers of samples in the low crop proportion regions of different continents. Where possible (e.g., Landsat or Sentinel imagery), a homogeneous cluster of 3 × 3 pixels should be used as the sampling unit to account for the positional error at each location [
42]. For coarser-resolution imagery (e.g., MODIS), it is difficult to find large homogeneous regions, and therefore, a single coarse-resolution pixel is used as the sample unit in these situations. The goal is to select the best sampling unit to ensure that only thematic error is considered in the accuracy measures and not error due to mis-registration or positional accuracy.
Another issue or limitation when conducting large area assessments is that most have reported just a single accuracy value for the entire world. This approach does not provide any details or information about the accuracy results for different continents or regions. Given the differences in crop growing strategies and patterns between continents, a more appropriate assessment strategy must be used. Such an assessment strategy has been recently used to assess the cropland extent maps of three different continents (i.e., the United States, Africa, and Australia) [
43]. Yadav and Congalton [
43] described an appropriate assessment strategy for these three continents based on their cropland distribution and reference data availability. This strategy employed a stratification approach to divide the entire world into homogeneous regions, and a sample simulation analysis was conducted to determine the appropriate sample size. Implementing a stratification approach prior to the actual assessment provided an effective means of evaluating the cropland extent maps by considering the diverse cropping patterns of different continents [
37].
The most widely accepted approach for reporting thematic map accuracy results is using an error matrix [
22,
42]. The error matrix presents the comparison of reference samples with the map and allows computation of overall, producer’s, and user’s accuracy [
44]. This assessment technique can be used to report these accuracy measures for different regions. In addition, there are some regions, such as the United States and Canada, where a reference cropland data layer (e.g., CDL in the US) exists for comparing the entire map on a pixel by pixel basis. Such comparison results can then also be presented in the form of a similarity analysis which represents the spatial distribution of agreement and disagreement that occurred in the map as compared to the reference map.
Finally, in addition to evaluating each of the three different GFSAD cropland extent maps separately, it is useful to perform a comparison between the maps to evaluate the effectiveness of each spatial resolution for specific user requirements [
45,
46]. Mapping at a variety of spatial resolutions raises many inconsistencies, differences, and uncertainties among the estimated cropland areas that can be visualized on the map and are the result of the spatial distribution of cropland patches in different cropping patterns [
47,
48]. Therefore, existing and newly developed cropland extent maps must be compared with each other to investigate, determine, and recommend the appropriate spatial resolution for agriculture monitoring given different agriculture field sizes and patterns. Many comparative studies have been performed for existing datasets such as GLC2000, MODIS, International Geosphere-Biosphere Program (IGBP), and National Land Cover Dataset (NLCD) [
49,
50]. Despite identifying spatial discrepancies and inconsistencies, particularly in the cropland class on a global scale, these comparison studies have not focused on the adequacy of different spatial resolutions given different agriculture field sizes [
1,
51]. The uncertainties in the cropland class of these existing maps could be due to: (1) absence of precise spatial location of the cropped areas; (2) coarse resolution of the map products, with significant uncertainties in areas, locations, and detail; and (3) invalid assessments of these cropland extent maps.
The recent production of the three different GFSAD cropland extent maps promises to provide more detailed and accurate cropland information with a high amount of certainty in the geographic location of the cropland areas. Therefore, these three different cropland extent maps must be assessed with an appropriate large area accuracy assessment strategy describing their use, reliability, and quality for different continents. However, cropland extent mapping in different agriculture fields sizes can be inconsistent at different spatial resolutions because of the spectral similarity of different fields and differences in similar agriculture fields [
52]. The similarities and differences in the characteristics of agriculture landscapes (i.e., crop area proportions and landscape metrics) must also be explored to provide specific recommendations for when to apply the three different cropland extent maps with respect to different field sizes [
53]. This kind of regional comparison can be effectively implemented by using a similarity matrix based on a contingency table approach like an error matrix and by categorizing landscape metrics such as landscape proportion for different agriculture field sizes.
Therefore, the two primary objectives of this study are to perform an accuracy assessment of the three different GFSAD cropland extent maps of the world and then evaluate the impact of different agriculture field sizes on global and regional agriculture monitoring. To accomplish the first objective, a large area accuracy assessment was conducted separately for each spatial resolution GFSAD cropland extent map. Three different assessments were performed using a valid large reference data set collected from different sources and sampling simulations to choose the appropriate sample size for each region [
42,
43]. Second, regional comparisons of the three GFSAD cropland extent maps were performed by calculating similarity in crop area proportions and landscape heterogeneity in nine random 10 km by 10 km study sites or regions selected in each of three agriculture field sizes (i.e., small, medium, and large). The specific goal of this paper is to provide an appropriate assessment and comparison of three different GFSAD cropland extent maps to resolve many of the short-comings and uncertainties of other cropland extent mapping efforts. Finally, the results of this analysis are used to recommend when it is appropriate to apply each map given the different agriculture field sizes for each continent of the world.
5. Conclusions
Over the last few years, the mapping of global cropland datasets has been rapidly increasing. With the recent release of the three different GFSAD cropland extent maps produced by different researchers, their quality and reliability must be evaluated at global and regional scales. Previous assessments of global thematic maps have been limited by the small size of the reference data used resulting in reporting a single global accuracy value for the entire world. The large area accuracy assessment of three different GFSAD cropland extent maps performed here demonstrates an appropriate sampling strategy for collecting a large cropland reference dataset necessary to achieve meaningful accuracy results not only for the entire world, but also by continent. The assessment report of GFSAD30m cropland extent map is available at
https://lpdaac.usgs.gov [
63].
When a global cropland extent map needs to be assessed for individual continents, the prevalent cropland distribution, area, spatial extent, and pattern of each region are very important to be considered for providing meaningful accuracy measures. Accuracy assessment performed in regions with varying crop proportion regions showed that a statistically valid, proportional, and random sampling design resulted in an insufficient sample size and low accuracy measures for regions with low cropland distribution and proportion. Therefore, depending on the objective of assessing the crop and no-crop maps of these low crop proportion regions, alternative sampling designs should be employed to achieve meaningful accuracy measures.
In addition to the estimation of accuracy measures of different continents, the overall accuracy of the three different cropland extent maps of the entire world show that the cropland extent mapping becomes more accurate towards the higher spatial resolutions. Consequently, the GFSAD30m cropland extent map with higher overall accuracy would be potentially preferable to the other coarser-resolution maps. However, despite the differences in the overall accuracy of the three different cropland extent maps, different-resolution cropland extent maps need to be compared in order to provide recommendations as to when to apply which map in different agriculture field sizes. The comparison of the characteristics of the cropland landscape mapped at the three different spatial resolutions is a very effective way of establishing the similarity among the cropland extent maps. The similarities between the three cropland extent maps were used to develop the following three main recommendations for the farmers, crop yield predictors, food market researchers, and policy- or decision-makers in choosing a suitable cropland extent map for agriculture monitoring in different agriculture field sizes:
The cropland extent maps developed at 30 m spatial resolution must be used in small agriculture field sizes of Africa. However, the cropland extent maps developed at 30 m and 250 m spatial resolution can be used for agriculture monitoring in small agriculture fields of South-Asia.
The cropland extent maps developed at either 30 m or 250 m spatial resolutions are recommended for the medium field sizes of the United States for different agriculture monitoring purposes.
The cropland extent maps developed at 30 m, 250 m, and 1 km spatial resolutions (i.e., any of the different spatial resolution crop extent maps) can be used in the large agriculture field size of Australia and the United States.