A Methodology for Georeferencing and Mosaicking Corona Imagery in Semi-Arid Environments

Brooke Iacone; Ginger R. H. Allington; Ryan Engstrom

doi:10.3390/rs14215395

,

and

¹

Department of Geography, The George Washington University, Washington, DC 20052, USA

²

Department of Natural Resources and Environment, Cornell University, Ithaca, NY 14853, USA

^*

Author to whom correspondence should be addressed.

Remote Sens.2022, 14(21), 5395;https://doi.org/10.3390/rs14215395

Version Notes

Order Reprints

Review Reports

Abstract

High-resolution Corona imagery acquired by the United States through spy missions in the 1960s presents an opportunity to gain critical insight into historic land cover conditions and expand the timeline of available data for land cover change analyses, particularly in regions such as Northern China where data from that era are scarce. Corona imagery requires time-intensive pre-processing, and the existing literature lacks the necessary detail required to replicate these processes easily. This is particularly true in landscapes where dynamic physical processes, such as aeolian desertification, reshape topography over time or regions with few persistent features for use in geo-referencing. In this study, we present a workflow for georeferencing Corona imagery in a highly desertified landscape that contained mobile dunes, shifting vegetation cover, and a few reference points. We geo-referenced four Corona images from Inner Mongolia, China using uniquely derived ground control points and Landsat TM imagery with an overall accuracy of 11.77 m, and the workflow is documented in sufficient detail for replication in similar environments.

Keywords:

Corona imagery; remote sensing; georeferencing; semi-arid ecosystems

1. Introduction

Satellite imagery plays a key role in analyzing land cover change and is important for documenting disturbances [1,2,3,4], assessing the impact of policy decisions [5,6], and predicting the future health and structure of landscapes over time. However, the regular acquisition of moderate-resolution satellite imagery did not begin until 1972 with the launch of Landsat 1, and quality imagery is sparse pre-Landsat 5 (1984). Thus, much of the work surrounding land cover change only extends back to that time and we have much less understanding of landscapes prior to this, particularly outside of the United States. In the absence of repeated satellite data, other sources of land cover information, such as historic maps, aerial photography, and field data can be used to derive additional time points of data and establish baseline conditions for comparison to contemporary data. However, aerial photography must be digitized, and is not always available, particularly outside of the U.S.

One such source of historic imagery is Corona imagery, which is high spatial resolution photographic imagery collected via covert satellites by the United States in the 1960s and 1970s. The imagery has been declassified and scanned and is now available to the public. However, Corona imagery remains underutilized due to the challenges associated with preparing it for use. The scanned and digitized images contain no spatial reference information and must be georeferenced. Additionally, inherent distortions from the acquisition process must be corrected before the imagery can be analyzed. Many regions have changed significantly since the acquisition of Corona imagery or contain very few persistent ground features, often making this process difficult and time-consuming. In particular, locating a sufficient number of ground features in very rural areas presents a challenge, particularly when ground features are either too small in scale to detect with moderate-resolution reference imagery, or are not evenly distributed throughout the landscape. Very few studies that have utilized Corona imagery detail the georeferencing process in enough detail to replicate. Several recent studies have utilized structure-from-motion methods that take advantage of contemporary digital elevation models to fit historic images (e.g., [7]). Such methods cannot be applied to dynamic physical landscapes where topographic features shift over time, such as floodplains or dune fields. This paper demonstrates a workflow for georeferencing Corona images that can be used by researchers working in highly dynamic landscapes with little to no reference data. Here, we document the approach for pre-processing Corona imagery for a region of Northern China that has undergone significant modification since the 1960s. The final product can be used in future research to provide a set of data on landscape conditions that predates anything available at this time.

1.1. Corona Imagery

The Corona program consisted of 144 reconnaissance satellites produced and operated by the National Reconnaissance Office from June 1959 to May 1972 in order to gain military intelligence during the Cold War. Single-band panchromatic images were acquired worldwide with a particular focus on political hotspots such as the Soviet Union, China, and the Middle East at spatial resolutions varying from 1.8 m to 12.2 m [8,9]. The photographs were acquired on film via satellite and later scanned into eight-bit panchromatic images [9].

During each mission, the imaging satellite orbited the earth several times, acquiring photographs that cover approximately 13.8 km by 188 km on the ground (2595 square km in area) [3]. Later satellites were equipped with two panchromatic cameras, one forward and the other backward (aft) at 15 degrees off-nadir in either direction, with the aft camera roughly six frames behind the front camera [10]. Because of this dual-angle system, it is possible to utilize both camera angles to model topography in the landscape and generate digital elevation models [11,12,13]. The final camera design, KH-4B, is widely considered to be the most useful for modern analysis because of the high quality of the imagery [13]. The KH-4B camera operated over seventeen missions between September 1967 to May 1972 and returned images with an approximate ground resolution of 1.8 m (six feet) at nadir [9].

The images taken by the Corona satellites remained classified as top secret until 1992. In 1995, President Bill Clinton signed an executive order declassifying over 800,000 images. High-performance film scanners are used to digitize the film at 1800 dots per inch (dpi) and are accessible to the public through the United States Geological Survey’s Earth Explorer [8].

1.2. Georeferencing Corona Imagery

Corona images acquired from the USGS have no spatial reference. This means that the images lack the necessary geo-corrections required to display them over their real-world location using mapping software. In some applications, high geo-location accuracy may only have minor implications for interpreting the imagery, such as when only visual interpretation is required. However, in order to utilize Corona imagery in location-specific analyses such as time series and change detection analyses, it is important to accurately georeference the images because mismatches in pixel alignment will result in missing or false detection of change. In such cases, the level of accuracy required to use the georeferenced Corona images will depend on the spatial resolution of the other datasets in the analysis [3]. Georeferencing Corona imagery is useful outside of land cover change analyses as well and has frequently been used in identifying archeological features [11,14,15].

Accurate georeferencing of Corona images is difficult due to the challenges presented when overlaying scanned photo strips on real-world terrain. Several types of geometric and brightness distortions are present in the imagery from the acquisition, including multiple sources of non-linear distortions. The images tend to have fewer distortions at the center of the image, and greater distortions along the edges. These distortions are unique to Corona imagery and more difficult to correct than typical non-linear distortions [16]. Additionally, the flattened images do not account for varying topography [13,17]. Several studies have analyzed and proposed methodologies to account for the distortions present in the imagery from acquisition [13,17,18]. Multiple studies involving Corona imagery detail a similar approach to georeferencing which utilizes ground control points from nearest-date Landsat MSS or TM imagery at the nearest future date to co-register the image [3,18,19,20]. Zhang et al. utilized recent high-resolution Google imagery to help identify stable ground features such as road intersections in Corona and Landsat MSS imagery [18]. While most studies opt for remote sensing software for georeferencing, such as ERDAS Imagine, a few utilize a structure-from-motion photogrammetry method of georeferencing [7,21]. Almost all studies focus on imagery collected during sixteen missions between 1967 and 1972 using the KH-4B sensor due to its high spatial resolution (1.8 m) and high image quality. Some Corona imagery that has been previously orthorectified (georeferenced and corrected using a digital elevation model) is available to the public for download as NIFT files through the University of Arkansas’ Center for Advanced Spatial Technology’s Corona Atlas, though it is worth noting that some of the imagery has been automatically orthorectified and therefore the accuracy is not guaranteed [22].

Despite the increasing number of studies that make use of Corona imagery and efforts to make it widely available for use, few studies provide a set of reproducible methods or steps for georeferencing. Further, most work has been focused on urban and agricultural landscapes, with persistent features that can be utilized to generate ground control points. It is still unclear how well Corona can be utilized in arid and semi-arid environments which are highly variable in cover and topography. This study presents a workflow for georeferencing Corona images in greater-than-average detail using ArcGIS Pro, a common and readily available software system, that others can easily replicate with the goal of reducing the time and training required to prepare Corona imagery and ultimately expand its use across many disciplines.

1.3. Objectives

The main objective of this paper is to demonstrate a repeatable workflow for preparing, georeferencing, and mosaicking Corona imagery, particularly within the context of semi-arid, highly dynamic landscapes with little to no reference data.

1.4. Study Area

The study area is the central portion of Naiman Banner (2324 km²). Located at E120°19′40″–121°35′40″, N42°14′10″–43°32′20″, Naiman Banner is situated in the eastern portion of the Inner Mongolian Autonomous Region of China and on the southern edge of the Horqin Sandy Land District approximately 700 km northeast of Beijing (Figure 1). Naiman Banner is a semi-arid ecosystem characterized by its undulating sand dunes and meadows. Due to dune movement caused by wind, the topography is dynamic and varying [23]. Elevation ranges from 300 to 400 m, with the highest elevation along the western side of the banner where the landscape is much sandier with highly mobile dune fields [23]. Elevation slopes downward from west to east, transitioning from mobile dunes to semi-fixed and fixed dunes and then to flatter, gently sloping meadows. Vegetation cover exists in a gradient across the landscape from west to east, aligned with the gradient of dune stabilization from mobile to fixed [24].

Figure 1. Satellite view of the focal study area and the location within China (red box).

Naiman Banner faces some of the most severe grassland degradations in all of Northern China [2,25], and the trajectory of degradation over time generally follows that of the Horqin Sandy Lands and Northern China, with the most intense period of degradation occurring between the mid-1980s and late-1990s [2]. In general, very little is known about the condition of landscapes in Northern China prior to the 1980s, particularly the status and extent of grasslands. Increased cultivation and grazing pressure are documented to have begun around 1978 with the opening of China’s economy and intensified further around 1985 through policies such as the Rangeland Law of 1985, meaning the earliest remotely sensed imagery that we have utilized for analysis characterizes already-disturbed landscapes [23,26,27].

There is evidence that the landscape has been recovering in recent years due to several grassland conservation programs in the region [2,4,6]. This includes the Grain for Green Program (beginning 1999), the West Development Strategies (beginning 1999), the Beijing and Tianjin Grazing Withdrawal Program (beginning 2003), and the Ecological Subsidy and Award System (beginning 2011) [5]. However, agricultural land has steadily increased, and despite restricting grazing in certain areas, the number of livestock has also increased. The groundwater table is also decreasing, which could lead to future problems associated with desertification despite all of these recent efforts [5]. Extending the timeline of available information with Corona imagery presents an opportunity to better understand the landscape prior to the most intense period of degradation in its history, giving us a better “baseline” understanding of historic conditions and the ability to more effectively analyze long-term trends in degradation.

2. Materials and Methods

2.1. Materials

Four Corona images, captured on 30 November 1970 during mission 1112, were obtained from the USGS’s Earth Explorer (Table 1). Only images from the KH-4B system (missions 1101 through 1117, excluding 1113) were considered since these missions use the dual-camera system and have the highest spatial resolutions. Images were filtered to contain less than 20% cloud coverage and intersect with the bounding coordinates of the study area. It is important to note that since the images are unreferenced, USGS approximates the location of images when displaying them in Earth Explorer. Some images that overlapped with the bounding coordinates within Earth Explorer’s search criteria did not intersect with the study area when positioned manually in mapping software. Priority was placed on finding images with the fewest visible errors. Scan lines, bright/dark spots, and blur spots were among the most commonly found errors. In addition, priority was placed on images where the study area was positioned in the center of the image rather than needing to mosaic multiple images along their vertical edges. This is because greater distortions occur towards the edges of Corona images, and this would require mosaicking along a greater number of seams. When possible, all images should also be acquired on the same date to avoid differences in image quality and possible changes in the landscape. Lastly, all images should be acquired from the same camera direction, forward or aft (backwards). For purposes of this study, either direction would have been sufficient. The dual directions have been utilized to generate digital elevation models [12]; however, this is not useful in this case given the high variability and frequent movement of the dune fields in Naiman Banner. Henceforth, unreferenced images will be referred to as strips.

Table 1. Image identifiers for the four Corona images utilized in this study for central Naiman Banner, Inner Mongolia. Image No. 1–4 refers to our internal numbering scheme for the individual strips; Corona Image ID is the metadata label obtained from the source.

Landsat Thematic Mapper (TM) and contemporary high spatial resolution basemap imagery from E.S.R.I. were used for locating stable ground control points. A cloud-free Collection 1 Tier 1 Landsat TM image captured by Landsat 5 was acquired from Earth Explorer. The closest data available to that of the Corona images were taken on 8 December 1984. It is helpful to have an older reference image in landscapes where ground features have changed significantly between the collection of the Corona imagery and contemporary imagery. While Landsat MSS imagery could provide a date closer to that of the Corona images, the lower spatial resolution (60 m) was insufficient for locating distinguishable features in the landscape. The contemporary image was acquired from E.S.R.I. for the year 2019.

2.2. Methods

2.2.1. Preprocessing

Several preprocessing steps are required to prepare Corona imagery for georeferencing. Each strip is delivered in four separate files which, when combined, represent one Corona acquisition during the mission. Strip parts are delivered at a 180-degree rotation. Additionally, each Corona strip contains a black film border which must be trimmed before georeferencing [28]. Figure 2 details a workflow for the georeferencing process.

Figure 2. Approximate workflow from data acquisition through export of the final mosaicked image. Details for each step are outlined in the text.

The four files for each strip referred to in this paper as parts A through D (from left to right), should be mosaicked into a single strip prior to georeferencing in order to maintain consistency across the strip. There is about six kilometers of overlapping area between strip parts. Figure 3 shows the position of the four unreferenced strips in relation to the study area. Because strip parts are only segmented due to file size, not due to differences in image acquisition, the parts should fit together near-perfectly. Since the objective is simply to join the parts back together, the mosaic operator (which defines how overlapping pixels in the mosaic are resolved) should be defined using a first or last method rather than an averaging method.

Figure 3. Location of the study area with respect to the original Corona image strips.

After trials on several platforms, including common spatial software such as QGIS and ERDAS Imagine, it was determined that ArcGIS Pro’s georeferencing interface was the most efficient software for georeferencing Corona imagery. Here, version 2.8 of ArcGIS Pro was utilized [29]. ArcGIS Pro’s georeferencing toolkit allows the user to move, scale, and rotate the unreferenced strips over other image datasets and toggle between images quickly when placing ground control points. Additionally, the impact of ground control points is visible on the reference immediately as you select them, allowing the ability to gauge the quality of each point. For these reasons, ArcGIS Pro was the preferred interface for georeferencing, but many of the methods detailed here can be implemented in other spatial software, such as QGIS.

Parts A through C of each strip were scaled and rotated over contemporary imagery to approximate their location. The exact location of each strip can be further refined once the parts are mosaiced together. Spatial reference is not required to mosaic the strip parts, but the underlying imagery may be helpful for determining their relative overlap. Part D of each strip was located beyond the bounds of the study area and was therefore excluded from the mosaic. However, it is important to note that there should be a suitable buffer of imagery surrounding the desired study area in order to account for warped edges caused by georeferencing. If the study area boundary lies close to the edge of the strip, additional pieces from surrounding strips should be included in the georeferencing process.

After mosaicking, the boundaries of the four strips (excluding the black film borders) were each traced manually to produce a clipping geometry for trimming the borders away. It is important to trim the borders prior to georeferencing in order to properly visualize the alignment of the overlapping areas between strips [28]. Finally, the location of each strip was further refined over the reference imagery.

2.2.2. Georeferencing

Once each Corona strip is positioned as accurately as possible over the reference imagery (accounting for distortions that will prevent ground features from perfectly aligning), ground control points were used to refine the spatial reference. Each Corona strip should be georeferenced individually rather than as a final mosaic. This will result in a more accurate mosaic since the georeferencing transformations can account for unique distortions in each strip.

When possible, it is advised to begin the georeferencing process on the strip with the largest number of potential quality ground control points, particularly where the largest number of persistent structures or roadways exist. Once this strip is georeferenced it can be used as reference for other overlapping, unreferenced strips. While not a typical approach for georeferencing in remote sensing, Hamandawana et al. proposed this methodology for Corona imagery because of the unique challenges associated with detecting a sufficient number of quality ground control points in many Corona scenes [17]. They recommend georeferencing a “primary” strip and then, if necessary due to a lack of quality ground control points in the overlapping strip, match features in the referenced Corona strip to features in the unreferenced strip within the area that they overlap [17]. For this study area, the most practical primary strip to build from was image four (the southernmost image) because there are significantly more persistent features that could serve as ground control points, including the present-day city of Naiman. Subsequent images were georeferenced using the approach from Hamandawana et al., moving from image four to image one in reverse chronological order and making use of the overlapping area with the previously referenced image when necessary [17].

Based on reports from previous studies [12,21,30], it was determined that at least 20 ground control points should be collected for each strip for a total of 80 ground control points across four strips. In landscapes with an abundance of potential ground control points, more may certainly be collected; however, many studies have achieved sufficient accuracy with between 20 and 30 ground control points per strip [12,21,30]. A minimum of 10 ground control points is required to georeference with a third-order polynomial [30]. Distinct features such as built-up urban areas and road intersections are best for georeferencing because they are typically static through time. Corona imagery presents a unique challenge in that a majority of useful features for ground control points did not exist in this area in 1970. Additionally, utilizing Landsat TM imagery as a reference point is difficult because of its low spatial resolution. Many of the built-up areas and road crossings present in Naiman Banner in 1970 are not visible in the Landsat TM imagery and therefore cannot reliably be referenced in the Corona imagery. Modern, very high spatial resolution imagery is also difficult to utilize due to the dynamic and changing landscape in Naiman Banner, with shifting dunes and rapidly expanding urban areas. Locating reliable ground control points was time-consuming and required a combination of both reference imagery and making use of overlapping, previously georeferenced strips. Examples of ground control points using several imagery dates can be seen in Figure 4.

Figure 4. Examples ground control points (green dots) overlaid on Corona imagery and two separate reference datasets.

Ground control points were categorized into one of four types: roadways, overlapping areas, structures, and natural features. Roadways, which consisted predominantly of intersections in small built-up areas that have grown into larger settlements since the acquisition of Corona imagery, accounted for 52.5% of all ground control points. In some instances, informal path networks connecting settlements and grazing areas were also useful. Overlapping strip features accounted for 18.75% of ground control points. These features generally consisted of distinct natural edges. While it is not recommended to utilize natural features such as bodies of water or sand dunes as ground control points because it is less certain that the features have not shifted position between image dates, it was used in this case because the natural features are selected using other Corona images that were acquired at the same time. While this type of ground control point should be used sparingly since they rely on the accuracy of the previously referenced strip, it was impossible to find enough persistent ground control points throughout the four strips to avoid them entirely. Similarly, natural features in non-overlapping areas should be used sparingly because they are less reliable given that natural features such as dunes and bodies of water vary over time. Natural features accounted for 13.75% of ground control points in this study and primarily consisted of persistent edges of agricultural fields. Structures accounted for the remaining 13.75% of ground control points. This consisted predominantly of building corners and one persistent grazing enclosure. The sum of each type of ground control point can be found in Table 2.

Table 2. Distribution of sampling across different types of ground control points.

Focus was placed on selecting ground control points within and closely surrounding the study area. When ground control points were selected evenly throughout the strips, including the corners and edges of the images, the overall accuracy of the georeferencing decreased. This is because distortions are greater towards the edges of the strips and the polynomial must compromise accuracy to fit these distortions. Since the study area was centrally located in the strips and the edges would be removed, we decided to place ground control points in such a way that would overly warp and compromise the far edges of the strips in exchange for greater accuracy within the study area. However, some points had to be placed outside of the study area to anchor it so that warping did not occur within the boundaries. The location of the final ground control points throughout the strips can be found in Figure 5.

Figure 5. Location of ground control points.

A third-order polynomial transformation was applied to each strip to fit the ground control points. A second- or third-order polynomial transformation is recommended when there are significant distortions in the imagery because the strip can be bent and curved rather than just shifted, rotated, and scaled [31]. However, as mentioned previously, it is important to anchor strip edges within the study area when using a higher-order polynomial because extreme edge warping may result in gaps between strips.

2.2.3. Mosaicking Georeferenced Strips

Once each Corona strip was georeferenced within the acceptable measure of accuracy within ArcGIS Pro (<15 m), the images were combined using ArcGIS Pro’s Mosaic Dataset tool. Mosaic datasets are used within ESRI software to manage, analyze, and display multiple rasters as one dataset, accompanied by a suite of tools to enhance the mosaic such as color balancing and generating seamlines. First, the four georeferenced images were imported into an empty mosaic dataset. A shapefile with the boundary of the study area was imported into the mosaic dataset as the boundary parameter. The boundary was then applied as a mask to the mosaic, removing the edges of the images that fell outside of the study area. This eliminates the need to clip the mosaic later during export, which saves computational power.

Next, brightness variations were accounted for using color-balancing techniques. Color balancing the mosaic is important because the range of digital numbers (DNs) in each strip can vary, impacting both the appearance and statistics of the final mosaic. Edge darkening and blurring increase towards the outer edges, likely due to the oblique angle of the camera for features distant from the nadir, and other bright and dark spots may occur due to changes in the sun angle [17]. However, since we used high-quality images that contained minimal spotty distortions, and because the edges were being clipped away from the study area, no major distortions needed to be accounted for.

In addition to a smoother visual output, color balancing ensures that the distribution of pixel values is consistent across the mosaic. Color balancing applies an algorithm that manipulates pixel values across several images to equalize them to one another. Here, a histogram matching technique was chosen based on a visual inspection of outputs using different balancing methods. A histogram match changes each pixel’s value according to its relationship with a histogram composed of all of the input images or the histogram of one target input image [32]. This approach is recommended when all of the images have similar histograms and features are generally consistent throughout the images, which was the case for this study area. Since the images in this study did not contain inconsistent coloring that would need to be corrected manually, a uniform color-balancing approach for the entire mosaic was appropriate. However, where uneven bright or dark spots occur, Hamawanda et al. propose several methodologies for accounting for them [17]. We recommend thoroughly assessing the underlying statistics of each image in the mosaic as well as in the resulting color-balanced mosaic when the pixel values of the mosaic will be utilized for further analysis, such as for deriving features. In applications where the objective is simply to achieve a smooth visual output, a uniform color-balancing approach is generally appropriate.

Next, seamlines were generated for the mosaic dataset. Seamlines are used to sort overlapping imagery and produce a smoother transition between images. Computational methods for generating seamlines use either overlapping geometries of the images or the spectral characteristics of overlapping areas. Through experimentation, we found that generating seamlines based on the physical footprint of each image produced the smoothest result, whereas utilizing a radiometric approach was not useful due to a lack of spectral information. Once seamlines were generated, we inspected the mosaic using different image overlaps in order to determine an order that produced the smoothest seams. An overlap ranking field was created and populated in the mosaic dataset’s attribute table to assign the image order and the sort of method for the mosaic was defined by this attribute. The overlap method was set to “first” which assigns the final pixel value of the mosaic to the higher-ranking image in overlapping areas. We found that the blend or feathering method, which averages the values of two overlapping pixels, distorted features found along overlapping areas and should therefore be avoided.

2.2.4. Exporting the Mosaic

During the georeferencing process, individual cells are skewed in both shape and size and must be resampled back to a standardized grid format for use in other analyses (Figure 6). Cell sizes varied in both height and width for each strip (Table 3). The minimum cell size of 2.5 m and a bilinear interpolation sampling method were used for resampling. Bilinear interpolation is a standard method for interpolating pixel values and generates a smoother visual output when compared to the nearest neighbor interpolation method. However, nearest neighbor can produce similar visual results at a distance while also resulting in less data manipulation; therefore, the user should choose an interpolation method based on the goals of their research. A script which documents the workflow used to mosaic and export georeferenced Corona images in Python can be found on GitHub (https://github.com/brookeiacone/CORONAmosaic/blob/master/CreateCORONAMosaic_UserInputVersion.py).

Figure 6. Example of skewed cell sizes after georeferencing.

Table 3. Cell size of images after georeferencing. Image No. 1–4 refers to our internal numbering scheme for the individual strips.

3. Results

The final georeferenced mosaic has a spatial resolution of 2.5 m and is shown in Figure 7. The average root mean square error (RMSE) for the mosaic was 11.77 m, with RMSE values for each of the four images ranging from 9.17 m to 14.22 m (Table 4 and Table 5). This aligns with accuracy values achieved throughout the literature for georeferencing Corona imagery, which is less than 15 m in most cases [12,20,21,30].

Figure 7. Final mosaic of central Naiman Banner, Inner Mongolia in 1970 derived from Corona imagery. Red box in inset map shows approximate location of the study area.

Table 4. Mean accuracy of final Corona mosaic.

Table 5. Accuracy of georeferencing by image.

4. Discussion

Despite the challenges associated with using Corona imagery, this study demonstrates a methodology for successfully georeferencing Corona imagery at a high level of accuracy in a region with dynamic physical features and limited reference points. Working with Corona imagery is time-consuming primarily because the literature surrounding their use lacks any robust or repeatable workflows that would streamline the process and reduce unnecessary experimentation. This study demonstrates a tractable and replicable workflow for georeferencing Corona imagery at a high level of accuracy using a singular platform—ArcGIS Pro—which is widely available and commonly used. This broadens the applicability of the workflow to a wider range of researchers and reduces the learning curve, and therefore labor for those interested in utilizing Corona imagery.

While these methods were developed intentionally to be highly replicable, they should be adapted according to the unique requirements of both the imagery and the scene. The area of interest in this study was located centrally across four strips and did not include any of the intense color distortions found in some Corona images. For study areas that span the edges of Corona images and/or require mosaicking along vertical edges, the accuracy of the final mosaic may decrease due to the greater distortions found along edges (parts A and D). Additional research into the specific shape and scale of distortions near edges is needed to adapt the methodology accordingly.

Additionally, unique color distortions such as brightness, dark, or blurred spots may require advanced color balancing techniques not described in this paper. Similarly, some landscapes may possess even fewer persistent ground features, such as areas where no settlements or road networks exist. In these instances, we recommend reducing the number of ground control points from twenty to ten (the minimum required to utilize a third-order polynomial), or six with a second-order polynomial. This will likely reduce the accuracy. For landscapes similar to the study area described here, the methods detailed in this paper should prove adequate for highly accurate georeferencing.

Developing a fully automated workflow for georeferencing Corona imagery would be difficult given the complex and varying nature of distortions within the imagery and the different approaches required in different landscapes. However, this study has demonstrated that the georeferencing process can be accomplished entirely within ArcGIS Pro using Python-based tools. Therefore, as demonstrated in the mosaicking process executed using a Python script, there are opportunities to automate portions of the process and create user-facing toolboxes or applications. This would reduce the technical training required to georeference and, in turn, increase accessibility to research in other disciplines.

Georeferenced Corona imagery is useful in a wide variety of applications. It presents a unique opportunity to detail and quantify historic land cover and extend the timeline of available data for analyzing anthropogenic impacts on landscapes. Due to the high spatial resolution of the imagery, it is possible to detail historic landscapes in far greater detail than is currently available, past or present. Forthcoming research utilized the mosaic output of the central Naiman Banner to assess the feasibility of utilizing contextual features to derive spatial information and land cover types. The results indicate that the amount and complexity of information that can be derived from Corona imagery are significant and potentially exceeds any other available datasets for Naiman Banner. Generating historic cover types is useful for assessing historic, current, and future land use policies and restoration efforts surrounding aeolian desertification in Inner Mongolia. With highly accurate georeferencing, Corona imagery may also be combined with other image datasets to quantify change at the pixel level.

5. Conclusions

Corona imagery presents a unique opportunity to extend the timeline of available information for analyzing changing landscapes. However, Corona imagery remains underutilized due to the challenges associated with preparing them for use and the lack of a robust and replicable methodology in the literature. This study successfully georeferenced four Corona images with a high level of accuracy and describes a repeatable workflow suitable for others to follow. The resulting mosaic of the Naiman Banner significantly increased our understanding of the landscape in 1970 and represents one of the first available datasets of its kind. Corona imagery is an important resource for analyzing long-term land use and land cover patterns and their integration in analyses is essential for a broader understanding of anthropogenic impacts on landscapes that continue to drive global environmental change.

Author Contributions

Conceptualization, B.I., G.R.H.A. and R.E.; methodology, B.I., G.R.H.A. and R.E.; software, B.I., G.R.H.A. and R.E.; validation, B.I. and G.R.H.A.; formal analysis, B.I. and G.R.H.A.; investigation, B.I. and G.R.H.A.; resources, B.I., G.R.H.A. and R.E.; data curation, B.I. and R.E.; writing—original draft preparation, B.I.; writing—review and editing, B.I., G.R.H.A. and R.E.; visualization, B.I.; supervision, G.R.H.A. and R.E.; project administration, G.R.H.A.; funding acquisition, G.R.H.A. and R.E. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded in part by the George Washington University Facilitating Fund and the GWU Center for Urban and Environmental Research (CUER).

Data Availability Statement

Not applicable.

Acknowledgments

Thank you to the George Washington University Department of Geography.

Conflicts of Interest

The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.

References

Brogaard, S.; Prieler, S. Land Cover in the Horqin Grasslands, North China. Detecting Changes between 1975 and 1990 by Means of Remote Sensing; IIASA: Laxenburg, Austria, 1998; p. 26. [Google Scholar]
Li, J.; Xu, B.; Yang, X.; Jin, Y.; Zhao, L.; Zhao, F.; Chen, S.; Guo, J.; Qin, Z.; Ma, H. Characterizing changes in grassland desertification based on Landsat images of the Ongniud and Naiman Banners, Inner Mongolia. Int. J. Remote Sens. 2015, 36, 5137–5149. [Google Scholar] [CrossRef]
Song, D.-X.; Huang, C.; Sexton, J.O.; Channan, S.; Feng, M.; Townshend, J.R. Use of Landsat and Corona data for mapping forest cover change from the mid-1960s to 2000s: Case studies from the Eastern United States and Central Brazil. ISPRS J. Photogramm. Remote Sens. 2015, 103, 81–92. [Google Scholar] [CrossRef]
Xu, L.; Tu, Z.; Zhou, Y.; Yu, G. Profiling Human-Induced Vegetation Change in the Horqin Sandy Land of China Using Time Series Datasets. Sustainability 2018, 10, 1068. [Google Scholar] [CrossRef]
Li, J.; Xu, B.; Yang, X.; Qin, Z.; Zhao, L.; Jin, Y.; Zhao, F.; Guo, J. Historical grassland desertification changes in the Horqin Sandy Land, Northern China (1985–2013). Sci. Rep. 2017, 7, 3009. [Google Scholar] [CrossRef]
Zhang, G.; Dong, J.; Xiao, X.; Hu, Z.; Sheldon, S. Effectiveness of ecological restoration projects in Horqin Sandy Land, China based on SPOT-VGT NDVI data. Ecol. Eng. 2012, 38, 20–29. [Google Scholar] [CrossRef]
Munteanu, C.; Kamp, J.; Nita, M.D.; Klein, N.; Kraemer, B.M.; Müller, D.; Koshkina, A.; Prishchepov, A.V.; Kuemmerle, T. Cold War spy satellite images reveal long-term declines of a philopatric keystone species in response to cropland expansion. Proc. R. Soc. B 2020, 287, 20192897. [Google Scholar] [CrossRef]
Earth Resources Observation and Science (EROS) Center. A Collection of Declassified Military Intelligence Photographs from the CORONA, ARGON, and LANYARD Satellite Systems in Digital Format (1960 to 1972). 13 July 2018. Available online: https://www.usgs.gov/centers/eros/science/usgs-eros-archive-declassified-data-declassified-satellite-imagery-1?qt-science_center_objects=0#qt-science_center_objects (accessed on 1 June 2021).
National Reconnaissance Office. Corona Fact Sheet. Available online: https://www.nro.gov/History-and-Studies/Center-for-the-Study-of-National-Reconnaissance/The-CORONA-Program/Fact-Sheet/ (accessed on 1 January 2022).
Fowler, M.J. Modelling the acquisition times of CORONA KH-4B satellite photographs. AARGnews 2006, 30, 34–40. [Google Scholar]
Casana, J.; Cothren, J. Stereo analysis, DEM extraction and orthorectification of CORONA satellite imagery; archaeological applications from the Near East. Antiquity 2008, 82, 732–749. [Google Scholar] [CrossRef]
Casana, J.; Cothren, J. The CORONA Atlas Project: Orthorectification of CORONA Satellite Imagery and Regional-Scale Archaeological Exploration in the Near East. In Mapping Archaeological Landscapes from Space; Springer: New York, NY, USA, 2013; Volume 5, pp. 33–43. [Google Scholar] [CrossRef]
Sohn, H.-G.; Kim, G.-H.; Yom, J.-H. Mathematical modelling of historical reconnaissance CORONA KH-4B Imagery. Photogramm. Rec. 2004, 19, 51–66. [Google Scholar] [CrossRef]
Goossens, R.; De Wulf, A.; Bourgeois, J.; Gheyle, W.; Willems, T. Satellite imagery and archaeology: The example of CORONA in the Altai Mountains. J. Archaeol. Sci. 2006, 33, 745–755. [Google Scholar] [CrossRef]
Ur, J. Agricultural and Pastoral Landscapes in the Near East: Case Studies using CORONA Satellite Photography. ArchAtlas 2007, 2.1 Edition. Available online: http://www.archatlas.org/workshop/Ur07.php (accessed on 24 October 2021).
Scollar, I.; Galiatsatos, N.; Mugnier, C. Mapping from CORONA: Geometric Distortion in KH4 Images. Photogramm. Eng. Remote Sens. 2016, 82, 7–13. [Google Scholar] [CrossRef]
Hamandawana, H.; Eckardt, F.; Ringrose, S. Proposed methodology for georeferencing and mosaicking Corona photographs. Int. J. Remote Sens. 2007, 28, 5–22. [Google Scholar] [CrossRef]
Zhang, Y.; Shen, W.; Li, M.; Lv, Y. Integrating Landsat Time Series Observations and Corona Images to Characterize Forest Change Patterns in a Mining Region of Nanjing, Eastern China from 1967 to 2019. Remote Sens. 2020, 12, 3191. [Google Scholar] [CrossRef]
Saleem, A.; Corner, R.; Awange, J. On the possibility of using CORONA and Landsat data for evaluating and mapping long-term LULC: Case study of Iraqi Kurdistan. Appl. Geogr. 2018, 90, 145–154. [Google Scholar] [CrossRef]
Shahtahmassebi, A.R.; Lin, Y.; Lin, L.; Atkinson, P.M.; Moore, N.; Wang, K.; He, S.; Huang, L.; Wu, J.; Shen, Z.; et al. Reconstructing Historical Land Cover Type and Complexity by Synergistic Use of Landsat Multispectral Scanner and CORONA. Remote Sens. 2017, 9, 682. [Google Scholar] [CrossRef]
Nita, M.D.; Munteanu, C.; Gutman, G.; Abrudan, I.V.; Radeloff, V.C. Widespread forest cutting in the aftermath of World War II captured by broad-scale historical Corona spy satellite photography. Remote Sens. Environ. 2018, 204, 322–332. [Google Scholar] [CrossRef]
Center for Advanced Spatial Technologies, University of Arkansas; United States Geological Survey. Corona Atlas & Referencing System. Available online: https://corona.cast.uark.edu/ (accessed on 1 January 2021).
Wang, T. Deserts and Aeolian Desertification in China; Science Press: Beijing, China, 2011. [Google Scholar]
Zhou, Y.; Chang, X.; Ye, S.; Zheng, Z.; Lv, S. Analysis on regional vegetation changes in dust and sandstorms source area: A case study of Naiman Banner in the Horqin sandy region of Northern China. Environ. Earth Sci. 2015, 73, 2013–2025. [Google Scholar] [CrossRef]
Wen, Y.; Guo, B.; Zang, W.; Ge, D.; Luo, W.; Zhao, H. Desertification detection model in Naiman Banner based on the albedo-modified soil adjusted vegetation index feature space using the Landsat8 OLI images. Geomat. Nat. Hazards Risk 2020, 11, 544–558. [Google Scholar] [CrossRef]
Robinson, B.E.; Li, P.; Hou, X. Institutional change in social-ecological systems: The evolution of grassland management in Inner Mongolia. Glob. Environ. Chang. 2017, 47, 64–75. [Google Scholar] [CrossRef]
Wu, J.; Zhang, Q.; Li, A.; Liang, C. Historical landscape dynamics of Inner Mongolia: Patterns, drivers, and impacts. Landscape Ecol 2015, 30, 1579–1598. [Google Scholar] [CrossRef]
Rigina, O. Detection of boreal forest decline with high-resolution panchromatic satellite imagery. Int. J. Remote Sens. 2003, 24, 1895–1912. [Google Scholar] [CrossRef]
Esri Inc. ArcGIS Pro (Version 2.8); Esri Inc.: Redlands, CA, USA, 2020; Available online: https://www.esri.com/en-us/arcgis/products/arcgis-pro/overview (accessed on 23 August 2022).
Gurjar, S.K.; Tare, V. Estimating long-term LULC changes in an agriculture-dominated basin using CORONA (1970) and LISS IV (2013–2014) satellite images: A case study of Ramganga River, India. Environ. Monit. Assess. 2019, 191, 217. [Google Scholar] [CrossRef] [PubMed]
Environmental Systems Research Institute. Understanding Raster Georeferencing. 2018. Available online: https://www.esri.com/about/newsroom/arcuser/understanding-raster-georeferencing/#:~:text=The%20process%20involves%20identifying%20a,and%20in%20real%2Dworld%20coordinates. (accessed on 24 October 2021).
Liu, J.; Li, H.T.; Gu, H.Y. Study of Color Balance for Remote Sensing Imagery Mosaic. In Proceedings of the 2011 International Symposium on Image and Data Fusion, Tengchong, China, 9–11 August 2011; pp. 1–4. [Google Scholar] [CrossRef]