1. Introduction
Rapid increases in spatial data volumes, the growing complexity of data analysis and modeling methods, the popularity of multi-disciplinary science collaborations, and the requirement for effective remote access to data have resulted in the development and promotion of cyberinfrastructures. Cyberinfrastructures are ‘[t] set of organizational practices, technical infrastructure and social norms that collectively provide for the smooth operation of scientific work at a distance’ [
1]. Increasingly, data include spatial or spatio-temporal attributes that provide a rich trove of analytical information at the cost of increased complexity in data management and analysis [
2]. These data are stored, analyzed, and used in geospatial cyberinfrastructures, or cyberGIS [
3,
4,
5]. These spatially enabled, distributed information systems are designed to enhance data discoverability, access, and usability across distributed science teams. As defined by this triumvirate of goals, geospatial cyberinfrastructures are concrete, albeit technologically focused implementations of Spatial Data Infrastructures (SDIs) [
6,
7,
8]. The broad goal of cyberinfrastructures, geospatial cyberinfrastructures, and SDIs is to codify and standardize some collection of data using ad hoc or formalized policies to improve the data access and discoverability in support of some endeavor (e.g., data use for modeling economic growth, ecological protection, or scientific research). Adopting the view that cyberinfrastructures are technology-focused SDIs (or stage II SDIs [
9]), throughout this work we use the term SDIs to describe spatial data infrastructures.
SDIs exist at the local, regional, national, and global levels [
10,
11]. It is also possible to identify topical, community SDIs. For example, Li [
12] described PolarHub, an Open Geospatial Consortium (OGC) standards-compliant web crawler, semantic search engine, and search result visualization tool that aggregates global, spatially enabled arctic web services and data repositories. Wright [
13] identifies multiple marine coastal SDIs that can be viewed as topical SDIs or as components within larger national or trans-national SDIs. Finally, He et al. [
14] describe the development of a SDI for spatially aware human brain data. These examples illustrate that SDI concepts are broadly applicable to any spatially enabled data and are not limited in scope to either classic Earth-based spatial data or to the integration into the local-to-global SDI hierarchy proposed by Rajabifard and Williamson [
10].
Several national and trans-national government agencies direct and fund extraterrestrial research and exploration. These include the United States National Aeronautics and Space Administration (NASA), The European Space Agency (ESA) and the Japanese Aerospace Exploration Agency (JAXA), among other space agencies. The global cooperative community generates large quantities of planetary science data. NASA archives and serves raw and calibrated data via a distributed network called the Planetary Data System (PDS). ESA offers a similar service in the Planetary Science Archive (PSA) and JAXA provides the Data Archives and Transmission System (DARTS). NASA, ESA, and JAXA, along with several other national space agencies, are members of the International Planetary Data Alliance (IPDA). These agencies and their implicit SDIs seek to fulfill the sometimes competing goals of (1) maintaining an effective data archive and (2) developing and delivering efficient data discovery and access portals. We define second generation SDIs in line with McLaughlin and Nichols [
9] who suggest that SDI development progresses from focusing on egalitarian questions of ‘for whom’ to technical entrenchment focusing on ‘how’ and finally back to a user-centered focus. We argue here that the explicit identification of the semantic components of a Planetary Spatial Data Infrastructure (PSDI) directly supports the development of topical planetary SDIs (e.g., a Mars SDI) and better utilizes existing data archive and delivery services. Both of these benefits directly address the emerging needs of the planetary science community.
Two issues must be addressed with current planetary SDIs before a more effective collection of PSDIs can be developed. First, the goals of a data archive can be both complementary to and directly competitive with those of an SDI and significant challenges arise when addressing both purposes. We take the view of Masser et al. [
15] that an SDI must serve a broad community whose members need not be experts in spatial concepts and who may not understand the intricacies of storing, finding, utilizing, and exploiting spatially enabled data. We posit that the majority of planetary science data users are experts in some aspect of planetary science and are not spatial data experts; these users want spatial data to ‘just work’. Therefore, the derivation, discoverability, and accessibility of spatial data products are of a major importance. Second, a strong technical focus is critical for the success of an SDI and permeate all SDI components. Technology is inherently impermanent and simply a short- to medium-term means by which a long-term spatial data infrastructure is realized. Currently available SDI solutions are often too technology-centric and must instead focus on streamlining data access and improving usability [
8,
16]. Here we propose the development of a PSDI framework that leverages existing data archives, the distributed network of available storage and data sharing institutions, and the significant technical work that has occurred under the auspices of planetary research, cyberinfrastructure, and the development of national SDIs across the globe. This explicitly defined PSDI framework codifies the components of terrestrial SDIs and the unique components of planetary spatial data to support the rapid implementation of PSDIs.
The remainder of this work is organized as follows. In
Section 2 we frame the proposed PSDI framework in the context of NASA strategic goals, existing NASA SDIs, and the broader SDI and geospatial cyberinfrastructure literature. We also identify gaps in the current NASA SDIs and seek to distinguish a PSDI from terrestrial SDIs. In
Section 3 we adopt the model of Rajabifard and Williamson [
17] to further map these SDI concepts to the proposed PSDI. Finally, we conclude by describing the next steps in the development of a PSDI framework in
Section 4.
2. Motivating Planetary Spatial Data Infrastructure
The proposed Planetary SDI (PSDI) is akin to the existing U.S. National Spatial Data Infrastructure (NSDI) [
18] that identifies spatial data, spatial data practitioners, and spatial data interoperability as issues of national importance. This user-motivated effort serves as a conceptual framework for the development of a PSDI that enables effective long-range planning and development, and efficiently accesses and uses spatial data for NASA planetary science exploration and research [
8,
19]. PSDI is not itself a long range planning document or ‘roadmap’. Instead, PSDI describes the facets and bounds within which spatial data planning should occur, and seeks to identify, understand, and codify spatial data usage and access requirements, technologies, policies, standards, and regulatory issues. In fact, we suggest that a multitude of planetary SDIs are the most appropriate outcome of this research, and the PSDI framework describes the theoretical bounds and scaffolding to support more focused SDIs (e.g., a Venus SDI) to address near-term needs of NASA scientists, engineers and mission planners. Recognition of the need for a PSDI framework is just the initial step in developing effective spatio-temporal data exploitation strategies over the next several decades; this is explicitly a process based SDI development view [
17].
2.1. Existing NASA Support for a PSDI
We identify four efforts that broadly define the direction of NASA planetary science research, a long term information technology infrastructure for data archiving, and a distributed network of physical space science hard-copy artifacts. As a matter of policy, the NASA Strategic Plan [
20] and NASA Decadal Survey [
21] documents both outline short-, mid-, and long-term strategies for the advancement of planetary science and exploration. In addition, the Planetary Data System (PDS) currently acts as both the primary archive for U.S. space science missions and the primary data access portal for data users. Finally, the Regional Planetary Image Facility (RPIF) is a spatially distributed network of planetary science libraries storing historic, hard-copy planetary data.
The current (2014) NASA Strategic Plan [
20] outlines three primary goals to: (1) expand the frontiers of knowledge, capability, and opportunity in space, (2) advance understanding of Earth and develop technologies to improve quality of life on our planet, and (3) serve the American public and accomplish the NASA mission mandate by effectively managing people, technical capabilities, and infrastructure. These goals are designed to return tangible benefits for cutting edge technologies and to ensure sustainability, accountability, and transparency in NASA’s operations. A major component in realizing these goals is the acquisition, processing, analysis, and distribution of spatial data [
22]. Together these plans can be encompassed by the development of a broader, third-generation SDI system that supports both the acquisition and use of spatial data for planetary scientific research.
The NASA Planetary Data System currently manages over one Petabyte of planetary data [
23] through the operation of six federated Science nodes, a Radio Science node, a Navigation and Ancillary Information Facility (NAIF) node, and an Engineering node. PDS node science disciplines include atmospheres, cartography and imaging sciences, geosciences, planetary plasma interactions, ring-moon systems, and small bodies. In addition to supporting data archives from NASA planetary missions, each node contributes to the development and maintenance of data archiving standards, PDS-compliant data and document formats, metadata standards, etc. As the primary archive system for planetary digital data, the PDS plays a critical role in laying the foundation for the development of a PSDI, and also in the ultimate success of such a PSDI. In part, the long term success of both PDS and PSDI are achieved by recognizing that the fundamental goals of each are both complementary and directly competitive. PDS emphasizes the archival of fundamental data products and their long-term preservation, and these are most commonly raw and low-level data products as well as higher level products developed by missions and science research. In contrast, the proposed PSDI is a living structure that addresses user needs by leveraging the data within the PDS, using current technologies to transform those data to meet the standards and interoperability policies defined by the PSDI, and providing data access mechanisms that can evolve rapidly and readily as new technologies become available. The PDS can be viewed as a second generation, data-centric SDI [
24,
25] that emphasizes stability, data security and long-term preservation over accessibility and maintenance of cutting edge technological capabilities. In contrast, the proposed PSDI must address data usability and straightforward data access by fulfilling the user-centric needs and embracing the inevitable complexity of the human-technical interaction. We note that NASA-funded missions are required to submit data to the PDS in a timely manner and said data are freely available to any user without restriction, making issues of open data that are present in the terrestrial community largely moot.
The NASA Regional Planetary Image Facility (RPIF) is a distributed network of publicly accessible planetary libraries, each with a unique topical focus such as lunar exploration or impact cratering. The RPIF Network maintains a rich record of international space exploration that includes photographs, maps, films, and historical documents. In the next five years, the Network will evolve to not only provide improved access to its collections but also to enhance its collections with newly derived data products and assist users in locating and using the most relevant data for their needs. This plan for the future of the RPIFs includes a well-connected network of facilities that provide access to unique collections, specialized resources, public outreach materials, and training opportunities for locating and using planetary data, all while continuing to preserve NASA-funded derived data products related to Solar System exploration.
Finally, the 2013 NASA Decadal Survey [
21] is a medium-term planning document that encompasses a wide variety of planetary science needs and goals as identified by nationwide community members. The scope of the survey document emphasizes exploration strategies to address high-priority science goals, and it does not address details such as the ‘how’ with respect to spatial data acquisition, storage, and usability. The proposed PSDI seeks to augment such strategy approaches by enabling and extending the research and exploration goals of the national planetary science community well into the future and by providing a framework for effective planetary spatial data collection, management, and utilization.
2.2. Gaps in the Existing Solutions
We suggest that the existing NASA and international planetary science strategy and goals documents implicitly define a planetary SDI. Such a PSDI serves the planetary science community in many ways, but does not yet fulfill the needed user-focused goal of making spatial data easily discoverable, accessible, and usable. Data are most readily discoverable via the PDS given an expert understanding of any given mission data collection and archiving strategy. Data access via PDS often is multi-mission at the single instrument level and via cart-based bulk download systems or File Transfer Protocol (FTP) sites. PDS data usability is heavily dependent on the development and delivery of high-level, derived products by the data provider or the capabilities of data portals such as the PDS Planetary Image Locator Tool (PILOT) and Projection on the Web (POW) services [
26] that support limited development of high-order products. While delivery of derived, spatially aware, highly usable products is becoming more common, many datasets served by PDS are simply linked to high-level products not generated by the instrument teams and there is often a multi-year lead time for those products. Such derived products include what we define (below) as both foundational and framework data products. These long lead times and delivery of a limited set of products may seem out of place to terrestrial data users. The generation of derived products for planetary missions is exceptionally complex due to issues described below and can require significant effort. Additionally, many science objectives can be achieved using only the lower-level products that are not spatially enabled. Although issues of product derivation can be side-stepped by the expert user with sophisticated tools (e.g., using a ‘do-it-yourself’ approach), this case is not ideal because it artificially limits the number of possible users and collaborators and thus the possible outcome of the research. We explore each of these issues further below.
Within a single flight mission, data discovery is often supported by a mission team operating a private, web-based data portal with data search capability tailored to mission needs. Unfortunately, cross-mission spatially aware data searches (and any type of semantically enabled search) within such portals are often limited by the challenges in data storage location, network access speeds, lack of accurate co-registration of framework data at known levels of precision and accuracy, data format standards compliance for interoperability, and data access technologies. These limitations can be addressed with a community-based planetary spatial data infrastructure that leverages archived data and existing PDS and mission data portals, interoperable data and metadata standards, and flexible data access technologies - framing within a explicitly defined PSDI.
Terrestrial SDIs and geospatial cyberinfrastructure solutions offer living, proof-of-concept realizations of what a PSDI could be. Terrestrial SDIs are increasingly being realized in the form of web services, web catalogs, and more recently cyber-enabled data processing and collaboration platforms [
15]. For example, Li [
12] described PolarHub, an Open Geospatial Consortium (OGC) standards compliant web crawler, semantic search engine, and search result visualization tool that aggregates globally available and spatially enabled arctic data web services. As these services are OGC-compliant, they are immediately integrable into a suite of analysis tools. Similarly, Padmanabhan et al. [
27] provide a case study linking two high performance computing grids to perform highly parallelized viewshed analysis, a standard raster data processing task. With significant increases in data volumes, analytical complexity, and resultant algorithm run-times, distributed data processing using shared (and possibly spatially distributed) resources is becoming the norm. Wiemann and Bernard [
28] link distributed processing capabilities and describe OGC-compliant data fusion techniques that integrate semantic search, adequately attributed linked data, and distributed computing.
The evolution of terrestrial SDIs from user-driven, to technology driven, to user-focused, and finally to user-integrated SDIs [
29], is the standard against which PSDI efforts should be compared. A comparison between implicit PSDI efforts and robust terrestrial SDIs, clearly demonstrates that PSDI are in their infancy. Terrestrial SDIs seek to either maximize data accessibility and service availability or maximize the integration of users into deep SDIs where spatial data can be well utilized. Existing PSDI efforts are largely data driven, as evidenced above in the description of the PDS and focused on the ‘description, publication, and supply of data’ [
29]. Users are passive recipients of information and the focus of the PSDI is on the technology and standards to achieve the previously quoted goal. We consider this passive view to be a techno-centric view of the role of PSDI. In those instances where spatial data are being made initially ready, (e.g., as high-level products served via OGC standards (this is the exception for the majority of data products), the PSDI still serves as a passive view of the data and not a user-centric framework.
Not only are technical implementations for search and discovery of planetary spatial data lagging terrestrial counterparts, the cyber-infrastructure driven analytics and standards necessary for interoperability generally are not being addressed. However, precursor technical studies focusing on interoperability [
30] have established a foundation for the realization of a PSDI, and the planetary science community is acutely aware of the need for a theoretical foundation upon which planetary spatial data stewardship can be developed. For example Gaddis et al. [
31], summarizing a 2012 Planetary Data Workshop panel discussion, identified 12 issues with the broad use of planetary spatial data (to see the full listing, please see [
31] (pp. 3–6). These include a need to develop high-level derived data products from raw data while reducing redundancy between individual mission teams and the PDS (data access), challenges in finding data including methods for semantically enabled search (data access), a general lack of metadata to support interoperability (standards), and the need to identify big data storage solution (data). Most interestingly, a recurrent theme was the need for more interaction between users, data providers, and engineering developers. The planetary community has identified the need for a unified PDSI independently, without specifically recognizing the terrestrial SDI discourse.
Given the large challenges and very high costs associated with acquisition of planetary data by space missions, it is unsurprising that PSDI development has not been a primary concern. The act of flying a mission, collecting, downloading, and archiving planetary data requires tremendous expertise and effort, and these are typically needed before the data are made spatially aware. We suggest that given these challenges and costs, it is necessary to treat planetary spatial data as a multi-use infrastructural product [
8] that provides the foundation for leveraging consistent and reliable spatial expertise from multiple institutions. Planetary spatial data without a coherent planetary spatial data infrastructure plan propagates the current inefficient state of managing this precious data resource, impedes fulfillment future goals and objectives efficiently, and squanders the opportunity to fully exploit the data and expertise.
2.3. Issues in Proposing a PSDI
The proposed PSDI exists at the intersection of multiple existing planning documents, implemented processes, and existing information infrastructures and technical requirements. Georgiadou et al. [
32] identify three concepts from the Information Systems literature: (1) lock-in effects, (2) a measured implementation approach, and (3) self-reinforcing standardization. All three of these issues are present in the current, second-generation planetary SDIs. Star and Ruhleder [
33] suggest that the existing access methodologies, standards, policies, data, and users of any information system contribute to a structure with significant gravity that can dramatically influence how a proposed PSDI is implemented. The existence of a second-generation SDI suggests that a measured, cultivation based approach [
32] will be required where planned, cadenced change is slowly introduced to the system. This type of change can be observed within the PDS at a technical level as it transitions file formats from the PDS3 to a new PDS4 format [
34]. A PSDI would expand the scope, but not pace of change. Finally, the process of standardization may be self reinforcing [
32] within information systems. While we recognize technical standardization and increasing compliance with OGC standards within the existing system, we do not see broad standardization in areas such as data accuracy assessment and error reporting, camera model creation for accurate data representation, or widespread adoption of OGC standards for data discovery and data interoperability. The adoption of standards may therefore require both community buy-in and NASA directed policy.
There are also conceptual issues in proposing a process-based model, where ‘the objectives behind the design of an SDI...are to provide better communication channels for the community for sharing and using data assets, instead of toward the linkage of available databases’ [
10], while identifying components of a product-based framework. That is, early phases of process-modeled SDIs seek to build community awareness, identify and organize knowledge infrastructures (e.g., datasets, data owners, policies, metadata), and align stakeholder vision [
10]. The ultimate goal of a PSDI is process-based, but the product-based model (composed of users, policies, standards, data access mechanisms, and spatial data) fits exceptionally well into the organization of knowledge frameworks. Therefore, we view the process-based model as encapsulating the product-based model and leveraging it throughout the conceptualization and development of a PSDI.
Finally, institutional and implementation issues exist in proposing a PSDI within the context of NASA funded planetary science missions and the adoption of the PSDI components by other planetary science organizations (e.g., ESA, JAXA, or the Indian Space Research Organization (ISRO)). In comparison to terrestrial spatial data collection, the inherent financial costs in planetary science research require broad international cooperation. Currently, twelve international space agencies (The Armenian Astronomical Society, China National Space Administration, European Space Agency, German Aerospace Center, Indian Space Research Organization, Italian Space Agency, Japan Aerospace Exploration Agency, National Aeronautics and Space Administration, National Center for Space Studies, Russian Space Research Institute, United Arab Emirates Space Agency, and UK Space Agency) are members of the IPDA. As members these organizations already seek to support cross organization interoperability and data management standards using NASA’s PDS data standard and information model. We suggest that membership in the IPDA provides the benefits of an existing cooperative platform for the communication of PSDI concepts (awareness) and solicitation of organizational buy-in (alignment and persuasion) at the cost of increased, cross organization entrenchment. Existing communication channels and common data formats provide a starting point from which a federated ‘system of systems’ approach can be leveraged similar to the Global Earth Observation System of Systems [
35]. We do not suggest that institutional issues lack complexity or are already solved, but a strong foundation for international planetary science research exists from which the proposed PSDI can be developed.
3. PSDI: A Product-Based View
A single, unifying definition of SDI is still an active area of research [
8]. For this work, it is necessary to identify a definition such that the complexity of proposing an SDI is tractable. Rajabifard and Williamson [
17] identify five primary components to SDI: ‘policy, access network, technical standards, people (including partnerships), and data’, and these are grouped into two themes: human-data interaction (data and people), and the facilitating technologies (policy, access, and standards),
Figure 1. Technology acts as the facilitator, through which people can discover, access, and exploit spatial data. In defining the National (U.S.) Spatial Data Infrastructure (NSDI) the Office of the President [
18] largely and implicitly followed this decomposition. While the broad components of the proposed PSDI are largely consistent with that of Rajabifard and Williamson [
17], the specific people, technology, and data requirements are uniquely planetary.
3.1. People
Spatial data users pervade all components of a PSDI and are the primary drivers for establishing and maintaining a robust PSDI. Management of the human components includes the development and stewardship of the critical skills necessary to realize a PSDI, the communication mechanisms to engage and educate stakeholders (i.e., data collectors, providers, and users), and the techniques to connect with non-expert and new users; this is a user-centric not techno-centric view. Unfortunately, the planetary science community currently has an entrenched techno-centric point of view and end users are largely being left to fend for themselves with respect to PSDI resources and products. For example, planetary missions, which in principle have interdisciplinary teams that include at least one spatial data expert, can become insular and do not always relate to or work on behalf of the end user (e.g., the planetary scientist). The result is that data processing responsibilities are passed down to the end user, who is often a non-expert in spatial data manipulation.
The confluence of issues facing the planetary science community are exacerbated by the lack of community knowledge, coordination, or communication regarding the importance of data infrastructure and or for establishing long term goals and priorities related to a robust PSDI Georgiadou et al. [
32]. If the foundational components of a PSDI are taken for granted or ignored, it gradually leads to the atrophy of the workforce necessary to implement a PSDI. The lack of investment in and long term vision for a PSDI-knowledgeable workforce results in fewer qualified candidates for maintaining and growing a PSDI. The decline in qualified candidates is further aggravated by the public perception that planetary science, although exciting and interesting, is exceptionally specialized and ultimately perceived to exclude qualified candidates from other fields in the physical sciences. A large-scale, coordinated, user-centric effort is required to fully realize and maintain a vigorous and long lasting PSDI that serves the needs of a wide user community [
15].
3.2. Policies
The U.S. NSDI is supported by policy generated by both the Office of Management and Budget (OMB) [
22] and the Federal Geographic Data Committee (FGDC). These are broad policy statements in support of the (inter) national coordination and stewardship of spatial data. Implementation, whether by standards development and compliance or technical implementation are left to a large number of other organizations, data collectors, and data providers. As noted earlier, NASA already provides policy statements and announcements of opportunity for short term research grants (e.g., data management plans) and longer term flight missions (e.g., the data submission to the PDS). We suggest that NASA is in a prime position to expand their policy guidelines to support the development and adoption of a PSDI. These statements can take the form of identification of missing data products (see below for identified foundational data products), clarification on the need to adhere to standards (e.g., standard sensor models), and expansion of existing data management plans to include integration with a nascent PSDI. This final statement immediately raises the question of where the responsibility for the production, storage, and integration of higher order data products should fall. We suggest that (1) this is a policy decision to be made by the funding agency and (2) a requirement that higher order products be integrated into the PSDI should be a component of that policy decision. A policy requiring derivation of spatial products includes indication of by whom and at what cost the higher level products are to be created.
3.3. Standards
The primary motivators for planetary standards development, compliance, and adoption has been to support interoperability between spatial data products and interoperability between analytical methods [
30,
36]. These standards include spatial data formats, cartographic mapping and representation standards, recommendations on planetary coordinate frames [
37], FGDC metadata standards, and Open Geospatial Consortium (OGC) web mapping formats. For example, multiple mission and institutional groups utilize the OCG Web Map Service (WMS) format for serving single image data products, as well as regional and global mosaics (e.g., the LunaServ Global Explorer browse map (
https://webmap.lroc.asu.edu/lunaserv.html)). Likewise, Web Feature Service (WFS) standards are frequently employed for serving planetary nomenclature or image footprints (e.g., USGS WFS layers (
https://astrodocs.wr.usgs.gov/index.php/Webservices)) and the Web Coverage Processing Service specification is used to support hyperspectral data analysis in an online environment [
38]. At the time of writing, we are not aware of any groups utilizing the OGC Catalog Service specification. For additional examples of the reach of terrestrial standards into the planetary science community, see Hare et al. [
30] Clearly, a PSDI can (and should) draw heavily from the successes of the terrestrial community in developing and promoting the standards necessary to support broad data discovery, access, and utilization. Planetary standards requirements necessitate two ongoing interactions with the terrestrial standards community. First, it is frequently necessary to extend the standard to support a broader scope of potential body descriptions beyond the Earth. For example, a Community Sensor Model (CSM) is under development to standardize camera models for planetary instruments so that pixels can be accurately projected onto the surface of a body with the necessary extensions to support alternate semi-major and semi-minor ellipsoid axes [
39]. Second, inclusion of planetary metadata within a standard also frequently requires that the PSDI user-base interact with the terrestrial standards body (e.g., the inclusion of planetary Coordinate Reference Systems (CRS) within the OGC and European Petroleum Survey Group (EPSG) corpus [
30]).
Like the terrestrial SDIs, the route to achieving interoperability of planetary data requires data fusion and/or data conversion. The former is generally conducted during the creation and iterative refinement of foundational data products, for example creation of and registration to the most accurate geodetic coordinate reference frame. The latter makes extensive use of the Geospatial Data Abstraction Library (GDAL) and the support that has been added for planetary data formats [
30]. A key standards component not broadly seen in planetary sciences is robust error reporting (e.g., the ISO 19100 series). Standard error reporting is a critical first step in data integration and interoperability.
3.4. Access Networks
As with standards, planetary access networks generally follow terrestrial guidelines and address similar goals such as enabling data discovery and access and the reduction of redundant information [
30]. Given the wide diversity of methods by which raw planetary data are collected, calibrated, archived, and processed to be spatially enabled, data access portals also seek to be a single point, authoritative source. In addition to PDS discipline-focused data portals, the single, authoritative source model is most often observed when flight missions provide their own data access portals. We suggest that the distinction between idealized terrestrial and extra-terrestrial data access networks are largely nonexistent; spatial data search, discovery, and utilization is broadly agnostic to the underlying data. In making this assertion, it is important to note that two critical component of terrestrial data access networks are missing from the planetary science community. First, terrestrial access networks include not only data access, but also service access mechanisms. We cite the proposed conceptual PSDI within the growing cyber-infrastructure literature [
3,
40,
41] in order to include the myriad of distributed analytical capabilities. Second, terrestrial data access networks increasingly support ontology based semantic search capabilities.
3.5. Data
OMB Circular No. A-16 Revised [
22] identifies 34 terrestrial data themes critical to national spatial data utilization. Of these, seven are considered foundational data sets; the remainder are more specialized, framework data sets with smaller user bases. We identify three foundational data themes (the remaining four are Earth centered): geodetic coordinate systems, elevation, and orthoimagery. Geodetic coordinate reference frames provide the basic positional framework upon which all other data themes, whether framework or not, are registered. Within the planetary context the International Astronomical Union (IAU) has traditionally recommended specific geodetic reference frames through a cadenced revision schedule [
37]. Elevation data, whether point observation, vector, triangulated irregular network (TIN) or gridded, is a critical data product and a key input for derived data products. The diversity in elevation data representation, collection, generation, and utilization formats has lead the FGDC to define elevation schema to support utilization of this data type. We echo that elevation data are foundational and concentrated research is required to identify best practices within a planetary context. Digital orthoimagery is the third framework data theme. Digital orthoimagery includes not just visible spectrum sensor data, but also infrared, and radar sensor data.
We intentionally, do not explicitly call out the need for specific data sets or the methods by which derivation of higher level, spatial products should occur. The generation of planetary derived products is largely body and data availability specific. As suggested above and fully described below, the derivation of even foundational data products are often highly iterative as data becomes available from different instruments. We also note that in general, methods for the derivation of higher level products are at best pseudo-automated (e.g., the selection of image correspondences for later block bundle adjustment to make a regional or global mosaic). The technologies to support the creation of the necessary foundational and framework data products will always be advancing (and therefore dynamic) and herein, we seek to identify the classes of products that provide the potential to generate discoverable, high quality science data cubes (e.g., in the style of the Australian DataCube project).
3.5.1. Geodetic Coordinate Reference Frame
Recommended planetary geodetic coordinate systems and associated coordinate reference frames [
42] are updated on a three year cycle by the IAU Working Group on Cartographic Coordinates and Rotational Elements [
37]. The rapid cadence of reassessment is a function of the methodology used to define extra-terrestrial reference systems and associated frames. In the terrestrial case, benchmarks or reference features on permanent geodetic instruments are used to define a coordinate reference system. In the extra-terrestrial case physical emplacement is largely infeasible. We note lunar laser ranging retro-reflectors and stationary landers are exceptions to this general rule. Therefore, widely recognized morphological features are identified (at the best available resolution) to be used as benchmarks.
The use of morphological features suggests that as the spatial resolution of data improves, so does the definition of the coordinate system. In the case where sensor data (visible spectrum, radar, or infrared) are the only available data set morphologic features are used to define the position of standard coordinate reference parameters (e.g., longitude of origin, polar axis, and an orientation model). Additionally, some shape or body model (spherical or ellipsoidal) is defined. As increasing volumes of sensor data are collected or as laser altimetry becomes available the coordinate system can be refined using standard photogrammetric [
43] or radargrammetric [
44] least squares solutions (in the case of sensor data) or altimetric cross over solutions (in the case of laser altimetry [
45]). Once improved image positioning information is available, sensor data can be more accurately positioned and dense stereo-reconstruction techniques applied to derive improved elevation data.
In addition to the iterative nature of definition, the by-products of defining a geodetic coordinate reference frame are of critical importance. These include sensor data that is (sub-pixel) registered to either other data collected by the same sensor or data collected from a different sensor. For example, the absolute accuracy of photogrammetrically bundle adjusted visible spectrum lunar images is significantly improved by also tying the identified features (correspondences or control points) to laser altimetry data (generally by identifying crater rims in the former and topographic features in the latter). Second, the application of photogrammetric control using rigorous camera models supports standardized accuracy and error reporting, a critical component in identifying the suitability of spatial data for different use cases.
3.5.2. Elevation
Elevation data can take a wide variety of forms, including those collected by laser altimeters or generated from stereo images, and are necessary for the derivation of a vast number of framework products and scientific investigations. Laser altimeter data have been collected for some planetary bodies (including the Moon, Mars, and Mercury) but not all, and the availability of data are mission dependent. In addition, traditional Digital Terrain Models (DTMs), TINs, and point clouds are widely used for a number of photogrammetric control, planetary process modeling, and geologic mapping purposes. Point clouds and TINs are becoming increasingly popular as irregularly shaped bodies (asteroids, comets) become more widely investigated. Standards-compliant elevation products, tied to a common reference frame, offer improved integration capabilities that can help reduce challenging fusion operations (e.g., Mars Reconnaissance Orbiter Context Imager-derived DTM tied to a Mars Express High Resolution Stereo Camera-derived DTM).
Altimetry data are collected by measuring precisely the range from an orbiting spacecraft to a planetary body, and typically the instrument collects global data over many years resulting in a highly accurate global topographic product. Altimetry data sets currently exist for the Moon (Lunar Orbiter Laser Altimeter (LOLA) onboard the Lunar Reconnaissance Orbiter [
46]), Mars (Mars Orbiter Laser Altimeter onboard Mars Global Surveyor [
47,
48]), and Mercury (Mercury Laser Altimeter (MLA) onboard the MErcury Surface, Space ENvironment, GEochemistry, and Ranging spacecraft (MESSENGER) [
49]), and some altimetry information was collected for asteroid 433 Eros [
50]. Altimetry data are used to generate gravity models of a planetary body, improve the accuracy of our understanding of the shape of a planetary body, to revise/establish coordinate system, and revise latitude position. Altimetry data also provide accurate topography at global scales, and elevation models are an important product used as ground control when generating higher spatial scale topographic products.
Most frequently, planetary DTMs are products generated to evaluate regional- to local-scale topography at higher spatial scales than available altimetry data. DTMs are derived from stereo pairs, often collected in a nadir orientation and then an off-nadir orientation on a subsequent orbit. When planning and acquiring stereo images, care taken to ensure image stereo overlap, convergence angles, parallax, and lighting conditions are appropriate [
51]. DTMs are almost always processed using images acquired from the same instrument, thus spatial resolution differences are typically not a concern. Many software packages are used in the planetary science field to generate DTMs; the two most common methods being Ames Stereo Pipeline (ASP) [
52] and SOCET Set
®/SOCET GXP
® from BAE Systems, although individual scientists develop their own independent software and methods (e.g., [
53]). Typically, DTMs are tied vertically (and to a lesser degree, horizontally) to altimetry data, when altimetry data are available. When altimetry data are not available, knowledge of vertical accuracy degrades dramatically and limits the scope of scientific and engineering problems that can be addressed. Derived products often generated from DTMs include orthorectified images (orthoimages), slope and slope aspect, and surface roughness information.
3.5.3. Orthoimages and Controlled Mosaics
For terrestrial data users, the selection of geometrically controlled, constant scale orthoimages is straightforward. When available, extraterrestrial orthoimages are preferable as many studies depend upon accurate measurement and geometry (e.g., geologic mapping, mission traversal planning, landing site selection, and process modeling). Within this section, we use the single term ‘orthoimages’ broadly to include not just visible spectrum data, but also non-visible sensor data. For example, the Mars Odyssey Thermal Emission Imaging System (THEMIS) [
54] controlled base map of the Mars surface is a high resolution infrared data product that is tied to the Mars geodetic coordinate reference frame and corrected for topography. The dense atmosphere of Saturn’s moon Titan, with few spectral viewing windows, makes the generation of visible orthoimages problematic for that body, unless data from the various the spectral windows are carefully used. The Cassini mission collected a wealth of Synthetic Aperture Radar (SAR) data that are not attenuated by the atmosphere of Titan, making SAR derived non-visible controlled mosaics possible [
55].
Above, we make the implicit distinction between orthoimages and controlled mosaics. Orthoimages require registration of radiometrically and photometrically corrected raw remote sensing data to accurate elevation models, while controlled mosaics are photogrammetrically controlled collections of images that are relatively accurate to each other. Elevation models and geodetic coordinate reference frames derived from elevation models are available for only a subset of extraterrestrial bodies. Therefore, to limit the foundational data classification to only orthoimages would preclude many outer planetary bodies and irregular bodies (e.g., asteroids). In a PSDI, we must broaden the definition of the image-based foundational data products and suggest that if orthoimages are not available, controlled mosaics using radar, infrared or other data are a suitable substitute in some cases.
In defining the image-based foundational data product, it is also necessary to explore cases where an elevation derived geodetic coordinate reference frame does and does not exist. In the case of the former, the spatial extent of the image mosaics and orthoimagery is of lesser concern. As these data products are implicitly tied to the base during orthorectification, they are globally controlled to one another (up to the accuracy of the elevation model). In the latter case, the image data may also contribute significantly to both the derivation of elevation and the generation of a geodetic coordinate reference frame. For example, the recent New Horizons mission to Pluto and Charon is returning a wealth of image data, but an accurate elevation model does not exist. In fact, the ability to create such an elevation model exists only in those cases where either stereophotogrammetry (the derivation of elevation from overlapping stereo pairs) or single image photoclinometry (the derivation of elevation using a single image and illumination information) can be leveraged. In these instances, we make two broad suggestions for classification as a foundational data product. First, the generation of some global (or as near global as the data allow) controlled base is critical in providing an initial geodetic coordinate reference frame to which additional data can be registered. Second, the control network (the collection of image correspondences or ‘tie’ points that allow for photogrammetric bundle adjustment) or reference frame used to generate such a base and associated error tracking (e.g., residuals or Root Mean Square (RMS) error) are of equal importance to the resultant base data product; the provenance of data generation must be preserved.
Within this section, we hope to have identified the complex relationship between three foundational planetary data products. In the most straightforward case accurate elevation data have been globally collected and can serve to identify a geodetic coordinate reference frame. From this reference frame and an elevation product, orthorectified sensor products can be derived. In the more complex case, image data serves as the starting data product. The more complex case requires relative photogrammetric control, the derivation of elevation when possible, and the identification of geodetic coordinate reference frame. The definition of PSDI foundational data products must be flexible enough to support the highly iterative paths from which they can be generated.
3.5.4. Framework Data
As in the terrestrial case, foundational data products do not serve the needs of all use cases. In conjunction with foundational data products, framework data products support many more science and engineering goals. Herein, we identify three supporting framework products. This list is not intended to be a full enumeration or a prioritization of those products, but seeks simply to represent the breadth of data that can and should be encompassed within a PSDI.
Compositional data:
The concept of “compositional data” in the planetary sciences is typically inclusive of a variety of data sets, including remote sensing data from orbital spacecraft, landed spacecraft, and even from individual geologic samples. In a broad sense, compositional data consist of information derived from the interaction between planetary surface materials and the electromagnetic spectrum, the results of which make it possible to determine the chemical or mineralogical make up of planetary materials. The chemistry and mineralogy of planetary materials can in turn be used to make inferences regarding the origin and evolution of planetary bodies. Compositional data sets such as visible and near-infrared hyperspectral, multispectral, thermal emission, radar, gamma ray, and neutron data tend to be most useful for airless bodies which have evolutionary histories driven by both endogenic and exogenic processes e.g., [
56]. Although these data sets are important for inferring both origins and processes, they are dependent upon other framework products that establish or provide local, regional, and global context. In other words, the value of compositional data are greatly enhanced when they are accurately and precisely tied to specific geologic or morphologic features. Conversely, other framework products have limited usefulness unless they can be tied to compositional data, thus allowing researchers to make broad scale geologic and geophysical inferences.
Geographic names:
Another framework data set is planetary nomenclature. The IAU has been the authority of planetary and satellite nomenclature since its first organizational meeting in Brussels in 1919. The first goals were to normalize various systems used in lunar and martian nomenclatures across different countries. Similar to Earth-based gazetteers, the current IAU planetary nomenclature gazetteer [
57] is used to uniquely name a feature on the surface of a planet or satellite so the feature can be easily located and described. As in the terrestrial case, consistent and accurate feature identification continues to be critical in semantic positioning, context sharing, and effective scientific communication.
Geologic units:
Geologic maps, fundamental syntheses of interpretations of the materials, landforms, structures, and processes that characterize planetary surfaces [
58], are another framework data product that serve to illustrate multiple facets of a product based PSDI. First, the creation of geologic maps depend upon foundational data products, including registered global and regional orthomosaics and elevation data. Without these products, the interpretation of the surface is not rigorously supported. Second, derived geologic maps integrate into a larger science context and provide a regional or global contextual framework for summarizing and evaluating thematic research. For example, robotic and human planetary exploration including traverse planning and landing site selection continue to depend upon geologic maps. Finally, the derivation of geologic maps illustrates the integration of data products into the larger PSDI as the detailed procedural components of producing high-quality standardized, peer-reviewed, and technically edited geologic maps are complex and involve a wide range of data, cartographic software tools, technical procedures and publication requirements [
59]. Therefore, consistent adherence to community standards regarding mapping methods and representation is critical and must be thoroughly considered, reported on, and reviewed [
36]. These methods apply to with respect to data collection, attribution, symbology, documentation, and distribution formats.
3.6. Comparing SDI with PSDI
The proposed PSDI and terrestrial SDIs share many commonalities with respect to the broad goals, supporting policies and standards, and user bases. Issues of horizontal and vertical data integration (i.e., data fusion or data heterogeneity) [
9], data interoperability [
24], and user-centric data exploitation in a global context [
9,
17] are shared whether terrestrial or extra-terrestrial spatial concepts are examined. The proposal of a PSDI is not driven by early SDI drivers: economic development, ‘better government’ (as measured by the government funding the development), or environmental sustainability [
11] nor is the use of the SDI designed to improve these outcomes [
17] (One could argue that the NASA strategic goals broadly parallel SDI goals in terms of open, transparent, and efficient government. We take a slightly more narrow view, focusing on the non-Earth planetary sciences). Instead, the need for a PSDI is motivated by the challenges in planetary data acquisition and discovery, the ever increasing need for collaborative research teams fusing spatial data from multiple instruments, and the growing complexity of research questions that require multi-institution storage and analysis techniques (e.g., cyber-enabled research). In other words, the objectives behind the design of an SDI by any coordinating agency are to provide better communication channels for the community for sharing and using data assets, instead of aiming toward the linkage of available databases. We also note that chronologically described SDI development has been classified as first focusing on the ‘why’, later focusing on technological concerns, the ‘how’ and finally returning to the user centric ‘why’ perspective [
9]. We suggest that within the a PSDI ‘what’ is an equally valid concern due to the complexity and cost of non-Earth based data acquisition. The implicit assumption within the SDI is that spatially enabled data at an appropriate extent, scale, resolution, and quality are available or obtainable, but this is frequently not the case in a planetary context. Calls for inter- and intra-regional cooperation that pervade the applied SDI domain with the goal of improving integration of heterogeneous data are relevant here. Taking a product based view, we compare PSDI and SDI in
Table 1.
The ultimate goal of a PSDI is to provide seamless discovery, access, and exploitation of spatially enabled data for all data consumers without any predetermined requirement of spatial data expertise through the use of cutting edge technologies, standards, and transparent policy initiatives. The development of a strategic PSDI plan is foundational in realizing the ability to fully leverage NASA collected spatial data over the next decades. NASA plays a pivotal role in driving the development of a PSDI, identifying policy alignment with existing SDI mandates and filling policy gaps, and empowering partners to codify a user centered plan for spatial data management.
4. Conclusions
Four primary drivers of the need for the proposed Planetary Spatial Data Infrastructure (PSDI) are rapidly increasing planetary spatial data volumes, high data acquisition costs, increasingly cross-domain research teams, and the need to remove spatial data expertise as a prerequisite for planetary data usage. The current approaches for planetary spatial data discovery, access, and utilization are invaluable in supporting the development of a third generation, user-centric PSDI. Unfortunately, these solutions also lead to issues of entrenchment, where the spatial expertise requirement is viewed as a de facto right of passage for researchers. We suggest that this requirement can impart a shallow spatial expertise that is insufficient when working with complexities associated with planetary spatial data.
Existing terrestrial SDIs and the associated theory, along with existing, implicit NASA strategies provide a critical foundation from which a PSDI can be developed. For example, the Polar Hub [
12] offers an OGC compliant access mechanism and data aggregation layer that can be adapted and expanded to support the unique characteristics of planetary spatial data. Likewise, the generation of topical terrestrial SDIs (e.g., [
13]) and creation of topical cyber-infrastructures (e.g., [
41]) provide models for planetary body specific SDIs (e.g., a Europa SDI to support a Europa clipper mission and subsequent data analysis) with a strong collaboration and High Performance Computing (HPC) and data mining components. Terrestrial efforts in inter- and intra-organization data registration for integration, cross mission data analysis, such as the Australian Geoscience Data Cube project, the Committee on Earth Observation Satellites Future Data Architectures, or the National Science Foundation funded EarthCube initiative all provide valuable models for future vertical and horizontal data integration and science collaboration efforts. We consider these to be future models as the technical challenges in data fusion remain high in the planetary sciences and an iterative approach of data registration followed by integration into larger systems is most appropriate. As the planetary science community, we can benefit from the theoretical work presented by [
8,
15] (among many others) in defining both what an SDI is and what the boundaries should be. That is, how does a PSDI integrate with the broader NASA science goals? Finally, existing NASA efforts are critical in supporting the proposed PSDI. The Planetary Data System (PDS) and associated data standards provide the long-lived data archive from which dynamic standards-based data access mechanisms can be developed.
Taking the view of Rajabifard and Williamson [
10], the proposed PSDI efforts are situated firmly within the communication phase of the process model. That is, we are actively engaging the planetary science community and NASA directly to promote the concepts of PSDI and articulate how these map to the uniqueness of planetary spatial data and the planetary science community. Taking a product based approach to limiting the extent of what an SDI can cover, we see that standards, policies, and the general needs our community are largely shared with the terrestrial community. The iterative nature of planetary foundational data product generation is unique and we have identified those areas where PSDI data products must fulfill different roles.
Moving forward, we envision the identification of stakeholders, data products and access portals to support the development of a body specific PSDI. The realization of a PSDI includes the potential development of necessary foundational data products, propagation of standards, and requests for policy updates in order to support a wide user base with discoverable, accessible, and usable spatial data products. The first PSDI could take the form of a Europa SDI to support an anticipated future mission or a (larger in scope) Mars SDI to support current operations. While the specific form of a PSDI is outside of the scope of this work (and a follow on manuscript is planned), below we identify the specific components of a PSDI that we envision being necessary to assess the value of an SDI to the planetary science community.
Taking a fictional outer planet mission as a use case and the five component product based view as the theoretical framework, we suggest that a body specific PSDI could be realized as the following: (1) Existing outer planet data at varying resolutions has already been collected by previous flyby and orbiting missions. These data products provide the bases for foundational data products that have not been photogrammetrically controlled and integrated into a spatial product. Therefore, the derivation of the necessary (and available) foundational data products are required, (e.g., a controlled photomosaic, available topography, and the identification of a geodetic coordinate reference system). Concurrently, available framework data can be collated and prepared for discovery. (2) We assume that the previous effort is of sufficient complexity and scope that an intra-organizational approach is required. Therefore, using common data standards and OGC compliant data interoperability services (WMS, WFS, CSW) and semantically enabled discovery capabilities, the necessary technical infrastructures can be developed. (3) Data access mechanisms with the ability to accept new derived products needs to be developed in a manner similar to terrestrial efforts to support distributed science teams. Ideally, these mechanisms include integration of cyber-enable analysis capabilities and the mission team that will be shortly collecting new data. At this point, the PSDI largely resembles a terrestrial SDI, using common standards, metadata policies, and methodologies. The expert spatial users of the PSDI are integrating, or have integrated, with the terrestrial standards and technical communities to ensure support for planetary standards within the broader terrestrial standard. (4) As data are collected by the mission team and submitted to the PDS for archiving, the same or another team begins the process of integrating the non-spatial data into the existing PSDI. This includes application of techniques necessary to accurately locate the data to support spatio-temporal analysis.
In this work, we have sought to motivate the development of a planetary spatial data infrastructure as a theoretical framework to support the implementation of planetary body specific SDIs. In doing so, we have explored the theory and implementation behind terrestrial SDI and cyber-infrastructures. We have also sought to identify where the proposed PSDI exists within a process based SDI model, a product based SDI model [
10], and existing NASA efforts.