Next Article in Journal
GASSF-Net: Geometric Algebra Based Spectral-Spatial Hierarchical Fusion Network for Hyperspectral and LiDAR Image Classification
Previous Article in Journal
Dual-Feature Fusion Learning: An Acoustic Signal Recognition Method for Marine Mammals
Previous Article in Special Issue
Integrating SAR and Optical Data for Aboveground Biomass Estimation of Coastal Wetlands Using Machine Learning: Multi-Scale Approach
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Standards for Data Space Building Blocks

by
Francesca Noardo
1,*,
Rob Atkinson
1,
Lucy Bastin
2,
Joan Maso
3,
Ingo Simonis
1,
Alejandro Villar
1,
Marie-Françoise Voidrot
1 and
Piotr Zaborowski
1
1
Open Geospatial Consortium, Arlington, VA 22201, USA
2
School of Computer Science and Digital Technologies, Aston University, Birmingham B4 7ET, UK
3
Grumets Research Group, Centre de Recerca Ecològica i Aplicacions Forestals (CREAF), Cerdanyola del Vallès, 08193 Bellaterra, Spain
*
Author to whom correspondence should be addressed.
Remote Sens. 2024, 16(20), 3824; https://doi.org/10.3390/rs16203824
Submission received: 19 July 2024 / Revised: 1 October 2024 / Accepted: 2 October 2024 / Published: 14 October 2024
(This article belongs to the Special Issue Earth Observation Data in Environmental Data Spaces)

Abstract

:
Data spaces are conceptualised as a trusted and secure distributed data ecosystem through which to exchange resources in the Web. Several efforts define guidance toward data space implementation, such as reference architectures and frameworks. As yet, the proposed data space solutions do not provide common and mature implementation options yet, and this gap between concept and implementation risks confusing users and developers. However, well-recognised organisations have been developing solutions and standards that address interoperability and good data exchange practices for decades, especially in the domain of geospatial information and remote sensing. Therefore, this paper compares the available solutions, providing the mapping and integration of the proposed blueprints to available interoperable standards. This concrete mapping, followed by a discussion with experts, results in a proposal of integrated reference data space building blocks, and an overview of the related standards and solutions. It is designed to support the effective practical implementation of data spaces and to guide future solution developments. This work can form the base for effective collaboration among different organisations, clearly identifying their scopes. A key role is apparent for standards and use cases from the remote sensing and geospatial domains, which have achieved wide adoption and maturity over the past years.

1. Introduction

The concept of the ‘data space’ [1,2] was mentioned in the early 2000s in relation to the web of data, enabled by Linked Data technologies [3], or as an abstraction for the need for the integrated management of diverse data sources for different applications. Data spaces should minimise the bottlenecks to data exchange that result from diverse dataset storage and representation choices and from the lack of trusted channels to communicate data between different stakeholders. A key concern is the need to improve semantic interoperability, i.e., the ability for stakeholders to find, access, and interpret available data and processing services, and ultimately evaluate and make an effective re-use of resources to deliver value. These factors, in particular, generate issues that impede the efficiency of data retrieval and harvesting as well as the automation of processing pipelines.
In parallel to the conceptualisation of data spaces, a number of open data paradigms, theories, and sharing implementations have been formulated and developed, as well as national or regional open data strategies [4,5,6]. The underlying concept driving such open data approaches is that the value of the data is not in their cost but in their usage and the value that they bring to society. This is particularly clear in situations where data are relevant to a variety of stakeholders globally. An example is data from remote sensing, which, in fact, were among the first data to be exchanged through standardised and online procedures.
The potential value of open data was immediately grasped in relation to scientific data and especially for environmental use cases, for which a cross-border approach is obviously essential [7]. At the same time, in Europe, the Directive for an Infrastructure for Spatial Information in the European Community (INSPIRE) was formulated with similar goals (Directive 2007/2/EC [8]). It was preceded by the Directive 2003/98/EC [9] on the re-use of public sector information, known as the Open Data Directive. However, the INSPIRE Directive introduced governance means to support higher efficiency in implementation, such as a roadmap and sanctions.
Other international institutions have also been developing successful architectures and standards for data exchange within their specific domains. For example, the World Meteorological Organization (WMO) has used defined standards to exchange data since 1951. WMO’s approach was originally focused on the so-called “push” systems (data dissemination and broadcasting to everyone ‘listening’ from their community) which leveraged telecommunications to exchange standardised open data in real-time. Subsequently, from 1999, WMO moved to Web services and “pull” systems (data are available to anyone and can be requested, which is more similar to the current data space concept) with the WMO Information System (WIS) [10].
These initial formulations and the successful example of WMO were followed by numerous other initiatives and organisations in various fields, and standards and principles were developed, which are now well-known. Examples include the standards developed within the World Wide Web Consortium (W3C) [11], founded in 1994; within the Open Geospatial Consortium (OGC), founded in 1994 [12]; and the principles defined by the Group on Earth Observations (GEO), established in 2005 [13]. Geospatial data-related organisations quickly understood the need for good data-sharing solutions and practices to successfully support their use cases (e.g., Earth observations, satellite information, environment-related applications). Therefore, they actively started to collaborate to define standards and build consensus over those standards.
More recently, other organisations have reformulated and further specified the concept of the ‘data space’ (Section 2.1) and have begun to propose their solutions to solve some of the implied challenges. For example, the International Data Spaces Association (IDSA) defined ‘data space’ as a data exchange infrastructure characterised by uniform rules, certified data providers and recipients, and trust among public and private partners [14].
In recent years, the European Union developed a greater interest in data spaces as a shared European infrastructure through which both public and private data may be exchanged in a reliable and cost-effective way to facilitate shared data-driven products and services across Member States. The European Commission, in the European Data Strategy, promotes data spaces as open for the participation of all organisations and individuals (under defined conditions); secure and privacy-preserving to pool, access, share, and use data; respecting EU rules and values (especially personal data, consumer protection, and competition law); and empowering data holders to make their data available for reuse (either for free or against compensation). Several projects have been funded by the European Union to research the topic and provide working solutions (Section 3.1). The same concept is being investigated and developed in other parts of the world, many times without specifically referring to it as a ’data space’, for example, by the United States National Geospatial-intelligence Agency (NGA) National Unclassified Data Lake (NUDL) in the US [15], or by the United Nations Environment Program, working on a Global Environmental Data Strategy (GEDS) [16].
The European data spaces approach is part of a broader global trend focusing on data sovereignty, interoperability, governance, and the creation of secure and trustworthy data ecosystems. Various regions and countries have made comparable efforts, but an analysis on a global level is beyond the scope of this paper. In order to do justice to the complexity of data spaces and the special role geospatial data plays therein, this paper focuses on the European approaches. It thus combines a regional perspective on data spaces with the global perspective of the geospatial community, as discussed primarily in the context of the Open Geospatial Consortium.

The Current Gap and the Purpose of This Paper

Several projects, initiatives, and organisations have provided valuable blueprints and architectures, representing the challenges to be addressed when planning data spaces (see a review in Deliverable 3.2 [17] of the HORIZON Europe USAGE project [18]). However, in such proposals, the connection to the huge palette of existing standards and solutions, often already adopted in practice, is not straightforward.
Therefore, in this paper, various data space initiatives are mapped to the interoperability and sharing solutions and standards that have been provided or adopted by key international organisations. This mapping especially considers the standards relevant to the geospatial domain (e.g., OGC, GEO, W3C), which are also those most relevant for remote sensing. This offers a starting point to consolidate a common framework to which different solutions, coming from different initiatives, can be related.
This article provides an overview of the capability of the currently available standards and services to address the defined challenges. On the one hand, it identifies the scope of expertise of the organisations managing geospatial information, on which this paper focuses. On the other hand, it may help to identify the remaining gaps in data space developments.
To highlight the usefulness of established geospatial information solutions and infrastructures as a starting point for data space architecture, and how they can act as a potential reference for other kinds of data, we can consider the 14 data space domains proposed by the European Commission [19]. While for some of these geospatial information is not identified as essential (e.g., Finance, Language, Media, Research and Innovation, Skills), for others, it is at the core of any analysis and decision-making activities (Energy, Green Deal, Mobility, Tourism). For the remaining domains, geospatial data might not be essential but deserve to be considered and included because of the high probability of their relevant added value. In addition, remote sensing data exchange and related applications would be improved by the increased interoperability and transparency of data, as well as by the architecture component solutions provided to address each involved challenge.
The paper starts with an overview of the current principles, standards, and initiatives related to interoperability and relevant to the data space concept development (Section 2). After a short explanation of the methodology (Section 3), the results are then presented (Section 4), reporting intermediate and final data space building blocks and definitions, with current standards mapped to them (in detail in Appendix A) and including a proposal for criteria to be used for standards description and assessment, as a guide to standards users and developers (Section 4.4). The paper ends with a discussion (Section 5) and conclusions (Section 6).

2. Background

In this section, the existing principles, organisations, and initiatives working on data space-related matters are reported.

2.1. Data Spaces Concept

A ‘data space’, as currently defined, is a data exchange paradigm that envisions a distributed architecture or federated data ecosystem where data remain close to their owners or providers, who keep full control of their data and manage the scope of and conditions for their use throughout the life cycle of those data. Data would be effectively exchanged over the Internet using trusted connections for the benefit of various use cases. A data space is defined by a governance framework that enables secure and trustworthy data transactions between participants [20,21].
A data space is supposed to have a decentralised data storage, i.e., data physically remain with the respective data owner until they are transferred to a trusted party [22]. The users of such data spaces should be trusted parties enabled to access data in a secure, transparent, trusted, easy, and unified fashion, according to commonly agreed principles [23]. Access and usage rights can only be granted by persons or organisations entitled to dispose of the data [24]. Data spaces with an appropriate underlying interoperability infrastructure enable the shared understanding and reuse of data and processing capabilities.
The identified advantages of data spaces are as follows [25]:
  • New services relying on enhanced transparency and data sovereignty;
  • A level playing field for data sharing and exchange;
  • A new digital culture for users, with higher awareness of digital data related ethics and value;
  • Availability of large pools of data;
  • Infrastructure to use and exchange data;
  • Appropriate governance mechanisms.
Although this paper primarily considers the technical aspects of data spaces, these are only one part of the entire challenge, which also includes business, organisational, governance, and legal aspects. A recent Joint Research Center (JRC) document on European data spaces [25] summarises their principles, requirements, and features, including technical aspects but also a range of other facets (e.g., data sovereignty, citizen centricity, inclusion, self-determination, trust, innovation, scalability, and so on).

2.1.1. Data Spaces beyond Technical Features

As mentioned, implementing a data space implies decisions and infrastructures that go beyond the bare technical ones. In fact, specific business models, governance, organisational models, and legal aspects need to be taken into account [21].
A specific business model, or combination of multiple business models, needs to be developed in order for a data space to ensure a sustainable business case. It should consider network effects, should serve both the supply and demand of data and related services, and should be built on consensus from the multiple users and organisations involved. Data spaces go through different life cycle stages: preparatory, implementation, operational, and especially the growing and scaling stages, including maintenance and improvement. The scaling stage is actually the one with the highest potential, and it is therefore essential that the interoperability solutions chosen are well supported by an established organisation and community.
From a legal point of view, several aspects should be taken into account, i.e., different local, national, and supranational legal entitlements to data, the intense legislative agendas, and the intricate interplay between different relevant regulatory instruments. Therefore, legal and governance building blocks need to be proposed. Proposals on the topic come, for example, from the H2020 OpenDEI [26] design principles, the EU Support Centre for Data Sharing (SCDS) [27], and the Data Sharing Coalition [28], which proposes conformance to the Business Legal Operational Functional and Technical (BLOFT) Framework, as well as part of the pillars identified in the UN Environment Program ‘Global Environmental Data Strategy’ [16]. The operational aspects of a data space include operational governance agreements, such as compliance with GDPR (General Data Protection Regulation), the onboarding of organisations, decision-making, and dispute resolution. Business operations such as process streamlining, automation, marketing, and awareness activities are also important components of operational activities, such as monitoring and logging data exchanges to detect and solve reconciliation issues in a timely manner, as well as monitoring the whole infrastructure, such as software, energy, and resources, to ensure efficiency and a trustful Service Level. Guides and documentation should support the correct use, management, and maintenance of the data space, as well as means for technical and organisational support.

2.1.2. Main Projects and Initiatives for Data Spaces

Several challenges are involved in the data spaces definition and implementation. The first challenge of all is the definition itself and the identification of proper components design. As the ambition of data spaces is so wide, a multi-sectoral and interdisciplinary collaboration is essential. Therefore, several organisations and projects are usually required to build such a collaboration, and all of them may propose their solutions for data spaces at different levels. Although the ultimate goal will be an integrated definition and implementation, it is important to be aware of the nature and background of contributing parties, to understand better where and how they contribute, and where they are complementary to each other. This also allows the identification of possible gaps and weaker aspects deserving future attention.
The International Data Spaces Association (IDSA) [14] is a global membership association founded in 2016, covering a range of cross-sectoral fields (research, industry, lawmakers, and others). It has the goal of providing solutions for data spaces, implying the sharing of data in a data economy model in which everyone can keep full control over their data (data sovereignty) when exchanging them. The focus of IDSA is to ensure reciprocal trust between the actors involved (data providers and consumers), security and data sovereignty, building on and further developing existing standards, technologies, measures, and governance models [22].
The Gaia-X European Association for Data and Cloud (Gaia-X) (a non-profit association founded in 2021) [29] aims at the development of a data exchange architecture (standards for data sharing, best practices, tools, and governance mechanisms [30]) for federated open data infrastructure based on European values regarding data and cloud sovereignty. Gaia-X proposes a complementary Data and Infrastructure Ecosystem, building upon each other.
The FIWARE Foundation (a member-based organisation, dating from 2016) [31] develops components (‘Generic Enablers’) to support Open Source Platforms and connected solutions, and a Reference Architecture to be extended and developed based on applications’ needs. It proposes open standards for interoperability, focusing on Information and Communication Technology (ICT).
The Big Data Value Association (BDVA) [32]/DAIRO, FIWARE, Gaia-X, and IDSA defined themselves in the Data Spaces Business Alliance (DSBA) as a reference for the deployment of data spaces, and in 2023 they published the ‘Technical Convergence Discussion Document’ [33], also endorsed by DSSC [34]. It defines a common technology framework, based on the technical convergence of architectures and models, leveraging mutual infrastructure and implementation efforts.
The ‘Data Spaces Support Centre’ (DSSC) is a project (2022–2025) funded by the European Union under the Digital Europe Programme [35]. DSSC aims at supporting data space deployment by providing assets in cooperation with a network of stakeholders. They developed a stack of high-level data space building blocks, starting from the OpenDEI project [26] proposal and based on consensus among the DSSC partners. Data space building blocks found consensus among several organisations and initiatives working on the topic of data spaces.
A similar project is the ‘Data space for smart and sustainable cities and communities’ (DS4SSCC) [36], which developed a Catalogue of Specifications providing an overview of 11 identified building blocks (technical and non-technical), compliant with most of the DSSC specifications, especially in their technical aspects.
The Green Deal Data Space (GREAT) project (2022–2024) [37], funded by the European Union under the Digital Europe Programme, aimed at defining foundations for a Green Deal Data Space as well as the connected community of practice. In their Technical Blueprint (2023), a digital ecosystem service design is proposed as a “secure, trusted and seamless sharing of data” [38]. They state the general principles as follows: Inclusiveness (i.e., all datasets should be allowed in the data space); Fairness (i.e., provide equal possibility of access to the data space); Autonomy (i.e., each data source should maintain its own management, while being allowed to be included in the data space). Moreover, they identified the design principles as follows: Low Entry Barrier, System of Systems, Standardisation and Mediation, Loose-coupling, and Interoperability/Security Orthogonality. The GREAT project proposes to use the ISO Reference Model for Open Distributed Processing (RM-ODP) for describing the proposed architecture.
They identify and define some solutions supporting security and trust (authentication, access control, confidentiality, integrity, and non-repudiation). Finally, the GREAT Project Blueprint [38] lists the relevant services and components that might be useful to address the identified functionalities and building blocks, including standardised solutions already in place in practice.
The European Union also funded the Smart Open-source Middleware (Simpl) project [39] under the Digital Europe programme and Horizon Europe. It is intended to develop an open source smart and secure ecosystem (middleware platform supporting safe and secure access and interoperability). Simpl adds to the principles for software quality described in Section 2.4, by establishing five principles: Anchored to specific use cases, Smart and modular, Open source, Green, scalable and agile, and Secure and interoperable.
Among the projects not explicitly related to data spaces, but very relevant to the topic, there is the HORIZON Europe WorldFAIR project [40], which recently produced the description of the Cross-Domain Interoperability Framework [41]. It is intended to identify an architecture and a minimum set of standards and solutions able to support applications across different domains. To reach this goal, a set of standards is identified which are implemented, well affirmed, and used in different domains. The WorldFAIR deliverable [41] is therefore an interesting reference since it reports such a set of standards that have been practically assessed against effectiveness and portability across different applications.
Finally, input and discussion on interoperability and data exchange also come from the ‘Open Agile and Smart Cities’ (OASC) [42] global network of cities and actors, which since 2015 has collaborated to define and agree on solutions for digital transformation. They define the Minimal Interoperability Mechanisms (MIMs) as minimal technical requirements to facilitate digital solutions for cities, mostly based on leveraging open standards and APIs.

2.2. Principles and Recommendations for Data Interoperability and Management

In the efforts to enable data exchange ecosystems, some principles have been formulated to guide good practices for data management and publications. They come from different sectors, e.g., research, in the case of FAIR principles [43]; legal directives, such as the European Interoperability Framework (EIF) [44]; or practice and operational initiatives, such as GEO Data Sharing and Data Management Principles. As a result, their scope and goals are slightly different, overlapping in some cases but more often being complementary.
Because of their different origins and remits, the focus area of each is different. The EIF mostly addresses practices of European public administrations, while the FAIR principles consider researchers and stakeholders from a global perspective.
The European Union, with the EIF [44], gives recommendations and guidance to support a shared and interoperable digital environment for the communication and exchange of data with public administrations in Europe. There, interoperability is defined as “the ability of organisations to interact towards mutually beneficial goals, involving the sharing of information and knowledge between these organisations, through the business processes they support, by means of the exchange of data between their Information and Communication Technology systems”.
The EIF provides 12 interoperability principles, divided into four categories. For each principle, recommendations are proposed. In addition, the EIF defines an interoperability model, structured on four layers (legal, organisational, semantic, and technical) plus a transversal layer related to integrated public service governance, for which other recommendations are provided, and finally, a conceptual model of the interoperability components, again with related recommendations.
The GEO Principles consider primarily technical aspects, while in practice the FAIR principles require a consideration of all of the layers. A decision to reuse data or services is ultimately the result of the evaluation of many factors, such as the semantic relevance of resources to a problem, the ease of use from a legal or licensing perspective, and the cost-effectiveness of access and integration. We can directly compare EIF, FAIR, and GEO here to better understand how they inform the requirements of data spaces.
By addressing different sectors and having distinct objectives and components, the EIF and GEO can be seen to serve complementary but different roles in the landscape of interoperability and data sharing.
Since 2015, GEO has promoted fundamental principles for data sharing, recognising data sharing as a key factor in realising the potential societal benefits from Earth observation. Beyond the GEO Data Sharing Principles, ten GEO Data Management Principles (GEO DMP) were defined by the related GEO Task Force in April 2015 [45]. They are grouped under five categories: discoverability, accessibility, usability, preservation, and curation.
The GEO Data Sharing and Data Management Principles Subgroup has made the first mapping of such GEO principles to the FAIR Principles and is working on a more refined realignment. They also advocate the Transparency, Responsibility, User focus, Sustainability (TRUST) principles [46] for digital repositories and the Collective Benefit, Authority to Control, Responsibility, Ethics (CARE) [47] principles for indigenous Data Governance. All these principles are of interest for data spaces.

2.3. Established Interoperability and Standardisation Organisations and Initiatives

Some of the groups working towards consensus on interoperability across diverse operations are particularly relevant to geospatial information, including the ISO TC 211 committee, or the ‘Spatial Data on the Web’ joint Working Group of the World Wide Web Consortium (W3C) and the Open Geospatial Consortium (OGC). Others, such as the Group on Earth Observation (GEO), recommend and practically evaluate standards and interoperability by demonstration portals.
The International Standardization Organization (ISO) [48] is recognised as a global institution to publish standards, covering a wide range of domains. Founded in 1945, it is an independent, non-governmental international organisation bringing together experts to share knowledge and develop voluntary, consensus-based international standards, supporting market innovation and interoperability. ISO standards are a general reference and are usually considered a priority for compliance.
The Open Geospatial Consortium (OGC) [12] is a global consortium of thirty years’ standing, which provides open standards and solutions to support interoperability for geospatial data. The OGC operates a range of liaisons with ISO, W3C, and others within this specialised context. It has a particular emphasis on improving FAIR through standards openness. Several standards and solutions published by the OGC are currently well known and adopted solutions that can be considered as options to address the required functionalities of data spaces.
The World Wide Web Consortium (W3C) [11] was founded in 1994 and publishes standards enabling the development of the World Wide Web. Many of the other standardisation actions for interoperability across the web stand on the basis of W3C standards. In particular, a joint W3C-OGC working group, namely, the ‘Spatial Data on the Web Working Group’, running since around 2017, has gathered interoperability and data sharing experts from major organisations around the world and is active in providing and maintaining vocabularies, the best practices, and documents supporting the effective use and better sharing of spatial data on the web. Moreover, the work of the group identifies where joint action in developing standards is needed from W3C and OGC [49]. Although formulated in terms of joint standards publishing, and being focused on spatial data, the scope of the working group is very much aligned with the data spaces concept. The deliverables produced can, therefore, be considered as an effective base to be extended in data space implementation. In particular, the Spatial Data on the Web Best Practices latest version [50] proposes practices and solutions for publishing spatial data “FAIRly” through the web (including principles, data representation and documentation, validation, data access, metadata, and data ethics), as well as analysing current limitations for future development. They also propose an interesting extension to the FAIR principles, including web accessibility for humans and machines and data quality. Another part of their work regards the legal and ethical implications of sharing data over the web and provides guidance [51].
Some very relevant requirements have emerged from summits and policies related to sustainability, resilience, and disaster risk reduction. At the global level, the Group on Earth Observations (GEO) is a partnership of national governments and participating organisations wishing to coordinate and collaborate on the management and use of Earth Observations; this partnership was initiated in 2003 to support work on the aforementioned challenges. The GEO community developed the Global Earth Observation System of Systems (GEOSS) [52], aiming at the integration of the different observing systems and connection of infrastructures through common standards.

2.4. Software Interoperability Standards

Data spaces need software components that support their exchange, discovery, evaluation, and use. The field of software development has also provided standards related to software quality, including interoperability and reciprocal connections between components; these should also be taken into account. For example, ISO 25000 defines, in the ISO/IEC25010 ‘System and software quality models’, some software product quality criteria, including categories such as ‘compatibility’ or ‘interaction capability’. These overlap with some principles and issues addressed within the data spaces blueprints and proposed architectures (Figure 1). In the same ISO25000 series, the ISO/IEC25012 on Data Quality model [53] indicates parameters for assessing the quality of data, several of which relate to the data capability to be accessed, understood, and tracked. These examples clearly fall in the same scope as the data space objectives.
Another relevant standard related to software interoperability is the Open Systems Interconnection (OSI) model, ISO/IEC 7498. This defines a layered architecture for systems interconnections and communication, including data exchange. Each layer includes definitions and guidance that may align with the concerns of data spaces: application, presentation, session, transport, network, data link, and physical [55].
Finally, the Reference Model of Open Distributed Processing (RM-ODP) [56], standard ISO/IEC 10746 [57], is considered for the technical blueprint description in the GREAT project, as well as in several other initiatives and organisations (e.g., OGC). The viewpoints recommended by the standard for the modelling of software architectures are as follows:
  • Enterprise—business requirements of the system (purpose, scope policies);
  • Information—information managed by the system and the structure and content type of the supporting data (semantics and information processing);
  • Computational—functionality provided by the system and its functional decomposition (objects which interact at interfaces);
  • Engineering—distribution of processing performed by the system to manage the information and provide the functionality (mechanisms and functions required to support distributed interactions between objects in the system);
  • Technology—technologies chosen to provide the processing, functionality, and presentation of information.

3. Methodology

Mapping these diverse approaches to data spaces to existing interoperability and sharing solutions and standards widely adopted at European and global level, required a systematic methodology. The starting point was the work of organisations and projects that have extensively worked on data spaces definition and related frameworks in recent years. Among all the options, the data space building blocks stack proposed by the Data Spaces Support Centre project was identified as the most inclusive and high-level, as well as being agreed by several other organisations and project consortia. These building blocks describe the different challenges for producing an effective data space. It is considered and further refined using a bottom-up approach, starting by mapping the solutions already provided and adopted in practice, especially focusing on technical standards from the geospatial domain. Interoperability-related principles are also considered for the mapping.
Figure 2 summarises the different parts of the work described in this paper. This study began with some work developed in the Green Deal data spaces HORIZON Europe project ‘Urban Data Space for Green Deal’USAGE (https://www.usage-project.eu, accessed on 31 August 2024) and extended within the ‘All Data for Green Deal’AD4GD (https://ad4gd.eu, accessed on 31 August 2024) project addressing the same topic (Section 4.1.1 and Section 4.1.2, respectively).

3.1. Review and Comparison of Data Spaces-Related Initiatives and Blueprints

As reported by the USAGE deliverable 3.2 [58] and updated for this paper, after reviewing the most prominent initiatives regarding data spaces definitions and related solutions and standards proposals, these were mapped to each other to highlight the reciprocal relationships, the respective scopes or main focuses, and the progression and dependencies in the reference documents [17]. Figure 3 summarises the progression of the different initiatives and related proposed architectures developed over time.
Aspects of interoperability and data sharing (see Section 2.2) are relevant to enable effective data spaces. However, several data spaces-related conceptualisations [22,30] are focused on data sovereignty and trust. By contrast, the institutions involved in (geo)data standardisation (e.g., Joint Research Centre—INSPIRE, OGC) or information sharing through the web (e.g., W3C) have been developing solutions and standards to support data interoperability for some time.

Identification of an Initial Baseline Suite of Building Blocks

Considering the timeline of different initiatives, as well as the participation of the main actors in data spaces conceptualisation and implementation, in this study, the building blocks proposed by the DSSC [59] were taken as an initial baseline for data space building blocks. As a European-funded project, DSSC supports the participation of other (not only) European-funded activities such as Gaia-X, IDSA, BDVA (Section 3.1), building on European efforts and regulations. In addition, the project DS4SSCC considers the same building blocks as a reference for its catalogue, in which some standards are already mapped, alongside the Minimum Interoperability Mechanisms defined by OASC. The GREAT Blueprint [60] acknowledges the proposed building blocks as a reference, in addition to the mapping of building blocks and components proposed by the different initiatives. Grothe [23] also makes a mapping of the proposed blueprints and architectures over an initial version of building blocks, pointing out the main scope of the considered projects and initiatives.
The DSSC building blocks are grouped into ‘Technical’ building blocks and ‘Governance and Business’ building blocks. Although governance and business are essential aspects, in this paper, the focus is on the technical building blocks, and the provision for semantic interoperability. For each building block, the alternative solutions are mapped, starting from the mapping already completed by the DS4SSCC, and are critically re-considered against these concerns and the relevance of further standards and solutions, primarily coming from the W3C, OGC standards, INSPIRE, and GEO references. The direct reference to DS4SSCC remains in the Tables in Appendix A when additional standards are reported by DS4SSCC, with respect to the main references considered in this paper.
First, a mapping to interoperability solutions and principles was performed [58], and the first integration to the building blocks stack was proposed to improve consistency with the available standards and provide a suitable base for technical implementations (Section 4.1.1).
Each building block proposed in the first stage (namely, within the USAGE project) was critically reconsidered by the AD4GD consortium as well, during a hackathon in Turin, in early February 2024. The hackathon was mostly participated by the AD4GD consortium and advisory board, as well as by some guests (e.g., from Sensor.Community [61]). Moreover, in the same days, an open event was organised with local stakeholders on the use case of air quality, and the discussions about the needs for data exchange and adopted practices also fed the discussion on refinement the building blocks. The approach capitalised on the experience gained by the consortium during the first part of the project, including the development of AD4GD use cases and supporting architecture. Moreover, experience of the consortium partners (deeply involved in OGC, FIWARE, and other interoperability-related initiatives and projects) in longer-term interoperability solutions and technology developments was brought to the discussion to justify the proposed changes (Section 4.1.2). In particular, each building block was described as well as the applied choices and changes. According to the description, the examples coming from the AD4GD architecture components and solutions, as well as from other projects and cases from the consortium experience, were considered. These were mapped to the building blocks, and, whether the building blocks were not sufficiently defined or unambiguously understood, the needed changes were applied.
As a result, data space building blocks were identified, having a sufficient granularity level to allow the consistent mapping of each building block to single functionalities addressed by different standards.
To further refine the mapping a workshop was held with a panel of experts involved in standardisation organisations and initiatives who had affirmed experience in interoperable architecture and data design and implementation (Section 3.2).
An initial set of the main standards and solutions, as currently available, are then mapped to the identified building blocks, connecting concepts and ideas to proved solutions. This helps to identify where the solutions provided (i.e. working and adopted standards) can already represent a basis for further data space development, facilitating their uptake in general.
Finally, to assist in navigating the wide range of available solutions, some criteria are proposed to provide metrics for the consistent description and assessment of each standard (Section 4.4). This allows users to evaluate and choose the standards they need. At the same time, it can guide standardisation organisations to improve their standards and present them transparently and effectively. However, this is an initial proposal, which will need to be improved through additional research, including testing and a consensus-based process.

3.2. Finding a Consensus over the Revised Data Space Building Blocks

As mentioned in the Section 3.1, the data space building blocks stack provided by the DSSC project was analysed and mapped to the available standards and shared data management principles. It was further integrated with reference to specific implementation needs in later phases, especially considering insights and pilot cases from the AD4GD project. Both the USAGE and AD4GD projects consortia include several partners deeply involved in standardisation actions and activities, as well as developers of interoperability solutions. Their agreement on the proposed integration of the DSSC building blocks is therefore also valuable as an expert validation, reinforcing the bottom-up approach adopted in the first stage.
To further extend the consensus over the integrated building blocks stack and to outline possible shortcomings, a workshop was organised, involving the authors of this paper (who are involved in several data ecosystems-focused projects and standardisation initiatives) and external experts with similar extensive experience from international research, implementation, and practice perspectives (Table 1). Another goal of this additional review was to extend the validity of the proposal, involving experts who work in international organisations (such as OGC) and in their respective international projects all over the world.
The initial results (Section 4.1.1 and Section 4.1.2) were shared with and explained to the panel of experts. The panel was then asked to give their feedback on the adopted choices, considering their own solutions and standards, other frameworks they know, and general experiences in research, work, and projects. A form was provided which guided them through each step and choice in the DSSC data space building blocks evolution, asking them for specific feedback and level of agreement. A final meeting was organised to discuss the possible discrepancies and different points of view, and to find a common agreement. The final results were summarised as reported in Section 4.1.3 and shared with the panel for last comments and feedback. As a substantial part of the refinement of the building blocks happened across several iterations, taking its final shape during the discussion phases, it is not efficient to share the detailed feedback by each expert over each step. Instead, the final results are shared, on which they could find an agreement. The future implementations will provide final validation, or offer additional opportunity for refinements.

4. Results

The results reported in this Section describe the outcome of the mentioned methodology phases, starting with the integration of the data space building blocks (Section 4.1). According to such results, the available standards, many of which were considered for the building blocks discussion, are mapped (Section 4.2). Two more sections are written to report some examples of how data space building blocks can be used for remote sensing use cases (Section 4.3) and to propose some initial metrics useful to assess standards to guide users and developers (Section 4.4).

4.1. Data Space Building Blocks Integration

The results of the three main phases of the building blocks integration methodology are reported in Section 4.1.1, Section 4.1.2 and Section 4.1.3.

4.1.1. Improved Mapping to International Solutions and Principles for Increased Building Block Granularity

The mapping of building blocks to current solutions highlighted the need for an increase in granularity in the proposed building blocks to allow their consistent use (Figure 4). In particular, under the ‘data exchange’ aspect, both the communication technology, such as Application Programming Interfaces (APIs), and the format to encode data should be considered, and these are two separate issues, for which different standards and solutions apply.
Similarly, usage policy specification and the control over the compliance to the terms established in the policy should be considered separately as they regard (a) the way in which to express and encode the policy terms and (b) the solutions used to read and enforce such terms.
“Data services and offerings description” is intended to describe metadata. However, these should be described for each parameter of data, software, and services. It may be noted that, in a workflow, such elements correspond to the core concepts of the W3C Provenance model: Entities, Agents, and Activities. To keep these distinctions clear and provide appropriate interoperability standards for each case, additional building blocks are needed.
International principles and recommendations for the good management and sharing of data (Section 2.2) were mapped to the building blocks baseline. These principles are a major driver in the digital economy that cannot be ignored, and they have to be explicitly addressed to benefit from related resources and be adopted by a broad community. Additional changes were generated in the proposed building blocks stack in order to make it consistent with such principles, as well as able to include additional standards and components being developed and tested for data spaces developments (Figure 5).
Considering FAIR principles, the ‘Data Interoperability’ category was generalised to address a wider range of aspects, and the title was revised into ‘Data FAIRness’. It will be important that, for each dataset involved in a data space, all the aspects listed under the ‘Data FAIRness’ category will be properly documented into extended versions of metadata, including or linking to the following:
  • Data model, which should be based on standards, typically specifying the used profile (i.e., the subset of a more comprehensive data model, if applicable) and possible extensions. A standardised documentation of them, with machine-readable encoding to possibly support data validation, shall be provided. Data models and profiles should specify the following:
    The semantic and structural description of the data;
    The use of geometry and any related aspect, such as the kind of representation used (as solids, surfaces, polygons, lines, raster), kind of geometry stored, any topology representation required, level of detail or resolution, accuracy, and so on.
  • Data Exchange, i.e., encoding, related to the syntax.
  • Data description (metadata), moved from the ‘Data value creation’ category after splitting from the related service.
  • Provenance and Traceability, extending the attribute ‘provenance’ or ‘lineage’ of typical discovery metadata allowing the visibility of underlying data supply chains critical to a suitable understanding, and subsequent reusability of the data.
  • Data Licences (moved from the ‘Data Sovereignty and Trust category’ after splitting from the related control service), indicating the conditions for use of the data and reusability of derived outputs.
The ‘Data Value Creation’ category is renamed as ‘Tools for FAIRness’ to emphasise that the contained solutions are not adding to the intrinsic value of data, but rather enabling full leveraging of this value by means of the FAIR principles’ comprehensive support.
The ‘Data exchange–communication (e.g., APIs for data exchange)’ building block (previously under ‘Data Interoperability’) has been moved to the ‘Tools for FAIRness’ category. Moreover, a building block for ‘Data requirements specification and data validation’ was added. In fact, it is essential to agree on standardised methodologies and, possibly, supporting tools, to define exactly the data requirements, considering all the building blocks contained in the ‘Data FAIRness’ category.
Data requirements specifications connect the needs of use cases and the processing required for the datasets that need to be retrieved or produced. The highly detailed definition of many aspects of such data requirements, including the needed data quality, are necessary to support evaluation, planning, and any possible automation of data re-use and integration steps.
When such data requirements are defined in a machine-readable encoding, automatic data validation against them becomes possible, ensuring that data to be input into processing have sufficient quality and characteristics, and the result of processing or analysis has, therefore, the expected reliability. They enable pipeline automation, in addition to other advantages given by data spaces. In addition, several aspects related to services were not placed within the building block stack, despite playing a relevant role in supporting data preparation for sharing, data use, analysis, processing, and so on. Such software should exist for the whole data space in order to deliver maximum benefits. In addition, it needs to be suitable for connection to the ecosystem, as well as being able to manage standardised data, or any shared data, properly. Therefore, we proposed an additional category addressing ‘Services FAIRness’ which contains the related building blocks—both from the previous stack, such as the ‘Software descriptions (metadata)’ and new proposals; for example, those to support the documentation and definition of access to software and the related licenses.

4.1.2. Building Blocks Refinement with Input from Experts and Green Deal Data Spaces Projects

In the workshop held in Turin by the AD4GD consortium, additional integrations were proposed to the building blocks stack, resulting in the framework represented in Figure 6.
The category ‘Data Sovereignty & Trust’ only underwent minor changes. For example, one more building block, the ‘Sharing traceability’, was added to address traceability of the data. This one came from the splitting of the original DSSC ‘Provenance and Traceability’ building block in two: one about solutions helping to track the data and their use, and one about expressing provenance according to standards, i.e., the ‘Data Provenance models’, which remained within the category ‘Data FAIRness’.
Within the ‘Data FAIRness’ category, ‘Data requirements schemas’ was added, as complementary to ‘Data requirements system (definition + validation)’ under ‘Tools for FAIRness’, both resulting from the split of the previous ‘Data requirements specification’. Symmetrically to ‘Data Licences’, a ‘Licences for Services’ building block was added under ‘Services FAIRness’.
Under the category ‘Tools for FAIRness’ the following building blocks were added:
  • ‘Vocabularies and Meaning service’, i.e., services enabling semantic interoperability by defining and providing terms and mechanisms to represent and leverage “shared knowledge” in different domains;
  • ‘Data transformation’, i.e., any mapping tool and routines facilitating data integration and conversions.
Moreover, ‘Vocabularies’ were added as a transversal building block for both Data FAIRness and Services FAIRness categories, since they are relevant tools to support all the blocks in those categories.
Finally, we added a category, which we considered as not being entirely part of the data spaces, but rather interacting with them (possibly under the concept of ‘digital twin’), to host all computation-related building blocks, i.e.,: processing, software access (previously under ‘Services FAIRness’), workflows, and actuators.
We also considered adding the data space assets (contents) themselves, which were classified as requirements, metadata, provenance, semantic tagging, data (including personal data), and licenses.

4.1.3. Validation of the Resulting Building Blocks

As the result of the final workshop, the panel of experts as described in Section 3.2 reported their feedback about each choice made to map the DSSC building blocks to the available standards, and components used in the projects and current pilots, through a form. These results were used as a base for a discussion to find a consensus over such a mapping and extension. This was an opportunity to clarify which components could be interpreted in different ways, and if the mapping was aligned to general experience about standards and interoperable systems as well as with the similar discussions in national and international venues, in which the panel experts take part.
Finally, as the DSSC blueprint evolved in parallel, and, the study had originally considered the DSSC Blueprint v.0.5, published in September 2023, we compared the results of the discussion with the updated DSSC Blueprint v.1.0 structure and definitions, which was published in March 2024, and applied the needed adjustments.
Figure 7 depicts the result of the DSSC data space building blocks extension based on the mapping completed (below), compared to the DSSC Blueprint v.1.0 technical building blocks (above). In Table 2, the definitions of each building block and the possible changes with respect to the DSSC blueprint are reported, as well as the reasons behind them.

4.2. Available Standards for Data Space Building Blocks

The standards and solutions proposed by different organisations and institutions were mapped to the integrated building blocks stack (Appendix A) and can be considered as an initial reference catalogue for data spaces developers to address each relevant data space aspect by choosing among a set of available solutions.
The focus of the mapping reported in the Appendix is especially on open and international standards. The list reported is intended as an initial and provisional mapping, which can be improved with additional discussion and clarifications by the mentioned organisations about their own standards and solutions, which might still be under development in some cases. Ideally, the mentioned organisations, and possibly others, could agree in refining such mapping and maintain it through time by means of a joint collaboration, for the advantage of all of them and of data spaces in general.
It is hard to comment and interpret the standards mapping to the building blocks, although from reading the tables in Appendix A, it is clear how diverse the situation is for different building blocks. In some cases, several well-known and mature standards are available (e.g., data models, data encodings), while others are still quite new and can only count on rather general high-level guidelines, or initial solutions still to be deeply tested and improved (e.g., data requirements specification, trust framework). According to this, it is possible to grasp from the tables an initial idea of where the major gaps with respect to data spaces solutions are, as well as which organisations can already provide skills related to the different issues, but we are not yet able to draw conclusions, because deeper insights and tests would be necessary for each building block.
However, it is possible to see from the tables how standards coming from geospatial information-related organisations, which have been tackling the interoperability problem for a long time due to the nature of their use cases, can offer an extensive set of solutions, especially for all the building blocks related to the category ‘Data Specification for FAIRness’: data models; Data Exchange–Encodings; Data Descriptions (metadata); Data Requirements and quality schemas; and Data provenance model. For ‘Data Sovereignty and Trust’, although some solutions are available, and the current developments are addressing the needs stated by the building blocks in such category, other organisations can provide a wider set of solutions. An extensive set of standards and solutions is again available for the building blocks in the category ‘Data Value Enhancement’.
To improve the reported mapping and integrate this study, it would be useful to provide a discussion over the standards available and solutions proposed, which can be very diverse in terms of reference technology, maturity stage, and level of uptake for different reasons. However, agreed criteria and a measurement matrix would be necessary to compare them to each other, in order to guide the users through the whole list. For this reason, an initial list is proposed in the next Section 4.4, to be improved and discussed in future activities, preferably by a joint standardisation organisations working group.

4.3. A Remote Sensing-Related Case Study for a Green Deal Data Space: Landscape Connectivity Workflow

An example of a data space architecture for a remote sensing use case can be seen in the biodiversity pilot study of the Horizon Europe AD4GD project [62]. The diagram delineating this set of interacting building blocks (Figure 8) is laid out according to our previously described classifications of the assets, data models, tools, and services to make data FAIR. This classification aligns with the data space concept but also recognises other processes that augment the data space or make use of the assets in the data space. The end-to-end workflow of the pilot covers a range of data space concerns from stakeholder discovery, access, and use/re-use of resources to producer data sovereignty.
The pilot case illustrated has a fundamental reliance on remote sensing and its derived products, since a key input to the workflow is the time series of land-use/land-cover (LULC) information derived from multi-spectral satellite data. Combining this LULC with IoT and citizen science data, the workflow processes the resulting maps to establish and validate key locations for landscape functional connectivity and endangered species persistence under different scenarios. The data pre-processing and computational methods for connectivity calculation can be complex to replicate and thus, in addition to a user client for accessing the workflow, this pilot requires modular analysis and data discovery/access components which can be deployed on a range of platforms. Many of these components reuse or extend existing standardised solutions from those which have been described and mapped in earlier sections. Some examples relating specifically to remote sensing and ancillary data are as follows:
  • The LULC maps are curated in and accessed from an OpenDataCube data storage which implements the OGC’s GeoTIFF standard [63] as well as access and publishing standards for data and maps—Web Map Service [64], Web Coverage Service [65], Web Processing Service [66], and OGC API Converages.
  • Species occurrence data are retrieved from an alternative data cube solution, rasdaman [67], which also conforms to the above OGC interface standards and thus allows the flexible re-use of project code.
  • IoT data from camera traps are published to a STAplus Web service that complies with an extended implementation of the OGC’s SensorThings API standard [68,69]. This harmonises access to a range of bespoke commercial sensors and permits a common access interface that allows a range of querying and filtering operations.
  • Citizen science data on roads, rail, and waterways, which are used to enrich the LULC maps, are derived from OpenStreetMap as Overpass JSON and converted to the standardised OGC GeoPackage format [70], which allows easy processing and combination with the other open standard data formats.
  • Data used in and produced by the project can be discovered and evaluated because of the standardised metadata which conform to the ISO 19115 standard [71], and which are published in Geonetwork—a catalogue implementation that conforms to the ISO 19110 standard for feature cataloguing [72]. Because of their standards-compliance, these metadata records can be easily exported to or federated by other catalogues which conform to the W3C DCAT standard [73].

4.4. Measuring Standards Quality and Uptake

To further guide users and data spaces developers in the choice of suitable standards, some indications would be useful about the quality of the standards, supported technology, and the level of uptake (e.g., available compliant datasets, available supporting software, supported use cases in research or in operational environments, and so on).
When choosing a set of technical standards from many options, with varying degrees of overlap and inter-dependencies, it is essential to apply a systematic approach to evaluate the quality of standards and their appropriateness for collective adoption. Standards quality is separate, but related to, uptake and maturity. All three aspects are factors in the evaluation of, and decision to reuse, specific standards. It should be noted that, whilst quality is not the ultimate arbiter of uptake, adoption by the developer community is a key enabler and driver of solutions, and is usually driven by some aspect that is perceived as providing advantages—e.g., quality of design.
Therefore, we should explore standards quality from the perspective of data spaces requirements, and then follow up to explore how best to allow this quality to be seen as attractive to the developer community.
It is worth noting that the ultimate goal of the data interoperability and management principles (Section 2.2) is realised through data reuse itself. The evaluation processes that trigger reuse are supported by the specific principles, but other factors also play a significant role. For data, this includes design factors, such as how the data are gathered and how appropriate this is to the end use requirements. To achieve this ultimate goal, significant attention must be paid to the FAIR principle I1: “(Meta)data use a formal, accessible, shared, and broadly applicable language for knowledge representation”. This has implications for the way metadata are handled: it is not possible to specify metadata standards for all possible aspects of all possible datasets. Thus, it is recommendable to provide extensible graph solutions that can adapt to the available forms of metadata, even if some aspects of the metadata need to be standardised to meet data space requirements.
This language must be expressive enough to convey, in a standardised and recognizable way, the details required for evaluation. The technical implication is that syntactical languages such as JSON, RDF serialisations, XML, etc., are necessary but not sufficient—the description of the data itself must also be standardised, and in turn this means that common aspects must be standardised, as per FAIR R1.3. “(Meta)data meet domain-relevant community standards”.
For data spaces to function as intended, the interoperability of standards for data and function description thus becomes paramount. As no two datasets or functions are identical, it follows that such standards need to be adapted for application-specific descriptions from standardised components. In addition, the composition process needs to be standardised, hence the concept of such components as reusable modules (see for example the OGC Location Building Blocks [74]).
Some approaches to standards assessment have been proposed in the past. The most widely-known in the European context is the ‘Common Assessment Method for Standards and Specifications’ (CAMSS) [75], provided by the European Commission. This is intended to guide public authorities in the choice of suitable standards through an online questionnaire, supporting the European Interoperability Framework recommendations. Other examples come from the Open Data Initiative [76] organisation, which made some investigations interviewing users and experts, and drafted guidelines for choosing open standards [77].
Building on these examples and on the experience matured in the field of standards development, in this Section, we propose initial criteria that might be useful to present standards transparently and quantitatively (Table 3), so that similar ones can be more easily compared and chosen by the users, facilitating an effective data space development. This might also support discussion and collaboration between different standardisation organisations, to complement each other or to support reciprocal compliance for the sake of mutual advantage.
In future work, we will further test and investigate these criteria to provide a more robust and agreed reference matrix.
When evaluating and selecting a set of standards, a structured approach will support sustainability:
  • Requirements Analysis: Define the specific needs and objectives the standards must meet.
  • Standards Identification: Identify potential standards and gather detailed information about each one.
  • Evaluation and Comparison: Apply the principles in Table 3 to evaluate and compare the standards. A matrix will be proposed in future work to assess each standard against these criteria.
  • Integration Assessment: Examine how the standards will work together, considering inter-dependencies and potential conflicts.
  • Pilot Implementation: Conduct a pilot implementation to test the chosen standards in a real-world scenario.
  • Decision and Adoption: Based on the evaluation and pilot results, make an informed decision and proceed with the adoption of the selected standards.
By carefully considering these principles, organisations can choose a set of technical standards that are high-quality, appropriate for their needs, and conducive to building a robust, scalable, and maintainable system.

5. Discussion

The work described in this paper provides guidance on the major efforts in the data space development domain, bridging them to international standardisation activities and agreed-upon principles for good data management and sharing. The resulting configuration of the data space building blocks stack reflects the current granularity of issues as addressed by the categories of standards and recognised data sharing and management principles, as well as the kinds of components necessary to address the data spaces’ needs. Although most of the work was developed within European Union funded projects, and Europe especially promotes the concept of data spaces, the results are internationally relevant. In fact, the results also build on existing international standards and previous projects developed at the international level (e.g., OGC Testbeds and Pilots, joint ISO-W3C-OGC standardisation activities, and OGC working groups). Moreover, the participating experts in the discussions, representing international organisations, confirmed the international perspective of the achieved results. Finally, interim results were also re-used for other projects developed outside Europe (e.g., in Australia and New Zealand, a common model for cadastral survey data exchange was created, including common and sub-national profiles with extensive constraints around locale-specific vocabularies [78]) providing additional validation for their scalability.
In the intermediate versions of the building block stack, as worked out in the Green Deal data spaces projects USAGE and AD4GD, a building block and a respective category related to software and services were added. These were ultimately removed in order to keep a close alignment with the latest DSSC proposal, which includes a huge set of services having any kind of scope in the ‘Added value services’ category. In a future revision, this building block will probably need to be specified further to act as concrete support for planning the range of services necessary for a data space. In the AD4GD version, in particular (Section 4.1.2), the concept of digital twins was included. This encapsulates processing capabilities that have been attributed to data spaces for some time. At the same time, the inclusion of the digital twin concept ensures that both paradigms, digital twins and data spaces, are related to each other. However, the reference to digital twins was later removed. This decision was based on the reasonable opinion of some of the panel experts that there is a need for a higher level alignment with documents and conceptualisations specifically on digital twins. However, in future elaborations, in collaboration with the digital twins domain, it would be useful to revisit this proposal and investigate the connections and interactions between cross-domain building blocks, to make the data spaces and digital twins reciprocally stronger and consistent.
Technologies such as artificial intelligence, machine learning, edge computing, and semantics technology are continually improving and are expected to increasingly play a key role in either enabling data spaces or achieving their highest potential. For example, using Large Language Models (LLMs) to identify similarities may assist in solving several data harmonisation and integration issues, and enhance processing functionalities for diverse use cases and applications. Whether these technologies are used to make humans more productive, or can eventually provide autonomous capabilities, is yet to be fully explored. However, in principle, such technologies represent sub-systems in an ecosystem of technologies delivering an evolving set of capabilities for digital twins. For this reason, generalised models of processing applications were considered in the proposal described in Section 4.1.2. Future investigations will be required to assess to what extent new technologies impact on data spaces. Many of the initial challenges addressed by this paper will need to be solved before sophisticated tools can be deployed, combined, and extended to meet emerging needs. On the other hand, there is a risk that artificial intelligence will define connections whose roots are no longer traceable, making it difficult to subsequently break down any inconsistencies.
Tables reported in Appendix A, mapping the available standards for each building block, can be considered as a reference to support the development and description of data space solutions as well as a tool to ensure that the FAIR, GEO DMP, and European Interoperability Framework recommendations are respected as far as possible.
The mapping reflects the current status of standards. Some parts are still empty or could be integrated. Additional standardisation organisations not involved in this initial effort can later integrate the overview with their solutions, as well as other projects and initiatives. The organisations involved may decide to collaborate in the future to maintain such a catalogue in a joint and shared effort, directly related to standardisation organisations. In this way, the most recent updates in terms of standards and proposed solutions could be reflected. At the same time, the new, rapidly changing state of technology can be flexibly integrated for the benefit of the designers, developers, and users of data spaces. Such flexibility and transparency in the standards mapping and description would also benefit the reuse of the framework in diverse cases: different use cases, different governance levels, and different countries.
The work described in this paper is intended to support a comprehensive technical design of data spaces. However, to reach the implementation phase, wider challenges and barriers must be considered. They involve various aspects, including organisational, legal, and financial aspects, as well as technical implementation details. In some cases, several of these aspects are involved. For example, when it comes to security or privacy, legal measures need to be consistently and effectively supported by technical solutions and standards. Some of them are mentioned in this paper as building blocks and standards to be used and developed. However, the comprehensive design described in this paper does not solve all interoperability issues. Several lower-level implementation details on the technical, organisational, and legal sides still need to be addressed. This requires the involvement of a variety of stakeholders with often different perspectives and demands. Some examples of the challenges left for achieving data space implementation are as follows:
  • Organisational—the need for good and consistent definition of use case and data requirements, supporting consistent or even automatic data retrieval; the promotion of multidisciplinary collaborations among different experts.
  • Financial—a new business model will need to be developed, considering the different needs for investments and maintenance.
  • Political—legal and political choices might influence the adoption of data spaces concerning the use of data, the kind of applications, the changes in responsibilities, or the identification of new roles in organisations. The identification and definition of data policies and related technological components to enforce them is crucial.
  • Technical—the adoption of relevant standards for all elements, including data, software components, and procedures is essential. They might need some transformations such as data harmonisation or conversions. All the components will need to be developed to a suitable maturity (e.g., data space connectors).
Beginning with the building blocks and related standards can be an effective technical starting point for addressing the mentioned challenges and a hook to the related non-technical issues implied. Remote sensing use case developments can therefore act as guides on how to solve such challenges by starting from the initial framework reported in this paper.

6. Conclusions

The landscape of blueprints and reference architectures for data spaces has grown in the last few years, although mostly from rather recent projects and initiatives. For this reason, the study considered the different offers for conceptualising data space structures and components and mapped them to the well-established standards for data interoperability and sharing, as well as guiding principles recognized at the international level. Some integrations to the DSSC approach were proposed, together with a mapping to the interested standards and solutions available.
In addition, some criteria were proposed that might support users and developers in their assessment of and choice among available standards. The criteria allow them to address each building block according to their own needs.
The work discussed in this paper is certainly very complex and situated in a very dynamic field. It requires extensive collaboration among different organisations. For this reason, this work is only the foundation stone for a more robust overview and agreed blueprint, which we hope will be developed in the future through wider and sustainable collaborations. However, it was crucial to link the various initiatives outlining the relatively new concept of data spaces with the current standards that support the established principles and concepts related to interoperability and data sharing over the Internet.
Future work will need to improve, systematise, and maintain this conceptualisation and mapping over time in collaboration with other organisations. It will be important to keep it up-to-date, as well as to embed results from future research and, especially, testing in implementations and real-world scenarios.
Furthermore, the mapped standards should be assessed according to criteria and metrics like those proposed in Section 4.4. Such an assessment would need to be performed by a joint team of users, including data modellers, researchers, standards developers, and software developers who are supposed to implement the standard. It will allow a clearer overview of the current status of the offer of standards and solutions for data space implementation, as well as guidance directing the future standards of development planning.
Remote sensing use cases and data were, in some cases, pioneering the data space-sharing concept. They might be a good test case for upgrading their sharing methods by considering the added aspects introduced by the more recent data space concept (e.g., trust, data tracking, improved provenance mapping, semantic uplifting, and so on).
One more relevant recommendation for a huge topic for future research is the further implementation and testing of new technologies (edge processing, machine learning, artificial intelligence, digital twins, and so on) and their impact on data spaces and related use cases.
Going beyond the technology aspect, other challenges (e.g., related to organisational and business building blocks) will need to be tackled more closely. This includes training aspects as well as changes in organisational and governance-related principles and processes. Citizens need to understand their potential to produce and use data. Legal frameworks need to be updated or established to regulate certain aspects of data spaces. The business models of several organisations will need to change to adapt to the novelties of data spaces. Future research, not only technical research, will need to be developed to provide good guidance on all of these aspects.
The testing of a wide variety of use cases and the reuse of data across different architectures and data spaces for various domains will be crucial to fully harness the power of such a paradigm.

Author Contributions

Conceptualisation, F.N.; methodology, F.N., R.A., L.B. and I.S.; validation, R.A., L.B., J.M., I.S., A.V., M.-F.V. and P.Z.; investigation, F.N.; writing—original draft preparation, F.N. and R.A.; writing—review and editing, L.B., J.M., I.S., A.V., M.-F.V. and P.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the USAGE project under the European Union’s Horizon Europe programme–Grant Agreement No.101059950 and by the AD4GD project, cofunded under the European Union’s Horizon Europe programme–Grant Agreement No.101061001, Switzerland and the United Kingdom.

Informed Consent Statement

Not applicable.

Data Availability Statement

No new data were created or analysed in this study. All the references to the reviewed documents are available in the text. Data sharing is not applicable to this article.

Acknowledgments

We would like to acknowledge the panel of experts who took part in the discussions over the data space building blocks: Linda van den Brink, Bart de Lathouwer, and Giacomo Martirano. Diego de la Vega (CREAF) has produced the graphics for Figure 6 and Figure 8 within the AD4GD project.

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A. Mapping of Available Standards to Data Space Building Blocks

Table A1, Table A2, Table A3, Table A4, Table A5, Table A6, Table A7, Table A8, Table A9, Table A10, Table A11, Table A12, Table A13, Table A14, Table A15, Table A16, Table A17, Table A18, Table A19, Table A20, Table A21, Table A22, Table A23 and Table A24, reported in the Appendix A.1, Appendix A.2, Appendix A.3 and Appendix A.4, map the useful solutions and standards from the different reference initiatives and standardisation organisations to the data space building blocks as described in this paper (Section 3.1). The order of institutions in the left column follows the order of the start of the activities of each organisation.
After each table reporting the mapping of standards for each building block, another table reporting the addressed good data sharing and management principles mentioned in Section 2.2 (i.e., FAIR principles, GEO Data Management Principles–GEO DMP–and the European Interoperable Framework–EIF principles and recommendations) is reported.

Appendix A.1. Technical Building Blocks—Data Specification Enabling FAIRness

Table A1, Table A2, Table A3, Table A4, Table A5, Table A6, Table A7 and Table A8 report the mapping related to building blocks in the category ‘Data Specification enabling FAIRness’.
Table A1 reports the mapping of standards to the ’Data Models’ building block, defined as: “The model provides semantics and a shared vocabulary, as well as a structure for the data (hierarchies and relationships)”.
Table A1. Standards and solutions for the data models building block.
Table A1. Standards and solutions for the data models building block.
RefSpecification/Implementation(s) Recommended
W3CSSN-SOSA
OGCCityGML/CityJSON, LandInfra, IndoorGML, Indoor Mapping Data Format (IMDF), MUDDI, PipelineML, WaterML, Augmented Reality ML (ARML), SensorThings API data model, SWE common data model, SensorML, Semantic sensor Network (SSN), STAplus, Time Ontology in OWL, TimeseriesML, WaterML, GeoPose, Geoscience Markup Language (GeoSciML), Zarr (https://portal.ogc.org/files/100727, accessed on 31 August 2024), GroundwaterML, network Common Data Form (netCDF) standards suite, Observations, Measurements and Samples
GEO et al.Essential Variables (https://www.earthdata.nasa.gov/learn/backgrounders/essential-variables, accessed on 31 August 2024). Topics/domains: Climate; Ocean; Biodiversity; Geodiversity; Agriculture
INSPIREINSPIRE Themes and UML model (https://inspire.ec.europa.eu/Themes/Data-Specifications/2892)
OASCMIM2—data models—Smart data Models; NGSI-LD compliant data models for aspects of the smart city have been defined by organisations and projects, including OASC, FIWARE, GSMA and the SynchroniCity project and there is an ongoing joint activity of TM Forum and FIWARE to specify more. Existing data models and ontologies, e.g., the SAREF (Smart Applications REFerence ontology) standard by ETSI/oneM2M, can be mapped for use with NGSI-LD by identifying what are entities, properties and relationships, which can be managed and requested by the NGSI-LD API. oneM2M base ontology (that is compatible with SAREF). Additionally, oneM2M provides the means to instantiate ontologies as a means to provide semantic descriptions of the data exchanged (through the use of metadata). The extension SAREF4Cities provides an ontology focused on smart cities. Core vocabularies of ISA like Core Public Service Vocabulary Application Profile used as the basis for the Single Digital Gateway Regulation that touches local governments, Core Person, Core Organisation etc. DTDL is the Digital Twin Definition Language developed by Microsoft. This language is based on top of JSON-LD and the existing FIWARE data models are converted in this format. MIM7–Places
DSBASmartDataModels
IDSAIDS RA—Functional Layer—Ecosystem of Data—Vocabularies; Information Layer—Hexagon of concerns
In addition, standardisation domain organisations would likely propose their own data models, ontologies, and vocabularies for the specific domains and applications of interest, such as the CIDOC-CRM (https://cidoc-crm.org/) for cultural heritage, the InteroperAble Descriptions of Observable Property Terminology (I-ADOPT) ontology for observable properties (https://i-adopt.github.io), and many more. Extension mechanisms can be foreseen by the different standards in case the existing data model or previous related extensions is not sufficient. However, usually only a profile of the provided comprehensive domain data model is necessary, or a combination of profile and extension. Therefore, it is recommended to document them properly in machine readable format and to associate the resulting data model with datasets through metadata. For example, the OGC Data Exchange Toolkit is intended to support this [79].
Table A2. Principles addressed by the data models building block.
Table A2. Principles addressed by the data models building block.
FAIR PrinciplesEIF
R1.3: Data meet domain-relevant community standardsRecommendation 4 (Openness): Give preference to open specifications, taking due account of the coverage of functional needs, maturity and market support and innovation.
Table A3 reports the mapping of standards to the ’Data Exchange–Encodings’ building block, defined as: “Encoding is the format in which the data are encoded’.’
Table A3. Standards and solutions for the Data Exchange–Encodings building block.
Table A3. Standards and solutions for the Data Exchange–Encodings building block.
RefSpecification/Implementation(s) Recommended
ISOSQL, JSON
W3CRDF (RDF/XML, Turtle, JSON-LD), SPARQL, OWL
OGC3D Tiles, Cloud Optimised GeoTIFF, CoverageJSON, GML in JPEG2000, GeoPackage, GeoSPARQL, GML, GeoTiff, I3S, NetCDF, Zarr, Hierarchical Data Format Version 5 (HDF5), KML, LAS, Moving Features, SWE Service Model Implementation Standard, Sensor Observation Service SOS, WKT CRS, Simple Features, OpenGeoSMS, GeoXACML
Table A4. Principles addressed by the Data Exchange–Encodings building block.
Table A4. Principles addressed by the Data Exchange–Encodings building block.
FAIR PrinciplesDMPEIF
I1: data use a formal, accessible, shared, and broadly applicable language for knowledge representationDMP-3 (Usability). Data will be structured using encodings that are widely accepted in the target user community and aligned with organisational needs and observing methods, with preference given to non-proprietary international standards.Recommendation 9 (Technological neutrality and data portability): Ensure data portability, namely that data is easily transferable between systems and applications supporting the implementation and evolution of European public services without unjustified restrictions, if legally possible.
Table A5 reports the mapping of standards to the ‘Data descriptions (metadata)’ building block, defined as follows: “Metadata, technical guidance, and schemas for describing datasets, exchanged within a data space between providers and recipients. Metadata allows consistent data retrieval, generation or reuse, ensuring reliability in the results for which data are used as input and related decision-making process”.
Table A5. Standards and solutions for the Data descriptions (metadata) building block.
Table A5. Standards and solutions for the Data descriptions (metadata) building block.
RefSpecification/Implementation(s) Recommended
ISOISO 19115, ISO 15836 Dublin core
W3CDCAT
OGCGeoDCAT—3dP, EO Dataset Metadata GeoJSON(-LD) Encoding Standard (EO-GeoJSON)
INSPIREINSPIRE metadata (based on ISO19115, ISO19119 and ISO 15836 (Dublin Core)
OASCMIM1–Context, MIM7–Places
IDSAIDS Reference Architecture—Functional Layer—Ecosystem of Data—Data source description; Information Layer
Gaia-XFederated catalogue—Self description
SIMPLSelf-description (ID SECAV-FUNC-002-FUNC-001) (https://futurium.ec.europa.eu/en/simpl/l2-detailed-requirement/attributes-self-description-dataset)
Table A6. Principles addressed by the Data descriptions (metadata) building block.
Table A6. Principles addressed by the Data descriptions (metadata) building block.
FAIR PrinciplesDMP
F2: Data are described with rich metadata
R1: (Meta)data are richly described with a plurality of accurate and relevant attributes
R1.3: Metadata meet domain-relevant community standards
I1: Metadata use a formal, accessible, shared, and broadly applicable language for knowledge representation
I2: (Meta)data use vocabularies that follow the FAIR principles
I3: Metadata include qualified references to other (meta)data
DMP-4 (Usability). Data will be comprehensively documented, including all elements necessary to access, use, understand, and process, preferably via formal structured metadata based on international or community-approved standards. To the extent possible, data will also be described in peer-reviewed publications referenced in the metadata record.
Q-quality: Data should be of sufficient quality for the user’s task (extension to the FAIR principles [50]).DMP-6 (Usability). Data will be quality-controlled and the results of quality control shall be indicated in metadata; data made available in advance of quality control will be flagged in metadata as unchecked.
F1: (Meta) data are assigned globally unique and persistent identifiers. F3: Metadata clearly and explicitly include the identifier of the data they describeDMP-10 (Curation). Data will be assigned appropriate persistent, resolvable identifiers to enable documents to cite the data on which they are based and to enable data providers to receive acknowledgement of use of their data.
Table A7 reports the mapping of standards to the ‘Data Requirements and Quality Schemas’ building block, defined as: “Data Requirements Schemas definition: Data requirements specification is essential for a successful data retrieval or generation, leading to reliable results for use cases. Referring to standardised schemas to specify data requirements allows interoperability between systems”.
There is no specific mapping of the interoperability-related principles to this building block. However, it addresses very similar needs that metadata and data descriptions, although being from a different point of view: metadata document existing datasets, while data requirements specification describes the needs of use cases to be matched to metadata for an efficient data retrieval or generation.
In addition, the extension to the FAIR principles [50] includes the ‘Q-Quality’ principle, referring in turn to other best practices proposed by the W3C-OGC working group ‘Spatial data on the Web’, among which ‘Best practice 6: Provide data quality information’ (https://www.w3.org/TR/dwbp/#DataQuality, accessed on 31 August 2024) and ‘Best practice 21: Provide data up to date’ (https://www.w3.org/TR/dwbp/#AccessUptoDate, accessed on 31 August 2024).
Table A7. Standards and solutions for the Data Requirements and Quality Schemas building block.
Table A7. Standards and solutions for the Data Requirements and Quality Schemas building block.
RefSpecification/Implementation(s) Recommended
ISOISO19131 on Data Product Specification, ISO/IEC25012 on Data Quality Model (https://iso25000.com/index.php/en/iso-25000-standards/iso-25012, accessed on 31 August 2024)
OGCSchema used by the Data Exchange Toolkit [79]. It addresses a complementary part of ISO19131, to support data requirements definition and semantic validation. Other experiments are trying to address the issue starting from profiling the PROV vocabulary, originally intended to represent provenance information (https://github.com/ogcincubator/prov-cwl/tree/master, accessed on 31 August 2024)
Committee on Earth Observation Satellites (CEOS) (https://ceos.org/ard/, accessed on 31 August 2024)Analysis Ready Data Framework, currently planned to be extended for being applied to geospatial data within OGC (https://www.ogc.org/press-release/ogc-forms-new-analysis-ready-data-standards-working-group/, accessed on 31 August 2024)
SIMPLQuality dimension and quality rules (ID SEGOA-FUNC-012-FUNC-001) (https://futurium.ec.europa.eu/en/simpl/l2-detailed-requirement/quality-dimension-and-quality-rules-0, accessed on 31 August 2024)
Table A8 reports the mapping of standards to the ‘Data Provenance Model’ building block, defined as follows: “Data Provenance model definition: Standards intended to represent and document provenance and lineage of data”.
Table A8. Standards and solutions for the Data Provenance Model building block.
Table A8. Standards and solutions for the Data Provenance Model building block.
RefSpecification/Implementation(s) Recommended
ISOISO19115 geospatial lineage model
W3CPROV-O (https://www.w3.org/TR/prov-o/, accessed on 31 August 2024)
OGCOGC Provenance chains (https://ogcincubator.github.io/bblock-prov-schema/build/generateddocs/slate-build/ogc-utils/prov/index.html?json#examples, accessed on 31 August 2024), W3C PROV-O extension
OASCMIM5–Transparency
IDSAPart of the information layer—hexagon of concerns
Others (reported by DS4SSCC)ETSI-CIM; DACT-AP—property ‘provenance’
Table A9. Principles addressed by the Data Provenance Model building block.
Table A9. Principles addressed by the Data Provenance Model building block.
FAIR PrinciplesDMP
R1.2: (Meta)data are associated with detailed provenanceDMP-5 (Usability). Data will include provenance metadata indicating the origin and processing history of raw observations and derived products, to ensure full traceability of the product chain.

Appendix A.2. Technical Building Blocks—Data Sovereignty and Trust

Table A10, Table A11, Table A12, Table A13, Table A14 and Table A15 report the mapping related to building blocks in the category ‘Data Sovereignty and Trust’.
Table A10 reports the mapping of standards to the ’Data Policies’ building block, defined as: “Either an offer from the data provider or an agreement between the data provider and the data recipient”.
Table A10. Standards and solutions for the Data Policies building block.
Table A10. Standards and solutions for the Data Policies building block.
RefSpecification/Implementation(s) Recommended
W3CW3C Open Digital Rights Language (ODRL) (https://www.w3.org/TR/odrl-model/), W3C Verifiable Credentials.
OGCRAINBOW for licences; GeoXACML
OASCMIM3–Contracts
IDSARA—Functional layer—Security and Data Sovereignty—Usage Policies; RA—Functional layer—Data markets—Usage restrictions and governance; RA—Security perspective
Gaia-XIdentity and Trust—Federated Access; Sovereign Data Exchange—Policies
Others (reported by DS4SSCC)Standards: OASIS XACML (https://docs.oasis-open.org/xacml/3.0/xacml-3.0-core-spec-os-en.html, accessed on 31 August 2024) Policy Definition Language; Industry Body Specifications: Rego, Open Policy Agent, JSON-LD; Implementations: i4Trust, Prometheus-X
OthersCreative Commons
Some tools are provided by the European Commission to guide the choice of a suitable licence for the specific needs for data sharing, through the Joinup Licensing Assistant (https://joinup.ec.europa.eu/collection/eupl/solution/joinup-licensing-assistant/jla-find-and-compare-software-licences?etrans=fr, accessed on 31 August 2024. See an explanation of the tool at https://www.youtube.com/watch?v=DhEhKtlsjQ0, accessed on 31 August 2024). Other websites (for example, https://choosealicence.com, accessed on 31 August 2024) provide guidance in the specific case an open licence is needed.
Table A11. Principles addressed by the Data Policies building block.
Table A11. Principles addressed by the Data Policies building block.
FAIR PrinciplesDMP
R1.1: (Meta)data are released with a clear and accessible data usage licence.DMP-1b (Discoverability). [...] and data access and use conditions, including licences, will be clearly indicated.
Table A12 reports the mapping of standards to the ’Access and usage control and management’ building block, defined as: “Mechanisms in place to ensure access and usage policies related to certain data are respected”.
Table A12. Standards and solutions for the Access and usage control and management building block.
Table A12. Standards and solutions for the Access and usage control and management building block.
RefSpecification/Implementation(s) Recommended
W3CW3C Web Access Control (WAC) (https://www.w3.org/wiki/WebAccessControl, accessed on 31 August 2024)
OGCOGC APIs rely on the access control from the underlying OpenApi mechanisms (https://docs.ogc.org/is/19-072/19-072.html#rc_oas30-security, accessed on 31 August 2024), included in service model (STA+)
OASCMIM3—Contracts
IDSARA—Functional layer—Security and Data Sovereignty—Usage enforcement; RA—Functional layer—Data markets—Usage restrictions and governance; RA—Security perspective
Gaia-XIdentity and Trust—Federated Access; Sovereign Data Exchange—Usage control + Data agreement service
Others (reported by DS4SSCC)Industry Body Specifications: Rego, Open Policy Agent, JSON-LD Implementations: i4Trust, Prometheus-X
SIMPLFederated Authentication (ID SEGOA-FUNC-001) (https://futurium.ec.europa.eu/en/simpl/l1-high-level-requirement/federated-authentication, accessed on 31 August 2024)
Table A13 reports the mapping of standards to the ’Identity and Attestation Management’ building block, defined as follows: "Information provided on the relevant entities must be verifiable to enable the onboarding and offboarding processes. The trustworthiness of information is linked to the trustworthiness of the Trust Anchors (or Trust Service Providers, specifically for identities), who are entitled to issue the respective attestations" (https://dssc.eu/space/BVE/357075352/Identity+and+Attestation+Management, accessed on 31 August 2024).
Table A13. Standards and solutions for the Identity and Attestation Management building block.
Table A13. Standards and solutions for the Identity and Attestation Management building block.
RefSpecification/Implementation(s) Recommended
W3CW3C Decentralised Identifiers (DID) (https://www.w3.org/TR/did-core/)
OASCMIM4–Trust, MIM6–Security
IDSARA—functional layer—Trust—Identity Management + user certification; RA—Functional layer—Security & Data Sovereignty—Authentication & Authorisation; RA—Security perspective + Certification perspective
Gaia-XIdentity and Trust—Federated Identity Management; Sovereign Data Exchange—logging service; Compliance—Onboarding and certification
Others (reported by DS4SSCC)Standards: LDAP OAUTH2 X.500 X.509; Industry body specifications: CEF eID, OpenID Connect; SAML 2.0; SOLID
Table A14. Principles addressed by the Identity and Attestation Management building block.
Table A14. Principles addressed by the Identity and Attestation Management building block.
FAIR principles
A1.2: The protocol [for metadata publication] allows for an authentication and authorisation procedure where necessary
Table A15 reports the mapping of standards to the ‘Trust Framework’ building block, defined as: “Verification that a participant in a data space adheres to certain rules and a common set of standards”. (https://dssc.eu/space/BVE/357075333/Data+Sovereignty+and+Trust).
Table A15. Standards and solutions for the Trust Framework building block.
Table A15. Standards and solutions for the Trust Framework building block.
RefSpecification/Implementation(s) Recommended
W3CVerifiable Credentials (https://www.w3.org/TR/vc-data-model-2.0/)
OGCOGC Web Services Security
OASCMIM4–Trust, MIM6–Security
IDSARA—Functional layer—Security & Data Sovereignty—Trustworthy communication; + Security by design; + Technical certification; RA—Security perspective + Certification perspective
Gaia-XIdentity and Trust—Trust Management; Compliance—Relation between Providers and consumers + Rights and obligations of participants
Others (reported by DS4SSCC)Standards: EUDI; Industry body specifications: EBSI; Reference implementations: European Blockchain, i4Trust
Table A16. Principles addressed by the Trust Framework building block.
Table A16. Principles addressed by the Trust Framework building block.
EIF
Recommendation 15 (Security and privacy): Define a common security and privacy framework and establish processes for public services to ensure secure and trustworthy data exchange between public administrations and in interactions with citizens and businesses.
Table A17 reports the mapping of standards to the ’Sharing Traceability’ building block, defined as: “Standards and services intended to keep track of the data processing and sources along their lifecycle”.
Table A17. Standards and solutions for the Sharing Traceability building block.
Table A17. Standards and solutions for the Sharing Traceability building block.
RefSpecification/Implementation(s) Recommended
OASCMIM5–Transparency
Others (reported by DS4SSCC)ETSI-CIM; DACT-AP—property ‘provenance’

Appendix A.3. Technical Building Blocks—Data Value Enhancement

Table A18, Table A19, Table A20, Table A21 and Table A22 report the mapping related to building blocks in the category ‘Data Value Enhancement’.
Table A18 reports the mapping of standards to the ‘Vocabulary Services’ building block, defined as follows: “Services intended to leverage vocabularies for the related functionalities”.
Table A18. Standards and solutions for the Vocabulary Services building block.
Table A18. Standards and solutions for the Vocabulary Services building block.
RefSpecification/Implementation(s) Recommended
W3CSKOS vocabulary
OGCuses SKOS–API support under consideration
IDSA“RDF”, “taxonomies”
...?
Table A19 reports the mapping of standards to the ’Data Exchange–Communication (APIs)’ building block, defined as follows: “The data exchange building block focuses on data transmission once the conditions for interchange authorisation are met”.
Table A19. Standards and solutions for the Data Exchange–Communication (APIs) building block.
Table A19. Standards and solutions for the Data Exchange–Communication (APIs) building block.
RefSpecification/Implementation(s) Recommended
W3CW3C APIs (https://api.w3.org/doc)
OGCOGC APIs (https://ogcapi.ogc.org/), OGC Web Services.
OASCMIM1–context; MIM7–Places
IDSAConnectors
Others (reported by DS4SSCC)NGSI-LD; LDES MQTT JSON-LD
Table A20 reports the mapping of standards to the ’Metadata Publication and discovery’ building block, defined as follows: “The purpose of the publication and discovery building block is to provision and discover metadata of data, services and offerings in a data space”.
Table A20. Standards and solutions for the Metadata Publication and discovery building block.
Table A20. Standards and solutions for the Metadata Publication and discovery building block.
RefSpecification/Implementation(s) Recommended
W3C?
OGCOGC Catalogue Service (https://www.ogc.org/standard/cat/), OGC API Records, Cat:ebRIM App Profile: Earth Observation Products (https://www.ogc.org/standard/cat2eoext4ebrim/)
OASCMIM1–Context, MIM3–Contracts
IDSAIDS Reference Architecture—Functional Layer—Ecosystem of Data—Brokering
Gaia-XFederated Catalogue—Catalogue Management Functions
Others (reported by DS4SSCC)ICT Innovation Network reference architecture, DCAT-AP, JSON-LD
SIMPLCatalogues of Data/Applicaton/Infrastructure (ID SECAV-FUNC-001) (https://futurium.ec.europa.eu/en/simpl/l1-high-level-requirement/catalogues-dataapplicationinfrastructure)
Table A21. Principles addressed by the Metadata Publication and discovery building block.
Table A21. Principles addressed by the Metadata Publication and discovery building block.
FAIR PrinciplesDMPEIF
F4: (Meta)data are registered or indexed in a searchable resourceDMP-1a (Discoverability). Data and all associated metadata will be discoverable through catalogues and search engines
A2: Metadata should be accessible even when the data is no longer available
A1: (Meta)data are retrievable by their identifier using a standardised communication protocol
A1.1: The protocol is open, free, and universally implementable
DMP-2 (Accessibility). online services, including, at minimum, direct download but preferably user-customizable services for visualisation and computation.
Recommendation 5 (Transparency): Ensure internal visibility and provide external interfaces for European public services.
Table A22 reports the mapping of standards to the ‘Value added services’ building block, defined as follows: “any kind of processing, as service, is included”. (https://dssc.eu/space/BVE/357076468/Value-Added+Services).
Table A22. Standards and solutions for the Value added services building block.
Table A22. Standards and solutions for the Value added services building block.
RefSpecification/Implementation(s) Recommended
W3CADMS
ISA/ISA2/SEMICADMS-AP
OGCCoordinate transformation Service, GeoAPI, LocationService (OpenLS), Open Model Interface (OpenMI), RAINBOW, Filter Encoding, Styled Layer Description, Symbology Encoding, Geospatial User Feedback (GUF)
OASCMIM3 Basic Data Marketplace Enablers SynchroniCity
IDSARA—functional layer—Data markets—Clearing and billing
SIMPLUI and API for defining data quality rules (ID SEGOA-FUNC-012) (https://futurium.ec.europa.eu/en/simpl/l1-high-level-requirement/ui-and-api-defining-data-quality-rules, accessed on 31 August 2024), Data quality assessment (ID SEARE-FUNC-017) (https://futurium.ec.europa.eu/en/simpl/l1-high-level-requirement/data-quality-assessment, accessed on 31 August 2024)
Table A23. Principles addressed by the Value added services building block.
Table A23. Principles addressed by the Value added services building block.
EIF
Recommendation 6 (Reusability): Reuse and share solutions, and cooperate in the development of joint solutions when implementing European public services.
In the last version of the DSSC blueprint [80], several kinds of services, serving rather different purposes, are gathered into the ‘Value added services’ building block, including the previous ‘Marketplaces’, which disappear in such an updated version as a building block, per se. From DSSC definitions [80], these services seem to include any kinds of software or services intended to use the data (data processing, management and analysis software, services and apps) (https://dssc.eu/space/BVE/357076468/Value-Added+Services?attachment=/download/attachments/357076468/image-20240301-090319.png&type=image&filename=image-20240301-090319.png, accessed on 31 August 2024). They are also foreseen in the IDSA Reference Architecture, in the functional layer, and under ‘value adding apps’ group, including the following aspects: data processing and transformation; data app implementation; providing data apps; and installing and supporting data apps.

Appendix A.4. Technical Building Blocks—Services FAIRness

Table A24 reports the mapping of standards to the ‘Services Description (metadata)’ building block, defined as follows: “Metadata, technical guidance, and schemas for describing services chosen as components of a data space architecture”.
Table A24. Standards and solutions for the Services Description (metadata) building block.
Table A24. Standards and solutions for the Services Description (metadata) building block.
RefSpecification/Implementation(s) Recommended
OGCOGC API Processes, Web Processing Service, Web Coverage Processing Service
Gaia-XGaia-X Labels
SIMPLAttributes of a self-description for an application (ID SECAV-FUNC-002-FUNC-001) (https://futurium.ec.europa.eu/en/simpl/l2-detailed-requirement/attributes-self-description-application, accessed on 31 August 2024)

References

  1. Halevy, A.; Franklin, M.; Maier, D. Principles of dataspace systems. In Proceedings of the Twenty-Fifth ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, Chicago, IL, USA, 26–28 June 2006; pp. 1–9. [Google Scholar]
  2. Kasamani, S.B.; Lukandu, A.I.; Gregory, W. Modelling Dataspace Entity Association Using Set Theorems. Comput. Technol. Appl. 2012, 3, 6. [Google Scholar]
  3. Heath, T.; Bizer, C. Linked Data: Evolving the Web Into a Global Data Space, 3rd ed.; Morgan & Claypool Publishers: San Rafael, CA, USA, 2011; Volume 1. [Google Scholar]
  4. Zuiderwijk, A.; Janssen, M.T. Open data policies, their implementation and impact: A framework for comparison. Gov. Inf. Q. 2014, 31, 17–29. [Google Scholar]
  5. Braunschweig, K.; Eberius, J.; Thiele, M.; Lehner, W. The state of open data. Limits Curr. Open Data Platforms 2012, 1, 72. [Google Scholar]
  6. Huijboom, N.; Van den Broek, T. Open data: An international comparison of strategies. Eur. J. ePractice 2011, 12, 4–16. [Google Scholar]
  7. Chignard, S. A Brief History of Open Data. ParisTech Review. 2013. Available online: http://www.paristechreview.com/2013/03/29/brief-history-open-data/ (accessed on 8 July 2024).
  8. Directive 2007/2/EC of the European Parliament and of the Council of 14 March 2007 Establishing an Infrastructure for Spatial Information in the European Community (INSPIRE). Available online: http://data.europa.eu/eli/dir/2007/2/oj (accessed on 8 July 2024).
  9. Directive 2003/98/EC of the European Parliament and of the Council of 17 November 2003 on the Re-Use of Public Sector Information. Available online: https://eur-lex.europa.eu/LexUriServ/LexUriServ.do?uri=OJ:L:2003:345:0090:0096:en:PDF (accessed on 8 July 2024).
  10. WIS Timelines. Available online: https://community.wmo.int/en/activity-areas/wis/wis-timelines (accessed on 8 July 2024).
  11. World Wide Web Consortium–W3C. Available online: https://www.w3.org (accessed on 8 May 2024).
  12. Open Geospatial Consortium–OGC. Available online: https://en.wikipedia.org/wiki/Open_Geospatial_Consortium (accessed on 8 May 2024).
  13. Group on Earth Observations. Available online: https://earthobservations.org (accessed on 8 July 2024).
  14. International Data Spaces Association–IDSA. Available online: https://internationaldataspaces.org (accessed on 9 May 2024).
  15. NGA Unclassified Data Lake Fosters GEOINT Innovation. Available online: https://www.nga.mil/news/NGA_Unclassified_Data_Lake_Fosters_GEOINT_Innovati.html (accessed on 10 September 2024).
  16. Co-Creating a Global Environmental Data Strategy (GEDS). Available online: https://committee.iso.org/files/live/users/fh/aj/aj/tc211contributor%40iso.org/files/Presentations/2024-06%20London/SiA1-6.pdf (accessed on 23 August 2024).
  17. USAGE Deliverable 3.2–Data Space Prototype and Report–First Version. Available online: https://drive.google.com/file/d/1FyvNkWkAWkuKKHh-3l529VUu5LIKyu5E/view?usp=drive_link (accessed on 8 July 2024).
  18. Urban Data Space for Green Deal–USAGE Project. Funded within the HORIZON Europe Programme (GA. 101059950). Available online: https://www.usage-project.eu (accessed on 9 May 2024).
  19. Common European Data Spaces. Available online: https://digital-strategy.ec.europa.eu/en/policies/data-spaces (accessed on 8 July 2024).
  20. DSSC Glossary. Available online: https://dssc.eu/space/Glossary/55443460/DSSC+Glossary+%7C+Version+1.0+%7C+March+2023 (accessed on 8 July 2024).
  21. DSSC Starter Kit. Available online: https://dssc.eu/space/SK/29523973/Starter+Kit+for+Data+Space+Designers+%7C+Version+1.0+%7C+March+2023 (accessed on 8 July 2024).
  22. Otto, B.; Steinbuß, S.; Teuscher, A.; Lohmann, S.; Auer, S.; Bader, S.; Bastiaansen, H.; Bauer, H.; Birnstil, P.; Böhmer, M.; et al. Reference Architecture Model, Version 3; International Data Spaces Association: Dortmund, Germany, 2019. Available online: https://internationaldataspaces.org/wp-content/uploads/IDS-Reference-Architecture-Model-3.0-2019.pdf (accessed on 8 July 2024).
  23. Grothe, M. Exploring Data Space Initiatives; Geonovum: Amersfoort, The Netherlands, 2023; Available online: https://www.geonovum.nl/uploads/documents/Exploring%20data%20space%20initiatives%20v0.52_EN%20publication%20version%20Geonovum.pdf (accessed on 8 July 2024).
  24. Ahle, U.; Bastiaansen, H.; Bengtsson, K.; Caballero, M.; Castellvi, S.; Dognini, A.; van Ette, F.; Faraldi, M.; Gelhaar, J.; Graziani, A.; et al. Design Principles for Data Spaces, Version 1.0.; OpenDEI, International Data Spaces Association: Dortmund, Germany, 2021. [CrossRef]
  25. Farrell, E.; Minghini, M.; Kotsev, A.; Soler-Garrido, J.; Tapsall, B.; Micheli, M.; Posada, M.; Signorelli, S.; Tartaro, A.; Bernal, J.; et al. European Data Spaces: Scientific Insights into Data Sharing and Utilisation at Scale, 3rd ed.; JRC129900; Publications Office of the European Union: Luxembourg, 2023. [Google Scholar] [CrossRef]
  26. OpenDEI Project. Available online: https://www.opendei.eu (accessed on 8 July 2024).
  27. Support Centre for Data Sharing Deliverables. Available online: https://dssc.eu/space/DC/408059908/Support+Centre+for+Data+Sharing (accessed on 8 July 2024).
  28. Data Sharing Canvas—A Stepping Stone towards Cross-Domain Data Sharing at Scale. Available online: https://coe-dsc.nl/wp-content/uploads/2024/02/data-sharing-canvas.pdf (accessed on 8 July 2024).
  29. Gaia-X. Available online: https://gaia-x.eu/ (accessed on 8 July 2024).
  30. Gaia-X. Gaia-X–Architecture Document. 2022. Available online: https://gaia-x.eu/wp-content/uploads/2022/06/Gaia-x-Architecture-Document-22.04-Release.pdf (accessed on 8 July 2024).
  31. FIWARE. Available online: https://www.fiware.org/about-us/ (accessed on 8 July 2024).
  32. Big Data Value Association–BDVA. Available online: https://www.bdva.eu (accessed on 8 July 2024).
  33. Gronlier, P.; Hierro, J.; Steinbuss, S. Technical Convergence—Discussion Document; Data Spaces Business Alliance: Darmstadt, Germany, 2023; Available online: https://data-spaces-business-alliance.eu/wp-content/uploads/dlm_uploads/Data-Spaces-Business-Alliance-Technical-Convergence-V2.pdf (accessed on 8 July 2024).
  34. DSSC Endorsement. Available online: https://dssc.eu/page/Endorsements (accessed on 8 July 2024).
  35. Data Spaces Support Centre–DSSC. Available online: https://dssc.eu/space/DDP/117211137/DSSC+Delivery+Plan+-+Summary+of+assets+publication (accessed on 8 July 2024).
  36. Data Space for Smart and Sustainable Cities and Communities–DS4SSCC. Available online: https://inventory.ds4sscc.eu (accessed on 8 July 2024).
  37. GREAT Project. Available online: https://www.greatproject.eu (accessed on 8 July 2024).
  38. GREAT Technical Blueprint. Available online: https://www.greatproject.eu/wp-content/uploads/2023/10/D3.1-Initial-Blueprint-of-the-GDDS-Reference-Architecture_web.pdf (accessed on 8 July 2024).
  39. Smart Open-Source Middleware (Simpl) Project. Available online: https://digital-strategy.ec.europa.eu/en/policies/simpl (accessed on 30 September 2024).
  40. GREAT Project. Available online: https://worldfair-project.eu (accessed on 30 September 2024).
  41. Gregory, A.; Bell, D.; Brickley, D.; Buttigieg, P.L.; Cox, S.; Edwards, M.; Doug, F.; Gonzalez Morales, L.G.; Heus, P.; Hodson, S.; et al. WorldFAIR (D2.3) Cross-Domain Interoperability Framework (CDIF) (Report Synthesising Recommendations for Disciplines and Cross-Disciplinary Research Areas). Zenodo 2024. [Google Scholar] [CrossRef]
  42. Open and Agile Smart Cities–OASC. Available online: https://oascities.org (accessed on 8 July 2024).
  43. Wilkinson, M.D.; Dumontier, M.; Aalbersberg, I.J.; Appleton, G.; Axton, M.; Baak, A.; Mons, B. The FAIR Guiding Principles for scientific data management and stewardship. Sci. Data 2016, 3, 160018. [Google Scholar]
  44. EU. New European Interoperability Framework—Promoting Seamless Services and Data Flows for European Public Administrations; EU: Brussels, Belgium, 2017; ISBN 978-92-79-63756-8. [Google Scholar] [CrossRef]
  45. GEOSS Data Management Principles. Available online: https://old.earthobservations.org/documents/dswg/201504_data_management_principles_long_final.pdf (accessed on 8 July 2024).
  46. Lin, D.; Crabtree, J.; Dillo, I.; Downs, R.R.; Edmunds, R.; Giaretta, D.; De Giusti, M.; L’Hours, H.; Hugo, W.; Jenking, R.; et al. The TRUST Principles for digital repositories. Sci. Data 2020, 7, 144. [Google Scholar] [CrossRef] [PubMed]
  47. CARE Principles. Available online: https://en.wikipedia.org/wiki/CARE_Principles_for_Indigenous_Data_Governance (accessed on 8 July 2024).
  48. International Standardization Organization–ISO. Available online: https://www.iso.org/home.html (accessed on 8 July 2024).
  49. W3C-OGC Spatial Data on the Web Working Group Charter. Available online: https://www.w3.org/2021/10/sdw-charter.html (accessed on 8 July 2024).
  50. Tandy, J.; van den Brink, L.; Barnaghi, P.; Homburg, T. Spatial Data on the Web Best Practices; OGC, W3C. 2023. Available online: https://www.w3.org/TR/2023/DNOTE-sdw-bp-20230919/ (accessed on 8 July 2024).
  51. Abhayaratna, J.; Daemen, E.; Janowicz, K.; Parsons, E.; Smith, R.; Verschoor, F. The Responsible Use of Spatial Data; OGC, W3C. 2021. Available online: https://www.w3.org/TR/responsible-use-spatial/ (accessed on 8 July 2024).
  52. Global Earth Observation System of Systems–GEOSS. Available online: https://old.earthobservations.org/geoss.php (accessed on 8 July 2024).
  53. ISO 25012. Available online: https://iso25000.com/index.php/en/iso-25000-standards/iso-25012 (accessed on 8 July 2024).
  54. ISO 25010. Available online: https://www.iso25000.com/index.php/en/iso-25000-standards/iso-25010 (accessed on 8 July 2024).
  55. OSI Model. Available online: https://en.wikipedia.org/wiki/OSI_model (accessed on 8 July 2024).
  56. Reference Model of Open Distributed Processing (RM-ODP). Available online: https://en.wikipedia.org/wiki/RM-ODP (accessed on 19 July 2024).
  57. ISO/IEC10746 Information Technology–Open Distributed Processing (RM-ODP). Available online: https://committee.iso.org/sites/jtc1sc7/home/projects/flagship-standards/isoiec-10746.html (accessed on 19 July 2024).
  58. D3.2—Data Space Prototype and Report—First Version. Available online: https://drive.google.com/file/d/1FyvNkWkAWkuKKHh-3l529VUu5LIKyu5E/view (accessed on 31 August 2024).
  59. DSSC. Blueprint, Version 0.5. 2023. Available online: https://dssc.eu/space/BPE/179175433/Data+Spaces+Blueprint+%7C+Version+0.5+%7C+September+2023 (accessed on 8 July 2024).
  60. Santoro, M.; Mazzetti, P. GREAT D3.2 Final Blueprint of the GDDS Reference Architecture. 2024. Available online: https://www.greatproject.eu/wp-content/uploads/2024/04/D3.2-Final-Blueprint-of-the-GDDS-Reference-Architecture.pdf (accessed on 8 July 2024).
  61. Sensor. Community. Available online: https://sensor.community/en/ (accessed on 13 September 2024).
  62. Bastin, L.; Kriukov, V.; Lush, V.; Serral, I.; Hodson, T.; Borger, C.; Zamzow, M.; Bruch, L.; Maso, J. AD4GD D6.1 Pilot Technical Implementation Planning, Implementation and Assessment. Zenodo, 2024. [Google Scholar] [CrossRef]
  63. OGC GeoTIFF Standard. Available online: https://www.ogc.org/standard/geotiff/ (accessed on 10 September 2024).
  64. OGC Web Map Service Standard. Available online: https://www.ogc.org/standard/wms/ (accessed on 10 September 2024).
  65. OGC Web Coverage Service Standard. Available online: https://www.ogc.org/standard/wcs/ (accessed on 10 September 2024).
  66. OGC Web Processing Service Standard. Available online: https://www.ogc.org/standard/wps/ (accessed on 10 September 2024).
  67. Rasdaman (“Raster Data Manager”). Available online: http://www.rasdaman.org/ (accessed on 10 September 2024).
  68. OGC Sensor Things API Standard. Available online: https://www.ogc.org/standard/sensorthings/ (accessed on 10 September 2024).
  69. STAplus Extended Data Model. Available online: https://cos4cloud-eosc.eu/sensorthingsplusapiplus/ (accessed on 10 September 2024).
  70. OGC GeoPackage Standard. Available online: https://www.geopackage.org/spec140/ (accessed on 10 September 2024).
  71. ISO 19115-1:2014; Geographic Information—Metadata. ISO: Geneva, Switzerland, 2014. Available online: https://www.iso.org/standard/53798.html (accessed on 10 September 2024).
  72. ISO 19110:2016; Geographic Information—Methodology for Feature Cataloguing. ISO: Geneva, Switzerland, 2016. Available online: https://www.iso.org/standard/57303.html (accessed on 10 September 2024).
  73. Data Catalog Vocabulary (DCAT)–Version 3. Available online: https://www.w3.org/TR/vocab-dcat-3/ (accessed on 10 September 2024).
  74. OGC Location Building Blocks. Available online: https://blocks.ogc.org (accessed on 8 July 2024).
  75. Common Assessment Methods Standards and Specifications–CAMSS. Available online: https://joinup.ec.europa.eu/collection/common-assessment-method-standards-and-specifications-camss (accessed on 8 July 2024).
  76. The Open Data Institute–ODI. Available online: https://theodi.org/about-the-odi/our-vision/ (accessed on 8 July 2024).
  77. ODI–How to Choose an Open Standard. Available online: https://docs.google.com/document/d/1E5uARrZf5AJUIF_DJz-42_793EY_Dwk7n7B3bMn3x5A/edit#heading=h.xbuzggui7nk0 (accessed on 8 July 2024).
  78. Aus/NZ Intergovernmental Committee on Survey and Mapping (ICSM), 3D Cadastre Survey Data Exchange Specification. Available online: https://icsm-au.github.io/3d-csdm-common/ (accessed on 8 September 2024).
  79. Noardo, F.; Atkinson, R.; Simonis, I.; Villar, A.; Zaborowski, P. OGC Data Exchange Toolkit: Interoperable and Reusable 3D Data at the End of the OGC Rainbow. In Recent Advances in 3D Geoinformation Science–3DGeoInfo 2023; Lecture Notes in Geoinformation and Cartography; Kolbe, T.H., Donaubauer, A., Beil, C., Eds.; Springer: Cham, Switzerland, 2024; pp. 761–779. [Google Scholar] [CrossRef]
  80. DSSC. Blueprint, Version 1.0. 2024. Available online: https://dssc.eu/space/BVE/357073006/Data+Spaces+Blueprint+v1.0 (accessed on 8 July 2024).
Figure 1. Quality characteristics defined by the ISO/IEC25010 [54].
Figure 1. Quality characteristics defined by the ISO/IEC25010 [54].
Remotesensing 16 03824 g001
Figure 2. Methodology workflow illustrating the different phases of the discussion and integration of data space building blocks (first line), followed by the last step of mapping existing standards to the building blocks as an initial reference for users and to identify an overview picture of the current offer (second line). Dark blue represents the methodological steps, while light blue boxes summarise the obtained results.
Figure 2. Methodology workflow illustrating the different phases of the discussion and integration of data space building blocks (first line), followed by the last step of mapping existing standards to the building blocks as an initial reference for users and to identify an overview picture of the current offer (second line). Dark blue represents the methodological steps, while light blue boxes summarise the obtained results.
Remotesensing 16 03824 g002
Figure 3. Timeline of the different initiatives, projects, and documents related to the data spaces conceptualisation and implementation.
Figure 3. Timeline of the different initiatives, projects, and documents related to the data spaces conceptualisation and implementation.
Remotesensing 16 03824 g003
Figure 4. Re-factored technical building blocks stack, following the mapping to solutions and their scopes.
Figure 4. Re-factored technical building blocks stack, following the mapping to solutions and their scopes.
Remotesensing 16 03824 g004
Figure 5. Integrated proposed building block stack aligning with principles and solutions components [58].
Figure 5. Integrated proposed building block stack aligning with principles and solutions components [58].
Remotesensing 16 03824 g005
Figure 6. Abstract architecture diagram for the data space building blocks from discussion based on research projects implementations.
Figure 6. Abstract architecture diagram for the data space building blocks from discussion based on research projects implementations.
Remotesensing 16 03824 g006
Figure 7. Data space building blocks stack, coming from the DSSC blueprint (above) and as a result of the final comments and results of the discussion in this paper (below).
Figure 7. Data space building blocks stack, coming from the DSSC blueprint (above) and as a result of the final comments and results of the discussion in this paper (below).
Remotesensing 16 03824 g007
Figure 8. An example of a workflow designed for the Green Deal Data Space, which offers the processing and harmonisation of diverse data to users wishing to evaluate habitat connectivity scenarios.
Figure 8. An example of a workflow designed for the Green Deal Data Space, which offers the processing and harmonisation of diverse data to users wishing to evaluate habitat connectivity scenarios.
Remotesensing 16 03824 g008
Table 1. Panel of experts description, including authors, in alphabetical order.
Table 1. Panel of experts description, including authors, in alphabetical order.
ExpertCurrent AffiliationExperience and Reason for Involvement
Rob AtkinsonOGC (Int)Developer and semantics and standards expert. A total of 30 years experience in distributed geospatial system implementation, standards, and standards development methodologies.
Lucy BastinAston University (UK)Involved in several projects and standardisation initiatives related to data and metadata interoperability and the transparent communication of data and model quality. Active OGC and ISO WG member.
Bart de LathouwerGeonovum (NL)Long experience in OGC and interoperability-related projects and standardisation for spatial data infrastructures and data exchange ecosystems. Involved in Simpl initiative. Currently working for Geonovum, the Dutch geospatial standardisation organisation.
Giacomo MartiranoEPSIT Italia (IT)Involved in USAGE and FAIRiCUBE (HORIZON project funded under the same call than USAGE and AD4GD), collaborating with years of experience in OGC and INSPIRE development and implementation.
Joan MasoCREAF (ES)Involved in several projects and standardisation initiatives over interoperability. Active OGC member.
Francesca NoardoOGC (Int)Working in USAGE and AD4GD projects, having researched the topic of multi-source data integration and standardisation for built environment related use cases in recent years.
Ingo SimonisOGC (Int)Expert in distributed architectures, semantics, interoperability, data spaces, and knowledge sharing.
Linda van den BrinkGeonovum (NL)Geospatial and Web standards expert at Geonovum, co-chair of OGC/W3C Spatial Data on the Web Interest Group, chair of the OGC GeoSemantics domain working group, PhD titled Geospatial data on the Web, and OGC Gardel award winner.
Alejandro VillarOGC (Int)Software engineer and semantics expert, with over 15 years of experience designing traditional and semantic data solutions and platforms.
Marie-Françoise VoidrotOGC (Int)Expert in Earth Observation data and knowledge sharing, platforms, and distributed systems from local, regional, and national data infrastructures to European or global systems such as WMO or GEOSS.
Piotr ZaborowskiOGC (Int)Software architect, developer of GEOSS Common Infrastructure, OASC and JRC-INSPIRE liaison.
Table 2. Definition of each building block and the possible changes with respect to the DSSC blueprint.
Table 2. Definition of each building block and the possible changes with respect to the DSSC blueprint.
DSSC Blueprint AspectExtensionReason for ExtensionDefinition
Data Interoperability categoryData Specification enabling FAIRnessBuilding blocks support different FAIR aspects. All of these contribute to an effective description of data characteristics (be it published or required)A complete and standards compliant specification of all aspects of data FAIRness.
Data Sovereignty and Trust categoryUnchanged-Technical enablers to guarantee reliability and authenticity of participants’ information, to establish trust among them when interacting and performing data transactions. Common standards and agreed policies should prevent lock-in effects for users, and support FAIR principles, verification and authentication mechanisms, ensuring interoperability and security.
Data Value Creation EnablersData Value EnhancementThe intrinsic value of data is not created, however, the tools contained in this category unlock the value of those data by making them available for a wider audience and facilitating their use.To leverage the value of data, users must be supported in the retrieval and access to the data they need, and have the possibility to apply the necessary processing to adapt them to specific needs (e.g.,  data transformation, data visualisation, etc.).
-New Services FAIRness categoryAlthough data spaces are focused on data, specialised Services may be required to enable applications to exploit that data. Therefore, similar challenges apply to identify the necessary services and their possible use.Symmetrically to the category “Data specification enabling FAIRness”, this category supports a good description of services to facilitate services retrieval according to the use cases workflows needs.
Data modelsUnchanged-The model provides semantics and a shared vocabulary, as well as a structure for the data (hierarchies and relationships).
Data Exchange or Data ModelsData Exchange–EncodingsIn previous versions of the DSSC building blocks stack, the encodings of data, or formats were included in the data models building block and, in the current version, they also have relationships with the Data Exchange building block. However, it is important to specify them separately, because dedicated standards are available, and they are independent from the options for representing data models or data exchange mechanisms.Encoding is the format in which the data are encoded.
Data ExchangeData Exchange (APIs)Minor change in the title. Moved from the “Data interoperability” to the “Data value enhancement” categoryThe data exchange building block focuses on data transmission once the conditions for interchange authorisation are met.
Provenance and TraceabilityData Provenance modelThe original building block is intended to address both (a) the standards to be used to represent and document provenance in the data and (b) mechanisms to track the data throughout their lifecycle. However, two different kinds of standards and services are available for the two scopes. Moreover, one complements the data description, while the other is intended to support data sovereignty. It is therefore reasonable to address them separately.Standards intended to represent and document provenance and lineage of data.
Provenance and TraceabilitySharing TraceabilitySee row above. Moved from the “Data Interoperability” to the “Data Sovereignty and Trust” categoryStandards and services intended to keep track of the data processing and sources along their lifecycle.
Access and usage policies and enforcementData PoliciesThe DSSC building block aims to specify how to define and enforce access and usage policies within a data space and how participants define their policies in data spaces. However, we consider that the policies specifying access and usage conditions and policies schemas follow rules and definitions which are in practice separated from the generic mechanisms used to ensure that such conditions are respected. Therefore in this proposal, it is split in two respective building blocks.A policy is defined by the DSSC Blueprint v1.0 as “either an offer from the data provider or an agreement between the data provider and the data recipient. A policy comprises rules that specify the rights and duties of the parties: Access Rules: whether access to a resource is allowed or not; Usage Rules: how a resource might or may not be used; Consent Rules: whether usage of a resource, for which consent might be required from third parties, is allowed or not.”. Policies include license details.
Access and usage policies and enforcementAccess and usage control and managementThe DSSC building block aims to specify how to define and enforce access and usage policies within a data space and how participants define their policies in data spaces. However, we consider that the policies specifying access and usage conditions and policies schemas follow rules and definitions which should be separated from the mechanisms used to ensure that such conditions are respected. Therefore in this proposal, it is split in two respective building blocks.Mechanisms in place to ensure access and usage policies related to certain data are respected.
Identity and attestation managementUnchanged-Information provided on the relevant entities must be verifiable to enable the onboarding and offboarding processes. The trustworthiness of information is linked to the trustworthiness of the Trust Anchors (or Trust Service Providers, specifically for identities), who are entitled to issue the respective attestations.
Trust FrameworkUnchanged-Mechanisms and standards enabling a trust environment to be implemented within which data can be securely exchanged.
Data, services and offering descriptionsData descriptions (metadata)Separated by the original building block, because specific schemas are needed to describe datasets (metadata). Moved from the “Data Value Creation Enablers” to the “Data Specification enabling FAIRness” categoryMetadata, technical guidance, and schemas for describing datasets, exchanged within a data space between providers and recipients. Metadata allow consistent data retrieval, generation or reuse, ensuring reliability in the results for which data are used as input and related decision-making process.
Data, services and offering descriptionsServices descriptions (metadata)Separated by the original building block, because specific schemas are needed to describe services. Moved from the “Data Value Creation Enablers” to the “Services FAIRness” categoryMetadata, technical guidance, and schemas for describing services chosen as components of a data space architecture.
Data, services and offering descriptionsOfferings descriptionsSeparated by the original building block as part not covered by existing or planned data and services technical description. We remain with the doubt about the category under which it could fall, since several elements from different categories are useful to define the offering.Offerings refer to a combination of descriptions and conditions attached to the data made available in the data space. However, a sharper definition is hard to find in the DSSC building block description.
Publication and discoveryMetadata publication and discoveryMinor change in the title. Checking the DSSC blueprint, it refers to metadata publication rather than to data publication itself. Therefore “metadata” was added to the title, to avoid any confusion with data publication systems (e.g.,  by means of APIs).
Value added servicesUnchanged-Included in the Blueprint v1.0, any kind of processing, as service, is included
-New Data Requirements and quality SchemasIn “Data Specification enabling FAIRness” categoryData requirements specification is essential for a successful data retrieval or generation, leading to reliable results for use cases. Referring to standardised schemas to specify data requirements allows interoperability between systems.
Data modelsNew VocabulariesData models are often described as “vocabularies”; however, there are many forms of terminology needed to describe the structure and semantics of data, as well as constrain the values that are used to convey information within these structures. Data spaces will face issues of abstraction, where one domain may model a concept in detail, where for another domain this is simply a classification attribute–e.g., a “cat” or an “animal, type = cat”.Vocabularies are defined sets of terms. In the case of data models, these terms may be formally described in terms of relationships to other terms, as ontologies or taxonomies, but they may exist a lists of terms with definitions.
-New Vocabularies servicesIn “Data Value Enhancement” category. Services may be used to cross-reference or match related terms from different domains to enhance “findability” in particular. Publishing cross-walks with expert curation can add significant value to data for domains needing this semantic clarity.Services intended to manage or augment vocabularies for the related functionalities.
-New data requirements and quality services (definition + validation)In “Data Value Enhancement” categoryServices intended to support specification of standard-based data requirements (based on the building blocks in the category “Data Specifications for FAIRness”) as well as data validation against such requirements.
-New Licenses for servicesIn “Services FAIRness” categoryThe kinds of licenses available for services (to be known when planning and implementing a data space architecture).
Table 3. Proposed criteria to describe and assess standards.
Table 3. Proposed criteria to describe and assess standards.
Relevance and Scope
Fit for Purpose: Ensure that the standards align with the specific needs and objectives of the domain. Adoption within a domain is an obvious indicator of quality, but should be qualified by any activities to improve such standards based on experience of usage.
Scope Coverage: Evaluate whether the standards cover all necessary areas without significant gaps. Overlapping standards should be assessed to see if they offer complementary benefits or if they introduce redundancy.
Alignment: Have alignments between related standards in use or proposed been published and available for re-use (noting that transformations of data are a significant overhead in most re-use scenarios, availability of tested transformation mechanisms make it easier to combine different standards in practice).

Flexibility and Extensibility

Modularity: Building Block composition mechanisms must be explicit and supported by tooling to realise the principle of reuse. Such mechanisms must include explicit traceability of interoperability design, through transparent and standardised description of building block dependencies. Furthermore, such building blocks need to be adaptable for larger building blocks that use the same composition mechanisms, to allow the rich metadata required to evaluate and reuse resources to be assembled and understood. This has been a critical weakness of formal standardisation processes, leading to many alternative ad hoc approaches to standardisation in application profiles,
Adaptability: Standards should be flexible enough to accommodate future changes and extensions. Avoid overly rigid standards that may hinder innovation or adaptation to new requirements.
Special vs. General Solutions: Prefer standards that provide flexible, general solutions over those offering special case solutions, unless the special case is critical and cannot be addressed adequately by the general standard.

Interoperability and Integration

Compatibility: Standards should work well together and integrate smoothly, minimising the need for custom adapters or significant modifications.
Interdependencies: Assess the interdependencies between standards. Strongly interdependent standards should be adopted together only if they provide a cohesive, integrated solution.
Transparency: Machine-readable declarations of compatibility and interdependencies allows for cost-effective and scalable testing and the reuse of integrated suites of standards.

Simplicity and Clarity

Simplicity: Choose standards that support simplification through encapsulation—the ability to test and compose arbitrarily complex complete solutions from simple components.
Ease of Understanding: Standards should be clear, well-documented, and easy to understand. Complex standards can lead to misinterpretation and implementation errors.
Examples: Standards should be supported by clear examples to allow practitioners to easily understand the scope of standards. Examples should be tested to conform to standards, as discrepancies are common and cause significant confusion.

Support and Ecosystem

Tool Support: Consider the availability of software tools and libraries that support the standards. Robust tool support can significantly ease implementation and maintenance.
Community and Vendor Support: Evaluate the level of community and vendor support. Widely adopted standards with active communities and strong vendor backing are often more reliable and future-proof.

Maturity and Stability

Proven Track Record: Mature standards that have been widely adopted and tested in various contexts are usually more reliable.
Stability: Prefer standards that are stable and have a clear roadmap for updates and maintenance. Frequent changes can disrupt development and integration processes.

Compliance and Security

Regulatory Compliance: Ensure that the standards comply with relevant regulations and industry best practices.
Security: Assess the security implications of the standards. They should support the implementation of secure systems and not introduce vulnerabilities.

Cost and Resource Considerations

Implementation Cost: Consider the cost of implementing and maintaining the standards, including licensing fees, if any.
Resource Availability: Ensure that the necessary skills and resources are available for adopting and maintaining the standards.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Noardo, F.; Atkinson, R.; Bastin, L.; Maso, J.; Simonis, I.; Villar, A.; Voidrot, M.-F.; Zaborowski, P. Standards for Data Space Building Blocks. Remote Sens. 2024, 16, 3824. https://doi.org/10.3390/rs16203824

AMA Style

Noardo F, Atkinson R, Bastin L, Maso J, Simonis I, Villar A, Voidrot M-F, Zaborowski P. Standards for Data Space Building Blocks. Remote Sensing. 2024; 16(20):3824. https://doi.org/10.3390/rs16203824

Chicago/Turabian Style

Noardo, Francesca, Rob Atkinson, Lucy Bastin, Joan Maso, Ingo Simonis, Alejandro Villar, Marie-Françoise Voidrot, and Piotr Zaborowski. 2024. "Standards for Data Space Building Blocks" Remote Sensing 16, no. 20: 3824. https://doi.org/10.3390/rs16203824

APA Style

Noardo, F., Atkinson, R., Bastin, L., Maso, J., Simonis, I., Villar, A., Voidrot, M. -F., & Zaborowski, P. (2024). Standards for Data Space Building Blocks. Remote Sensing, 16(20), 3824. https://doi.org/10.3390/rs16203824

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop