1. Introduction
Much of all government data have some reference to location, and thus benefit from being mapped to a geographical data framework [
1]. The last decade has seen substantial efforts in opening up geospatial datasets by governments. For example, for the city of Manchester in the UK (
https://mappinggm.org.uk/, accessed on 13 August 2024) and the Halifax Regional Municipality, Canada (
https://catalogue-hrm.opendata.arcgis.com/, accessed on 13 August 2024), provide open public access to map data, including, housing, socioeconomic and demographic data. In the UK, open data policy has led to new data sets being made freely available by the Ordnance Survey (UK’s national mapping agency) (
https://osdatahub.os.uk/, accessed on 13 August 2024), including among others, all of the administrative and postal code boundaries [
2]. Similar initiatives exist across all of Europe [
3], and many other countries in the world [
4], driven by evidence of significant social and economic value [
1]. Administrative geography open datasets are an important set of data representing a hierarchy of areas relating to national and local government, that are often used as backdrop map layers for locating other geospatial data. These multilayered hierarchies are normally complicated by the differing structures within and across countries. For example, in the UK, Council Areas, Unitary Authorities and London Boroughs are used to represent regions at corresponding levels in Scotland, Wales, and England, respectively. Proprietary ontologies are used by different countries to encode and publish these datasets [
5,
6,
7], as linked open data [
8,
9]. However, there is a lack of studies that evaluate the effectiveness of these ontologies. There is a need for ontology evaluation methods that consider the specific semantics of the geospatial ontologies. The inherent spatial semantics in the representation of administrative divisions as layered partitions of space by fiat boundaries are common across different geographic places and datasets. The definition and explicit representation of these semantics are important to allow for the development of common ontologies and spatial reasoning frameworks.
On the other hand, much effort has been made in the design and development of standard models for the representation of geospatial data to support their effective discovery, sharing and integration; in particular, the ISO Geographic information standards (
https://committee.iso.org/sites/tc211/home/re.html, accessed on 13 August 2024) and the Open Geospatial Consortium (OGC) (
https://www.ogc.org/standards/, accessed on 13 August 2024) standards. A prominent example of these standards for geospatial linked data is the ontology underlying the OGC Geographic Query Language for RDF (GeoSPARQL) (
https://docs.ogc.org/is/22-047r1/22-047r1.html, accessed on 13 August 2024). The GeoSPARQL ontology provides core vocabulary and concepts for the representation of geographic features and spatial relationships that are fundamental modelling elements in geospatial domains. Adopting and reusing these foundational ontologies will enable the shared use and integration of the administrative geography open datasets.
In this work, we propose a new measure of semantic completeness for geospatial ontologies of administrative datasets. Topological and proximity semantics are distinguished and combined in a measure of spatial semantic completeness. The significance of the proposal is twofold; (a) it provides a homogeneous method of defining spatial semantics in these types of ontologies, and (b) it provides a novel metric of quality for evaluating geospatial ontologies that considers the spatial dimension of the datasets. This core dimension has not been considered before in any ontology evaluation methods. The utility of the proposal is demonstrated by the evaluation of ontologies from four European national mapping agencies used in the provision of their open datasets. It is shown how the proposed metric provides an objective measure of quality of the geospatial ontologies and identifies the gaps in their definition. The contributions of this work are as follows: (a) a study of the nature of administrative open geospatial data and the identification of a uniform set of spatial semantics that are inherent in the data that can be explicitly defined in their representative ontologies, (b) proposal of a uniform measure of spatial completeness as a method for evaluating ontologies of administrative geographies, (c) demonstration of the utility and the effectiveness of the proposed measures through the evaluation of four real world ontologies.
The remainder of this paper is structured as follows. 
Section 2 provides an overview on geographic ontologies and ontology evaluation methods. 
Section 3 presents the proposed measures of spatial semantics and concludes with a combined measure of spatial semantics completeness. The proposed spatial semantics are translated into a set of competency questions that can be used for evaluating geospatial ontologies. 
Section 4 provides a systematic evaluation of four administrative division ontologies against the proposed measures. A discussion of the results and their significance, in comparison to other standard ontology evaluation metrics, is also presented. Conclusions and an overview of future work is given in 
Section 5.
  3. Measures of Spatial Semantics in Administrative Geography Data
Consider the map of Wales, UK, shown in 
Figure 1. The map in 
Figure 1a represents the boundaries of the local authority districts in the UK. In particular, the map is focused on the Welsh districts. In 
Figure 1b, the boundaries of the Local Health Boards are shown for the same area. The map implicitly provides semantics about the location of these regions in space and their relationships to one another that can be inferred by visual spatial reasoning Some examples of these semantics are as follows, with references to the shaded regions in 
Figure 1 (Map data are from the Open Geography portal from the Office for National Statistics, UK. 
https://geoportal.statistics.gov.uk/, accessed on 13 August 2024).
- The local authority district of Cardiff is in Wales. 
- Cardiff and Newport are neighbours. 
- Swansea is further away from Cardiff than Newport. 
- The Isle of Anglesey is farther away to the north from Swansea than Carmarthenshire. 
- Pembrokeshire, Carmarthenshire and Caredigion are part of the Hywel Dda University Health Board. 
The above statements are examples of qualitative spatial relationships between regions, and in particular, topological relations; describing the degree of connectedness between regions, proximal relations; describing qualitative distance relationships between regions and directional relations; describing the relative position of the regions with respect to a specific frame of reference (e.g., cardinal directions). Qualitative spatial representation and reasoning (QSRR) is an established area of research that seeks to define formalisms for modelling space and spatial relations to allow for the automatic inference of such spatial semantics (see [
47] for a detailed review of this subject).
Here, we utilise topological and proximal spatial relationships to define the spatial semantics of administrative geographies. The hierarchical division of space normally adopted for administrative divisions lends itself naturally to connection patterns represented by topological relations. We extend these patterns to also express some coarse representation of proximity and leave out directional relationships for future studies. 
Figure 2 shows a set of six standard topological spatial relations between simple regions. Generalised containment relationships are used where no distinction is made for regions with touching boundaries.
Administrative divisions, as seen in the examples shown in 
Figure 1 divide the space into distinct neighbouring regions that together cover the space completely. Several divisions of the same space can be adopted to represent different purposes. For example, health board regions, local authority district regions or postcode regions, etc. Multiple layers of these divisions combine to form administrative hierarchies, e.g., local authority districts contain wards and parishes, and postcode areas contain postcode districts, which in turn contain postcode sectors, and so on. Relationships between regions in administrative division geographies can thus be summarised as follows:
- Spatial relationships between regions in one division of space are restricted to either touches or ; e.g., , and . 
- Relationships between regions on different layers can be defined by containment relationships; e.g., . 
- Regions in different divisions may intersect, where  -  is defined as any relationship of connectedness between two regions [ 48- ]:  - ∨ - ∨ - ∨ - . For example, regions representing local districts will intersect with postal code areas in the same geographic region. 
  3.1. Topological Semantics
Let  represents level i in an administrative hierarchy . Let a region  represents an instance of the set of regions R in . A measure of the topological semantics in  can thus be formulated in terms of the numbers and types of possible topological relations defined between regions across the levels of .
Let  be the set of all neighbouring regions to . Let  be the complement of  in  (the remaining set of regions in R besides ). The topological semantics () of the administrative division  in  are defined as follows.
- :
- There exists a  relationship between  and every region in . 
- :
- There exists a  relationship between  and every region in . Note, that this is a derived relationship from . 
- :
- There exists one region in  (for all levels except the root), that  . I.e. every region lies inside one parent region. Transitivity of the  and  relationships can then be applied to define all possible containment relationships between regions on all levels. 
- :
- Regions on a similar level in different administrative hierarchies for the same extent in space will intersect. 
For two administrative hierarchies  and , that represent different divisions of the same space, let  represents level i in  and  represents level i in . There exists at least one instance of the relationship  between  in  and  in .
To determine the correspondence between levels in different hierarchies, we propose the use of a global universal scheme, such as the Global Administrative Areas (GADM) (
https://gadm.org/, accessed on 13 August 2024).
Using the above definitions of topological semantics, we can propose a definition of topological semantic completeness () for the representation of administrative geographies as follows.
- :
- A level  in  is considered to be complete with respect to topological semantics (topologically-complete), if for every region , the four topological semantics above can be defined. 
- :
-  is considered to be topologically-complete, if every level  is topologically complete. 
  3.2. Proximity Semantics
We propose a measure of proximity for administrative geography regions that is based on and extends the semantics of connectedness used in defining the topological semantics above. Two regions in R are either  or ; . A measure of the proximity semantics in  can thus be formulated in terms of the numbers and types of regions that  across the levels of . Thus, measures of proximity semantic completeness () for the representation of administrative geographies can be proposed as follows.
- :
- A level  in  is considered to be complete with respect to proximity semantics (proximity-complete), if for every region , its connectedness with all other regions in  is defined, or can be inferred. 
- :
-  is considered to be proximity-complete, if every level  is considered to be proximity-complete. 
  3.3. Spatial Semantic Completeness
We use the above set of topological and proximity semantics to define an overall spatial semantic completeness metric that can be used to evaluate an ontology of administrative geography. Let a geospatial ontology O be a tuple (C, I, A, R), where C represents concepts, I represents instances of these concepts, A represents a collection of finite sets of attributes of these concepts, and R a finite set of binary relations on these concepts. Let  be the ontology O that defines, in addition to R, all spatial relations that are definable through QSR from O ().
Let a set of map layers  represent a hierarchy of administrative divisions in space, against which O will be evaluated. Let  be the set of concepts, instances, attributes and spatial relationships definable in , where ().
 represents a gold standard ontology with respect to O and contains all concepts, relations, attributes, and instances of the administrative division for a specific space.  is considered to be topologically complete () and proximity complete (); namely, it is possible to define the set of all topological and proximity semantics for . Furthermore, it is assumed that O is semantically complete with respect to  on the three components: C, I and A (since the maps are faithful representations of the ontology). On the other hand, completeness of O with respect to  is not assumed and will need to be evaluated as follows.
- Spatial Semantic Completeness:-  The spatial semantic completeness (spatial completeness for short) of ontology  O-  with respect to  -  ( - ) is defined as follows:
           - where,
           
 
 is the number of definable spatial relations in .  is the ratio between the definable spatial relationships in O to those defined in .
  3.4. Spatial Semantic Competency Questions
Competency questions (CQs) play a key role in ontology engineering [
49]. They consist of a set of questions that the ontology should be able to answer and thus define the ontology scope and provide a way of evaluating the ontology. Spatial semantics are part of the domain knowledge captured by a geospatial ontology. Capturing these semantics in the form of competency questions can therefore be used to facilitate both the process of defining and evaluating ontologies. When designing the ontology, requirements can be captured by ontology engineers through CQs and expressed in natural language. During evaluation, the questions are expressed formally using Description Logic and posed are posed as queries to the ontology.
The proposed set of spatial semantics can be used to formulate a set of competency questions for administrative domain ontologies as follows. If O is spatially complete with respect to , then O would be able to address the set of “spatial semantic competency questions” for all classes of regions C and instances I in O. Let  be an instance of a region, and  be a subset of map layers representing a possible hierarchical division in O. The set of competency questions (SCQ) that can be used to evaluate O are as follows.
- SCQ1: 
            
- Which regions are  of () region x? 
- SCQ2: 
            
- Which regions are  region x? 
- SCQ3: 
            
- Which regions lie  region x and region y? 
- SCQ4: 
            
- Which regions are not neighbours (do not touch) region x? 
- SCQ5: 
            
- Which regions are parents of () region x? 
- SCQ6: 
            
- Which regions are contained in () region x? 
- SCQ7: 
            
- Which regions in  intersect with region x in ? 
- SCQ8: 
            
- Which regions in  are near region x in ? 
 and 
 directly correspond to topological and proximity relationships in 
Figure 2. 
 relationships (
 and 
) describe relative distance in natural language communication. They are dependent on the scale of the space and size of objects described and have no precise quantitative definition [
50]. Using a graph-based approach to the representation of qualitative spatial relations, semantics of the 
, 
 and 
 relationships can be described as the graph distance between regions (nodes). Some interpretations of proximity relationships can be described as follows on a graph representing regions and 
 relationships as described in 
Section 3.3 above.
- Regions are considered to be  -  if they are disjoint and there is a path of two (touches) edges between them, as shown in  Figure 3- a. 
- A region is considered to be  -  two other regions if it on the path between them such that the path does not contain cycles or parts of cycles in the graph. For example, region  y-  is between regions  x-  and  z-  in  Figure 3- a and regions  -  are between regions  x-  and  n-  in  Figure 3- b. 
- Regions are considered to be  -  if  -  regions exist  -  them, as shown in  Figure 3- b. 
Using the above relationships on the map in 
Figure 3c, we can define some proximity relationships as follows: 
. 
, 
, 
, etc.
The above definitions are possible examples of how proximity can be defined in this type of ontologies. Formal definitions of these relationships and other variations (e.g., very near, very far, etc.) using graph theory is possible, but is outside the scope of this work. In the remainder of this paper, it is assumed that there exists an  and an associated set of spatial competency questions, against which an administrative geography ontology can be evaluated.
  4. Evaluating the Spatial Semantics of Administrative Geography Ontologies
Four well-established administrative geography ontologies were chosen to be studied here. These are the administrative geographies of the UK, Ireland, France and Greece; offered by their respective National Mapping Agencies (Ordnance Survey of Great Britain, Ordnance Survey Ireland, IGN France and the Ministry of the Interior and Administrative Reconstruction, Greece). The ontologies, provided on open data portals, were downloaded and stored in GraphDB (
https://graphdb.ontotext.com/, accessed on 13 August 2024) and Protégé (
https://protege.stanford.edu/, accessed on 13 August 2024) for analysis. A summary of the the number of classes and instances in the datasets is presented in 
Table 1.
The analysis was performed by studying the ontologies to determine the different administrative hierarchies (and corresponding levels) presented in each dataset, and then identifying the spatial relationships defined between the classes and between the instances. In what follows, we present, for each dataset, the structure of the ontology and a summary of the spatial semantics encoded in the data. These are then used to compare with the spatial semantics proposed in the previous section to provide a measure of completeness. Analysis of the Ordnance Survey ontologies is presented here, while analysis of the Irish, French and Greek ontologies is provided in the 
Appendix A.
  4.1. Ordnance Survey, UK (OS)
The Ordnance Survey, UK, has invested much effort in preparing its ontologies, including a specific ontology of spatial relationships [
17,
51]. Administrative divisions for Wales are shown in 
Figure 4a,b. Postal code division for the whole of the UK is shown in 
Figure 4c, and an example class definition of the OS ontology is shown in 
Figure 4d. Note that the Welsh divisions are used here as an example. Similar representations of divisions, are used to represent different areas of the UK. Note also that postal code divisions are not considered as administrative divisions. However, they are represented with similar spatial hierarchies and can thus be treated in the same manner. 
Table 2 presents the computation of the spatial completeness measure for the OS ontologies.
In the table, the columns for 
Domain and 
Range represent the types of regions explicitly defined in the administrative division in the ontology. Regions in three administrative divisions, corresponding to those shown in 
Figure 4, are presented in groups, separated by a horizontal divider. For example, the first group in 
Table 2 consists of three types of regions: Unitary Authority, Unitary Authority Electoral Division and Electoral Division. The column of 
Possible Relationships lists the set of sound topological relationships between the regions considered (the only types of physically possible relationships between the regions). For example, two Unitary Authority regions can exist only in 
 or 
 relationships. The 
Defined Relationships column lists the actual relationships that are explicitly defined in the ontology between the regions considered. For example, the OS ontology defines the 
 relationship between regions of type Unitary Authority. The 
Definable by QSR column lists the possible relationships that can be automatically derived by qualitative reasoning from the explicitly stored ones. For example, using 
, all disjoint relationships between region of type Unitary Authority can be deduced from the defined touches relationships, etc. The 
Not Defined column lists the difference between the possible relationships and the union of defined and definable relations. 
Spatial completeness is the ratio between the total number of defined and definable relationships to the total number of possible relationships. An overall measure of spatial completeness is given in the last row, as an average of the measures across all the considered regions and relationships.
Based on the defined relationships, some example competency questions that can be answered by the OS ontologies are as follows, assuming they are used within a type of location-based service application.
Some examples of questions that cannot be answered by the ontologies are as follows.
Table 3 provides a list of all the competency questions that can be answered by the ontology and those that the ontology cannot address.
   4.2. Results and Discussion
As can be seen in the four examples of administrative geospatial ontologies, they all capture the containment relationships between levels in their represented hierarchies. These relationships are essential for encoding the semantic structure of the administrative divisions. In all cases, spatial containment is assumed based on a semantic relationship, such as,  or . With the exception of the OS ontology, no explicit spatial relationships are defined. In the case of the OS ontology, the definition of the spatial relationships was not homogeneous across all hierarchies, in particular, no  relationships was defined in the postcode hierarchy. Note also that no explicit relationships were encoded across divisions in any ontologies, (except for postcode units in the OS ontology).
As can be expected, the greater the degree of spatial completeness of the ontology, the more spatial competency questions it is able to address, as shown in the 
Table A1, 
Table A2, 
Table A3, 
Table A4, 
Table A5, 
Table A6 and 
Table A7 in the 
Appendix A. The OS ontologies are able to handle approximately half (
) of the possible competency questions, while the three other ontologies address only two of the possible eight questions; coverage score of (
). Thus, the proposed topological semantics can make a significant improvement in the semantic richness and usability of the geospatial ontologies. Summary of results for spatial completeness (SpCom) and competency questions coverage (CQCov) for the four ontologies is shown in 
Figure 5.
The measure of spatial completeness proposed here is a special type of general measure of semantic completeness for ontologies; which is a measure of the coverage of the concepts and relationships in the ontology usually in comparison to a gold standard ontology; considered to be complete. A data-driven approach to semantic completeness can also be used where the coverage is measured against a corpus of data, where information retrieval metrics of recall and precision are used for evaluation. In our case, the administrative geography maps provide the gold standard and the structure of the administrative divisions is used to identify the complete set of classes and possible spatial relationships between classes. Hence, the production of a gold standard for comparison is feasible and is systematically applicable to any type of administrative geography ontologies. The power of this metric is that it provides a clear indication of the gaps in the semantic knowledge as well as how to address it by defining the missing spatial relationships. The uniformity of representation is beneficial because it paves the way for the development of universal tools and languages for querying and manipulation of different geospatial ontologies.
Few studies have reported on the correlation between different quality metrics of ontologies, particularly whether completeness of the ontology is related to the complexity of its structure or to its readability, etc. [
30]. Here, we present the results of evaluation of some selected metrics that are used in the literature to evaluate the structural quality of the ontologies and its readability [
52,
53]. The following structuredness metrics were used: schema metrics (Attribute Richness (AR), Relationship Richness (RR), Inheritance Richness (IR)), graph metrics (Average Depth (AD), Maximal Depth (MD), Average Breadth (AB), Maximal Breadth (MB)), and Class Richness (CR) as a knowledge base metric. Readability metrics used are Class comments and labels (C.cmt, C.lbl), Object properties comments and labels (O.cmt, O.lbl) and data properties comments and labels (d.cmt, d.lbl). A brief definition of these metrics is given in 
Table A7 in the 
Appendix A. OntoMetrics [
54] was used to calculate the graph metrics, schema metrics, and knowledge base metrics from the ontologies directly. As for the readability metrics; these were computed using SPARQL queries over the GraphDB database that stores the ontologies. 
Table 4 and 
Figure 6 shows the values of the metrics as applied over the four ontologies.
  4.3. Overall Comparison of Quality
The metrics need to be normalised for comparison between different aspects of quality as some metrics are relative values (e.g., completeness) and some are absolute values (e.g., structural complexity and readability). We mapped the metrics’ values to subranges, where subrange 1 means the poorest quality of a specific metric, and subrange 5 is the highest quality, as shown in 
Table 5. 
Figure 7 shows the ontology scores after normalisation. Once the data has been normalised, the completeness, structuredness, and readability metrics can be aggregated into single values by taking the average. A combined result of all quality metrics is shown in 
Figure 8. As shown in the figure, the OS ontology is more spatially complete in comparison to the three others, which are equally spatially complete. The degree of structuredness or readability of the geospatial ontologies do not correlate with their spatial completeness; e.g., IGN geofla is superior to all others with respect to both measures, but is inferior with respect to completeness. Hence, it can be seen how this new measure of completeness provides a useful complementary metric for evaluating geospatial ontologies.
  5. Conclusions
This work presents a novel measure of quality of geospatial ontologies. With a focus on administrative geography open data and their ontologies, this work identified a set of topological and proximity semantics, which can be explicitly defined (and inferred by spatial reasoning) that captures the spatial semantics between their component regions. A uniform measure of spatial completeness is proposed and interpreted as a set of competency questions that, together, can be used to evaluate the completeness of geospatial ontologies in this domain. Four European administrative geography ontologies and datasets were analysed and evaluated. It is shown how the metrics can be homogeneously applied and computed for different ontologies with different levels in their hierarchical divisions. The proposed measures provide an objective view of spatial completeness of the ontologies and explain the gaps in their representations. The proposed spatial completeness measures are novel and significant as they can allow a homogeneous definition of ontologies across different countries and can therefore support data sharing and integration. The measures complement the established methods in the literature, that primarily focus on the syntactical and structural dimensions of the ontologies, and offers a novel approach to ontology evaluation in the geospatial domain. Research points that would be worth studying in the future include: a) methods of encoding the proposed semantics, including spatial reasoning, in the ontologies to allow for the automatic building of ontologies and checking and maintaining their consistency, and b) the reuse of spatial data standards and designing frameworks, that can support their integration with the proposed semantics and metrics, for the sharing of geospatial open data.