1. Introduction
The digital transformation of metrology is one of the strategic challenges clearly identified by the International Committee for Weights and Measures (CIPM) towards the 2030+ strategy [
1]. In November 2022, the 27th World Conference on Metrology formed a resolution on “Global Digitalization and the International System of Units” [
2], which not only indicates that the digital transformation of metrology is the consensus of the world metrology community, but also signifies that the digitalization of metrology is based on the International System of Units (SI) as the core of the digital framework. The digitization of units of measurement requires the development of a machine-readable SI format for measurement data in digital communications [
3].
Scientific research relies heavily on measurements, and units are a central concept in measuring the physical world [
4]; quantitative measurements are meaningless without a clear description of the units being measured [
5] and can be even more problematic when it comes to using machines for large-scale analysis of data [
6]. In order for units of measurement to continue to serve society in an increasingly digital world, the CIPM established the CIPM Task Group on the Digital SI (TG-DSI) to study the digitization of units of measurement [
7].
An ontology is a formal knowledge representation used to describe concepts and relationships in a domain. Due to its benefits of knowledge sharing, semantic interpretation, and extensibility, it has become a solution to digitize the vocabulary of units [
8,
9]. The importance of unit ontologies has also been recognized by the W3C Semantic Web Best Practices and Development (SWBPD) working group [
10].
Within the unit domain, several different ontologies exist [
11,
12,
13]: Quantities, Units, Dimensions, and Data Types Ontologies (QUDT) [
14], which were developed under the NASA Exploration Initiatives Ontology Models (NEXIOM) project; Ontologies Quantities, Units, Dimensions, Values (QUDV) [
15], a joint collaboration between the SysML 1.2 Revision Task Force and the OMG MARTE Specification Group, aims to define the system model using of unit systems; Ontology of units of Measure (OM), created by Hajo Rijgersberg [
16] and others, is an ontology developed in the context of food research, with a focus on units, quantities, measures, and dimensions.
For the field of Chinese units of measurement, most of the current unit ontologies, such as QUDT, QUDV, OM, etc. [
17], are presented in English, which may make it difficult for Chinese users to understand and use the content of the ontologies, and cannot meet the requirements of the Chinese measurement field for measurement traceability. Furthermore, the expressions about units in Chinese metrology vocabulary are also partially different from the ontologies developed abroad, which does not facilitate unit conversion and calculation. Thirdly, most of the institutions that develop unit ontologies are not metrology institutions, which may lead to irregular and inaccurate unit descriptions. In order to solve the above problems, it is necessary to construct a set of unit vocabulary ontologies that conform to the Chinese metrology domain and ensure consistency and interoperability with international unit standards to provide a unified framework for standardized metrology. This paper’s main contributions are the following:
- (1)
Addressing the issues of standardization and contextual differences in the ontology descriptions of measurement unit vocabulary, this paper constructs an ontology suitable for the field of measurement unit vocabulary, named “vim”, based on Chinese national standards, metrology technical specifications, and international standards such as the International System of Units and its applications [
18], the SI manual published by the International Bureau of Weights and Measures (BIPM) [
19], etc.
- (2)
By adopting the provisions concerning quantity and units found in the International vocabulary of metrology—Basic and general concepts (VIM) [
20] and the provisions of the chapter on quantities and units in JJF 1001-2011 [
21], this paper provides a unified unit standard and reference framework for the digitization system to facilitate the digital expression and sharing of units, and to provide a basis and reference for the digitization application of units.
The following sections of this paper are organized as follows:
Section 2 will detail the construction of the ontology.
Section 3 will focus on verifying the ontology’s grammatical accuracy, logical consistency, and knowledge inference capabilities. This will involve the use of the Ontology Pitfall Scanner (OOPS!) [
22], the application of the RDFlib library, the creation of Semantic Web Rule Language (SWRL) rules [
23], as well as the utilization of the Pellet reasoner within Protégé [
24] and the writing of SPARQL protocol and the RDF query language (SPARQL) [
25] queries. Subsequently, in
Section 4, the practical application of the ontology on publicly available datasets will be discussed.
Section 5 will compare with OM and QUDT, demonstrating the ontology’s advantages in terms of ontology model selection, application domains, and multilingual support. Finally,
Section 6 provides conclusions and future work.
5. Comparing vim with OM and QUDT
In this section, we compare vim with OM v2.0 and QUDT v2.1.24, focusing on three main areas: choice of ontology construction model, application areas, and language support.
5.1. Choice of Ontology Construction Model
Units in QUDT and OM, like in vim, are instances of Units. However, there are differences in the categorization of unit classes. QUDT categorizes Units into Derived Units and Dimensionless Units, while OM focuses more on the description of Unit Multiple or Submultiple subclasses, e.g., “nanometer” is a unit of length that should be explicitly related to the prefix “nano” and the unit “meter”. In contrast, vim’s unit classifications tend to be considered from the perspective of both SI units, which are generic and standardized units suitable for a wide range of applications, and non-SI units, which are more suitable for specific domains or practical applications. In addition, countries may have units that they are accustomed to using. However, it is common practice to convert these units to SI units for data exchange and sharing purposes. For the relationship between units and quantities, all three ontologies are related through object properties such as “hasQuantity”. It is worth noting that OM and QUDT do not include certain traditional Chinese units such as “zhang” and “jin”. The absence of these units may cause some inconvenience to Chinese users.
In handling kinds of quantities and quantity classes, OM adopts the approach of using “quantity” as a class and provides common units for each quantity. QUDT, on the other hand, employs “kind of quantity” as a class and “quantity” as an instance of “kind of quantity”. In contrast, for the sake of expressing the correspondence between a quantity and its kind of quantity more clearly, vim adopts the approach of considering both “quantity” and “kind of quantity” as classes. This association is established through object properties such as “vim: hasKindOfQuantity”. For example, in the case of “Depth”, it corresponds to the kind of quantity “Length”. Additionally, vim also provides clear definitions of the applicable units for each quantity or kind of quantity.
Most of the descriptions of classes, properties, individuals, etc. constructed within the ontology by OM are limited to rdfs:label, which lacks sufficiently detailed definitions and descriptions, e.g., the unit “metre” only provides “metre” as a label. In comparison, QUDT offers more comprehensive descriptions, including rdfs:label (@en), rdfs:comment (@en), dcterms:description (@en), qudt:ucumCode, and other information. For example, also for the unit “metre”, QUDT provides not only the label but also specifies its symbol as “m” and its ucumCode as “m”. vim goes even further in this regard, as each concept includes not only vim:alsoKnownAs (@en/@zh), skos:definition (@en/@zh), skos:note (@en/@zh), and so on, but also information from official documents such as the SI Brochure, 9th Edition, etc. It is worth mentioning that vim also adds information from Chinese official documents, such as the Chinese national standard GB/T 17295-2008, in which the standard code for the recommended unit “inch” is “INH”. These improvements make the concepts in vim more accurate and enriching, and provide users with more comprehensive information.
5.2. Application Areas
OM constructs classes of application areas, including subclasses such as typography, shipping, information technology, food engineering, etc., and describes the units and quantities that can be used for each area. However, unlike OM, QUDT does not specify which units or quantities can be used in a domain. Vim takes a different approach by using the annotation attribute “vim:domain” to indicate to which domains each unit or quantity applies. Furthermore, vim’s division of areas into disciplines, unlike OM, is better suited to meet the needs of specific disciplines, e.g., the unit “second” is applicable to the domains of acoustics, atomic and nuclear reactions, ionizing, solid state physics, and space and time.
5.3. Language Support
In contrast to the OM and QUDT ontologies, which are only available in English, vim not only offers English descriptions, but also adds Chinese descriptions, which helps users in the Chinese market who are not proficient in English to use the ontology more easily.
6. Conclusions and Future Work
In this paper, the complex problem of defining classes, class hierarchies, and properties is successfully solved by using the ontology construction method based on Seven Steps to Ontology Development. The bilingual construction of a unit of measure vocabulary ontology (vim) was accomplished using the ontology construction tool Protégé with OWL and RDF. The ontology includes 45 classes, 53 object attributes, and 71 data attributes. Vim achieves the description including but not limited to Chinese specific units of measurement, and the description of units under different systems and the conversion between each unit can be expressed effectively. Moreover, by leveraging the ontology pitfall detection tool OOPS!, employing the RDFLib library, crafting SWRL inference rules, and executing SPARQL queries, it is verified that vim can be syntactically correct and logically consistent.
To comprehensively assess the quality of the ontology, we conducted two sets of experiments. The first one is the unit conversion, which applies vim to the height and weight dataset containing 25,000 data items and the unit conversion calculation of “zhang” to “metre”. The experimental results show that vim successfully realizes the unit conversion of the dataset and can correctly answer the calculation of 2 “zhang” = 6.666666 “metres”. The second set of experiments is semantic annotation, taking the publicly available temperature data as an example, and the experimental results show that the vim ontology can annotate the temperature data and perform semantic query according to the users’ needs. The experimental results have been certificated by the National Institute of Metrology (NIM), China. The corresponding certification NO is SJsj2023-00012, which is also provided by the National Metrology Data Center (NMDC), China. And for information on how to access the vim ontology constructed in this paper and the experimental code for each experiment, please refer to
Appendix B.
At the same time, vim is compared with OM and QUDT in three aspects: ontology model building choice, application domain, and language support. The comparison results illustrate that vim has several advantages, including more comprehensive descriptions, description of units and quantities in a discipline-oriented categorization, and bilingual language support, which improves its usability and accuracy for a wider user base.
In summary, vim provides a unified architecture for the representation of concepts in the vocabulary of measurement units that are central to the scientific community and engineering applications, facilitates the exchange and sharing of measurement test data, and addresses ontological contextual differences.
The construction of an ontology is a dynamic and iterative process that needs to be continuously updated and improved based on domain knowledge and application requirements so that it can better support tasks such as domain modeling, knowledge sharing, and application reasoning. In the research process of this paper, we adopted manual knowledge extraction. However, this approach is time-consuming and labor-intensive, and although vim is capable of describing the relationship between units and quantities, systems of units, etc., it needs to be extended more to cover metrology-related content more comprehensively, e.g., descriptions of measurement aspects and experimental processes, which in the future will be crucial for experimental processes to be handled in a machine-readable, understandable, and automatable way and for achieving metrological traceability. On top of that, multilingual support and integration with other ontologies need to be further improved. It is important to note that our descriptions of the ontology vocabulary are taken from official documents and do not correlate the daily customary user usage (unofficial usage) with the official descriptions.
Looking ahead, the development direction of vim should focus on solving these problems. Specifically, methods such as Natural Language Processing (NLP) can be used to automatically extract the relationships between vocabularies and improve the efficiency of ontology construction. Furthermore, the integration of domain expertise should be strengthened to cover more expertise in the field of metrology, for example, focusing on the description of measurement uncertainty to improve the assessment of the credibility of measurement results. Third, the detailed description of the experimental process, including steps, conditions, and instrument settings, must be supported to facilitate the automation and traceability of the experimental process. Simultaneously, multi-language and cross-cultural needs are considered to improve the internationalization and applicability of vim. Moreover, integration with other related ontologies is needed to extend the application scope and interoperability of vim. It is worth noting that the X.Y.Z versioning model [
45] needs to be used to record each version of the vim ontology, and to make each version of the ontology comply with the FAIR principle [
46].
In summary, the future direction of vim aims to enhance its expressiveness, broaden its application domain, improve multilingual support, and promote integration with other ontologies to better meet the changing needs of metrology and science, and thus promote the development of metrology in scientific research and engineering applications.