Next Article in Journal
Economic Losses Due to Climatic Damage in Viticulture: Adaptation Proposals
Previous Article in Journal
Spatiotemporal Patterns of Olive Fruit Fly Movements: Impact of Variety, Temperature, and Altitude in Five Olive Oil Production Areas in Greece
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Proceeding Paper

Building a Global Aquatic Resource Knowledge Base for Fisheries †

1
Institute of Computer Science, Foundation of Research and Technology-Hellas, 71110 Heraklion, Greece
2
Computer Science Department, University of Crete, 70013 Heraklion, Greece
3
Food and Agriculture Organization of the United Nations, 00153 Rome, Italy
*
Author to whom correspondence should be addressed.
Presented at the 11th International Conference on Information and Communication Technologies in Agriculture, Food & Environment, Samos, Greece, 17–20 October 2024.
Proceedings 2025, 117(1), 4; https://doi.org/10.3390/proceedings2025117004
Published: 17 April 2025

Abstract

:
Fisheries management is a complex task aiming to ensure the long-term sustainability of fish populations and the ecosystems they depend on. To achieve those goals, it is essential that the fisheries are described with precise and non-ambiguous information. Different agencies are reporting fisheries data by relying on several vocabularies or thesauri. Just indicatively, for the description of aquatic species, there are different official and widely used data sources that can be used. As a result, there are different identifiers or names for describing the same resource. In this paper, we describe the construction of a global aquatic resource knowledge base, which is the result of the integration of different data sources using semantic web technologies. By focusing on aquatic species, we show that the information provided by different data sources is complementary, and we provide a unified way for accessing them. We finally describe how the same process was adopted for other information domains as well.

1. Introduction

Fishing is one of the most significant drivers of declines in ocean wildlife populations. Although, catching fish is not inherently bad for the ocean, except for when the catches are higher than the fish stocks can replenish, something called overfishing. The number of overfished fish stocks globally has tripled in half a century. Just indicatively, according to [1], one-third of the global fish stocks were overfished in 2017. Fisheries management [2] plays a crucial role in preventing overfishing by regulating the harvesting and utilization of fish stocks to ensure their sustainable exploitation while preserving the marine ecosystem. It involves a range of activities aimed at maintaining the balance between the extraction of fishery resources and the conservation of aquatic ecosystems. One of the key components of efficient fisheries management is the so-called stock assessment. This involves monitoring fish populations to determine their abundance, distribution, and health. It is, therefore, of crucial importance that the description of relevant information (e.g., species) is accurate and complete.
There are many different ways of referring to a species; the majority of people use common names. These are not used by the scientific community, because species usually have many different common names in various languages. Instead, they use the Linnean binomial nomenclature or the so-called scientific name of a species; it consists of two parts, the genus name first and then the specific epithet (Mullus barbatus). In addition, there are alpha-numeric identifiers that can be used for that purpose. Such identifiers are provided from different data sources or registries and are widely used, particularly for data exchange and interoperable mechanisms. Some indicative ones are FAO 3-Alpha Species Code (FAO ASFIS List of Species for Fishery Statistics Purposes (https://www.fao.org/fishery/en/collection/asfis/en, accessed on: 17 March 2025)) (e.g., MUT), APHIA ID from WoRMS (World Register of Marine Species (https://www.marinespecies.org/, accessed on: 17 March 2025)) (e.g., 126985), FishBase ID (FishBase (https://fishbase.org/, accessed on: 17 March 2025)) (e.g., 790), and TSN code from IT IS (Integrated Taxonomic Information System (https://www.itis.gov/, accessed on: 17 March 2025)) (e.g., 169419), etc. The problem is that there is no common guideline adopted from the existing fisheries management authorities and other institutions spread around the world. So, practically all of them are being used nowadays and there are cases in which it can become rather cumbersome to analyze fishery reports produced by different authorities.
To alleviate this, in this paper, we have implemented a process that relies on the semantic web [3] by collecting information from different data sources and constructing a single knowledge base with key species taxonomic information. By using an appropriate ontology, as the conceptual model, we managed to semantically integrate information coming from different data sources and describe them in a homogeneous manner. This process managed to interconnect the identifiers of the same species, and exactly because of this, to support the provision of complementary information. This is something that would not be possible at all without semantically integrating them.

2. Semantic Data Integration

Semantic data integration involves the harmonization of heterogeneous data sources by understanding the underlying semantics, relationships, and meanings within the data. It goes beyond syntactic matching to interpret the semantics of data elements, resolving the semantic heterogeneity that arises from differences in terminologies and concepts across sources. This process employs ontologies [4], vocabularies, and definitions of schema mappings to establish common semantic interpretations across disparate datasets.
The key element is the definition of a proper ontology, able to capture clearly the semantics of the knowledge base of aquatic species. In the literature, there are several ontologies for the marine domain, such as MarineTLO [5], SWEET [6], BioTop [7], Uberon [8], etc. In this work, we have adopted MarineTLO, since it contains all the necessary information we wanted to model and there is no need to extend it with new concepts, attributes, or relations.

3. Marine Species Data Sources

For the construction of the knowledge base, we have used the following well-known and actively used data sources:
  • FAO ASFIS List of Species for Fishery Statistics Purposes provides a code list of marine species with several identifiers, such as 3-alpha code, taxonomic code, and ISSCAAP code (International Standard Statistical Classification of Aquatic Animals and Plants (ISSCAAP) (https://www.fao.org/fishery/en/collection/asfis, accessed on: 17 March 2025)). The 3-alpha codes are made of three characters that uniquely identify the species, complemented with scientific names, taxonomic details, and common names in English, French, Spanish, Russian, Arabic, and Chinese when available. The taxonomic codes and ISSCAAP codes are other codes that are used for classificatory purposes.
  • WoRMS is an authoritative database that provides a comprehensive and up-to-date inventory of all known species globally. The main identifier in WoRMS is AphiaID which is a numeric code. Moreover, it contains taxonomic information, synonyms, distribution maps, and bibliographic references for each species listed in the database.
  • FishBase is a global biodiversity information system on fishes that provides detailed information about species regarding taxonomy, morphology, ecology, distribution, behavior, and fisheries-related data. FishBase is maintained and continuously updated by an international consortium of scientists, with support from various organizations.
  • ITIS is an authoritative database that provides taxonomic information including the Taxonomic Serial Number (TSN), scientific names, and taxonomic hierarchies. The database is reviewed and updated periodically to ensure high quality with valid classifications, revisions, and additions on newly described species.

4. Aquatic Species Knowledge Base Construction Process

Figure 1 shows how information was collected from the different data sources. More specifically, we used the code lists in CSV format for FAO ASFIS List of Species for Fishery Statistics Purposes (https://www.fao.org/fishery/en/collection/asfis, accessed on: 17 March 2025), a RESTfull API for WoRMS (https://www.marinespecies.org/rest/, accessed on: 17 March 2025) and FishBase (https://www.fishbase.ca/api/readme.txt, accessed on: 17 March 2025), and the database dumb in MySQL from ITIS (https://www.itis.gov/downloads/index.html, accessed on: 17 March 2025). As illustrated in the figure, the resources collected from each source appeared in different formats, so it was necessary to proceed with a syntax normalization phase before actually transforming them into instances of the top-level ontology MarineTLO. We started gradually integrating contents from the different data sources into the knowledge base, starting from FAO ASFIS, then WoRMS, then FishBase, and the last one being ITIS. For each pair of data sources, we used different parts as the “glue” to connect species information. More specifically, we used the binomial of a species between FAO ASFIS and WoRMS, while for the rest of the pairs (i.e., WoRMS and FishBase, and FishBase and ITIS), we used their codes and their scientific names. Finally, we should note that from ITIS, we collected only the marine species based on the species that we already have added to the knowledge base until that moment. It is also worth noting, that we have collected all the possible external identifiers for species from WoRMS using the corresponding API method (i.e., AphiaExternalIDByAphiaID). These identifiers refer to IDs of the species in different data sources (such as GBIF (Global Biodiversity Information Facility (https://www.gbif.org/, accessed on: 17 March 2025)), GISD (Global Invasive Species Database (https://www.iucngisd.org/gisd/, accessed on: 17 March 2025)), IUCN Red List (https://www.iucnredlist.org/, accessed on: 17 March 2025), etc.), paving the way for including more data sources in the future.
During the construction of the knowledge base about species, we encountered several mismatches regarding the scientific names of species. To resolve those issues, we used as the source of truth FishBase, which contains a list of synonyms for each species describing if a synonym is valid or not. For the few cases that the issue could not be resolved that way, the corresponding data source owners were informed about them. After communicating with them, some of those issues were resolved. Figure 2 describes in a diagrammatic manner how the information is stored in the knowledge base by relying on the proper classes of MarineTLO; for example, IDs are instances of the class MarineTLO:BC32_Identifier (http://www.ics.forth.gr/isl/ontology/MarineTLO/BC32_Identifier, accessed on: 17 March 2025). In this example, we report all the information that refers to the species with the binomial “Mullus barbatus”; the left group of information reports the different identifiers, the one in the middle reports some of the names, and the right one reports the taxonomic information of the species. Note that all the species-related information is presented in a uniform manner, although they have been collected from different data sources. Figure 3 is derived from a web application that allows searching the knowledge base and illustrates the available identifiers for marine species belonging to the genus “Mullus”.

5. Aquatic Resource Knowledge Base

Of course, fisheries management is not just about aquatic species. It includes more essential information, like fishing assessment or management of water areas and fishing gear. So, we applied the same methodology for constructing knowledge bases about those entities as well. With regard to areas, we mainly used FAO major fishing areas for statistical purposes (https://www.fao.org/fishery/en/area/search, accessed on: 17 March 2025), as well as Large Marine Ecosystems (LME) [9], Marine Regions (https://www.marineregions.org/, accessed on: 17 March 2025), GFCM geographical subareas (GSA) (https://www.fao.org/gfcm/, accessed on: 17 March 2025), and several national jurisdiction areas and ISO3166 codes [10]. For fishing gear, we relied on the different versions of the FAO International Standard Statistical Classification of Fishing Gear (ISSCFG) standard [11].
Overall, we have implemented a process that constructs a knowledge base of aquatic resources from publicly available, well-known data sources, focusing on aquatic species, water areas, and fishing gear. The knowledge base can be browsed either through a SPARQL endpoint (https://isl.ics.forth.gr/grsf/sparql, accessed on: 17 March 2025) or through a dedicated web application (https://isl.ics.forth.gr/grsf/grsf-ir, accessed on: 17 March 2025) that supports searching as well (an indicative screenshot is shown in Figure 3). Overall, it contains information about 40,564 marine species, 3316 water areas, and 88 fishing gear resources.
The aquatic resource knowledge base is actively used for monitoring the status of fisheries. In particular, it is used by the Global Record of Stocks and Fisheries [12]. for harmonizing and updating, if necessary, the information on the stocks and fisheries it collects.

6. Conclusions

In this paper, we describe the process for the construction of a comprehensive knowledge base for aquatic resource identifiers to support the complex process of fisheries management. Although we focused on semantically integrating marine species, water areas, and fishing gears, our methodology and tools can be applied to other entities or resources are well (e.g., identification of countries, fishery management authorities, etc.). The knowledge base can also be used as a reference for harmonizing fishery management resources, and we also report its current usage on this. Potential extensions include the addition of more data sources and entities as well as the automation of the construction process so that the knowledge base remains in sync with the contents of the data sources that are used.

Author Contributions

Conceptualization, Y.M., Y.T., A.G., A.E. and M.T.; methodology, Y.M. and A.G.; software, Y.M.; validation, Y.T., A.G., A.E. and M.T.; formal analysis, Y.M., Y.T. and A.G.; investigation, Y.M.; resources, Y.M., A.G. and A.E.; data curation, Y.M. and A.G.; writing—original draft preparation, Y.M.; writing—review and editing, Y.M., Y.T., A.G., A.E. and M.T.; visualization, Y.M., Y.T. and A.G.; supervision, Y.T. and M.T.; project administration, Y.T. and M.T.; funding acquisition, Y.M., Y.T., A.G., A.E. and M.T. All authors have read and agreed to the published version of the manuscript.

Funding

This work has received funding from the Horizon Europe projects Blue-Cloud 2026 (Grant Agreement No: 101094227) and VeriFish (Grant Agreement No: 101156426).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data will shared on request to the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:
FAOFood and Agriculture Organization
AFSISAquatic Sciences and Fisheries Information System
TSNTaxonomic Serial Number
ISSCAAPInternational Standard Statistical Classification of Aquatic Animals and Plants
LMELarge Marine Ecosystems
GSAGeographical SubAreas
ISSCFGInternational Standard Statistical Classification of Fishing Gear
WORMSWorld Register of Marine Species
ITISIntegrated Taxonomic Information System

References

  1. Ritchie, H.; Roser, M. Fish and Overfishing; Our World in Data: 2021 (updated on March 2024). Available online: https://ourworldindata.org/fish-and-overfishing (accessed on 17 April 2025).
  2. Pikitch, E.K.; Santora, C.; Babcock, E.A.; Bakun, A.; Bonfil, R.; Conover, D.O.; Dayton, P.; Doukakis, P.; Fluharty, D.; Heneman, B.; et al. Ecosystem-based fishery management. Science 2004, 305, 346–347. [Google Scholar] [CrossRef] [PubMed]
  3. Antoniou, G.; Van Harmelen, F. A Semantic Web Primer; MIT Press: Cambridge, MA, USA, 2004. [Google Scholar]
  4. De Giacomo, G.; Lembo, D.; Lenzerini, M.; Poggi, A.; Rosati, R. Using ontologies for semantic data integration. In A Comprehensive Guide Through the Italian Database Research over the Last 25 Years; Springer: Cham, Switzerland, 2018; pp. 187–202. [Google Scholar]
  5. Tzitzikas, Y.; Allocca, C.; Bekiari, C.; Marketakis, Y.; Fafalios, P.; Doerr, M.; Minadakis, N.; Patkos, T.; Candela, L. November. In tegrating heterogeneous and distributed information about marine species through a top level ontology. In Research Conference on Metadata and Semantic Research; Springer International Publishing: Cham, Switzerland, 2013; pp. 289–301. [Google Scholar]
  6. DiGiuseppe, N.; Pouchard, L.C.; Noy, N.F. SWEET ontology coverage for earth system sciences. Earth Sci. Inform. 2014, 7, 249–264. [Google Scholar] [CrossRef]
  7. Schulz, S.; Stenzhorn, H.; Boeker, M. The ontology of biological taxa. Bioinformatics 2008, 24, i313–i321. [Google Scholar] [PubMed]
  8. Mungall, C.J.; Torniai, C.; Gkoutos, G.V.; Lewis, S.E.; Haendel, M.A. Uberon, an integrative multi-species anatomy ontology. Genome Biol. 2012, 13, 1–20. [Google Scholar] [CrossRef] [PubMed]
  9. Sherman, K.; Duda, A.M. Large marine ecosystems: An emerging paradigm for fishery sustainability. Fisheries 1999, 24, 15–26. [Google Scholar] [CrossRef]
  10. ISO 3166 Country Codes. Available online: https://www.iso.org/iso-3166-country-codes.html (accessed on 17 March 2025).
  11. The International Standard Statistical Classification of Fishing Gear (ISSCFG). Available online: https://www.fao.org/cwp-on-fishery-statistics/handbook/capture-fisheries-statistics/fishing-gear-classification/en/ (accessed on 17 March 2025).
  12. Marketakis, Y.; Tzitzikas, Y.; Gentile, A.; Van Niekerk, B.; Taconet, M. On the Evolution of Semantic Warehouses: The Case of Global Record of Stocks and Fisheries. In Proceedings of the 14th International Conference on Metadata and Semantics Research, Special Track on Metadata & Semantics for Agriculture, Food & Environment (AgroSEM’20), Madrid, Spain, 2–4 December 2020. [Google Scholar]
Figure 1. The process of collecting species resources from external data sources, preparing, adapting, and ingesting them into the Knowledge Base.
Figure 1. The process of collecting species resources from external data sources, preparing, adapting, and ingesting them into the Knowledge Base.
Proceedings 117 00004 g001
Figure 2. Detailed information (i.e., identifiers, scientific names, common names in English [UK], Italian [IT], Greek [GR] and Chinese [CN], and taxonomic information) about aquatic species in the knowledge base.
Figure 2. Detailed information (i.e., identifiers, scientific names, common names in English [UK], Italian [IT], Greek [GR] and Chinese [CN], and taxonomic information) about aquatic species in the knowledge base.
Proceedings 117 00004 g002
Figure 3. Identifiers of marine species belonging to the species “Mullus”.
Figure 3. Identifiers of marine species belonging to the species “Mullus”.
Proceedings 117 00004 g003
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Marketakis, Y.; Tzitzikas, Y.; Gentile, A.; Ellenbroek, A.; Taconet, M. Building a Global Aquatic Resource Knowledge Base for Fisheries. Proceedings 2025, 117, 4. https://doi.org/10.3390/proceedings2025117004

AMA Style

Marketakis Y, Tzitzikas Y, Gentile A, Ellenbroek A, Taconet M. Building a Global Aquatic Resource Knowledge Base for Fisheries. Proceedings. 2025; 117(1):4. https://doi.org/10.3390/proceedings2025117004

Chicago/Turabian Style

Marketakis, Yannis, Yannis Tzitzikas, Aureliano Gentile, Anton Ellenbroek, and Marc Taconet. 2025. "Building a Global Aquatic Resource Knowledge Base for Fisheries" Proceedings 117, no. 1: 4. https://doi.org/10.3390/proceedings2025117004

APA Style

Marketakis, Y., Tzitzikas, Y., Gentile, A., Ellenbroek, A., & Taconet, M. (2025). Building a Global Aquatic Resource Knowledge Base for Fisheries. Proceedings, 117(1), 4. https://doi.org/10.3390/proceedings2025117004

Article Metrics

Back to TopTop