ROSSIO Infrastructure: A Digital Humanities Platform to Explore the Portuguese Cultural Heritage
Abstract
:1. Introduction
2. Software and Data Architecture
2.1. Applications Architecture
- Data Manager and Curator—This actor operates the process of ingesting metadata and also publishes datasets. To perform these roles, he uses the Public Datasets Repository and the Metadata Harvesting and Ingestion Application;
- Vocabulary Manager—This actor creates and maintains the controlled vocabularies used in the ROSSIO Infrastructure. Typically, this role is performed by information professionals and terminology specialists;
- End-user—This is the target user group of the ROSSIO Infrastructure, which includes students, teachers, researchers, as well as the general public.
- Metadata Harvesting and Ingestion Application—This application performs the OAI-PMH harvesting process of the data providers’ datasets into ROSSIO. The initial data-processing tasks are also performed by this application. It ingests the harvested datasets into ROSSIO’s internal repository, creates the search indexes, and publishes the datasets in ROSSIO’s public datasets repository. The harvested metadata is also enriched during the ingestion process by using the Data Normalisation and Enrichment Application;
- Internal Repository—The datasets from the data providers are stored in this repository. This storage is designed for fast access to individual records, and assigns each record an identifier in the form of a linked data URI;
- Data Normalisation and Enrichment Application—During the ingestion process, this application is used to enrich the metadata harvested from the data providers. The application matches specific field values from the metadata with entities and concepts included in one of the ROSSIO vocabularies. The applications establishes links with the matched ROSSIO entities in order to enable semantic searching in ROSSIO via its vocabularies. In addition, this application performs the normalisation of values of date properties to enable a date search index and timeline searching in ROSSIO’s end-user applications;
- Public Repository (Dataverse)—In order to publish datasets to the public, ROSSIO uses the Dataverse software (https://dataverse.org/ accessed on 7 December 2021). Two kinds of datasets will be published by ROSSIO: the datasets aggregated from the data providers and the datasets that will be created by researchers while using one of the ROSSIO applications. The datasets are assigned persistent identifiers by the Dataverse software;
- Linked Data Resolution Application—This application makes the metadata accessible by following the best practices for linked data and implementing the relevant specifications. The application receives and processes the access requests to URIs in the ROSSIO’s namespaces. The data for responding to these URI requests is fetched in real time from other applications in ROSSIO, as follows:
- ○
- From the Internal Repository—for the aggregated metadata of cultural and scientific items;
- ○
- From the Public Dataset Repository—for dataset-level metadata;
- ○
- From the Vocabularies RDF Triple Store—for concepts and entities defined in any of the ROSSIO vocabularies;
- Search Engine (Apache SolrTM)—This application provides searching functionality across the aggregated metadata about cultural and scientific items. The indexing process is operated by the Data Manager and Curator, who uses the Metadata Harvesting and Ingestion Application. The Search Engine implements a search schema designed for the search requirements of the ROSSIO applications for end-users. It consists of a deployment of the Apache SolrTM software (https://solr.apache.org/ accessed on 7 December 2021), which makes an API available for searching. It is via this API that the applications in ROSSIO search in the aggregated dataset;
- Vocabularies Triple Store (Apache Fuseki)—This application stores and indexes the ROSSIO vocabularies as RDF data. The Triple Store makes available a SPARQL endpoint [7] that allows semantic queries to be made on the vocabularies by other applications. The SPARQL endpoint is available to the public; therefore, the vocabularies can also be queried by internal applications, by applications from third parties, and by end-users proficient in the SPARQL language. The Triple Store is an installation of the Apache Fuseki software (https://jena.apache.org/documentation/fuseki2/ accessed on 7 December 2021);
- Vocabularies Management Application (Vocbench)—By using this application, the Vocabularies Manager creates and maintains the controlled vocabularies of the ROSSIO infrastructure. It is a deployment of the Vocbench (http://vocbench.uniroma2.it/ accessed on 7 December 2021) software;
- Vocabularies Publication Application (Skosmos)—The vocabularies created in Vocbench are published for human usage in this application. The Vocabularies Manager is responsible for the publication process, which consists in exporting the vocabularies from the Vocabulary Editor and then importing them into the Vocabularies Publication Application. This application is a deployment of the Skosmos software;
- End-user systems—Four independent applications are being developed under a common framework, which will offer the services of the ROSSIO platform to end-users, based on the aggregated dataset. These services are described in detail in Section 3. The applications that implement these services use the Search Engine for querying the dataset, use the Linked Data Resolution Application to obtain the metadata about the digital objects in RDF, and query the ROSSIO vocabularies using the Vocabularies Triple Store. The services are supported by four applications:
- Discovery Portal—This application provides search and retrieval functionality to all end-users of ROSSIO;
- Virtual Research Environment—This application provides functionalities for researchers to work with the resources available in ROSSIO;
- Digital Exhibitions—This application provides functionalities for creating and publishing exhibitions of digital resources available in ROSSIO;
- Digital Collections—This application provides functionalities for creating and publishing explanatory resources that use digital resources available in ROSSIO (the technical development of the VRE and the digital collections and exhibitions benefited from the work of three MA students from FCT—Joana Barbosa, Henrique Raposo, and Luís Coelho—who did their thesis among the project).
2.2. Internal Data Model
- Date normalisation—Data providers use different date formats and values with partial dates, such as only a year, or the month of a year. The values of the date properties are normalised into one format that allows date ranges to be represented. Using these normalised values, a special date range index is created in the Search Engine, allowing end-users to query by periods of time;
- Hyperlink enrichment—This enrichment operation analyses the hyperlinks present in the metadata, and tries to determine if the links point to a web page, directly to an image file, or to another kind of media file. This allows ROSSIO to make particular uses of the links, such as for linking back to the page of the digital objects at the data providers’ websites, or displaying miniature images representing the digital objects;
- Entity linking—This enrichment process aims to allow semantic searching on the ROSSIO dataset by linking entity names expressed in the metadata of data providers to entities and concepts of the ROSSIO vocabularies (described in Section 2.3). It includes four enrichment operations:
- ○
- Georeferencing enrichment—Place names, found in metadata fields about subjects and coverage, are matched against entities in the Places vocabulary, allowing, in some cases, an enrichment of the metadata with accurate geographic coordinates;
- ○
- Agent enrichment—Names of persons and organisations, found in metadata fields about authors, contributors, and subjects, are matched against entities in the Agents vocabulary;
- ○
- Temporal enrichment—Names of historical periods, found in metadata fields about subjects and coverage, are matched against entities in the Periods vocabulary;
- ○
- Concept enrichment—Terms of concepts, found in metadata fields about subjects, are matched against entities in the ROSSIO Thesaurus.
2.3. The Development and Publication of Controlled Vocabularies
- ROSSIO Thesaurus. This vocabulary is composed of designations of subjects or topics in the SSAH, which mostly correspond to general concepts (e.g., “Arts”, “Artists”). The ROSSIO Thesaurus also includes several individual concepts, such as disciplines (e.g., “History”) and conceptual objects (e.g., “Identity”). The distinction between general and individual concepts is standardised both in terminology [9] and information science [10]. The development of the ROSSIO Thesaurus was already described in [11];
- ROSSIO Agents. This vocabulary consists of personal and organisational names. For example, it lists every consortium member and data provider of the ROSSIO Infrastructure, along with relevant agents in SSAH;
- ROSSIO Places. This vocabulary includes toponyms. It will include names of geopolitical entities (e.g., countries), geographical features (e.g., rivers), areas (e.g., neighbourhoods), and points of interest (e.g., buildings). “Place” is understood broadly as any physical entity that is inherently located and, therefore, has stable geographic coordinates;
- ROSSIO Periods. This vocabulary is composed of names for periods, including historical, cultural, artistic, or geological periods of time. It will comprise “absolute” time-intervals (e.g., millennia and centuries) as well as more or less variable periods in history (e.g., “Renaissance”) and individual events (e.g., “World War I, 1914–1918”).
- BIBFRAME (http://id.loc.gov/ontologies/bibframe/ accessed on 7 December 2021). The BIBFRAME ontology provides classes for modelling types of entities for each vocabulary. These classes are Topic, Person, Organisation, Place, and Temporal. Each entity in the vocabularies is simultaneously an instance of skos:Concept and of one of the above-mentioned BIBFRAME classes. This is intended to facilitate the use of the ROSSIO vocabularies in the platform and, also, their potential reuse in the linked open data cloud;
- Getty Vocabulary Program (GVP) ontology (http://vocab.getty.edu/ontology accessed on 7 December 2021). The ROSSIO vocabularies make use of properties from this ontology for representing types of agents (agentType) and places (placeType) by means of concepts from the ROSSIO Thesaurus. These properties allow, for example, declarations that Amália Rodrigues was a fado singer and that Portugal is a country by linking “Amália Rodrigues” in ROSSIO Agents to the “Fado singers” concept in the ROSSIO Thesaurus, and “Portugal” in ROSSIO Places to the “Countries” concept in ROSSIO Thesaurus, respectively;
- SKOS extension of ISO 25964 (http://purl.org/iso25964/skos-thes accessed on 7 December 2021). This extension aligns the ISO 25964 data model for thesauri [15] with SKOS. The ROSSIO vocabularies use the Thesaurus array class for modelling guide terms (e.g., <People by occupation>), which is a common design pattern found in many thesauri and terminologies;
- Schema.org ontology (http://schema.org/ accessed on 7 December 2021). ROSSIO Agents makes use of properties from Schema.org for modelling birth and death dates, since BIBFRAME only contains a generic date property, which is intended to model publication dates;
- WGS84 Geo Positioning ontology (http://www.w3.org/2003/01/geo/wgs84_pos accessed on 7 December 2021). ROSSIO Places uses properties of the Geo Positioning ontology for modelling geographical coordinates, which could be further explored by geolocation applications. When available, latitude and longitude coordinates are extracted from GeoNames (http://www.geonames.org/ accessed on 7 December 2021) or the Getty Thesaurus of Geographic Names (http://www.getty.edu/research/tools/vocabularies/tgn/ accessed on 7 December 2021) (TGN).
3. ROSSIO Infrastructure Services: Discovery Portal, Exhibitions and Digital Collections, Virtual Research Environment
3.1. The Discovery Portal
- Chronological diversity: The digital objects available date from different historical periods, ranging from Prehistory to the present day. The user can find a decorated menhir from the Ancient Neolithic period, preserved at DGPC, or temporary websites, such as those linked to presidential campaigns, searchable only in the Portuguese web-archive;
- Geographic variety: Although the name of the Infrastructure (ROSSIO) might be taken to be associated with the large square located in Lisbon, the data provided covers the entire country, islands (Madeira and Azores), and former Portuguese colonies (e.g., Brazil, Cape Verde, Angola, Mozambique, Timor, etc). It also covers other regions and countries of the world, some of which have been less documented for earlier periods, such as Oman;
- Thematic heterogeneity: The resources available focus on diverse subjects, including Festivities, Monuments, Music, Architecture, and Theatre, which are relevant to SSAH disciplines such as History, Anthropology, Sociology, Linguistics, Art History, and Musicology;
- Typological diverseness: The available resource types include handwritten documentation (e.g., manuscripts, codices), published studies (e.g., scientific papers, books), iconography (e.g., engravings, paintings, photographs, sculptures), audio-visual material (e.g., sound recordings, videos), and other online resources (e.g., databases, websites).
3.2. Exhibitions and Digital Collections
3.3. The Virtual Research Environment
- My Folders—to store and organise the resources selected;
- Search—connected with the discovery portal (search, visualise and select data);
- Resources—selected resources and personal annotations;
- Text editor—to write notes on the resources chosen;
4. Final Remarks: Potentialities for Public Dissemination of Science Research
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- European Strategy Forum on Research Infrastructures. European Roadmap for Research Infrastructures Report 2016; Office for Official Publications of the European Communities: Luxembourg, 2016; Available online: https://www.esfri.eu/sites/default/files/esfri_roadmap_2006_en.pdf (accessed on 30 July 2021).
- European Strategy Forum on Research Infrastructures. European Roadmap for Research Infrastructures Report 2018; Office for Official Publications of the European Communities: Luxembourg, 2018; Available online: http://roadmap2018.esfri.eu/ (accessed on 30 July 2021).
- ACLS Commission on Cyberinfrastructure. Our Cultural Commonwealth: The Report of the American Council of Learned Societies Commission on Cyberinfrastructure for the Humanities and Social Science. 2006. Available online: https://publications.acls.org/Our_Cultural_Commonwealth.pdf (accessed on 30 July 2021).
- Foundation for Science and Technology. Portuguese Roadmap of Research Infrastructures—2020 Update; FCT: Lisbon, Portugal, 2020; Available online: https://www.fct.pt/apoios/equipamento/roteiro/index.phtml.en (accessed on 30 July 2021).
- Lagoze, C.; Van de Sompel, H.; Nelson, M.; Warner, S. The Open Archives Initiative Protocol for Metadata Harvesting, Version 2.0. 2002. Available online: http://www.openarchives.org/OAI/2.0/openarchivesprotocol.htm (accessed on 30 July 2021).
- Van de Sompel, H.; Nelson, M. Reminiscing About 15 Years of Interoperability Efforts. D-Lib Mag. 2015, 21, 1–12. Available online: http://www.dlib.org/dlib/november15/vandesompel/11vandesompel.html (accessed on 30 July 2021). [CrossRef] [Green Version]
- Prud’hommeaux, E.; Seaborne, A. SPARQL Query Language for RDF. W3C Recommendation. W3C, 2008. Available online: https://www.w3.org/TR/rdf-sparql-query/ (accessed on 7 December 2021).
- Europeana Foundation: Definition of the Europeana Data Model v5.2.8. 2017. Available online: http://pro.europeana.eu/edm-documentation (accessed on 7 December 2021).
- ISO 1087; Terminology Work and Terminology Science—Vocabulary. ISO: Geneva, Switzerland, 2019.
- ISO 5127; Information and Documentation—Foundation and Vocabulary. ISO: Geneva, Switzerland, 2017.
- Almeida, B.; Freire, N.; Monteiro, D. The Development of the ROSSIO Thesaurus: Supporting Content Discovery and Management in a Research Infrastructure. In Proceedings of the 17th Italian Research Conference on Digital Libraries; Dosso, D., Ferilli, S., Manghi, P., Poggi, A., Serra, G., Silvello, G., Eds.; CEUR-WS: Aachen, Germany, 2021; pp. 138–146. Available online: http://ceur-ws.org/Vol-2816/ (accessed on 7 December 2021).
- Miles, A.; Bechhofer, S. SKOS Simple Knowledge Organization System Reference. 2009. Available online: http://www.w3.org/TR/skos-reference (accessed on 7 December 2021).
- Souza, R.R.; Tudhope, D.; Almeida, M.B. Towards a Taxonomy of KOS: Dimensions for Classifying Knowledge Organization Systems. Knowl. Organ. 2012, 39, 179–192. [Google Scholar] [CrossRef]
- Baker, T.; Bechhofer, S.; Isaac, A.; Miles, A.; Schreiber, G.; Summers, E. Key Choices in the Design of Simple Knowledge Organization System (SKOS). J. Web Semant. 2013, 20, 35–49. [Google Scholar] [CrossRef]
- ISO 25964-1; Information and Documentation—Thesauri and Interoperability with Other Vocabularies—Part 1: Thesauri for Information Retrieval. ISO: Geneva, Switzerland, 2011.
- Área de Classificação e Indexação da Biblioteca Nacional. SIPORbase: Sistema de Indexação Em Português: Manual, 3rd ed.; Rev. e Aumentada; Biblioteca Nacional: Lisboa, Portugal, 1998. [Google Scholar]
- Sottomayor, J.C. (Ed.) Regras de Catalogação: Descrição e Acesso de Recursos Bibliográficos Nas Bibliotecas de Língua Portuguesa; APBAD: Lisboa, Portugal, 2008. [Google Scholar]
- Sherratt, T. From Portals to Platforms—Building New Frameworks for User Engagement; Paper Presented at LIANZA 2013; Zenodo: Hamilton, New Zealand, 2013; pp. 1–9. [Google Scholar] [CrossRef]
- Kalfatovic, M.R. Creating a Winning Online Exhibition. A Guide for Libraries, Archives, and Museums; American Library Association: Chicago, IL, USA; London, UK, 2002. [Google Scholar]
- Khoon, L.C.; Chennupati, R.K.; Foo, S. The design and development of an online exhibition for heritage information awareness in Singapore. Program 2003, 37, 85–93. [Google Scholar] [CrossRef] [Green Version]
- Khoon, L.C.; Chennupati, R.K. Design and development of Web-based Online Exhibitions. DESIDOC J. Libr. Inf. Technol. 2014, 32, 97–102. [Google Scholar] [CrossRef] [Green Version]
- Natale, M.T.; Fernández, S.; López, M. (Eds.) Handbook on Virtual Exhibitions and Virtual Performances; Offiine Grafihe Tiburtine: Tivoli, Italy, 2012. [Google Scholar]
- Antoniou, A.; Lepouras, G.L.; Vassi lakis, C. Methodology for Design of Online Exhibitions. DESIDOC J. Libr. Inf. Technol. 2013, 33, 158–167. [Google Scholar] [CrossRef] [Green Version]
- Candela, L.; Castelli, D.; Pagano, P. Virtual Research Environments: An overview and a research agenda. Data Sci. J. 2013, 12, 75–81. [Google Scholar] [CrossRef] [Green Version]
- Zhou, J.; Smith, K.; Wilsbacher, G.; Sagona, P.; Reddy, D.; Torkian, B. Building Science Gateways for Humanities. In Practice and Experience in Advanced Research Computing (PEARC’20); Association for Computing Machinery: New York, NY, USA, 2020; pp. 327–332. [Google Scholar] [CrossRef]
- Carusi, A.; Reimer, T. Virtual Research Environment Collaborative Landscape Study: A JISC Funded Project; JISC: Bristol, UK, 2010. [Google Scholar]
- Osório, A.J. Reflexões sobre tecnologia e educação em tempo de pandemia. In A Universidade do Minho em Tempos de Pandemia II; UMinho Editora: Braga, Portugal, 2020; pp. 212–224. [Google Scholar] [CrossRef]
- Rodrigues, E. A pandemia e a emergência da ciência aberta. In A Universidade do Minho em Tempos de Pandemia. II; UMinho Editora: Braga, Portugal, 2020; pp. 263–294. [Google Scholar] [CrossRef]
- Colace, F.; Santaniello, D.; Casillo, M.; Clarizia, F. BeCAMS: A behavior context aware monitoring system. In Proceedings of the 2017 IEEE InternationalWorkshop on Measurement and Networking, Naples, Italy, 27–29 September 2017; IEEE: Naples, Italy, 2017; pp. 1–6. Available online: https://ieeexplore.ieee.org/document/8078374 (accessed on 30 July 2021).
- Clarizia, F.; Colace, F.; de Santo, M.; Lombardi, M.; Pascale, F. A sentiment analysis approach for evaluation of events in field of cultural heritage. In Proceedings of the 2018 Fifth International Conference on Social Networks Analysis, Management and Security (SNAMS), Valencia, Spain, 15–18 October 2018; SNAMS: Valencia, Spain, 2020; pp. 120–127. [Google Scholar]
- Casillo, M.; Clarizia, F.; Colace, F.; Lombardi, M.; Pascale, F.; Santaniello, D. An Approach for Recommending Contextualized Services in e-Tourism. Information 2019, 10, 180. [Google Scholar] [CrossRef] [Green Version]
- Elrayies, G.M. Flipped Learning as a Paradigm Shift in Architectural Education. Int. Educ. Stud. 2017, 10, 93–108. [Google Scholar] [CrossRef] [Green Version]
- Ouda, H.; Ahmed, K. Flipped Learning As A New Educational Paradigm: An Analytical Critical Study. Eur. Sci. J. 2016, 12, 417–444. [Google Scholar]
- Kamińska, D.; Sapiński, T.; Wiak, S.; Tikk, T.; Haamer, R.E.; Avots, E.; Helmi, A.; Ozcinar, C.; Anbarjafari, G. Virtual Reality and Its Applications in Education: Survey. Information 2019, 10, 318. [Google Scholar] [CrossRef] [Green Version]
Type of Institution | Institutions | Datasets | Objects |
---|---|---|---|
Archive | 18 | 18 | 4,631,463 |
Library | 4 | 11 | 127,964 |
Research Unit | 10 | 20 | 76,818 |
Audio-Visual Archive | 1 | 2 | 0 |
Institutional Repository | 1 | 1 | 12,069 |
Total | 34 | 52 | 4,848,314 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Silva, G.M.d.; Glória, A.C.; Salgueiro, Â.S.; Almeida, B.; Monteiro, D.; Freitas, M.R.d.; Freire, N. ROSSIO Infrastructure: A Digital Humanities Platform to Explore the Portuguese Cultural Heritage. Information 2022, 13, 50. https://doi.org/10.3390/info13020050
Silva GMd, Glória AC, Salgueiro ÂS, Almeida B, Monteiro D, Freitas MRd, Freire N. ROSSIO Infrastructure: A Digital Humanities Platform to Explore the Portuguese Cultural Heritage. Information. 2022; 13(2):50. https://doi.org/10.3390/info13020050
Chicago/Turabian StyleSilva, Gonçalo Melo da, Ana Celeste Glória, Ângela Sofia Salgueiro, Bruno Almeida, Daniel Monteiro, Marco Roque de Freitas, and Nuno Freire. 2022. "ROSSIO Infrastructure: A Digital Humanities Platform to Explore the Portuguese Cultural Heritage" Information 13, no. 2: 50. https://doi.org/10.3390/info13020050
APA StyleSilva, G. M. d., Glória, A. C., Salgueiro, Â. S., Almeida, B., Monteiro, D., Freitas, M. R. d., & Freire, N. (2022). ROSSIO Infrastructure: A Digital Humanities Platform to Explore the Portuguese Cultural Heritage. Information, 13(2), 50. https://doi.org/10.3390/info13020050