1.1. Context: The Energy and Digital Transformation
During recent decades, research and innovation in Renewable Energy (RE), such as wind and solar, have supported the green transformation of energy systems, the backbone of a low-carbon climate-resilient end society. One of the challenges is to manage the complexity of the grid transformation to allow for higher shares of highly variable renewables while securing the stability of the grid and a stable energy supply. Help comes from the ongoing digital transformation where digitisation of infrastructures and assets in research and industry generates multi-dimensional and multi-disciplinary digital data.
The escalation of data from new sources and the growing complexity of challenges introduce different roadblocks: among them, a data user who must perform data analytics needs help finding the correct data to exploit. This has two significant facets: firstly, missing data management (i.e., datasets are neither findable because of missing community standard metadata and taxonomies) and interoperable data (i.e., missing standards for data formats); secondly, data owners having a negative perception of sharing data.
It is increasingly clear that we need to introduce information science and data science elements to deal with the data challenge. The former is essential for data management to organise and make data ready for data science exploitation. The latter enables the digitalisation process, i.e., transforming data into information and then into value to find innovative solutions and business models, e.g., to cut costs by better planning, optimising processes to increase operational efficiency and by reducing risk to optimise investments.
As the energy sector is central in the actions to mitigate climate change, digitalisation can become one of the effective tools to fight climate change. This can be added to the other central challenges already faced by the energy sector such as the contribution to economic development and pollution reduction. In this framework, data generated and consumed by digital applications in the energy sector have significant importance. They range from meteorological to consumer data, electricity consumption and production data or data relative to the infrastructures and the state of machines. In each case, such data may have a value outside the specific scope for which they have been generated, especially considering new information-centric business models and the potential in research and development.
However, to be exploited, data must be available to potential users. To do so, different paradigms have been proposed such as open or Findable, Accessible, Interoperable and Reusable (FAIR) data which will be discussed in the next section. In any case, data for each topic in a specific field must also be findable; for this, a topic taxonomy for energy-related fields is necessary for two reasons: to standardise the nomenclature, supporting a common understanding amongst the actors, and to classify existing and future data available in this field.
In this paper, we address the information science issue by creating metadata and a taxonomy of the topics of photovoltaics and concentrated solar power based on previous work carried out by the wind energy community. We adopted expert elicitation where a group of experts in different renewable technologies and for cross-cutting fields such as Life Cycle Assessment (LCA) and the European Union (EU) taxonomy for sustainable activities have been gathered to propose a consistent and detailed taxonomy for renewable energy-related data. The result is a coherent classification of relevant data sources, considering the general aspects applicable to electricity generation from selected renewable energy technologies and their specific elements. Metadata and taxonomies can be easily extended to other renewable resources not considered here.
1.2. State of the Art
The knowledge of the existence of data and their prompt access is widely recognised as a method to optimise research impact and several frameworks are proposed to achieve it. Among them, the two more debated ones are currently the open data [
1] and the FAIR [
2] approaches. In the first approach, data access is free to users with several licences granting different reusability levels. In the second approach, data property and access are maintained by the owners who make data (a) findable by a faceted search and (b) accessible through agreements, e.g., compensation and/or confidentiality clauses.
Independently of whether data are open or FAIR, data should be findable and reusable; to comply with these conditions, data must be identified and described using metadata, i.e., a series of information, e.g., who, what, when, where, how etc. to put data in a context. Furthermore, metadata should be machine-actionable allowing for searchability and usability by machines. With metadata, data are preserved for future reuse. These can be general or discipline-specific, with the first defined, for example, by the Dublin Core (DC) [
3] whilst the second must be defined by domain experts.
Within the information included in metadata is the position of the dataset in a relevant taxonomy, which is a method to structure the knowledge related to a discipline into topics. An example is the famous biological classification or the most recent EU taxonomy for sustainable activities [
4]. In this work, the most common hierarchical taxonomy approach has been used.
Regarding data availability, it is observed that datasets about renewable energy resources are now largely accessible thanks to long-term work carried out at independent research centres and large multinational organisations. Examples are the numerous wind or solar atlases created for different regions. Among them are Danish Technical University’s (DTU) Global Wind Atlas [
5] and Solargis’ Global Solar Atlas [
6], both funded by the World Bank. Regulated grid operators also disclose relevant information through dedicated data platforms such as the European Association for the Cooperation of Transmission System Operators for Electricity (ENTSOE) Transparency Platform [
7] or similar portals at the national level. On the other hand, data relative to operational aspects of renewable plants, such as output or maintenance, are much rarer despite a long operational history. Remarkable is the Open Mod Initiative, which aims at collecting access to existing open data related to the energy sector [
8]. The landscape can be concluded with the scarcity of data related to emerging aspects of renewable energy such as plants’ end of life, to disposal, recycling and reusing aspects or to concepts emerging in importance such as environmental performance evaluated through LCA.
The idea of this work originates from the lack of such an organisation for renewable energy, potentially due to the relative youth of this field in the landscape of science and technology research. Anyway, this work could be built based on previous pioneering works in subaspects of renewable energy and neighbouring fields.
Regarding wind energy, the deepest and most complete taxonomy for wind energy research and development topics has been developed during the European Commission FP7 Project, Integrated Research Programme in Wind Energy (IRPWIND). IRPWIND combined strategic research projects and support activities within the field of wind energy, to leverage the long-term European research potential. To guarantee the reliability of the results, the work used the expert elicitation procedure, gathering experts from the major organisations in wind energy associated with the European Energy Research Alliance, Joint Programme on Wind Energy (EERA JP Wind) whose organisational structure and participation is mirrored in the IRPWIND consortium. This guarantees the reliability of the taxonomies. Details of the work are presented in [
9]. This work is also enhanced by the production of a metadata schema and the design and demonstration of the metadata catalogue Share-Wind. The metadata schema includes the list of general Dublin Core metadata completed by wind energy domain-specific metadata and related taxonomies/vocabularies. Data owners can register the metadata of available datasets to populate the data catalogue, and data users can search for needed data. The platform Share-Wind is in the process of being transferred to the domain
http://share-wind.net (accessed on 29 October 2021).
Another significant activity on this topic has been carried out at NREL [
10] and IEA [
11] levels. In the first case, a taxonomy has been developed for studying the cost structure of wind energy to identify potential cost reduction sources. In the second case, the activity is not linked exclusively to cost but to several aspects of wind turbines and plants, including design and operation. A second theme which sparked the creation of taxonomies is the need to completely map activities linked to maintenance and condition monitoring. In this case, it is possible to mention [
12] focused on wind turbines and [
13] where the ontology is used to represent the knowledge extracted by fault diagnosis analysis. Finally, it is possible to mention [
14] where an ontology of wind energy-related topics is created semi-autonomously from the open-source text.
The same types of studies can be found for photovoltaics, where taxonomies have been proposed for both cost and maintenance analysis. In the first group—cost—it is possible to mention the works carried out at the National Renewable Energy Laboratory (NREL) [
15,
16] for cost benchmarking and [
17] focused on soft costs for Photovoltaics (PV). The second group, maintenance, presents a detailed breakdown of aspects related to photovoltaic energy that have been developed in [
18] and the Trust PV project [
19], to analyse the risk and maintenance aspects in the whole PV value chain. The need for a common nomenclature in PV has also been highlighted in the H2020 project Solar Bankability [
20] where a cost factor was added to the Failure Mode and Effect Analysis (FMEA) in PV risk management, but to develop common results from different systems a common taxonomy must be in place. A third topic has been identified in the need for the taxonomy for systems design [
21] and planning [
22]. Finally, it is worth mentioning two works particularly relevant to this research: in [
23], a topic and metadata taxonomy is presented for enabling FAIR PV production data time series, whilst in [
24] a topic taxonomy is developed for agri-PV systems, with a deep level of detail.
Regarding concentrated solar power, little activity has been found, except for attempts to structure the knowledge around solar irradiation forecasts, presented in [
25,
26].
Finally, it is important to mention works related to knowledge organisation in life cycle assessment thinking. This is the notion of going beyond the traditional focus on the manufacturing site to account for the environmental, social and economic impacts over the whole product’s life cycle. It is important to use this method to cover all the aspects of renewable project life, rather than focusing on specific phases such as planning or operation. This perspective is described in [
27], which focused on the data necessary for life cycle assessment, and [
28], which attempted to reduce the ontology to a minimal extent and used a coal power plant as a case study. A comprehensive view of the topics in LCA can be found in [
29] and aspects related to uncertainty are detailed in [
30]. The Share-Wind initiative started as a data catalogue for wind energy-related datasets and evolved into a fully searchable metadata catalogue [
31].
On a general level, relevant work is the above-mentioned EU taxonomy for sustainable activities [
4], developed to direct investments towards sustainable projects and clarify to the industry which type of investments can be considered sustainable. This taxonomy covers several topics affecting the level of sustainability of a system, including electricity production from renewable sources, which is the scope of this work. For each of them, information is given on the type of effect that investments have on improving the impact on the environment, whether their nature is permanent or transitional, direct or enabling, etc.
1.3. Conclusions and Contributions
We summarise the current research activity on taxonomies for renewable energy as follows: (1) activity is mainly carried out for wind and PV technology; (2) activity is often carried out at the level of large research bodies or international organisations; (3) the main objectives are: (a) understanding cost structure, (b) classifying maintenance issues, (c) to a lesser extent, understanding system planning; (4) all works are technology-specific and not related to renewable energies in general, except for [
4], even if renewable resources share several common aspects; (5) most work focuses on a subset of the life cycle of renewable energy technologies, often planning or operation and maintenance, except for [
9,
18].
In light of this, the work presented in this paper aims to provide the following main contribution:
This is considered important for several reasons, such as the need for generalising the concepts for several renewable technologies. It is clear that different technologies such as wind power or photovoltaics are based on completely different operating principles, but several common points exist. Among them is the dependence on renewable resources, which needs to be assessed at the planning and operation stage, the structure of power plants which is characterised by an array of captors and the types of studies that need to be carried out for renewable projects. For example, a trans-technology taxonomy could compare benchmark practices, performances and costs among different technologies and not only between plants or generations of the same technology. Secondly, the attention to data is due to the fact that this is where information is stored, both that necessary to carry out studies and analysis and that produced by and for the renewable industry and research. Finally, the attention given to the whole life cycle of renewables is necessary because of the current attention also given to phases before and after the visible life of a renewable project, such as the manufacture and the disposal of the equipment.