1. Introduction
Clean air is a basic requirement of life, together with food and water. Although the latter two have been a primary concern for many civilizations for multiple centuries, especially in more industrialized countries, shaping new lifestyles and driving new economies, air is something imposed, with no possibility of choice. Since the industrial revolution, as people started to spend most of their time in confined environments, clean air should have been considered a prerogative, as indoor air had become a leading exposure for humans. Therefore, in places of life and work, it is necessary to monitor the urban environment. For this reason, as major environmental concerns, policy directives and guidelines have recently highlighted energy use, sustainable buildings, outdoor air quality, and indoor air quality (IAQ).
Bibliometrics is the application of quantitative analysis and statistics to publications using different parameters, such as author co-citation, document co-citation, co-word analysis, and journal mapping, in order to understand emerging trends and the knowledge structure of a research field. Jointly with science mapping tools, it is possible to start from a large dataset of scientific publications in order to generate straightforward visual representations of complex structures for statistical analysis and interactive data exploration.
Using bibliometric tools to analyze the scientific literature collected by Web of Science, this article provides, firstly, an overview of the IAQ topic from 1990 to 2018, by reporting the most important publications and by collocating the existing literature in a finite number of clusters and, secondly, the latest developments and future trends. Although this work is not structured as an exhaustive review of the related literature, it does illustrate the opportunity of bibliometric techniques for exploring research gaps and new frontiers.
2. Materials and Methods
2.1. Data Collection
All data were obtained from the Web of Science Core Collection (WoSCC) by Thomson Reuters prior to 1 May 2018. In this study, the keywords used for the data retrieval strategy were as follows: TS: (“Indoor air quality” OR “IAQ”). English-only document types were articles, letters, and reviews, ranging from 1990 to 2018, from the following indexes: SCI-EXPANDED, SSCI, A&HCI, CPCI-S, CPCI-SSH, and ESCI. The final dataset contained 7389 bibliographic records of Article or Letters or Review in English.
2.2. Data Analysis
We utilized several scientific and visual analytic methods. The statistical results are displayed in CiteSpace V [
1], visualization software used for analyzing data by network modeling.
For the first analysis, the overall time span adopted was 1990 through 2018, with a 1-year time slice. The Term Source included Title, Abstract, Author Keywords, and Keyword Plus; Node Type was selected according the type of analysis conducted; and Selection Criteria included the top 50. No pruning was selected. For the second part, in order to focus on the actual and future trends, a restricted dataset, in which only papers selected with the same criteria from 2010 to 2018, was used. A different thresholding method was adopted: Instead of selecting the top 50 articles of each time slice, CiteSpace thresholding parameters (citation, co-citation, and the cosine coefficient thresholds, shortened as: c, cc, and ccv) were set at 8;8;40, 6;6;30, and 2;2;10, respectively. This choice was justified by the three thresholding being respectively referred to: Begin, middle, and end of the selected period of time and, by applying more selective parameters for the first years, only the most relevant publications are reported. In the last period, a wider number of publications are reported.
3. Results
3.1. Data Description
The distribution of yearly outputs is shown in
Figure 1. The publishing trend increased from 13 publications in 1990 to 786 publications in 2017, highlighting the increased global focus on the topic. In particular, since 1990, it is possible to distinguish between two different phases. During the first phase from 1990 to 2009, a slow increase in publications occurred. The second phase occurred from 2010 to 2017, with a higher growth rate, indicating the growing interest in the topic. This distinct separation coincides with the publication of the World Health Organization (WHO) guidelines for indoor air quality [
2].
Almost 30% of the publications in the dataset were published by five scholarly journals (2102, 28.45%), each one accounting for about 350 publications. This trend is reported in
Figure 2.
The top 15 contributing institutes are listed in
Table 1. The University of California (3.71%) ranked first, followed by two Chinese universities: Tsinghua University (2.64%) and Hong Kong Polytechnic University (2.56%). Among the top 25 institutions, 11 are American, followed by 7 European, and 3 Chinese. However, more than one-quarter of the total records are American, followed by People’s Republic of China (PRC) (14.12%).
The top 10 contributing countries, in terms of publications, are reported in
Table 2. In first place, the USA accounts for 27.5% of the total literature, with 2032 records, followed by China with 1043 records (14.12%). England, Canada, and South Korea ranked third, fourth, and fifth place, respectively, with a cumulative number of publication equal to the total number of publications in 1990, followed by four European countries and Australia.
The top 10 categories are reported in
Table 3. However, only the first four constitutes the major fields. The top four categories, in order of importance, are: “Engineering”, “environmental sciences ecology”, “construction building technology”, and “public environmental occupational health”.
3.2. Categories and Journal Co-Occurring Networks
The network of co-occurring subject categories (Web of Science categories), reported in
Figure 3, highlights the relationship between the main subjects and disciplines in the field. The thickness of each link represents the density of the co-occurring category and the color map refers to the average year of the node. The color of a category ring denotes the time of corresponding utilization. The thickness of a ring is proportional to how many times the category has been used in a specified time slice. Lighter colors (yellow) correspond to newer nodes, while darker colors (blue) are related to older nodes. This method enabled us to highlight the multidisciplinary and temporal evolution of the subject. We found that environmental sciences and engineering, construction and building technology, and public, environmental, and occupation health were the main subjects in the IAQ field. Minor categories, constituted mainly by unlabeled nodes because of their lower amount, have been grouped into larger areas. “Material science” is a linking node between chemistry, physics, and environmental studies and “public, environmental, and occupational health” belongs to the highest burst, as it connects the most populous nodes (construction and building engineering with environmental sciences).
In order to outline the set of journals that are connected to the IAQ topic, the co-citation network at the journal level is shown in
Figure 4. Like the previous figure, color and thickness reflect the temporal distribution of the cited journals. It was possible to categorize the journals into four macro areas: Medicine-related (center), energy-related (left), building-related (bottom), and environment-related (top). The biggest circles correspond to the most cited journals. The most-cited journals were the ones reported in
Figure 2 but, while “Building and Environment”, “Atmospheric Environment”, “Indoor and Built Environment” are, the top three journals in terms of number of publications, the most-cited were “Indoor Air”, “Energy and Buildings”, and “Environmental Health Perspective”. Unlabeled nodes are minor journals given their number of citations.
3.3. Term Burst
Citation burst is an indicator of the most active part of the research. CiteSpace’s citation burst is based on Kleinberg’s algorithm [
3], and it provides evidence that a particular publication (or keyword) has attracted an extraordinary degree of attention from its scientific community. The keywords in the literature can reveal the main research content, and literature citation frequency can reflect research heat. The terms (title, abstract, and keywords) having the strongest citation bursts in the dataset are reported in
Table 4. The time intervals are represented by the blue line, while the periods of the burst are highlighted in red, indicating the beginning and end of each burst interval. The full list is reported in the
Supplementary Materials.
From
Table 4, the initial “IAQ” burst was coincident with the “sick building syndrome”, “office workers”, “environmental tobacco smoke”, and “outdoor level” term bursts. The term “sick building syndrome” (SBS) is used to describe when building occupants experience acute health and comfort effects that can be linked to the time spent in a building. The first studies were conducted in offices [
4,
5] in which tobacco smoke was the major pollutant [
6,
7], as the smoking ban in working and public places had not yet been implemented, together with the presence of contaminants from the outdoor air. By that time, several specific pollutants, like nitrogen dioxide (NO
2), fungal spores, and airborne bacteria, were present as strong citation bursts for the 1996–2011. Notably, volatile organic compounds (VOCs) appeared with burst strength lower than the threshold (arbitrarily set to 7.0, which is equal to the median value of the citation burst distribution) in different time intervals; for this reason, they are not reported in
Table 4.
All the citation burst still active started after 2013. The introduction of other physical and psychological aspects of indoor life led to the definition of Indoor Environmental Quality (IEQ), within which IAQ occurs. In this last time interval, pollutants with a strong, still active burst are related to fine particulate matter (PM2.5), carbon dioxide (CO2), and “energy saving” fields. Also, the “residential buildings” term appears, which may indicate the shift of academic attention from the work place to private homes, together with increased attention focused on building energy saving and thermal comfort of occupants
3.4. Document Co-Citation Network
A document co-citation network represents a network of references that have been co-cited by a set of publications. Time was divided into a number of one-year slices, and an individual co-citation network was derived from each time slice. In order to reduce the dimension of every single slice, the top 50 most-cited publications in each year were used to build a network of cited references in that particular year. Subsequently, individual networks were merged. The merged network reported in
Figure 5 depicts a spatial visualization of the network, which represents the development of the IAQ topic over time, showing the most important footprints of the related research activities. Each colored node represents a cited reference.
The network in
Figure 5 is divided into 19 co-citation clusters. These clusters are labeled by index terms from their own citers. The numbers in front of the cluster’s name are identifiers rank the size of the clusters. The “silhouette” value is a descriptor of the homogeneity of a cluster, and it ranges between −1 and 1; higher values indicate meaningful clusters [
8], while the “modularity” measures the extent to which a network can be divided into independent blocks, which ranges from zero to one [
9]. The network has a modularity of 0.7437, which is considered to be high, suggesting that the specialties in IAQ are clearly defined in terms of co-citation clusters. The average silhouette score of 0.2579 is low mainly because of the presence of numerous small clusters. Different colors indicate the time when co-citation links in those areas appeared for the first time. Purple areas were generated earlier than yellow areas. Important publications have been reported in the network visualization.
The importance of clustering publications lies in identifying the most important thematic macro areas and to see how they are related to each other. Cluster descriptors in
Figure 5 are reported in
Table 5. The “size” column refers to the number of publications within the cluster. The “name” tag was generated by the Log-likelihood algorithm from the articles’ indexing terms [
10]. The Log-likelihood algorithm was chosen as it better reflected the cluster topic. The “description” column was manually filled after analyzing the top articles of each cluster having high coverage value. The “range” is the period in which the cluster evolves, or the period covered, by the various publications within the cluster A straightforward visual representation of this parameter is reported in
Figure 6.
Cluster #0 is the largest cluster, containing 145 publications and having a silhouette value of 0.866. It is labeled as “nasal lavage”. The first five publications, with the highest coverage in this cluster, are by Wieslander [
11,
12], followed by Brooks [
13], and Nordstorm [
14]. They all investigated the health effects (nasal and ocular) of sick building syndrome in hospitals and public facilities. This cluster collects the first comprehensive studies between built environments through monitoring of chemical and physical parameters (pollutants concentrations, temperature, and humidity) and symptoms of indoor occupants. Publications ranged between 1980 and 1999.
The second largest cluster (#1) has 125 members and a silhouette value of 0.725. While it is the lowest among all clusters, it had a relatively high level of homogeneity. The cluster is labeled “building product”. This second cluster is related to the factors are connected to the presence of gaseous pollutants. The top five, in order of citations, are: Ozone [
15], VOC emission from building products [
16], NOx and O
3 concentration [
17], terpene/ozone mixtures [
18], and carbonyl compounds [
19]. It ranges from 1973 to 2010.
The third largest cluster (#2) has 109 members and a silhouette value of 0.766. It is labeled “personal exposure”. It is representative of the personal exposure to fine particulate matter (PM
2.5 and PM
10). Articles with the highest coverage are by Bahadori [
20], Janssen [
21], and Micallef [
22]. This cluster, as with the previous one, started after the previous one, but it terminated more recently. It covered 1981–2015.
The fourth largest cluster (#3) has 105 members and a silhouette value of 0.753. It is labeled “indoor air quality”. This cluster collects studies of indoor pollution in schools. While all the population is vulnerable to air pollution, the children are the most at risk. This constitutes a trend of actual interest, starting in 1992 up to current. The measurement of pollutant concentrations [
23,
24,
25] and the risk assessments [
26,
27,
28] are the main topics in this cluster.
The fifth largest cluster (#4) has 63 members and a silhouette value of 0.908. It is labeled “macrocyclic trichothecene mycotoxin”. The trichothecene mycotoxins are a group of toxins produced by some fungi. Some of these substances may be present as contaminants in mold and transported through aerosols. This cluster is representative of bacterial and fungal aerosols, as well the role of environmental factors in asthma. This cluster extends from 1958 to 2011.
The sixth largest cluster (#5) has 43 members and a silhouette value of 0.884. It is labeled “thermal comfort” and includes research papers on thermal personal comfort, Heating, Ventilation and Air Conditioning (HVAC), and the effect of relative humidity. This cluster extended from 1970 to 2016.
The seventh largest cluster (#6) has 41 members and a silhouette value of 0.916. It is labeled as “particle exposure” and collects publications about the environmental tobacco smoke. This cluster started in 1982 and ended in 2013, probably due to the implementation of non-smoking laws, which reduced the exposure to these pollutants.
Figure 6 shows the timeline visualization in CiteSpace, in which clusters are distributed along the horizontal timeline, reported in the top view. In black are the top two items ranked by centrality, while the colors represent the top three most cited articles. The full list of articles, ordered by coverage for each cluster, is reported in the
Supplementary file.
3.4.1. Most-Cited Articles
The top-ranked item by citation counts is a comprehensive review of the relationship between indoor air pollution and health by Jones [
29] in Cluster #6, with 262 citation counts. The second most-cited article is by Klepeis [
30] in Cluster #2, with the publication of the National Human Activity Pattern Survey (NHAPS): A two-year probability-based telephone survey, sponsored by the U.S. Environmental Protection Agency (EPA), to assess the population exposure to environmental pollutants, with 261 citation counts. The third most-cited article is by Daisey [
31] in Cluster #3, with 233 citation counts. In this review, data about ventilation rates, pollutants concentrations, and symptoms related to indoor air contaminants in schools were collected. The results highlighted the limited knowledge on the topic and it was a starting point for all current research on IAQ in schools, as shown in
Figure 6, Cluster #3. The fourth is the “Indoor Air Quality Guidelines” by the World Health Organization (WHO) in 2010 [
2] (Cluster #3) with 151 citation counts. The guidelines identified in benzene, carbon monoxide, formaldehyde, naphthalene, nitrogen dioxide, polycyclic aromatic hydrocarbons, radon, trichloroethylene, and tetrachloroethylene, the selected indoor pollutants. The choice was made for three reasons: (1) These compounds have indoor sources; (2) they are known because of their hazardousness to health; and (3) they are often found indoors in concentrations of health concern. Particulate matter (PM) was exempt from the list, as it appears into the WHO guidelines on particulate matter were updated in 2005 [
32], which also apply to indoor spaces. Mendell [
33] in Cluster #3, who reviewed the literature on school environments and performance, had the same number of citations.
3.4.2. Citation Bursts
As previously reported, citations bursts are a strong indicator of scholarly impact in terms of the attention of the research community.
Table 6 reports the first five citation bursts. The top-ranked item by citation bursts is the WHO guideline for indoor air quality [
2], in Cluster #3. The second is a multidisciplinary review of the scientific literature on ventilation rates and occupant health by Sundell [
34] and the third is by Klepeis [
30]. Following these are two reviews about the relationship between indoor and outdoor particles by Chen et al. [
35] and about formaldehyde by Salthammer [
36]. These first five publications bursts belong to Clusters #2 and #3. Clusters with numerous nodes with strong citation bursts can be considered an emerging trend, so for this reason, Cluster #3 is of a particular interest.
3.4.3. Pivotal Points
Nodes that have high betweenness centrality scores, as defined by Freeman [
38], are an indicator of how strongly a reference connects references associated with two or more clusters. Centrality is normalized to the unit interval of [0, 1]. The sigma score (Σ) of a node is a composite metric of the betweenness centrality and the citation burstness of the node, computed as
[
9].
The 1994 Brown paper (centrality = 0.14, Σ = 3.22) [
39] is a typical pivot node; it is a strong contact point mainly between the two biggest clusters, and to a lesser extent between Cluster #8 and Cluster #10, as shown in
Figure 5. This review systematically compared the concentration of VOCs in the indoor air of buildings of different classifications and categories, drawing mainly from the articles in Cluster #0 (1980–1999). Results suggested that indoor concentrations were significantly elevated above those outdoors, indicating that they were emitted from indoor sources. The aspect of source emission and modeling was afterward deepened by the publications gathered in Cluster #1 (as reported in
Table 5).
Contemporary to the work of Brown about the analysis of the relationships between VOCs is the work of Wallace (centrality = 0.13, Σ = 9.05) [
40], in which particle concentrations and sources in homes and buildings were summarized in detail. The conclusions suggested tobacco smoking as the leading source of indoor PM and secondary, cooking. Many links to Cluster #9 (topic—environmental tobacco smoke) highlight this aspect. Different from the previous study, PM infiltrations from the outdoors were found to contribute significantly to indoor PM. These first two papers have the highest centrality for Clusters #1 and #2, respectively.
The work of Bornehag (centrality = 0.06, Σ = 2.47) [
41] is part of Cluster #5 (topic—“thermal comfort, relative humidity, HVAC”). In this study, the relationship between “dampness” in building and health was considered, concluding that there is evidence for a strong association between them. Dampness is usually related to the presence of microbial agents (supporting the closeness to Cluster #4) as airborne molds and bacteria. However, due to the limited knowledge about the mechanisms behind the association between ‘dampness’ and health effects, it is difficult to intervene to limit this problem, unlike other known major pollutants.
3.5. Actual and Future Trends
In the second part of the analysis, a restricted dataset (from 2010 to 2018) was used. The results of the selection are reported in
Table 7. For each year slice, in the “Criteria” column, the interpolated citation, co-citation, and the cosine coefficient thresholds are reported [
42]. The “Space” column reports the number of articles having at least one citation within the related year-slice.
In this last period, the most active institutions are reported using a geographical distribution map, which was created using “Generate Google Earth Maps” in CiteSpace (
Figure 7). The colors on the maps show the institution that published papers about IAQ during the last eight years. The most active areas are colored in red. Notably, this map only highlights the quantity and not the quality of the publications.
Figure 7 shows that countries and territories in Europe, Northeast U.S., and East Asia participated actively in IAQ research in the recent years. The results of the actual geographical map were consistent with the total contribution by country, reported in
Table 2, which referred to the complete dataset.
Articles with high citation bursts in the development of IAQ, between 2010 and 2018 are reported in the following list, ordered in ascending initial burst date. References with strong values in the “Strength” column can be considered relevant in the last eight years for the IAQ field. The “Topic” column summarizes the content of the publication. The “Begin” and “End” columns report, respectively, the beginning and end of the citation burst.
The first milestone, in the last eight years of study, is the IAQ guidelines published by the WHO. Notably, 10 of the 25 publications reported in
Table 8 are related to the school environment, and this constitutes a major topic. Another topic common between articles having a strong citation burst is the relationship between indoor and outdoor pollutants, followed by studies on ventilation.
4. Discussion
The categories and journals co-occurrence analysis showed the multidisciplinary nature of the IAQ topic and the most relevant categories. Although only a few journals are now the most active, the network of co-occurring journals reported an articulated structure, groupable into four macro thematic areas: Medicine, energy, buildings, and environments. Temporal patterns showed IAQ is a theme that has changed over time, often due to new external stimuli (different sources of pollution, different environments, and new emerging countries), and the result of new laws enacted, due to a greater awareness of the topic, such as non-smoking laws. Cluster analysis found 15 well-defined clusters, of which two (Cluster #2, “personal exposure” and Cluster #5, “thermal comfort”) are still active.
This study, although not comparable to a systematic and in-depth study of the issue, has allowed us to delineate the past and present of IAQ research. The analysis of when the meaningful terms had a citation burst, together with the study of the cluster timeline, shows how the concept of IAQ has blossomed, starting from a medical point of view, i.e., from the study of the symptoms of SBS in places of work. Over the years, the focus has shifted to the characterization of pollutants and risk assessment. With the introduction of the ban on smoking in public places and workplaces, one of the prevailing indoor sources has been greatly reduced.
At the same time, academic attention initially focused on workplaces, and then moved to public buildings, especially schools and hospitals. The implementation of non-smoking laws has reduced the focus on the Environmental Tobacco Smoke (ETS), despite the recent introduction of electronic cigarettes on the market.
Although many studies have been completed on indoor and outdoor sources of pollutants and risk assessments on poor air exposure, in many parts of the world, these sources have been abruptly reduced (ETS, for example), so IAQ remains important.
From the presence of new sources, ignored until a few years ago as niche or because they were non-existent (3D printers, electronic cigarettes), the field has evolved from the need to make workplaces as appropriate and pleasant as possible (by thinking of a broader concept of Indoor Environmental Quality, in which more environmental parameters are considered to increase productivity), due to the ubiquitous use of new available technologies (low-cost sensors and DNA sequencing techniques), to the new frontiers in indoor chemistry and transformation of indoor pollutants. These are just a few examples of the challenges that the scientific community working on the issue of IAQ has encountered in recent years, without considering the research based on non-major fields. These challenges range from the impact of clothing on exposure, to smart technologies, from the study of phytoremediation and materials able to passively improve air quality [
63,
64,
65] to new technologies and purification processes [
46,
64,
66,
67,
68]. The monitoring of indoor airborne pollutants is a necessary step for assessing personal exposure to pollutants not previously considered. Many of the chemicals presently found in indoor environments were not present in the past, and concentrations have varied over time due to the use of different building materials, new consumer products, electrical appliances, and cleaning products. For this reason, new monitoring campaigns, as well as new sampling methods, are required. Passive samplers are popular and convenient for distributed and long-term exposure assessment, but they cannot provide a short time-resolved picture of the indoor air, which can be completed by using more expensive analyzers. Widespread low-cost sensors are valuable resource that could be coupled with well-established monitoring techniques because, even if they have recently shown an improvement in power consumption, sensitivity, and resolution, they still have problems in selectivity. So, at the time of writing the present article, only laser scattering-based sensors for PM measurement provide acceptable results [
69,
70].
The collection of big data obtained from ubiquitous sensor networks [
69,
71,
72] requires determining how to process and extract useful information from the raw data, and semantic frameworks can be a useful tool to address this challenge [
73].
From another point of view, 1.2 billion people from developing countries are without access to electricity, while nearly 3 billion people worldwide are exposed to the threat of household air pollution every day from the use of solid fuel for cooking, heating, and lighting [
74,
75].
These are only a few aspects the indoor air community has to consider. Scientometric tools can be useful to analyze, categorize, and identify milestone and turning points if the scientific literature. However, these tools cannot be considered exhaustive and scientific advances will probably be driven by new multidisciplinary contaminations into the discipline.