Next Article in Journal
Failure Characterization and Analysis of a Sport Utility Vehicles SUV Rear Door Damper Made by Nylon as Structural Element
Next Article in Special Issue
Examining Current Research Trends in Ozone Formation Sensitivity: A Bibliometric Analysis
Previous Article in Journal
Impact of a Sonochemical Approach to the Structural and Antioxidant Activity of Brown Algae (Fucoidan) Using the Box–Behnken Design Method
Previous Article in Special Issue
Monitoring PM2.5 at a Large Shopping Mall: A Case Study in Macao
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Design and Implementation of a Crowdsensing-Based Air Quality Monitoring Open and FAIR Data Infrastructure

by
Paolo Diviacco
*,
Massimiliano Iurcev
,
Rodrigo José Carbajales
,
Alberto Viola
and
Nikolas Potleca
National Institute of Oceanography and Applied Geophysics, Borgo Grotta Gigante 42/C, 34010 Sgonico, Italy
*
Author to whom correspondence should be addressed.
Processes 2023, 11(7), 1881; https://doi.org/10.3390/pr11071881
Submission received: 6 May 2023 / Revised: 11 June 2023 / Accepted: 20 June 2023 / Published: 23 June 2023
(This article belongs to the Special Issue Air Quality Monitoring for Smart Cities and Industrial Applications)

Abstract

:
This work reports on the development of a real-time vehicle sensor network (VSN) system and infrastructure devised to monitor particulate matter (PM) in urban areas within a participatory paradigm. The approach is based on the use of multiple vehicles where sensors, acquisition and transmission devices are installed. PM values are measured and transmitted using standard mobile phone networks. Given the large number of acquisition platforms needed in crowdsensing, sensors need to be low-cost (LCS). This sets limitations in the precision and accuracy of measurements that can be mitigated using statistical methods on redundant data. Once data are received, they are automatically quality controlled, processed and mapped geographically to produce easy-to-understand visualizations that are made available in almost real time through a dedicated web portal. There, end users can access current and historic data and data products. The system has been operational since 2021 and has collected over 50 billion measurements, highlighting several hotspots and trends of air pollution in the city of Trieste (north-east Italy). The study concludes that (i) this perspective allows for drastically reduced costs and considerably improves the coverage of measurements; (ii) for an urban area of approximately 100,000 square meters and 200,000 inhabitants, a large quantity of measurements can be obtained with a relatively low number (5) of public buses; (iii) a small number of private cars, although less easy to organize, can be very important to provide infills in areas where buses are not available; (iv) appropriate corrections for LCS limitations in accuracy can be calculated and applied using reference measurements taken with high-quality standardized devices and methods; and that (v) analyzing the dispersion of measurements in the designated area, it is possible to highlight trends of air pollution and possibly associate them with traffic directions. Crowdsensing and open access to air quality data can provide very useful data to the scientific community but also have great potential in fostering environmental awareness and the adoption of correct practices by the general public.

1. Introduction

The World Health Organization defines air pollution as the “contamination of the indoor or outdoor environment by any chemical, physical or biological agent that modifies the natural characteristics of the atmosphere” [1]. Epidemiological evidence suggests that polluted air is one of the leading factors associated with the development of respiratory illness, cardiovascular disease and lung cancer [2]. At the same time, air pollution directly and indirectly affects the climate and damages buildings and cultural heritage [3,4]. Many countries have introduced specific legislation setting strict objectives for air quality. In the United States, this was implemented in the 1970s with the Air Quality Act [5] and later in Europe with the 2008/50/EC directive [6]. Notwithstanding the fact that, in general, air quality has improved a lot since then, there are still several hotspots of air pollution in most of the Western countries [7,8]. Recently, several analysts highlighted the risks that, due to the current geopolitical situation and the shortage of natural gas, the resulting increase in the use of solid fuels poses to worsen the situation [9]. Indeed, while the combustion of natural gas contributes to the formation of smog and acid rain, particulate matter (PM) emissions are generally low [10]. In contrast, liquid and solid fuel combustion produces large quantities of PM and high concentrations of sulfur and heavy metals [11].
Air quality monitoring is generally performed by government agencies using standardized methods and devices at specific fixed locations in order to have reliable long time series that could be considered as reference measurements. The position of such monitoring stations is linked to the specific problem to be considered, be it traffic congestion in an urban area or an industrial site or any other issue that could be relevant to public health.
Reference methods are intrinsically expensive and need well-trained personnel. As a consequence, there are limitations in the possible number of stations to install. In order to reconstruct the geographic distribution of the quality of air in an area, measurements at the sparse stations can be inter/extrapolated using statistical [12] or modeling techniques [13]. Although generally very accurate, these methods can be problematic where high gradients are present. In such cases, the possibility to increase the geographic and temporal coverage and resolution of phenomena could be very helpful.
In this perspective, a new paradigm can be introduced that has already been successfully applied in several scientific fields. Since the seminal work of Irwin [14], a large number of initiatives have, in fact, flourished that aim at enrolling resources from outside the scientific community and employing them within several research activities [15,16,17,18,19]. This new approach is generally referred to as citizen science, although slightly different definitions may be more suited to each specific application. In this work, we prefer to use the term ‘crowdsensing’, which refers to a technique where a large number of volunteers offer their help in acquiring and sharing measurements taken with devices or software provided by the project designer. Within this perspective, a large literature exists on the use of air quality sensors [20] for several air quality parameters [21,22,23], both indoors [24,25,26] and outdoors [27,28]. Within this work, we will focus on mobile crowdsensing [29], where PM acquisition devices are installed on vehicles such as cars, vans or public buses. As a result, the availability of a large number of vectors has the potential to radically increase the amount of data available and improve geographic and temporal coverage and resolution.
Moreover, participative research activities such as crowdsensing and citizen science have the possibility to deliver benefits well beyond scientific outcomes. Starting from the importance of understanding how the general public is informed about environmental topics, such as climate change or air quality, it is easy to understand that if the discussions remain confined within the scientific community, it is very unlikely that the general public will be able to take informed actions in order to mitigate those phenomena. Effective and correct communication via the mass media and the Internet is therefore necessary to spread correct messages, which, unfortunately, is often not the case because mass media and, in particular, social media may at best be partial if not altogether manipulative.
In contrast, active participation in research activities by volunteers, together with the possibility to freely and easily access reliable data and information on environmental issues, can have a wide range of positive effects. These can span from an increase in trust in the scientific community, to the improvement of awareness and engagement of citizens in environmental issues [30] up to their empowerment in steering political and economic decisions [31,32]. While improvement in subject-matter knowledge and stronger scientific literacy is generally easy to be traced as a participative research outcome, the actual impact on policy making of such initiatives is not always easy to fully understand and is sometimes a matter of debate [33,34].
To support citizen science and crowdsourcing activities in the field of environmental monitoring, several, often intermingled, aspects have to be considered. Each of them can constitute on their own a topic for a considerable analysis. In previous works [35,36,37,38,39], we have considered some of these aspects and the scientific results that we were able to obtain by exploiting them. Here, we describe the technological aspects of our work in the hope that our experience could prove beneficial to others intending to replicate or eventually improve what we have been able to build so far.

2. Materials and Methods

Within this work, we will describe a system and infrastructure we developed that, leveraging the crowdsensing paradigm, allows the monitoring of air quality and represents its geographic distribution on a web portal in real time. The initiative is named “COCAL” after the dialectal term used for seagulls in the city of Trieste (Italy), where it has been developed and first deployed. The reason for using a seabird name comes from the fact that Trieste is a coastal city where applications of the crowdsensing paradigm can be envisaged in multiple environments. As a matter of fact, the first trials of COCAL were focused on monitoring marine parameters such as temperature, pH, salinity and dissolved oxygen. Further information on such trials can be found in [35,36].
Trieste is located at the north-east tip of the Adriatic Sea and occupies an NW-SE trending elongated area of approximately 90 square kilometers between the sea and the Karst plateau, which acts as a barrier to air masses (Figure 1). The city center lies in a restricted area that is characterized by economic activities linked to the tertiary sector and tourism and separated from an industrial area located in an SE sector. The wind regime is characterized in winter by a strong NE wind called Bora, which can effectively move polluted masses from the city to the sea and other nearby regions [13,40]. In summer, the sea breezes are the prevalent factor conditioning the behavior and position of the polluted air masses [41].
Within this work, we focus on monitoring PM only, but we are currently working on extending the method to other pollutants. It is worth highlighting that all the data acquired within this initiative and in other crowdsensing initiatives are gathered in an integrated database and managed within a fully FAIR-compliant perspective following international standards as mandated by ISO and OGC.

2.1. PM Sensors

Given the importance in crowdsensing of using a large number of acquisition devices, it is evident that increasing their number will inevitably imply increasing the overall cost of the initiative. Under this approach, in fact, it is not possible to use conventional PM monitoring techniques such as filters and gravimetric mass detection, which are very expensive and based on standardized procedures that can only be performed by trained personnel. Therefore, low-cost sensors (LCS) are needed. New technologies have emerged that use laser scattering, which relates the waveform of the scattered light to the diameter and number of particles, enabling real-time and continuous measurements of particulate matter.
A detailed description of the technologies behind PM sensors is beyond the scope of this work and can be found in other works such as, for example, [42,43]. Suffice it to say that these PM sensors consist of a fan, generally connected to a small tube, that pushes air into the sensing box. Light from a laser diode is scattered by the particles. This scattered light is received by a photodiode, which can estimate the concentration of each type of particle by classifying and counting the number of pulses detected.
Within COCAL, we use the SDS011 PM sensor from Nova Fitness Co., which enables simultaneously measuring both PM2.5 and PM10 levels at a very low cost.

2.1.1. LCS Performances

The major advantages of LCS in terms of price and portability come at the expense of limitations in precision and accuracy [43,44,45].
Many LCS manufacturers and models are available on the market, and detailed comparisons between them can be found, for example, in [12,24,42]. These works highlight that, in addition to the limitations in precision and accuracy, it is very important to consider the environmental conditions in which LCS sensors operate. For example, because these sensors do not have sample conditioning equipment, they are susceptible to drifts due to relative humidity (RH), which can affect the hygroscopic growth of particles and distort measurements [27,46]. The results of these studies demonstrate that among LCS, the issue of quality and precision of the specific brand and model of sensors can be as relevant as the intrinsic limitations of the technology employed, the issues related to the deployment in the designated environment and the environmental conditions.
In this perspective, to monitor the environmental conditions in which the acquisition takes place, together with the PM LCS, we also use a Dallas Semiconductor DS18B20 one-wire communication sensor with waterproof protection outside the acquisition box, while internally, we use a Bosch BME280 temperature, pressure and RH sensor connected to the board via the I2C bus.
In [42], useful references can be found to understand the performance of the SDS011 sensor and other similar sensors under controlled laboratory conditions. The results of that work confirm that the SDS011 sensor is suitable for use within the COCAL project since it performs reasonably well in comparison with similar or even more expensive sensors. At the same time, downsides have been identified such as a general trend to underestimate PM values and the presence of a delay in the timing of measurements.

2.1.2. Statistical Analysis

Following [44], it is particularly important to understand the behavior of the LCS used in this work under real-world conditions. In order to devise possible mitigation strategies for the issues introduced by the use of LCS, following the protocols suggested by US EPA and [43], we designed two experiments, namely: (i) a study of the behavior of LCS in a highly variable PM level environment to evaluate precision and (ii) a co-location-based evaluation of LCS with a reference measurement station managed by the regional environmental agency ARPA-FVG, in order to evaluate accuracy. It should be noted that during these tests, the data completeness of the COCAL system, meaning the ability to avoid gaps in measurements and data transmission, has always been very high, with almost negligible glitches and well above the 75% threshold recommended by US EPA.

LCS Precision

To assess the precision of LCS, we studied the recordings of three co-located LCS in a highly variable PM concentration environment. Following US EPA standards and procedures mentioned by [43], precision can be estimated using the standard deviation (SD) and the coefficient of variation (CV). The SD shows a value of 2.15 for PM10 and a value of 1.17 for PM2.5, while the CV shows a value of 24.70% for PM10 and 22.77% for PM2.5.
US EPA recommends a target SD less or equal to 5 μg/m3 and a CV less or equal to 30%. The tests we conducted therefore assess that LCSs used in COCAL match the recommendations for precision.

LCS Accuracy

To test the accuracy of the selected sensors, we placed a COCAL box at a short distance from a certified reference air quality station (ARPA-FVG station ‘Rosmini’). The reference PM values of this station are made available through an API on the official ARPA-FVG website as daily average values only. The measurements and comparison took place from mid-March 2022 to the end of April 2023.
In Figure 2, one can compare measurements from the reference station and those taken during the same period with the COCAL box located close to the reference station.
Figure 2 is divided into two boxes. The upper box shows data and statistics for PM10, while the lower one focuses on PM2.5. In each box, the first graph (n.1) shows in red the time series of the reference daily average values as made available from the ARPA-FVG website, while the daily average values of the COCAL box located close to the reference station are plotted in blue.
High PM values were measured at the end of July 2022. Unfortunately, these were not outliers but the effects of a large forest fire that occurred for several days at a distance of about 20 km from the test site.
It is possible to note that COCAL measurements are generally lower than the reference measurements; however, during the first months of 2023, in three specific events (identified by boxes A, B and C), this behavior reverses. To better understand the performance of LCSs, Figure 2 also shows the difference between the reference and the COCAL time series (graph n.3), the RH time series (graph n.3) and the time series of the standard deviation of all COCAL measurements acquired near the reference station, calculated on a daily basis.
It is interesting to note that during the A, B and C events, the standard deviation of COCAL measurements increases. In two of these cases (A and C), this can be understood as owing to the sensitivity of the LCS to RH. In fact, the time series of the RH in those periods exceeded 60%, while elsewhere, when LCS underestimated the reference measurements, the RH remains below this threshold. In the case of event B, however, where RH is low, a different explanation is needed. This could be found in the different technologies and rate of sampling of the LCS and the reference acquisition system. COCAL boxes acquire data every ten seconds, which means that rapid variations in the actual PM concentrations can effectively be captured, increasing at the same time the standard deviation of the set of daily measurements. Reference systems sample much more slowly so that, even if very reliable, they can overlook rapid phenomena so that, in the comparison, LCSs data appear to be dispersed.
Following US EPA standards and procedures mentioned by [32], accuracy can be estimated using the coefficient of determination (R2), slope (m), intercept (b), root mean square error (RMSE) and the normalized root mean square error (NRMSE). Results for the mentioned tests for PM10 and PM2.5 are shown in Table 1, while Figure 3 provides a snapshot of the comparison of the LCS and reference measurements.
Considering US EPA recommendations, these results can be problematic. In fact, considering R2, this parameter is recommended to exceed 0.70, while the analysis reveals lower values. The target slope should be approximately 1 ± 0.35, a condition that is instead respected by the LCSs. Similarly, the intercept parameter performs relatively well for PM2.5 sensors, while PM10 sensors do not fall within the recommended range since EPA recommends a value between −5 and +5. Following EPA standards, RMSE should be lower than 7 μg/m3, and again, also here, PM2.5 scores rather well, whereas we measured a value twice the threshold for PM10. In addition, unfortunately, the NRMSE results are also too high, being around 60%, while EPA recommends a value less than 30%.
It can therefore be said that accuracy-wise, the LCSs perform rather poorly, both for PM10 and PM2.5, and that a correction mechanism is needed to obtain results that could be comparable with the official ones. At the same time, it is possible to say that, since the precision is reasonably good, a geographic distribution of LCS measurements built on the integration of multiple COCAL boxes should reasonably be capable of highlighting general trends and pinpointing local anomalies.

2.1.3. LCS Performances Improvement

The results of the tests are consistent with a wide literature on LCS performance. Several authors [47,48,49] highlighted the impact of the environmental conditions and, in particular, of the RH in the deviation between reference measurements and LCS. To address such problems, reference stations are equipped with a device that, on heating air samples, induces the water vapor condensed onto the particle to evaporate. This, of course, is not possible with LCSs. To compensate for this effect, several RH correction approaches exist such as, for example, κ-Köhler theory-derived factors and various types of regressions or machine learning methods. After an extensive survey of the existing literature, Ref. [44] maintains that such corrections are applied very seldomly, with a simple linear regression, in that case, being the most used method. The same authors underline difficulties in accurately defining local parameters and accumulating knowledge from different cases and areas. It is also worth noting that RH itself can be a problematic parameter to measure and that, since COCAL is a VSN system, RH measurements taken with it can have further limitations.
Taking these considerations into account, and since the tests performed in Sections “LCS Precision” and “LCS Accuracy” with the LCSs we use in this work revealed that they perform reasonably well in terms of precision but unfortunately not well enough in terms of accuracy, we devised a specific and pragmatic two-step mitigation strategy to improve their performance.
The first step consists in filtering all measurements made in problematic conditions, for example, when RH is more than 60%. These data are automatically flagged and are not sent to the following processing flow.
The second step consists in calculating an accuracy correction to a reference station using a COCAL box located in its proximity. Since sensors proved to behave consistently among them, following [39], we apply the same accuracy correction to all the other sensors. As abovementioned, given that, in the designated area, only one value per day is currently available from the reference station, we calculate the difference between that reference value and the daily average value of all the measurements taken by an LCS co-located in the proximity of the reference station. Corrections are inter/extrapolated in all designated areas by means of the technique described in [39].
An example of the method’s results can be seen in Figure 4, where on the left, the geographic distribution of LCS measurements before corrections is shown, while on the right, corrected values that are more consistent with reference measurements are shown. It is to be noted that this method can be problematic since applying the correction to areas far from where the reference station is located can unpredictably bias the final values. In the case proposed in Figure 4, measurements taken in the village of Opicina (upper part of the map) generally depict rather different conditions from the city center (lower part of the map). Opicina is, in fact, located uphill, is characterized by a different climatic setting and is less subject to vehicular traffic. No reference station is available in that area such that the only reference measurements available are those taken in the city center. In the example of Figure 4 (left), while the un-corrected data report a polluted city center and a much better situation uphill, after the corrections (Figure 4 right), the revised air quality also degrades notably in the hills. This could be an artifact that needs careful consideration when interpreting the data.

2.2. Deployment on Mobile Platforms

As mentioned above, sensors were installed on two different platform types, namely (i) buses and (ii) cars. In both cases, we developed a tailor-made waterproof box that can easily be installed on the platform and where all the acquisition and transmission electronics can be safely protected while air inlets and outlets could effectively bring air samples to the LCS.
Bus deployment has been developed with key help from the local transportation authority, TPL Trieste Trasporti, which kindly offered to host several COCAL systems. The boxes were installed on the roof of the buses (Figure 5) in a closed compartment with a specific air inlet passing through a syphon in order to prevent rain from entering the box. The power supply is obtained from the bus using a temporized relay to minimize the impact of the COCAL box on the normal functioning of the vehicles.
Buses are a very convenient acquisition platform because each unit can be redirected to several routes throughout the day, thereby covering a large portion of the urban area. On the other hand, bus routes tend to follow the main directions of the traffic in a city, which may somehow bias the coverage of the designated area.
Cars have several advantages over buses, one being that they generally do not follow predefined routes. This makes cars a good means to provide infills in areas where buses are not available. At the same time, cars introduce other constraints that depend mainly on volunteer drivers. Issues may arise, in fact, in order to motivate them to cover areas that are not within their daily routines. In our experience, this often meant also using vehicles belonging to our institute.
COCAL boxes for cars have been designed entirely by us and 3D-printed autonomously with the help of the ICTP FabLab laboratory (Figure 6). The boxes were conceived to be fully autonomous and least invasive as possible. This forced us to make some design choices; for example, since connecting the boxes to the car’s power supply can be problematic, they are powered on batteries only. Autonomy is approximately one full day, although it can be longer depending on the rate of data transmission. Battery recharge can be completed in a few hours. Another choice was to avoid taking up space inside the vehicle or in the trunk, so we decided to position the box on the roof. To affix the box to the roof surface, we added magnetic plates on the bottom and, for further security, we decided to install it on cars with roof bars only, to which COCAL boxes are secured using Velcro strips. The air inlet passes through the curved white roof so that it remains dry in case of rain (Figure 3 lower left). The air outlet is located on the back of the box. The roof can easily be removed to access the electronics inside.

Limitations of Mobile Platforms

Besides the already mentioned limitations in accuracy, we were also concerned about the possible effects of the deployment of LCSs on moving platforms. While it is known that platform speed influences the measurement, to the best of our knowledge, there is no specific study on this topic since most of the existing literature is based on fixed-position deployments. We therefore set up a test, where, passing multiple times in the same area at different speeds during a restricted period with stable meteorologic conditions, we collected a large dataset of measurements. The results of the experiment can be seen in Figure 7. These show an inverse relationship between PM and platform speed. Considering how the COCAL box is built, this is probably due to a depression induced by the platform movement on the inlet of the box. This increases with speed, reducing the quantity of air that reaches the detection device and therefore reducing the estimates of PM values. The drift is relatively small and below the precision of sensors for velocities lower than 50 km/h, while higher speed values tend to be more problematic. Since the system has been installed mostly in an area where the speed limit is below 50 km/h, we can safely say that data collection is not particularly affected by this issue. As a measure of further security, during data processing, measurements associated with a speed higher than 50 km/h are automatically filtered out of the calculations.

2.3. Data Acquisition and Transmission

The acquisition system (Figure 8) is based on a low-cost ESP32 microcontroller with WiFi and Bluetooth connectivity. We selected a Heltec LoRa 32 v2 board, which has an embedded OLED display and battery charger together with a LoRa chip and Wi-Fi and Bluetooth connectivity. These are used for testing and short-distance connectivity, while LoRa is used for long-distance connectivity [28]. To this, we added GSM and GPS functionalities using an A9G development board, designed by Ai-Thinker, which, with an active SIM card, allows data transmission using the GSM telephonic network where coverage is available. Data transmission using Wi-Fi and GSM stores data directly in an InfluxDB database, while, using Lora, we rely on The Things Network (TTN) LoRaWAN infrastructure in order to retrieve data transmitted using LoRa and store data into the database. Telegraf, a server-based agent, oversees retrieving data from TTN using the MQTT protocol and storing data into the database.

2.4. Data Management

Figure 9 describes the general architecture of the COCAL system. The flux of incoming data transmitted through LoRaWAN flows into an InfluxDB table filled by the TTN service. A server script manages to reroute the data into the main InfluxDB time-series tables after proper conversion. The final storage and processing server is based on a Postgres database, with a PostGIS extension for dealing with georeferenced objects, geographic projections and geometric objects such as polylines. This storage/processing server (SPS) is built on an open-source architecture: Linux Ubuntu, Apache, PHP, Python and Postgres. It currently manages the database, as well as several scripts responsible for the processing and the web front-end. A PHP script periodically synchronizes the InfluxDB with the Postgres database, inserting in the latter one every valid measurement from a sensor with a time marker (UTC), WGS84 coordinates, all other GPS info (such as altitude or speed), the type of transmission (e.g., GSM), a device ID, a sensor ID (e.g., atmospheric pressure) and the measured value (e.g., 1007 mBar).

2.5. Data Processing

The SPS performs different activities by means of PHP scripts, which are scheduled with crontab. The most demanding analyses are encoded in Python with its standard libraries such as NumPy, SciPy, Matplotlib or PIL.
All processed products are made available in near real-time and stored permanently for better performance.

2.5.1. Window Averaging

Window averaging is necessary to assimilate the large amount of data acquired by many devices spread across a wide area. As in [35], we define a geographical grid of 200 m per 200 m wide cells, based on a local projection. In addition, we subdivide the timeline into 1 h intervals. Every set of data spanning a spatial cell and a time interval is a datacube, including measurements from different devices but mounting the same kind of sensor (e.g., PM10). The choice of a local projection provides good accuracy when the area of interest is limited, and, in the case of this work, we used WGS84/UTM zone 33N (EPSG:32633). In order to obtain values that are smoother and more representative of the physical phenomenon, reducing the outliers and providing a uniform subdivision of space and time, considering the good results obtained in [26], we adopted a similar approach by averaging data (calculating the median) over space and time datacubes. Larger time intervals (for example 2, 3, 4 or 8 h) can be analyzed by selecting the specific datacube. These are processed once a day and made available the next day. A discussion on the advantages of window averaging and the shape of the cells can also be found in [39].
Every averaged datacube is stored into the database, marked with a start time, end time and a polyline describing the square cell.

2.5.2. Correction of LCS Data

As mentioned in Section “LCS Accuracy”, the accuracy of LCS can be problematic. The technologies used within these sensors and the rate of sampling together with the effects of environmental parameters such as the RH often induce drifts in the LCS measurements. In Section 2.1.3, we introduced a pragmatic method that can mitigate such effects. This is based on applying a correction value to the LCS measurements that are calculated daily as the difference between the value provided by a reference station and a fixed LCS located in the vicinity of the reference station. The correction is applied server-side one day after the LCS data are actually collected since the reference value is available only with such a delay. The resulting grid of data is then made available to the web portal.

2.5.3. Interpolation and Contouring

In order to provide a more intuitive insight into the measured phenomenon, interpolation is a useful tool. Following [39], there are many aspects to take into account when spatial interpolation is applied:
(i)
The accuracy of the method and how far the interpolated values from the samples are still meaningful, i.e., a consideration on “extrapolation”. This issue can be partially solved by the definition of an area of interpolation, such as the bounding box or (better) the convex hull of the samples as the first approximation.
(ii)
The computational complexity and the relative speed of the interpolation method, which must comply with the near real-time requirement. In our implementation, we chose a very quick and sufficiently accurate method, inverse distance weighting (IDW). IDW interpolation is defined as follows. Assuming that {x1xN} are the interpolating points (samples) and x a generic point, the interpolant is:
f I D W ( x ) = k = 1 N w k ( x ) f ( x k ) k = 1 N w k ( x )
where the weights are:
w k ( x ) = x x k p
(iii)
In addition, there is an epistemological aspect to be further considered: all processing is automatic and cannot be assisted by human intervention. This fact excludes algorithms such as Kriging, which involve many discretional models and parameters. An excellent alternative solution is natural neighbor interpolation (Sibson’s method), which is based only on the geometrical properties of the dataset and is approximately ten times slower than IDW [50]. Lastly, it is necessary to define the levels for the contouring in some adaptive way to improve the readability and also the color map for the interpolation, which must be coherent with all other data visualizations. The interpolation/contouring is implemented on the SPS with a Python script that reads the averaged values, applies the IDW method (with exponent p = 3), generates the contour and produces a transparent PNG image with a small text file for the georeferentiation. The image is clipped around the convex hull of the dataset, excluding the external area.

2.5.4. Near-Real-Time Web-Based Visualization

The visualization of environmental data is a topic that raises several questions: Are our data time series or spatial distributions? How do we represent time-varying phenomena? Which colors and graphic patterns are more effective and representative? To what extent does the computation have an impact on near-real-time web interaction? There are many different answers, of course, and much research involving mathematical, computational or psychological aspects (see, for example, [39]).
We implemented a set of visualizations in the web front-end, which allows the end user to browse through spatial and temporal coordinates and to select and analyze both single acquisitions and averaged maps (Figure 10).
All services are available at the web portal https://cocal.ogs.it (accessed on 1 June 2023).
The web interface allows easy navigation through the datasets by means of a simple window (Figure 10 left) where the user can select a single device, the acquisition sensor, the time interval and many different options. The time selection can be made in local time or in UTC, and a simplified view of the day shows the density of available data as shades of red, providing one-click access to time selection.
A calendar (month view button) shows the data density day by day by using the same principle. The acquisitions of a single device are represented as an interactive chart of the time series (a) or as a collection of connected points on a map (b). In the latter case, an arrow shape can show the GPS direction, and the point colors are mapped to the measure scale and corresponding legend.
The averaged data (on a rectangular grid) are represented as colored and labeled polygons (c). The interpolated data use the same color coding but are represented as continuous within the data convex hull, with superimposed contour lines (d). All graphic elements are responsive, showing all data details.
Additionally, we implemented a functionality that allows the user to follow cumulative data as an animation, cycling from a starting to an ending hour, in order to dynamically represent the temporal evolution of each parameter.

2.5.5. Advanced Analysis

The adoption of UTM33N as the map projection is a disadvantage when the acquisitions are beyond the limits of the range from 12° E to 18° E. Moreover, the implemented processing mechanism, which computes the averages periodically (in the background), is very efficient for a quick response and a fluid user experience but, on the downside, is rather fixed and rigid. A more flexible interface for data analysis has been tested, based on the global map projection “Web Mercator” (EPSG:3857, Pseudo-Mercator/Spherical Mercator). This spherical projection is used by most GIS systems such as Google Maps, Bing, ESRI, etc., has an increasing distortion at high latitudes and is not conformal but is the de facto standard for web applications and allows global coverage for processing. The web page shown in Figure 11 provides a wider range of query parameters and builds an analysis “on the fly” (windowed average or IDW interpolation). The computation is restricted to the visible bounding box and requires some computational time, favoring the extended query flexibility.

2.5.6. Open and FAIR Data Access

According to the FAIR principles, data must be findable, accessible, interoperable and reusable. COCAL deploys different protocols and implementations aiming to provide Open and FAIR data in accordance with well-established and official standards. In order to achieve discoverability, the initiative handles the standard ISO 19115-3 metadata profile through the Geonetwork catalog application. To ensure interoperability, such as machine-to-machine data flows, data harvesting or archive federations, the COCAL geospatial database is compliant with OGC (Open Geospatial Consortium) standards, deployed as Web HTTP services: (i) WMS (Web Map Service), which provides georeferenced map images of the requested features or (ii) WFS (Web Feature Service), which provides detailed and fine-grained information about features or general capabilities of the dataset in a structured text format (XML, JSON, etc.). All OGC services are implemented on a Geoserver platform, an open-source Apache Tomcat extension linked to the main database. In order to achieve accessibility, data products are fully open and accessible, while a download of raw data in CSV format is available on the COCAL web portal, after authentication with a trusted account.

3. Results

The technology behind COCAL has been under constant development since 2020. During the initial phase, trials took place using multiple simultaneous acquisition and transmission platforms mounted on vehicles operated by our institution. This allowed us to extensively test the system, its scalability, precision and accuracy during specific targeted surveys. Once the system was finalized, we were able to deploy a fully operative system on vehicles of the local transportation authority (Trieste Trasporti) and on some voluntary cars. COCAL entered into service in February 2021 and has been fully operative ever since.
Up to April 2023, the system acquired and processed a remarkable amount of data, both “points” (single measurements) and “cells” (averaged results). In Table 2, it is possible to see the approximated number of records per year during the period from January 2020 to April 2023.
Data are fully public and can be accessed using standard procedures from the COCAL web portal (https://cocal.ogs.it, accessed on 1 June 2023). The main results of the work are, on one hand, the COCAL system itself, which has proved to be efficient, robust and easy to install and maintain, allowing a very high throughput of environmental data that strongly support the paradigm of low-cost participative systems in monitoring the environment. On the other hand, a very important result is the availability of a very large quantity of environmental measurements, allowing us to significantly increase the spatial and time coverage of the distribution of air pollutants in the designated area. Initial analysis of the dataset acquired enabled identifying several interesting features of the air quality in the area of Trieste. Some of these observations have already been published in the papers mentioned above.
Figure 12 shows the geographic distribution of the standard deviation of all measurements made by the COCAL systems in the designated area. For every possible grid cell, we consider all PM measurements during the whole considered period (…2021…2022). Acquisitions within each cell are divided into time windows of 1 h, and some statistical parameters are calculated for every window, such as the number of samples, average, median and standard deviation. Eventually, the maximum standard deviation over the period is calculated for every cell.
This map should be interpreted with some caution. On one hand, the standard deviation could provide an idea of the amount of variation in measurements during a specific period of time. Where the standard deviation is high, this could mean that higher levels of PM have been recorded in that area compared to areas where the standard deviation is lower. On the other hand, there is a risk that if a biased coverage of data has been used, then the standard deviation could also be biased. Indeed, in Figure 12, it is easy to identify an NW-SE alignment (highlighted by the light blue line) where high values of the standard deviation seem to gather. This correlated almost perfectly with the main traffic direction of the city. It could be tempting to associate this trend with the traffic, concluding that those areas are the most polluted in the city. We think that some caution should be taken because this direction also corresponds to the more frequently followed routes of the buses used in the COCAL initiative. Most of the data, in fact, have been acquired in those areas so that, in comparison to other areas, measurements made in the former could have had the opportunity to detect all pollution events, while measurements made in other areas may just have overlooked them.
An interesting and surprising result we obtained is related to the possible deterioration of the sensors after long usage. After approximately 2 years of activity, we decided to substitute the hardware of the deployed systems and discovered that they practically had not accumulated any dirt inside and that their precision remained (considering the intrinsic limitations of the LCS) almost unaltered (Figure 13).

4. Conclusions and Future Work

This work describes a crowdsensing-based air monitoring system following all of the technological segments of a path that starts from the actual measurement using LCS, to data transmission, to processing and a FAIR-compliant web-based representation and access to the reconstructed data products. The main conclusion of this work is, therefore, that the implementation of all segments of the system can be achieved using low-cost and open-source technology only. At the same time, the acquisition of data does not need trained personnel but can be performed with the help of volunteers and especially of the local transportation authorities. The results of the experience we propose here suggest that such systems can be trustworthy from the point of view of the precision of measurements, while it is necessary to rely on a reference value to correct the deviation of measurements due to the intrinsic limitations of the LCSs. Currently, within the proposed system, this correction is calculated daily, since the reference values are made available by the local environmental agency only as daily averages and one day after the actual measurements took place. We demonstrated that, in some cases, this can be problematic and that, when available, corrections should be calculated with a higher temporal resolution. Other limitations of the system were taken into consideration such as the speed of the acquisition vehicle. We showed that this speed should not exceed 60 km/h; otherwise, the variation in air pressure could bias the measurements. We also demonstrated that after almost two years of continuous field operations, the amount of dirt accumulated within the acquisition box in the case of the designated area was minimal. This, of course, can depend on the levels of pollution in the specific cases where the system is applied.
The amount of data acquired raised important questions from the viewpoint of the ITC system to be used. We understood that, for example, a modular approach in separating each activity on different virtual machines helps considerably in monitoring the performance of the system and in understanding where it may be necessary to increase dedicated resources.
After two years of testing the system with five COCAL boxes installed on local transportation authority buses, we understood that since this means of transportation can often be under maintenance or rerouted, five units of vectors is the minimum set of installation for a city of approximately 200,000 inhabitants and an area of approximately 100,000 square meters. In this perspective, much depends on the actual urban configuration. In fact, while installations on cars can cover the urban area almost randomly, buses follow the bus line distribution, which is generally concentrated in specific areas while neglecting others. The resulting data can be biased and limit the reconstruction of the distribution of air quality.
In the coming months, the number of COCAL installations on the public bus network of Trieste will be doubled in order to achieve a broader and more homogeneous coverage of the city.
The designated area being a coastal area, the system developed so far allows reconstructing the on-land area only. This does not allow studying in depth the phenomena related to the movements of polluted air masses due to sea breezes. In this perspective, an extension of the system is planned at sea using volunteers’ recreational sailing ships, while the system will also be installed on a certain number of sea buoys managed by OGS.

Author Contributions

Conceptualization, P.D.; methodology, P.D., M.I. and R.J.C.; software, M.I., R.J.C.; validation, P.D., M.I., R.J.C., A.V. and N.P.; resources, R.J.C., N.P. and A.V.; data management, M.I. and N.P.; writing—original draft preparation, P.D., M.I. and R.J.C.; writing—review and editing, P.D.; visualization, M.I.; supervision, P.D.; project administration, P.D. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

All the data used in this work are freely and openly available at the website https://COCAL.ogs.it, accessed on 3 May 2023.

Acknowledgments

The authors would like to thank Trieste Trasporti TPL, the local transportation authority of Trieste, for its support, without which this work would not have been possible. In particular, we would also like to thank, for their great help, Giuseppe Zottis, Fabrizio Godinich and Fabio Vidotto. We are also very grateful to the ICTP FabLAb for their effective help in designing and 3D printing the COCAL boxes.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Air Pollution. Available online: https://www.who.int/health-topics/air-pollution (accessed on 20 February 2023).
  2. Almetwally, A.A.; Bin-Jumah, M.; Allam, A.A. Ambient air pollution and its influence on human health and welfare: An overview. Environ. Sci. Pollut. Res. 2020, 27, 24815–24830. [Google Scholar] [CrossRef]
  3. Di Turo, F.; Proietti, C.; Screpanti, A.; Fornasier, M.F.; Cionni, I.; Favero, G.; De Marco, A. Impacts of air pollution on cultural heritage corrosion at European level: What has been achieved and what are the future scenarios. Environ. Pollut. 2016, 218, 586–594. [Google Scholar] [CrossRef]
  4. Cachier, H.; Sarda-Estève, R.; Oikonomou, K.; Sciare, J.; Bonazza, A.; Sabbioni, C.; Greco, M.; Saiz-Jimenez, C.; Hermosin, A.; Reyes, J. Aerosol Characterization and Sources in Different European Urban Atmospheres: Paris, Seville, Florence and Milan. In Air Pollution and Cultural Heritage; Taylor and Francis Group: Abingdon, UK, 2004; Available online: https://scholar.google.com/scholar_lookup?title=Aerosol%20characterization%20and%20sources%20in%20different%20European%20urban%20atmospheres%3A%20Paris%2C%20Seville%2C%20Florence%20and%20Milan&publication_year=2004&author=H.%20Cachier&author=R.%20Sarda-Est%C3%A8ve&author=K.%20Oikonomou&author=J.%20Sciare&author=A.%20Bonazza&author=C.%20Sabbioni&author=M.%20Greco&author=J.%20Reyes&author=B.%20Hermosin&author=C.%20Saiz-Jimenez (accessed on 20 February 2023).
  5. United States Environmental Protection Agency. Clean Air Act Text. Available online: https://www.epa.gov/clean-air-act-overview/clean-air-act-text (accessed on 21 February 2023).
  6. Directive 2008/50/EC of the European Parliament and of the Council of 21 May 2008 on Ambient Air Quality and Cleaner Air for Europe. Off. J. Eur. Union 2008, 152, 169–212. Available online: http://data.europa.eu/eli/dir/2008/50/oj/eng (accessed on 21 February 2023).
  7. European Environment Agency. European City Air Quality Viewer. Available online: https://www.eea.europa.eu/themes/air/urban-air-quality/european-city-air-quality-viewer (accessed on 22 February 2023).
  8. United States Environmental Protection Agency. Interactive Map of Air Quality Monitors. Available online: https://www.epa.gov/outdoor-air-quality-data/interactive-map-air-quality-monitors (accessed on 22 February 2023).
  9. Strzelecki, M.; Stezycki, K. “Burn Everything”: Poland Chokes on the Smog of War. Reuters, 8 December 2022. Available online: https://www.reuters.com/world/europe/burn-everything-poland-chokes-smog-war-2022-12-08/ (accessed on 22 February 2023).
  10. United States Environmental Protection Agency. EPA AP-42: Compilation of Air Emissions Factors. Available online: https://www3.epa.gov/ttnchie1/ap42/ch01/ (accessed on 12 August 2022).
  11. Pudasainee, D.; Kurian, V.; Gupta, R. 2—Coal: Past, Present, and Future Sustainable Use. In Future Energy, 3rd ed.; Letcher, T.M., Ed.; Elsevier: Amsterdam, The Netherlands, 2020; pp. 21–48. [Google Scholar] [CrossRef]
  12. Singh, D.; Dahiya, M.; Kumar, R.; Nanda, C. Sensors and systems for air quality assessment monitoring and management: A review. J. Environ. Manag. 2021, 289, 112510. [Google Scholar] [CrossRef]
  13. Bonafè, G.; Montanari, F.; Stel, F. Air quality in Trieste, Italy—A Hybrid Eulerian-Lagrangian-Statistical Approach to Evaluate Air Quality in a Mixed Residential-Industrial Environment. 37p. Available online: https://pdfs.semanticscholar.org/72da/4005028680576c99a26eefc6d68a4df25c78.pdf (accessed on 5 May 2023).
  14. Irwin, A. Citizen Science: A Study of People, Expertise and Sustainable Development; Routledge: London, UK, 1995. [Google Scholar]
  15. Froeling, F.; Gignac, F.; Hoek, G.; Vermeulen, R.; Nieuwenhuijsen, M.; Ficorilli, A.; De Marchi, B.; Biggeri, A.; Kocman, D.; Robinson, J.A.; et al. Narrative review of citizen science in environmental epidemiology: Setting the stage for co-created research projects in environmental epidemiology. Environ. Int. 2021, 152, 106470. [Google Scholar] [CrossRef]
  16. Fraisl, D.; Campbell, J.; See, L.; Wehn, U.; Wardlaw, J.; Gold, M.; Moorthy, I.; Arias, R.; Piera, J.; Oliver, J.L.; et al. Mapping citizen science contributions to the UN sustainable development goals. Sustain. Sci. 2020, 15, 1735–1751. [Google Scholar] [CrossRef]
  17. Stewart, C.; Labrèche, G.; González, D.L. A Pilot Study on Remote Sensing and Citizen Science for Archaeological Prospection. Remote Sens. 2020, 12, 2795. [Google Scholar] [CrossRef]
  18. Kasperowski, D.; Hillman, T. The epistemic culture in an online citizen science project: Programs, antiprograms and epistemic subjects. Soc. Stud. Sci. 2018, 48, 564–588. [Google Scholar] [CrossRef] [Green Version]
  19. Kullenberg, C.; Kasperowski, D. What Is Citizen Science?—A Scientometric Meta-Analysis. PLoS ONE 2016, 11, e0147152. [Google Scholar] [CrossRef] [Green Version]
  20. Ward, F.; Lowther-Payne, H.J.; Halliday, E.C.; Dooley, K.; Joseph, N.; Livesey, R.; Moran, P.; Kirby, S.; Cloke, J. Engaging communities in addressing air quality: A scoping review. Environ. Health 2022, 21, 89. [Google Scholar] [CrossRef]
  21. Höhne, A.; Schulte, R.A.A.; Kulicke, M.; Huynh, T.-T.; Telgmann, M.; Frenzel, W.; Held, A. Assessing the Spatial Distribution of NO2 and Influencing Factors in Urban Areas—Passive Sampling in a Citizen Science Project in Berlin, Germany. Atmosphere 2023, 14, 360. [Google Scholar] [CrossRef]
  22. De Craemer, S.; Vercauteren, J.; Fierens, F.; Lefebvre, W.; Meysman, F.J.R. Using Large-Scale NO2 Data from Citizen Science for Air-Quality Compliance and Policy Support. Environ. Sci. Technol. 2020, 54, 11070–11078. [Google Scholar] [CrossRef]
  23. Ellenburg, J.A.; Williford, C.J.; Rodriguez, S.L.; Andersen, P.C.; Turnipseed, A.A.; Ennis, C.A.; Basman, K.A.; Hatz, J.M.; Prince, J.C.; Meyers, D.H.; et al. Global Ozone (GO3) Project and AQTreks: Use of evolving technologies by students and citizen scientists to monitor air pollutants. Atmos. Environ. X 2019, 4, 100048. [Google Scholar] [CrossRef]
  24. Baldelli, A. Evaluation of a low-cost multi-channel monitor for indoor air quality through a novel, low-cost, and reproducible platform. Meas. Sens. 2021, 17, 100059. [Google Scholar] [CrossRef]
  25. Bousiotis, D.; Alconcel, L.-N.S.; Beddows, D.C.S.; Harrison, R.M.; Pope, F.D. Monitoring and apportioning sources of indoor air quality using low-cost particulate matter sensors. Environ. Int. 2023, 174, 107907. [Google Scholar] [CrossRef]
  26. Yang, G.; Zhou, Y.; Yan, B. Contribution of influential factors on PM2.5 concentrations in classrooms of a primary school in North China: A machine discovery approach. Energy Build. 2023, 283, 112787. [Google Scholar] [CrossRef]
  27. Zusman, M.; Schumacher, C.S.; Gassett, A.J.; Spalt, E.W.; Austin, E.; Larson, T.V.; Carvlin, G.; Seto, E.; Kaufman, J.D.; Sheppard, L. Calibration of low-cost particulate matter sensors: Model development for a multi-city epidemiological study. Environ. Int. 2020, 134, 105329. [Google Scholar] [CrossRef]
  28. Johnston, H.J.; Mueller, W.; Steinle, S.; Vardoulakis, S.; Tantrakarnapa, K.; Loh, M.; Cherrie, J.W. How Harmful Is Particulate Matter Emitted from Biomass Burning? A Thailand Perspective. Curr. Pollut. Rep. 2019, 5, 353–377. [Google Scholar] [CrossRef] [Green Version]
  29. Ganti, R.K.; Ye, F.; Lei, H. Mobile crowdsensing: Current state and future challenges. IEEE Commun. Mag. 2011, 49, 32–39. [Google Scholar] [CrossRef]
  30. Fraisl, D.; Hager, G.; Bedessem, B.; Gold, M.; Hsing, P.-Y.; Danielsen, F.; Hitchcock, C.B.; Hulbert, J.M.; Piera, J.; Spiers, H.; et al. Citizen science in environmental and ecological sciences. Nat. Rev. Methods Primer 2022, 2, 64. [Google Scholar] [CrossRef]
  31. Wehn, U.; Gharesifard, M.; Ceccaroni, L.; Joyce, H.; Ajates, R.; Woods, S.; Bilbao, A.; Parkinson, S.; Gold, M.; Wheatland, J. Impact assessment of citizen science: State of the art and guiding principles for a consolidated approach. Sustain. Sci. 2021, 16, 1683–1699. [Google Scholar] [CrossRef]
  32. Aristeidou, M.; Herodotou, C. Online citizen science: A systematic review of effects on learning and scientific literacy. Citiz. Sci. Theory Pract. 2020, 5, 11. [Google Scholar] [CrossRef] [Green Version]
  33. Luger, T.M.; Hamilton, A.B.; True, G. Measuring Community-Engaged Research Contexts, Processes, and Outcomes: A Mapping Review. Milbank Q. 2020, 98, 493–553. [Google Scholar] [CrossRef]
  34. Friedman, A.J. (Ed.) Framework for Evaluating Impacts of Informal Science Education Projects; Report from a National Science Foundation Workshop; The National Science Foundation: Alexandria, VA, USA, 2008.
  35. Diviacco, P.; Nadali, A.; Iurcev, M.; Carbajales, R.; Busato, A.; Pavan, A.; Burca, M.; Grio, L.; Nolich, M.; Molinaro, A.; et al. MaDCrow, a Citizen Science Infrastructure to Monitor Water Quality in the Gulf of Trieste (North Adriatic Sea). Front. Mar. Sci. 2021, 8, 619898. [Google Scholar] [CrossRef]
  36. Diviacco, P.; Nadali, A.; Nolich, M.; Molinaro, A.; Iurcev, M.; Carbajales, R.; Busato, A.; Pavan, A.; Grio, L.; Malfatti, F. Citizen science and crowdsourcing in the field of marine scientific research—The MaDCrow project. J. Sci. Commun. 2021, 20, A09. [Google Scholar] [CrossRef]
  37. Diviacco, P.; Iurcev, M.; Carbajales, R.J.; Potleca, N. First Results of the Application of a Citizen Science-Based Mobile Monitoring System to the Study of Household Heating Emissions. Atmosphere 2022, 13, 1689. [Google Scholar] [CrossRef]
  38. Diviacco, P.; Iurcev, M.; Carbajales, R.J.; Potleca, N.; Viola, A.; Burca, M.; Busato, A. Monitoring Air Quality in Urban Areas Using a Vehicle Sensor Network (VSN) Crowdsensing Paradigm. Remote Sens. 2022, 14, 5576. [Google Scholar] [CrossRef]
  39. Iurcev, M.; Pettenati, F.; Diviacco, P. Improved automated methods for near real-time mapping—Application in the environmental domain. Bull. Geophys. Oceanogr. 2021, 62, 427–454. [Google Scholar]
  40. Horvath, K.; Ivatek-Šahdan, S.; Ivančan-Picek, B.; Grubišić, V. Evolution and Structure of Two Severe Cyclonic Bora Events: Contrast between the Northern and Southern Adriatic. Weather Forecast. 2009, 24, 946–964. [Google Scholar] [CrossRef]
  41. Orlic, M.; Penzar, B.; Penzar, I. Adriatic Sea and Land Breezes: Clockwise Versus Anticlockwise Rotation. J. Appl. Meteorol. 1988, 27, 675–679. [Google Scholar] [CrossRef]
  42. Bulot, F.M.J.; Russell, H.S.; Rezaei, M.; Johnson, M.S.; Ossont, S.J.J.; Morris, A.K.R.; Basford, P.J.; Easton, N.H.C.; Foster, G.L.; Loxham, M.; et al. Laboratory Comparison of Low-Cost Particulate Matter Sensors to Measure Transient Events of Pollution. Sensors 2020, 20, 2219. [Google Scholar] [CrossRef] [Green Version]
  43. Zimmerman, N. Tutorial: Guidelines for implementing low-cost sensor networks for aerosol monitoring. J. Aerosol Sci. 2022, 159, 105872. [Google Scholar] [CrossRef]
  44. Liang, L. Calibrating low-cost sensors for ambient air monitoring: Techniques, trends, and challenges. Environ. Res. 2021, 197, 111163. [Google Scholar] [CrossRef]
  45. Yi, W.Y.; Lo, K.M.; Mak, T.; Leung, K.S.; Leung, Y.; Meng, M.L. A Survey of Wireless Sensor Network Based Air Pollution Monitoring Systems. Sensors 2015, 15, 31392–31427. [Google Scholar] [CrossRef] [Green Version]
  46. Mead, M.I.; Popoola, O.A.M.; Stewart, G.B.; Landshoff, P.; Calleja, M.; Hayes, M.; Baldovi, J.J.; McLeod, M.W.; Hodgson, T.F.; Dicks, J.; et al. The use of electrochemical sensors for monitoring urban air quality in low-cost, high-density networks. Atmos. Environ. 2013, 70, 186–203. [Google Scholar] [CrossRef] [Green Version]
  47. Crilley, L.R.; Shaw, M.; Pound, R.; Kramer, L.J.; Price, R.; Young, S.; Lewis, A.C.; Pope, F.D. Evaluation of a low-cost optical particle counter (Alphasense OPC-N2) for ambient air monitoring. Atmos. Meas. Technol. 2018, 11, 709–720. [Google Scholar] [CrossRef] [Green Version]
  48. Crilley, L.R.; Singh, A.; Kramer, L.J.; Shaw, M.D.; Alam, M.S.; Apte, J.S.; Bloss, W.J.; Hildebrandt Ruiz, L.; Fu, P.; Fu, W.; et al. Effect of aerosol composition on the performance of low-cost optical particle counter correction factors. Atmos. Meas. Technol. 2020, 13, 1181–1193. [Google Scholar] [CrossRef] [Green Version]
  49. Malm, W.C.; Day, D.E.; Kreidenweis, S.M.; Collett, J.L.; Lee, T. Humidity-dependent optical properties of fine particles during the Big Bend Regional Aerosol and Visibility Observational Study. J. Geophys. Res. Atmos. 2003, 108, 4279. [Google Scholar] [CrossRef] [Green Version]
  50. Sathiyanarayanan, M.; Varadarajan, V.; Pradeep, K.V. Visual analytics on spatial time series for environmental data. Int. J. Recent Technol. Eng. 2019, 8, 1173–1181. [Google Scholar]
Figure 1. The area where the COCAL system is installed.
Figure 1. The area where the COCAL system is installed.
Processes 11 01881 g001
Figure 2. Comparison of 15 months of PM measurements from a certified reference station with measurements taken during the same period with a COCAL system located near the reference station. The upper set of graphs reports PM10 measurements, while the lower one reports PM2.5 measurements. In both cases, graph no. (1) shows the daily average COCAL measurements in blue, while the reference daily value is plotted in red. Graph no. (2) shows the difference between the two time series. Graph no. (3) shows the RH time series. Graph no. (4) shows the standard deviation of COCAL measurements calculated day by day. As can be seen, COCAL measurements are generally lower than the reference measurements. During the first months of 2023 in three specific events (identified in the graph by the boxes A, B and C), this behavior reverses. It is interesting to note that during these events, the standard deviation of measurements also becomes high. In two of these cases (A and C), this can be understood as owing to the sensitivity of the LCS to RH, but in the case of event B, the RH is low.
Figure 2. Comparison of 15 months of PM measurements from a certified reference station with measurements taken during the same period with a COCAL system located near the reference station. The upper set of graphs reports PM10 measurements, while the lower one reports PM2.5 measurements. In both cases, graph no. (1) shows the daily average COCAL measurements in blue, while the reference daily value is plotted in red. Graph no. (2) shows the difference between the two time series. Graph no. (3) shows the RH time series. Graph no. (4) shows the standard deviation of COCAL measurements calculated day by day. As can be seen, COCAL measurements are generally lower than the reference measurements. During the first months of 2023 in three specific events (identified in the graph by the boxes A, B and C), this behavior reverses. It is interesting to note that during these events, the standard deviation of measurements also becomes high. In two of these cases (A and C), this can be understood as owing to the sensitivity of the LCS to RH, but in the case of event B, the RH is low.
Processes 11 01881 g002
Figure 3. Scatter plot comparing Reference PM measurements and COCAL measurements.
Figure 3. Scatter plot comparing Reference PM measurements and COCAL measurements.
Processes 11 01881 g003
Figure 4. Distribution of raw PM10 values (left); distribution of data after correction to improve LCS accuracy (right).
Figure 4. Distribution of raw PM10 values (left); distribution of data after correction to improve LCS accuracy (right).
Processes 11 01881 g004
Figure 5. Installing a COCAL box on buses.
Figure 5. Installing a COCAL box on buses.
Processes 11 01881 g005
Figure 6. Design and installation of COCAL boxes for cars: Rendering of the box (upper left). Lateral section of the box (lower left): the air inlet passes through the curved white roof to remain dry in case of rain. The air outlet is located on the back of the box. The roof can easily be removed to access the electronics inside. Actual deployment on a car (upper right): boxes are located on car roofs secured with Velcro stripes to roof bars. COCAL boxes have magnetic plates to better adhere to the roofs (lower right).
Figure 6. Design and installation of COCAL boxes for cars: Rendering of the box (upper left). Lateral section of the box (lower left): the air inlet passes through the curved white roof to remain dry in case of rain. The air outlet is located on the back of the box. The roof can easily be removed to access the electronics inside. Actual deployment on a car (upper right): boxes are located on car roofs secured with Velcro stripes to roof bars. COCAL boxes have magnetic plates to better adhere to the roofs (lower right).
Processes 11 01881 g006
Figure 7. Measurements of PM10 concentration as a function of platform speed.
Figure 7. Measurements of PM10 concentration as a function of platform speed.
Processes 11 01881 g007
Figure 8. Heltec and A9G board on a PCB designed for COCAL.
Figure 8. Heltec and A9G board on a PCB designed for COCAL.
Processes 11 01881 g008
Figure 9. General architecture of the COCAL system.
Figure 9. General architecture of the COCAL system.
Processes 11 01881 g009
Figure 10. COCAL web-based GUI (left) and near-real-time visualizations: (a) time series; (b) single acquisitions; (c) near-real-time window averaging; (d) interpolation and contour.
Figure 10. COCAL web-based GUI (left) and near-real-time visualizations: (a) time series; (b) single acquisitions; (c) near-real-time window averaging; (d) interpolation and contour.
Processes 11 01881 g010
Figure 11. COCAL web “real time analysis” with advanced queries.
Figure 11. COCAL web “real time analysis” with advanced queries.
Processes 11 01881 g011
Figure 12. Map of maximum standard deviation of COCAL PM10 measurements in the designated area. In light blue, the alignment of the higher values correlates with the main traffic direction of the city.
Figure 12. Map of maximum standard deviation of COCAL PM10 measurements in the designated area. In light blue, the alignment of the higher values correlates with the main traffic direction of the city.
Processes 11 01881 g012
Figure 13. An LCS case opened after almost two years of activity. The interiors show only very small quantities of dirt.
Figure 13. An LCS case opened after almost two years of activity. The interiors show only very small quantities of dirt.
Processes 11 01881 g013
Table 1. LCS accuracy metrics.
Table 1. LCS accuracy metrics.
ParameterPM10PM2.5
Coefficient of Determination (R2)0.450.25
Slope1.060.80
Intercept (b)−8.73−4.09
Root Mean Square Error (RMSE)13.768.73
Normalized Root Mean Square Error (NRMSE)66.0460.37
Table 2. Approximated number of records per year.
Table 2. Approximated number of records per year.
Record TypeTotal CountYearCount
Points53M20203.8M
202113M
202227M
2023 (partial)9.2M
Cells18M20200.9M
20215.8M
20229.4M
2023 (partial)1.8M
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Diviacco, P.; Iurcev, M.; Carbajales, R.J.; Viola, A.; Potleca, N. Design and Implementation of a Crowdsensing-Based Air Quality Monitoring Open and FAIR Data Infrastructure. Processes 2023, 11, 1881. https://doi.org/10.3390/pr11071881

AMA Style

Diviacco P, Iurcev M, Carbajales RJ, Viola A, Potleca N. Design and Implementation of a Crowdsensing-Based Air Quality Monitoring Open and FAIR Data Infrastructure. Processes. 2023; 11(7):1881. https://doi.org/10.3390/pr11071881

Chicago/Turabian Style

Diviacco, Paolo, Massimiliano Iurcev, Rodrigo José Carbajales, Alberto Viola, and Nikolas Potleca. 2023. "Design and Implementation of a Crowdsensing-Based Air Quality Monitoring Open and FAIR Data Infrastructure" Processes 11, no. 7: 1881. https://doi.org/10.3390/pr11071881

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop