1. Introduction
In recent years, air quality has become a major problem in large cities. Although episodic increases in pollutant levels can be caused by natural events such as volcanoes (e.g., SO
2) or desert dust outbreaks (PM), the main emission sources of pollutant gases and particulates are of human origin, such as traffic and industrial activities [
1]. As demonstrated by scientists, notably by World Health Organization (WHO) expert groups, these compounds can lead to respiratory problems. In particular, they have been linked to the appearance of lung cancer and they contribute to the appearance of mental disorders [
2]. These reasons have prompted governments of large cities and national and international organizations to take surveillance and control measures to prevent the impact of atmospheric pollution, which often involve novel clean technologies and traffic restrictions.
The process of obtaining and processing pollutant data to monitor and assess air quality has been separated into three stages by some authors: monitoring, prediction, and tracing [
3]. Currently, regulatory monitoring is carried out by large and expensive analytical equipment based on standard and certified measurement techniques that is installed inside fixed or mobile (large vehicles) reference stations placed at points of special interest. These stations and the equipment they contain have a high installation cost and require periodic maintenance by qualified personnel. Moreover, they require an extensive energy supply for air conditioning (operating temperature 20 °C) and instrument operation. Aside from these restrictions, the monitoring strategy based on sophisticated fixed and mobile stations involves an additional problem, which is the very limited spatial resolution of the data that are provided, which compromises the representativeness of the estimated risk to human health and ecosystems.
In this scenario, low-cost sensors (LCSs) have emerged in recent years as complementary tools that, combined with conventional equipment, allow air quality to be monitored more effectively, dramatically improving the spatial resolution of air-monitoring data and effectively engaging citizens in the experimental measurements [
4]. These sensors have much lower price and energy requirements than conventional equipment. As described in [
5], the price of low-cost sensors is between
$10 and
$100, which increases to about
$1000 to
$5000 when the sample supply, electronic systems, protective housing, connections, and management hardware and software are included. LCSs can be installed in easily available locations (typically urban posts, trees, balconies, etc.), worn on vehicles (such as bicycles), or carried by citizens, thereby improving the spatial and temporal representativeness of pollutant-level measurements [
6]. Moreover, a large number of LCSs installed in a given area allows a high spatial resolution to be obtained in that area, as exemplified by two studies, one that monitored the air quality of the Gold Coast (Australia) for several weeks [
7] and one using a network of 40 nodes to control air quality at Heathrow Airport (London) [
8].
Low-cost gas sensors can be classified according to their principle of operation. In this way, a distinction is made between electrochemical, metal-oxide (MOX), and optical sensors such as non-dispersive infrared (NDIR) and optical particle-counter (based on light dispersion) sensors. For particulate-matter measurement, the most common sensors are based on photometry using a scattering laser, although there are also some based on gravimetric sampling and beta attenuation [
9].
Once calibrated in the lab, LCSs need to be validated for field use in real conditions. In addition, low-cost sensors often present interference between contaminants (cross-sensitivity), especially between NO
2 and O
3, which should be evaluated and corrected during real operation. On the other hand, they are also affected by environmental conditions, especially temperature and relative humidity [
10,
11], which are not controllable variables when working in the field. Finally, these devices have drift, i.e., the response of the sensor is affected as it is working. Due to all of these aspects, it is common in the literature to see different approaches to calibrating these sensors and obtaining reliable data from them, mostly based on machine-learning algorithms such as neural networks, linear regressions, or support vector machines. For example, in [
12] these algorithms were used to study the air inside a running vehicle. In [
13], the authors obtained good results using a long short-term memory network. In [
14], the authors used a different approach; they first conducted a linear regression and then introduced the result into a neural network. Most authors agree that the quality of the LCS data is good enough if the data fulfill the objective of providing useful air-quality information to citizens, or if the objective is to distinguish between different levels of pollution on a semiquantitative basis (high, low, moderate, etc.). Providing citizens with a simple and easy-to-understand system for estimating air quality on a local scale is thus mandatory and was one of the main purposes of this work.
This work is framed within the NanoSenAQM project, a European initiative that pursues the monitoring of air quality by designing, fabricating, and testing a range of LCSs in different natural, rural, and urban environments [
15]. One of the main contributions of this work is the device presented, which is designed to be installed on bicycle handlebars in order to monitor air quality and its dynamic spatiotemporal evolution. The novelty of our air-quality-monitoring strategy also lies in the implementation of multiparametric neural-network calibration coupled with cloud connectivity and air-quality-index calculation, which readily allows cyclists to get a clear overview of their degree of exposure to air pollution during urban travel, providing a basis for personal decisions such as route selections or preferable time slots. It is also valuable information for urban designers, e.g., for the planning of bike paths. We consider it more refined and personalized information than the data provided by smartphone apps based on reference air-quality-monitoring units, which are sometimes located at quite remote points with respect to the user’s location. Our device has been tested on bicycles, but it can be easily adapted to different vehicles and even for static usage. As shown in [
16], it is much more common to find LCSs designed for operation at fixed points, or coupled to UAVs in case a mobile device is needed, than attached to bicycles. However, drones must comply with flight rules, so they are not the best choice if the objective is to study the air quality in an urban area at low altitude.
In the following sections of this work, we describe in depth the device that was designed for this task, the calibration and field validation and the measurement campaigns, including the problems found during the experimental work. The most relevant results are shown and discussed and, finally, the work concludes with the most relevant ideas obtained.
2. Materials and Methods
2.1. Description of the Device
The device, described in
Figure 1, was developed entirely and exclusively to measure the main pollutants (NO
2, O
3, PM
2.
5, and PM
10) responsible for worsening air quality. The prototype was designed to fit on a bicycle in order to provide increased spatiotemporal resolution of pollutant maps and estimate the air quality perceived by cyclists. That is why great importance was given to optimizing the device to allow its installation and operation on bicycles, with special attention paid to the autonomy, size, weight, and communication method implemented.
The air-quality-sensor array is composed of 3 sensors produced by AlphaSense (Essex, UK). Two are A4 series 3-electrode electrochemical sensors designed to measure NO2 and O3. The third is an OPC-N3 optical particulate sensor that provides PM10 and PM2.5 concentration values by light dispersion, using an internal algorithm.
In addition to these 3 sensors, the device has a number of sensors focused on monitoring environmental conditions, which can provide relevant information when the signals are processed. These include a temperature, pressure, and relative-humidity sensor (BMP280) from Bosch Sensortech. The values of the gas sensors are read by the microcontroller from the built-in 12-bit A/D conversion inputs. The particle sensor uses SPI communication. Finally, the pressure, temperature, and humidity sensor use the I2C bus.
The complete system is controlled by a low-power, high-performance ARM® Cortex®-M0+-based flash microcontroller (Microchip ATSAMD21G18), which is a 32-bit microcontroller with an operating frequency of up to 48 MHz. It is connected via I2C interface to an OLED display that shows the sensor values and the operating status of the device (battery, date, time, temperature, GPS status, etc.). The system carries out communication in different ways. First, the GSM/GPRS module (SIM808, SIMCom) allows wireless communication with the internet to store information coming from the sensors in the cloud. This module communicates with the microcontroller via UART connection. There is also a USB input for manual data transfer and device programming. Finally, all of the collected information is also stored locally on a microSD card: date, time, latitude, longitude, altitude, number of satellites in coverage and used, battery voltage and percentage, temperature, humidity, pressure, external-fan rpm, PM1, PM2.5, PM10, OPC-N3 sensor flow rate, NO2 (ppb), O3 (ppb), and the raw values of the gas-sensor electrodes. This study focused on NO2, O3, PM2.5 and PM10. Information is stored in a text file (.txt) with a sampling period of 3 s.
The system is powered by a lithium-ion battery with a capacity of 2750 mAh, which provides system autonomy of up to 8 h. In addition, a jack connector is incorporated to charge the system from a 9 V, 660 mA power supply.
The pneumatic sampling system of the whole sensing device uses the OPC-N3 sensor as the input, since it has an air inlet with flow control from a fan. The air is conducted to a collector, where the rest of the sensors are located, and finally it is expelled to the outside by a second fan placed at the outlet. It should be noted that this collector includes a conductive coating; in addition to redirecting the air flow through the gas sensors, it also protects the sensors against electromagnetic interference.
All of these elements are protected inside a polycarbonate housing with IP66 protection with dimensions of 180 × 120 × 90 mm, which can be seen in
Figure 2. The housing is attached to the handlebars of a bicycle by an adjustable adapter designed to fit most models. The approximate weight of the complete system is 1.1 kg, which is appropriate for use without disturbing the cycling experience of the user. Nevertheless, we suggest that a reduced size and weight are desirable technical aspects to be considered in further developments.
With this design, eight identical devices were manufactured, and two of them, coded BEC01 and BEC02, were used in the experiment. The first one served as the object of study, while the second one was used to study the repeatability of the design, which is detailed in
Section 3.3.
2.2. Gas Sensors and Particulate Matter Sensor
Two electrochemical sensors manufactured by AlphaSense, NO2-A43F and OX-A431, were used to measure concentrations of NO
2 and O
3, respectively. As shown in [
17], electrochemical gas sensors are the most commonly used type of LCS for air-quality monitoring and have demonstrated good performance (R
2 = 0.90 and R
2 = 0.81 for NO
2 and O
3, respectively, when calibration with an artificial neural network was performed). These sensors calculate the concentration of a given gas based on the changes it produces in the electrical properties of an electrode. The sensors are composed of 4 electrodes: working, reference, counter, and auxiliary electrodes [
18]. From the current of the working and auxiliary electrodes, the gas concentration is calculated by means of a conversion algorithm, which will be detailed in
Section 2.5.1.
The OPC-N3 sensor was chosen to measure the particle concentration. It is an optical particle counter (OPC) that uses Mie scattering to estimate the concentrations of particulate matter in the air [
19]. The sensor incorporates a small fan that draws air inside, where a laser beam passes through the sample and strikes the suspended particles. Knowing the intensity of the scattered light and the refractive index, it is possible to estimate the PM concentration.
2.3. Measurement Campaigns
To fit and experimentally test the prototypes, the measurement campaign was conducted in two parts: first, calibration and validation of the sensors, and second, testing of the devices coupled to a bicycle performing several routes on different days. The place chosen for both the calibration and testing was the city of Badajoz (Spain).
2.3.1. Calibration Measurements
Calibration by adjusting the sensor measurements to the reference instrumentation (see details in
Section 2.5) was carried out from 18 to 20 January 2021. During that period, the devices were collocated in parallel with a reference station at an urban location with high traffic density (38°52′14.6″ N 6°58′43.6″ W) to ensure that the sensors worked within a wide concentration range and with the high temporal variability typical of urban traffic. The air-quality-monitoring instruments that were used as a reference belonged to the Air Quality Protection and Research Network of Extremadura (REPICA), of the Department of Ecological Transition and Sustainability of the Regional Government of Extremadura. The devices were installed in the same position at which they were subsequently placed on the bicycles; thus, the conditions were similar in both scenarios.
The reference equipment used was as follows:
O3: Thermo Fisher 49i-B3ZAA (UV absorption)
NOx: Thermo Fisher 42i-BZMTPAA (chemiluminescence)
PM: DIGITEL DHA-80 (high-volume sampler + gravimetric analysis) and GRIMM 180 (optical laser light aerosol spectrometer, non-regulatory)
Data collected by the reference system, comprising 10-min-average values in concentration units of micrograms per cubic meter (μg/m
3), were validated before being used for calibration. Data from the developed devices were averaged over the same intervals for comparison. This information was used to apply the calibration techniques described in
Section 2.5.
2.3.2. Cycling Routes
To test the devices in the field, three routes (R1, R2, and R3), consisting of 90–120 min bicycle rides through Badajoz were carried out on 22, 24, and 28 January 2021. During these routes, the bicycle travelled both on roads with high traffic density and in quiet areas with more vegetation and less traffic impact. In addition, the bicycle was equipped with a camera to record the entire route, in order to have more information when analyzing the results. The data collected by the sensors were corrected using the calibration algorithm. The air-quality index was calculated and the results were mapped, as shown in
Section 3.4.
2.4. Data Acquisition and Filtering
The devices store pollutant-concentration data, data on the gas-sensor electrodes, the air flow of the OPC-N3 sensor, and data related to the environment on a microSD card, along with the device status such as temperature, humidity, battery, latitude, longitude, and altitude, with a sampling frequency of 3 datapoints per second. The data are then averaged to obtain an appropriate resolution, and all units are converted to micrograms per cubic meter (μg/m3) for later study.
In addition, the devices send the concentration data to a cloud platform developed by the University of Coimbra and the University of Evora within the framework of the European NanoSen-AQM project [
20,
21]. This platform was developed so that users can have access to the collected data and accurate information on the quality of the air around them.
2.5. Calibration Process
2.5.1. Internal Algorithms
The gas sensors use the value of the working and auxiliary electrodes to calculate pollutant concentrations using an internal algorithm designed by the manufacturer through the following expression:
where
SWE and
SAE are the values of the working and auxiliary electrodes, respectively;
SWE,0 and
SAE,0 are the offset of the working and auxiliary electrodes; n is a temperature-dependent parameter given by the manufacturer; and
s is the sensitivity to the contaminant.
This calculation of the concentration from the voltage values was designed for stable conditions and indoor applications. Several authors [
22,
23] have shown that if the objective is to improve the performance of the sensors when working outdoors, then it is more advisable to work with the electrode values (
SWE and
SAE) than with the concentration values calculated from Formula (1).
Optical particle counters (OPCs) are calibrated in the laboratory by the provider, relating the intensity of the scattered light to the diameter and abundance of particles. This is performed using an aerosol generator of known size and optical properties. Thus, an OPC is defined by three parameters: the wavelength of the incident light, the scattering angle, and the number of particle-size intervals (24 in the case of the AlphaSense OPC-N3) [
19]. The OPC-N3 counts the particles and creates a size distribution. The mass concentration is then obtained using an internal algorithm (black-box type) that uses the refractive index, particle density, and a weighting factor [
24].
2.5.2. Calibration by Means of Neural Network
In general, the aim of the calibration process is to find a function (f) that returns the concentration of each pollutant from the raw data of the electrodes (SWE and SAE; for simplicity, WE and AE) and environmental parameters such as temperature, relative humidity, pressure, and wind speed and direction.
In this work, the variables used as the inputs for this calibration function were the electrodes of each pollutant (
WEi and
AEi) (
i = NO
2, O
3) and the temperature and relative humidity:
An artificial neural network was used to calculate this calibration function. Neural networks are a fundamental tool in the field of low-cost-sensor calibration [
25,
26,
27,
28]. In a previous work [
29], we tested this type of algorithm against other common techniques and concluded that neural networks performed the best.
A multilayer perceptron was designed with two hidden layers and two inputs per layer. The activation function of the neurons in the hidden layers was a rectified linear unit (ReLU). These parameters were chosen in such a way as to optimize the results. The process was performed using Python 3.8, making use of the scikit-learn 0.24.1 package.
Regarding the OPC-N3 particle sensor, the problems that arise when using it outdoors were exhaustively detailed in [
7,
24,
30], highlighting the effect of temperature and the strong dependence on relative humidity. Relative humidity is especially relevant, since particles of suspended matter absorb part of the humidity in the air, which increases their size and modifies their refractive index, interfering with the sensor reading. In addition, it is important to highlight that because the OPC-N3 has a fan at the inlet, and because this fan is used to drive the flow to the gas sensors, a malfunction of the fan could affect both the particulate sensor and the two gas sensors. The status of the fan was monitored in all tests and will be discussed in the next section.
To solve the main problems of the OPC, another neural network was used. The inputs were the values returned by the OPC-NC particle sensor (
PM1,
PM2.
5, and
PM10) and the temperature (
T) and relative humidity (
RH) as variables to describe the environment and the input flow rate (
FR):
For the gas and particle sensors, the calibration dataset was divided into training and test sets. The training set consisted of 60% of the data from the calibration campaign, and the test set contained the remaining 40%.
2.5.3. Model Evaluation
In order to evaluate the effectiveness of the neural network in calculating the actual concentrations of pollutants, different metrics were calculated: slope, MAE, MSE, and coefficient of determination. Their expressions are as follows:
where
is the real value of the pollutant for the
ith sample (given by the reference equipment),
is the calculated value for that sample (given by the neural network), and
is the mean value of
.
2.6. Air-Quality Index
As mentioned above, the raw data collected from an LCS are generally inaccurate and not reproducible. For this reason, the use of an LCS is not a straightforward choice for making accurate measurements without refinement. However, it is an interesting tool for informing citizens about the overall levels of pollution that may be present in the environment. For this reason, in this work, the data collected during the cycling routes were processed and translated into different air-quality levels as stipulated by legislation.
Spanish legislation [
31,
32], based on recommendations of the European Environment Agency, includes a methodology that allows the calculation of the air-quality index from data of the official reference equipment. This index is divided into different levels with respective recommendations.
This system recommends reducing outdoor activity at level 4 for the general population and eliminating it completely at level 6. For people who belong to a risk group or are particularly sensitive to the effects of pollution, reducing outdoor activities is recommended at level 3 and eliminating them at level 5.
According to the regulation, the air-quality index at a given time corresponds to the index with the highest value among the four pollutants (PM
2.
5, PM
10, O
3, and NO
2). To obtain this index, the regulation has established a method that involves 8-h-average data, which would require long measurement campaigns. This procedure was not practical for the purpose of this work. Therefore, to calculate the air-quality index at any time on a cycling route, the data were preprocessed, calibrated, and averaged to 1 point/5 s in order to obtain a more manageable resolution for users. These data were then compared to the values in
Figure 3, thus obtaining an index for each pollutant.
In addition to the four pollutants mentioned above, SO
2 is also included in this regulation; however, for this work it was decided not to include a sensor to measure this gas since it is found in very low concentrations in the study area [
33].
In this work, the values in
Figure 3 were used as a reference to determine the air-quality index at particular moments in order to inform users of the pollution status. However, it should be emphasized that the procedure by which the concentrations of pollutants were calculated is not exactly the same as the one indicated in the regulations [
31,
32]; therefore, this index will not correspond to the official one. The air-quality index calculated and shown in the following section should be understood as an index that allows users of the device to have immediate but approximate information on the state of the air quality at a local level and a specific time, given the data collected by the LCS.
4. Conclusions
An electronic prototype was developed to monitor air quality in motion and in real time. It is based on low-cost electrochemical sensors and was designed to be coupled to a bicycle. This system obtains reliable information on concentrations of NO2, O3, PM2.5, and PM10 in order to inform bike users and citizens.
The device has a novel design: it takes advantage of the OPC-N3 air-supply pump as an input for the two gas sensors, NO2-A43F and OX-A431. However, this design requires paying special attention to the OPC-N3, since a malfunction in the air-supply pump would affect not only the PM measurements but also the NO2 and O3 measurements. The inlet variables were examined to ensure their correct operation, and the airflow to the sensors was incorporated as an additional input to the calibration algorithm.
Parallel measurements with the reference show some initial deviations of the LCS. These deviations were corrected by the calibration algorithm. Calibration by a neural network allows the sensor accuracy to be increased, with a coefficient of determination up to 0.85.
On the other hand, comparing the two devices made it possible to ensure repeatability. We conclude that there are no meaningful differences between BEC01 and BEC02.
The air-quality-index mapping described in this work allows us to inform citizens about air quality in a simple, easy-to-understand way. Preliminary field testing of the device during short-term cycling routes in urban areas during wintertime showed that the device can provide information about which areas of the city have better air quality, which days of the week have differences in the air-quality index, and at what times the impact of traffic is more severe, making the device a useful tool for citizens in addition to traditional instruments. Work is in progress to improve the quality testing of the system by conducting additional campaigns of longer duration and including different meteorological scenarios, with the goal of constructing a model of spatial distribution and temporal evolution of air pollutants along urban cycling routes.