**1. Introduction**

Air pollution has been reported to significantly affect human health [1], causing issues such as premature death, bronchitis, asthma, cardiovascular disease, and lung cancer [2]. Pollutants in the air include CO, NO2, and particulate matter. Among them, particulate matter with a diameter of less than 2.5 microns (PM2.5) is a key component which severely affects human health in many ways. For example, PM2.5 aerosols are able to directly enter the lungs through the respiratory tract and affect a person's health [3]. According to the World Health Organization report, more than 90% of the world's population inhales large amounts of pollutants every day, which results in approximately seven million deaths each year [4]. Consequently, PM2.5 concentration estimation is required and has become an important concern for human health [5,6].

Many techniques have been developed to measure the PM2.5 concentration, such as the filter-based gravimetric method [7], tapered element oscillating microbalance method [8], beta attenuation monitoring method [9], optical analysis method [10,11], and black smoke measurement [12]. These methods require expensive instruments and professional operations. Some more comprehensive methods analyze the relationship between human activities and PM2.5 by satellite and big data [13,14]. However, satellite and big data are not available to the common user. Therefore, a simple and effective method should be sought for PM2.5 concentration estimation.

In urban environments, researchers have developed low-cost sensors. These sensors are widely deployed throughout the city to monitor the PM2.5 concentration [15]. Although one sensor is low in cost, it is not effective when widely deployed in a city requiring many sensors. The portable PM2.5 sensor can be used to monitor the PM2.5 concentration at different locations [16]. The portable device reduces the cost of employing a large number of sensors, but requires more manpower to move the sensors. Optical sensors, such as TEOM 1400a analyzer, SDS011 (Nova Fitness, Jinan, China), ZH03A (Winsen, Zhengzhou, China), PMS7003 (Plantower, Beijing China), and OPC-N2 (Alphasense, Braintree, UK), have been introduced to monitor PM2.5 [17]. However, these optical sensors are more expensive than ordinary cameras. Since a camera is installed on the top floor of environmental monitoring stations in Taiwan, using the camera to estimate PM2.5 is a simpler and more effective approach than employing extra devices.

It should be noted that air pollution is usually characterized by a poor visibility due to light scattering, such as Rayleigh scattering and Mie scattering, caused by the interaction between light and airborne particles [18]. In other words, the visibility is reduced, as a large amount of aerosol pollution scatter the atmospheric light [19], and vice versa. In previous decades, some researchers proposed methods to estimate the visibility through image processing schemes [20,21]. Recently, an expensive digital camera was used to take high-quality photos for visibility estimation [22]. These studies have shown that image processing schemes can be applied to visibility estimation. Furthermore, it was reported that the PM2.5 concentration is related to visibility reduction [23]. However, these studies did not develop image processing technologies to estimate the PM2.5 concentration. Therefore, it gives us hope that the PM2.5 concentration may be estimated through image processing schemes.

The rapid development of computers, algorithms, and artificial intelligence has meant that image processing methods using machine learning have been widely applied. The main advantage of using machine learning is that it requires training and does not require defining too many features. Two types of the training-based algorithms are neural network methods [24] and linear regression schemes [25]. The neural network methods require a very fast and expensive graphics processing unit [26]. By contrast, compared to neural network methods, the estimation of spatial variations by linear regression could be performed by a consumer computer, as economical and predictive performance were both acceptable [27]. Nowadays, high-quality images can be taken by a commercial digital camera. This facilitates PM2.5 concentration estimation by image processing schemes.

In order to understand which features can affect PM2.5 concentrations when using image processing methods, previous research has pointed out that the PM2.5 concentrations may affect image characteristics, including the distance, hazy model, entropy, contrast, sky color, and solar zenith angle. It was found that the distance is the feature that has the most influence [28]. This is consistent with the definition of visibility, and a previous study has also shown that visibility can be estimated using high-frequency information from an image [22]. The region of interest (RoI) has also been manually selected to estimate PM2.5 concentrations [28]. However, the estimation performance might be degraded because of such a manually selected RoI. Besides, the computational cost might not be cheap, since a support vector regression model with several features was involved in the estimation. To solve these problems, this paper presents an approach to PM2.5 concentration estimation, where only a single feature is used and simple linear regression is employed as an estimator. The main contribution of the proposed approach is to use a series of image processing schemes in PM2.5 concentration estimation where the images are taken by a consumer camera. It provides a

valuable alternative to estimating PM2.5 concentration. The main aims of this image-based approach are as follows: (i) to automatically locate the RoI to replace the manual selection of Liu's work [28]; (ii) to use a single feature for linear regression instead of multiple features in PM2.5 concentration estimation, with an acceptable performance; and (iii) to provide a cheaper alternative method with a camera for estimating the PM2.5 concentration. This paper is organized as follows. The proposed approach is described in Section 2. In Section 3, real-world data is given to verify the proposed approach. Finally, a conclusion is made in Section 4.

#### **2. The Proposed Approach**

There are two stages involved in the proposed approach. In the first stage, a series of image processing schemes are employed to automatically locate the region of interest (RoI) to extract a single feature, which is required in the following stage for PM2.5 concentration estimation. In the second stage, a simple linear regression model is used with the training data, which contains pairs of the single feature obtained through the selected RoI and the actual PM2.5 concentration measurement. The simple linear regression model is then used in PM2.5 concentration estimation with the testing data. The estimated PM2.5 concentration is compared with the actual value and evaluated by performance indices. An overall block diagram for the proposed approach is depicted in Figure 1. The details of the proposed approach are described in the following sections. The proposed automatic RoI selection approach is described in Section 2.1, the simple linear regression model is given in Section 2.2, and three performance indices employed to assess the proposed approach are given in Section 2.3.

**Figure 1.** A block diagram of the proposed approach.

#### *2.1. Automatic RoI Selection*

It should be noted that not all parts of an image are strongly related to the PM2.5 concentration. Therefore, selecting an appropriate RoI to estimate the PM2.5 concentration is an important step for the successful application of the proposed approach. It is known that some details in the image will be blurred when the PM2.5 concentration is high, compared to when there is a low PM2.5 concentration. In other words, the pixel value of the images with a high and low PM2.5 concentration is different. This also illustrates that not every feature has a good correlation with the PM2.5 concentration. It motivates us to use the differences in image pairs of low and high PM2.5 concentrations in automatic RoI selection. A flowchart of the proposed automatic RoI selection is depicted in Figure 2. A pair of images, shown in Figure 3a,b, are given to demonstrate how the proposed automatic RoI selection works. Given a pair of images of low and high PM2.5 concentrations, both images are converted into gray-level images. The image of a low PM2.5 concentration is denoted as *<sup>I</sup>*<sup>1</sup> and the one with a high

PM2.5 concentration is denoted as *<sup>I</sup>*2. A series of image processing steps to determine the final RoI is described in the following.

**Figure 2.** A flowchart of the proposed automatic region of interest (RoI) selection.

**Figure 3.** A sample image pair. (**a**) *<sup>I</sup>*<sup>1</sup> (low PM2.5 concentration, <sup>1</sup> <sup>μ</sup>g/m3); (**b**) *<sup>I</sup>*<sup>2</sup> (high PM2.5 concentration, 75 μg/m3).

#### 2.1.1. Sobel Edge Detection

As the first step, Sobel edge detection is applied to the image pair, *I*<sup>1</sup> and *I*2, to extract the high-frequency components [29]. In Sobel edge detection, the gradients used in this approach for the *x*-axis and *y*-axis, respectively, are denoted as *Gx* and *Gy*, and given as

$$\mathbf{G}\_x = \begin{bmatrix} -1 & 0 & 1 \\ -2 & 0 & 2 \\ -1 & 0 & 1 \end{bmatrix} \text{and} \, \mathbf{G}\_y = \begin{bmatrix} -1 & -2 & -1 \\ 0 & 0 & 0 \\ 1 & 2 & 1 \end{bmatrix} \tag{1}$$

where a 3 × 3 mask is employed. The final magnitude *Gxy* is calculated as

$$G\_{\mathbf{x}\mathbf{y}} = |G\_{\mathbf{x}}| + \left| G\_{\mathbf{y}} \right|. \tag{2}$$

The images produced by Sobel edge detection are denoted as *<sup>I</sup>*1,s and *<sup>I</sup>*2,s, and are shown in Figure 4a,b, respectively. In Figure 4, one can see that the low concentration image *I*1, after Sobel edge detection, has more details than *I*2. This shows that more high-frequency components are contained in *I*1,s than *I*2,s. The edge detection results of Figure 3a,b are shown in Figure 4a,b, respectively. We can see that two buildings on the right of Figure 3a do not appear in Figure 3b. This is because the PM2.5 concentration of Figure 3b is higher than that of Figure 3a. This means that the edges of the two buildings are invisible in Figure 4b. The difference of Figure 4a,b is shown in Figure 4c. The results show that the PM2.5 concentration has a significant effect on the high frequency components of images.

(**c**) **Figure 4.** Images after Sobel edge detection. (**a**) *<sup>I</sup>*1,s (low PM2.5 concentration); (**b**) *<sup>I</sup>*2,s (high PM2.5 concentration); (**c**) the difference of (**a**) and (**b**).

#### 2.1.2. Otsu Thresholding

After Sobel edge detection, Otsu thresholding [30] is applied to the two images in Figure 4 to obtain binary images. In Otsu thresholding, the pixels in an image are separated into two groups based on the histogram. By employing statistical properties, the optimal threshold, where the variance of each group is minimized and the variance between two groups is maximized, is determined. In Otsu thresholding, the weighted sum of the variance between two groups is found as

$$
\sigma\_w^2 = w\_0(t)\sigma\_0^2(t) + w\_1(t)\sigma\_1^2(t),
\tag{3}
$$

where σ<sup>2</sup> <sup>0</sup>(*t*) and <sup>σ</sup><sup>2</sup> <sup>1</sup>(*t*) represent the variance of each group, and *w*0(*t*) and *w*1(*t*) are the weights of two groups separated by the threshold *t*, respectively. The weights *w*0(*t*) and *w*1(*t*) are obtained, respectively, as

$$w\_0(t) = \sum\_{i=0}^{t-1} p(i) \tag{4}$$

and

$$w\_1(t) = \sum\_{i=t}^{L-1} p(i),\tag{5}$$

where *p*(*i*) is the probability of the pixel value *i* and *L* is the number of gray levels. The variance between two groups is given as

$$
\sigma\_o^2(t) = \sigma^2 - \sigma\_w^2(t),
\tag{6}
$$

where σ<sup>2</sup> is the variance of the whole image. Equation (6) can be transformed into

$$
\sigma\_o^2(t) = w\_0(t)w\_1(t) \left[\mu\_0(t) - \mu\_1(t)\right]^2,\tag{7}
$$

where μ0(*t*) and μ1(*t*) are the means of two groups separated by threshold *t*. The optimal threshold is then found with *t*, which maximizes σ<sup>2</sup> *<sup>o</sup>* (*t*) in Equation (7). The images *<sup>I</sup>*1,s and *<sup>I</sup>*2,s after Otsu thresholding, are denoted as *<sup>I</sup>*1,so and *<sup>I</sup>*2,so and shown in Figure 5a and Figure 5b, respectively.

**Figure 5.** Images after Otsu thresholding. (**a**) *<sup>I</sup>*1,so (low PM2.5 concentration); (**b**) *<sup>I</sup>*2,so (high PM2.5 concentration).

#### 2.1.3. Morphological Dilation

Using the obtained binary images, *<sup>I</sup>*1,so and *<sup>I</sup>*2,so, shown in Figure 5, morphological dilation is applied to expand boundaries and to connect neighborhood pixels. The degree of expansion depends on the size of structuring elements. The equation employed for morphological dilation is given below:

$$A \bigoplus B = \langle \text{white} | B\_X \cap A \neq \mathbb{Z} \rangle \,\tag{8}$$

where *A* is the image to be processed and *B* represents the structuring elements.

In the proposed RoI scheme, the 3 × 3 mask for structuring elements with all white pixels is used. After morphological dilation, the resulting images are denoted as *<sup>I</sup>*1,som and *<sup>I</sup>*2,som and shown in Figure 6a and Figure 6b, respectively.

**Figure 6.** Images after morphological dilation. (**a**) *<sup>I</sup>*1,som (low PM2.5 concentration); (**b**) *<sup>I</sup>*2,som (high PM2.5 concentration).

#### 2.1.4. Image Subtraction and Labeling

In this step, image subtraction is used to obtain the difference image for *<sup>I</sup>*1,som and *<sup>I</sup>*2,som in Figure 6. Then, a labeling scheme is employed to identify connected pixels. The difference image for *<sup>I</sup>*1,som and *<sup>I</sup>*2,som is shown in Figure 7, denoted as *<sup>I</sup>*d, where pixels with the same value in the image pair are eliminated and those with different pixel values remain in a white color. In order to distinguish whether pixels are connected, a labeling scheme [31] is applied to mark the connected pixels by colors. The connected neighborhood pixels are marked with the same color. After labeling, the resulting image, denoted as *I*dl, is as shown in Figure 8. Finally, the labeled regions with the top three largest numbers of pixels are considered as candidate regions of interest.

**Figure 7.** The difference image *<sup>I</sup>*<sup>d</sup> after image subtraction.

**Figure 8.** The image *<sup>I</sup>*dl after labeling.

### 2.1.5. Selected RoI in the Given Pair of Images

Now, the red flow path shown in Figure <sup>2</sup> will be described. The difference image, denoted as *I*sd, for *<sup>I</sup>*1,s and *<sup>I</sup>*2,s is obtained by image subtraction. Then, the three candidate regions of interest and the difference image *<sup>I</sup>*sd are overlapped to select the pixels in the candidate regions of interest. Next, the averages of pixel values in each candidate region of interest are calculated. Then, the RoI with the highest average is determined as the final RoI in the given pair of images, *<sup>I</sup>*<sup>1</sup> and *<sup>I</sup>*2. This completes the process of automatic RoI selection given in Figure 2 for the given pair of images.

#### 2.1.6. Final RoI Determination

It needs to be pointed out that the image pair given above is just an example provided to show the process of the proposed automatic RoI selection. In practice, in automatic RoI selection, 30 images with a low PM2.5 concentration (≤<sup>5</sup> <sup>μ</sup>g/m3) and 120 images with a high PM2.5 concentration (≥<sup>70</sup> <sup>μ</sup>g/m3) are randomly selected from the training set. In this study, the images with low and high PM2.5 concentrations are paired by combinations. In other words, the 30 × 120 paired images are included in the automatic RoI selection process, as described in Figure 2. By using the averages of 3600 results, the three candidate regions of interest are determined, as shown in Figure 9. The box plot given in Figure 10 shows the range of average pixel values in each candidate RoI. Since Region 1 has the highest average value, it is selected as the final RoI to estimate the PM2.5 concentration. The average pixel value within the final RoI will be used as the only single feature for the following simple linear regression model in the proposed approach.

**Figure 9.** The three candidate regions of interest indicated by red boxes.

**Figure 10.** A box plot for three candidate regions of interest.

#### *2.2. Simple Linear Regression Model*

A simple linear regression model, which is a statistical analysis scheme [25], will be used to estimate the PM2.5 concentration in the proposed approach. *xi* is the average pixel value within the final data and *yi* is the corresponding PM2.5 concentration measurement in the training data (where subscript *i* denotes the *i*th sample). It is assumed that these two sequences of data have a linear relation, shown as

$$y\_i = \alpha + \beta x\_{i\prime} \tag{9}$$

where α and β are coefficients to be determined. *yi* denotes an estimate of *yi* (corresponding PM2.5 concentration). The estimation error between *yi* and *yi* is given as

$$
\varepsilon\_i = y\_i - y\_i. \tag{10}
$$

Employing the least squares algorithm to minimize the estimation error, coefficients α and β can be found as

$$\alpha = \sum\_{i=1}^{N} y\_i - \beta \sum\_{i=1}^{N} x\_i \tag{11}$$

and

$$\beta = \frac{\sum\_{i=1}^{N} \mathbf{x}\_i y\_i - \sum\_{i=1}^{N} \mathbf{x}\_i \sum\_{i=1}^{N} y\_i}{\sum\_{i=1}^{N} \mathbf{x}\_i^2 - \sum\_{i=1}^{N} \mathbf{x}\_i \sum\_{i=1}^{N} \mathbf{x}\_i} \,\tag{12}$$

where *N* is the number of samples. Once the simple linear regression model is obtained, it is employed to estimate the PM2.5 concentration with the testing data.

#### *2.3. Performance Indices*

Inherently, image-based method cannot analyze the ingredients in the air, as in previous works, thus it is hard to define a parameter to show the performance by error. Instead, three overall performance indices are used to evaluate the proposed approach. The first one is the root mean square error (RMSE). It is used to show the error between the recorded value and the estimated value of the proposed method. RMSE is calculated as

$$\text{RMSE} = \sqrt{\frac{1}{N} \sum\_{i=1}^{N} (y\_i - y\_i)^2} \,\text{}\,\tag{13}$$

where *yi* and *yi* are the true and estimated PM2.5 concentrations, respectively. The second performance index is R squared (*R*2), which has also been used in previous work [28], and is employed to show the correlation between estimated results and measured values. It is defined as

$$R^2 = 1 - \frac{\sum\_{i=1}^{N} \left( y\_i - y\_i \right)^2}{\sum\_{i=1}^{N} \left( y\_i - \overline{y} \right)^2},\tag{14}$$

where *y* is the mean of *yi*. *R*<sup>2</sup> indicates the linearity between *yi* and *yi*. When it is linear, *R*<sup>2</sup> = 1. The third index is *F*-test, which is the test statistic for an F-distribution under the null hypothesis [32], where the *p*-value indicates the statistical significance; that is, it determines whether the result is beyond chance or not. The *p*-value will be used as an indicator of statistical significance in the following experiments.

#### **3. Experimental Results**

In this section, the proposed approach is verified by a real-world data set, which is described later in Section 3.1. Then, the results without and with unreliable data exclusion are shown in Sections 3.2 and 3.3, respectively.

#### *3.1. Experimental Data Sets*

In the experiments, the images were taken from Renwu Environmental Monitoring Station, Kaohsiung City, Taiwan. A consumer camera was set up at the station and took one image every ten minutes during the period of 7:00 AM to 5:00 PM. In total, 10,084 images were collected from May to October 2016. We did not exclude sampled images of sunny or rainy days. The image data were divided into training and testing data, of which the proportions were 60% and 40%, respectively. The images shown in Figure 3a,b are examples taken from the data set. Furthermore, the hourly PM2.5 concentration and relative humidity (RH) in the corresponding area were obtained from the open data released by the Environmental Protection Administration, Executive Yuan, Taiwan [33]. Using the data, a simple linear regression model was obtained and used to estimate the PM2.5 concentration by employing the proposed approach.

#### *3.2. Results with All Data*

In this experiment, all of the data set, including 10,084 images, was used. As described in Section 2, three candidate regions of interest were automatically selected and the final RoI was determined by the highest average pixel value among the three candidate regions of interest. Besides, the average pixel value in the final RoI was used as the only single feature. To compare the estimation performances for the whole image, Region 1, Region 2, and Region 3 are presented in Figure 11a–d, which show scattering plots for each case, where the region under consideration is shown in the upper right corner. The three performance indices with all data are displayed in Table 1. Table 1 indicated that Region 1 had a better performance than the other cases. Besides, all results were statistically significant in the F-test. In the case with all data, the highest *R*<sup>2</sup> = 0.41, which was achieved by Region 1. One may see that the performance of the whole image case is inferior to those for candidate regions of

interest. When Regions 1 to 3 are considered, the performance index *R*2, from high to low, is Region 1, Region 3, and Region 2. The result is consistent with the priority for the proposed automatic RoI selection. In other words, the proposed automatic RoI selection is appropriate for the given data.

**Figure 11.** The scatter plots for (**a**) the whole image; (**b**) Region 1 (selected); (**c**) Region 2; (**d**) Region 3.

**Table 1.** The performance indices (with all data).


#### *3.3. Results with Unreliable Data Exclusion*

By conducting experiments, it was observed that two factors may affect the performance of the proposed approach. One is the time difference between the time to take images and the time to measure the PM2.5 concentration. For the data set described in Section 3.1, the images were taken every ten minutes, but the PM2.5 concentration was collected hourly. In other words, six images were related to only one PM2.5 concentration for each hour. When the PM2.5 concentration changes within an hour, it might degrade the estimation performance. To solve this problem, the variance of six images taken in the same hour was calculated. When the variance was greater than 1, the images were considered as unreliable data and discarded.

The other factor seen to affect the performance of the proposed approach was the RH. There are many substances, in addition to PM2.5, in the atmosphere that affect visibility, such as sulfur oxides, nitrogen oxides, carbon monoxide, and water droplets. It has been observed that PM2.5 aerosols are expanded by absorbing water molecules in the air and this affects visibility [34]. It has also been reported that the RH affects PM2.5 concentration estimation [28]. Consequently, the effect of RH on PM2.5 concentration estimation was considered in the proposed approach.

By conducting experiments, we observed that the estimation performance of the proposed approach was significantly degraded when RH ≥ 65%. Consequently, the data was excluded if its corresponding RH ≥ 65%. Moreover, it should be noted that human health is mostly endangered by a higher PM2.5 concentration, instead of a lower one. Consequently, the data with PM2.5 concentrations less than 5 <sup>μ</sup>g/m3 were excluded. By employing the criteria RH <sup>≥</sup> 65% or PM2.5 concentration less than 5 μg/m3, 2361 images were excluded from the given data set. With the consideration of data exclusion, the three performance indices were recorded and are presented in Table 2 for all cases, as in Table 1. As seen in Table 2, Region 1 had a better performance than the other cases, as in Table 1. Moreover, all results were statistically significant in the *F*-test. When comparing the results presented in Tables 1 and 2, one can see that the RMSE and *R*<sup>2</sup> were obviously improved in all cases with data exclusion. Additionally, Region 1 exhibited the most improvement. The RMSE was reduced from 11.88 to 8.67, while the *R*<sup>2</sup> increased from 0.41 to 0.73. Again, the results implied that the automatically selected RoI was appropriate in the given example. To sum up, the proposed approach with automatic RoI selection and data exclusion is feasible and has an acceptable performance for PM2.5 concentration estimation. By Table 2, one may observe that the performance of the whole image case is inferior to those for candidate regions of interest, as in Table 1. According to the results, the performances from high to low are Region 1, Region 3, and Region 2, which is consistent with the priority for the proposed automatic RoI selection, as shown in Figure 10. Again, the results have verified the feasibility of the proposed automatic RoI selection scheme in the given experiments.


**Table 2.** The performance indices (with unreliable data exclusion).
