1. Introduction
Numerous air pollution epidemiology studies have found associations between ambient concentrations and adverse health effects [
1,
2]. Due to various challenges with personal exposure measurements (e.g., cost, participant burden), these health studies often use outdoor air monitors as exposure surrogates, which can: (1) introduce negative bias in health effect estimates due to time spent in indoor microenvironments with ambient-source pollutant concentrations that can be substantially attenuated from outdoor levels [
3,
4], and (2) increase confidence intervals of health effect estimates by not accounting for building-to-building and temporal variability of this attenuation [
4]. To help improve health effect estimates, we are developing an air pollution exposure model for individuals (EMI) in health studies [
5,
6,
7,
8]. The EMI predicts personal exposures based on outdoor concentrations, meteorology, questionnaire information (e.g., building characteristics, occupant behavior related to building operation), and time-location information. A critical aspect of EMI is the air exchange rate (AER) of individual homes, which is the rate of exchange of indoor air with outdoor air. In addition, AERs have been applied as a covariate or modifying factor in air pollution epidemiology studies, showing the importance of this variable [
9,
10].
This study addresses the cross-validation and application of residential AER models, and specifically the AER predictions for the Near-Road Exposures and Effects of Urban Air Pollutants Study (NEXUS) [
5]. The goal of NEXUS is to examine traffic-related air pollution exposures and respiratory effects in asthmatic children living near major roads in Detroit, Michigan (MI).
The AER affects both the steady-state (
i.e., long-term average) and dynamic (
i.e., time-varying) behaviors of indoor air pollutant concentrations, and the resulting exposures [
11]. For example, assume that outdoor concentrations,
Cin_ss are under steady-state conditions (
i.e., short-term changes of concentrations are considered negligible compared with long-term average concentrations), then the steady-state indoor concentrations of outdoor-generated air pollutants
Cin_ss can be described by:
where
Finf is the fraction of
Cout_ss that enters and remains airborne indoors (infiltration factor) defined as:
where
P is the penetration coefficient, and
kd is the indoor loss rate. Setting
P = 0.9 and
kd = 1.0 h
‑1 based on reported values for particulate matter (diameter = 2.5 µm; PM
2.5),
Cin_ss for a tight (AER = 0.1 h
‑1) and leaky (AER = 3.0 h
‑1) building is 0.08 and 0.68 times
Cout_ss, respectively. Therefore, the AER can substantially affect
Cin_ss. Furthermore, studies examining particulate matter show that the AER can explain a substantial amount of the variability of
Finf [
12,
13,
14]. For time-varying outdoor concentrations
Cout (e.g., traffic), indoor concentrations
Cin can be described by the dynamic mass balance equation:
Measurements of
Cout and
Cin for time-varying traffic pollutants show that the dynamic behavior of
Cin depends on the AER [
15]; for example,
Cin increases more slowly and reaches lower peak levels for tighter buildings [
16].
For gaseous pollutants with
kd > 0 (e.g., ozone),
Finf depends on AER [
17]. For gases with negligible
kd (e.g., carbon monoxide) compared with AER,
Cin_ss can be considered independent of the AER based on Equation (2) (
Finf =
P) [
18]. However, for outdoor pollutants that vary with time (e.g., traffic), time-varying
Cin (Equation (3)) depends on AER even when
kd is negligible compared with AER [
15].
A residential AER model has several benefits for exposure assessments in health studies. First, the AER is a key determinant for the entry of outdoor-generated air pollutants and the removal of indoor-generated air pollutants [
11,
19]. Since people in the United States spend approximately 66% of their time indoors at home [
20,
21], the residential AER is a critical parameter for air pollution exposure models. Costs and participant burden often limit the number of AER measurements. Therefore, a residential AER model integrated within exposure models can be a feasible method to predict exposure metrics for epidemiological analysis. Second, an AER model can reduce the uncertainty of exposure models by accounting for factors that influence the house-to-house (spatial) and temporal variability of the AER. These factors include the physical driving forces of the airflows (e.g., indoor-outdoor temperature differences, wind speed), building characteristics (e.g., local wind sheltering, building height, tightness of the building envelope), and occupant behavior (e.g., opening windows). Spatial and temporal differences in weather, building characteristics, and occupant behavior can produce substantial AER variations. The resulting spatial and temporal variations in exposure may help explain the impact of AER for individuals with exceptionally high and low exposures. Also, predicting the AER variability can help reduce exposure misclassifications, and the resulting errors in health effect estimates.
Various AER models are described in the literature [
11]. The Lawrence Berkeley Laboratory (LBL) model is widely used to predict residential AER [
22]. The LBL model predicts the AER due to airflow through small unintentional openings (
i.e., leakage), but does not account for the airflow through large controllable openings (
i.e., natural ventilation), such as open windows. Previously, we addressed this limitation by extending the LBL model (LBLX) to predict natural ventilation airflow [
7]. In this study, we used the previously developed LBL and LBLX models, which were linked with a leakage area model, to predict the AER from questionnaire and weather data [
7]. The LBL model was used for all homes, and the LBLX model was used for a subset of homes with window opening data, as described below.
The NEXUS design includes the development of various tiers of modeled exposure metrics for traffic-related air pollutants, and the use of measurements from a subset of homes for model calibration (
i.e., parameter estimation) and evaluation [
5]. This paper focuses on modeling the residential AER. We used NEXUS questionnaires and airport weather data as inputs for the AER models, and AER measurements from a subset of homes for parameter estimation and model evaluation. Below, we first describe the NEXUS design, and then describe the AER models, methods for parameter estimation and model evaluation, and development of daily AER predictions for the three year health study.
4. Discussion
Our goal was to develop daily AER predictions for each NEXUS participant home to provide improved exposure estimates for the health study. We used cross-validation to evaluate two models (LBL and LBLX), which predict residential AER from questionnaires and meteorology, with measured AERs from a subset of NEXUS homes. The daily modeled AER closely correspond to the measured AER with the same overall |ε| median of 29% for both the LBL and LBLX models. These results demonstrate that it is possible to apply these models for individual-level air pollution exposure assessments that require daily predictions of house-specific AER. However, the impact of applying these models for a health study in support of improving health effect estimates will depend not only on the accuracy of exposure predictions, but also on other factors such as the design of the health study [
33,
34].
Figure 4.
Time-course of AER predictions (A), absolute indoor-outdoor temperature differences (B), outdoor temperatures (C), and wind speeds (D) across the three years of health study. Two AER time-course plots correspond to homes with highest and lowest median AER predictions. Plots show daily 24 h average values across three years of health study from 1 January 2010 to 31 December 2012. AER oscillations correspond to indoor-outdoor temperature differences. AER transients of positive or negative spikes correspond primarily to wind speeds and secondarily to indoor-outdoor temperature differences.
Figure 4.
Time-course of AER predictions (A), absolute indoor-outdoor temperature differences (B), outdoor temperatures (C), and wind speeds (D) across the three years of health study. Two AER time-course plots correspond to homes with highest and lowest median AER predictions. Plots show daily 24 h average values across three years of health study from 1 January 2010 to 31 December 2012. AER oscillations correspond to indoor-outdoor temperature differences. AER transients of positive or negative spikes correspond primarily to wind speeds and secondarily to indoor-outdoor temperature differences.
Figure 5.
AER predictions for 213 homes across three years of health study with results for each season and road type. Boxes correspond to median, 25th and 75th percentiles; and whiskers correspond to minimum and maximum values. Winter includes December, January, and February; spring includes March, April, May; summer includes June, July, August; fall includes September, October, and November.
Figure 5.
AER predictions for 213 homes across three years of health study with results for each season and road type. Boxes correspond to median, 25th and 75th percentiles; and whiskers correspond to minimum and maximum values. Winter includes December, January, and February; spring includes March, April, May; summer includes June, July, August; fall includes September, October, and November.
Figure 6.
AER predictions for 213 homes across the three years of the health study with results for individual homes grouped by the three traffic categories: HTHD (A), HTLD (B), and LTLD (C). Box plots show median, 25th and 75th percentiles, and whiskers represent minimum and maximum values of 24 h average AER.
Figure 6.
AER predictions for 213 homes across the three years of the health study with results for individual homes grouped by the three traffic categories: HTHD (A), HTLD (B), and LTLD (C). Box plots show median, 25th and 75th percentiles, and whiskers represent minimum and maximum values of 24 h average AER.
We found considerable variation in measured AERs (range: 0.09–3.48 h
‑1) and modeled AERs (range: 0.11–3.04 h
‑1). Another study in central North Carolina showed similar variation in measured AERs (range: 0.09–3.17 h
‑1) across 31 homes on seven consecutive days during the same two seasons (spring, fall) as the seasonal intensives in NEXUS [
7]. This suggest that AER differences may be an important source of heterogeneity in the infiltration of outdoor air pollutants into homes and the resulting exposures, even for studies focused on within-city variations and for studies in different geographical locations. Using questionnaire and weather data, the LBLX and LBL models explained a substantial amount of the measured AER variation (R
2 = 61% and 59%, respectively).
There is substantial temporal variation in the modeled AER that differs for each home based on the building envelope tightness. The home with the largest Aleak (i.e., leakiest building envelope) had the highest median AER (1.64 h‑1) and largest AER range (0.50–3.04 h‑1) across time. The home with the smallest Aleak (i.e., tightest building envelope) had the lowest median AER (0.36 h‑1) and smallest AER range (0.11–0.64 h‑1) across time.
This study demonstrates a novel health study design and modeling method designed to improve residential AER predictions for individual exposure assessments in health studies. This study is the first to use daily AER measurements and window opening data from a subset of homes for parameter estimation (
i.e., model calibration) and model evaluation, and then apply the calibrated model to predict the spatial and temporal variations of the AER for each participant’s home in a health study. This approach allowed us to estimate the uncertainty of the model parameters (e.g., based on the jackknife method) and the uncertainty of the model predictions (
i.e., based on the cross validation method), which can be important when the model is applied for health effect analyses [
33].
We can compare our model performance using two alternative approaches for parameter estimation of
Aleak. First, we estimated parameters using both the 23 older homes and the one newer home combined instead of estimating parameters using only the 23 older homes, as described in the methods. Using this alternative method, the median |ε| for the one newer home increased from 17% to 91% (
Supplementary Material Figure S5,
Figure 3). Second, we used the literature-reported parameters for both the 23 older homes and one newer home instead of only for the newer home, as described in the methods. Using this alternative approach, the median |ε| for the older homes increased from 29% to 43%, the 25th percentile increased from 12% to 19%, and the 75th percentile increased from 63% to 131% (
Supplementary Material Figure S5,
Figure 3). This demonstrates the benefit of including AER measurements from a subset of homes, which represent the housing stock of homes in the same city as the health study, to reduce the AER model uncertainty.
We can compare the AER model evaluation with other studies. LBL model evaluations using whole-building pressurization measurements to determine the leakage area showed mean |ε| of 26%–46% [
35] and 25% [
36] for detached homes. For our implementation of the AER models, which uses a leakage area model, the LBL and LBLX models had mean |ε| of 43% and 48%, respectively for 31 detached homes across four seasons in central North Carolina [
7]. In this study, the LBL and LBLX models both had a mean |ε| of 45%. Given the limitations of single-zone AER models (e.g., no internal resistance to airflow, no internal temperature or pressure differences) and the AER measurement error of the PFT method (accuracy of 20%–25%, precision of 5%–15% for occupied homes) [
19,
25,
26], our LBL and LBLX model evaluations are reasonable, but their impact will depend on the particular application.
For parameter estimation, Tin was set to the 24 h average indoor temperature time-matched to the 24 h average AER measurements from a subset of homes. However, for predicting the daily AER for all homes across the three year study, Tin was set to a constant (24 ºC), which was the median indoor temperature measured in subset of homes. To investigate the impact of using a constant Tin, we compared the LBL model predictions with Tin set to a lower and upper limit of 20 and 28 ºC, respectively. For Tin set to 20, 24, and 28 ºC, the minimum AER was 0.10, 0.11, and 0.16 h−1; the median AER was 0.88, 0.95, and 1.05 h−1; and the maximum AER was 3.00, 3.04, 3.12 h−1. Since these results are similar, we expect that setting Tin to 24 ºC does not have a substantial impact on the AER model predictions.
On days with open windows, similar model evaluation results were obtained for the LBLX model, which includes both leakage and natural ventilation, and the LBL model, which includes only leakage. Another study showed similar results for the LBLX and LBL models with AER measurements and window opening data from 31 homes in central North Carolina [
7]. For 253 days with open windows across 4 consecutive seasons, the median |ε| was 41% and 48% for the LBLX and LBL models, respectively. For days with open windows, the LBL model slightly underestimates, the LBLX model slightly overestimates. Also, the LBL and LBLX models may perform similarly since windows may be opened more often on comfortable days with small indoor-outdoor temperature differences. Thus, the stack effect may be small on days with windows opened. Also, the stack effect can be reduced after windows are opened from a thermal equilibrium created between indoor and outdoor temperatures. These results suggest that our application of the LBL model, instead of the LBLX model, for the NEXUS health study is reasonable. In certain geographical locations (e.g., coastal regions) with high and persistent winds, comfortable outdoor temperatures across seasons, and frequent window opening; the LBLX model may provide substantially improved estimates as compared to the LBL model.
The temporal resolution of the AER is determined by the meteorological data. In this paper, we used hourly outdoor temperature and wind speed measurements to predict hourly AER, and then calculated 24 h averages to compare with the 24 h average AER measurements. To account for the diurnal variation of traffic-related air pollutants, we plan to use the hourly AER predictions combined with hourly residential outdoor concentration predictions to predict every NEXUS participant’s hourly residential indoor concentrations based on the dynamic mass balance model (Equation (3)) [
4].
Since the AER is the key parameter for
Finf (Equation (2)), we can compare our AER models with a previously reported model used to predict
Finf of outdoor PM
2.5 for individual homes in a health study [
13]. The reported
Finf model is an empirical linear regression model that does not include the stack and wind effects, which are the driving forces for leakage and natural ventilation airflows. The
Finf model also does not account for differences in the leakage area between homes. In our study, we used the mechanistic LBL and LBLX models that include the stack and wind effects, and the building characteristics that modify the stack effect (
i.e., building height) and wind effect (
i.e., local wind sheltering and building height). Also, these AER models are linked to a building-specific leakage area model (Equation (5)). Furthermore, we estimated only a few parameters based on daily measurements, whereas the reported
Finf model required several parameters to be estimated based on two-week average measurements.
Most air pollution health studies use outdoor concentrations as an exposure surrogate. Under steady-state conditions, exposure
E can be described by:
where
fin is the fraction of time spent indoors. Therefore,
E depends on the product of steady-state outdoor concentration
Cout_ss and outdoor attenuation (
finFinf + (1 −
fin)). Since people spend more time indoors than outdoors (
i.e.,
fin > (1 −
fin)) [
20],
Finf is a substantial component of outdoor attenuation. When
Cout_ss is used as an exposure surrogate, the estimated health effect parameter is reduced (
i.e., biased towards the null) since it is the product of the toxicity (
i.e., true health effect) and outdoor attenuation [
4]. Using
E instead of
Cout_ss in health studies should yield a less attenuated health effect estimate [
37]. Since the LBL model inputs are relatively easy to obtain, our modeling approach can facilitate the estimation of
Finf to help support the use of
E in health studies. Also, accounting for AER variability can reduce the uncertainty of
Finf and the resulting exposure in support of improving health effect estimates.
For exposure models, there are two components of measurement error [
4,
34]. The Berkson-like component of error results from using a model that has some sources of variation or exposure factors missing from the model. The classical‑like component of error is from uncertainty in the estimated model parameters. Both types of measurement error have an impact on health effect estimates. The Berkson error can increase confidence intervals of health effect estimates while classical error can lead to incorrect confidence intervals and biased health effect estimates [
4,
34]. Under a new measurement error correction method [
33], Berkson-like error can also induce a bias. The method used in this study can minimize both types of errors. Our mechanistic AER models (
i.e., LBL and LBLX models) can reduce Berkson error, as compared to using empirical AER models that do not account for temporal variations due to the stack and wind effects [
11]. Also, our model calibration with a subset of homes to improve the estimated parameters of the leakage area model can reduce classical error.
A limitation of this study is that mechanical ventilation could not be included in the AER predictions for the three year health study since it was not collected due to cost and participant burden considerations. We expect bathroom fans, outdoor-vented kitchen range hoods, and clothes dryers, which have low-intermediate airflows and are used intermittently, to have a small AER effect. Central heating and air conditioning (HVAC) systems in homes re-circulate indoor air with no outdoor air intake, but can have air duct leaks in unconditioned spaces (e.g., basements, attics) when operated [
38]. However, none of the NEXUS homes had HVAC systems. Window/wall air conditioners also re-circulate indoor air, but can be operated with open outdoor vents. Other types of outdoor-vented fans include window fans and whole-house fans, which move outdoor air into the living space through open windows. Overall, we expect a large AER effect from window fans, whole-house fans, and window/wall air conditioners operated with open outdoor vents. Attic fans, which ventilate the attic space and not the living space with soffit or gable vents, are expected to have a small AER effect. The ability to quantify the impact of mechanical ventilation on the AER in this study is not possible since the variability of mechanical ventilation can be substantial due to various factors, which include the type of mechanical ventilation, frequency of use, and method of operation (e.g., open or closed outdoor vents for window/wall air conditioners).
Another limitation of this study is that the AER were measured in the spring and fall, with no measurements from the summer or winter due to cost. However, the leakage area model parameters, which were estimated from the AER measurements and applied for the older homes, are independent of the stack and wind effects that can vary seasonally. Therefore, we expect AER measurements from different seasons to have a small effect on the estimated parameters. In addition, a previous study that compared AER measurements with LBL and LBLX model predictions, which used the same literature-reported parameters that we applied for the newer homes in this study, showed similar results in all four seasons [
7]. The LBL and LBLX models had median relative errors of 41% and 37% in spring, 45% and 44% in summer, 43% and 40% in fall, 39% and 39% in winter, respectively. Therefore, we expect the model performance in this study to be similar across the four seasons.
An additional limitation is the AER measurements used for parameter estimation were from a cluster of 23 older homes built between 1900 and 1969 (median 1942). Therefore, the estimated parameters were applied for the older homes in the health study, and literature-reported parameters were used for the newer homes in the health study. However, a previous study that compared AER measurements with LBL and LBLX model predictions [
7], which used the same literature-reported parameters, showed results similar to those reported in this study, which used the estimated parameters. Based on 642 AER measurements from 31 homes built between 1922 and 2000 (median 1965), the median |ε| was 43% (0.17 h
−1) and 40% (0.17 h
−1) for the LBL and LBLX models, respectively [
7]. In this study, the median |ε| was 29% (0.19 h
−1) for both the LBL and LBLX models. Therefore, we expect the model performance in this study to be similar for the older and newer homes.
Another limitation is the small sample size (six homes) used to estimate parameters for conventional homes. This can increase uncertainty in the estimated parameters, and lead to more classical-like measurement error.