*3.2. Data*

### 3.2.1. Individual Activity-Travel Survey Data

Since the study seeks to obtain results based on people's actual activity-travel behaviors (not to compare the congested and non-congested travel speeds/times of all road segments in the study area), we employed individual activity-travel survey data accessed via the Transportation Secure Data Center [40]. This individual activity-travel survey is a part of the "National Household Travel Survey California Add-On" survey conducted in 2017. The survey data were collected from around 55,800 individuals (about 26,000 households) in California. Participants were asked to report their activity-travel patterns (e.g., the location of activities, duration of activities, trip purposes, modes of travel, and the number of accompanying passengers) for one survey day and their socioeconomic attributes. Note that the survey did not collect or provide any global positioning system (GPS) data for constructing the space-time trajectories of participants' trips. Therefore, to estimate the travel time of participants' trips and their exposures to traffic congestion, we assumed that they used the shortest path (in terms of travel time) to travel between all locations and used the Google Maps Application Programming Interface (API) to derive the travel time for each trip based on the time of day and the shortest travel route for the trip.

We selected individuals according to the research goal as follows. First, we selected individuals whose trips were all in the study area (i.e., the Los Angeles Metropolitan Area) on weekdays. We only focused on weekdays because weekend activity-travel patterns typically consist of non-routine patterns such as recreational trips and often do not involve any commuting trips [26]. Second, we selected individuals who were actively employed because unemployed people do not undertake any commuting trips. Third, we selected individuals who made trips by driving alone without any accompanying passengers. In other words, all trips in this study were traveled by driving, and trips made by public transit and non-motorized modes (including buses, taxis, bicycles, and walking) were not considered. We focused on these individuals to control other possible travel-mode related factors that may also influence how traffic congestion exposure affects health. For instance, previous studies found that the effects of traffic congestion exposure on health may be different when individuals are drivers rather than passengers (e.g., [41,42]). Moreover, they found that the existence of accompanied passengers may affect drivers' stress (e.g., [7,42,43]).

Lastly, individuals who did not undertake any commuting trips or who had only commuting trips were excluded because we seek to generate two exposure assessments, one that considers only commuting trips and the other considers both commuting and non-commuting trips. Note that we define commuting trips as trips that are anchored at a workplace so that we can consider trip-chaining travel behaviors. Before this exclusion criterion was applied, there were 729 individuals in our subsample. As a result of applying this exclusion criterion, 77 individuals (11%) were excluded because they did not make any commuting trips (e.g., having a day-off from the work), and an additional 402 individuals were removed since they only made commuting trips (i.e., no other type of trips). Note that a considerable portion of the survey participants (34%, 250 individuals) reported that they made commuting as well as non-commuting trips. This provides a compelling rationale that individuals' activity-travel patterns still should be considered to accurately assess their exposure to traffic congestion.

After this selection process, 250 individuals were finally included in the subsample used in this study. To avoid sample selections that do not have similar sociodemographic characteristics as the larger population in the Los Angeles Metropolitan Area, we compared their socio-economic attributes with those of the larger population in the study area. Note that since our research focuses on employed individuals, the statistics reported in Table 1 represent only employed workers. Overall, descriptive statistics of the selected participants showed similarity to those of the larger population in the study area. The only discrepancy we found is that the median age of the selected participants (45.2 years old) is higher than that of the larger population in the Los Angeles Metropolitan Area (39.9 years old). This can be explained by underrepresentation of the younger generations in our subsample as we focus on workers who drive their own cars. Recent travel behavior studies revealed that the younger generation (e.g., millennials) may drive less or not own cars (e.g., [44,45]). Therefore, it is likely that the younger generation may be underrepresented in our subsample.


**Table 1.** Comparison of the sociodemographic attributes of the 250 selected participants with those of the larger population of the Los Angeles Metropolitan Area.

(a) American Community Survey (ACS) 2016 5-year estimates, (b) Higher education indicates education attainment that is equal to or higher than bachelor's degree.

### 3.2.2. Real-Time Traffic Congestion Data

In this study, we estimate individual exposure to traffic congestion for each trip by subtracting its free-flow travel time from its estimated travel time that considers traffic congestion, following the framework used in previous studies (e.g., [29,46,47]). For example, imagine an individual who undertakes 5 trips in his or her daily life (Figure 2). This person travels from the home location to a workplace (Trip 1), goes back from the workplace to home (Trip 2), goes grocery shopping from home (Trip 3), goes to the gym after the grocery shopping (Trip 4), and finally goes back home from the gym (Trip 5). For each of these trips, by subtracting the free-flow travel time from the estimated travel time (obtained using the Google Maps API (Google, Mountain View, CA, USA) based on time of day and the origin and destination of the trip), we can estimate this person's exposure to traffic congestion for each trip.

Recall that the primary goal of this research is to compare individual exposures to traffic congestion obtained from two assessments: one that only considers commuting trips and one that also considers individuals' activity-travel patterns in addition to considering commuting trips. Assume that the person in this example is exposed to traffic congestion for 10 min for each of the five trips. An assessment that only considers the two commuting trips (Trips 1 and 2) estimates that the duration of exposure to traffic congestion is 20 min (10 + 10), while an assessment that also considers the non-commuting trips (Trips 3, 4, and 5) as well as the commuting trips (Trips 1 and 2) estimates that the duration of exposure to traffic congestion is 50 min (10 + 10 + 10 + 10 + 10).

**Figure 2.** An example of estimating an individual's exposure to traffic congestion for two types of assessments: (**a**) commute-only versus (**b**) activity-travel patterns in addition to commuting trips.

To obtain free-flow travel time and estimated travel time, we utilized the Google Maps API (Figure 3). The Google Maps API estimates driving time between two points when API users provide departure time, departure/arrival/waypoints locations, and route options (e.g., avoiding toll roads or highways) [48,49]. The API computes driving time based on two data sources: (1) crowdsourced real-time traffic data that were submitted by anonymous drivers who consent to send their location information to Google Maps via their smartphones and (2) historical traffic flow databases that Google has established [48]. Free-flow travel time is derived as if trips occurred at 2 A.M. when traffic volumes practically approach 0. To the best of our knowledge, while no study has compared the accuracy of Google Maps data with those from other sources, it seems that travel times provided by Google Maps are highly accurate based on several sources on the web. For instance, in one assessment that used 56 trips with an average journey time of 32 min, the average travel time difference between actual and estimated travel times is 1.8 min (see https://blog.ancoris.com/how-accurate-is-google-maps-journey-time).

In this research, the departure time and geographic coordinates (e.g., longitude and latitude) of the origin and destination of each trip of the participants recorded in the travel survey were used to obtain free-flow travel time and estimated travel time through the Google Maps API. Note that we did not use the reported travel times from the survey as the actual travel time in this study because the survey did not provide the travel routes of participants' trips and it is not possible to estimate the corresponding free-flow travel time for each of the participants' trip, which in turn renders the comparison between the free-flow travel time and the (estimated) travel time that considers traffic congestion for each trip of the participants impossible. Further, estimating actual travel time using the Google Maps API serves to avoid the recall and rounding errors common in the reported travel times of travel surveys.

**Figure 3.** A screenshot of a map of typical traffic congestion levels at 6 P.M. on Wednesday in the Los Angeles Metropolitan Area (Source: Google Maps).

Using the Google Maps API service has several advantages over traditional desktop-based GIS programs. One compelling advantage is that using the API service does not require researchers to prepare a considerable amount of data and use considerable computing resources. For example, the Google Maps API promptly provides users with travel time that considers traffic congestion between any two given locations (i.e., longitude, latitude) at 20-min intervals.

To obtain this detailed travel time estimate, conventional desktop-based GIS programs require researchers to prepare a considerable amount of network data (e.g., [50,51]). For example, researchers need to prepare road network files, estimate the traffic volume and speed on each road segmen<sup>t</sup> at each time interval, and generate penalty information for each street intersection (e.g., one-way roads, no-left-turn penalty, and so on). Preparing these datasets may not be feasible for large metropolitan areas such as Los Angeles. Additionally, even if researchers can prepare the required data, it may take substantial time to run the shortest-path algorithm since the road networks are large and complex. However, by using the Google Maps API service, researchers only need to develop a simple program based on easily accessible programming languages (e.g., Python, Java, and so on). Moreover, since the calculations of travel time are performed inside the API service (where the API uses its own high-performance computing facilities), researchers can ge<sup>t</sup> results immediately. For these reasons, there has recently been a growing number of studies that extensively employed the Google Maps API and other map-based API services (e.g., [52,53]).

However, it should be noted that the Google Maps API service has several limitations. One limitation is that users may not know the detailed mechanism of how it estimates travel time. However, documentation from API service providers may mitigate this issue (e.g., [49]). Another limitation is that API services may charge a fee based on the number of API requests. For example, Google Maps API users can use 40,000 API requests per month for free. Beyond the 40,000 free requests, users need to pay a fee per single API request (e.g., a single query of travel time estimation for a single pair of origin and destination) [49]. Thus, the Google Maps API service may not be a viable option for researchers who want to obtain travel times for a larger number of origin-destination pairs [54]. However, this limitation did not significantly affect our research because we did not need a large number of requests; we requested travel time estimates for approximately 1000 trips, which the 250 selected participants undertook.
