Next Article in Journal
Effectiveness of Cool and Green Roofs Inside and Outside Buildings in the Brazilian Context
Previous Article in Journal
Carbon Emission Accounting Model of Three-Stage Mechanical Products for Manufacturing Process
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Travel Time Variability in Urban Mobility: Exploring Transportation System Reliability Performance Using Ridesharing Data

1
Department of Civil and Environmental Engineering, University of Massachusetts Amherst, Amherst, MA 01002, USA
2
Department of Civil and Environmental Engineering, Northwestern University, Evanston, IL 60201, USA
*
Author to whom correspondence should be addressed.
Sustainability 2024, 16(18), 8103; https://doi.org/10.3390/su16188103
Submission received: 10 July 2024 / Revised: 28 August 2024 / Accepted: 12 September 2024 / Published: 17 September 2024
(This article belongs to the Section Sustainable Transportation)

Abstract

:
Travel time variability (TTV) is a crucial indicator of transportation network performance, assessing travel time reliability and delays. This study investigates TTV metrics within the context of shared mobility using probe data from transportation network companies (TNCs) in Chicago, Los Angeles, and Dallas–Fort Worth. Eight reliability metrics are analyzed and compared for each origin–destination (OD) pair in the network, including standard deviation (SD), the Planning Time Index (PTI), the Travel Time Index (TTI), the Buffer Index (BI), On-time Measures PR (alpha), and the Misery Index (MI), to evaluate their effectiveness in clustering OD pairs using K-means clustering. The findings confirm that SD, PTI, and MI are particularly effective in measuring travel time reliability and clustering within urban systems. This study identifies the most unbalanced supply–demand OD pairs/regions in each city, noting that low/medium-SD clusters around metropolitan airports indicate stable travel times even in high-demand zones, while high-SD clusters in downtown areas reveal significant traffic demands and unreliability. These patterns become more pronounced in study areas with multiple city centers. This study highlights the need for targeted strategies to enhance travel time reliability, particularly in regions like Dallas–Fort Worth, where public transportation alternatives are limited.

1. Introduction

Traffic congestion remains a pervasive issue in cities throughout the US. The Texas A&M Transportation Institute reported that by 2019, costs related to congestion reached an astounding $190 billion (adjusted for 2020 values) [1]. Urban residents in US metropolitan areas face increasing levels of unexpected traffic congestion, leading to significant delay-induced losses and disrupted services. The erratic nature of traffic congestion significantly impacts travel reliability [2].
Travel time variability (TTV) and travel time reliability (TTR) are interlinked concepts integral to gauging transportation system performance. TTR pertains to the likelihood of completing a trip within a designated time frame, representing fluctuations in travel durations. Generally, an uptick in TTV correlates with a drop in TTR, suggesting that travelers face heightened unpredictability and hold-ups when travel times oscillate extensively. Recent studies examine travel time variability (TTV) and travel time reliability (TTR) in terms of system performance and user response. From the system performance perspective, TTR serves as a valuable tool with which to monitor transportation systems’ performance, employing various indices that are straightforward and easily interpretable. On the other hand, from the user response side, reliability is often incorporated to account for uncertainty in travel demand studies and increase the prediction accuracy of user choice behavior models. Traditional statistical models and indices have been applied to capture the nuances of travel time variability; however, there is a need for more data and the evaluation of these indexes.
Since Uber’s launch in 2009, TNC services have experienced increasing popularity, fundamentally transforming urban travel dynamics. With an increasing number of TNCs providing GPS data containing trip information, TNC trips reveal important traffic patterns in metropolitan regions and reflect people’s travel behaviors. According to the 2017 National Household Travel Survey (NHTS) [3], TNC (grouped with taxi trips) trips made up 0.5 percent of person-trips nationally, with around 6 percent of these trips having work-related destinations, and 22 percent originating from work. Although TNC trips may not represent a stratified sample of all-purpose trips, potentially overrepresenting first-mile/last-mile alternatives and underrepresenting daily regular commute trips, they reveal emerging mobility patterns in urban transportation. These trips have the potential to serve as representative samples of overall urban transportation systems. Additionally, TNC data remain significant in urban transportation analysis, as a considerable 96.5% of TNC trips take place in urban areas [3], where traffic congestion and emissions present significant challenges.
According to the Federal Highway Administration [3], 12.3% of TNC service users do not have their own vehicles, making these services particularly valuable in enhancing accessibility for car-less individuals. While TNCs have become integral components of urban transportation; they have also added layers of complexity to traffic flow, contributing to greater variability within the network. However, the advent of TNCs has been a double-edged sword; while they introduce additional uncertainty into traffic patterns, they concurrently provide a wealth of large-scale data that are invaluable in capturing the uncertainty and accommodating the additional variability introduced by TNC services. TNCs provide a wealth of data that are instrumental in analyzing urban transportation reliability patterns. These data not only offer insights into the most unreliable corridors and trends, but also aid in understanding the broader impact of TNCs on urban mobility and sustainability.
This study presents several novel contributions to the understanding of TTV in urban transportation networks, leveraging TNC data to offer actionable insights:
  • The utilization of TNC data: this study leverages TNC data to examine TTV across OD pairs, providing critical insights for use in traffic management and urban planning.
  • Comprehensive TTV measurement: TTV is measured for OD pairs using TNC trip data as a probe, with a comparison of eight TTV indexes. This comprehensive approach allows for a detailed analysis of travel time variability.
  • The identification of key metrics: this study identifies standard deviation (SD), the Planning Time Index (PTI), and the Misery Index (PI) as particularly effective metrics in capturing travel time variability and clustering across cities with diverse transportation characteristics.
  • Temporal and spatial analysis: TTV patterns are analyzed within each cluster, both temporally and spatially, offering a dynamic understanding of how travel time variability evolves over time and across different urban areas.
  • The identification of unbalanced regions: the most unbalanced regions and unreliable trends within the urban transportation networks of three selected metropolitan areas are identified at the census tract level, providing targeted insights for use by urban planners.
  • Practical implications: The insights gained from this study can be leveraged to enhance traffic management, optimize travel routes, and reduce congestion and emissions, contributing to more efficient and sustainable urban transportation systems.
The remainder of the paper is organized as follows: Section 2 provides an overview of the recent literature related to travel time reliability. The travel time reliability indexes and the index selection method are then outlined, followed by an examination of ride-sourcing data from three selected metropolitan areas and a presentation of the clustering results, both spatially and temporally. Section 6 summarizes the findings, discusses the implications of the results, and outlines future research directions.

2. Literature Review

TTV is a critical metric used to evaluate the punctuality of urban transportation systems [4]. It quantifies the uncertainty in travel time for a given route over a specific period, reflecting unexpected delays caused by factors such as short-term incidents (e.g., vehicle breakdowns), long-term issues (e.g., work zones, route closures), or random events (e.g., accidents, travel demand fluctuations) [5,6]. TTV significantly impacts commuters and freight operators who rely on predictable travel times to plan their journeys and schedules. It also influences the transportation agencies and city planners responsible for monitoring and optimizing road network performance to improve reliability [2]. Analyzing travel time variability serves several purposes, such as managing congestion proactively during events, identifying the optimal routes for new services, optimizing bus schedules, and developing traffic management strategies for autonomous vehicles that share roads with other traffic [7].
The evaluation of TTV and TTR primarily aims to assess the probability of travel times remaining within predefined acceptable thresholds. Zang et al. [8] describe two primary methods used to characterize TTR. First, TTR can be assessed based on the assumed distributions of sources of uncertainty. This method involves modeling TTR by identifying various sources of uncertainty, such as weather, events, and accidents, and employing different approaches to model them. Second, TTR can be directly evaluated using travel time datasets to measure the dispersion of the travel time distribution. The direct use of travel time data is particularly advantageous when large-scale datasets are available, as it eliminates the need to assume probability distributions for sources of uncertainty, thereby automatically accounting for all sources of variability [8].
Various statistical measures of TTR have been identified in the previous literature [2,8], including probability-based, moment-based (e.g., standard deviation, variance, and the coefficient of variation), percentile-based (e.g., MI, which gauges the interquartile difference), and tail-based methods (e.g., unreliable areas, signifying unexpected delays in the tail of the travel time spread). Recently, the Federal Highway Administration (FHWA) introduce two new percentile-based reliability metrics—the level of travel time reliability (LOTTR) and truck travel time reliability (TTTR)—to assess the performance of the National Highway System (NHS) [6,9]. However, these evaluation measures are not universally applicable, as the choice of measure depends on the specific application and context. Existing reliability measures may exhibit inconsistent behavior for the same assessment object. Consequently, it is important to establish criteria that can guide the selection or development of a generalized optimal reliability measure for practical applications. These criteria should be comprehensible to decision-makers and valuable for identifying optimal reliability measures tailored to different application purposes.
Traditional measures like the mean and variance of travel time may not fully capture the complexities of travel time distributions, especially under atypical conditions. Van Lint et al. [10] emphasize the limitations of these conventional metrics and argue for the inclusion of indicators that consider the skewness and width of the travel time distribution. They propose the use of reliability metrics based on the 10th, 50th, and 90th percentiles for specific routes and time-of-day/day-of-week periods, emphasizing the need for more nuanced approaches to understanding travel time reliability. However, the significance of each metric in assessing reliability can vary depending on the application or context. Moreover, the presence of correlations and collinearity among these metrics adds to the complexity, making the selection of appropriate reliability indicators a challenging task.
In this context, the methods used to collect travel time data are crucial. Typically, data are gathered through three main methods: archived traffic operations data, estimation techniques, and vehicle-based data [11]. Archived traffic operations data primarily cover freeways but are limited by the scope of the data collected [12]. Estimation techniques, while useful for evaluating travel times across broader networks, may oversimplify the complexities of real-world traffic conditions. Vehicle-based data, particularly those taken from floating or probe vehicles, are the focus of this research due to their ability to provide detailed and context-specific insights, though this ability is often constrained by smaller sizes and data application to select road segments [13,14,15].
Building on this foundation, recent studies on travel time reliability (TTR), as summarized in Table 1, have increasingly utilized private datasets, especially GPS-based data, to analyze urban travel patterns at a detailed link level. Ride-sourcing and taxi services have proven to be valuable for revealing the characteristics of these patterns [13,14]. Despite the detailed analysis possible at the link level, many studies often do not compare statistical measures across different contexts, even though travel patterns vary significantly by road type and by the day of the week, both spatially and temporally. To address this variability, clustering methods are often applied to segment urban trips, but such analyses are typically limited to case studies in specific cities [13,16]. Extending the work of Ale-Ahmad et al. [13], this research aims to broaden the application of TTR measures beyond single-city case studies to offer more comprehensive insights into urban travel reliability.
Furthermore, selecting suitable TTR indices requires an awareness of the challenges in network-level aggregation. When examining network-wide travel time reliability (TTR), reliability measures are typically calculated directly with respect to total travel time, being aggregated from the link level to the network level. However, this approach may not be suitable for monitoring network-wide TTR and identifying congested or unreliable areas. As this measure produces overly aggregated data, there is difficulty pinpointing the source of identified unreliability in the network [8]. In this study, rather than focusing solely on the statistical fitting of TTVs, indices are selected based on their ability to effectively cluster origin–destination pairs, analyze their statistical relationships, and identify the contributing factors that affect the reliability of the aggregated network.

3. Methods

This study aims to identify TTV metrics for use assessing traffic performances across different cities, each of which has its own unique socio-economic background. Three major metropolitan areas—Chicago, Los Angeles, and Dallas–Fort Worth—are selected for their diverse sociodemographic profiles, population density distributions, commuter behaviors, and the availability of extensive datasets, providing a comprehensive overview of major US cities. To improve the understanding of travel reliability in these urban transportation systems, TTV metrics are selected based on their statistical significance, and a clustering method is adapted to segment the origin–destination pairs using these metrics. The objective is to provide valuable insights, allowing city planners and policymakers to evaluate and enhance TTV across diverse urban contexts.
To utilize travel time as an indicator of traffic congestion and road performance, this study compares three pairs of travel time-related variables, as shown in Equations (1)–(6): travel time duration difference with excess rate, travel speed with free-flow travel speed, and travel rate (TR) with free-flow travel rate (FFTR). Through this comparison, the researchers aim to identify travel time metrics that can more accurately capture the variability and patterns of traffic congestion.
Travel time duration difference:
Δ t = t actual t free   flow
Excess rate (ER):
E x c e s s   R a t e = t actual t free   flow d free   flow
Actual travel speed (vactual):
v actual = d actual t actual
Free-flow travel speed (vfree flow):
v free   flow = d free   flow t free   flow
Travel rate (TR):
T R actual = t actual d actual
Free-flow travel rate (FFTR):
F F T R = t free   flow d free   flow
where t actual is the actual travel time; t free   flow is the free-flow travel time, which is the travel time under ideal conditions—the details of estimating free-flow travel time are provided in the data section; d actual   is the actual travel distance; and d free   flow is the free-flow travel distance. It is simplified as the shortest path between OD pairs.

3.1. Indices for Reliability Analysis

This study reviewed and tested eight indices to analyze travel time reliability [2,13,18].
Standard Deviation (SD)
SD (as shown in Equation (7)) is a classic statistical method that is easy to use and adopt. This index reveals the variability in the travel rate and treats early and late arrivals with equal weight, avoiding bias towards late arrivals. However, it is important to note that this metric only indicates a reliable arrival performance concerning the average travel rate.
S D = i = 1 N | T R i E T R i | 2 N 1
where T R i = the ith travel rate(s/m); E T R i = the expected travel rate(s/m), where the average travel rate of each pair is used; and N = number of trips or observations available for a given pair.
Travel Time Index (TTI)
TTI, shown in Equation (8), represents the average additional time required for a trip compared to free-flow travel time. It serves as an indicator of road congestion and can describe travel time reliability. TTI is defined as the ratio of travel time in the peak period to the free-flow travel time, enabling the measurement of base (average) congestion conditions [2]. For example, a TTI of 1.2 indicates that a trip takes 20% longer than it would under ideal conditions.
T T I = T R A v e r a g e F F T R
Planning Time Index (PTI)
PTI, as described in Equation (9), represents the total travel rate that a traveler should plan for, including a 95th-percentile “adequate” buffer time in addition to on-time arrival. PTI accounts for both typical and unexpected delays. It is defined as the ratio of the 95th-percentile travel rate to the free-flow travel rate. For instance, a PTI of 1.5 for a 30-min trip indicates that a traveler should plan for 45 min before the trip to accommodate potential delays.
P T I = T R 95 t h   P e r c e n t i l e F F T R
Buffer Index (BI)
BI represents the normalized extra time required to arrive on time. It indicates the buffer travel rate size as a percentage of the average travel rate. BI is defined as the difference between the 95th percentile of travel rate trips and the average travel rate, normalized by the average travel rate.
B I = T R 95 t h   P e r c e n t i l e T R A v e r a g e T R A v e r a g e
On-time Measures PR ( α = 1.1 )
PR ( α ) represents the probability of achieving a reliable on-time arrival. For example, for a corridor with a median travel rate of 100 s/m, if P R α = 80%, it means that 80% of trips in this corridor can be completed within a travel rate of 110 s/m, ensuring a reliable on-time arrival. Conversely, the failure rate (FR) can be determined by F R α = 1 P R α . This quantifies the probability of failing to achieve a reliable on-time arrival.
P R 1.1 = P e r c e n t a g e   o f T r i p s   w i t h   T R 1.1 × T R m e d i a n
On-time Measures PR (α = 1.25)
Similar to Equation (11), PR ( α = 1.25 ), in this context, α = 1.25 means that a trip is considered on-time if it arrives within 1.25 times the scheduled or expected travel time.
P R 1.25 = P e r c e n t a g e   o f T r i p s   w i t h   T R 1.25 × T R m e d i a n
Misery Index (MI)
The MI is defined as the difference between the average of the 20% longest travel rate and the average travel rate, which is then normalized by the average travel rate [1]. This index provides insights into the reliability of travel time for the worst trips. According to Van Lint and Van Zuylen [10], since MI focuses on the impact of extremes on the average travel time, it can reflect similar effects to those of the extreme low values in a heavily left-skewed distribution. Therefore, the presence of a heavily left-skewed distribution may lead to higher MI values because the extremely low travel times elevate the measurement of delays [19]. The formula for calculating MI is presented in Equation (13):
M I = T R A v e r a g e   o f   H i g h e s t   20 % T R A v e r a g e T R A v e r a g e
Modified Misery Index (MI_mod)
MI-Mod is a measure that denotes the average travel time delay experienced by the worst 5% of trips when compared to the free-flow travel rate.
M I _ m o d = A T R A v e r a g e   o f   H i g h e s t   5 % F F T R

3.2. Indices Selection

TNC data from three US cities (Dallas–Fort Worth, Chicago, and Los Angeles) are used to investigate TTV and its relationship with congestion and urban scaling. The indexes are first selected based on the statistical relationships between each other. Eight metrics are calculated and plotted in Figure 1, which presents a correlation matrix showing the relationships between different travel time reliability metrics. The three cities share similar performances in TTV indexes. PTI exhibits strong linear positive relationships with other metrics, such as TTI and PTI, PR (1.1) and PR (1.25), and MI and MI_mod, among others, which can be explained by their formulas. Furthermore, some variables are found to be mutually interpretable based on their definitions. For instance, TTI is used to measure the level of congestion on the road, while MI_mod quantifies the reliability of travel time for the worst trips and extreme delays. A larger MI_mod value indicates a lower reliability of travel time, and when the road experiences heavy congestion, the reliability of travel time for the worst trips decreases. Similarly, PTI shows the 95th percentile of travel time needed for on-time arrival. A larger PTI value indicates a lower reliability of travel time. Additionally, SD provides insights into the variability in travel time, where a larger PTI value might suggest a longer distance between OD or more congested roads. In both scenarios, travel time becomes less stable. Furthermore, it is observed that PR1.1 and PR 1.25 display an almost linear relationship. Considering the relationships and the correlations among the metrics, PTI, SD, and MI are selected for the TTV analysis due to the limitations of the samples.

3.3. Clustering Analysis

Clustering analysis is used to discover patterns or structures in large-scale unlabeled datasets by identifying similarities [20]. Various types of clustering methods, including partitioning, hierarchical, density-based, grid-based, and model-based methods, employ different metrics to measure the similarity between data points, such as Manhattan distance, Euclidean distance, Fre’chet distance, or correlation matrix [20,21]. For this study, an unsupervised clustering method known as K-means is adopted for the analysis.
K-means clustering is a widely used distance-based algorithm that partitions data into K clusters, with the value of K predetermined by the analyst. This method measures the similarity between data objects based on their distance from the centroids. The main objective of K-means clustering is to minimize the total within-cluster variation by finding K centroids and assigning each data point to the nearest centroid [13,20]. K-means clustering is regarded as one of the simplest unsupervised clustering methods, and over time, several variations have been developed to cater to different requirements, such as K-means++, K-Medoids, and K-means with noise algorithms, which can handle multiple features and outliers [20].To determine the optimal number of clusters, the elbow method is utilized, which identifies a point where adding more clusters does not significantly improve the quality of clustering. By applying K-means clustering, the selected TTV indexes are grouped into K clusters based on their similarity [13].

4. Trip Data

The datasets used in this paper consist of high-volume, two-week TNC trip data collected before the pandemic in three metropolitan areas: Dallas–Fort Worth (DFW), Chicago, and Los Angeles (LA). Each trip entry contains the trip ID, the trip start time, the trip end time (representing pick-up and drop-off timestamps), the pick-up and drop-off locations, and the actual travel distance and trip duration. To address privacy concerns, the exact geographic coordinates are not provided. Instead, pick-up and drop-off locations are associated with relevant census tracts, and a census tract ID is added accordingly for each location using 2020 Tigerweb census tract data [22]. In total, there were 535,002 trips in the Chicago area, 754,296 trips in the DFW area, and 3,633,890 trips in the LA area.
A comparison is made between the travel time of TNC trips and the estimated free-flow travel time of similar trips. This was calculated using ArcGIS Pro 3.0.3 Network Analysis tools. Eight indices for TTV in the three largest metropolitan areas are calculated and compared both temporally and spatially. Additionally, the free-flow travel time of each trip is estimated using the ArcGIS Pro Network Analysis Tool, where the best guess travel time (based on historical and real-time traffic conditions) at 3 am on a typical weekday for the same OD pairs is taken as the estimated free-flow travel time. Route analysis finds the shortest paths and driving directions between the input stops, using the origin layer as the first stop and the destination layer as the last stop. It also provides information about the driving direction and the shortest route paths, including travel time and distance [23].
For the following analysis, three TNC datasets are cleaned up by excluding records with erroneous or missing inputs using the following steps:
Step 1: Travel time cleaning—to ensure data consistency, the minimum travel time duration is set to 1 min for all cities, while the maximum travel time duration is set to 120 min.
Step 2: Travel distance cleaning—data with erroneous values are excluded by considering 0.1 mi as the minimum distance.
Step 3: Geodesic distance cleaning—since the primary focus of this study is to examine (un)reliable OD pairs, trips with the origin and destination within the same census tract are excluded. OD pairs with geodesic distances shorter than 0.1 mi are also discarded. Figure 2 compares the geodesic distance and travel distance for the three study areas, with the blue line representing the estimated regression line for all data points. The coefficient of travel distance in the regression model is provided in Table 2, indicating that on average, travel distances in Chicago, DFW, and LA are 15%, 30%, and 32% higher than geodesic distances, respectively. This difference indicates that factors such as road network layout, rerouting, detours, traffic patterns, and navigational choices contribute to the increased travel distances compared to direct geodesic distances. Using this information, additional cleaning measures are implemented to refine the analysis.
Step 4: Difference in geodesic distance and travel distance cleaning—since a trip’s path rarely follows a straight path between the origin and destination, geodesic distances should be less than or equal to trip distances. However, for TNC data, geodesic distances are calculated between the centroids of the pick-up and drop-off locations, not the actual origin and destination of the trips. As a result, trip distances might be higher or lower than geodesic distances. Nonetheless, for longer trips, it is more likely that the trip’s distance will be higher than the geodesic distance. Hence, two filters are applied to eliminate the outliers: (a) trips above the regression line are identified, and the top 0.05% of these trips are treated as outliers; (b) points with travel distance at least 10 times greater than their corresponding geodesic distance are excluded.
Step 5: Travel rate cleaning—after applying the mentioned criteria to clean the datasets, the travel rate is calculated. The SHRP2 Project suggests that values more than 1.5 times interquartile distance above 75th percentile or below 25th percentile can be considered outliers [2,24]. To remove low travel rates, the actual speed limit is also considered. Travel rates lower than 40 s/m (equivalent to 90 mi/hr) are deemed erroneous and are excluded from the analysis.
The steps described above were applied, and the new regression analysis results are presented in Table 3. This table also provides the descriptive statistics for TNC trips across the three study areas. The number of trips was reduced to 0.4 million in Chicago and 0.6 million in DFW. Chicago has 112 unique origins and destinations and 2651 unique OD pairs, while DFW has 2971 unique origins and destinations and 16,756 unique OD pairs. LA has a much larger amount of data than the other two cities.
Upon analyzing Figure 3a and considering the distributions of different variables in DFW, it becomes evident that the travel time duration difference exhibits a narrower range of values and a less skewed distribution compared to the other two variables. This characteristic has the potential to influence the clustering performance and outlier identification observed during the analysis. Additionally, comparing Figure 3a–c, it is observed that different cities demonstrate similar distribution patterns. Taking these factors into account and aiming to simplify the clustering process while also considering the data range, this study opts to focus on the travel rate and free-flow travel rate pair to conduct reliability analysis. The data are filtered to include only trips occurring between 6 am and 10 pm for each OD pair on weekdays, ensuring that each OD pair has at least three occurrences in an hour.

5. Results

K-means clustering is applied to group the FFTR values of the selected OD pairs into K clusters, each possessing comparable characteristics. The selection of the cut-off point is crucial as it directly impacts the resulting clusters’ number and composition. To identify the optimal number of clusters and corresponding cut-off thresholds, the elbow method is applied, as illustrated in Figure 4 and Table 4.
Subsequently, the clusters are categorized into the following groups based on the number of clusters and OD pairs they contain:
  • Low-FFTR clusters.
  • Medium-FFTR clusters, which are further divided into medium/low- and medium/high-FFTR clusters.
  • High-FFTR clusters.
For LA, the analysis results in three distinct clusters, comprising 38,143, 31,642, and 11,778 ODs in each respective group. In the case of Chicago, three clusters are identified, with 1276, 976, and 399 ODs in each group. Finally, for DFW, the clustering process produces three clusters, containing 7167, 7137, and 2452 ODs in each group. All three cities show right-skewed distributions, which is typical for travel distances where most trips are short and experience more variability. Chicago has higher threshold values for all categories compared to DFW and LA, indicating that travel rates (and hence congestion levels) are generally higher. DFW has the lowest thresholds among the three cities, suggesting that travel rates are generally faster or that congestion is lower. LA falls between Chicago and DFW, indicating moderate congestion levels. It is also noticeable that LA has a different tendency compared to Chicago and DFW, particularly in medium travel rate trips. This suggests that the travel rate distribution in LA may have unique characteristics, possibly influenced by its urban layout, traffic patterns, or travelers’ socio-demographic characteristics.
Figure 5 illustrates the distribution of estimated travel distances for each cluster. It is observed that across all the cities, the cluster with the lowest FFTR includes trips with diverse distances, whereas the clusters with medium- and high-FFTR values primarily comprise much shorter trips. This phenomenon can be attributed to the fact that short trips are more inclined to use arterial roads within the city [13], where congestion occurs more frequently, and traffic signal cycles may have varying effects on waiting time variance. Conversely, longer trips are more common in suburban areas and freeways. Additionally, the overhead time per mile, which includes the wait time, pick-up time, and drop-off time of passengers, tends to be higher for shorter trips compared to longer trips [13]. This distinction contributes to the clustering of shorter trips into the medium- and high-FFTR clusters. Moreover, a comparative analysis among the three reveals that Chicago exhibits a relatively slower pace and a longer tail in its FFTR distribution. This could be attributed to the presence of multiple city centers, such as business areas, and lower-density neighborhoods in Chicago. Moreover, the occurrence of long intercity trips between airports and business areas further influences the FFTR distribution in the city.

5.1. Temporal Analysis

To enable a comprehensive comparison of TTV among the three metroplex areas, the SD, PTI, and MI values are first clustered using the K-means clustering method. Subsequently, a temporal analysis of the TTV metrics is performed, examining the average trip distance and travel time of OD pairs within different TTV metrics clusters. These metrics vary over time and demonstrate differences among the three metroplex areas, as illustrated in Figure 6, Figure 7 and Figure 8.
The TTV metrics show higher values during peak hours (morning and evening) compared to off-peak hours (midday and night) for all three metroplex areas. This observation indicates that traffic congestion and travel time variability are more severe during peak hours. In addition, by analyzing the three metrics and the characteristics of clustered trips, it is found that, in comparison to the clustering result in Chicago, most of the OD pairs in LA and DFW belong to the low- and medium-TTV metrics clusters, and these clusters tend to have more outliers. Analyzing the box plot of the average trip distance of OD pairs, there is evidence that the median distances in different travel rate variability clusters are very close to each other, implying that the variation in travel time is not subsequently impacted by the distance of OD pairs in the clusters.
These figures also show the variation in the different attributes of OD pairs within each cluster. It is observed that for SD clusters, the variation in different attributes predominantly follows the morning and evening peak hour patterns. However, for PTI clusters, this pattern is not consistently observed, particularly in DFW and LA. Trips between these OD pairs are less likely to be short commute trips, and they do not appear to experience significant congestion throughout the day.

5.2. Spatial Analysis

To understand how the trips are distributed across the metroplex areas and how their reliability varies by location, a spatial analysis of the trips and their TTV metrics is conducted. All trips are plotted based on different SD clusters. Furthermore, the aim is to identify the most unreliable trips, corresponding to the trips clustered in “high”-TTV clusters. The spatial analysis allows for the study of TTV patterns and trends across different locations within the metroplex areas.
Figure 9 shows the spatial distribution of OD pairs in different SD clusters. Notably, the dense areas predominantly consist of airport trips. In the low-SD cluster, the trips are dispersed across various regions, including downtown (of multiple city centers, if appliable), residential, and entertainment areas, as well as airports. Conversely, the high-SD cluster primarily comprises trips originating from the downtowns and airports, indicating significant traffic demands in these areas. Of particular interest is the observation that in DFW, many high/medium-TTV trips are scattered around Fort Worth and Arlington. This highlights the significant unreliability of travel time for these trips, even beyond arterial roads, and emphasizes the need to consider travel demand in city planning efforts. In contrast, LA demonstrates a distinct pattern, where trips to LA downtown are mostly clustered into low/medium-TTV groups. This indicates that the actual trips between LA downtown and other areas are not as congested or unreliable as those comparison to other cities. This condition could be attributed to LA downtown’s well-developed public transit system, which reduces traffic demand and travel time variability for private vehicles on certain routes. Alternatively, the high density of population and employment, coupled with the presence of multiple city centers in LA downtown, leads to a significant number of shorter trips that do not require extensive travel on congested roads.

6. Conclusions and Future Work

In this study, we aggregate OD trips at the census tract level to analyze TTV indices in three metropolitan areas, Chicago, DFW, and Los Angeles, using real-world datasets from TNC trips. To compare TNC trip travel times, the estimated free-flow travel times of similar trips are calculated using ArcGIS Network Analysis tools. Eight reliability metrics are then estimated for each OD pair in the network, with PTI, SD, and MI selected for travel time reliability analysis. The K-means clustering algorithm is used to segment the eligible OD pairs in the metropolitan areas and compare their reliability performances.
Based on the analysis of the spatial distribution of the OD level in different SD clusters, spatial travel reliability patterns across various urban areas are observed. Dense areas, particularly around metropolitan airports, predominantly feature trips within the low-SD cluster, indicating relatively stable travel times, even in high-demand zones such as downtowns and entertainment areas. In contrast, trips in the high-SD cluster, often originating from downtown area, reveal significant traffic demands and travel time unreliability. These patterns become more pronounced in areas with multiple city centers.
This study confirms the significance of TTV as an essential indicator of transportation network performance. Furthermore, the conclusions drawn from this research can be generalized to other cities with different transportation characteristics. It is observed that reliable travel paths can be identified in each city, with city-wide characteristics influencing the effectiveness of different TTV indices across cities. This study also provides implications and recommendations for enhancing travel time reliability and reducing traffic congestion in case study areas; for instance, to promote transportation equity, Dallas is investing in infrastructure and providing grants to support alternative modes of transportation like biking, walking, and public transit [25]. However, this study shows that in the DWF area, there are very few viable opportunities to replace TNC trips with transit -or other environmentally friendly modes of transportation, compared to other transit-friendly cities that have relatively high TNC trip penetration rates. There is a need for transportation planning, targeted strategies, and innovative solutions in order to address travel time variability challenges in the test bed area.
The K-means clustering algorithm, as an unsupervised learning method, may produce different segments across cities. This method is exploratory in nature and is intended to guide urban planning rather than provide definitive solutions. Moreover, the estimation of free-flow travel time using the ArcGIS geo-database could be improved by incorporating more reliable probe data to enhance accuracy. Future research directions include expanding this study to consider a holistic approach to transportation planning that integrates various modes of transportation. Future research should assess the potential benefits of creating an interconnected and efficient multi-modal transportation network in the metropolitan area. In addition, another promising direction is to extend the TTV analysis to consider temporal variations in travel time reliability in order investigating how TTV metrics change across different days of the week, seasons, or during special events. This dynamic analysis will provide a more comprehensive understanding of how traffic patterns impact travel time reliability. To effectively extrapolate findings from TNC data and consider planning implications for cities, we consider the incorporation of demographic and socioeconomic data to understand the impact of different population segments. Furthermore, in terms of data-driven decision-making, by employing machine learning and predictive modeling techniques to forecast travel time variability and congestion patterns, the algorithms can be developed to predict areas and times with high TTV in order to facilitate better traffic management and resource allocation.

Author Contributions

Y.S.: Conceptualization, Methodology, Data Analysis, Writing—Original Draft, Visualization; Y.C.: Conceptualization, Methodology, Data Analysis, Writing—Review & Editing, Supervision. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The data used in this study are not publicly available due to privacy restrictions.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Lasley, P. 2021 Urban Mobility Report—Appendix B: Change in Vehicle Occupancy Used in Mobility Monitoring Efforts; Texas Transportation Institute (TTI): College Station, TX, USA, 2021; p. 6. [Google Scholar]
  2. Cambridge Systematics and National Research Council. Analytical Procedures for Determining the Impacts of Reliability Mitigation Strategies; Transportation Research Board: Washington, DC, USA, 2012. [Google Scholar]
  3. U.S. Department of Transportation, Federal Highway Administration. National Household Travel Survey. Available online: https://nhts.ornl.gov/ (accessed on 9 July 2024).
  4. Sun, C.; Arr, G.; Ramachandran, R.P. Vehicle Reidentification as Method for Deriving Travel Time and Travel Time Distributions: Investigation. Transp. Res. Rec. 2003, 1826, 25–30. [Google Scholar] [CrossRef]
  5. Low, V.J.M.; Khoo, H.L.; Khoo, W.C. Quantifying bus travel time variability and identifying spatial and temporal factors using Burr distribution model. Int. J. Transp. Sci. Technol. 2022, 11, 563–577. [Google Scholar] [CrossRef]
  6. FHWA Office of Operations. Does Travel Time Reliability Matter?—Primer—What is Travel Time Reliability? Available online: https://ops.fhwa.dot.gov/publications/fhwahop19062/whatis.htm (accessed on 9 July 2024).
  7. Büchel, B.; Corman, F. Review on Statistical Modeling of Travel Time Variability for Road-Based Public Transport. Front. Built Environ. 2020, 6, 70. [Google Scholar] [CrossRef]
  8. Zang, Z.; Xu, X.; Qu, K.; Chen, R.; Chen, A. Travel time reliability in transportation networks: A review of methodological developments. Transp. Res. Part C Emerg. Technol. 2022, 143, 103866. [Google Scholar] [CrossRef]
  9. National Performance Management Measures; Assessing Performance of the National Highway System, Freight Movement on the Interstate System, and Congestion Mitigation and Air Quality Improvement Program. Available online: https://www.federalregister.gov/documents/2017/01/18/2017-00681/national-performance-management-measures-assessing-performance-of-the-national-highway-system (accessed on 27 August 2024).
  10. van Lint, J.W.C.; van Zuylen, H.J.; Tu, H. Travel time unreliability on freeways: Why measures based on variance tell only half the story. Transp. Res. Part A Policy Pract. 2008, 42, 258–277. [Google Scholar]
  11. Lomax, T.; Schrank, D.; Turner, S.; Margiotta, R. Selecting Travel Reliability Measures; The National Academies of Sciences, Engineering, and Medicine: Washington, DC, USA, 2003. [Google Scholar]
  12. Chase, R.; Williams, B.; Rouphail, N. Detailed Analysis of Travel Time Reliability Performance Measures from Empirical Data. In Proceedings of the Transportation Research Board 92nd Annual Meeting, Washington, DC, USA, 13–17 January 2013. [Google Scholar]
  13. Ale-Ahmad, H.; Chen, Y.; Mahmassani, H. Travel Time Variability and Congestion Assessment for Origin–Destination Clusters through the Experience of Mobility Companies. Transp. Res. Rec. 2020, 2674, 103–117. [Google Scholar] [CrossRef]
  14. Chen, P.; Tong, R.; Lu, G.; Wang, Y. Exploring Travel Time Distribution and Variability Patterns Using Probe Vehicle Data: Case Study in Beijing. J. Adv. Transp. 2018, 2018, 3747632. [Google Scholar] [CrossRef]
  15. Yang, S.; An, C.; Wu, Y.-J.; Xia, J. Origin–Destination-Based Travel Time Reliability. Transp. Res. Rec. 2017, 2643, 139–159. [Google Scholar] [CrossRef]
  16. Chepuri, A.; Ramakrishnan, J.; Arkatkar, S.; Joshi, G.; Pulugurtha, S.S. Examining Travel Time Reliability-Based Performance Indicators for Bus Routes Using GPS-Based Bus Trajectory Data in India. J. Transp. Eng. Part A Syst. 2018, 144, 04018012. [Google Scholar] [CrossRef]
  17. Pulugurtha, S.; Koilada, K. Exploring Correlations between Travel Time Based Measures by Year, Day-of-the-week, Time-of-the-day, Week-of-the-Year and the Posted Speed Limit. Urban Plan. Transp. Res. 2021, 9, 1–17. [Google Scholar] [CrossRef]
  18. Analytic Relationships between Travel Time Reliability Measures. Transp. Res. Rec. 2011, 2254, 122–130. [CrossRef]
  19. Lint, J.W.C.; van Zuylen, H. Monitoring and Predicting Freeway Travel Time Reliability: Using Width and Skew of Day-to-Day Travel Time Distribution. Transp. Res. Rec. 2005, 1917, 54–62. [Google Scholar] [CrossRef]
  20. Ran, X.; Zhou, X.; Lei, M.; Tepsan, W.; Deng, W. A Novel K-Means Clustering Algorithm with a Noise Algorithm for Capturing Urban Hotspots. Appl. Sci. 2021, 11, 11202. [Google Scholar] [CrossRef]
  21. Qin, K.; Zhou, Q.; Wu, T.; Xu, Y.Q. Hotspots Detection from Trajectory Data Based on Spatiotemporal Data Field Clustering. Int. Arch. Photogramm. Remote Sens. Spat. Inf. Sci. 2017, XLII-2-W7, 1319–1325. [Google Scholar] [CrossRef]
  22. US Census Bureau. TIGERweb Decennial Nation-Based Data Files. Available online: https://tigerweb.geo.census.gov/tigerwebmain/TIGERweb2020_state_based_files.html (accessed on 9 July 2024).
  23. An Overview of the Network Analysis Toolset—ArcGIS Pro | Documentation. Available online: https://pro.arcgis.com/en/pro-app/latest/tool-reference/ready-to-use/an-overview-of-the-network-analysis-toolset.htm (accessed on 9 July 2024).
  24. Cambridge Systematics, Inc.; Strategic Highway Research Program; Strategic Highway Research Program Reliability Focus Area; Transportation Research Board; National Academies of Sciences, Engineering, and Medicine. Guide to Incorporating Reliability Performance Measures into the Transportation Planning and Programming Processes; Transportation Research Board: Washington, DC, USA, 2013. [Google Scholar]
  25. O’Donnell, T. Transportation Equity and Access to Opportunity for Transit-Dependent Population in Dallas. 17 October 2017. Available online: https://dallascityhall.com/government/Council%20Meeting%20Documents/msis_2_transportation-equity-and-access-to-opportunity-for-transit-dependent-population-in-dallas_combined_102317.pdf (accessed on 9 July 2024).
Figure 1. Pairwise correlation matrix plot between SD, TTI, PTI, PR10, PR25, MI, and MI_mod of three metropolitan areas: (a) Chicago; (b) DFW; (c) LA. Note: *** p < 0.001 indicate levels of statistical significance.
Figure 1. Pairwise correlation matrix plot between SD, TTI, PTI, PR10, PR25, MI, and MI_mod of three metropolitan areas: (a) Chicago; (b) DFW; (c) LA. Note: *** p < 0.001 indicate levels of statistical significance.
Sustainability 16 08103 g001
Figure 2. Scatter plot or geo/travel-distance regression result of three metropolitan areas.
Figure 2. Scatter plot or geo/travel-distance regression result of three metropolitan areas.
Sustainability 16 08103 g002
Figure 3. (a) Distribution of travel time-related variables in DFW: (I,IV) travel time duration difference (second) and excess rate (m/s); (II,V) travel speed (m/s) and free-flow travel speed (m/s); (III,VI) travel rate (s/m) and free-flow travel rate (s/m). (b) Distribution of travel time-related variables in Chicago: (I,II) travel rate (s/m) and free-flow travel rate (s/m). (c) Distribution of travel time-related variables in LA: (I,II) travel rate (s/m) and free-flow travel rate (s/m).
Figure 3. (a) Distribution of travel time-related variables in DFW: (I,IV) travel time duration difference (second) and excess rate (m/s); (II,V) travel speed (m/s) and free-flow travel speed (m/s); (III,VI) travel rate (s/m) and free-flow travel rate (s/m). (b) Distribution of travel time-related variables in Chicago: (I,II) travel rate (s/m) and free-flow travel rate (s/m). (c) Distribution of travel time-related variables in LA: (I,II) travel rate (s/m) and free-flow travel rate (s/m).
Sustainability 16 08103 g003
Figure 4. The elbow method for FFTR of three metropolitan areas for clustering.
Figure 4. The elbow method for FFTR of three metropolitan areas for clustering.
Sustainability 16 08103 g004
Figure 5. Density plot of travel distance distribution for each FFTR group of three metropolitan areas.
Figure 5. Density plot of travel distance distribution for each FFTR group of three metropolitan areas.
Sustainability 16 08103 g005
Figure 6. Average trip distance and travel time of OD pairs of Chicago, as calculated using SD, PTI, and MI.
Figure 6. Average trip distance and travel time of OD pairs of Chicago, as calculated using SD, PTI, and MI.
Sustainability 16 08103 g006
Figure 7. Average trip distance and travel time of OD pairs of DFW, as calculated using SD, PTI, and MI. From left to right, low (green), medium (yellow), high (red) clusters.
Figure 7. Average trip distance and travel time of OD pairs of DFW, as calculated using SD, PTI, and MI. From left to right, low (green), medium (yellow), high (red) clusters.
Sustainability 16 08103 g007
Figure 8. Average trip distance and travel time of OD pairs of LA, as calculated using SD, PTI, and MI. From left to right, low (green), medium (yellow), high (red) clusters.
Figure 8. Average trip distance and travel time of OD pairs of LA, as calculated using SD, PTI, and MI. From left to right, low (green), medium (yellow), high (red) clusters.
Sustainability 16 08103 g008
Figure 9. Spatial distribution of all SD clusters, low-SD (green), medium-SD (yellow), high-SD (red) by OD pairs of (a) Chicago, (b) DFW, and (c) LA.
Figure 9. Spatial distribution of all SD clusters, low-SD (green), medium-SD (yellow), high-SD (red) by OD pairs of (a) Chicago, (b) DFW, and (c) LA.
Sustainability 16 08103 g009
Table 1. Summary TTR usage and findings.
Table 1. Summary TTR usage and findings.
StudyCityDataSelected TTR Index Reference Travel TimeComparison Method of
Measures
Ale-Ahmad
et al.
(2020) [13]
Chicago,
IL, US
Ride-sourcing dataPlanning Time Index (PTI)
On-time measure (PR)
The mode of
travel time distribution
Comparison over different time periods/corridors at the OD level; use clustering method to
compare TTV performance in different groups.
Pulugurtha and Koilada
(2020) [17]
Charlotte,
NC, US
One-minute interval travel time data for road links
(private fata source)
Buffer Time (BT)
Buffer Time Index (BTI), PTI
Travel Time Index (TTI)
The minimum, maximum, and average; 10th-, 15th-, 50th-, 85th-, 90th-, and 95th-percentile
travel times
Correlation between indexes
and travel time.
Chepuri et al.
(2018) [16]
Chenna,
India
GPS–based trajectory data of bus tripsBT, 95th-percentile travel time Average travel timeComparison of TTV and reliability measures across
different time periods and segments. The clustering method is applied to classify reliability indicators based on
segment-level data, COV, and V/C ratio.
Chen et al.
(2018) [14]
Beijing,
China
Taxi data
(GPS-based probe
vehicle data)
Unit distance travel time
coefficient of variation, BTI
punctuality rate
Average travel time for
different road
types and time windows
Consider the changes in multiple indicators
comprehensively.
Yang et al.
(2017) [15]
Kunshan,
China
Taxi data
(GPS-based probe
vehicle data)
Standard deviation (SD)
Coefficient of Variation (CV), BI
Average travel timeCompared route-specific TTR values against non-route-specific (NRS) TTR values using statistical methods.
Chase et al.
(2013) [12]
North Carolina,
Virginia, Florida,
Delaware,
Pennsylvania,
New Jersey,
Washington, DC.
GPS probe vehicle network data with
real-time traffic information
Semi-standard deviation,
(a one-sided statistic that
measures deviations from a
reference value)
Free-flow travel timeCorrelation test, ranking analysis (root-mean-square differences in the ranks), visual scatterplots. A full
distribution analysis should be conducted to obtain a
comprehensive evaluation.
Table 2. Regression analysis.
Table 2. Regression analysis.
ChicagoDFWLA
SlopeInterceptSlopeInterceptSlopeIntercept
Before outliers removing1.1490.4801.3000.8891.3160.644
After outliers removing1.1520.4721.2990.9021.3170.624
Table 3. Descriptive statistics of TNC trips (after data preprocessing).
Table 3. Descriptive statistics of TNC trips (after data preprocessing).
ChicagoDFWLA
Number of total trips388,642578,4092,853,778
Number of unique origins/destinations11211972971
Number of census tract OD pairs265116,75681,563
Mean of median trip length (mi)5.097.966.39
Mean of median trip duration (sec)1024.07944.30999.01
% of trips made between 6 am–10 pm72.64%76.68%78.53
Mean of median travel time difference (sec)361.20186.72326.76
Mean of median FFTR (s/m)153.40110.86122.20
Table 4. Thresholds of clustering.
Table 4. Thresholds of clustering.
ChicagoDFWLA
Low FFTR111.6681.5490.28
Medium FFTR172.02120.94131.57
High FFTR240.34167.61178.13
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Sun, Y.; Chen, Y. Travel Time Variability in Urban Mobility: Exploring Transportation System Reliability Performance Using Ridesharing Data. Sustainability 2024, 16, 8103. https://doi.org/10.3390/su16188103

AMA Style

Sun Y, Chen Y. Travel Time Variability in Urban Mobility: Exploring Transportation System Reliability Performance Using Ridesharing Data. Sustainability. 2024; 16(18):8103. https://doi.org/10.3390/su16188103

Chicago/Turabian Style

Sun, Yuxin, and Ying Chen. 2024. "Travel Time Variability in Urban Mobility: Exploring Transportation System Reliability Performance Using Ridesharing Data" Sustainability 16, no. 18: 8103. https://doi.org/10.3390/su16188103

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop