Next Article in Journal
Tele-Coupling Energy Efficiency Polices in Europe: Showcasing the German Governance Arrangements
Previous Article in Journal
A Spatial DEA-Based Framework for Analyzing the Effectiveness of Disaster Risk Reduction Policy Implementation: A Case Study of Earthquake-Oriented Urban Renewal Policy in Yongkang, Taiwan
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Data-Driven Method to Estimate the Maximum Likelihood Space–Time Trajectory in an Urban Rail Transit System

1
Department of Transportation Management Engineering, School of Traffic and Transportation, Beijing Jiaotong University, Beijing 100044, China
2
Department of Civil and Environment Engineering, The Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong, China
3
Wuhan Metro Operation Co. Ltd., Wuhan 430000, China
*
Author to whom correspondence should be addressed.
Sustainability 2018, 10(6), 1752; https://doi.org/10.3390/su10061752
Submission received: 28 April 2018 / Revised: 23 May 2018 / Accepted: 24 May 2018 / Published: 27 May 2018
(This article belongs to the Section Sustainable Urban and Rural Development)

Abstract

:
The Urban Rail Transit (URT) passenger travel space–time trajectory reflects a passenger’s path-choice and the components of URT network passenger flow. This paper proposes a model to estimate a passenger’s maximum-likelihood space–time trajectory using Automatic Fare Collection (AFC) transaction data, which contain the passenger’s entry and exit information. First, a method is presented to construct a space–time trajectory within a tap in/out constraint. Then, a maximum likelihood space–time trajectory estimation model is developed to achieve two goals: (1) to minimize the variance in a passenger’s walk time, including the access walk time, egress walk time and transfer walk time when a transfer is included; and (2) to minimize the variance between a passenger’s actual walk time and the expected value obtained by manual survey observation. Considering the computational efficiency and the characteristics of the model, we decompose the passenger’s travel links and convert the maximum likelihood space–time trajectory estimation problem into a single-quadratic programming problem. Real-world AFC transaction data and train timetable data from the Beijing URT network are used to test the proposed model and algorithm. The estimation results are consistent with the clearing results obtained from the authorities, and this finding verifies the feasibility of our approach.

1. Introduction

As an important part of urban public transport, Urban Rail Transit (URT) serves increasingly more citizens, and its network scale is experiencing large growth. With the rapid development of URT, this network has expanded from a single line into multiple lines, forming a complex network. The Beijing Subway has developed from a simple network of four subway lines into a complex network of twenty-one subway lines in the past 10 years. On the one hand, passengers have more route choices under a complex network. There is typically more than one feasible spatial route between the Origin and Destination (OD) for passengers to travel. For example, approximately 73.9 percent of OD pairs had two or more spatial routes within the Shanghai subway in July 2015. Consequently, there is an urgent need to study passenger network flow within a complex URT network. On the other hand, complex subway networks are typically cooperated by several companies, especially in China. Because all fare payments are collected through a common Automatic Fare Collection (AFC) system, high accuracy is required to allocate revenues according to ridership shares.
The characteristics of the network passenger flow distribution in the URT network has always been an important consideration in traffic planning. This factor reflects travel demand and forms the basis for train scheduling. Previous studies on passenger flow distribution have focused predominantly on two parts. One is the passenger flow assignment model, which calculates the proportion of each selected path and analyzes the bottleneck from a macroscale perspective. The other is a model that assigns a space–time trajectory for each passenger from a microscale perspective.
At the macroscale level, models capture characteristics of the network passenger flow distribution. Generally, travel demand is known because the passenger’s entry and exit information is recorded by AFC system. Taking some specific factors, such as crowding and travel time, into consideration, the models formulate cost functions for all the routes and calculate the ridership share according to network flow assignment principles. Traditional studies on the characteristics of network passenger flow distribution-built assignment models using traffic flow theory are based on the network travel demand.
As of the end of 2009, the AFC Clearing Center (ACC) of Shanghai metro corporations applied an all-or-nothing principle for network assignment. The assignment model assumed that all travelers would choose the shortest route for their trips. However, this method is suitable only for a simple network. To reflect the diversity of passengers’ path-choice behavior, some researchers expanded their studies based on a stochastic assignment principle [1,2,3,4,5], which takes individual differences into account and assigns passengers to all the feasible routes according to fixed proportions. A deletion algorithm was proposed in [5] for available routes based on the depth-first principle and calculated the corresponding proportions. Then, a passenger flow distribution model was proposed that met the passenger’s desire to minimize cost and reflected path-choice multiplicity. However, the model proposed in that paper is a static model. Moreover, the impedance of routes is time-independent in that method. The impedance of a route varies widely, especially for some overcrowded routes. Left-behinds was taken into consideration and a new network passenger flow assignment model was proposed based on user equilibrium assignment principle [6,7,8,9,10,11,12,13,14,15,16,17,18] Considering the train vehicle capacity and individual path-choice differences, the method assumes that the network will come to a state wherein the journey times in all the routes chosen by passengers are equal and less than the time that would be experienced by a single passenger on any unused route. The authors in [6] developed a time-increment simulation method to load passengers with capacity constraint and solved the user equilibrium assignment problem iteratively by the method of successive averages.
With the wide adoption of AFC systems, a new methodology to analyze the characteristics of the network passenger flow distribution and the performance of the URT network was proposed based on AFC transaction data [19,20,21,22,23,24,25,26,27,28,29]. Although the primary purpose of the AFC system is to collect fares, bulk transaction data are recorded accurately and saved. The transaction data record the passengers’ travel information in detail, including the origin station, destination station, entry time and exit time. Attracted by the potential value of this system, many transit operators and researchers have begun to use AFC transaction data to characterize travel demands and analyze the URT system performance [23,24,25] and passenger path-choice behavior [19,20,21,22].
The authors of [26] proposed a schedule-based passenger path-choice estimation model using AFC data. They then converted this model into a network Train Schedule Connection Network (TSCN) path generation and assignment problem. Assuming that the access time is equal to the minimum access time and that the transfer time is equal to the minimum transfer time, a set of feasible paths was generated. However, that paper ignored egress activity when the weights of the feasible paths were calculated. In addition, that model relies heavily on the fail-to-board parameters, which were modeled by Schmöcker (2006) in a user equilibrium assignment model.
The authors in [28,29] developed a probabilistic Passenger-to-Train Assignment Model (PTAM) based on automated data. The model takes as input the walking speed distribution estimated from a maximum likelihood method using AFC data and Automatic Vehicle Location (AVL) data. By decomposing the gate-to-gate journey time into access and wait times, the number of times a passenger is left behind at the individual level can be estimated. The output of the model is the probability that a passenger boards each feasible train. At the aggregate level, the degree of train load and station crowding are estimated based on the assignment results with satisfactory accuracy regardless of transfers. That paper focused mainly on journeys without transfers. Even though that paper stated that the problem with transfers could be formulated in a similar way in principle, the diversity of spatial routes due to the complexity of the URT network was ignored.
In [27], a methodology was proposed to estimate the most likely space–time path by mining the AFC data. The model took the left-behind phenomenon into consideration (in which passengers are prevented from boarding the first arriving train due to the crowd) and incorporated a time-expanded network to formulate the passengers’ space–time path. Assuming the independence of passengers, the most likely space–time path estimation model was developed. At the macroscale level, the network passenger flow distribution result estimated with the proposed method is consistent with the actual data. At the microscale level, all passengers’ detailed space–time paths are estimated. However, the accuracy of the estimation results relies heavily on the probability that passengers can board a specific train vehicle. Table 1 provides a systematic comparison of the key modeling components in the existing network flow assignment research.
As shown in Table 1, a data-driven method to assign the maximum likelihood space–time trajectory is proposed, and individual differences are taken into consideration based on prior studies and our earlier work. In contrast to the traditional methods, the proposed method outputs all the passengers’ detailed travel information, which indicates the passengers’ movements among activity locations with respect to time. Moreover, an estimation model using the station walk time parameters, which can be obtained easily and accurately, is developed to reduce the dependence on data and improve the reliability and accuracy of the estimation results.
This paper decomposes the journey within the URT network into different types of activities, including time spent on the access walk, platform, train, and egress walk. Transfer walk and additional platform waiting are also included when a transfer occurs. If all the trains are punctual according to train schedules, the passengers’ space–time trajectories are constructed based on the train timetable data and URT network topology. Since the walking speed of a passenger generally fluctuates within a small range, one of our aims is to minimize the variance among the passenger’s walk activities. Our other aim is to minimize the variance between the passenger’s actual walk time and the expected value, which is obtained by a manual survey. Taking this individual difference into account, the maximum likelihood space–time trajectory estimation model is proposed.
The proposed estimation model presents a method to calculate the weight of each space–time trajectory. Because a train vehicle departs a station at a fixed time according to a schedule, the passenger’s arrival time and departure time can be calculated once his/her space–time trajectory is assigned. Thus, the total time consumption a passenger spent at a station can be calculated. Then, a quadratic problem is formulated to solve the estimation problem. If the activities of the passengers at different stations are independent, the quadratic problem can be converted to a set of one-quadratic problems to improve computational efficiency.
The main contributions of this paper are as follows:
  • A data-driven methodology to estimate a passenger’s detailed travel information is developed. Passengers’ detailed trajectories can be used for further study, such as analysis of path-choice behavior.
  • A method to estimate walk parameters of subway stations using AFC data and train schedules is developed.
  • At the aggregate level, we develop outputs of various kinds of statistical reports for operators, such as time-independent network passenger flow distribution and time-independent congestion of train vehicles and stations.
This paper is organized as follows. Section 2 analyzes the passenger travel components and introduces the main idea of estimating the maximum space–time trajectory based on AFC data. Section 3 describes the model, followed by an introduction of the solution algorithm in Section 4. In Section 5, numerical experiments on a real-world network are presented. The final section presents the paper’s conclusions along with a summary of the comments and future research steps.

2. Conceptual Illustrations

This section first analyzes the passenger’s travel trajectory components. Then, we illustrate how to use a time-expanded network to represent a passenger’s space–time trajectory.

2.1. Passenger Travel Trajectory Components

A URT system is a closed system that includes a pay zone and a free zone. A passenger enters the pay zone once he/she passes an entry gate and leaves when passing an exit gate by swiping his/her smart card. The AFC system records a passenger’s entry and exit information and produces complete transaction data. Figure 1a illustrates a simple urban rail network topology, which is made up of three railway lines, and Figure 1b presents a complete journey from station A to station D by path A→C→E, which contains one transfer.
As shown in Figure 1b, in general, the travel time comprises the access walk time, Platform Waiting Time (PWT), On-Train Time (OTT), egress walk time and transfer walk time (if a transfer is included). The mean value of the access walk time and egress walk time are typically measured by a manual survey or simulation model depending on the station volume. The access walk time is defined as the elapsed time after entry and before arriving at the midpoint of the platform(s), as represented by arrow 1 in Figure 1b. Similarly, the egress, represented by arrow 5 in Figure 1b, is defined as walking from the midpoint to the exit gate(s), assuming that the egress walk starts immediately after the train arrives.
The OTT is determined by the departure time and arrival time. According to the punctual assumption, the arrival time and departure time at all stations are determined and fixed. Therefore, the OTT is constant once a space–time path is assigned for a passenger.
The transfer walk time is defined as the elapsed time after the end of the previous on-train travel and before the beginning of the next on-train travel if the transfer walk starts immediately after the train arrives. This time is represented by arrow 3 in Figure 1b.
The PWT is defined as the elapsed time after the end of access and before the beginning of on-train travel, i.e., the PWT is the time from the passenger arrival at the midpoint of the platform to the start of movement of the boarded train.

2.2. Passenger’s Space–Time Trajectory

According to the previous section, travel time contains four parts. For journeys that require one (or more) transfer(s), the transfer walk time and additional PWT are also included. Thus, the space–time trajectory comprises five types of links: the access link, transfer link, egress link, platform wait link and train link. The first three are a type of walk link. Travel trajectory is linked by train connections, and there are no two adjacent walk links. Here, we take the OD from station A to station E as an example.
As shown in Figure 1a, there are two feasible spatial routes for the passenger to choose. The first route is A→C→E, which requires one transfer. The other one is A→B→D→E. During the low peak hours, a passenger can board the first available train after his/her arrival at the platform. Figure 2a,c displays the different network-time paths of different routes, which are chosen by the passenger for his/her travel.
In constructing the time-expanded network, as shown in Figure 2, a transfer station is replaced by two or more stations, which is dependent on the account of railway line passing by this station. For example, Station C is a transfer station passed by Line 1 and Line 3. Therefore, Station C is replaced by Station C′ and Station C″ in a time-expanded network.
As shown in Figure 2, there are usually two or more feasible space-time trajectories to assign with given entry and exit information. Figure 2a,b indicate that the passenger traveled via spatial path A→ C → E. However, the passenger spend much more time on walking from the entry gate to platform in Figure 2b than the one in Figure 2a. Figure 2c indicates that the passenger traveled by a different spatial route, A → B → D → E.
In a complexity URT, generally, there are two or more feasible spatial routes for passenger to choose for their travels. Even though the entry information and exit information can be recorded by AFC system accurately, it is hard to tell the specific space-time trajectory only by AFC transaction data. This paper aims to develop a methodology to estimate the maximum likelihood space-time trajectory with given entry information and exit information recorded by AFC system.

3. Maximum Likelihood Space–Time Trajectory Estimation Models

We now describe the formal problem statement for the passenger’s space–time trajectory model to minimize the variance among the passenger’s walking speeds and mean values. First, the variables used in the mathematical formulations are defined. Those variables describe the construction of the time-expanded network and passenger’s space–time trajectory. Then, the method of constructing the time-expanded network and the maximum likelihood space–time trajectory estimation model are proposed.

3.1. Notations

Table 2 and Table 3 give the related notations, input parameters and decision variables of the corresponding problem.

3.2. Maximum Likelihood Space–Time Trajectory Estimation Model

Problem statement. Given the AFC transaction data and train timetable data, the passenger space–time trajectory estimation problem aims to assign the most likely space–time paths to the passengers.
Space–time flow balance constraints. To depict a time-dependent tour in the space–time network, we formulate a set of flow balance constraints as follows:
( j , t 2 ) V , t 1 < t 2 z i , j , p t 1 , t 2 ( j , t 2 ) V , t 2 < t 1 z j , i , p t 2 , t 1 = { 1 , i = o p ,   t 1 = t o p , p 1 , i = d p ,   t 1 = t d p , p 0 , o t h e r w i s e , f o r a l l ( i , t 1 ) V , p P
The first term in this constraint for a node represents the total count that passenger p leaves from the node and the second term represents the total count that passenger p arrives at the node. The flow constraint states that the number of departures must equal the number of arrivals unless this node is an origin node or a destination node. If the node is an origin node for passenger p , the number of departures exceeds the number of arrivals and the number of departures minus the number of arrivals must equal 1. If the node is a destination node for passenger p , the number of arrivals exceeds the number of departures and the number of arrivals minus the number of departures must equal 1.
Activity time cost constraints. In the real world, the time spent on any activity is not less than zero. Therefore,
c i , j , p t 1 , t 2 = t 2 t 1 0 , ( i , j , t 1 , t 2 ) E , p P
Total time consumption constraints. Given a passenger’s AFC transaction data, the time that the passenger passed by the entry gate and exit gate is known. Then, the consumed travel time is calculated.
( i , j , t 1 , t 2 ) E z i , j , p t 1 , t 2 · c i , j , p t 1 , t 2 = t d p , p t o p , p
Platform wait time constraints. Assuming that all passengers can board the first coming train after their arrival at the platform, PWT should be less than the interval time between the departure time of the first coming train after the passenger’s arrival at the platform and the departure time of the last train that leaves before the passenger’s arrival at the platform.
c i , j , p t 1 , t 2 h j s t a t i o n , t 2 , ( j , t 2 ) V , i N W , j N W { d p }
Objective function. This paper aims to estimate the maximum likelihood space–time trajectory for all passengers based on two aims. Since the walking speed of a passenger generally fluctuates within a small range, one of our aims is to minimizing the variance among passenger’s walk activities. The other aim is to minimizing the variance between passenger’s actual walking time and its expected value. The objective function is given as followed.
min z = p P ( i , j , t 1 , t 2 ) E z i , j , p t 1 , t 2 · w i , j t 1 , t 2 [ ( c i , j , p t 1 , t 2 e t i , j 1 ) 2 + ( c i , j , p t 1 , t 2 e t i , j · e t i , d p c i , d p , p t 1 , t 2 1 ) 2 ]
In formulation (5), ( c i , j , p t 1 , t 2 e t i , j 1 ) 2 represents the variance between the time consumed that passenger p spends on the spatial link ( i , j ) and the mean value of expected time consumption on the spatial link ( i , j ) . ( c i , j , p t 1 , t 2 e t i , j · e t i , d p c i , d p , p t 1 , t 2 1 ) 2 is used to calculate the variance among the passenger’s walking speeds, including the access walking speed, egress walking speed and transfer walking speed.
w i , j t 1 , t 2 is an attribute of the space–time link ( i , j , t 1 , t 2 ) , which is 0–1 binary variable. If the space–time link ( i , j , t 1 , t 2 ) is a walk link, w i , j t 1 , t 2 equals 1. Otherwise, w i , j t 1 , t 2 equals 0. Because our two aims are both related with passenger’s walk activities. In order to reduce the feasible region and improve computational efficiency, only the set of space–time walking links are searched while calculating the utility value of a feasible space–time path. In general, the spatial location of the end node of a space–time walking link should be either platform or exit gates. Thus, the objective function can be converted into formulation (6).
min z = p P ( i , j , t 1 , t 2 ) E j N W { d p } z i , j , p t 1 , t 2 [ ( c i , j , p t 1 , t 2 e t i , j 1 ) 2 + ( c i , j , p t 1 , t 2 e t i , j · e t i , d p c i , d p , p t 1 , t 2 1 ) 2 ]

4. Solution Algorithms

This section presents an algorithm to estimate the maximum likelihood space–time trajectory based on AFC transaction data and the time-expanded network. In absolute terms, a space–time trajectory is determined by z i , j , p t 1 , t 2 and t i , p , where j N W . z i , j , p t 1 , t 2 determines the spatial route and train the passenger boards, whereas the passengers’ walk time and PWT are determined by t j , p , where j N W . Thus, the passenger maximum likelihood space–time trajectory estimation problem can be converted in a time-expanded network trajectory generation and assignment problem.

4.1. Feasible Trajectory Generation Algorithm

A space–time trajectory indicates a passenger’s movements among activity locations with respect to time. Such a trajectory contains five types of links: The access walk link, platform wait link, train link, transfer walk link and egress walk link. Considering that trains are running according to schedules, train links are fixed and can be built based on the train timetable data. However, the travel time of passengers can vary. Therefore, the passenger’s walk time and PWT are uncertain even if the train he/she boards is determined. Some constraints used to formulate the space–time trajectory are given below.
Transfer station constraints. According to the introduction, a transfer station is replaced by two or more stations. Then, if station s and station s are different stations of the same transfer station, they share the same name.
s n a m e = s n a m e , s S
With constraints (2)–(4) and (7), an algorithm is proposed to show how to construct a space–time trajectory network based on the URT train timetable data and AFC data. Figure 3 shows the procedures of building space–time trajectory network and Figure 4 gives an explanation to express the procedures.
Algorithm. Building the space–time trajectory network
  • Input: URT network, train timetable data, AFC data
  • Output: space–time trajectory network
Step 1. Initialize parameters and variables.
    Input URT train timetable and AFC data, initialize parameters of the algorithm, and initialize variables t i , p and z i , j , p t 1 , t 2 .
Step 2. Extend subway stations.
    Step 2.1 Extend the transfer stations according to the count of accessing subway lines. The stations that represent the same transfer station have same station name. Add all stations to S .
    Step 2.2 Replace a station by four spatial nodes that represent the entry gate, exit gate, platform and track of the station. These four nodes are either the start node or end node of passenger’s activity links. Add all spatial nodes to N . Add all spatial nodes that represent platform to N W .
Step 3. Build the space–time node set.
    Step 3.1 Generate train departure space–time nodes and arrival space–time nodes. Extend the track spatial nodes according to train arrival count and departure count. Add these space–time nodes to V .
    Step 3.2 Generate passengers’ entry space–time nodes and exit space–time nodes. Extend the spatial nodes that represent entry gate or exit gate passengers’ entry information and exit records. Add these space–time nodes to V .
    Step 3.3 Generate platform space–time nodes. Extend a platform spatial node according to the departure of a train accessing the platform. Add these space–time nodes to V .
Step 4. Build the space–time link set.
    Step 4.1 Build train space–time links. Connect the space–time departure node and arrival node of the same train according to the sequence of space–time nodes passing by train. Add these space–time links to E .
    Step 4.2 Build walk space–time links. First, build the access walk space–time link between entry space–time nodes and platform space–time nodes with activity time cost constraint and the constraint that the platform arrival time should be less than the exit time. Assuming that passengers get off the train immediately while transferring at transfer station or arriving at their destination station, build the egress walk space–time links between platform space–time nodes and exit space–time nodes, and the transfer walk space–time links between platform space–time nodes. Add these space–time links to E and E W .
Step 4.3 Build platform waiting space–time links. Connect platform waiting space–time nodes and train departure space–time nodes with platform wait time constraint. Add these space–time links to E .

4.2. Weighted Assignment

Since passengers are independent individuals, it can be assumed that a passenger’s maximum likelihood space–time trajectory is dependent from each other. Then, it is reasonable to estimate the maximum likelihood space–time trajectory for all passengers one by one. A set of feasible space–time trajectories can be obtained based on the algorithm in Section 4.1. Here, we will develop an algorithm to calculate the weight for all feasible space–time path.
By decomposing a passenger’s travel trajectory link, the train link(s) and egress walk link are fixed for a specific space–time trajectory. That is to say, the train(s) taken by a passenger are determined for a specific trajectory. Thus, the time spent at each station is fixed and can be calculated. What is uncertain is the arrival time at platform and PWT. Our aim is to determine the value of t j , p , where j N W , to minimize the variance among passenger’s walking speeds and the deviation between passenger’s walking time and its excepted value. The estimation problem can be summarized using the following function.
Problem P1:
min z = ( i , j , t 1 , t 2 ) E W j N W { d p } z i , j , p t 1 , t 2 [ ( c i , j , p t 1 , t 2 e t i , j 1 ) 2 + ( c i , j , p t 1 , t 2 e t i , j · e t i , d p c i , d p , p t 1 , t 2 1 ) 2 ]
s . t . { ( i , j , t 1 . t 2 ) E W z i , j , p t 1 , t 2 · c i , j , p t 1 , t 2 = t t j s t a t i o n , p 0 c i , j , p t 1 , t 2 < h j s t a t i o n , t 2 , i N W , j N W { d p }
In the constraints (9), the equality constraint is total time consume constraint at a station. For a specific passenger and a specific space–time trajectory, the enter time, exit time and the train(s) boarded by him/her are fixed. In other words, the total time consume is a fixed value once a passenger’s space_time trajectory is assigned. For the origin station, for example, passenger’s enter time is recorded by AFC system and the departure time from origin station is the departure time of the train taken by him/her. Thus, the total time consume of passenger’s activities in origin station is a certain value and c i , j , p t 1 , t 2 is independent from each other. Thus, Problem P1 is a single-quadratic programming problem with a given range. Problem P1 can be solved easily using monotonicity of the objective function.

5. Numerical Experiments

This section shows the numerical experiment conducted using the proposed method based on real-world data from the URT network of Beijing, China between 10:00 a.m. and 12:00 a.m. We developed a software system using C#, Windows Presentation Foundation (WPF) and Human–computer interaction technology. The system aims to provide the subway corporations an intelligent tool to manage URT basic data, assign network flow, analyze the characteristic of network flow distribution and allocate ticket income. Figure 4 shows the real-world Beijing URT network topology formulated by our software.
As shown in Figure 5, Beijing URT network was constructed by 17 lines and 338 stations. Two subway lines, Line 4 and Line 14, are operated and managed by Beijing MTR Corporation. The remaining 15 subway lines are operated and managed by Beijing Subway. There are totally 53 transfer stations where passengers can interchange from one subway line to another subway line. Transfer station is the intersection of subway lines accessing it and represented by a node. Three of them are accessed by three subway lines and the remaining fifty transfer stations are all accessed by two subway lines. In addition, the airport line is independent of other subway lines. That is to say passengers must swipe their smart cards to leave for or depart from airport line.
One of our aims is to confirm whether the computation time required is practical. Another aim is to verify the effectiveness of the method. First, we present the data used in this numerical experiment. Second, we show the process of how to estimate the maximum likelihood space–time trajectory.

5.1. Input Data

The numerical experiment employs train timetable data and AFC transaction data obtained from the Beijing Subway. Selected portions of the train timetable data are given in Table 4. The time-expanded network is constructed based on these data.
Table 5 shows part of the AFC transaction data observed on 9 May, 2016. The AFC transaction data includes the card ID, origin station, destination station, entry time and exit time. Card ID is the unique identifier of a smart card and represents a passenger. It’s critical to match entry information and exit information. The entry time and exit time are recorded by AFC system when passengers pass the entry gate or exit gate and swipe their smart cards. They are accurate and recorded to the nearest second. The time format is hh:mm:ss.

5.2. Results and Discussion

In this study, we present the estimation results from some aspects and compare these results with the manual survey results, including the travel time parameters and flow statistics.

5.2.1. Computational Efficiency

Table 6 shows the estimation results. We used a computer with an Intel Xeon E5-2640 2.4 GHz) processor and 8 GB memory; this system took 4.2 min to finish estimating all AFC transaction data between 10:00 a.m. and 12:00 a.m. In terms of the calculation time, the proposed methodology can be used for regular processing of transaction data and is acceptable for a daily routine of data processing.

5.2.2. Detailed Space–Time Trajectory Information

A space–time trajectory reflects not only the spatial route chosen by passengers but also the trains that passengers take. Figure 6 shows the set of feasible space–time trajectories for the passenger whose card ID is 15093452342. The time marked in the figures indicates the time when the passenger arrives at the spatial location, but the hour is ignored.
As shown in Figure 6, there are four feasible space–time trajectories in total for the passenger whose card ID is 15093452342 with his/her given entry and exit information. The first three space–time trajectories shown in Figure 6 indicate that the passenger transferred at Nanluoguxiang subway station. The differences among Figure 6a,b,c are that the times spent on walk activities are different and the passenger traveled by boarding different trains. However, the last trajectory identifies Chaoyangmen subway station as the transfer station. The detailed space–time trajectory information is given in Table 7, and Table 8 shows the basic walk time expectation obtained by the manual survey.
According to Table 7 and Table 8, the weights of all four feasible trajectories can be calculated and are 1.54, 8.63, 4.89 and 76.24. Clearly, the optimal likelihood space–time trajectory is the first one. Relative to the first trajectory, the relationship between access walk time and transfer walk time is exactly the opposite. The other two trajectories have the same problem in terms of walk time allocation. In addition, it is unreasonable that the passenger spent almost ten times more (as a result of the egress walk time) to tap out by the last trajectory. In contrast, the walk time distribution in the first path is much more reasonable. On the one hand, the actual walk time and its expected values are closer. On the other hand, this trajectory ensures the passenger’s consistency between access walking speed and egress walking speed as much as possible.

5.2.3. Estimation Result of Walk Time Parameters

The walk time parameters of a subway station are one of the most important parameters of subway stations. In general, these data contain the access walk time and egress walk time. For a transfer station, the transfer walk time is also included. The parameters are related to the subway station layout and infrastructure properties, such as the length and width of passageways. The walk time consumption of passengers spent at a station depends on those parameters. In this paper, we assume that all passengers are independent individuals and that their space–time trajectories do not interfere with each other. Furthermore, individual differences are considered. Table 9 compares the access walk time parameters estimated with the proposed method to the parameters based on manual survey observations for the top six stations in tap-in passenger flow.
As shown in Table 9, the relative deviations between the estimation result and manual survey are approximately 5%. The overall difference between the manual survey and the estimation result is small and can be tolerated.
Figure 6 shows the access walk time distribution of the four different subway stations. Among them, there is one regular station, the Beijing Railway station, and three transfer stations. The Beijing West Railway station and Beijing South Railway station are transfer stations accessed by two subway lines, while Xizhimen is a transfer station accessed by three subway lines.
According to Figure 7, the distribution of the access walk time resembles the normal distribution. The access walk time that passengers spend walking from the entry gate to platform is concentrated over a certain range. Comparing these four distributions of access walk time, we find that the variance of access walk time of the Beijing Railway station is the smallest and that of Xizhimen station is the largest. This result arises because the layout of a regular station is typically simpler than that of a transfer station. Relative to transfer stations, regular stations generally have fewer entrances and exits. In addition, passengers have more ways to reach the platform, access walk aisles and transfer walk aisles included. These factors lead to a greater variance of access walk time for transfer stations.
Combining the comparison in Table 8 and the estimation results shown in Figure 7, it is clear that the estimation results are consistent with the survey results. While the manual survey is time-consuming and expensive, the proposed method results in a satisfactory estimate of the walk time based on the AFC transaction data and train schedule.

5.2.4. Distribution of the URT Network Passenger Flow

The distribution of the URT network passenger flow is one of most important characteristics of the URT network. This distribution reflects the time-dependent travel demand and its trend from the macroscale perspective. At a microscale level, the URT network passenger flow is made up of passengers. In other words, the passengers’ space–time trajectories compose the time-dependent distribution of the URT network passenger flow. Figure 7 presents the passenger flow distribution of the URT network between 10:30 a.m. and 11:30 a.m. The thickness of the line indicates the count of the passenger flow, and the color indicates the load carried by the subway section.
According to Figure 8, the section load is less than 1, and the majority of passengers are travelling downtown in the study period. The most congested section is Caishikou-Xuanwumen section and its load is almost 1. The relatively congested segments are located mainly around the central business district (CBD) and railway stations, such as Beijing South Railway station and Sanlitun CBD. The passenger traffic in these segments is approximately 7000 to 9000 person per hour and the load rate is about 50%. The load rate of those segments in suburb is basically around 35% or less.
Table 10 compares the estimated results of section passenger flow and the clearing results provided by the Beijing Transportation Operations Coordination Center (TOCC) for the top five subway sections in terms of passenger flow.
At the network level, according to Table 10, the estimated results are consistent with the clearing results from TOCC, which means that the network passenger flow distribution estimated with the developed method is correct at the macroscale level. The difference, relative to the clearing results provided by TOCC, lies at the microscale level. The proposed method estimates the maximum likelihood space–time trajectories for all the passengers. In other words, except for the network passenger flow distribution, all the passengers’ detailed space–time trajectories can be estimated.

6. Conclusions

Network passenger flow is one of the most important characteristics of URT and the basis for train scheduling. Thus, it is significant to characterize the time-dependent network passenger distribution and enhance the efficiency of the URT system for subway operators. This paper proposed a data-driven methodology to estimate the maximum likelihood space–time trajectory based on bulk transaction data. The method proposed in this paper uses expected values of station walk time as the input data instead of the distribution function of the station walk time. This approach reduces the challenges associated with data acquisition and improves the accuracy of estimation results.
Furthermore, the estimation result indicates the passengers’ travel information in detail, including the actual access walk time consumed, platform wait time and trains he/she boarded. The passenger’s path-choice behavior can be estimated based on the detailed space–time trajectories, which is significant for operators to dispatch, especially in the case of unexpected accidents. Moreover, the train load and congestion of the stations can also be inferred.
We future research will focus on three major areas. First, extensions of the model to incorporate left-behinds due to crowd and the improvement of the methodology to estimate the space–time trajectory without station walking time parameters. Second, consideration of arrivals with a time variance. We aim to develop a practical method to solve the estimation problem for whole day. Finally, we also aim to optimize the URT timetable based on the estimation results.

Author Contributions

Conceptualization, L.Z.; Investigation, L.L.; Methodology, X.C.; Project administration, L.Z.; Resources, L.L.; Software, X.C. and L.S.Z.; Supervision, Y.Y.; Visualization, X.C.; Writing–original draft, X.C.; Writing–review & editing, Y.Y.X. and Y.Z.

Acknowledgments

This work is financially supported by the Fundamental Research Funds for the Central Universities (grant: 2018RC012), the Science and Technology Research Program of China Railway Corporation (grant: 2016X005-D) and the Fundamental Research Funds (grant: 2017JBZ001). The real data in this paper are based on research supported by the Beijing TOCC. We extend our sincere gratitude to all the reviewers for their careful review.

Conflicts of Interest

The authors declared no potential conflicts of interest with respect to the research, authorship, and publication of this article.

References

  1. Tong, C.O.; Wong, S.C. A stochastic transit assignment model using a dynamic schedule-based network. Transp. Res. Part B Methodol. 1999, 33, 107–121. [Google Scholar] [CrossRef]
  2. Tong, C.O.; Wong, S.C. A schedule-based time-dependent trip assignment model for transit networks. J. Adv. Transp. 1999, 33, 371–388. [Google Scholar] [CrossRef]
  3. Moller-Pedersen, J. Assignment model for timetable based systems (TPSCHEDULE). In Proceedings of the Seminar F, 27th European Transportation Forum, Cambridge, UK, 27–29 September 1999; pp. 159–168. [Google Scholar]
  4. Nielsen, O.A.; Jovicic, G. A large-scale stochastic timetable-based transit assignment model for route and submode choices. In Proceedings of the Seminar F, 27th European Transportation Forum, Cambridge, UK, 27–29 September 1999; pp. 169–184. [Google Scholar]
  5. Xu, R.; Luo, Q.; Gao, P. Passenger flow distribution model and algorithm for urban rail transit network based on multi-route choice. J. China Railway Soc. 2009, 31, 110–114. [Google Scholar]
  6. Poon, M.H.; Wong, S.C.; Tong, C.O. A dynamic schedule-based model for congested transit networks. Transp. Res. Part B Methodol. 2004, 38, 343–368. [Google Scholar] [CrossRef]
  7. Hamdouch, Y.; Lawphongpanich, S. Schedule-based transit assignment model with travel strategies and capacity constraints. Transp. Res. Part B Methodol. 2008, 42, 663–684. [Google Scholar] [CrossRef]
  8. Ben-Akiva, M.E.; Gao, S.; Wei, Z.; Wen, Y. A dynamic traffic assignment model for highly congested urban networks. Transp. Res. Part C Emerg. Technol. 2012, 24, 62–82. [Google Scholar]
  9. Cepeda, M.; Cominetti, R.; Florian, M. A frequency-based assignment model for congested transit networks with strict capacity constraints: Characterization and computation of equilibria. Transp. Res. Part B Methodol. 2006, 40, 437–459. [Google Scholar] [CrossRef]
  10. Richard, D.; Connors, A.S. A network equilibrium model with travellers’ perception of stochastic travel times. Transp. Res. Part B Methodol. 2009, 43, 614–624. [Google Scholar]
  11. Leurent, F.; Chandakas, E.; Poulhès, A. A passenger traffic assignment model with capacity constraints for transit networks. In Proceedings of the 15th Meeting of the EURO Working Group on Transportation, Paris, France, 10–13 September 2012. [Google Scholar]
  12. Schmocker, J.-D.; Bell, M.G.H.; Kurauchi, F. A quasi-dynamic capacity constrained frequency-based transit assignment model. Transp. Res. Part B Methodol. 2008, 42, 925–945. [Google Scholar] [CrossRef]
  13. Nuzzolo, A.; Crisalli, U.; Rosati, L. A schedule-based assignment model with explicit capacity constraints for congested transit networks. Transp. Res. Part C Emerg. Technol. 2012, 20, 16–33. [Google Scholar] [CrossRef]
  14. Leurent, F.; Chandakas, E.; Poulhes, A. A traffic assignment model for passenger transit on a capacitated network: Bi-layer framework, line sub-models and large-scale application. Transp. Res. Part C Emerg. Technol. 2014, 47, 3–27. [Google Scholar] [CrossRef]
  15. Lu, C.-C.; Liu, J.; Qu, Y.; Peeta, S.; Rouphail, N.M.; Zhou, X. Eco-system optimal time-dependent flow assignment in a congested network. Transp. Res. Part B Methodol. 2016, 94, 217–239. [Google Scholar] [CrossRef]
  16. Fu, Q.; Liu, R.; Hess, S. A review on transit assignment modelling approaches to congested networks: A new perspective. Soc. Behav. Sci. 2012, 54, 1145–1155. [Google Scholar] [CrossRef]
  17. Tong, C.O.; Wong, S.C. A predictive dynamic traffic assignment model in congested capacity-constrained road networks. Transp. Res. Part B Methodol. 2000, 34, 625–644. [Google Scholar] [CrossRef]
  18. Bekhor, S.; Chorus, C.; Toledo, T. Stochastic User Equilibrium for Route Choice Model Based on Random Regret Minimization. Transp. Res. Rec. 2012, 100–108. [Google Scholar] [CrossRef]
  19. Van Der Hurk, E.; Kroon, L.; Maróti, G.; Vervest, P. Deduction of Passengers’ Route Choices from Smart Card Data. IEEE Trans. Intell. Transp. Syst. 2015, 16, 430–440. [Google Scholar] [CrossRef]
  20. Shi, J.; Zhou, F.; Zhu, W.; Xu, R. Estimation method of passenger route choice proportion in urban rail transit based on AFC data. J. Southeast Univ. (Natl. Sci. Ed.) 2015, 1, 184–188. [Google Scholar]
  21. Kusakabe, T.; Iryo, T.; Asakura, Y. Estimation method for railway passengers’ train choice behavior with smart card transaction data. Transportation 2010, 37, 731–749. [Google Scholar] [CrossRef]
  22. Sun, Y.; Schonfeld, P.M. Schedule-Based Rail Transit Path-Choice Estimation using Automatic Fare Collection Data. J. Transp. Eng. 2016, 142, 04015037. [Google Scholar] [CrossRef]
  23. Zhou, F.; Xu, R. Model of passenger flow assignment for urban rail transit based on entry and exit time constraints. Transp. Res. Rec. 2012, 2284, 57–61. [Google Scholar] [CrossRef]
  24. Dai, X.; Tu, H.; Sun, L. A Multi-modal Evacuation Model for Metro Disruptions: Based on Automatic Fare Collection Data in Shanghai, China. In Proceedings of the 95th Annual Meeting of the Transportation Research Board, Washington, DC, USA, 10–14 January 2015. [Google Scholar]
  25. Ma, X.; Wu, Y.J.; Wang, Y.; Chen, F.; Liu, J. Mining smart card data for transit riders’ travel patterns. Transp. Res. Part C Emerg. Technol. 2013, 36, 1–12. [Google Scholar] [CrossRef]
  26. Zhu, W.; Hu, H.; Huang, Z. Calibrating Rail Transit Assignment Models with Genetic Algorithm and Automated Fare Collection Data. Comput.-Aided Civ. Infrastr. Eng. 2014, 29, 518–530. [Google Scholar] [CrossRef]
  27. Chen, X.; Zhou, L.; Tang, J.; Zhou, H. Estimating the most likely space-time path by mining automatic fare collection data. In Proceedings of the 98th Annual Meeting of the Transportation Research Board, Washington, DC, USA, 13–17 January 2018. [Google Scholar]
  28. Zhu, Y.; Koutsopoulos, H.N.; Wilson, N.H.M. A probabilistic Passenger-to-Train Assignment Model based on automated data. Transp. Res. Part B Methodol. 2017, 104, 522–542. [Google Scholar] [CrossRef]
  29. Zhu, Y. Passenger-to-Train Assignment Model Based on Automated Data. Master’s Thesis, Massachusetts Institute of Technology, Cambridge, MA, USA. Available online: https://dspace.mit.edu/handle/1721.1/90075 (accessed on 27 May 2018).
Figure 1. Topology of the Urban Rail Transit (URT) and a selected passenger’s travel routes.
Figure 1. Topology of the Urban Rail Transit (URT) and a selected passenger’s travel routes.
Sustainability 10 01752 g001
Figure 2. Different space–time paths of different routes.
Figure 2. Different space–time paths of different routes.
Sustainability 10 01752 g002
Figure 3. Build space–time trajectory network procedures.
Figure 3. Build space–time trajectory network procedures.
Sustainability 10 01752 g003
Figure 4. Build space–time trajectory network.
Figure 4. Build space–time trajectory network.
Sustainability 10 01752 g004
Figure 5. Beijing URT topology network.
Figure 5. Beijing URT topology network.
Sustainability 10 01752 g005
Figure 6. Set of feasible space–time trajectories.
Figure 6. Set of feasible space–time trajectories.
Sustainability 10 01752 g006
Figure 7. Distribution of access walk times.
Figure 7. Distribution of access walk times.
Sustainability 10 01752 g007
Figure 8. Network passenger flow distribution between 10:30 a.m. and 11:00 a.m.
Figure 8. Network passenger flow distribution between 10:30 a.m. and 11:00 a.m.
Sustainability 10 01752 g008
Table 1. Comparison of key elements in network assignment problems.
Table 1. Comparison of key elements in network assignment problems.
PublicationsObjective FunctionSolution AlgorithmData
NTTSDAFCTD
Tong and Wong (1999a) [1]Minimizing generalized costBBA
Tong and Wong (1999b) [2]Minimizing travel time consumedTDNL
Xu et al. (2009) [5]Minimizing path impedanceMRA
Poon et al. (2004) [6]Equilibrating network flowMSA
Kusakabe et al. (2010) [21]Minimizing the costDA
Bekhor et al. (2012) [18]Minimizing random regretMSA
Van der Hurk et al. (2015) [19]Minimizing the deviation between path and conductor checksBFA
Chen et al. (2018) [27]Maximizing the probability of a space–time DA
This paperMinimizing the variances among the time cost of walk activitiesSQP
Abbreviation descriptions: Solution algorithm: MRA—Multi-Route Assignment; BBA—Branch and Bound Algorithm; TDNL—Time-Dependent Network Loading; MSA—Method of Successive Averages; BFA—Bellman-Ford Algorithm; DA—Dijkstra Algorithm; SQP—Single-Quadratic Programming. Data: NT—Network Topology; TSD—Train Schedule Data; AFCTD—AFC Transaction Data.
Table 2. Subscripts and parameters used in the mathematical formulation.
Table 2. Subscripts and parameters used in the mathematical formulation.
SymbolDefinition
S Set of URT stations
C Set of URT connections, including sections and transfer links
N Set of spatial nodes
N W Set of spatial nodes that represent the platform, N W N
V Set of space–time nodes
E Set of space–time links
E W Set of walk space–time links
P Set of passengers
T Set of activity times
L Set of URT lines
V Set of URT trains
s ,   s Index of stations, s ,   s S
p Index of passenger, p P
i ,   j Index of spatial nodes, i ,   j N
t 1 , t 2 Index of different time stamps, t 1 , t 2 T
( i , t 1 ) , ( j , t 2 ) Index of space–time nodes, ( i , t 1 ) ,   ( j , t 2 ) V
( i ,   j ) Index of URT connection, ( i ,   j ) C
( i , j , t 1 , t 2 ) Index of space–time link indicating the actual movement at the entering time t 1 and leaving time t 2 on the spatial link ( i , j ) , ( i , j , t 1 , t 2 ) E
s n a m e Name of station s ,   s S
i s t a t i o n Index of station that spatial node i belongs to, i N
o p ,   d p Index of origin node, destination node of passenger p , p P , o p ,   d p N
c i , j , p t 1 , t 2 Time cost for passenger p on link ( i , j , t 1 , t 2 ) , ( i , j , t 1 , t 2 ) E ,     p P
e t i , j Excepted value of the time cost of link ( i , j ) , ( i , j ) C
w i , j t 1 , t 2 0-1 binary variables: 1 if passengers must walk from node i to node j ; 0, otherwise, ( i , j ,   t 1 , t 2 ) E
h s , t Interval time of train departure at station s at t , s S , t T
t t s , p Total time consume of passenger p at station s , s S , p P
Table 3. Decision variables used in the mathematical formulation.
Table 3. Decision variables used in the mathematical formulation.
VariableDefinition
t i , p Time at which passenger p arrives at node i , p P , i N
z i , j , p t 1 , t 2 0–1 binary variables: 1 if passenger p passes spatial link ( i , j ) from time stamp t 1 at node i to t 2 at node j ; 0 otherwise.
Table 4. Selected train timetable data.
Table 4. Selected train timetable data.
Line NameTrain NumStationArrival TimeDeparture Time
1102116Dongdan09:51:0709:51:52
1102116Xidan09:42:5909:43:33
1102116Fuxingmen09:39:5909:40:49
9091070Beijing West Railway10:39:2810:40:28
9091070Liuliqiao10:44:5810:45:58
9091070Qilizhuang10:48:2510:49:10
Table 5. Automatic Fare Collection (AFC) transaction data.
Table 5. Automatic Fare Collection (AFC) transaction data.
Card IDOriginEntry TimeDestinationExit Time
15093451177Huilongguan10:14:00Fuchengmen10:56:24
15093451188Tiantongyuanbei11:03:00Beijing Railway11:53:00
15093452342Dalianpo11:35:00Guloudajie12:18:28
15093452355Chongwenmen11:49:00Guloudajie12:19:16
15093451636Jishuitan10:48:00Andingmen10:55:03
15093451646Huixi North11:16:00Beijing Railway11:50:04
15093451660Changchunjie11:44:00Fuchengmen11:54:04
15093450644Dalianpo10:08:00Fuchengmen10:56:02
15093451221Jishuitan11:42:00Xizhimen11:50:21
15093452373Jishuitan11:57:00Chongwenmen12:20:20
15093451672Wangfujing11:21:00Jishuitan11:53:21
15093450259Dajing10:55:00Yonghegong11:52:16
Table 6. Summary of estimation results.
Table 6. Summary of estimation results.
Number of hours2
Number of OD pairs117,992
Number of trips320,228
Number of trips for which no path was estimated8629
Calculation time (min)4.2
Table 7. Detailed space–time trajectory information.
Table 7. Detailed space–time trajectory information.
No.Transfer StationBoard TrainAccess TimeTransfer TimeEgress Time
1Nanluoguxiang231111, 292110144 s300 s185 s
2Chaoyangmen231111, 102127141 s336 s173 s
3Nanluoguxiang251112, 292110323 s151 s185 s
4Chaoyangmen231111, 272126141 s197 s443 s
Table 8. Walk time expectation at each subway station.
Table 8. Walk time expectation at each subway station.
Walk Time ParameterExpected Value (s)
Access time of Dalianpo (150996535)120
Transfer time of Nanluoguxiang (150996519)300
Transfer time of Chaoyangmen (150996523)285
Egress time of Guloudajie (150995473)46
Egress time of Guloudajie (150997027)90
Table 9. Access walk time parameter comparison table.
Table 9. Access walk time parameter comparison table.
StationSurvey (s)Estimation (s)Relative Deviation
Beijing West Railway station90933.33%
Beijing Railway station39415.13%
Beijing South Railway station123116−5.69%
Xizhimen327314−3.98%
Tiantongyuanbei47494.26%
Dongzhimen2222334.96%
Table 10. Section passenger flow comparison results.
Table 10. Section passenger flow comparison results.
Section NameTOCCEstimateRelative Deviation
Caishikou–Xuanwumen63696290−1.24%
Jintailu–Hujialou60555946−1.8%
Guomao–Yong’anli578958150.45%
Yong’anli–Jianguomen570857240.28%
Dawanglu–Guomao546854690.02%

Share and Cite

MDPI and ACS Style

Chen, X.; Zhou, L.; Yue, Y.; Zhou, Y.; Liu, L. Data-Driven Method to Estimate the Maximum Likelihood Space–Time Trajectory in an Urban Rail Transit System. Sustainability 2018, 10, 1752. https://doi.org/10.3390/su10061752

AMA Style

Chen X, Zhou L, Yue Y, Zhou Y, Liu L. Data-Driven Method to Estimate the Maximum Likelihood Space–Time Trajectory in an Urban Rail Transit System. Sustainability. 2018; 10(6):1752. https://doi.org/10.3390/su10061752

Chicago/Turabian Style

Chen, Xing, Leishan Zhou, Yixiang Yue, Yu Zhou, and Liwen Liu. 2018. "Data-Driven Method to Estimate the Maximum Likelihood Space–Time Trajectory in an Urban Rail Transit System" Sustainability 10, no. 6: 1752. https://doi.org/10.3390/su10061752

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop