1. Introduction
When operation suffers a disturbance, prompt rescheduling measures must be taken to maintain the robustness of the metro system. In the past few years, train rescheduling has caused great concerns among many researchers, and many different approaches have been developed with different model formulations.
In some studies, the train rescheduling problem is converted into the problem of mathematical programming, aiming to make the operation return to normal as soon as possible by altering the running time and dwelling time. Usually, it is most important to maintain service as much as possible for the customers [
1]. In early work, classical optimization methods were used for rail transit train regulation to describe passenger perception of service quality [
2]. In the study by D’riano [
3], train scheduling was viewed as a job shop scheduling problem with no-store constraints and was modeled with the alternative graph formulation. The branch-and-bound algorithm was used in this to obtain the optimization solution. A mixed integer programming model was established in Ref. [
4] to minimize the incidents’ impact with a heuristic algorithm. There have been some studies on train regulation problems with high nonlinearity, heavy constraints, and stochastic characteristics, such as Ref. [
5]. Besides, some efficient train operation control algorithms were presented in Refs. [
6,
7] with the highly increasing concerns about environmental protection.
In another type of research, the train rescheduling model has been established based on discrete event dynamic systems theory. The discrete-time traffic system was described earlier in Ref. [
8], and this study used state feedback control algorithms to optimize system performance and ensure system stability. Then, a discrete event model was adopted to handle perturbations in the railway network [
9]. Recently, discrete methods have received more attention. In Ref. [
10], the subway line was characterized through the train positions’ state transition on the basis of discrete events. A timed colored Petri network was adopted in Ref. [
11] to describe the railway system with double-track lines. The Markov decision framework was proposed to deal with the uncertain disturbances in real-time train operation by Yin et al. [
12].
Currently, the headway between trains has become smaller with the growing passenger demand, which raises higher requirements for train rescheduling. To improve the computational efficiency, many techniques have been tried. A general genetic algorithm was applied in Ref. [
13] to get the train optimization solutions. For the same problem, a heuristic greedy approach that performs a depth-first search that branches according to a set of criteria was described by Krasemann [
14]. The Problem Space Search (PSS) meta-heuristic was used in Ref. [
15] for large-scale problems to generate a revised timetable quickly. In Ref. [
6], approximate dynamic programming was proposed to solve the stochastic programming and obtain a high-quality solution within a short time compared with the MIP solver. Based on a standard event-based MILP formulation in Ref. [
16], the solution was addressed by an ad-hoc heuristic preprocessing on top of a general-purpose commercial solver.
With high-frequency operation and high traffic density, the metro system now is more sensitive to disturbances and more unstable than the traditional system. During rush hour, passenger demand is high, even exceeding the transportation capacity, so that the dwelling time is often extended by squeezing in passengers, leading to a departure delay. Although the metro system is commonly equipped with ATC (Automatic Train Control), enabling making an adjustment to improve the punctuality by altering the travel speed profile, it has a limited responsiveness to the dynamics of passenger flow and unloading the gathered passengers. Therefore, train operation combined with demand management is needed in practice.
In this context, a joint optimal train regulation and passenger flow control model was first developed by Li et al. [
17] aiming to improve the headway regularity and commercial speed under perturbations, based on the assumption that the dwelling time of each train is affected by boarding and alighting passengers. The paper defined a state vector that consists of operation error and passenger loading error to describe the linear time-varying system. In order to minimize the system error, train regulation and the passenger control measure are adopted jointly to adjust the running time and dwelling time. The simplified joint dynamic model described the evolution of the departure time and the passenger loading in the form of a matrix. However, the formulation cannot reflect some feature variables, such as the number of passengers left on the platform. It also ignored the total delay of passengers, which is one of the important performances of the rescheduling problems. In addition, the proposed model is only applicable to slight delays in a certain range.
Therefore, considering minimizing the total delay of passengers and service quality, as well as adjustment costs under dynamic passenger flow, we propose train regulation combined with a passenger control model under discrete the Markov decision process framework. Moreover, we take the uncertainty in the dwelling process into account. Similarly, the running time and the number of passengers’ control are selected as two variables in our study.
In principle, the Markov decision problem can be solved by using dynamic programming algorithms, such as value iteration and policy iteration [
18]. However, the rescheduling problem is high-dimensional, involving a large number of variables, which render such an algorithm infeasible. To address the problem, Approximate Dynamic Programming (ADP) is applied in our paper. ADP was development by Powell to overcome the curse of dimensionality [
19]. The method has been widely applied in various sequential stochastic optimization problems, such as the network capacity control problem in Ref. [
20], supply risk management in Ref. [
21], and transshipment policy optimization in Ref. [
22]. In our study and experiments, the dynamic operation of the metro system is described explicitly through the Markov decision model, and the ADP method helps us lower the dimensions of the variables. With different scenario settings, the experiments’ results demonstrate the fast convergence performance in the case of a large-scale problem.
The rest of this paper is organized into several parts. In
Section 2, we first state the problem and give the assumptions of the study. Then, in
Section 3, we present our train adjustment model based on the analysis of the interaction of train operation and dynamic passenger flow. In
Section 4, we explain the ADP algorithm’s superiority and the algorithm procedure. In
Section 5, three experimental scenarios are implemented to verify the validity of the proposed model and algorithm. Finally, some improvements and future works are put forward in the Conclusion section.
2. Problem Description
This paper considers a singe-track metro with
N stations and
running sections. As shown in
Figure 1, each train begins its journey from the first station and dwells at the station for a period, waiting for passengers’ alighting and boarding in sequence, then arrives at the next station by running in a section according to a given train timetable. To make the study easy to understand, we give the variable notations of the train service process in
Table 1.
Generally, passengers are delivered from the origin to their destination as they expect. However, it is inevitable that trains will suffer disturbances and that the operation will deviate from the pre-determined timetable in the actual operation. In the cruising phase, equipment failure, improper driving behavior, or sudden accidents can cause a late arrival. In the loading process, there are also some uncertainties that can result in a departure delay, for instance the passengers in the train are so crowded, that the door cannot be closed on time. If the deviation of the operation is not eliminated in time, the delay could propagate throughout the network due to the cumulative passengers, which has been simulated in Refs. [
23,
24]. The fluctuation of waiting passengers on the platform is illustrated in
Figure 2. To prevent a second delay, train rescheduling is necessary.
Usually, both dwelling time and running time would be reset in a train rescheduling problem. However, with the fast-growing passenger demand, the recovery of train operation experiences more difficulty. Once a delay occurs, more passengers will accumulate in a short time with the originally huge arriving passenger flow. This requires higher transportation efficiency and sufficient dwelling time to disperse passengers, otherwise more passengers will be retained and more trains will deviate from the previous schedule, thus influencing the operational efficiency of the entire network. Besides, the squeezing in of passengers increases the uncertainty of the dwelling process, as well; while the delayed train needs to depart as soon as possible to improve its punctuality performance at a later station. Therefore, a passenger control measure should be taken to regulate the dwelling time, thereby achieving a trade-off between the number of loaded passengers and the time required to return to the normal condition.
In addition, inappropriate running regulation strategies may be counterproductive with lower service quality and higher operating costs as well. According to Ref. [
25], a smaller section running time leads to a greater energy consumption. In addition, excessive acceleration to move faster would cause passenger discomfort. As for the adjustment of dwelling time, it also should integrate the train dispatching and passenger loading.
To develop an adjustment model, we first discuss the interaction between passenger flow and train operation. In reality, the start and end of the train service are part of the process of train stopping. The dwelling time is usually predetermined, which matches with passenger flow in the timetabling stage. However, it should be reset in the case of disturbances in order to return to the original timetable. Therefore, in this paper, we consider determining the dwelling time based on passenger flow, which has been investigated greatly in Refs. [
26,
27]. According to Ref. [
28], dwelling time was considered to be closely related to the speed at which passengers move and the crowding degree, as the following formulation.
where
are given correlation coefficients, which can be estimated according to historical data.
is the number of doors of the vehicle. In this paper, the difference of the number of waiting passengers before doors is neglected, and we assume that the number of alighting passengers is proportional to the number of passengers in the vehicle, with the ratio set as
.
As
Figure 3 shows, after the train’s arrival at a station, the arrival at the next station is only determined by the dwelling time in the former station and the section running time, being the initial stage of the later schedule. Accordingly, based on the interactive process of passenger boarding and train operation, we propose a train regulation combined with a passenger control model to restore the deviation of the train operation to a reasonable range as soon as possible concerning both the passengers and operation costs.
Ahead of the model formulation, we give several assumptions as follows. (1) To simplify the problem, skip-stopping and overtaking are not permitted in our study, so the order of the train passing through the station is determined. (2) We do not consider the impact of the passenger control measure on the passenger demand. This means that the passenger would not leave the station or reroute in spite of being denied. (3) In peak hours, the number of passengers entering stations fluctuates a little in general, so we used the passenger arrival rate obtained through ATC data directly, ignoring the temporal gap between the moment passengers enter the station and their arrival at the platform.
3. Model Establishment
As illustrated in
Figure 1, train operation planning is a multi-stage decision problem involving passenger flow. When a train departs from a station, the arrival time depends on the running time. Based on the dwelling model we discussed before, the number of boarding passengers subject to remaining capacity determines the dwelling time and the departure time. There will be some passengers on left platform if the volume is not sufficient. If we consider the period from one departure to another departure as a step, the metro system evolves in such a discrete stage. Therefore, from the planning level, we convert the train rescheduling problem into a Markov decision process. The formulations are as follows.
State,
, is a vector that is made from the arrival time of the train, the number of passengers in the vehicle, the number of waiting passengers, and the delay of the arrival time.
Action,
, are decision variables at each step, denoted as Equation (
4), which we mentioned previously.
where
also equals the number of remaining passengers who are left at the platform to wait for the next train.
State transfer function indicates how the state
evolves to the state
exposed to the action
. The function is expressed as Equation (
5), and the components of the state vector can be obtained by Equations (
7)–(
10).
Immediate cost,
, generated by action
, is formulated by:
where,
,
, and
are weighted parameters. In the problem of this paper, we aim to minimize the total delay of all the disturbed trains with minimal impact on both operation costs and service quality. Therefore, the three terms make up the decision cost in our model. The first one is the total delay of passengers. The second is added to penalize the passenger control to reduce the negative impact on service quality. The third term is the train regulation penalty. As we discussed before, the variance of running time should be kept small considering less extra energy consumption and small acceleration change to avoid passenger discomfort.
However, in fact, train adjustment is a real-time problem, and the number of affected trains is unknown, but depends on our policy. We can only predict the future based on the current status and the information we have. In MDP, the value function is calculated to judge how good the decision is in each step. For state
, it is formulated with the long-term expected return, and then, the recursion formula is described as Equation (
12) according to the Bellman optimality principle.
where
is the discount factor, which indicates the impact of current actions on future ones.
is the set of allpossible states.
In actual operation, the train operation is also subject to the following constraints on the operating environment and safety restrictions.
where Equations (
13)–(
18) are the section running time constraint, dwelling time constraint, passenger control constraint, headway constraint, and passenger loading constraint.
C is the vehicle loading capacity, and
is the overload ratio.
The decision-making process of the proposed model can be explained explicitly by
Figure 4. For train
i at station
j,
is the past state, and
is the current state. After making a decision
, which are the number of passenger control and section running time based on the current state, train
i moves to station
and
transfers to the next
; an immediate cost
is produced simultaneously.
5. Numerical Examples
In this section, we applied our proposed model and ADP methods to the actual case of the Beijing Subway YIZHUANG Line, which consists of 13 stations, through three different experimental scenarios. During the morning peak hour, there is an apparent high passenger flow; thus, we only consider the up direction of the line from the Beijing Economic Technological Development Zone to the downtown. The time horizon is set from 7:30–8:30 when the passenger arrival rate is high and the headway is short. The first two scenarios were designed to verify the feasibility of the model, and the third one focused on the performance of the algorithm.
The map of the Beijing Subway YIZHUANG Line and its system parameters are shown in
Figure 5 and
Appendix A (
Table A1). Based on practice survey data and AFC records, the minimum and maximum running times are defined as 0.85-times and 1.2-times the scheduled running time. The upper and lower bounds of headway are 120 s and 400 s. The minimum dwell time for door opening and closing is 8 s.The capacity is 1480, and the overload ratio is 1.4. The number of doors is 24. The coefficients in the immediate cost are set as
. Besides, the algorithm parameters are all fixed in the experimental scenarios. Discount factor
is 0.9, and the maximum iteration
N is 500 with a
step size.
5.1. Scenario 1
To validate the feasibility and effectiveness of the model and algorithm presented in this paper, we first considered the situation where an equipment failure occurred in
Section 2 for Train 2 and resulted in an arrival delay of 110 s. Owing to SONGJIAZHAUNG being a transfer station to the city, few passengers get off at the stations along the line. Therefore, in our experiments, the number of people alighting is proportional to the number of people in the vehicle, and the ratio is a small fixed value. Passenger arrival rate and alighting ratio are listed in
Table 2.
First, to demonstrate the validity of the ADP method we used in our proposed model, we compared the solving performance of policy iteration algorithms and the ADP method on the MATLAB platform. Due to the high effectiveness of the train operation adjustment problem, we concentrated more on the computational efficiency. It took 18 s to converge by the ADP method, as shown in
Figure 6, while it took 123 s to get the optimal solution with a
total cost under the policy iteration strategy.
From
Table 3, we can conclude that the delay was effectively reduced through train regulation and passenger control to recover to the normal operation schedule as soon as possible. For the delayed Train 2 and the following affected train, some boarding passengers were restricted. The section running time was shortened, to avoid arriving too late at the latter station for Train 2, while being prolonged due to the headway constraint for Train 3. Furthermore, the number of passengers controlled reduced to zero gradually, and the running time returned to the scheduled value. Gradually, delay disappeared, and the train operation returned to normal. The comparison between the scheduled timetable and the rescheduled one is clear in
Figure 7. Here, it should be clearly pointed out that although the two red lines of Train 2 and Train 3 are close to each other, they still meet the minimum headway constraint.
5.2. Scenario 2
In the former scenario experiment, the delay was caused by systemic disorder, and we assumed the passenger arrival rate was constant. However, in the actual operation, there is also another disturbance that is caused by a sudden increase in passenger flow sometimes. Therefore, we designed the second scenario experiment to analyze the sensitivity to passenger flow of the proposed model and algorithm in this paper. All rates in this case fluctuated with a small increment
k. Due to the limitation of length for the manuscript, we just chose the solution results of the first followed train affected by the delayed train and used
Figure 8 and
Figure 9 to reveal its features of change.
From the two figures, something interesting can be concluded. First, when the rate was relatively small, the number of passengers controlled was zero, which is consistent with the actual situation. That is because the scheduled dwell time was sufficient enough, in addition to the time for passenger alighting and boarding; there was no need to sacrifice the benefit to passenger, and it was easy to recover to the normal operation only by regulating the section running time. Moreover, with the increase of arriving passengers, exclusively changing the running time did not work, and the passenger control strategy was supposed to be adopted, which makes sense. The higher the rate, the greater the degree of delay that may result, and more passengers should be controlled. By comparison, there is something else notable: the change of the running time was not monotonous. Although our goal was to dissipate the delay, the section running time was not reduced all the time due to the headway constraint.
These results also prove that our model does consider both dynamic passenger flow and operating characteristics, and it can reflect the impact of passenger flow on operations. Such adjustment measures are also applicable to sudden large passenger flow situations. Passenger control can flexibly regulate dwell time, meeting the demand of reasonable deployment for transportation resources well.
5.3. Scenario 3
In this scenario, we further investigated the model application in situations where delay occurs at different station and for different train. By comparison, the extensive applicability was verified. The different initial delays are shown in
Table 4. Other parameters were identical to scenario 1.
Convergence results are explicitly shown in
Figure 10. In each case, objective functions converged at about the 50th iteration. Actually, this iterative update method involves the idea of machine learning. Although we did not have the real value, we could substitute it with other targets. The historical results of each cycle were used as sample data, by which exploration and exploitation were used to work out the optimal policy.
As we discussed before, the regulation models in other literature works have limitations to consider the indices of train running simply. However, we found that with the contradiction between demand and supply getting more serious, the impact of passenger flow fluctuations on operations can become more apparent, and the passenger control variable and running time variable were sensitive to environmental changes. Through the above three experimental scenarios, we have proven the necessity of passenger control and verified the effectiveness of our model in different situations.