**2. Background**

#### *2.1. Artificial Neural Networks*

The Artificial Neural Network (ANN) is a mathematical method that is widely used for reproducing several physical phenomena and forecasting the results of some actions on (or variations of) the parameters/variables of the system. ANNs are considered to be black-boxes since the functions and the relationships between inputs and outputs are hidden, not known and, generally, not interpretable.

Both the strengths and weaknesses of ANNs are related to their black-box approach. ANNs can reproduce a phenomenon or approximate a function without making the parameters explicit; moreover, once trained, they are able to give the results rapidly. On the other hand, trained ANNs are not extendible even to similar cases and work only if the boundary conditions do not change significantly.

ANNs have been widely studied elsewhere; they were initially introduced in [1–3] and then developed in other pioneering contributions [4–8]. Many general books focus on ANNs; here we refer to [9–14].

Literature reviews have been proposed in several papers. Scarselli and Tsoi [15] presented a review of studies that used Feedforward Neural Networks to approximate some functions, examining computational aspects, structures of the network (hidden layers and neurons), and training algorithms. They also proposed two training algorithms. Baptista and Morgado-Dias [16] examined the numerous software tools available, with the intention to help choose the most appropriate tool while considering its features (operating system, minimum hardware, kind of licence, algorithms implemented, and so on). Timotheou [17] reviewed random neural networks and their application to several problems. Extreme learning machines were reviewed in [18], while reviews on deep learning in neural networks can be found in [19,20]. Finally, here we refer to Yao [21] for an appraisal of evolutionary artificial neural networks.

#### *2.2. Road Tra*ffi*c Flow Forecasting*

Two main types of transportation flow forecasting problems can be identified: (i) short-term forecasting and (ii) tra ffic data spatial extension. Some papers that applied ANNs to these problems were reviewed in [22].

The first problem aims to forecast the tra ffic flows (or user flows) that use a road section (or a transit line) in a future time interval, using the data measured in the previous time intervals in the same road section. This problem has been widely studied elsewhere, and a complete review would deserve a specific paper; here we refer to [23,24]. For solving this problem, several methods were proposed; in this paper, we focus on most of the literature that has used ANNs.

Kirby et al. [25] discussed the use of ANNs for forecasting tra ffic flows on motorways up to an hour ahead and compared this approach with other statistical models. Smith and Demetsky [26] compared the performances of ANNs with traditional methods for solving the short-term tra ffic flow prediction problem, such as data-based algorithms and time-series models; they found that the back-propagation neural network model was able to predict future tra ffic flows on highways better than the other models. In the same research field, ANNs were used for modelling freeway tra ffic in a macroscopic environment [27]; the authors found that the neural network model was able to

capture the traffic dynamics quite closely and was "*computationally e*ffi*cient for real-time implementation*". ANNs were proposed as tools for predicting congestion and forecasting flows in [28]; the authors also discussed whether ANNs were able to estimate parameters that cannot be directly measured with road sensors. Park et al. [29] proposed a radial basis function neural network for short-term forecasting on freeways; they tested the method with real observations and compared it with other approaches such as Taylor series, single and double exponential smoothing methods, and back-propagation neural networks. Zheng et al. [30] proposed a Bayesian combined neural network approach for short-term forecasting on freeways, while a binary neural network was presented in [31]. Another application on a highway can be found in [32], while applications in urban environments can be found in [33,34]. Park et al. [35] used feedforward multilayer neural networks for estimating link travel times on freeways. Other applications of ANNs to short-term forecasting can be found in [36–40]. Ledoux [41] proposed the use of ANNs within an urban traffic flow model, while Florio and Mussone [42] studied traffic flow stability on freeways with neural network models.

Traffic data spatial extension problems have received less attention in the literature. Lin et al. [43] used a macroscopic model for short-term forecasting, which is also able to predict flows on other links. Zheng Zhu et al. [44] used ANNs for spatial extension of traffic flows at road intersections. Gallo and De Luca [45] proposed the use of ANNs for estimating traffic flows on some links of an urban road network according to the flows measured on other links.

Recently, deep learning methods have been proposed in the field of traffic prediction; a survey can be found in [46]. The authors identified four main deep learning models: deep neural networks (DNN), convolutional neural networks (CNN), recurrent neural networks (RNN), and deep reinforcement learning (referring, in particular, to the Deep Q-Network [47]). In the ITS field, among others, DNN, CNN and RNN are useful for time series prediction. However, the above methods have not ye<sup>t</sup> been used for the spatial extension of traffic or passenger data.

Short-term traffic forecasting with deep learning was studied in [48], where a long short-term memory (LSTM) network was proposed; the method was tested on a case study in Beijing, showing promising forecast accuracy compared with other approaches. The same approach has also been used for traffic flow prediction with missing data in [49].

Temporal CNN was proposed in [50] for short-term forecasting of passenger demand, outperforming other models in test cases. Support vector machines and data denoising schemes were combined in [51] for traffic flow prediction; the proposed denoising algorithms improved the results in this hybrid model, compared to other approaches without a denoising strategy. Short-term travel speed prediction was studied in [52–55].

#### *2.3. Metro Passenger Flow Forecasting*

The specific problem tackled in this paper entails metro passenger flow prediction. The literature review in this field presents some interesting contributions.

Deep learning methods were proposed in [56] and tested on a Bus Rapid Transit (BRT) system. The proposed model forecasts the hourly flow, adopting a three-stage deep learning architecture. This paper also analyses the literature, identifying four different approaches: (1) traditional classical algorithms; (2) regressive models; (3) machine learning-based models, including ANNs; (4) hybrid models. All studied cases, however, refer to short-term or long-term time periods, without considering the spatial extension. Among them, cases reported in [57,58] are applied on railways and based on ANNs, focusing on short-term and long-term forecasting respectively. Short-term forecasting on urban metros was also studied along with other methods, such as Kalman filter [59] and ARIMA (autoregressive integrated moving average) models [60].

Li et al. [61] proposed a multiscale radial basis function (MSRBF) for forecasting short-term metro passenger flows on special occasions, such as sporting events, concerts, and so on. In this case, passenger flow is very irregular and predictions are more difficult to obtain. Ling et al. [62] used smart-card data for predicting passenger flows in the subway of Shenzhen (China); they analysed

four predictive models: a historical average model, ANN, regression model and a gradient-boosted regression tree model. Liu et al. [63] proposed a deep learning method for short-term forecasting of metro inbound/outbound passenger flows, while Wang et al. [64] proposed a Novel Markov-Grey model for solving the same problem.

#### *2.4. Contribution of the Paper*

ANNs have been widely used in numerous scientific fields since the 1950s/1960s and in tra ffic engineering since the 1990s. Most applications in tra ffic engineering have focused on the temporal extent of data (more frequently short-term or, sometimes, long-term predictions) and road environments; fewer cases refer to transit systems. The spatial extent of data has been less widely studied and, to our best knowledge, the use of ANNs for the specific problem tackled in this paper has not been proposed elsewhere. Therefore, the originality of our contribution does not so much concern the method used, which is indeed consolidated, as the problem dealt with and the procedure used to construct the training datasets. Other more advanced methods, such as deep learning, will be the subject of further research, as will be discussed in the conclusions. In this paper, the performance of ANNs was not compared with other methods because there are no benchmarks. Indeed, almost all methods usually used as benchmarks in short-term forecasting problems are not applicable in our case, since they are time-series specific.

It is important to underline that the problem studied is relevant to the real-time managemen<sup>t</sup> of metro lines. Data at turnstiles can be easily collected with methods that do not require significant additional investment, while data obtained through the above procedure (loads on line sections) are essential for service operators and, unlike the former, are not easily detectable in real time and continuously.

#### **3. Problem Description and ANN Approach**

We assume that turnstiles control all accesses to a metro line: each user, entering a station, uses a ticket (or a pass) for crossing the turnstile. Moreover, the turnstiles are only able to count users entering the station without linking the origin of each trip with the corresponding destination. This situation is common to many metro lines, such as Line 1 of the Naples metro system (Italy) which will be the subject of the real-scale test. Indeed, turnstiles are often installed only for facilitating ticket control/validation and avoiding no-ticket trips, and, in urban contexts, the fare is the same regardless of the origin-destination pair. Below, we consider two cases: (a) turnstiles at the station entrance that measure only passengers entering, with no indication which direction they will follow; (b) turnstiles upon access to platforms that also give information on trip direction (see Figure 1).

The data collected by turnstiles can be used, with low technological investment, for implementing a monitoring system of the whole metro line, generating information about the passengers on each railway section (between two stations). Such information can be of grea<sup>t</sup> use to metro operators for implementing real-time strategies, like a frequency increase or reduction, determination of train composition (number of passenger carriages), the scheduling of additional runs, and so on.

The problem to solve is the estimation of loads on the line starting from turnstile data. For this purpose, we propose feedforward ANNs, which are suitable because (a) the relationships between inputs and outputs do not need to be explicitly known, (b) the results are obtained rapidly, and (c) the boundary conditions do not usually change so much as to invalidate forecasts.

The structure of the ANN provides an input layer with a node for each turnstile, an output layer with a node for each convoy load, and one or more hidden layers. The best structure of the ANN has to be designed for each specific problem. A crucial point concerns the dynamic nature of the problem: the train moves along the line, loading and unloading passengers at stations in di fferent time intervals. Therefore, the number of onboard passengers between two stations at time *t* depends on the passengers loaded and unloaded at previous stations at di fferent times (<*<sup>t</sup>*). Hence, ANN inputs

have to consider turnstile counts referring to several time intervals preceding those being forecast. The number of inputs will depend on the travel time duration between terminals.

The other crucial point is the training phase of ANNs; here we propose to use a supervised learning method, where the example datasets are generated through dynamic simulation models (see Section 4). Note that it is not possible to use real-world data for the training phase. Indeed, only the input data (on passengers entering the stations) are available while the output data (on-board passengers) can be known only if all coaches have sensors that are able to measure them; in this case, however, the proposed approach would not be useful.

**Figure 1.** Types of metro stations: (**a**) turnstiles at the entrance; (**b**) turnstiles at platform accesses.

#### **4. Generation of Training Datasets**

To generate the training datasets, we used the simulation model proposed in [65]. This model assumes that:


In our test, we assume that the passengers follow a FIFO (First In-First Out) rule for boarding the convoy. The analytical details of the model can be found in [65].

Using this model, we generate the training datasets on the case study as follows: (a) numerous origin-destination (OD) matrices referring to 15 intervals are randomly generated starting from a base OD matrix; (b) four OD matrices, referring to four consecutive time intervals, are assigned to the metro line, yielding as results the passengers counted at turnstiles, for each interval, and the passengers onboard in each railway section in the next time interval; (c) the output data of the problem are the passengers on railway sections, while the input data are the passengers counted at turnstiles in four time intervals, corresponding to the four previous 15 periods (e.g., flows on railway sections between 10:15 and 10:30 are estimated according to the turnstile counts in the intervals: 9:15–9:30, 9:30–9:45, 9:45–10:00 and 10:00–10:15). Therefore, the structure of the training datasets is reported in Table 1, where the following notations are used:


**Table 1.** Structure of training datasets for period *t*.


#### **5. Case Study and Numerical Results**

The proposed approach was tested on Line 1 of the Naples metro system. This line (see Figure 2) is 18 km long and has 18 stations; it connects high-density districts in Naples and is crucial infrastructure for urban mobility.

**Figure 2.** Line 1 route.

Considering these characteristics, we have 34 or 18 turnstiles, if we divide the passengers according to direction or not, and 34 mono-directional railway sections. The main features of the line are summarised in Table 2.

> **Table 2.** Features of Line 1.


The training datasets were obtained by simulating 2500 OD matrices generated randomly. On eliminating some of them because their results were not feasible (too many passengers compared to the actual capacity), we generated 2279 training datasets. Of the latter datasets, 2229 were used for the whole training process of the ANNs with the software MatLab: 1561 training datasets (70%), 334 validation datasets (15%) and 334 testing datasets (15%). The remaining 50 datasets were used

to verify the goodness of the trained ANNs with examples that were not used before in the training process. We tested six ANN structures for both cases: (a) turnstiles at the station entrance (18 turnstiles), and (b) turnstiles at the access points to platforms (34 turnstiles). We thus trained and tested 12 ANNs, as reported in Table 3.



The training phase required computing times from 30 s (case a\_1\_6) to 8 min (case b\_2\_20), with a Personal Computer Hewlett Packard i7-7700HQ, 280 GHz, RAM 16 GB. In Table 4 we report the best and worst coe fficients of determination (R2) for each case, referring to the 50 datasets not used in the training phase, and the corresponding averages and variances. The datasets for which R<sup>2</sup> is lower than 0.9, 0.8, 0.7 and 0.6 are reported in Table 5. In these tables, the best values for each ANN are underlined.

Examining the results reported in Tables 4 and 5, we may identify as best ANN structures the one with one hidden layer and 20 neurons for case (a), and two hidden layers and 10 neurons for case (b). The corresponding dispersion diagrams in the cases of best and worst R<sup>2</sup> are reported in Figures 3 and 4.


**Table 4.** Coe fficients of determination (R2).

Best values are underlined.


**Table 5.** Analysis of R<sup>2</sup> values.

Best values are underlined.

**Figure 3.** *Cont.*

**Figure 3.** Dispersion diagrams ANN a\_1\_20 (best, upper diagram; worst, lower diagram).

**Figure 4.** *Cont.*

**Figure 4.** Dispersion diagrams ANN b\_2\_10 (best, upper diagram; worst, lower diagram).

#### **6. Conclusions and Research Prospects**

In this paper, we studied the problem of forecasting passenger flows on railway sections of a metro line starting from counts at turnstiles and proposed to use artificial neural networks (ANNs) for its solution. The training datasets were generated using a simulation model. We considered two cases: turnstiles at station entrances and turnstiles at platform accesses. For both, we designed and trained several ANNs.

The results showed a good capacity of ANNs to forecast the loads on railway sections. Our analysis allowed us to identify the best ANN structure for each case.

Future research could profitably lead in several directions. First of all, other ANN structures could be tested. Then the problem could be extended to more complex metro systems, including systems with more than one line. Finally, other methods could be investigated, as well deep-learning approaches that could be applied to this problem.

**Author Contributions:** Conceptualization, M.G. and L.D.; Methodology, M.G., G.D.L. and L.D.; Validation, M.B. and G.D.L.; Investigation, M.B. and G.D.L.; Resources, L.D. and M.B.; Data Curation, M.B. and G.D.L.; Writing—Original Draft Preparation, M.G. and L.D.; Writing—Review and Editing, M.G., G.D.L., L.D. and M.B.; Supervision, M.G.

**Funding:** This research received no external funding.

**Acknowledgments:** The authors are grateful to the anonymous reviewers for their valuable comments and suggestions.

**Conflicts of Interest:** The authors declare no conflict of interest.
