First, real data were obtained to adapt the model both geometrically and in terms of flow to the real chosen situation and to compare—in further evaluations—the simulated outputs with the real values of the chosen parameter. With this information, the microsimulation model was created and visually tested to avoid unrealistic behavior.
The second step involved the selection of suitable parameters, which mainly influence pedestrian behavior, but also consider the interaction with motorized road users and are to be used as inputs for the neural network. Then the most suitable microsimulation output was selected, which should also be used as an output of the neural network. After these selections, a database with random input combinations was generated. Each of these combinations was linked by simulation to a corresponding output value. The third step consisted in the formulation of the neural network and its training. The results of this step were evaluated in relation to the criterion of learning outcomes.
Finally, a new data set was used to evaluate the network, taking into account the criterion of generalization. In the next paragraphs, the steps are described in detail.
3.1. Data Gathering
As a starting point for the model setting up and the development of the neural network, a real situation had to be selected.
It was decided to record an unsignalized pedestrian crossing, which is located on the entry leg of a roundabout: the considered infrastructure is located in Monfalcone (GO), a city in the Italian region Friuli-Venezia-Giulia. The roundabout is located in the urban area of the municipality and connects the city center with the main neighboring towns. The crossing passes two unidirectional lanes of the same roundabout entry leg and connects pubs and cafés with offices, schools and shops. Both pedestrian movements and conflicting vehicle traffic were recorded for a whole week, during the two peak hours of the city, i.e., from 7.30 to 9.30 a.m., obtaining more than 1000 recorded pedestrians.
The recordings were analyzed using a semi-automatic software that allows selected detections to be automatically saved and the users involved to be tracked manually. This step made it possible to obtain a huge amount of data on both motorized and non-motorized users. Due to the research objective, the selected real-world data, which were then used in the model, were pedestrian speed distribution, pedestrian flow and its split rate, vehicular flow and speed (
Table 1). In
Table 1, the numbers reported in the second column, which refer to pedestrian flows, represent the start and end zones of the pedestrian trip, the ones referring to vehicular flows indicate the link on which vehicles travel (see
Figure 2).
Geometrical data have also been collected and they have been gathered directly from official map information, provided by the municipality.
3.2. Micro-simulation Model Set Up
The model, on which neural networks were trained, was developed with the commercial microsimulation software Vissim. The simulated geometry and the allowed manoeuvers are exactly the same as the real ones (
Figure 2;
Table 2).
In order to obtain a realistic simulation, vehicular traffic was regulated according to the Wiedemann 74 behavior, which is the suggested behavior for urban traffic modeling [
37], while pedestrian traffic was regulated according to the social force model, which was implemented thanks to the Viswalk add-on.
Four main problems were encountered in the reproduction of the site: The first concerned the network geometry and in particular the need to model the entire roundabout or just the recalled entry leg. The decision was made to model the entire roundabout and its oncoming and outgoing legs, as the facility is very crowded and each incoming and outgoing flow is affected by the others. The second question was related to the path of modeling pedestrian movements. In fact, simulated pedestrians can move on a “pedestrian area” or on a “link used as pedestrian area”. After heuristically implementing both possibilities and critically observing the reproduced behavior, it was decided to use the “link used as pedestrian area”, since it visually provided the response best fitting real observations. Finally, the third question concerned vehicular behavior: Italian drivers do not usually stop at pedestrian crossings unless they consider it necessary. This led to the need to realistically reproduce this type of yielding behavior. To solve the problem, priority rules and speed reduction areas were jointly used. The first ones let the pedestrian be yielded by the oncoming vehicles, while there were 4 “speed reduction areas” placed, 2 in proximity of the crossing and 2 just before the stop line to enter the roundabout circle, to allow vehicles to reduce their speed, thus well simulating the real observed outcomes.
A fourth complication was set by pedestrian generation: in the selected scenario, pedestrians can arrive from 5 different directions—3 on the left and 2 on the right side—and they can reach 5 different destinations. To solve this problem, a pedestrian area and the corresponding pedestrian input for each of the listed directions were created.
Summing up, the network geometry consists of 49 links: 16 vehicular links defining the oncoming and outgoing roads and the roundabout circle lanes, 12 links used as pedestrian areas and 5 pedestrian areas and 12 connectors.
3.3. Parameter Choice and Preparation
To successfully formulate and train a neural network, the parameter choice, on which the selected network will work, is of primary importance. Since the aim of the research is to assess pedestrian crossing behavior in an environment where the interaction with vehicular flow is not uninfluential, it was decided to work both on pedestrian and vehicular behavioral parameters. Specifically, a total of 8 parameters were selected, 5 of which relate to pedestrians and 3 of them linked to vehicular traffic. Pedestrian movement was modeled thanks to the Social Force Model (SFM) [
38]. As stated in [
39], on the one hand, this model is easy to understand, but on the other hand, it is difficult to interpret and measure the parameters from which it is composed, which strongly influence diverse behavioral aspects.
The basic concept of the SFM is that each pedestrian, defined by its desired speed and target time, moves towards its destination ruled by the so-called social forces [
38]. These effects are: the attractive forces that lead each pedestrian to its destination
; the repulsive forces among pedestrians and among the individual and obstacles and the attractive forces due to other pedestrians/objects [
38]. In Vissim/Viswalk, the Social Force Model expression (Equation (1)) is adapted and some parameters governing the equations, emerge:
In detail, these are the following:
is a parameter depending on lambda calculated for each physical force opposing the main force F [
37].
d is a parameter dependent on a further formulation, which indicates the distance between two pedestrians [
37].
n is the vector pointing the influenced pedestrian to the influencing one [
37].
Tau (τ): relaxation time as expressed in Helbing’s original model. It relates the difference between the desired speed and direction to the current speed and direction. Though it is not directly present in Equation (1), it is used to express the force leading the pedestrians to its destination [
37].
Lambda_mean (λ): amount of anisotropy. It regulates the effect of phenomena that take place in the back of the considered pedestrian [
37].
A_soc_isotropic and B_soc_isotropic: non-measurable parameters that control the two forces among pedestrians [
37].
A_soc_mean and B_soc_mean: respectively define the strength and typical range of the social force between pedestrians [
37].
Also, other parameters influence pedestrian movement in the Vissim/Viswalk simulation tool, and should be considered. These are:
Noise: introduces the random forces, which are systematically added to the calculated forces [
37].
React_to_n: the number of pedestrians considered for the calculation of the forces [
37].
Side_preference: defines whether opposing pedestrians prefer using the right or left hand side when passing each other [
37].
Queue_order and Queue_straightness: specify the properties of the queue [
37].
In the calibration attempt developed in [
40], the authors give an insight into which modifications are brought about by the change of some of the recalled parameters (
Table 3). In
Table 3, additional parameters to the ones reported in Equation (1) are also considered. They are: the radius of pedestrians, which indicates the size of the bidimensional ellipse containing the shape of a pedestrian [
40]; B physical, a border that is a theoretical parameter controlling the force between a pedestrian and the border of a location [
40]; friction force, a component of the main force F, together with attractive and repulsive forces; VD is a theoretical parameter of Vissim namely “velocity dependence”, which is expressed in terms of seconds and which involves various other parameters, nevertheless it is not exactly defined in [
37]; velocity is the speed of pedestrians; longitudinal scale is a dimensional parameter of the model; maximum number of pedestrians represents the maximum number of interacting agents considered in the simulation.
A literature review has been developed focusing on parameter fine-tuning and, from these findings [
37,
39,
40,
41], 5 pedestrian parameters—tau, lambda, Asoc_iso, Bsoc_iso, side_pref—and their ranges have been outlined in
Table 4.
Since in the considered situation the interaction with vehicular flow is strict, vehicular parameters also have to be considered. In particular, the model governing vehicular behavior is Wiedemann 74, and the selected parameters to be used are the three that most affect the car-following model, i.e., average standstill distance, defining the mean desired distance between two cars, the additive part of safety distance and multiplicative part of safety distance, which are values used for the calculation of the desired safety distance [
37].
Geometric characteristics and traffic load are also very important parameters, but they are entered as inputs to the model for each location separately. In
Table 4, vehicular parameters and their ranges are also summarized.
The selection of a feasible output is also important. For this study, pedestrian crossing time has been chosen as assessment output for the neural network, because of its ease to be measured both from footage (semi-automatically as well as manually) and microsimulation results.
After parameter choices, these have to be stored in a suitable way to be read as input values by the neural network. The requirements needed in order to obtain a good training of the network are the amount of data, their randomness and a good quantity of repetitions of each combination.
For this study, a database of 100 random combinations of input-output values was worked out in Excel. Each combination is made up of a random selected value for each one of the 8 input parameters with a step of 0,1 and the resulting crossing time obtained by implementing the chosen input combination in the Vissim model. Also, parameters were selected in such a way that their ranges are comprised in the same magnitude unit, in order to avoid great range differences. To have a sufficient number of repetitions, ten simulations have been run for each random combination with an initial seed random value of 42 and an increment of 10 and a simulation of 1 timestep/second. In this way, we ensured that the same ten possible traffic scenarios were analyzed for each combination of input parameters. The number of simulations has been chosen as a compromise to the pseudo-stochastic working method of the model: as a matter of fact, it simulates a phenomenon which has a stochastic nature—traffic flow—but at the same time the experiment has to be repeatable, which means that the same “random seed” parameter value (random number generator) gives the same output.
Here the mean output value of ten simulations was used as the final output of the simulation for each random combination of selected input parameters.
Following the most used protocol [
42] of the created dataset, 80% has been used for the training of the network, while the remaining 20% has been utilized as a test bed for the first evaluations (test and validation).
An additional database corresponding to the 20% of the initial one was generated after network training and testing. This completely independent database, which did not participate in either the training set or the test set in the neural network learning process, served to independently confirm the ability to generalize the selected neural network. Indeed, the goal of training a neural network is to achieve a good generalization, being able to be applied to much larger databases than the one on which it was trained. In the specific case, the neural network trained on a database of 100 combinations, will be applied to 3–4 times larger databases, which compare crossing times measured in real traffic conditions and crossing times as obtained by microsimulations using the neural network prediction function implemented in the calibration process for different values of input parameters.
3.5. Neural Network Formulation
Previous works [
14,
15] about the calibration of vehicular traffic models have considered the application of various kinds of neural networks and their performance, examining 176 neural network configurations. On the basis of these considerations and of previous experiences, for the present issue, a ward network was implemented in the Neuroshell2 program.
Ward nets, better known as feedforward networks, are neural networks structured in such a way that the neurons of a layer have the outputs of the neurons in the precedent slab as input signals [
43]. A feedforward network, in which each neuron of one layer is linked to the ones of the adjacent layer, is called fully connected [
43]. An example of a fully connected feedforward network can be seen in
Figure 3.
The adopted neural network is made up of 5 layers: the input layer has 8 neurons, corresponding to the 8 selected input parameters, the output slab has one neuron, i.e., the selected output parameter—pedestrian crossing time, while layers 2, 3 and 4 are hidden layers and each one is composed of 4 hidden neurons.
The architecture of the neural network has been chosen following some of the rules found in literature [
43,
44,
45] and the knowledge acquired by previous experiences [
16]. The number of input and output neurons is related to the number of initial parameters to be used and desired outputs to be evaluated. The selection of the number of hidden neurons is more complicated and has important consequences for the results of the network: indeed, as reported in [
46], too many hidden neurons can cause over-fitting issues and degrade the generalization ability of the neural network. Much research exists on methods to select the best number of hidden neurons [
44,
46]: following [
47]’s indications, a correct number of hidden neurons can be suggested by applying the formula (Equation (2))
where
represents the number of input neurons and
is the number of output neurons. Also, a general rule of thumb is to select a number of hidden neurons belonging to the interval between the number of inputs and outputs.
Considering these two rules, 4 neurons have been chosen for the hidden layers of the network.
Regarding the number of hidden layers, initially it was selected according to the literature [
45,
46,
47] and empirically, on the basis of previous experiences [
15,
16], then 10 different configurations were tested and the one giving the best results was chosen. As with any optimization procedure, the question arises whether the selected network architecture is a local or global optimum. Given the learning outcomes and their practical application, this question was not focus of our further research.
The main feature of the selected network stands in the activation functions. As a matter of fact, the input layer is connected to the hidden ones by the same linear function, but each hidden layer is linked to the output slab through a different activation function in
Table 6.
In
Figure 4, the neural network architecture is outlined.