*Article* **Possibility of a Solution of the Sustainability of Transport and Mobility with the Application of Discrete Computer Simulation—A Case Study**

**Nikoleta Mikušová 1,\* , Gabriel Fedorko <sup>1</sup> , Vieroslav Molnár 2 , Martina Hlatká 3 , Rudolf Kampf <sup>3</sup> and Veronika Sirková 1**

	- 370 01 Cesk ˇ é Budˇejovice, Czech Republic; hlatka@mail.vstecb.cz (M.H.); kampf@mail.vstecb.cz (R.K.)

**Abstract:** The paper is focused on an example of a solution for the sustainability of transport and mobility with the application of discrete computer simulation. The obtained results from the realized simulation were complemented with the selected multi-criteria decision-making method, namely the analytic hierarchy process (AHP) method. The paper describes the use of the simulation model for obtaining characteristics of alternative solutions that were designed for the needs of transport sustainability. The aim is to address the problem of traffic congestion in urban agglomerations. The simulation model serves as a means to provide information for the needs of their analysis by multi-criteria evaluation by the AHP. The methodology is based on a combination of computer simulation and multi-criteria decision-making and presents a useful tool that can be used in the field of transport sustainability. The paper notes methods to implement analysis of alternative solutions in transport. However, this procedure can also be used to solve other problems in the field of logistics systems. The paper compares five possible solutions for the organization of transport at intersections. Multi-criteria decision-making was realized based on 12 criteria. The result was the solution that reduced the length of congestion in almost all directions, with a maximum shortening of 69 m and a shortening of the average delay by 26 s compared to the current state.

**Keywords:** transport; sustainability; mobility; simulation

#### **1. Introduction**

Sustainable transport within urban agglomerations creates conditions that ensure reliable satisfaction of transport needs and the functioning of individual transport systems. The aim is to ensure the smooth travel of the population, promote public passenger transport, improve the environment, and increase safety and the flow of traffic [1]. To achieve sustainable cities, it is essential to create and sustain changes in people's social behavior through new approaches to mobility, from inefficient, uneconomical, and motorized means, to cleaner, greener, healthier, and more economical means. One of the solutions implemented in connection with sustainable mobility is the support of public passenger transport.

This relates to increasing its attractiveness, with the aim to encourage passengers to switch from individual motoring to public transport. To achieve this aim, it is necessary to create such preconditions within the transport infrastructure that public passenger transport runs continuously, without undue delay, and allows a continuous form of transport [2].

Sustainable mobility within urban agglomerations is closely linked to the effect of transport on the environment. In this regard, it is possible to talk about the importance of

**Citation:** Mikušová, N.; Fedorko, G.; Molnár, V.; Hlatká, M.; Kampf, R.; Sirková, V. Possibility of a Solution of the Sustainability of Transport and Mobility with the Application of Discrete Computer Simulation—A Case Study. *Sustainability* **2021**, *13*, 9816. https://doi.org/10.3390/ su13179816

Academic Editor: Elzbieta Macioszek ˙

Received: 3 August 2021 Accepted: 28 August 2021 Published: 1 September 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

green city transport [3]. One of the key tasks is to realize measures to reduce emissions of toxic substances into the environment, the interaction of the car with the environment [4], and if it is possible to reduce the traffic volume in cities [5].

In addition, environmentally friendly vehicles are a key element of green transportation in modern economies [6]. However, it must be emphasized that the use of these types of vehicles can depend on the financial and economic situation of cities. As such, their use faces challenges in the countries of Eastern Europe.

This problem can be solved using traffic planning, whose importance is due to the growth of car traffic in cities. For this reason, cities are beginning to implement town planning measures aimed at improving the traffic situation, reducing congestion with a focus on public transport, and improving the environment [6]. One of the possible tools can be a Transportation Management Information Systems (TMIS) with the overall social and economic development, such as improvement of regional conditions and optimization of the environment, promoting communication and accelerating development [7]. It is very important for the cities to guarantee the efficiency and also the accuracy of the transportation system. One of the ways how to achieve this aim is to analyze the real-time status of transportation network, which determines the urban distribution, travel activities, and development of the urban systems [8]. In relation to the discussion above, real-time traffic prediction based on highly accurate spatio-temporal datasets of traffic sensors is a major challenge for intelligent transportation systems and sustainability. However, this is challenging due to complex topological dependencies and high dynamics associated with changes in road conditions [9]. An important topic of research in the field of modern intelligent traffic systems (ITSs) is path planning. Complex and changeable factors, for example, traffic congestion and traffic accident, should be considered by planning paths, and path point planning schemes can improve the reliability of path planning and also ensure several services needed for transportation process [10].

Town planning measures can be based on decision-making processes using a comparison of variants and partial solutions. The aim is to find a realistic transport scenario that can meet all the basic requirements associated with congestion.

One approach to traffic planning is the well-known microscopic traffic simulator, Simulation of Urban Mobility (SUMO), which is used to design traffic scenarios and present their parameters, in addition to the evaluation and validation of traffic requirements and mobility patterns [11]. The interdependencies among multimodal modes of transport significantly contribute to effective urban transport planning [11].

Due to the complexity of traffic problems, new approaches based on computer simulation and traffic modeling are increasingly being introduced.

It is possible to effectively use simulation models with different levels of detail [11]. For example, it is possible to use macroscopic traffic simulations, which focus on traffic streams but do not take into account the vehicles of the traffic stream. By contrast, microsimulation is a form of traffic simulation capable of accurately modeling the behavior of vehicles in a defined environment [11]. The importance of real data for microsimulation purposes of urban mobility monitoring can be noted by examining the mobility of vehicles at two daily peak times at a roundabout [12]. For the need of micro and macrosimulations, it is possible to use Internet of Things (IoT). IoT-based solution also presents an interesting tool for traffic problems solution. Internet of Things can be used in the daunting task of quickly identifying vulnerable network sections [13]. IoT can provide data transmission and their storage in the form of Big Data [14]. The obtained data are possible to use for real-time planning. Another interesting technology related with the data storage is a blockchain technology, which is a combination of distributed data storage, timestamp technology, and peer-to-peer network. This technology also can provide a solution for the secure distributed cloud data storage system. [15]

Several software packages are used for traffic simulation. One interesting example is the PTV Vissim software that realistically simulates complex vehicles interactions at the microscopic level [16]. This software can be applied to the simulation of future traffic and

transit conditions [17], and the obtained results can present the potential to reduce vehicle travel times and delays [17]. This software can be used for testing several scenarios relating to the effectiveness of increased safety of toll plazas [18]. This software and the resultant simulation models allow the optimal efficiency of the road network to be determined—for example, in a viaduct—hence allowing traffic management proposals to be made to reduce delays [19]. PTV Vissim can be used to simulate the effects of congestion and delays on a motorway network due to an accident, and then to apply a quantified regression formula to predict the time of traffic recovery [20]. This software and simulation models can evaluate road improvements or new traffic management strategies in different weather conditions [21]. With the help of the standard microscopic simulation platform of PTV Vissim, it is possible to compare the efficiency characteristics of algorithms for autonomous intersection control [22].

In addition to computer simulation tools, it is suitable and effective to implement other methods to support the complex and difficult problems of the sustainability of transport, particularly in the field of research [23]. One possible approach is the use of multi-criteria decision-making (MCDM). It must be emphasized that the MCDM regarding public transportation presents a complicated task that involves environmental, economic, and socio-political issues [24]. However, several studies have applied this approach to solve different transport problems, for example, the use of MCDM for decision-making relating to alternative fuel public transport buses [25], selection of sustainable urban transportation alternatives using fuzzy multi-criteria decision-making (FMCDM) [26], use of analytical hierarchical process (AHP) for the selection of suitable vehicles [27], use of AHP to determine the best solutions by traffic planning [28], and use of AHP for design and evaluate highway routes [29].

This literature review provides interesting combinations of MCDM and the simulation approach. This combination can be also used for the solution of transport sustainability and transport problems, for example, modeling and testing of different intersections using Vissim software, followed by generation of the AHP model using the PTV Vissim results or using AHP to solve traffic issues that arise due to ad hoc urban planning by changing road geometries and signaling model alternative solutions via PTV Vissim software [30–32]. This idea is also presented in this paper.

The goal of this paper is to note the possibilities of using the information provided by experts, using the simulation software PTV Vissim, and implementation of the simulation results in the proposed AHP model. By experts, we mean specialists for realization and evaluation of traffic survey. These are specialists who carried out a traffic survey at the examined transport hub, and subsequently, their data were used for the creation of a simulation model.

A novelty of the paper is the application of AHP for the selection of a suitable alternative based on the results of simulation experiments. In comparison to a recently published paper [30], which focuses on evaluating the quality of public passenger transport using AHP, the present paper focuses on the solution associated with changing the transport organization, which will not only have an impact on the quality of public transport but will also benefit the environment. The paper also demonstrates its application in the field of transport problems. A similar issue is also presented in [31], but this research study uses multi-criteria decision-making for evaluation and diagnostics of urban streets through an integrated multi-criteria model of a sustainable nature.

#### **2. Materials and Methods**

Addressing the issue of transport sustainability is dependent on the relevance of the criteria used, which is one of the most critical points of the many techniques available to derive decision-making solutions [32]. The selection of criteria and the sequence of steps in the analysis of transport sustainability is a challenging process [33]. The approach to the solution of transport sustainability includes several steps, of which the first is the identification of the sustainability assessment criteria. Computer simulation is often used in this process. The use of computer simulation to tackle transport problems is described by the algorithm in Figure 1. This suggests that computer simulation can be used twice. The first

Addressing the issue of transport sustainability is dependent on the relevance of the criteria used, which is one of the most critical points of the many techniques available to derive decision‐making solutions [32]. The selection of criteria and the sequence of steps in the analysis of transport sustainability is a challenging process [33]. The approach to the solution of transport sustainability includes several steps, of which the first is the identification of the sustainability assessment criteria. Computer simulation is often used

*Sustainability* **2021**, *13*, x FOR PEER REVIEW 4 of 25

**2. Materials and Methods**

in this process.

The use of computer simulation to tackle transport problems is described by the algorithm in Figure 1. This suggests that computer simulation can be used twice. The first use is within the analytical phase when the simulation model is amassing data and materials to suggest a solution. The second use of the simulation model is implemented in the designing phase when it is necessary to conduct a large number of experiments and evaluations with the intent to assess individual alternative solutions. This simulation method can also be used to compare the original state with the suggested solution. In the future, the resulting model may be applied to the traffic of the transport process and to answer "what if?" questions. use is within the analytical phase when the simulation model is amassing data and materials to suggest a solution. The second use of the simulation model is implemented in the designing phase when it is necessary to conduct a large number of experiments and evaluations with the intent to assess individual alternative solutions. This simulation method can also be used to compare the original state with the suggested solution. In the future, the resulting model may be applied to the traffic of the transport process and to answer "what if?" questions.

**Figure 1.** The algorithm solving the transport problem using computer simulation on a micro-level.

The use of the simulation model in the area of transport based on a case study referring to an actual traffic hub in the city of Uherské Hradištˇe in the Czech Republic is presented in the next section of the article.

#### *Described Solution of the Traffic Hub Described Solution of the Traffic Hub*

sented in the next section of the article.

The traffic hub was constructed as a level intersection with light signaling to manage the traffic. This research was conducted in the territory of Uherské Hradištˇe, Kunovice, and Staré Mˇesto, using the transport plan of the city from 2015. The data from research carried out in different directions show a cartogram of a transport network loaded by individual car transport within 8 h of the research's duration in the above-mentioned territory. Figure 2 shows a section of the surroundings of the designated traffic hub. The traffic hub was constructed as a level intersection with light signaling to manage the traffic. This research was conducted in the territory of Uherské Hradiště, Kunovice, and Staré Město, using the transport plan of the city from 2015. The data from research carried out in different directions show a cartogram of a transport network loaded by individual car transport within 8 h of the research's duration in the above-mentioned territory. Figure 2 shows a section of the surroundings of the designated traffic hub.

The use of the simulation model in the area of transport based on a case study referring to an actual traffic hub in the city of Uherské Hradiště in the Czech Republic is pre-

*Sustainability* **2021**, *13*, x FOR PEER REVIEW 5 of 25

**Figure 2.** The surroundings of the analyzed traffic hub. **Figure 2.** The surroundings of the analyzed traffic hub.

The research duration was 8 h; from 6:00 to 10:00 a.m. and from 2:00 to 6:00 p.m. Passing vehicles were recorded within a time interval of 30 min. The time and scope of the research were chosen in terms of when the researched transport hub was characterized by the creation of traffic congestions. The research duration was 8 h; from 6:00 to 10:00 a.m. and from 2:00 to 6:00 p.m. Passing vehicles were recorded within a time interval of 30 min. The time and scope of the research were chosen in terms of when the researched transport hub was characterized by the creation of traffic congestions.

The traffic hub is situated on the traffic artery connecting Kunovice, Uherské Hradiště, and Staré Město (Figure 2). The traffic volume is greatest on the east–west axis in both directions. The traffic situation is complicated in this hub by severe traffic conges-The traffic hub is situated on the traffic artery connecting Kunovice, Uherské Hradištˇe, and Staré Mˇesto (Figure 2). The traffic volume is greatest on the east–west axis in both directions. The traffic situation is complicated in this hub by severe traffic congestions during the rush hour.

tions during the rush hour. Vehicles approach the hub from the eastern side (B) over bridges leading across the Vehicles approach the hub from the eastern side (B) over bridges leading across the arm of the river. In this direction, the intersection consists of three lanes:


• short turning lane to the right. The other side (A) has two lanes:


• straight direction with a possibility to turn right. The southern part of the intersection (D) has only two opposite lanes. The northern part (C) is an industrial and commercial quarter. There are a shopping center and different business networks—food, fashion shops, electronics, or hobby shops. There is also a company whose business relates to spare parts for different types of cars. In the immediate The southern part of the intersection (D) has only two opposite lanes. The northern part (C) is an industrial and commercial quarter. There are a shopping center and different business networks—food, fashion shops, electronics, or hobby shops. There is also a company whose business relates to spare parts for different types of cars. In the immediate surroundings of the intersection, there is an industrial zone with a large number of passing lorries even though a new bypass was built especially so as not to overload the traffic hub.

surroundings of the intersection, there is an industrial zone with a large number of passing lorries even though a new bypass was built especially so as not to overload the traffic hub. The total number of passing vehicles in a specific direction is recorded as ∑ vehicles in Table 1. The number of lorries and buses from the total number of passing vehicles is expressed in L + B. Figures 3 and 4 represent passing vehicles moving in the prescribed directions. The figures were drawn according to the general city transport plan.

The total number of passing vehicles in a specific direction is recorded as ∑ vehicles in Table 1. The number of lorries and buses from the total number of passing vehicles is


expressed in L + B. Figures 3 and 4 represent passing vehicles moving in the prescribed

expressed in L + B. Figures 3 and 4 represent passing vehicles moving in the prescribed

Entry A A–B A–C A–D ∑vehicles L + B

Entry A A–B A–C A–D ∑vehicles L + B

**∑Vehicles L + B ∑Vehicles L + B ∑Vehicles L + B Together** 

**∑Vehicles L + B ∑Vehicles L + B ∑Vehicles L + B Together** 

4815 212 358 5 32 0 5205 217

4815 212 358 5 32 0 5205 217

directions. The figures were drawn according to the general city transport plan.

directions. The figures were drawn according to the general city transport plan.

**Table 1.** The traffic intensity of the Hradišt'ská–Východní intersection. Entry B B–A B–C B–D 3499 247 2530 125 121 0 6150 372 Entry B B–A B–C B–D 3499 247 2530 125 121 0 6150 372

**Table 1.** The traffic intensity of the Hradišťská–Východní intersection.

**Table 1.** The traffic intensity of the Hradišťská–Východní intersection.

*Sustainability* **2021**, *13*, x FOR PEER REVIEW 6 of 25

*Sustainability* **2021**, *13*, x FOR PEER REVIEW 6 of 25

**Figure 3.** Vehicles' dispersal from direction A and direction B. **Figure 3.** Vehicles' dispersal from direction A and direction B. **Figure 3.** Vehicles' dispersal from direction A and direction B.

**Figure 4.** Vehicles' dispersal from direction C and direction D. **Figure 4.** Vehicles' dispersal from direction C and direction D. **Figure 4.** Vehicles' dispersal from direction C and direction D.

Most vehicles coming from Staré Město continue on the main road to Uherské Hra-Most vehicles coming from Staré Město continue on the main road to Uherské Hra-Each direction is colored as follows:


turn to the shopping centers. The vehicles turning at the light signal slow the traffic because there is a short turning lane, and when the light is green, the vehicles have to let pedestrians cross the road, as suggested in Figure 4. turn to the shopping centers. The vehicles turning at the light signal slow the traffic because there is a short turning lane, and when the light is green, the vehicles have to let pedestrians cross the road, as suggested in Figure 4. Most vehicles coming from Staré Mˇesto continue on the main road to Uherské Hradištˇe. Figure 3 shows that only 7% of the vehicles turn to the shopping centers, and it does not cause significant delay at the traffic light.

The B direction (Figure 3) experiences a more significant dispersal than that of A. Most vehicles from Uherské Hradištˇe go on to Staré Mˇesto; however, 41% of the vehicles turn to the shopping centers. The vehicles turning at the light signal slow the traffic because there is a short turning lane, and when the light is green, the vehicles have to let pedestrians cross the road, as suggested in Figure 4.

Upon leaving the shopping centers, most vehicles continue to Uherské Hradištˇe (Figure 4). Traffic congestions are caused by a large number of vehicles heading to the shopping center and the short duration of the green signal (18 s). The number of vehicles

from direction D (Figure 4) is negligible (120 vehicles) considering the total number of vehicles at the intersection. No traffic congestion negatively influences the situation at the specific traffic hub.

Figures 3 and 4 suggest that the most problematic directions are as follows (considering traffic congestion):


The analysis results suggested four possible solutions to the existing situation at the traffic hub. These solutions were compared with a computer simulation method. The following solutions were proposed:


#### **3. Results**

To create a simulation model, the PTV Vissim program was applied. The program is based on a multipurpose microscopic simulation of traffic. The program PTV Vissim allows to realize a multipurpose microscopic traffic simulation based on the behavior of participants and also allows to examine and optimize traffic flows. This program contains a wide range of applications for modeling urban and motorway traffic and the integration of public and passenger transport. Visualization of operating conditions is at a high level [34].

The road network is represented by nodes located at intersections and connectors that are on-road segments. Within the model, the road has defined the following properties:


Roads and connectors are the basic building blocks for adding more infrastructure objects. System elements are divided into different classes and spatial resolutions. Modeling of the transport system in this program is dependent on the specification of vehicles that will be used in the simulation model. Vehicles have the option of choosing the route. The vehicles are divided into categories in the model. Each category has a specific model of vehicle with mandatory technical characteristics, namely length, width, maximum speed, and deceleration and acceleration of the vehicle. The vehicles are generated randomly at the beginning by the function "vehicle inputs."

The program PTV Vissim analyzes and optimizes traffic flows. This program comprises a large scale of applications for modeling the city and highway traffic and integrating public and passenger transport. The visualization of traffic ratios is undertaken at a high standard. The workstation was equipped with the processor Intel Core i9-8950HK and the graphic card nVidia GeForce GTX 1080 8GB DDR5 [34].

The road network in the simulation model is represented by traffic hubs placed at intersections and by connectors at road segments. In this case, four routes are integrated with the use of connectors.

The model is defined by these characteristics:


Routes and connectors decide whether and which infrastructure objects should be added. Elements of the system are classified into various classes and 3D definitions. Placing an object is related to a specific traffic lane, which means that objects relevant to a specific traffic hub must be implemented in all segments. A local object does not have a physical length allocated, which means the object must be allocated in one specific point of a traffic lane. These point objects were used for the model:


Vehicles randomly stop 0.5–1.5 m before the specific signal. Three-dimensional objects with defined lengths emerge at a specific position of the lane. The objects used to define the infrastructure are:


When modeling the system, it is necessary to specify vehicles located in a particular infrastructure. Passenger vehicles may choose a route. The model is classified into:


Individual categories include a specific vehicle model with mandatory technical characteristics, which are length, width, maximal speed, deceleration, and acceleration of the vehicle. Vehicles are randomly generated at the beginning of routes using the Vehicle Inputs function.

The numbers of vehicles included in the model and dispersal of the traffic flow through the Relative Flows function are suggested in Table 2.


**Table 2.** Number of vehicles and dispersal of the traffic flow in the simulation model.

"Volume" represents the number of vehicles generated on the specific route. "Vehcomp" refers to the composition of the traffic flow, in which 3 means passenger transport and 2 is haulage. For A and D directions, the total should equal 1; this result shows that all included vehicles were dispersed into the designated directions.

Vissim includes the light signalization at intersections in the infrastructure. The cycle length of the light signalization is influenced by dispositions of the traffic hub, traffic load, number of cycle phases, form of turning, length of the pedestrian crossing, and construction work at the intersection. The length I determined by fixed split times.

The cycle length at the Hradišt'ská–Východní–Zrezavice intersection is modified according to rush hours in the morning and the afternoon and on weekends and holidays. The morning cycle lasts 65 s, the afternoon cycle 80 s, and the cycle during holidays is 100 s.

Figure 5 suggests a signal program for the intersection in Uherské Hradištˇe. The cycle length is 65 s in the basic model. The duration of the color green for individual directions is as follows: Figure 5 suggests a signal program for the intersection in Uherské Hradiště. The cycle length is 65 s in the basic model. The duration of the color green for individual directions is as follows: cording to rush hours in the morning and the afternoon and on weekends and holidays. The morning cycle lasts 65 s, the afternoon cycle 80 s, and the cycle during holidays is 100 s. Figure 5 suggests a signal program for the intersection in Uherské Hradiště. The cycle length is 65 s in the basic model. The duration of the color green for individual directions

The cycle length at the Hradišťská–Východní–Zrezavice intersection is modified ac-

number of cycle phases, form of turning, length of the pedestrian crossing, and construc-

number of cycle phases, form of turning, length of the pedestrian crossing, and construc-

The cycle length at the Hradišťská–Východní–Zrezavice intersection is modified according to rush hours in the morning and the afternoon and on weekends and holidays. The morning cycle lasts 65 s, the afternoon cycle 80 s, and the cycle during holidays is 100 s.

tion work at the intersection. The length I determined by fixed split times.

tion work at the intersection. The length I determined by fixed split times.


*Sustainability* **2021**, *13*, x FOR PEER REVIEW 9 of 25

*Sustainability* **2021**, *13*, x FOR PEER REVIEW 9 of 25

• direction from the quay—5 s. • direction from the quay—5 s. • direction from the industrial and commercial zone—18 s;

**Figure 5.** A signal program for a modeled traffic hub in Uherské Hradiště. **Figure 5.** A signal program for a modeled traffic hub in Uherské Hradištˇe.

When the green light is on, it is possible to turn right from the industrial area in the direction of Uherské Hradiště. The specific layout of the intersection (Figure 6) overlaps the green wave from the Uherské Hradiště direction with the green wave from Staré Město, which results in downtime when turning from Staré Město toward the commercial and industrial zone. When the green light is on, it is possible to turn right from the industrial area in the direction of Uherské Hradištˇe. The specific layout of the intersection (Figure 6) overlaps the green wave from the Uherské Hradištˇe direction with the green wave from Staré Mˇesto, which results in downtime when turning from Staré Mˇesto toward the commercial and industrial zone. **Figure 5.** A signal program for a modeled traffic hub in Uherské Hradiště. When the green light is on, it is possible to turn right from the industrial area in the direction of Uherské Hradiště. The specific layout of the intersection (Figure 6) overlaps the green wave from the Uherské Hradiště direction with the green wave from Staré

Simulation experiments were realized using the simulation models, the results of which present the basic characteristics of variants (Tables 3–5). Město, which results in downtime when turning from Staré Město toward the commercial and industrial zone.

tion of Uherské Hradiště (when the green line is on) (**a**) turning right from the industrial zone in the direction of Uherské Hradiště (when the green line is on)

ward the commercial and industrial zone; (**b**) downtime in turning from the direction Staré Město toward the commercial and industrial zone;

**Figure 6.** *Cont*.

(**c**) overlaps the green wave from the Uherské Hradiště direction with the green wave from Staré Město

**Figure 6.** Model of the Hradišťská–Východní–Zrezavice intersection. (**a**) turning right from the industrial zone in the direction of Uherské Hradiště (when the green line is on); (**b**) downtime in turning from the direction Staré Město toward the commercial and industrial zone; (**c**) overlaps the green wave from the Uherské Hradiště direction with the green wave from Staré Město. **Figure 6.** Model of the Hradišt'ská–Východní–Zrezavice intersection. (**a**) turning right from the industrial zone in the direction of Uherské Hradištˇe (when the green line is on); (**b**) downtime in turning from the direction Staré Mˇesto toward the commercial and industrial zone; (**c**) overlaps the green wave from the Uherské Hradištˇe direction with the green wave from Staré Mˇesto.


**Table 3.** Numbers of vehicles and vehicle travel time—current state.

2700–3600 332 41.27 215 25.18 157 67.32 Average 359.5 37.50 217.25 25.57 169.5 68.80 **Table 4.** Numbers of vehicles and vehicle travel time—changing route from Staré Mˇesto.



1800–2700 587 25.41 222 24.51 168 60.54 2700–3600 562 25.91 244 22.66 198 58.47 Average 561.3 25.40 234.3 23.00 189 59.70 Minimum 553 25.18 219 22.22 187 58.10 Maximum 562 25.91 222 24.51 203 61.70

**Table 5.** Numbers of vehicles and vehicle travel time—modified traffic lights. **Direction A–B B–C C–B** 

**Table 5.** Numbers of vehicles and vehicle travel time—modified traffic lights.

*Sustainability* **2021**, *13*, x FOR PEER REVIEW 11 of 25

This structure of results allows variants to be compared, taking into account the lengths of congestion in each direction (Figure 7) and the vehicle delay (Figure 8). This structure of results allows variants to be compared, taking into account the lengths of congestion in each direction (Figure 7) and the vehicle delay (Figure 8).

**Figure 7.** Length of congestion by direction. **Figure 7.** Length of congestion by direction.

The results of simulation experiments for alternative solutions were realized based on data sets that most concisely characterized the undesirable condition at the transport node—traffic congestion. First of all, the length of columns in all directions was monitored. This indicator presents the capacity of traffic directions. The next indicator was the time of the vehicles' passage through the transport hub. It was another set of data describing whether the vehicles would not pass slowly within the individual variants, which would have bad environmental impacts. The final monitored indicator was the delay of vehicles at the intersection. It was the time that included, in addition to driving time, the waiting of vehicles at a transport hub due to traffic congestion.

**Figure 8.** Average delay of all vehicles. **Figure 8.** Average delay of all vehicles.

#### **4. Discussion**

The results of simulation experiments for alternative solutions were realized based on data sets that most concisely characterized the undesirable condition at the transport node—traffic congestion. First of all, the length of columns in all directions was monitored. This indicator presents the capacity of traffic directions. The next indicator was the time of the vehicles' passage through the transport hub. It was another set of data describ-To provide a comprehensive solution and comparison of individual alternatives, four independent microscopic simulation models were created. Subsequently, careful simulation experiments were implemented and the results were then compared. The key objective was to derive a solution that represented a substantial improvement compared to the existing, uncomfortable situation of the transport process.

#### ing whether the vehicles would not pass slowly within the individual variants, which would have bad environmental impacts. The final monitored indicator was the delay of *4.1. The Experiment of Changing the Composition of Vehicles*

vehicles at the intersection. It was the time that included, in addition to driving time, the waiting of vehicles at a transport hub due to traffic congestion. **4. Discussion**  To provide a comprehensive solution and comparison of individual alternatives, four independent microscopic simulation models were created. Subsequently, careful simulation experiments were implemented and the results were then compared. The key objective was to derive a solution that represented a substantial improvement compared to the existing, uncomfortable situation of the transport process. *4.1. The Experiment of Changing the Composition of Vehicles*  This experiment consists of excluding lorries from the intersection. Lorries can use a bypass without slowing down the traffic in this prominent hub. The model preserved ran-This experiment consists of excluding lorries from the intersection. Lorries can use a bypass without slowing down the traffic in this prominent hub. The model preserved random numbers of buses because the exact numbers were not the subject of the research. During the simulation, 4638 vehicles passed through the traffic hub, which was 126 vehicles greater than in the existing state. The transit time of vehicles when excluding lorries did not exceed 25 s for 218 vehicles. Compared to the existing state, there was a slight difference. The main difference in the transit time and the number of vehicles coming from the commercial zone to Uherské Hradištˇe compared to the current state is that an additional 15 vehicles can pass through the specific segment within the approximate same period. The transit from Staré Mˇesto to Uherské Hradištˇe upon excluding lorries is not significantly influenced, which means that there is no major difference compared to the current state. Tailbacks in individual directions are usually longer than in the default state of the intersection. The average delay of cars at the traffic hub approximately equals that of the basic model, with the difference not exceeding 7 s of delay within one hour's simulation.

#### dom numbers of buses because the exact numbers were not the subject of the research. *4.2. Changing the Route from Staré Mˇesto Direction*

During the simulation, 4638 vehicles passed through the traffic hub, which was 126 vehicles greater than in the existing state. The transit time of vehicles when excluding lorries did not exceed 25 s for 218 vehicles. Compared to the existing state, there was a slight difference. The main difference in the transit time and the number of vehicles coming from the commercial zone to Uherské Hradiště compared to the current state is that an additional 15 vehicles can pass through the specific segment within the approximate same pe-This experiment focuses on vehicles turning toward commercial centers in the direction of Staré Mˇesto. At this traffic hub, it means that this direction will no longer provide the possibility of turning toward the commercial zone. The only possible ways are to go straight or turn toward the quay. Vehicles traveling to the commercial zone must turn onto Huštˇenovská Street at the previous junction and then take the direction to Luˇcní District. As a result, the overall traffic capacity increased by 958 vehicles compared to

riod. The transit from Staré Město to Uherské Hradiště upon excluding lorries is not sig-

the default state. In comparison to the current state, the amount of emissions produced was reduced by 23% after re-calculating the ratio of passing vehicles and the quantity of emissions. Because this experiment excluded turning left from the Staré Mˇesto direction, the light signalization was modified, which resulted in a shorter transit time in the given directions and an increase in several vehicles passing through this segment. As contrasted with the basic model simulation, in the model with changed composition, approximately 20 more vehicles were able to pass in the C–B direction in a shorter transit time. The most significant change occurred in vehicles transiting in the A–B direction, where the number of passing vehicles increased from the original 360 to 561 and the transit time was reduced to 12 s. This represents an improvement of 62.5% after the total period was calculated. The tailback from the Staré Mˇesto direction was considerably shorter than in the original situation. Other directions did not see such a large difference. Overall, we may argue that the situation at the traffic hub improved because the overall delay at the intersection dropped from 344.5 to 239.45 s within one hour of the simulation. After re-calculating, the interval fell from 86 s to approximately 60 s.

#### *4.3. Implementing a Roundabout*

In this experiment, the construction and type of the intersection were changed to a two-lane roundabout (Figure 9). During the design, a minimal roundabout inner diameter of 5.3 m was preserved; the outer diameter was 25 m, which enables a smooth transit of longer semi-trailers.

The number of vehicles that passed through the given intersection within one hour's simulation equaled 545. The total amount of emissions that were released into the atmosphere during the simulation process decreased. Nonetheless, after re-calculating the emissions concerning the total amount of vehicles that had passed through the given segment, this ratio increased by 19.36% compared to the current state.

The transit time of vehicles from Uherské Hradištˇe to the commercial zone suggested significantly increased compared to the basic model. By contrast, upon implementing the roundabout, the transit time from the commercial zone decreased and the traffic capacity from the respective zone expanded. The vehicles heading to Uherské Hradištˇe passed through within 46 s on average. However, vehicles coming from Staré Mˇesto experienced an increased transit time in this experiment. Tailbacks in separate directions after implementing the roundabout and in the current state were approximately equal; however, the tailback from the commercial–industrial zone was reduced by 72%.

#### *4.4. Changing the Cycle and Type of Light Signalization*

Another possible solution focused on modifying the light signalization at the traffic hub. The cycle of 65 s was preserved, and the alternatives for turning from the quay and commercial zone were added. The green interval from the C direction was changed from 18 s to 16 s. The green for the quay and commercial zone did not start at the same time—the green for the commercial zone was delayed by 2 s.

After modifying the light signalization, 4649 vehicles passed through the specific intersection; i.e., 137 vehicles more than in the current state. The transit of vehicles in the B–C direction improved by 1 s on average and the traffic capacity expanded by four vehicles. The transit time from the commercial zone was reduced, which resulted in a larger number of vehicles passing through the given traffic hub compared to the default setting of the light signalization. The same applied to the transit time of vehicles from Staré Mˇesto. The length of congestion was approximately equal; the only significant difference was from the quay direction, with a tailback 27 m long compared to the current 6 m. The waiting time at the traffic hub was roughly the same as in the default state even in Experiment 3. Vehicles wait a maximum of 89 s; i.e., 1.5 s more than before.

Uherské Hradiště.

(**a**) view on the two-lane roundabout from the direction of industrial zone

nificantly influenced, which means that there is no major difference compared to the current state. Tailbacks in individual directions are usually longer than in the default state of the intersection. The average delay of cars at the traffic hub approximately equals that of the basic model, with the difference not exceeding 7 s of delay within one hour's simulation.

This experiment focuses on vehicles turning toward commercial centers in the direction of Staré Město. At this traffic hub, it means that this direction will no longer provide the possibility of turning toward the commercial zone. The only possible ways are to go straight or turn toward the quay. Vehicles traveling to the commercial zone must turn onto Huštěnovská Street at the previous junction and then take the direction to Luční District. As a result, the overall traffic capacity increased by 958 vehicles compared to the default state. In comparison to the current state, the amount of emissions produced was reduced by 23% after re-calculating the ratio of passing vehicles and the quantity of emissions. Because this experiment excluded turning left from the Staré Město direction, the light signalization was modified, which resulted in a shorter transit time in the given directions and an increase in several vehicles passing through this segment. As contrasted with the basic model simulation, in the model with changed composition, approximately 20 more vehicles were able to pass in the C–B direction in a shorter transit time. The most significant change occurred in vehicles transiting in the A–B direction, where the number of passing vehicles increased from the original 360 to 561 and the transit time was reduced to 12 s. This represents an improvement of 62.5% after the total period was calculated. The tailback from the Staré Město direction was considerably shorter than in the original situation. Other directions did not see such a large difference. Overall, we may argue that the situation at the traffic hub improved because the overall delay at the intersection dropped from 344.5 to 239.45 s within one hour of the simulation. After re-calculating, the interval

In this experiment, the construction and type of the intersection were changed to a two-lane roundabout (Figure 9). During the design, a minimal roundabout inner diameter of 5.3 m was preserved; the outer diameter was 25 m, which enables a smooth transit of

The number of vehicles that passed through the given intersection within one hour's simulation equaled 545. The total amount of emissions that were released into the atmosphere during the simulation process decreased. Nonetheless, after re-calculating the emissions concerning the total amount of vehicles that had passed through the given segment,

*4.2. Changing the Route from Staré Město Direction* 

fell from 86 s to approximately 60 s.

*4.3. Implementing a Roundabout* 

longer semi-trailers.

(**b**) view on the two-lane roundabout to the direction of the Staré Město

this ratio increased by 19.36% compared to the current state.

(**c**) view on the two-lane roundabout to the direction of Uherské Hradiště

**Figure 9.** Model of the roundabout. (**a**) view on the two-lane roundabout from the direction of industrial zone; (**b**) view on the two-lane roundabout to the direction of the Staré Město; (**c**) view on the two-lane roundabout to the direction of **Figure 9.** Model of the roundabout. (**a**) view on the two-lane roundabout from the direction of industrial zone; (**b**) view on the two-lane roundabout to the direction of the Staré Mˇesto; (**c**) view on the two-lane roundabout to the direction of Uherské Hradištˇe.

#### *4.5. Evaluation of Experiments*

The transit time of vehicles from Uherské Hradiště to the commercial zone suggested significantly increased compared to the basic model. By contrast, upon implementing the roundabout, the transit time from the commercial zone decreased and the traffic capacity The results of simulation experiments provide a wide range of information that characterizes in detail the individual variants designed to ensure the sustainability of transport in a given urban area. Figure 10 presents an example of part of the obtained results.

from the respective zone expanded. The vehicles heading to Uherské Hradiště passed through within 46 s on average. However, vehicles coming from Staré Město experienced an increased transit time in this experiment. Tailbacks in separate directions after imple-From the results, it is necessary to choose the optimal solution based on the obtained parameters. Because the comparison is based on several criteria, it is appropriate in this case to apply the method of multi-criteria decision-making.

menting the roundabout and in the current state were approximately equal; however, the tailback from the commercial–industrial zone was reduced by 72%. *4.4. Changing the Cycle and Type of Light Signalization*  Another possible solution focused on modifying the light signalization at the traffic hub. The cycle of 65 s was preserved, and the alternatives for turning from the quay and commercial zone were added. The green interval from the C direction was changed from There are currently a large number of multi-criteria decision-making methods. One of these methods is the analytical hierarchy process (AHP). AHP is a structured technique used to solve complex decisions. It is based on mathematical procedures and human psychology [35]. AHP provides a complex and logical concept for structuring a problem, quantifying its elements that are related to the overall objectives and evaluating alternative solutions. From the factors that make the AHP perhaps the most popular decision-making method, it can be emphasized that it adapts to fixed data, such as the speed of delivery,

18 s to 16 s. The green for the quay and commercial zone did not start at the same time—

tersection; i.e., 137 vehicles more than in the current state. The transit of vehicles in the B– C direction improved by 1 s on average and the traffic capacity expanded by four vehicles. The transit time from the commercial zone was reduced, which resulted in a larger number of vehicles passing through the given traffic hub compared to the default setting of the light signalization. The same applied to the transit time of vehicles from Staré Město. The length of congestion was approximately equal; the only significant difference was from the quay direction, with a tailback 27 m long compared to the current 6 m. The waiting time at the traffic hub was roughly the same as in the default state even in Experiment 3.

Vehicles wait a maximum of 89 s; i.e., 1.5 s more than before.

*Sustainability* **2021**, *13*, x FOR PEER REVIEW 15 of 25

*4.5. Evaluation of Experiments* 

price, and personal experiences [36]. It allows mathematical derivation of the weight of criteria, instead of the subjective selection of criteria weights, as used by other decisionmaking methods [37]. The results of simulation experiments provide a wide range of information that characterizes in detail the individual variants designed to ensure the sustainability of transport in a given urban area. Figure 10 presents an example of part of the obtained results.

**Figure 10.** Transit time and congestion length. **Figure 10.** Transit time and congestion length.

From the results, it is necessary to choose the optimal solution based on the obtained parameters. Because the comparison is based on several criteria, it is appropriate in this case to apply the method of multi-criteria decision-making. There are currently a large number of multi-criteria decision-making methods. One of these methods is the analytical hierarchy process (AHP). AHP is a structured technique used to solve complex decisions. It is based on mathematical procedures and human psychology [35]. AHP provides a complex and logical concept for structuring a problem, quantifying its elements that are related to the overall objectives and evaluating alterna-Several studies have used AHP for the solution of transport problems in the direction of sustainability and environmental impact; for example, using AHP for the development of a decision support framework to assess quantitative risk in multimodal green logistics [38]; the multiple criteria decision-making approach for route selection in the multimodal supply chain, which is based on the combination of AHP, data envelopment analysis (DEA), and the techniques for the order of preference by similarity to ideal solution (TOPSIS) [39]; and the evaluation and diagnosis of urban streets using an integrated multi-criteria model of a sustainable nature [31].

tive solutions. From the factors that make the AHP perhaps the most popular decisionmaking method, it can be emphasized that it adapts to fixed data, such as the speed of Based on published research, this method has good preconditions for its use in combination with the computer simulation method.

delivery, price, and personal experiences [36]. It allows mathematical derivation of the The model of AHP was created for comparison of variants of transport sustainability on the basis of the results of simulation experiments. The basic structure is presented in Figure 11.

decision-making methods [37].

Figure 11.

criteria model of a sustainable nature [31].

bination with the computer simulation method.

**Figure 11.** The structure of the analytical hierarchy process (AHP) model. **Figure 11.** The structure of the analytical hierarchy process (AHP) model.

The structure of the proposed AHP model consists of three hierarchical levels. At the highest level is the goal of choosing a suitable solution for the needs of transport sustainability. Its achievement is realized based on 12 criteria, which were based on traffic directions connected with the analyzed transport hub. The aforementioned 12 criteria form the second level of the model. The third level of the model is represented by individual alternatives, which were part of the realized simulation experiments. The structure of the proposed AHP model consists of three hierarchical levels. At the highest level is the goal of choosing a suitable solution for the needs of transport sustainability. Its achievement is realized based on 12 criteria, which were based on traffic directions connected with the analyzed transport hub. The aforementioned 12 criteria form the second level of the model. The third level of the model is represented by individual alternatives, which were part of the realized simulation experiments.

weight of criteria, instead of the subjective selection of criteria weights, as used by other

Several studies have used AHP for the solution of transport problems in the direction of sustainability and environmental impact; for example, using AHP for the development of a decision support framework to assess quantitative risk in multimodal green logistics [38]; the multiple criteria decision-making approach for route selection in the multimodal supply chain, which is based on the combination of AHP, data envelopment analysis (DEA), and the techniques for the order of preference by similarity to ideal solution (TOP-SIS) [39]; and the evaluation and diagnosis of urban streets using an integrated multi-

Based on published research, this method has good preconditions for its use in com-

The model of AHP was created for comparison of variants of transport sustainability on the basis of the results of simulation experiments. The basic structure is presented in

The evaluation criteria and the determination of their weights was realized by a group of experts. The role of this group was to help determine the weights of the criteria, to determine the weights of the objectives, to organize the objectives, and to determine the weights of the decision criteria. It must be said that the selection of the evaluation criteria and their weights was based on published knowledge [40,41]. Experts were selected so that their evaluation was considered objective. We tried to achieve this by concentrating several specialists from areas that are related to the problem of the transport hub and its The evaluation criteria and the determination of their weights was realized by a group of experts. The role of this group was to help determine the weights of the criteria, to determine the weights of the objectives, to organize the objectives, and to determine the weights of the decision criteria. It must be said that the selection of the evaluation criteria and their weights was based on published knowledge [40,41]. Experts were selected so that their evaluation was considered objective. We tried to achieve this by concentrating several specialists from areas that are related to the problem of the transport hub and its operation.

operation. The criterion of the competence of the experts was their professional knowledge, knowledge of the transport hub in terms of its operation and its impact on the environment. The experts have more than 10 years of experience in the researched issue. The The criterion of the competence of the experts was their professional knowledge, knowledge of the transport hub in terms of its operation and its impact on the environment. The experts have more than 10 years of experience in the researched issue. The structure of the experts is presented in Table 6.


structure of the experts is presented in Table 6. **Table 6.** The criterion of the competence of the experts.

y means years.

The individual directions at the transport hub were chosen as criteria. These criteria made it possible to evaluate the individual variants. Criteria that represent the directions of traffic that directly affect the traffic situation at the main nodes have higher weight than the directions that provide the diversion of traffic to areas that are adjacent to the traffic hub and are located on commercial premises and residential blocks. The definition of weights (Table 7) was based on the recommendation published in [35].


**Table 7.** The definition of weights.

We used three options for the selection of the method to evaluate the criteria. It was specifically about brainwriting, brainstorming, and the Delphic method. This selection was realized on the basis of the selected literature review, for example [42] and [43]. Based on the above facts, the brainwriting method was chosen. The main reason was the fact that it was possible to harmonize the selected experts to negate their fear of expressing their views in public, and this method made it possible to address that [44]. Brainwriting is a method that is suitable to use in the case that we assume that some participants may be shy in expressing their ideas [45]. It is a method that, in contrast to brainstorming, can effectively involve all its participants and, as a result, often offers more options [45]. The comparison of individual methods and their possibilities of use within AHP is offered by [42,43].

The evaluation of criteria in terms of significance is shown in Table 8. In their evaluation, the impact of individual roads on the creation of traffic congestion was taken into account.


**Table 8.** Evaluation of criteria in term of significance.

In the next step, a pairwise comparison of the alternative solutions considered in the context of individual criteria was realized (Table 9). In this comparison, the lengths of traffic congestion were taken into account, which were identified in the implementation of simulation experiments. This means that based on routes (criteria), variants were evaluated and compared with each other. The lower the length of the traffic congestion, the higher the rating assigned to the alternative. At the same time, the principle of universal axiom was observed. This created a reciprocal matrix in which all elements on the main diagonal were equal to 1.




**Table 9.** *Cont.*


**Table 9.** *Cont.*

The basic scale of pairwise comparison was used in the comparison. In total, 12 pairwise comparisons were created according to the criteria. In addition to alternative proposals, the existing current situation at a researched transport hub was also included in the pairwise comparison. The aim was to avoid a situation in which none of the compared solutions was worse than the current situation. *Sustainability* **2021**, *13*, x FOR PEER REVIEW 21 of 25

> The study used values that were the result of simulation experiments. Specifically, it was the length of traffic congestion on individual traffic directions. These characterized the number of means of transport on a given track profile. As another value, the transit time of the given section was used, which was used to monitor how favorable the given transport solution was. The application of these values in the evaluation was able to compare the individual sections and remove any advantage because the section is less frequent in terms of its importance. The value of consistency ratio (CR): 0.0933. The main limit in the analysis is the capacity of the transport infrastructure. We speculated, excepting the roundabout, about the use of the existing transport infrastructure, without increasing its capacity. This solution was chosen because the extension of the road infrastructure would be quite demanding in terms of securing the available space and high investment costs. the number of means of transport on a given track profile. As another value, the transit time of the given section was used, which was used to monitor how favorable the given transport solution was. The application of these values in the evaluation was able to compare the individual sections and remove any advantage because the section is less frequent in terms of its importance. The value of consistency ratio (CR): 0.0933. The main limit in the analysis is the capacity of the transport infrastructure. We speculated, excepting the roundabout, about the use of the existing transport infrastructure, without increasing its capacity. This solution was chosen because the extension of the road infrastructure would be quite demanding in terms of securing the available space and high investment costs.

> Based on the presented data, a synthesis of partial evaluations of alternative solutions was realized. Its result is presented in Figure 12. Based on the presented data, a synthesis of partial evaluations of alternative solutions was realized. Its result is presented in Figure 12.

The sustainability of transport in urban agglomerations is currently a key challenge to which constant attention is paid in the field of logistics. Approaches based on the use of computer simulation are widely used in the solution of partial needs related to

should be applied as much as possible. Experts in transport and infrastructure should

However, computer simulation is not a universal solution that can automatically solve all transport sustainability issues. Computer simulation in the field of transport is often used in the form of a support tool for obtaining broad-spectrum information, whose

The ability of experts to effectively use and work with simulation models, their modification with the view to efficiently resolving transport issues should be a steady trend in the area of transport. In particular, microscopic simulation models, which may be applied to a large scale of transport problems and issues, should be employed at the outset.

In using computer simulation as a tool for obtaining a set of information covering different variants of the solutions to traffic problems, it is necessary to realize a final decision and select a suitable variant of the solutions. In these cases, multi-criteria decisionmaking appears to be an effective application, in which it is suitable to use the AHP method. The combination of computer simulation and multi-criteria decision-making is an effective analytical tool for solving traffic tasks related to the field of transport suitabil-

This paper verifies, based on a practical example of a real traffic problem, possible solutions to the traffic problem using microscopic computer simulation and the AHP method. The investigated transport problem is associated with a change in the organiza-

The traffic hub was simulated using the PTV Vissim program, which is based on a multipurpose microscopic simulation of road traffic. After entering the required data (i.e.,

**5. Conclusions** 

ity.

transport sustainability.

consider simulation models as invaluable working tools.

level of detail can be specified according to the requirements.

tion of the traffic hub and the selection of a suitable final variant.

#### **5. Conclusions**

The sustainability of transport in urban agglomerations is currently a key challenge to which constant attention is paid in the field of logistics. Approaches based on the use of computer simulation are widely used in the solution of partial needs related to transport sustainability.

Computer simulation presents a highly useful tool to resolve transport issues and should be applied as much as possible. Experts in transport and infrastructure should consider simulation models as invaluable working tools.

However, computer simulation is not a universal solution that can automatically solve all transport sustainability issues. Computer simulation in the field of transport is often used in the form of a support tool for obtaining broad-spectrum information, whose level of detail can be specified according to the requirements.

The ability of experts to effectively use and work with simulation models, their modification with the view to efficiently resolving transport issues should be a steady trend in the area of transport. In particular, microscopic simulation models, which may be applied to a large scale of transport problems and issues, should be employed at the outset.

In using computer simulation as a tool for obtaining a set of information covering different variants of the solutions to traffic problems, it is necessary to realize a final decision and select a suitable variant of the solutions. In these cases, multi-criteria decision-making appears to be an effective application, in which it is suitable to use the AHP method. The combination of computer simulation and multi-criteria decision-making is an effective analytical tool for solving traffic tasks related to the field of transport suitability.

This paper verifies, based on a practical example of a real traffic problem, possible solutions to the traffic problem using microscopic computer simulation and the AHP method. The investigated transport problem is associated with a change in the organization of the traffic hub and the selection of a suitable final variant.

The traffic hub was simulated using the PTV Vissim program, which is based on a multipurpose microscopic simulation of road traffic. After entering the required data (i.e., the number of vehicles, the traffic flow distribution, the permitted speed, and the light signalization interval) into the program, the output data were recorded. The applied results included information about the passing time of the vehicles, the length of the congestion, and the average delay of the vehicles at the intersection per simulated hour. Four simulation experiments were performed using the basic model, and the results obtained from these experiments were compared using multi-criteria decision-making via the AHP method.

The first experiment was oriented to changing the composition of the vehicles, thus excluding trucks from the intersection. The conditions of traffic were not improved significantly; rather, the conditions worsened.

Experiment no. 2 was focused on a change in traffic management in the direction from Staré Mˇesto, i.e., the vehicles were not allowed to turn from this direction toward the commercial zone. Comparing the results with the original situation, the traffic-carrying capacity improved by almost 1000 vehicles. The transit time was also more efficient and the length of the line of vehicles in the direction from Staré Mˇesto shortened.

Another experiment was focused on the application of a roundabout equipped with two traffic lanes. Passage of the vehicles through this intersection significantly worsened. The only positive result was the shortening of the line of vehicles in the direction from the commerce–industrial zone by 72%.

The last experiment consisted of a change in the traffic light signalization. Signals were added for the turn options and the length of the green light was adjusted. The result of this last experiment was that more vehicles passed through the traffic hub. The passage of vehicles and the length of the column of cars changed slightly.

The evaluation of the experiments was based on a comparison of the length of congestion. The AHP method was used. Twelve criteria were used in multi-criteria decisionmaking, which was used to compare individual variants based on the results of simulation experiments and to select the optimal solution.

According to the performed experiments, it is evident that the best solution for improvement of the traffic situation for the given traffic hub was derived from the experiment with the change in traffic management in the direction from Staré Mˇesto.

The presented procedure is a productive and reliable methodology that can be applied to solve a wide range of traffic problems. This enables us to make decisions, perform different types of analyses, and investigate the functioning of different types of transport processes.

The application of the model makes it possible to create guidelines for the priority measures in the field of public investment in urban infrastructure. The appropriateness of AHP in the field of transport is also presented by the research study [46], in which AHP is used for evaluation of the public passenger transport quality. In conclusion, it can be stated that the main novelty of the current paper is the combination of the simulation approach and the AHP model to select a suitable transport alternative. This is a solution that, according to the available information, has not yet been applied in this way to the area of urban transport. The paper was written to show a practical example of how it is possible to use a combination of discrete simulation and multi-criteria decision-making for the selection of solutions for the needs of transport sustainability. In terms of the combination of both methods for the field of transport sustainability, such a solution has not been obtained. The results presented follow several studies that have already been published. Creation of the discrete simulation model applied knowledge about the calibration and validation of the microscopic simulation model in the program PTV Vissim [47]. Application of the set of the optimal solution by the AHP method is related to the study focused on the creation of a traffic model of an intersection [48]. The paper also points out that multi-criteria decision-making can be also used for modeling a mixed traffic flow that passes a traffic junction and causes traffic congestions. New knowledge about this research problem is presented in [49]. The paper further indicates which data need to be focused on if there are several possible solutions in terms of transport sustainability within urban agglomerations. The paper thus extends the results that [50] presents marginally. The results presented extend the knowledge base presented in [51]. In contrast to the use of the AHP method for the process of transport organization in road tunnels, the present paper extends this issue to the area of intersections in built-up areas.

**Author Contributions:** Conceptualization, G.F. and V.M.; methodology, M.H. and N.M.; validation, G.F. and V.M.; formal analysis, N.M.; resources, R.K.; data curation, M.H.; writing—original draft preparation, G.F.; writing—review and editing, V.M. and V.S.; visualization, N.M. and V.S.; supervision, V.M.; project administration, R.K.; funding acquisition, G.F. All authors have read and agreed to the published version of the manuscript.

**Funding:** This work was supported by the projects of the Scientific Grant Agency of the Ministry of Education, Science, Research and Sport of the Slovak Republic and the Slovak Academy of Sciences (project No. VEGA 1/0403/18, VEGA 1/0638/19, and VEGA 1/0600/20) and by the projects of the Cultural and Educational Grant Agency of the Ministry of Education, Science, Research and Sport of the Slovak Republic and the Slovak Academy of Sciences (project No. KEGA 012TUKE-4/2019, KEGA 013TUKE-4/2019, KEGA 049TUKE-4/2020, and APVV SK-SRB-18-0053).

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** The presented data can be obtained by email from corresponding author.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **Nomenclature**


#### **References**


## *Article* **Deep Journalism and DeepJournal V1.0: A Data-Driven Deep Learning Approach to Discover Parameters for Transportation**

**Istiak Ahmad <sup>1</sup> , Fahad Alqurashi <sup>1</sup> , Ehab Abozinadah <sup>2</sup> and Rashid Mehmood 3,\***


**Abstract:** We live in a complex world characterised by complex people, complex times, and complex social, technological, economic, and ecological environments. The broad aim of our work is to investigate the use of ICT technologies for solving pressing problems in smart cities and societies. Specifically, in this paper, we introduce the concept of deep journalism, a data-driven deep learningbased approach, to discover and analyse cross-sectional multi-perspective information to enable better decision making and develop better instruments for academic, corporate, national, and international governance. We build three datasets (a newspaper, a technology magazine, and a Web of Science dataset) and discover the academic, industrial, public, governance, and political parameters for the transportation sector as a case study to introduce deep journalism and our tool, DeepJournal (Version 1.0), that implements our proposed approach. We elaborate on 89 transportation parameters and hundreds of dimensions, reviewing 400 technical, academic, and news articles. The findings related to the multi-perspective view of transportation reported in this paper show that there are many important problems that industry and academia seem to ignore. In contrast, academia produces much broader and deeper knowledge on subjects such as pollution that are not sufficiently explored in industry. Our deep journalism approach could find the gaps in information and highlight them to the public and other stakeholders.

**Keywords:** natural language processing (NLP); topic modelling; BERT; transportation; newspaper; magazine; academic research; journalism; deep learning; smart cities

#### **1. Introduction**

#### *1.1. A Complex World, Governance Failures, and Deep Journalism*

We live in a complex world characterised by complex people, complex times, and complex social, economic, and technological environments. Because this was not enough, our complex activities have complex effects on our ecological environment. This is not an easy time for our governments to to manage matters that affect our social, economic, and environmental sustainability. There is clear evidence that governments are failing at addressing education, healthcare, public safety, and the list goes on [1–9]. The recent COVID-19 pandemic is a major example of global governance failure both at preventing such pandemics (caused by the effects of our lifestyles, processed food we eat, and other activities that damage our planet's environment) and managing the COVID-19 pandemic [10–15]. Governments are elected by people, and in a sense, government failure is also a failure of the public. It is time that we take responsibility for both success and failure and look into ways of collaboratively improving governance.

While there are many reasons for government failures, we believe the lack of information availability is a fundamental reason that limits a government's ability to act smartly and allows a lack of transparency to creep into policies and actions, leading to corruption

**Citation:** Ahmad, I.; AlQurashi, F.; Abozinadah, E.; Mehmood, R. Deep Journalism and DeepJournal V1.0: A Data-Driven Deep Learning Approach to Discover Parameters for Transportation. *Sustainability* **2022**, *14*, 5711. https://doi.org/10.3390/ su14095711

Academic Editors: Tommi Inkinen, Tan Yigitcanlar and Mark Wilson

Received: 16 March 2022 Accepted: 2 May 2022 Published: 9 May 2022

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

and failure. While the sincerity and intentions of the people involved could be a major cause of shortcomings in any institution or system, particularly large-scale public institutions and systems, practical efforts can be made to reduce silos and segmentation and bridge the gaps in the information and knowledge available to different communities through direct or indirect cross-sectional conversations and collaborations enabled through automated and autonomous technologies such as deep learning, big data analytics, and others.

To this end, this paper introduces the concept of deep journalism, a data-driven deep learning-based approach for discovering multi-perspective parameters related to a topic of interest. We examine the academic, industrial, public, governance, and political parameters for the transportation sector as a case study to introduce deep journalism and our tool, DeepJournal (Version 1.0), that implements our proposed approach. The concept of deep journalism will be illustrated in the rest of this section (and this paper) as we introduce the challenges facing the transportation sector and our work on detecting parameters for it as viewed by the public, governments, industry, and academia. The production and distribution of multi-perspective parameters are expected to provide a holistic and multiperspective view of a sector and help bridge the knowledge and collaboration gaps that exist to reduce inefficiencies and failures.

#### *1.2. Transportation and Challenges*

Transportation is fundamental to modern societies and economies. However, transportation is a major challenge considering the many issues that this sector faces and the design parameters that need considering in developing successful policies, systems, and operations. The issues facing transportation include the safety of people and goods, rising costs, growth of megacities, long commutes for work, parking problems, damage to health and the planet, and more.

Several modes of transportation exist—road, rail, air, and marine—each with its own challenges. Road transportation is considered the backbone of modern economies although it costs over 1 million deaths and 50 million human injuries annually [16]. Rail transport requires huge capital investments and is therefore subject to monopolies, is relatively inflexible in terms of adjustments to individual passenger needs, cannot be moved around, and may be underutilised in different times and situations (such as during the recent COVID-19 pandemic). Moreover, heavily utilised trains are prone to frequent faults [17], cancellations [18,19] accidents [20–22], etc. [23]. Air transportation faces many challenges including pollution [24,25], high costs [26], high safety [27,28] and security risks [29,30], huge capital investments, fuel requirements [31], and others. Marine transportation also faces many challenges such as pollution, security risks [32–34], increasing costs, and environmental regulations [35,36]. These challenges are threatening the sustainability of our societies, economies, and our planet.

There is a need for innovative approaches based on collaborative thinking enabled through the availability of integrated information. Academia is not being used to its full potential [37]. What is possible in terms of technology and the potential of academia and people is not being matched with what is being done. Policy and action need to work together through dialogue to make information available to all bodies working in the transportation sector, the government, industry, academia, journalism, and the public. Deep Journalism could provide a solution.

#### *1.3. Summary of the Proposed Work*

In this work, we bring together a range of deep learning, big data, and other technologies to discover transportation parameters from three different perspectives using three different types of data sources, viz., a newspaper (*The Guardian*), a transportation technology magazine (*Traffic Technology International*), and academic literature on transportation (from Web of Science). The three types of data sources provide three different views of the transportation domain, that is, a view as seen by the public and governed by the political and other institutions, a second view from the transportation industry, and a third view as

seen by the academics and researchers. Certainly, these views are not mutually exclusive and are to some extent affected by each other, but they do represent different perspectives with considerable differences. We call this approach Deep Journalism for two reasons. First, we call it deep in the actual sense of the word because it allows capturing and reporting a relatively deeper view of a topic (e.g., transportation) from multiple perspectives, dimensions, stakeholders, and depths. Second, we use deep learning to automatically discover multi-perspective parameters about a topic.

The newspaper dataset that we built to discover parameters for public, governance, and political aspects of transportation is collected from a UK-based newspaper, *The Guardian.* We collected all the articles from *The Guardian* newspaper that contain the word "transport" (in the title of the news, the full text of the news article, or the metainformation about the article) and found a total of 14,381 unique articles dated between September 1825 [38] and January 2022 [39]. We discovered a total of 25 parameters from *The Guardian* dataset and grouped them into 6 macro-parameters, namely Road Transport; Rail Transport; Air Transport; Crash and Safety; Disruptions and Causes; and Employment Rights, Disputes, and Strikes.

The industry and technology magazine dataset that we built to discover parameters about industrial aspects of transportation was collected from a technology-focused magazine, *Traffic Technology International (TTI),* a popular magazine reporting the latest transport technologies and news. We collected all the articles, a total of 5193 articles dated between February 2015 [40] and January 2022 [41] from the magazine website without any filters or search queries because this magazine only covers transportation-related news. We discovered a total of 15 parameters from the *TTI* dataset and grouped them into 5 macroparameters, namely Industry, Innovation, and Leadership; Autonomous and Connected Vehicles; Sustainability; Mobility Services; and Infrastructure.

The academic-view dataset that we built to discover parameters for the academiafocused aspects of transportation is collected from an academic database, Web of Science. We collected in aggregate 21,446 research article abstracts (with titles and keywords) in the English language only from about 20 categories of academic disciplines in Web of Science, such as transportation science and technology, engineering, environmental science, telecommunications, economics, computer science, business, and others. The collected article abstracts were limited to the publishing years 2000 [42] to 2022 [43]. We discovered 49 transportation parameters from the academic dataset and grouped them into 6 macroparameters. These are policy, planning and sustainability; transportation modes; logistics and SCM; pollution; technologies; and modelling.

We implemented the proposed deep journalism approach into a tool called Deep-Journal (Version 1.0). The tool is able to discover transportation parameters using the datasets described above. The tool comprises four software components; Data Collection and Storage, Data Preprocessing, Parameter Modelling and Discovery, and Validation and Visualisation. The three datasets were collected using web scraping and other techniques and were preprocessed to remove duplicate and irrelevant data, tokenise data, clean up the data, and lemmatise data to generate data in a form that can be processed by the deep learning processing engine. We used a pretrained BERT (bidirectional encoder representations from transformers) word embedding model [44] to capture the contextual relations within the data. The BERT model was used along with UMAP (uniform manifold approximation and projection) [45] (a dimension reduction technique), HDBSCAN (hierarchical densitybased spatial clustering of applications with Noise) [46] (a clustering algorithm), and a class-based TF–IDF (term frequency–inverse document frequency) score, to automatically group documents in the datasets into document clusters.

Subsequently, we discovered transportation parameters and macro-parameters from each dataset using the document clusters along with the domain knowledge and a range of quantitative analysis methods performed on the clustered data including similarity metrics [47], hierarchical clustering [48], term score [49], keyword score [50], and intertopic distance map [51]. A range of visualisation methods were used to elaborate on the

datasets, document clusters, and the discovered parameters. These include dataset histograms [52], taxonomies, similarity matrices [53], temporal progression plots, word clouds, and others. Multiple taxonomies of transportation from public, industry, and academic views were extracted using automatic clustering of datasets. Figure 1 depicts a high-level combined multi-perspective taxonomy of transportation as viewed by the public, industry, and academia. The first and second level branches in the figure show the discovered macro-parameters and parameters, respectively.

**Figure 1.** A multi-perspective taxonomy of transportation.

The findings related to a multi-perspective view (public, governance, political, industrial, and academic) of transportation show that there are many important problems such as transportation operations and public satisfaction that industry and academia seem to ignore, or perhaps if they do focus on them, the solutions do not achieve measurable results from the policymakers and industrialists. We can also see that academia produces much broader and deeper knowledge on the subject, while many important issues such as pollution are not publicised. Our deep journalism approach could find the gaps and highlight them for the public and other stakeholders.

The validation of our results can be considered internal or external. The internal validation is performed by investigating whether the articles and documents belonging to a certain parameter are related to the parameter. We have provided discussions on many articles in each dataset as to how those articles relate to the parameters. The external validation is performed by comparing parameters, keywords, and quantitative metrics across the three datasets (i.e., the three perspectives of transportation). The external validation is also performed by using sources other than the three dataset sources. Moreover, both the internal and external validation is performed using the depictions produced by various visualisation methods.

Further details on the methodology and design of our deep journalism approach and the DeepJournal tool are presented in Section 3.

#### *1.4. Broader Aim, Novelty, and Contributions*

The broader aim of our work is to investigate the use of ICT technologies for solving pressing problems in smart cities and societies. Specifically, in this paper, we investigate the use of deep learning and digital methods to discover and analyse cross-sectional multiperspective information to enable better decision making and develop better instruments for academic, corporate, national, and international governance. The contributions of this paper can be summarised as follows.


The literature review (Section 2) establishes that this work is novel in several respects: the proposed scheme of deep journalism, the developed digital methods and tools for the purpose, the use of three (or more) data sources to create a multi-perspective view of the transportation sector, the three datasets, the specific findings, and more.

#### *1.5. Journalism, Citizen Journalism, and Deep Journalism*

The public in many parts of the world is frustrated with their governments regarding the governance of social, economic, and other aspects of public lives. The public trust in governments has declined sharply with the emergence of phenomena such as kleptocracy, partisanship, and populism leading to extremism in our societies [54,55]. The main issues are related to the public's perception that the responsibility to provide impartial information and ideal governance lies with other people and not themselves. The American Press Institute defines the purpose of journalism as "to provide citizens with the information they need to make the best possible decisions about their lives, their communities, their societies, and their governments" [56]. Traditional journalism has failed at this purpose due to many reasons such as difficulty in maintaining the freedom and impartiality of media organisations and funding cuts, leading to public mistrust [57,58]. This is, for instance, highlighted by the UN News with a statement by UN Secretary-General, António Guterres, "at a time when disinformation and mistrust of the news media are growing, a free press is essential for peace, justice, sustainable development, and human rights" [59]. The distrust of traditional journalism coupled with the rise of the Internet, digital technologies, and digital and social media has given rapid rise to citizen journalism, i.e., journalism by the general public particularly using digital and social media [60–64]. Citizen journalism has also been referred to as public journalism, democratic journalism, participatory journalism, and other names.

Citizen journalism is complex due to its multifaceted, multidimensional, multilevel, and multimodal nature [65]. It is multifaceted due to its embrace of "a wide array of societal institutions, organisations, groups, and social actors at the intersection between journalism, community, and democracy". It is multidimensional due to it "embracing not only news production and creation but also news consumption and sharing, thus, generating interactive processes among news producers, consumers, and citizens". It is multilevel due to it "comprising journalists, sources, and news audiences at the individual level (micro-level), news organisations and other societal institutions at the organisational level (meso-level), and interorganisational networks in local communities and beyond (macro-level)". It is multimodal because it operates "across diverse communication platforms and channels" including radio, television, Internet, social media platforms, and more [65].

While citizen journalism solves some of the problems of traditional journalism, it comes with its own problems such as subjectivity and potentially lacking regulations, standards, quality, and responsibility [66–68]. The responsibility to maintain ideals lies with all people, and therefore, everyone has the responsibility and needs to work towards upholding honesty, sincerity, equity, freedom, and other ideals. Traditional and citizen journalism need to work together, and their problems need to be resolved collaboratively by the public.

Regarding journalism, the specific aim of this paper is to contribute to this area of journalism and help improve it through academic integrity and rigour. Academics should be in the vanguard of objective information, sincerity, impartiality, equity, and other ideals. Academics should search, pursue, propagate and defend these ideals. If the academics fail to do so, then the responsibility lies with common people to pursue and be in the vanguard of the ideals needed to maintain a free society. The idea of deep journalism is to make impartial, cross-sectional, and multiperspective information available, to bring rigour to journalism by nurturing responsibility in people by making it easy to generate information for the public benefit using deep learning, and to make tools and information available to common people so they can search and defend the ideals of freedom, including social, environmental and economic sustainability.

#### *1.6. Software and Hardware*

The work reported in this paper was developed on the Aziz supercomputer that comprises a total of 500 CPU, GPU, and Intel MIC nodes. In addition, we also used Google Colab to run some experiments. Specifically, we used an Nvidia V100 GPU with 32 GB RAM, which combines 5120 CUDA cores and 640 Tensor cores for deep learning and other HPC loads. The V100 has a double performance of 7.066 TFLOPS and a single performance of 14.13 TFLOPS. The software and platforms used in this work include Python as the programming language, along with Pandas [69], Numpy [70], BERTopic [71], NLTK [72], Scikit-Learn [73], Gensim [74], SentenceTransformer, and PyTorch [75]. The data visualisation libraries used in this work include Seaborn [76], Plotly [77], and Matplotlib [78].

#### *1.7. Section Organisation*

The rest of the paper is structured as follows. Section 2 reviews the related works and establishes the research gap. Section 3 describes the deep journalism methodology and the design of our tool. Section 4 introduces and discusses the parameters for public, governance, and politics. Sections 5 and 6 discuss the parameters for industry and academia, respectively. Section 7 provides discussion. Section 8 concludes and gives directions for future work.

#### **2. Literature Review**

We discuss in this section the works related to the proposed work in this paper. We conducted an extensive review of academic research on the use of artificial intelligence (AI) and data analytics for transportation. We did not find any work directly related to our paper. However, to place our work in the context of the overall body of work on data analytics in transportation, we review works in three areas. First, we discuss studies related to the use of AI and machine learning for transportation. Subsequently, we review research works that analyse and detect transport-related events by using social media data. Finally, we discuss works on the scientometric analysis of the general transportation literature, including scientometric analysis studies on specific areas of transportation.

Researchers have used machine learning for different problems in transportation. For example, a large body of work on the use of deep learning is in object detection, environment perception, health effect, resilience in transport, etc [79]. For example, Wang et al. [80] proposed a model MobileNetv1\_yolov3lite to detect objects and speed in real-time. Zhu et al. [81] presented an overview of datasets, evaluation criteria, and future work on environment perception, i.e., vehicle tracking, scene understanding, traffic sign detection, lane and road detection, etc. for intelligent vehicles. Deep learning has also been used for many transport modelling problems, including collaborative decision making for environment perception [82], incident detection [83], disaster management [84], rapid transit systems for megacities [85], and traffic flow modelling [86]. Some other research works are on traffic flow prediction [87], autonomous vehicles [88], vehicular networking [89], automatic license plate recognition [90], crash prediction [91], and others.

Researchers have also used various social media data to analyse and detect different events to discover and solve transportation issues. For example, Alomari et al. [92] used a tool and machine learning algorithms for traffic event detection by using a total of 2,511,000 tweets and transportation-related concerns detection during the COVID-19 pandemic [93]. Later, in another study, they [16] used 33.5 million tweets for event detection and road traffic social sensing by using distributed machine learning algorithms. Their research demonstrated Twitter's efficacy in spotting major occurrences without previous information. Suma et al. used a big data tool for automatic event detection [94] from Twitter data and also used apache spark to automatically detect and validate the events [95]. Traffic incident detection is another challenge for the transportation system. Zhang [96] proposed LDA and a clustering-based algorithm to detect traffic incidents by using the Twitter dataset. They used the Carlo K-test to validate their research outcomes. There is other research on using social media datasets for various topics in transportation such as transportation planning and management, the traffic monitoring system, traffic event detection, etc. Wang et al. [97] proposed a traffic management system (i.e., traffic alert and warning) using Twitter data and the LDA topic modelling algorithm. In 2020, a BERT-based automatic traffic alert system was developed by Wan et al. [98]. The authors used Twitter data to evaluate their system. Additionally, they implemented a question-answering model to extract the location, time, and nature of the traffic events.

The works that can be considered more related to this paper are those where researchers used scientometric analysis for transportation-related topics. For example, Heilig et al. [99] used scientometric analysis to perform a study on academic research on public transportation, which offers better knowledge of articles, authors, countries, and keywords based on citation information. They used 7868 research articles with 160,132 references from 2009 to 2013. This is the only work that looked into the transportation area as a whole. All other works on scientometric analysis have focused on specific topics in transportation, and these are discussed in the following paragraph.

Das et al. [100] analysed 15,357 paper abstracts from 7 years of Transportation Research Board (TRB) yearly meetings (2008–2014) by using LDA to show the research patterns and intriguing histories of transport research. Sun and Yin [101] proposed an LDA-based topic modelling approach to find the research topics and temporal information over the last 25 years of transportation. They collected transport-related abstracts of 17,163 articles from 22 journals between 1990 and 2015 and applied LDA to discover 50 key academic research topics. In 2021, Putri et al. [102] proposed a systematic review of ITS by using the LDA and named entity recognition. They retrieved 23,823 titles and abstracts from the Scopus database between 1974 and 2020. Their research findings include the evolution of ITS development and related research areas. Some other research work has been conducted on several transport-related topics. For example, road safety is a significant component of the transportation system. Zou et al. [103] presented a scientometric analysis to reveal the core research area of road safety. The authors found that road safety studies focused mostly on driver psychological behaviour, prevention of traffic accidents, the impact of driver risk factors, and the analysis of the consequences and frequency of road crashes. In another research study, Gao et al. [104] presented a scientometric analysis on traffic safety sign research from 1990 to 2019. The authors collected 3102 articles from the Web of Science database and used Citespace to analyze and visualise the research domain. They discovered that most of the research had been conducted in the last 10 years. Their research also found that the United States is in the lead position in traffic sign research. AV is the most heavily researched topic to improve the transportation system. A scientometric analysis on autonomous vehicles was conducted by Faisal et al. [105]. They collected a total of 4645 research articles between 1998 and 2017 to perform the scientometric analysis. Their research presented the development of AV systems by analyzing the authors, affiliations, citations, and publications in AV research.

#### *Research Gap*

The literature review shows that the existing research on the use of machine learning in transportation has mainly focused on autonomous vehicles, object detection, and others. There are some works on social media data analysis for detecting events in transportation. The very few works that have focused on scientometric analysis are very different from our work. We did not find any research papers that have used newspapers, transport magazines, and academic research articles altogether. Our work is novel in several respects: the proposed scheme of deep journalism, the developed digital methods and tools for the purpose, the use of three (or more) data sources to create a multiperspective view of the transportation sector, the three datasets, the specific findings, and more.

#### **3. DeepJournal V1.0: Methodology and Design**

The proposed system methodology and design is depicted in Figure 2 to analyse contextual topics that discover the transportation issues, challenges, development, and future planning by using newspaper, magazine, and research article abstracts. The software architecture consists of four software components which are described in the following subsections. The methodology overview, including the master algorithm, is provided in Section 3.1. In this research, we use d three types of data sources named *The Guardian*, *Traffic Technology International*, and Web of Science. Sections 3.2–3.5 summarise these three data sources, discuss the data collection algorithm, and describe the datasets. Sections 3.6–3.9 cover data preprocessing, parameter modelling, parameter discovery and quantitative analysis, and validation and visualisation, respectively.

**Figure 2.** DeepJournal V1.0: the system architecture.

#### *3.1. Methodology Overview*

Algorithm 1 outlines the master algorithm where the inputs are the search queries and website URLs which are needed for the data collection. The dataset was collected using the web scraping technique and stored in a CSV file. Then, the CSV file was loaded into the Pandas data frame, and the articles were passed to the data preprocessing function which removes duplicate articles, accomplishes tokenisation, removes irrelevant characters and stop words from the articles, and performs lemmatisation with allowed POS (partof-speech) tags, i.e., noun, adjective, verb, and adverb and generates cleaned tokens. After that, a pretrained BERT (bidirectional encoder representations from transformers) [44] word embedding model was used to capture the contextual relations between the words. Subsequently, we used the UMAP (uniform manifold approximation and projection) [45]/ which is a dimension reduction technique, HDBSCAN (hierarchical density-based spatial clustering of applications with noise) [46] which is a clustering algorithm and a class-based TF–IDF (term frequency–inverse document frequency) score to calculate the importance of words for each cluster. Additionally, we reduced the number of clusters by merging the most similar clusters. After that, we saved the clustering model and assigned the cluster to each article. Then, the clusters were relabelled as parameters and the parameters were grouped into macro-parameters using the domain knowledge along with a similarity matrix, hierarchical clustering, and other quantitative analysis methods. Finally, we visualised and validated the parameters against external and internal sources.

#### **Algorithm 1** Master Algorithm

#### **Input:** *searchQuery, weblink*

**Output:** *article with lebelled parameter and visualization*


#### *3.2. Data Collection*

We used three types of data sources in this research: *The Guardian* (newspaper), *Traffic Technology International* (magazine), and the Web of Science (academic research). We utilised web scraping techniques (i.e., Python, BeautifulSoup, Requests, and Pandas) to obtain the *The Guardian* and *TTI* datasets from their corresponding websites. We collected the Web of Science dataset from its website as it allows users to download the dataset as a CSV format. We discuss the data collection steps for *The Guardian*, *Traffic Technology International*, and Web of Science in Sections 3.3–3.5, respectively.

#### *3.3. Dataset: Newspaper Articles (The Guardian)*

The newspaper dataset was collected from the UK-based newspaper *The Guardian* from September 1825 to January 2022. We retrieved all transport-related articles from the website using the web scraping technique and collected about 14,855 articles.

Algorithm 2 shows the steps of the data collection process. Initially, we used "transport" keywords to search for the related articles on the website. After that, we passed the web link to the newspaper as a parameter in the algorithm. We divided our data collection methodology into two functions: article link collection and data collection. In the first function, after acquiring all the links from the web page content, we removed the irrelevant links and saved the links as a data frame. In the data collection function, we analysed the HTML and JavaScript code to obtain the article, date, and headline from the web page content. For each news article, we acquired the related heading and publication date. We saved the data in a data frame and finally saved the data frame into a CSV file. After retrieving the articles, we encountered a few duplicate articles. We eliminated all duplicate articles, resulting in 14,381 unique articles from *The Guardian*.


Figure 3 shows the histogram of *The Guardian* news articles. The *y*-axis indicates the number of news articles for the increasing word count per news article. We noticed that the prevalent length of news articles was 200–500 words. The number of news articles that contained more than 800 words was relatively small. The maximum number of words in a document was 8341. For more visual understanding, the zoomed portion inside the graph is shown. The figure also shows the density against the increasing number of words per news article. The maximum density is around 0.0016.

**Figure 3.** Histogram (*The Guardian* news articles).

#### *3.4. Dataset: Technology Magazine Articles (Traffic Technology International)*

*TTI* stands for *Traffic Technology International* which is a popular magazine related to the latest transport technology. From February 2015 to January 2022, we gathered 10,620 articles from all categories on the *TTI* website using the web scraping approach.

Algorithm 3 shows the steps of the data collection process. We divided the algorithm into two functions: article link collection and data collection. In the beginning, we passed the web link to the article link collection function. We used two loops, where the first loop was for the category list and the second loop was for the total web page number for each category. We used a dictionary-type variable to store the category as a key and the total web page number for that category as a value. After obtaining all the links from the web pages, we removed the irrelevant links and saved the links as a data frame. In the data collection function, we analysed the HTML and JavaScript code to obtain the article, publication date, and headline from the web page content. For each news article, we received the related heading and publication date. We saved the data in a data frame and finally saved the data frame into a CSV file. After saving the data, we found a lot of duplicate data as some articles are common in multiple categories.Therefore, we removed the duplicate articles from the dataset, and finally, we found 5193 unique articles.

#### **Algorithm 3** Data Collection (*Traffic Technology International*)

**Input:** weblink

**Output:** CSV file


```
21: end for
```
22: **end function**

Figure 4 depicts the histogram of the *Traffic Technology International* magazine articles. The *y*-axis and *x*-axis demonstrate the number of magazine articles and the increasing word count for magazine articles, respectively. The majority of magazine articles were between 300 and 450 words and 500 to 600 words (see graph peaks).The number of news articles that contained more than 600 words was relatively small. The maximum number of words in a document was 2323. The magnified plot inside the figure is presented for the convenience of the reader. The graph also depicts the density in relation to the increasing quantity of words per magazine article. The highest density is around 0.005.

**Figure 4.** Histogram (*Traffic Technology International* articles).

#### *3.5. Dataset: Academic Articles (Web of Science)*

We obtained the most pertinent documents from the Web of Science, the most comprehensive database with a consistent query language and data format. Furthermore, it facilitates access to subject indexes, citation indexes, and other databases from other disciplines which assists in the discovery of relevant research and the evaluation of its findings. We collected 21,446 research articles by using "transportation" keyword from several Web of Science categories, for example, transportation science technology, engineering electrical electronics, transportation, environmental science, telecommunications, economics, computer science information system, business, etc. The document type was limited to proceedings papers, articles, and review articles. Excluded were publications produced from news items, corrections, book chapters, data papers, book reviews, letters, editorial materials, and so on. Furthermore, we narrowed our search filtering option to the English language and the publishing years 2000–2022. In addition, we utilised advanced search and selected the "topic search" option which yielded results from the title, abstract, and keywords columns.

Figure 5 illustrates the histogram of the Web of Science research article abstracts. The *y*axis and *x*-axis show the number of article abstracts as well as the increasing word count for article abstracts. The majority of article abstracts contained between 150 and 250 words. A few article abstracts had more than 450 words. The number of article abstracts that contained more than 400 words was relatively small. The maximum number of words in an article abstract was 1132. The magnified plot inside the figure is presented for the convenience of the reader. The graph also shows the density in relation to the increasing quantity of words per article abstract. The highest density is around 0.006.

**Figure 5.** Histogram (Web of Science article abstracts).

#### *3.6. Data Preprocessing*

We employed the same preprocessing algorithm for the three datasets. Algorithm 4 shows the preprocessing steps as follows: (1) remove duplicate articles, (2) remove irrelevant characters, (3) tokenisation, (4) stop word removal, and (5) lemmatisation with POS tags. In the beginning, we read the CSV file using the Python package "Pandas" and saved it as a data frame (DF). In the second step, we removed all duplicate data to reduce the data redundancy, and in the third step, we removed all irrelevant characters, for example, several Unicode characters, from the texts. Furthermore, in the fourth step, we tokenised the texts using the simple\_preprocess function, which is included in the Python package called "gensim". The fifth step was to remove the stop words from the article. Initially, we used the NLTK predefined stop words list for clustering and implemented the BERT parameter model. After getting the parameter from the BERT parameter model, we reviewed the corresponding keywords and explored the unnecessary keywords that were obtaining a high probability score in the parameter. After some testing, we finalised a list of stop words that did not carry significant importance for generating parameters. As a result, in our final model, we added those keywords to the stop words list and removed them from the articles. In the final step of data preprocessing, we used lemmatisation using the "Spacy" engine and allowed only four types of parts of speech tags, including noun, verb, adjective, and adverb. After the preprocessing step, we obtained the cleaned articles, which were used for parameter modelling.

#### **Algorithm 4** Data Preprocessing

#### **Input:** *articles*

**Output:** *clean articles*


#### *3.7. Parameter Modelling*

We utilised the BERT topic modeling method [71] to cluster the data and discover parameters. At the beginning of parameter modelling, we generated a word-embedding model using the BERT (Bidirectional Encoder Representations from Transformers), which is a transformer-based approach developed by Google [44]. Word embedding is a lowdimensional, dense vector representation of words, and BERT develops contextual embeddings. In this paper, we used the pretrained "distilbert-base-nli-mean-tokens" model as it maintains the balance between performance and execution time. We implemented a dimensional reduction algorithm, UMAP, to keep the maximum information in a lower dimensionality. Furthermore, we used HDBSCAN to group identical articles together that define a cluster or parameter. HDBSCAN is a density-based approach that complements UMAP effectively, considering UMAP retains a range of local structures even at lower dimensionality. Additionally, HDBSCAN does not compel data points to clusters since they are considered outliers.

Furthermore, a class-based TF–IDF (term frequency–inverse document frequency) score was used to calculate the importance of words for each parameter. By determining the frequency of a word in a given document as well as the measure of how prominent the word is in the entire corpus, TF–IDF provides a means of comparing the relevance of words between documents. However, if we consider all documents in a single group as a distinct document and then execute TF–IDF, we will achieve significance scores for words inside a cluster. This significance score is called the c-TF–IDF score. The more significant the words inside a cluster, the more representative the parameter. As a result, we can obtain keyword-based descriptions for every parameter. Equation (1) [50] describes the formula of the c-TF–IDF score, where *f* is the frequency of each word derived for each class *c* and divided by the number of words *w*. The total number of unjoined documents (*d*) is then divided by the total frequency of words (*f*) throughout all classes (*cc*).

$$\mathcal{L} - \text{TF} - \text{IDF}\_{\mathcal{C}} = \frac{f\_{\mathcal{C}}}{w\_{\mathcal{C}}} \times \log \frac{d}{\sum\_{p}^{\mathcal{C}} f\_{p}} \tag{1}$$

We fit all of the articles and trained the BERT parameter model after obtaining the c-TF–IDF. The number of parameters was then decreased by recalculating the articles' c-TF–IDF matrices and then iteratively merging the most often occurring parameter with the most similar one based on the respective c-TF–IDF matrices.

Finally, we assigned parameters to all the articles and saved the model. As the parameter was originally represented as an integer number, we further scrutinised the corresponding parameter articles and relabelled the parameters and aggregated them into macro-parameters using domain knowledge and quantitative analysis methods which are discussed in the next section.

#### *3.8. Parameter Discovery and Quantitative Analysis*

We discovered the parameters and macro-parameters using domain knowledge and quantitative analysis methods (i.e., term score, keyword score, intertopic distance, hierarchical clustering, and similarity matrix).

#### 3.8.1. Term Score

A list of keywords (terms) for each parameter does not express the context of the related parameter in the same way. To find a parameter, we must first determine how many keywords are required, as well as the starting and finishing positions of significant keywords. We visualised the keywords c-TF–IDF score for each parameter by sorting them in decreasing order [71]. This term score visualisation has a significant influence on identifying the parameter.

#### 3.8.2. Intertopic Distance Map

The intertopic distance map is a two-dimensional representation of the parameters, with the area of the parameter circles proportional to the number of words in the dictionary associated with each parameter. The circles are formed using a MinMaxScaler algorithm depending on the words they contain, with parameters closer together sharing more words [71].

#### 3.8.3. Keyword Score

In the BERT parameter model, we obtained a set of keywords representing a parameter, where each keyword has an importance score or c-TF–IDF score (see Section 3.7) that describes the context of the parameter.

#### 3.8.4. Hierarchical Clustering

Hierarchical clustering systematically pairs the parameters based on the cosine similarity matrices between the parameter embeddings [71]. By systematically pairing clusters, hierarchical clustering assembles a unique cluster of nested clusters. At each phase, beginning with the correlation matrix, all clusters are attempted in all possible pairs, and the pair with the greatest average inter-correlation within the experimental cluster is chosen as the new unique cluster.

#### 3.8.5. Similarity Matrix

The similarity matrix was visualised as a heatmap using the Plotly library in Python to show the similarity between parameters based on the cosine similarity matrix [71]. We computed the similarity matrix by calculating the cosine similarity score between the parameters embedding to show the relationship between the parameters. We have used Plotly "BnGu" (green to blue) as the continuous color scale where the dark blue color represents the highest similarity relationship between parameters, while the light green represents the lowest similarity.

#### *3.9. Validation and Visualisation*

The validation of our results can be considered to be internal or external. The internal validation was performed by investigating whether the articles and documents belonging to a certain parameter are related to the parameter. We have provided discussions on many articles in each dataset as to how those articles relate to the parameters. The external validation was performed by comparing parameters, keywords, and quantitative metrics across the three datasets (i.e., the three perspectives of transportation). The external validation was also performed by using sources other than the three dataset sources. Moreover, both the internal and external validation were performed using the depictions produced by various visualisation methods.

A range of visualisation methods were used to elaborate on the datasets, document clusters, and the discovered parameters. These include dataset histograms [52], taxonomies, similarity matrices [53], temporal progression plots, word clouds, and others. For example, we visualised the temporal progression for both parameters and macro-parameters. Initially, we merged the similar representable parameters and then counted the number of articles (intensity) by grouping the parameters and article publication year. Consequently, we obtained a list of intensities for each parameter with specific years. After that, we sorted the list according to the year and plotted the intensity against the year for each parameter. We also plotted the macro-parameter temporal progression in the same way by integrating the parameters of each macro-parameter.

We used several python libraries for these visualisations including Seaborn, Plotly, and Matplotlib.

#### **4. Public, Governance and Politics: Transportation Parameters Discovery**

In this section, we discuss the parameters detected by our BERT model from *The Guardian* dataset. The parameters are grouped into six macro-parameters. We provide an overview of the parameters and macro-parameters in Section 4.1. The quantitative analysis is discussed in Section 4.2. Subsequently, we discuss each macro-parameter in separate sections, Sections 4.3–4.8. Finally, Section 4.9 discusses the temporal analysis of the parameters and macro-parameters.

#### *4.1. Overview and Taxonomy (The Guardian)*

We detected a total of 25 parameters from *The Guardian* dataset using BERT. These 25 parameters were grouped into 6 macro-parameters using the domain knowledge along with similarity matrix, hierarchical clustering, and other quantitative analysis methods. The methodology and process of the discovery of parameters and their groupings into macro-parameters have already been discussed in Section 3.

Table 1 lists the parameters and macro-parameters of *The Guardian* dataset. The parameters are categorised into 6 macro-parameters, including road transport, rail transport, air transport, crash and safety, disruptions and causes, and employment rights, disputes, and strikes (Column 1). The parameters are listed in Column 2, where some of them are merged. For example, Parameters 9 and 4 have been combined into a single parameter, rail projects and contracts. The third column indicates the cluster number. The percentage of the number of articles is recorded in the fourth column. Our BERT model labelled 50.5% of articles as the outlier clusters. The outlier clusters are more analogous to the average article compared to any of the other clusters. Consequently, we ignored these clusters, and the rest of the 49.5% of articles are listed in the fourth column. The fifth column represents the top keywords associated with each parameter.

Figure 6 provides a taxonomy of the transportation domain extracted from a newspaperfocused on public, governance and politics. The taxonomy was created using the parameters and macro-parameters discovered from *The Guardian*. The first-level branches show the macro-parameters, the second-level branches show the discovered parameters, and the third-level branches show the most representative keywords.

#### *4.2. Quantitative Analysis (The Guardian)*

This section discuss the term score, word score, intertopic distance map, hierarchical clustering, and similarity matrix.

Each parameter is represented by a group of keywords, although not all of these words describe the parameter equally. The term probability declined representation depicts how many keywords are required to describe a parameter and when the benefit of adding more keywords begins to diminish (see Section 3.8). When we assess the keywords, only the top 7 to 10 terms in each parameter accurately describe the parameter, as shown in Figure 7. Because all of the other probabilities are so close to one another, ranking them becomes more or less useless. When we analysed the top keywords per parameter to discover the parameter, we used this information to focus on the top seven or so keywords in each parameter.

*Sustainability* **2022**, *14*, 5711


**Table 1.** Parameter and Macro-Parameters for Transportation (Source: *The Guardian*).

84

ERDS = Employment Rights, Disputes, and Strikes.

**Figure 6.** A taxonomy of transportation extracted from *The Guardian* dataset.

**Figure 7.** Term score (*The Guardian*).

Figure 8 depicts the top five keywords for each parameter. The keywords are sorted based on the importance score, or c-TF–IDF score (see Section 3.8). There are 25 subfigures, and in each subfigure, the horizontal line indicates the c-TF–IDF scores, and the vertical line indicates the keywords. For example, the first subfigure is the airport expansion parameter, which is represented by the five keywords such as runway, airport, Heathrow, expansion, and government, having 0.07, 0.05, 0.49, 0.39, and 0.26 scores, respectively.

**Figure 8.** Newspaper article parameter with keywords c-TF–IDF score.

Figure 9 shows the intertopic distance map (see Section 3.8), where six macro-clusters are separately identified.

Figure 10 describes the hierarchical clustering of the 25 clusters and systematically pairs the clusters based on the similarity matrix (see Section 3.8). We noticed that clusters No. 6, 3, 9, 2, and 4 created a unique cluster that we labelled as the rail transport parameter.

Figure 11 visualises the similarity matrix among the parameters (see Section 3.8). Note the dark blue colour between clusters 22 and 16 which showed a high similarity score because both clusters 22 (train, carriage, and crash) and 16 (police, crash, and accidents) have high resemblance. For example, whenever a train or carriage crash happens, at that time there is a high possibility of an accident, and police might react at that time.

**Figure 9.** Intertopic distance map (*The Guardian*).

**Figure 10.** Hierarchical clustering (*The Guardian*).

**Figure 11.** Similarity matrix (*The Guardian*).

#### *4.3. Road Transport*

The macro-parameter road transport includes the following parameters: congestion and road charging; pollution and electric vehicles, fuel and SCM (supply chain management), cycling, cycling safety, and bus transport.

#### 4.3.1. Congestion and Road Charging

The congestion and road charging parameter concerns road congestion and the imposed congestion charging to address the congestion. It is represented by keywords (detected by our model) such as road, congestion, traffic, charge, scheme, car, and city. Looking at the news articles that belong to this parameter we were able to find a number of topics that capture various dimensions of this parameter. These include congestion charging [106], traffic reduction and management [107], smart roads [108], parking [109], walking [110], cycling [111], congestion charge for non-residents [112], e-scooters [113], etc. For example, Harvey reports in a *Guardian* article [114] that the traffic congestion levels in September 2020 in outer London increased above the prepandemic (COVID-19) lockdown

levels in 2019. The article also shows the impact of congestion charging on outside central London and central London traffic between prepandemic, 2019, traffic and 2020.

#### 4.3.2. Pollution and Electric Vehicles

The pollution and electric vehicles parameter captures various dimensions of transportation from *The Guardian*. These dimensions and related news include high air pollution and fears of high risks for COVID-19 infection [115], London being the worst city in Europe in terms of the damages to health due to air pollution [24], inadequacy of electric vehicle reforms alone in solving the pollution problem and the need for holistic solutions [116], the proposed increase in diesel and petrol vehicle prices to curb pollution [117], UK cities delaying creating clean air zones supposedly for COVID-19 excuses [118], the fall in air pollution levels in London by 50 percent through anti-pollution measures reported in April 2020 [119], the UK to introduce E10, a high-ethanol fuel, to cut pollution [120], the UK's plans to ban diesel and petrol vehicles by 2035 or even earlier [121], Bristol's plan (late 2019) to ban diesel vehicles [122], Oxford's plan (late 2017) to become the first zero emissions area in the world [123], charging station deserts and monopoly [124], opening of first all-electric vehicle charging station [125], Tesla struggling in the US, asking funds from government [126], a 2008 article on myths about renewable energy [127], the concerns that despite electric and hybrid car sales the emission gains are only 1% between 2011 and 2021 [128], and many more news and topics. The parameter includes the following keywords detected by our model: car, vehicle, emission, diesel, pollution, fuel, electric, petrol, carbon, and hydrogen.

#### 4.3.3. Fuel and SCM (Supply Chain Management)

The fuel and SCM (Supply Chain Management) parameter contains keywords fuel, oil, price, petrol, driver, duty, tax, government, shortage, car, and rise. Many news articles in this parameter are about fuel prices, shortages, crises, [31,129], and costs of supply chains [130–132]. For example, a *Guardian* article [133] dated 17 November 2021 discussed the rising costs affecting all streams of businesses featuring case studies on agriculture, farming, hospitality, transport (individuals, small and large businesses), manufacturing, and retail. We found in this parameter a fascinating article on just-in-time supply chains by Kim Moody [134], Moody writes, "Decades of deregulation, privatisation and market worship have left society vulnerable to the unbidden force of "just-in-time" supply chains. No amount of government subsidies, . . . will be enough to address the crises we face, from the pandemic to climate breakdown, . . . Now is the time to think about not just how we make and consume things, but also how we move them ". Moody discusses how we became used to a 'just-in-time world', while not appreciating the complexity of it, including cross-continent shipping, fuel price variations, floods, closed roads, last-mile delivery people and their well-being, and the most important, the triple bottom-line effects of all of it.

#### 4.3.4. Cycling

The cycling parameter captures the transportation issues associated with cycling and bikes, an issue that has become important due to climate and health. It includes the following keywords: bike, cycling, cycle, cyclist, city, ride, road, bicycle, route, lane, scheme, year, car, traffic, transport, way, safe, work, street, and day. The parameter discloses several important dimensions of cycling through *The Guardian* news articles, including the planned 5000 miles of dedicated cycle routes in the UK announced in June 2000 and supported by the charity Sustrans [135], barring charity cyclists from using new trains [136], the increase in the number of bikes and rise of the 'born-again bikers' in the UK, covered in February 2004 [137], ministers setting examples for bike usage in 2004 [138], a major rise of weekend cycling in the UK [139], the loss of a legal case made by cycling and rambling campaigners to prohibit vehicle driving in the Lake District [140], funding to nurture increased walking and cycling in the UK [141], cycle thefts [142], the rise of cycling

holidays [143], a new 500-mile cycle route network in West Midlands, UK [144], cycling with young children [145], Uber launching electric bikes for hire in Islington borough [146], and more.

#### 4.3.5. Cycling Safety

The parameter cycling safety is about the risks and safety of cycling due to vehicles and other hazards on the road. Our model detected the following keywords for the parameter: cyclist, cycle, road, bike, cycling, death, kill, pedestrian, traffic, ride, safety, accident, driver, year, lorry, safe, helmet, injury, lane, and number. This parameter captured some issues related to road accidents in general, from the early 2000s such as higher accident risks for children from deprived areas [147] and the use of explicit graphic accident images in ads to nurture road safety [148], but the parameter was dominated by road risks and safety for cyclists. Examples include the increase in cyclist deaths in the UK in 2020 [149]; advice on keeping safe while cycling [150]; cheaper insurance for drivers who take cycle training to improve cyclist safety [151]; concerns that pavements are being used for cycle stands and other purposes, causing problems for pedestrians [152]; the possibility that Great Britain could become a great cycling nation, but road safety is a hurdle [39]; and more.

#### 4.3.6. Bus Transport

The parameter bus transport is represented by keywords including bus, service, route, local, public, passenger, and operator. The parameter captures bus transport issues in the UK, though most of these are applicable to other countries in one way or another. The dimensions and issues of bus transport include schemes from the government to provide cheaper bus fares for pensioners introduced in August 2000 [153]; governmenthired pensioners in 2000 to go undercover and check bus service quality [154]; the proposals in late 2002 to scrap cheaper fares for pensioners and instead provide for jobless and students [155]; better employment conditions for bus drivers [156]; a bus strike in London in January 2015 and its effects on commuters [157]; an article from 2019 discussing the importance for integrated public transport across the UK while allowing city mayors to have freedom for local transport operations [158]; the launch of buses in the UK in 2020 that filter air ("air-filtering buses") while in operation around a city [159]; the need for security for bus drivers against coronavirus infection [160], a boost in electric buses in the UK with a GBP 20 million contract [161]; the behaviour of passengers towards physical distancing measures deteriorating as people get vaccinated [162]; a 2021 news report discussing the government's plans to introduce a major redesign of public transport with new bus lanes, new fare plans, and richer bus schedules [163]; changes in commuting patterns due to COVID-19 [164]; compulsory masks on tranport in London [165]; severely negative effects of privatisation of bus services outside London [166]; the effect of the COVID-19 pandemic on setting back public attitudes by two decades regarding giving preference to private cars over public transport for safety reasons [167]; an article discussing the downfall of British public transport services by bus privatisation [168]; and more.

#### *4.4. Rail Transport*

The macro-parameter rail transport defines the characteristics of the transportation sector that relate to trains and railways, as captured by the topic modelling of *The Guardian* dataset. Rail transport includes the parameters, public discontentment; rail fairs; funding, costs, and fares; industry and privatisation; rail projects and contracts; and governance and politics.

#### 4.4.1. Public Discontentment

The public discontentment parameter is represented by keywords train, rail, passenger, service, company, network, year, railtrack, timetable, and railway. The overarching theme of the news articles in this parameter is the state of public discontentment with the rail services in the UK. The range of issues that the public is discontented with includes train

delays, particularly due to ineffective train schedules. For example, *The Guardian* reported on 18 May 2019 that a new rail timetable was announced by the Rail Chief, UK, to improve the chaotic situation with the rail services in the UK due to many cancellations and delays in train services during the last year, 2018–2019 [19]. A couple of months later, people again encountered severe delays and cancellations affecting the train schedule due to overhead wire damage in July 2019 on the mainlines connecting London with Scotland, northeast England, and other regions [169]. A revised rail timetable was developed and put in place in late 2019 to enhance the rail services promised to be the biggest change in the UK train schedule, but the plans were ruined reportedly due to staff shortages [170]. These and similar incidents caused public upheaval and discussions on train delays schedules around the UK.

Other issues of discontent include poor train accessibility [171], delay in project completions [172,173], dissatisfaction with specific train service providers [174,175], discontent of company staff with their management [176], companies trying to win back customer satisfaction [177], change of management due to discontentment with services [178], and more.

#### 4.4.2. Rail Fares

The rail fares parameter is depicted by keywords such as fare, ticket, rail, season, price, commuter, cheap, and peak. The parameter includes issues such as EasyJet in fare wars with Virgin trains [179], the withdrawal of cheaper fares amid train delays and cancellations [180], denial of compensation subsequent to the Hatfield crash in 2000 for those who did not keep their tickets [181], a planned increase in rail fares in England reported in 2020 [182], rail fares to increase by 3.8% in March 2022 reported in December 2021 [183], the launch of budget rail London–Edinburgh announced in September 2021 [184], postpandemic flexible rail season tickets [185], and more.

#### 4.4.3. Funding, Costs, and Revenues

The train funding, costs, and fares parameter includes the keywords rail, company, network, profit, government, cost, fare, rise, revenue, public, and others. One of the news articles in this parameter, dated 5 December 2021, discusses the hefty budget cuts required by the UK government from train operators who are contracted to deliver train services for a fixed price while the revenues and risks are born by the government [186]. While the consequences of the pandemic on train travel patterns are obvious and being explored, some groups argue that it is critical to maintain services and cut costs to attract passengers and save taxpayer money. There are other political and public issues, including job cuts that harm some segments of the public. At the same time, it is necessary to reduce costs and improve public services. Other articles touch on a range of dimensions and issues of this parameter including penalties and cutting bonuses [187], bailouts [188], increase in demands and revenues [189–191], funding and funding gaps [192], and more.

#### 4.4.4. Industry and Privatisation

The industry and privatisation parameter is characterised by keywords such as shareholder, railtrack, company, eurotunnel, buyer, debt, government, tunnel, share, and investor. The parameter captures transportation dimensions surrounding governance, privatisation, and industry, mainly for rail transportation. The earliest article [193] in this parameter dates back to 7 February 1964 about the agreement on the Channel Tunnel between the French and British governments seen as "a sound investment". The Channel Tunnel as we know was opened in 1994. We then witness a news article [194] from 1999 opposing the cabinet view on partly privatising the public transport system in the UK due to its weaknesses. We also see an article [195] from December 2000 deliberating comments from a chief executive officer of Atkins who was a major stakeholder in two London Underground bids that "the public does not appreciate the benefits brought to the railways by the private sector". These and similar issues [196] show the debates around and pros and cons of privatisation versus government-owned services. Another issue or dimensions that we can learn from our

BERT model is the legal battles between companies such as the one reported in a *Guardian* article [197] from 2001 about the company Virgin planning to sue the company Railtrack for their losses due to the Hatfield train crash [198]. The legal battles between companies also extend to the leadership of a company being offered a job by another company such as reported in December 2009 by Dan Milmo that the chief of Tube Lines was offered a position at National Express [199]. There is also an article from September 2020 reporting the former transport secretary being offered a lucrative contract by Hutchison Ports [200,201]. Other news and dimensions of this parameter include the Stagecoach offer in 2009 to its rival National Express for a merger [202], the Channel Tunnel operator Eurotunnel's hope in 2007 for "investors to back a debt-for-equity swap" to save it from bankruptcy [203], the problems with the public and private sector working together such as the London Underground public–private partnership (PPP) and East Coast Rail [204], the rail and bus company FirstGroup rejecting a takeover bid from the American company Apollo [205], the postBrexit rebranding of Eurotunnel to Getlink [206], and more.

#### 4.4.5. Rail Projects and Contracts

The rail projects and contracts parameter was created by merging two clusters (numbers nine and four) because the two clusters contained keywords pointing to similar subjects. The parameter is represented by keywords franchise, rail, government, train, contract, railway, company, service, bid, plan, rail, project, line, transport, north, high, speed, government, plan, and route. *The Guardian* confirmed on 18 November 2021 [207] that the eastern link of HS2 connecting Leeds was abandoned by the government, and this caused fury among the affected segments of the public.

#### 4.4.6. Governance and Politics

The governance and politics parameter represents the government's decision or planrelated keywords including government, buyer, transport, labour, public, minister, private, decision, political, privatisation, and secretary. For example, *The Guardian* reported on 14 November 2021 [23] that the government dropped the plan for HS2 and instead decided to support projects that benefit the ruling party. HS2 was reportedly promised by the prime minister during the very early days of his job. It was expected to provide a new high-speed railway link serving as the foundation of Britain's transportation network.

#### *4.5. Air Transport*

The macro-parameter air transport includes the parameters airport expansions, air pollution, airport security, and air costs and fares.

#### 4.5.1. Airport Expansions

The airport expansions parameter is about expansions planned for airports and related facilities that are needed to meet the increasing demands for air travel [208] as well as about the opposition to expansions due to their negative impacts on climate [209,210]. This parameter includes the keywords runway, airport, Heathrow, expansion, government, aviation, decision, flight, build, and plan. For example, the matter of London Heathrow Airport expansion and construction of a third runway has remained a matter of discussion for many years. The project was approved, but climate activists challenging it, leading to the issue becoming bogged down in the courts [211]. Asthana, Laville, and Kale in a *Guardian* news item [212] discussed the Court of Appeal decision announced in March 2020 to deem the expansion unlawful due to the UK government's failure of not considering the climate impacts of the expansion. This topic has continued to remain in the news due to the airport trying to challenge the court decision [213]. In addition, Tim Crosland, the lawyer and a campaigner for environmental protection was found guilty (May 2021) by the supreme court and lost his appeal (December 2021) for disclosing the court decision before its official announcement to the public [214].

#### 4.5.2. Air Pollution

The air pollution parameter contains the following keywords: emission, carbon, aviation, climate, airline, fuel, environmental, biofuel, and others. The parameter relates to air pollution caused by air transport. For example, Ungoed-Thomas from *The Guardian* wrote in a news item [215] about the high number of flights being taken by UK government staff (293 every day according to a report) despite the UK government's promises to protect the climate and make the government greener.

#### 4.5.3. Airport Security

The airport security parameter is represented by keywords such as flight, airport, passenger, drone, and security. This parameter is exemplified in a news March 2008 article by Dodd and Milmo [30] reporting an incident of a breach, the second within a three-week period, where a man succeeded in reaching a runway at Heathrow airport.

#### 4.5.4. Air Costs and Fares

The air costs and fares parameter represents the transportation characteristics connected to the costs and fees incurred by air transportation providers and consumers. The keywords include airline, airport, flight, passenger, carrier, cost, price, business, market, and profit. An example of various issues that come under this parameter is a *Guardian* news article reported by Topham and Kollewe [26] on 19 October 2021. The news is about Heathrow airport potentially increasing charges for passengers by 56 percent by 2023. Topham and Kollewe explained that Heathrow will be permitted by the CAA, the Civil Aviation Authority, to raise the landing charges considerably from the summer of 2022. This was in response to the airport organisation that asked for doubling the charges due to the massive business losses caused by the dearth of airport activity during the COVID-19 pandemic. CAA explained that the permission to increase charges was necessary for keeping the airport competitive and safe. The airlines are affected by the decision as the costs for their operations will increase. The news shows the complexity of the parameter in terms of the different stakeholders (airport management, airline operators, CAA, and consumers) and changing times and situations.

#### *4.6. Crash and Safety*

The macro-parameter crash and safety includes three parameters: train crashes, accidents and deaths, and dangerous driving and speeding.

#### 4.6.1. Train Crashes

The keywords that represent the parameter train crashes are crash, safety, train, railtrack, rail, signal, and accident. The earliest *Guardian* article we found in this parameter dates back to one from 6 October 1999 about the worst crash of the decade between Great Western and Thames trains near Paddington in London [216] making safety of rail transport a major political issue [217], making the two train operators, Railtrack, and the government, to begin an inquiry into the crash [218], and government pledging GBP 1 billion for safety of rails [219]. This has further led to the possibility of manslaughter charges against Thames Trains and Railtrack [220]. Railtrack, a group of companies, owned a major part of the rail infrastructure in the UK from 1994. It was renationalised in 2002. Many other news items were found relating to train accidents such as the rail accident between two trains at Salisbury in November 2021 caused potentially by low adhesion between rail tracks and train wheels [221].

There have also been many news items from *The Guardian* in this parameter about losses to rail companies due to accidents, compensations, penalties, etc., [222]. The parameter also contained some articles related to rail suicides such as the article from November 2017 about urging commuters to indulge in small talk with people potentially attempting suicides [223]. It was reported in this article that about 273 people committed suicide on the railways in the

UK during 2016–17. The parameter and the contained news articles show the richness of information that can be extracted from our BERT-based modelling approach.

#### 4.6.2. Accidents and Deaths

The accidents and deaths parameter is represented by keywords such as police, crash, accident, incident, scene, woman, injure, die, man, and injury. This parameter contains news articles about deaths and road accidents as opposed to the parameter train crashes where the focus of the articles is on train crashes and the various issues surrounding them such as financial, political, investigative, and industrial issues. Moreover, while this parameter mainly contains articles about roads, we also found some articles that involved trains such as a death (potentially a murder) by a woman pushing another person in front of a train [20]. Another example in this parameter showing the focus on deaths rather than the mode of transportation is news from October 2000 about the history of train accidents in the UK [224]. The article focuses on injuries and deaths rather than other details, and this is the reason we believe this article, though also related to train crashes, is mainly associated by our BERT model to the accidents and deaths parameter. Another example is an article about the death of a woman who was a staff member in a railway ticket office. She died because of COVID-19 infection that she may have caught due to a man claiming to have COVID-19 who spat and coughed on her while she was on duty [225]. The news is related to rail transport but is about a death. Other examples of articles in this parameter that involve railways and trains (or even air transport) but are mainly about deaths, road transport, and vehicles include [226–233].

The dimensions and issues connected to this parameter as seen through the news articles include the UK government strategy for road safety highlighting the gravity of the matter due to over 0.3 million road casualties in the UK every year (1 March 2000; [234]), the release of the driver of the bus that crashed and killed two and injured dozens of people (5 January 2007, [235]); death and injuries of various people in different incidents due to cold, black ice, road death traps, etc. (8 February 2009, [236], 31 March 2010, [237]); the M5 crashes in November 2011 [238] and March 2012 [239]; the M1 crash in December 2012 and its investigations [240]; the death of a man due to collision with a Nottingham tram (16 August 2016, [241]); the rescue of 60 children from a bus operated by Stagecoach after its crash (11 November 2021, [242]); a woman killed due to the collision of two buses near Victoria Station, London (10 August 2021, [243]), and many more.

One of the issues discovered from this parameter is the deaths on and the safety concerns of smart motorways in the UK [244,245]. This topic of smart motorways was also detected in Parameters 7 and 14 in relation to congestion reduction and speeding, respectively.

The discussions in this article are supported by a large number of articles for the discovered parameters. These may be seen as unnecessary or of little or no benefit. We discuss a large number and range of articles to show the complexity and breadth of the parameter topics. The knowledge gained through the parameter discovery and analysis process that is currently partly automatic and partly manual and will become increasingly automatic and autonomous will allow autonomous modelling, (exploratory, dynamic, and real-time) analysis, and optimisation of transportation and other sectors. The discussions presented in this article are also helpful in understanding the working and performance of BERT and other clustering algorithms.

#### 4.6.3. Dangerous Driving and Speeding

The dangerous driving and speeding parameter is characterised by drunk, dozing, and other dangerous driving, speeding, speed limits, methods to measure and curb dangerous driving and their devastating effects, and penalties and legal punishments. The first article in this parameter is dated 1 March 2000 and is about the government pledging to introduce tougher measures for drunk-driving and speeding to reduce child pedestrian

deaths and other injuries, while road safety and environmental protection groups show dismay, criticising the government for giving in to the motoring lobby [246].

The dimensions and issues related to this parameter include, among others, efforts by the government to intervene and improve dangerous driving behaviour [247]; government caving in to different lobbies, including motoring and alcohol lobbies [248]; dozing drivers causing deaths and their legal punishments [249]; the use of virtual reality in driving tests [250]; dangerous and drunk drivers and their legal punishments [251]; devices that would not let drunk driver start the vehicle [252]; drunk police officers [253]; uninsured drivers [254]; speed cameras and privacy [255]; the law being soft on dangerous and drunk drivers [256]; the benefits of lower speed limits to air quality and the environment [257]; penalties and jails for drivers using mobile phones [258]; illegal use of devices to deceive speed measuring equipment [259]; improvements to driving tests to improve driving behaviour of young people [260]; the benefits of autonomous cars to free us from dangerous drivers [261]; shocking driving speed violations during the COVID-19 lockdown [262]; and more.

#### *4.7. Disruptions and Causes*

The macro-parameter disruptions and causes comprises two parameters: extreme weather impacts and disruptions and delays.

#### 4.7.1. Extreme Weather Impacts

The extreme weather impacts parameter captures the various impacts on transportation of extreme weathers such as snow, rain, floods, heat, and wind-storms. The keywords detected by our BERT model for this parameter include snow, weather, temperature, flood, road, cold, rain, heavy, condition, and wind. The issues and dimensions for this parameter as evidenced through *Guardian* news articles include, among others, ice bombs ("frozen effluent falling from planes") [263]; impact on rail transport causing delays, cancellations, accidents, deaths, injuries, financial losses, and more [264]; magic de-icer to help railways in applying timely brakes [265]; effects of snow on roads [266]; heaviest snow in 18 years and its effects [267]; government rejecting criticism over transport management during extreme weathers [268]; resignation of transport minister over snow chaos [269]; strong winds, snow, and floods beat up the country and bring it to a halt [270]; extreme weather effects on air travel [271]; weather impact on schooling [272]; travel chaos in the country [273]; storm Darcy, cold and snow to cause disruptions [274]; weather impacts on Christmas and its arrangements [275]; deaths due to storms [276]; weather impacts on rail repairs [277]; derailing of a train due to rain and landslide [17]; village evacuation due to extreme weather [278]; government advice to businesses not to penalise staff for following government snow advice [279]; travel chaos due to rain and high temperatures [280]; the inability of UK rail transport to deal with extreme climates and a call for investments [281]; damages to bridges due to flooding [282]; and more.

#### 4.7.2. Disruptions and Delays

The disruptions and delays parameter contains the following keywords detected by our BERT model: train, service, weekend, holiday, passenger, expect, delay, busy, work, station, line, travel, day, fire, weather, rail, disruption, run, and traffic.

The earliest article in this parameter is from 19 November 1987 about a fire at King's Cross underground train station in London. This shows travel and other disruptions caused by the fire. The dimensions and issues related to travel disruptions include, among others, closure of many stations in London underground due to coronavirus [283], disruptions due to peak-hour services cancellations [18]; a warning for people to plan their travel due to expected heavy traffic from bank holiday getaway travellers amid expected fine weather [284]; bridge failure causing disruptions [285]; advice to avoid travel due to rail works [286]; rail services disrupted by lightening strikes [287] arson at a train station [288]; getaways for Easter expected to cause traffic at motorways [289]; a leaf clearing operation

by Network Rail to reduce rail accident risks [290]; disruptions in Christmas Eve travels due to engineering problems and weather [291]; disruptions due to Notting Hill carnival [292]; heavy road traffic and delays due to rail closures [293]; disruptions and delays due to London 2012 Olympics [294]; disruptions due to tunnel falls in London Underground [295]; crowded airports and rail stations and congested roads due to school holidays and good weather [296]; and many more.

Considering the keywords and news articles in this parameter, we can say that this parameter is about travel disruptions, delays and their causes. The causes include accidents, fires, both bad and good weather, faults, repairs and new installations in transport infrastructures, and holidays, including bank holidays, Easter, and Christmas.

#### *4.8. Employment Rights, Disputes, and Strikes*

This macro-parameter has only one parameter which was created by merging two clusters, numbers 5 and 20. The parameter employment rights, disputes, and strikes captures information about employees, their rights, job cuts, employment conditions, disputes with the management, and union strikes and their impact on people and economy. Two document clusters were detected with similar keywords and articles, the cluster numbers 5 and 20, and therefore we merged them into a single parameter. We noted some difference in the two clusters with cluster number 5 containing more articles related to rail transport and cluster number 20 a bit inclined towards air transport. We also consider this parameter as a macro-parameter because of its vast impact on social, economic, and environmental sustainability. There are always apparent exceptions in the cluster documents such as this article [297] that is about cancellation of trains due to Covid but primarily belongs to the parameter employment rights, disputes, and strikes; however, on a close inspection, one can find the connection such as the mention of strikes in the aforementioned article.

The earliest article reported by *The Guardian* related to this parameter is on 14 October 1999 about rail guards voting to go for a strike over safety matters [298]. We see matters related to job cuts such as British Airways announcing on 7 December 2000 to cut 1000 jobs at the Gatwick airport [299]. Among the news related to employment rights, union disputes, and strikes we find articles including one about a dispute between the Amalgamated Engineering and Electrical Union (AEEU) and Virgin Atlantic reported on 29 December 2000 [300], the dispute between RMT (National Union of Rail, Maritime, and Transport Workers) and the government (precisely, TfL, Transport for London) over the work rosters threatening to go for a strike from 26 November 2021 [301], train drivers threatening a major rail strike in London rail in January 2000 subsequent to rail privatisation [302], British Airways employees' strike for disputes regarding salaries reported in June 2017 [303], a strike stretching multiple weeks during March 2019 by French customs over poor working conditions causing havoc to Eurostar trains [304], Yodel employees threatening to strike in September 2021 due to poor salaries and conditions causing potential disruptions to deliveries for major supermarkets and others [305], Stagecoach under threat of strike by drivers'low wages [306], a recent article (October 2021) on post-COVID-19 abuse of staff working at transport stations and other customer-facing staff by customers [307], and more.

These examples of different types of employment disputes and strikes reveal insights into a range of issues surrounding stakeholders, causes, and impacts of disputes and strikes.

#### *4.9. Temporal Analysis (The Guardian)*

In this section, we will analyse how the parameters have grown over time. Figure 12 displays the temporal progression of the parameters which are distributed into six subfigures. The vertical line of the graph indicates the number of articles which is defined as the intensity and the horizontal line indicates the years. Figure 12a depicts the temporal progression of the macro-parameter road transport. Fuel and SCM has a higher intensity compared to the others. Figure 12b illustrates the temporal progression of macro-parameter rail transport, where the rail projects and contracts and industry and privatisation parameter were started in 1960. After that, both parameters were highly discussed between 2000 and 2005. The intensity of articles for the macro-parameter air transport which includes four parameters is depicted in Figure 12c. Air pollution and air airport expansions both had a peak value of around 80 between 2007 and 2008. The temporal progression of the macro-parameter crash and safety which includes three parameters is shown in Figure 12d. We observed that there are more articles related to train crashes compared to others. Figure 12e displays the temporal progression of the macro-parameter disruptions and causes. The parameter extreme weather impacts was highly discussed in 2010 and had the highest peak value of 60. The temporal progression of the macro-parameters employment rights, disputes, and strikes, is shown in Figure 12f, where the highest peak value of intensity was more than 80 in 2010.

**Figure 12.** Temporal progression of parameters (*The Guardian*): (**a**) road transport; (**b**) rail transport; (**c**) air transport; (**d**) crash and safety; (**e**) disruptions and causes; (**f**) employment rights; disputes; and strikes.

The temporal progression of all macro-parameters is summarised in Figure 13. For the first time, rail transport was discussed in 1960. After 2000, the parameter was highly concerned and topics for discussion had the highest peak value of 225. In 2008, the macro-parameter air transport had the highest peak value. We also saw in 2020 that the macro-parameters road transport, rail transport, and air transport were equally discussed. The macro-parameters crash and safety, disruptions and causes, and employment rights, disputes, and strikes were also of equal concern in 2020.

**Figure 13.** Aggregated macro-parameters (*The Guardian*).

#### **5. Industry: Transportation Parameters Discovery**

In this section, we discuss the parameters detected by our BERT model from the dataset acquired from the *Traffic Technology International (TTI)* magazine. The parameters are grouped into five macro-parameters. We provide an overview of the parameters and macro-parameters in Section 5.1. The quantitative analysis is discussed in Section 5.2. Subsequently, we discuss each macro-parameter in separate sections, Sections 5.3–5.7. Section 5.8 discusses the temporal analysis of the parameters and macro-parameters.

#### *5.1. Overview and Taxonomy (Traffic Technology International Magazine)*

We detected a total of 15 parameters from the *TTI* dataset using BERT. These 15 parameters were grouped into 5 macro-parameters using the domain knowledge together with a similarity matrix, hierarchical clustering, and other quantitative analysis methods. The methodology and process of the discovery of parameters and their groupings into macro-parameters have already been discussed in Section 3.

Table 2 lists the parameters and macro-parameters of the transportation magazine, *TTI.* The parameters are categorised into five macro-parameters, including industry, innovation, and leadership, autonomous and connected vehicles, sustainability, mobility services, and infrastructure (Column 1). The second and third columns list the parameters and the cluster number, respectively. The fourth column lists the proportion of the total number of articles. Our BERT model identified 58.16% of the articles as having outlier clusters. As a result of excluding this cluster, the remaining 41.84% of articles are listed in the fourth column. The top keywords related to each parameter are represented in the fifth column.

Figure 14 provides a taxonomy of the transportation domain extracted from a transportation industry-focused technical magazine. The taxonomy was created using the parameters and macro-parameters discovered from the *TTI* magazine. The first-level branches show the macro-parameters, the second-level branches show the discovered parameters, and the third-level branches show the most representative keywords.

*Sustainability* **2022**, *14*, 5711


**Table 2.** Parameter and Macro-Parameters for Transportation (Source: Traffic Technology International).

#### *5.2. Quantitative Analysis (Traffic Technology International Magazine)*

This section discuss the term score, word score, intertopic distance map, hierarchical clustering and similarity matrix.

Figure 15 shows that only the top 7 to 10 keywords in each parameter actually represent the parameter when we evaluate the keywords (see Section 3.8). Because the probabilities of all the other possibilities are so close to one another, their ranking becomes more or less meaningless. When we analysed the top keywords per parameter to discover the parameter, we used this information to focus on the top seven or so keywords in each parameter.

**Figure 15.** Term score (*Traffic Technology International* magazine).

Figure 16 depicts the top five keywords for each parameter. The importance score, or c-TF–IDF score, is used to order the keywords (see Section 3.8). There are 15 subfigures and in each subfigure, the horizontal line shows the importance score, and the vertical line shows the parameter keywords.

**Figure 16.** Magazine article parameter with keywords c-TF–IDF Score.

Figure 17 shows the intertopic distance map (see Section 3.8), where two clusters are clearly identified on the left–below corner side, and the right–upper side represents the three clusters. However, we manually tagged the parameters into five macro-parameters.

**Figure 17.** Intertopic distance map (*Traffic Technology International* Magazine).

Figure 18 represents the hierarchical clustering of the 15 clusters and systematically pairs the clusters based on the similarity matrix (see Section 3.8). We noticed that initially, the clusters were grouped into five clusters: (1, 8, 9, 0, 3), (13, 2, 4), (11, 5, 7), (10), and (14, 6, 12). This automated hierarchical clustering grouped the clusters correctly, with some exceptions. Furthermore, based on our knowledge and magazine articles, we manually grouped the clusters that are discussed in Table 2.

**Figure 18.** Hierarchical clustering (*Traffic Technology International* Magazine).

Figure 19 visualises the similarity matrix among the parameters (see Section 3.8). We used the same configuration as discussed in Figure 11. The dark blue colour represents the highest similarity relationship between parameters, while the light green represents the lowest similarity. For example, Cluster 3, labelled as transport services, and Cluster 1, labelled as AV systems, have high similarity scores as the main intention of AV systems is to improve transport services and make them smoother and more flexible. There is another high similarity between Clusters 8 and 9, which are labelled as AV trials and V2X trials, respectively.

#### *5.3. Industry, Innovation, and Leadership*

The macro-parameter industry, innovation, and leadership includes two parameters: leadership, and competitive innovation, which reveals events, appointments, innovations, and awards-related information and topics. The leadership parameter captures the transportation events and leadership news that have a high impact on transportation development. For example, the appointment of Angelos Amditis as ITS Europe chairman in 2018 [308], Laura Chace as CEO and president of ITS America in 2021 [309], and Laura Shoaf as the chair of the UK's transport group in 2021 [310], and much more news and topics. The competitive innovation parameter is about new innovations and projects related to transportation. It includes the following keywords: award, project, winner, competition, solution, and so on.We discovered an award-related announcement that occurred in Florida, and the Woolpert was awarded for operating district-wide aerial photogrammetry and various surveys [311].

**Figure 19.** Similarity matrix (*Traffic Technology International* magazine).

#### *5.4. Autonomous and Connected Vehicles Systems*

The macro-parameter autonomous and connected vehicles systems includes autonomous vehicles (AV) systems, AV trials, vehicle-to-everything (V2X) trials, and platooning/truck platooning. The AV system is designed to develop an autonomous driving vehicle by using several technologies, sensors, GPS, etc. This parameter reveals several significant innovations and projects through *Traffic Technology International* magazine articles, including the first thermal sensor-equipped production for AV [312], the first use of blockchain to provide connected vehicle data by CyberCar [313], a thermal sensor technology for the AV system announced on November 5, 2019, by the Veoneer system [312], the first vehicle-to-cloud infrastructure for automated connected vehicles by SENSORIS [314], and so on. We found AV trials-related news that illustrates the AV trials by Singapore's Land Transport Authority [315].

Communication between a car and any component that can be affected by the car is referred to as V2X communication. Platooning is a technology that helps vehicles drive together and boosts road capacity by applying the automated highway system to reduce the distance between automobiles or trucks.

#### *5.5. Infrastructure*

The macro-parameter infrastructure includes road infrastructure, crash and safety, tolling, and ALPR (automatic licence plate recognition). The road infrastructure is represented by the following keywords: project, lane, road, motorway, traffic, construction, bridge, improvement, scheme, work, design, tunnel, junction, etc. We uncovered the following example related to this parameter: Highways England (HE) marked a turning point in road construction, encouraging better-planned roadworks and more consistent travel on motorways and key trunk routes [316]. The crash and safety is presented by road, death, pedestrian, crash, safety, fatality, injury, speed, etc. We noted that UK road deaths increased in 2016 [317]. Tolling is illustrated by toll, system, tolling, lane, toll collection, electronic, etc. We noticed that Canadas's A25 highway electronic tolling system was upgraded on 24 October 2017 [318]. ALPR is defined by the camera, video, system, traffic, surveillance, detection, plate, etc. The enforcement system employs over 120 Sicore ALPR cameras located at 80 locations along major arterial routes around the UK capital [319].

#### *5.6. Mobility Services*

The macro-parameter mobility services retains transport services and parking services. By applying our model, we found that the traffic enforcement system is one of the transport services [320]. The parking system is another solution to make transportation services more convenient. We found news that merged the parking payment solution and electric vehicle charging system in the UK [321].

#### *5.7. Sustainability / Sustainable Infrastructure*

The macro-parameter sustainability includes air pollution and quality, street lighting, and electric vehicles. Diminishing transport-sourced air pollution is one of the major concerns as the number of vehicles is dramatically increasing every day. We found the Wolverhampton project focused on diminishing air pollution and improving the air quality monitoring system [322].

Street lighting is one of the solutions for smart cities' public safety and traffic optimisation. For example, CityIQ Edge collects and processes street-level video and audio information that will enable urban areas to handle day-to-day problems [323]. To reduce carbon emissions and improve air quality, low emission electric vehicles are one of the best solutions. The following news article is an example of this parameter: the Department for Transport had contracted the UK's Transport Research Laboratory (TRL) to observe and evaluate the effectiveness and implications of low-emission buses at 13 sites around the country [324].

#### *5.8. Temporal Analysis (Traffic Technology International Magazine)*

In this section, we will analyze how the parameters have evolved over time. Figure 20 shows the temporal progression of the parameter which is distributed into six subfigures. The first five subfigures represent the temporal progression of five macro-parameters, whereas the last subfigure depicts the temporal progression of all macro-parameters. The vertical line of the graph indicates the number of articles which is defined as the intensity. The temporal progression of the macro-parameter industry, innovation, and leadership is depicted in Figure 20a. Leadership has a higher intensity compared to the innovation parameter. Figure 20b shows that the AV systems' intensity was increasing over time until 2017, but after that, the intensity declined. Figure 20c shows that the intensity of the road infrastructure parameter which is one of the components of the macro-parameter infrastructure was high in 2017 and then gradually decreased. The temporal progression of macro-parameter mobility services which includes two parameters, parking services and transport services is shown in Figure 20d. We observed that there are more articles related to transport services compared to parking services. The intensity of articles for the macro-parameter sustainability which includes three parameters: street lighting, air quality and pollution, and electric vehicles is depicted in Figure 20e. street lighting and air quality and pollution have both had the same peak value of 25 in 2017.

**Figure 20.** Temporal progression of parameters (*Traffic Technology International* Magazine): (**a**) industry, innovation, and leadership; (**b**) autonomous and connected vehicles; (**c**) infrastructure; (**d**) mobility services; (**e**) sustainability; (**f**) macro-parameters.

The temporal progression of all macro-parameters is summarised in Figure 20f. In 2017, macro-parameters autonomous and connected vehicles, infrastructure, and sustainability had the highest peak values of 140, 120, and 60, respectively.

#### **6. Academia: Transportation Parameters Discovery**

In this section, we discuss the parameters detected by our BERT model from the Web of Science. We provide an overview of the parameters and macro-parameters in Section 6.1. The quantitative analysis is discussed in Section 6.2. Subsequently, we discuss each macroparameter in separate sections, Sections 6.3–6.8. Section 6.9 discusses the temporal analysis of the parameters and macro-parameters.

#### *6.1. Overview and Taxonomy (Web of Science)*

We detected a total of 50 parameters from *The Guardian* dataset using BERT. We skip Cluster 19 as it is related to narrative transportation [325–327], not related to general transportation. These 49 parameters were grouped into 6 macro-parameters using the domain knowledge along with a similarity matrix, hierarchical clustering, and other quantitative analysis methods. The methodology and process of the discovery of parameters and their groupings into macro-parameters have already been discussed in Section 3.

Table 3 and 4 list the parameters and macro-parameters of the academic dataset. The macro-parameters policy, planning and sustainability; transportation modes; logistics and SCM; pollution; technologies; and modelling, are listed in Column 1 with the associated parameters (Column 2). Some parameters are merged. For example, the Clusters 33 and 38, and 44 and 45 are merged as road safety and freight and logistics, respectively, in Table 3. The third column indicates the cluster number. The percentage of the number of articles is recorded in the fourth column. Our BERT model labelled 56.42% of articles as the outlier clusters. Consequently, we ignored this outlier cluster, and the rest of the 43.58% of articles are listed in the fourth column. The fifth column represents the top keywords associated with each parameter.

*Sustainability* **2022**, *14*, 5711


106

*Sustainability* **2022**, *14*, 5711

