*Article* **NOAH as an Innovative Tool for Modeling the Use of Suburban Railways**

**Maciej Kruszyna**

Faculty of Civil Engineering, Wrocław University of Science and Technology (Politechnika Wrocławska), Wybrzeze Wyspia ´ ˙ nskiego 27, 50-370 Wrocław, Poland; maciej.kruszyna@pwr.edu.pl

**Abstract:** The paper presents an innovative method called the "Nest of Apes Heuristic" (NOAH) for modeling specific problems by combining technical aspects of transport systems with human decision-making. The method is inspired by nature. At the beginning of the paper, potential problems related to modeling a suburban rail system were presented. The literature review is supplemented with a short description of known heuristics. The basic terminology, procedures, and algorithm are then introduced in detail. The factors of the suburban rail system turn into "Monkeys". Monkeys change their position in the nest, creating leaders and followers. This allows for the comparison of the factor sets in a real system. The case study area covers the vicinity of Wroclaw, the fourth largest city in Poland. Two experiments were conducted. The first takes into account the average values of the factors in order to observe the algorithm's work and formulate the stopping criteria. The second is based on the current values of the factors. The purpose of this work was to evaluate these values and to assess the possibilities of changing them. The obtained results show that the new tool may be useful for modeling and analyzing such problems.

**Keywords:** suburban railway; human behaviors; modeling; heuristics; algorithm; NOAH

#### **1. Introduction with the Literature Review**

Suburban railways play or should play an important role in agglomerations as the main means of transport connecting the core with the surroundings. Significant numbers of lines, journeys, and seats can affect the choice of means of transport. This creates environmentally friendly travel. Modeling the use of suburban railways should take into account two main aspects: (a) rail operations and (b) cooperation in the transport system. A suburban railway works similarly to all other railways. It is slightly closer to metro systems due to high frequency, while being unlike long-distance rail due to higher stop density and lower speeds. Therefore, the typical problems of planning rail operations should be taken into account. There are numerous studies on these problems. Table 1 contains the list of publications analyzed in this paper concerning the considered problems and heuristic tools. In [1–24], researchers address the problems related to railway modeling, and the authors of [25–28] consider the demand side of transport systems. It is worth noting that [29], discussed below, does not exist in the table because of its review character. In [30–37], researchers address the problems of integration in a suburban transport system. Sources [38,39] show directions for future research, and sources [40–43] contain important definitions (they were not added to the table). In [44–53], authors present different tools used to solve some problems (partially not as transportation problems, but with methodologies inspiring the method presented here). The tools are discussed in Section 2.

**Citation:** Kruszyna, M. NOAH as an Innovative Tool for Modeling the Use of Suburban Railways. *Sustainability* **2023**, *15*, 193. https://doi.org/ 10.3390/su15010193

Academic Editors: Tianren Yang, Mengqiu Cao, Claire Papaix and Benjamin Büttner

Received: 22 October 2022 Revised: 12 December 2022 Accepted: 19 December 2022 Published: 22 December 2022

**Copyright:** © 2022 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).


**Table 1.** List of publications analyzed in this paper concerning the considered problems and heuristic tools.


Tool description: ANN = Artificial Neural Network, BBO = Biogeography-Based Optimization, BC = Blockchain Framework, GA = Genetic Algorithm, MILP = Mixed Integer Linear Programming, MS = Multiagent systems, NH = Not heuristic, O = other (heuristic), SA = Simulated Annealing, SI = Swarm Intelligence (incl. particle swarm intelligence, PSO, ant or bees colonies, ACO, BCO), SOM = Self Organized Maps, TS = Tabu Search.

Sometimes the problems classified in Table 1 are more complex. Dong et al. [7] integrate the planning of train stops with the timetable, and Yan et al. [2] integrate the timetable with route planning. Wang et al. [12] and Zhao et al. [19] combine train timetables and rolling stock. Zhang et al. [16] integrate train timetables and track maintenance scheduling.

The above characterized examples illustrate the offer (supply side of the transport system). On the other hand, the result in the form of passenger flow (demand side of the transport system) was taken into account, inter alia, in: Xiao et al. [25], Shen et al. [26], Wu et al. [27], and Liu et al. [28]. These works include passenger flow as a direct result of modeling or simulation. In many studies, passenger flow is a factor influencing the modeled parameters, such as train schedules.

Rail is not the only means of transport in suburban areas. The railway is or should be one of several integrated components that work together at different levels. Access to rail should be improved by means of "complementary tools or means of transport", forming a "delivery system" that includes local buses, private cars including car sharing, bicycles including rental, etc. It is important to optimize local systems, create nodes, and integrate tariffs and cost coordination. The importance of coordination studies is shown by a review by Liu et al. [29], who identified 135 papers on these topics. Further problems are related to the developing autonomy of vehicles. Examples of studies from recent years concerning cooperation in transport systems are presented in Table 1 ([25–28]).

Many parameters were taken into account in the models presented above. For example, Ahmed et al. [6] collect 27 input parameters, including: average travel speed, train headway, number of stations, spacing between stations, etc. The number of input parameters in Dong et al. [7] is 21 and includes, inter alia: the number of passengers arriving at the station, the number of trains, etc. A specific parameter is "passenger satisfaction" or "dissatisfaction" (Hickish et al. [3], Satoshi et al. [22], Stead et al. [34], Shen et al. [26]). Shen et al. [26] formulate nine elements creating passenger satisfaction: direction and guidance, cleanliness and comfort, speediness and convenience, safety and security, ticket service, equipment and facilities, staff service, information distribution, and convenient facilities for passengers.

Specific review studies (Liu et al. [29], Tang et al. [38]) formulate directions for future research. Integration with various planning activities is important. Data quality, data limitations or imperfections, uncertainty, and passenger behavior should be carefully considered. Modeling analyses should be more complex and include, inter alia, multiobjective optimization, multi-agent systems, and negotiations. Comprehensive and more flexible approaches will pay off.

All the parameters presented above affect the use of suburban railways. However, not only the "physical" ones (easy to identify and measure), such as speed or numbers, are important. Other parameters that are more difficult to identify and have a "psychological" aspect should also be taken into account. Lopez and Farooq [39] state that "*transportation data are shared across multiple entities using heterogeneous mediums*". Such data vary on certain days (not only working days and holidays, but some typical working days may also differ in passenger flow depending on weather, accidents, and random factors). The influence of bounded rationality and unbounded uncertainty is significant (Khisty and Arslan [40]). Similar problems are discussed by Li et al. [41] and Wu et al. [27]. They wrote that the

assumption of rational passenger behavior is not correct. Taking into account the behavior of passengers requires the use of advanced and unconventional tools in modeling. Tools inspired by nature and social behavior are called "artificial intelligence" (AI) or "heuristics" (in a broader sense, not just as an optimization tool).

The main research goal of this paper is to create a new algorithm (NOAH) not as an optimization tool but as a method of observation of selected datasets. The reason for this is the problems with the identification of the close set of important factors and with the collection and selection of the data. Known and used methods have other assumptions. The proposed algorithm allows us to find new and nonobvious connections between the factors (these are not correlations in the strictly mathematical sense). Assumptions to create an algorithm will be formulated after the presentation of the heuristics (Section 2). The rest of this paper is organized as follows: Section 3 presents the new algorithm, and Section 4 shows an example of its application (with the description of the case study area, Section 4.1; collection of factors, Section 4.2; and two experiments, Sections 4.3 and 4.4). The last two sections contain a discussion and conclusions.

#### **2. Heuristics as Inspirations from Nature**

The term "heuristics" will be used here in a broader, philosophical sense, as defined by Kahneman [42]: "a heuristic is a mental shortcut that our brains use that allows us to make decisions quickly without having all the relevant information". In more "technical" literature, this concept or tool is often referred to as "computational intelligence" or "artificial intelligence". Regardless of the name, such tools are very popular and efficient in solving many problems, including modeling railways. Many tools developed in the last few decades can be considered "heuristics". Tang et al. [38] identify 139 articles from the last decade on the use of heuristics in railway systems.

The third column in Table 1 presents tools used to solve the collected problems. Most of them are heuristics. An element inspired by nature, especially simulated human or animal behavior, is important. New developments in "metaheuristics" and their applications are presented by Lau et al. [43]. They evoke, among others, a new method called "flying elephants" (Xavier and Xavier [44]), which shows interesting and intriguing assumptions and solutions.

Some studies include more than one tool, including Yang et al. [35], who compared the effectiveness of GA and MINLP. The set in Table 1 contains only selected sources from a very large database. The selection focuses on methods dedicated to railway modeling or on tools that will be inspirations for the method formulated in this paper. Specifically, these are relatively new studies using PSO, SCO, BCO, SOM, blockchain, multiagent, or BBO methods. For example, Zheng et al. [21] used the earlier concept of Simon [54], biogeography-based optimization, to analyze emergency railway wagon scheduling. Similarly, Hua et al. [51] used the Nakamoto blockchain concept [55] for intelligent control on heavy haul railways.

Summarizing the above description, the conditions for a new model of suburban railway use are summarized below. Railways function in the transport system, and cooperation with other modes of transport is necessary. We may collect a large amount of data, but we do not know the significance (impact) of each individual piece of information. There are many factors that influence the use of suburban rail, and their impacts may vary from day to day. Passenger behavior (including the choice of means of transport) is not rational. We should consider bounded rationality and unbounded uncertainty. The modeled object (railway in the transport system) is variable. The "optimal" solution probably does not exist; rather, we are looking for an "acceptable" solution. An acceptable solution contains a set of factors that are realizable and make economic sense. The results from the model can support the decision-making process—for example, when choosing a specific option, planning system development, etc. It is desirable to use a dedicated metaheuristic in the new model. SOM, multiagent, and blockchain elements inspire certain assumptions about the new proposal. In particular, solutions based on animal or human behavior will be useful for creating a new modeling tool.

So, the new model (algorithm) should be allowed to compare different data with higher or lower complexity to show potential sets of them. It will be possible to analyze both the existing (observed) data as well as more theoretical values. The process of comparison should be flexible and based on partially random procedures. The assumptions collected above can be realized using a specific heuristic. A novel heuristic will be proposed based on the specific behaviors of monkeys.

#### **3. The NOAH Concept Based on the Behavior of Monkeys**

A novel tool created here and called "NOAH" (Nest of Apes Heuristic) is inspired by the social behavior of groups of monkeys. Numerous studies and publications have been devoted to groups of monkeys from different monkey species—such as diana (Decellieres et al. [56]), vervet (Gareta Garcia et al. [57]), capuchin (Leca et al. [58]), gelada (Miller et al. [59]), or colobus (Wikberg et al. [60])—which create various nests with specific social behaviors. Colobus monkeys create specific "social networks" based on interactions [60]. A visualization (model) of such a network is presented in Figure 1 (part b). Diana monkeys form specific relationships called "dear-enemy" or "nasty-neighbor" depending on the type of habitat [56]. Distributed leadership has been observed in the nests of white-faced capuchins [58]. All members can initiate a group movement, and many members recruit followers. Wild female vervets adapt their maneuvering to different pressures [57]. They are characterized by rapid social plasticity and flexible changes in care patterns (described by the authors as "Machiavellian-like"—this "human" analogy is important here). Miller et al. [59] identify leaders in gelada nests under the influence of out-of-group paternity. The behavior of such species has been compared with other primates and has been linked to human mating systems, including behaviors jealousy (Scelza et al. [61]) and reproductive strategies (Scelza et al. [62]). The implications presented by Miller et al. in [59] refer to the "weirdness" of various human populations described by Heinrich et al. [63]. WEIRD here is an acronym standing for western, educated, industrialized, rich, and democratic. The authors conclude that not all human groups can be characterized as above. Other classified groups have different social behaviors. Therefore, their description should assume specific and partially unknown parameters.

**Figure 1.** Different nests used in specific methods or models (**a**) SOM-like [21] (**b**) Colobus network [60] (**c**) NOAH.

Hypothetically and in accordance with the heuristic methods described in Section 2, the behaviors presented above can be used in an algorithm (NOAH) that can describe not only groups of animals, but also technical systems containing parameters (factors) related to human behavior (like choice of transport means). Especially useful can be changeable leadership in the nest and the behaviors of followers. The parameters will be associated with "monkeys"—individuals in the nest that change their behavior (monkey position) according to specific procedures including leader creation, observations by followers, importance and hierarchy of individuals, dynamic changes in the nest, etc.

Changes to the nest will modify individuals (factors) before the algorithm stops. It will be possible to analyze and observe different sets of parameters (monkeys), their interactions, and their correlations. NOAH does not specify an optimal solution but shows possible

datasets for comparison. It helps in choosing one or more. The operation of NOAH is very similar to the SOM (self-organized maps) concept, the stages of the blockchain, or the multiagent concept. A graphical representation of the exemplary methods is shown in Figure 1. Part (a) of this figure shows an SOM-like network, part (b) shows the colobus monkey nest described earlier, and part (c) shows the monkey nest and interactions between leaders (big black spots) and followers (little black spots) according to the NOAH concept.

Important for the application of NOAH in selected problems is the selection of parameters (factors) and their conversion into "monkeys". Initially, the selection of parameters is made by an "expert" with the use of all of the available data. After the algorithm is stopped, the re-conversion procedure will follow. These elements will be described in Section 4 with a specific example. The basic and theoretical aspects of NOAH are presented here. Each monkey has a specific position in the nest that is variable. The monkey position values are limited to a range of 0 to 1 as defined by the procedures in NOAH.

A specific set of terms, parameters, and symbols used in NOAH is defined herein.

**Nest** (seat, habitat) is a set of individuals (representing factors in the model).

**Position of the monkey** in the nest, *M*n, is a key variable in the algorithm. Each monkey changes its position in the nest, assuming the role of a leader or follower (representing variable values of factors and their importance in the model).

**Steps** (iterations) of changes in the nest, starting from zero, *i* = 0, are successive periods with a specific nest state (monkey position, i.e., factor values). The steps will continue until the socket is stable (will not change). See stopping criteria.

**Importance** of an individual, *I*n, is a random variable indicating the subjective position of the monkey in the nest. The scope of this variable is determined by Formula (1).

$$I\_n = random(0; 1)\tag{1}$$

**Hierarchy** of an individual, *Hn*: This is a variable indicating a more objective position of the individual in the nest, assuming an actual value of *M*n, a random importance *In*, a moderated followers coefficient *L*, and a number of steps *i*. The hierarchy is calculated using Formula (2).

$$H\_n = \left\lfloor \frac{I\_n - M\_n}{L \cdot i} \right\rfloor \tag{2}$$

**Followers coefficient** (influence of leaders), *L*: This determines the time and efficiency of the algorithm (it should be precisely defined in accordance with the specification of the modeled problem). Its impact increases in subsequent iteration steps—see Formula (2). *L* values should oscillate around 1–3 (see example in Section 4).

**Random hierarchy modifiers**, *R*n: These are used to modify the position of the monkey going to the next step of changes in the nest. The modifiers depend on the value of the hierarchy, taking into account the defined range of hierarchy modifiers according to Formula (3).

$$R\_n = H\_n \cdot random(R \text{min}; R \text{max} \text{})\tag{3}$$

**Range of hierarchy modifiers**, *R*min and *R*max: This also determines the runtime and efficiency of the algorithm; the *R*min value should be negative and the *R*max value positive (see example in Section 4).

NOAH works according to the algorithm shown in Figure 2. The position of the monkey in the next step is calculated using Formula (4). The position is changed taking into account the values of random hierarchy modifiers with the restrictions determined by Formula (5).

$$M\_n^{i+1} = M\_n^i + R\_n^i \tag{4}$$

$$0 \le M\_n^{i+1} \le 1\tag{5}$$

**Figure 2.** NOAH algorithm.

The progressing steps create a changeable nest with individuals who change their positions. That means the changeable values of factors are considered in the model. The leaders (factors with higher importance) are identified, observed, and analyzed. The whole nest (set of all factors) can be analyzed too. The algorithm heads to the nest stability, which means reducing the changes in the monkey's position during the steps. The tempo of such stabilization depends on the value of the followers coefficient and the range of hierarchy modifiers. However, specific stopping criteria are formulated. In each step the following "decisions measures" are calculated: *M* as the sum of all *Mn*, *I* as the sum of all *In*, *H* as the average from all *Hn*, and *R* as the average of all *Rn*. Consideration of these measures in the aspect of stopping criteria is shown in the example in Section 4.

The next steps of the algorithm create a changing nest with individuals of different positions. This refers to changes in the value of the factors included in the model. Leaders (factors of greater importance) are identified, monitored, and analyzed. The entire nest (set of all factors) can also be analyzed. The algorithm aims at nest stability, which means reducing the changes in the monkey's position during steps. The pace of such stabilization depends on the value of the followers coefficient and the range of hierarchy modifiers. Specific stopping criteria have been formulated. At each step, the following "decision measures" are calculated: *M* as the sum of all *M*n, *I* as the sum of all *I*n, *H* as the average of all *H*n, and *R* as the average of all *R*n. The inclusion of these measures in terms of the stopping criteria is illustrated in the example in Section 4.

#### **4. Application Example of NOAH**

#### *4.1. Case Study*

The case study area is located near Wrocław, the fourth major city in Poland. Wrocław is the core of an agglomeration with approximately 1 million inhabitants. The Wrocław railway junction is large and forms the basis of the shape of the suburban railway system. This system is under construction, and several railway lines are being rebuilt or extended. In December 2021, the rebuilt line connecting Wrocław with Jelcz through the Czernica community was opened after 20 years without passenger transport. The occurrence of the reopening provides an opportunity for specific research, including testing of the NOAH algorithm. The use of NOAH allows for temporary changes to the rail network to be taken into account, e.g., the closure of a specific section of the line. The situation of the temporary closure of one railway connection makes it possible to observe all public transport connections between Wrocław and Jelcz using only one corridor.

Several factors and their collections are considered. Among them are new railroads, bus lines, and parking lots. These data were compared with the observed values of the number of passengers, etc. Figure 3 shows the map of the case study area, taking into account different types of railway lines (older ones in black, new ones in blue, and temporarily closed ones in red), two groups of bus lines (correlated with the railway line in green and competitive with red railways), target area in terms of the area of the Czernica commune, and car parks at railway stations in this area (park and ride, PR). Based on these conditions, 16 factors are formulated to be considered in the NOAH model.

**Figure 3.** The map of the case study area.

#### *4.2. Factors and Their Conversion*

A set of 16 factors was used in this case study (Table 2). Factors are divided into two groups: positive (nine factors) and negative (seven factors). Rising values of positive factors increase the total number of travelers (in all modes), and increasing values of negative factors reduce this number. This is a proposal used in this research considering specific assumptions (not defining the close set of factors and their role). For example, factors F2 and F3 are classified as "positive" according to an assumption of the offer presented by the rail operator having a higher number of places than the forecasted demand. In such an assumption, the trains will not become overcrowded. The higher number of passengers or people who get on the train will increase the use of suburban rail because of the creation of "good behavior" for new passengers. Conversion into monkeys and re-conversion differ depending on which group the factor belongs to. The set of factors adopted here is not complete in terms of all possible measures of the rail system, but it does contain exemplary, representative, and easy-to-collect data.


#### **Table 2.** The factors considered in the case study.

Assuming (or identifying) a range of all factors is a necessary and important element of NOAH. The minimum and maximum values of the selected factors are necessary to calculate the monkey position value (*M*n). Monkey positions correspond with the factors collected in Table 2. Two experiments are carried out in this case study. The first (Section 4.3) tests the volatility of the factor values. The second (Section 4.4) takes the actual factor values and compares them with the NOAH results. The conversion from *F* to *M* only occurs in the second experiment, but the re-conversion from *M* to *F* applies to both experiments. The following conversion formulas (6)–(7) and re-conversion formulas (8)–(9) are used depending on the specificity of the positive (*FP*) or negative (*FN*) factors. The mathematical conversion formulas are used impose monkey position values in the range of 0 to 1.
