1. Introduction
School travel mode choice has altered significantly over the past century with the introduction of motorization, the predominant cause being a motor car [
1]. There has been a great decline in active transportation in the last 50 years [
2]. For schools located within one mile of travel in the USA, walking and bicycling represented 87 percent of the trips in 1969, which reduced to 55 percent in 2001. Conversely, automobile trips increased from 7 percent to 36 percent between 1969 and 2001. A similar situation also emerged in England where walking to primary schools reduced from 61 percent to 52 percent while car travel increased from 30 percent to 40 percent between 1992–1994 and 2002–2003 [
3]. The mode choice decisions differ for the teenagers, and children–parents are the main deciders for the travel mode for their children while teenagers participate in selecting the mode for themselves [
4].
For short trips, since distance is the main factor affecting the travel mode choice to the schools, students shift from walk/bicycle to cars or public transport for longer trips [
5]. Students living in a neighborhood with higher residential density prefer to walk to school, while a higher number of road intersections show a decrease in the number of students using active transportation to school [
6]. For children, an increase in actual and apparent dangers diminishes the opportunities for independent mobility and active travel mode [
7]. Independent mobility and active travel are the main factors which promote healthy physical and mental growth of the children [
8]. The parent’s income, car ownership, and socioeconomic status are positively related to car use for school children, while these factors show a negative relationship with walking/bicycling. The distance and safety concerns have been found to be pivotal barriers for the parents towards active transportation [
9].
As far as the Saudi Arabian context is concerned, the variables significantly influencing the mode choice behavior are the time to school, number of family members, monthly household income, distance to school, nationality, and the number of cars owned by each family [
10]. However, Assi et al. [
11] found family income, travel time and parent’s income to be the most deterministic variables that influenced the mode choice behavior of high-school-going students. The empirical study suggests that multi-layer perceptron (MLP) tends to produce higher accuracy to address mode choice behavior as opposed to the most preferred logistic regression model. Others, notably Al-Atawi and Saleh [
12] and Al-Ahmadi [
10] used a multinomial logit model to explain the mode choice behavior.
In contemporary research, logistic regression along with artificial intelligence was found to be extensively used for modeling school-travel mode choice behavior [
11,
13,
14]. However, the precision level has always been a subject of question as it apparently fails to capture the dynamics that are associated with the chosen variables that influence the mode choice behavior [
15,
16]. Moreover, the mode choice behavior has attempted to address this based on a disparate focus on high, middle and elementary-school-going Saudi students. This leaves a pervasive knowledge-gap on the choice of techniques and their relative efficiency in scientifically addressing the mode choice behavior of all levels of Saudi students. Moreover, an amalgamation of the different relevant techniques, resulting precision and comparison of the results to signify their corresponding strength onto the precision level have not yet been explored. The simulation results prove the efficacy of the employed machine learning techniques (MLT). For instance, the extreme learning machine (ELM) and multilayer perceptron neural network (MLP-NN) can almost accurately predict the students’ mode choice. While the support vector machine (SVM) is not as accurate as of the other two techniques, but can still predict the mode choice behavior with satisfactory accuracy [
17]. On this backdrop, this study attempts to use three MLT i.e., (i) ELM, (ii) SVM, and (iii) (MLP-NN) to achieve the relative (higher) precision in order to describe the mode choice behavior of all school going students in the Saudi cities. Specifically, this study, first, aims to compare the strength of these artificial intelligence (AI) techniques in predicting the mode choice behavior for male school students in the Khobar-Dhahran metropolitan area. The robustness, convergence, and efficacy of the employed techniques are attempted to investigate and validate by training and testing them with different threads of travel data. Furthermore, this study develops a majority voting ensemble method based on the outputs of the employed MLT that enhances the overall prediction performance of the mode-choice problem. Secondly, it develops a consolidated policy perspective emanating from the study results.
This research is presented in the following order.
Section 2 presents a short overview of the literature related to mode choice modeling.
Section 3 explains the study methods.
Section 4 presents data processing in which the employed MLT is discussed briefly.
Section 5 discusses the results of this study. Lastly,
Section 6 provides a brief summary about the conclusions of this paper.
2. Literature Review
The urbanity of any region significantly affects the travel mode choice, with fewer children walking or using a bicycle because of an increase in urbanity since urban environments have active public transportation networks [
18]. Children dwelling within one kilometer (km) of school prefer walking while those living beyond three kilometers use vehicles. According to Stewarta et al. [
19], the distance to school, parental attitude towards traffic, culture, climate, and family resources are some of the significant factors affecting walking and biking to school.
McMillan [
3] carried out research to identify the effect of an urban form of school travel mode choice that focused on elementary schools and the data for travel behavior was collected from sixteen schools located in North and South California. The probability of walking, bicycling or using the private car was examined using a binomial logit regression model. The study concluded that in addition to the urban form, many other factors influence the probability of school mode choice. These factors include the safety of locality, vehicular safety, family mobility options, societal values, and the caregiver’s attitude along with urban form being the most significant factor.
Müller et al. [
5] investigated the student’s school choice and travel mode choice for schools in Germany. The data was collected for secondary schools and was analyzed using the multinomial logit (MNL) model. The student’s profile along with the distance and responsible authority affected the school choice. According to the study, the mode choice significantly relied on the distance along with other factors like the weather conditions and seasonality. Another study conducted in Florida [
20], used an MNL model to examine the factors affecting the mode choice to school. The study found that the time to walk/bike, the distance from school and the elements of the built environment around the school played a significant role in selecting the travel mode choice. Lin and Chang [
21] reported that a high density of sidewalks and shade trees encourage walking to school while a higher number of road intersections have the opposite effect. According to Broberg and Sarjala [
18], compared to the neighborhood schools, fewer students use a bicycle or walk to magnet schools, and the rate of bus travel was high for magnet schools.
A research study was carried out in Toronto, Canada [
6] that investigated the positive effect of the built environment on the active transportation system, i.e., walking/cycling. The travel mode choice for the school students of ages eleven to twelve years was estimated using a binomial logistic regression. The study concluded that the factors associated with active travel are distance, travel walking density, signal-controlled intersections, and low-income vicinities. Panter et al. [
22] investigated the effect of the environment, locality, and route for a research carried out in the United Kingdom for students aged nine to ten years. The study reported that the children having a direct road to school were less likely to use active transportation compared to those having a dense network of roads in their locality. A higher number of streetlights decreased the number of students bicycling to school. Pojani and Boussauw [
23] examined the impact of the physical and cultural environment on school travel mode choice. The survey data for students aged eleven to thirteen years were collected in Tirana, Albania. The study revealed that walking to school was preferred by most of the students, while the use of a bicycle and bus was minimal. The study found that students walking to the school walk in the form of a group and attend the nearest schools with fewer road intersections. The students from high-income families who live far away from school use their own car to travel to school.
Zhang et al. [
8] examined the mode choice for school students aged seven to eighteen years in Beijing, China. The tree based and logit models were employed to study the key factors affecting the mode choice. The study revealed that having a personal car, and poor active transportation encourages the use of passenger cars. Further, it was found that long-distances from schools enhances the use of passenger cars. A study in Kanpur, India investigated various factors affecting the school travel mode choice [
4]. The mode choice decisions were examined using a multinomial logit framework. The study revealed that the lack of public transportation and a shortage of quality school bus services encouraged the use of the family cars and paratransit. Moreover, the distance from the school, gender, family economic status, cultural exposure towards active transportation, and proper infrastructure for walking/bicycling were recognized as the most significant factors in the school travel mode choice.
Pont et al. [
24] conducted a systematic literature review to study the relationship between physical and socio-economic characteristics of the environment on active transportation of children. The active transportation of the children aged between five to eighteen years was inversely related to the distance from the destination, car ownership, and higher family income. Giles-Corti et al. [
25] inspected the effect of street connectivity and traffic exposure on a child’s tendency for walking to school. It was found that more students walk to school in areas of densely connected streets and low traffic exposure compared to areas of densely connected streets and high traffic exposure. According to Stewart [
26], the distance, income, traffic, crime fears, and parental attitude strongly influence the active transportation of children. A study conducted in Belgium [
27] revealed that the probability of walking to schools was significantly influenced by gender, distance, smoking status, and perception towards walking. McDonald [
28] reported that the most critical factors affecting school mode choice were distance and time while gender showed little effect.
Dave et al. [
29] conducted a survey to examine the travel mode choice for school children in Vadodara, India and carried out a feasibility study for the use of the coordinated bus for school travel. According to the study, most of the students travel by auto-rickshaw and van as they provide door-to-door service. Moreover, the study found that the likelihood of moving to coordinated bus increases with a decrease in travel cost and distance. The most significant factors for the travel mode choice were the age, car ownership, family income, and several persons in the family. Carver et al. [
30] examined the social and physical environmental factors which affect the independent movement of children to school in rural/urban areas of Norfolk, United Kingdom. The study surveyed children aged nine to ten and their parents. In addition, the features of locality, the road to school, and the school environment were measured using school audits and geographical information systems. The study revealed that land use mix, major roads in the vicinity, and parental encouragement were the factors which encouraged the independent mobility of children. Moreover, the study reported that half of the children walked or used a bicycle for school travel. A study conducted in Yorkshire, United Kingdom [
9], reported that the distance, time constraints, and parent’s safety concern were the main barriers to active transportation. A study conducted in Dunedin, New Zealand [
31], reported that approximately 51 percent of adolescents were admitted in the nearest school, which showed a five times higher rate of walking and lower motorization rates compared to their counterparts.
The previous studies suggest that the use of AI techniques in mode choice modeling is scarce and limited. This research aims to fill this gap by employing three common artificial intelligence techniques i.e., ELM, SVM, and MLP-NN in school mode choice modeling with a relative evaluation of their predictive performance.
5. Results and Discussions
The descriptive statistics for the two modes used in this study are illustrated in
Table 3. More than 70 percent of the school-goers use passenger cars for travel while approximately 30 percent of them walk to the school. The median time for both modes is almost the same while students who reside near the school walk to the school.
This study used the MLP-NN, SVM, and ELM for predicting the mode choice of school goers. To test the robustness and convergence of the MLT, this study employed 60, 70, and 80 percent of the available data for training the prediction models for the three different scenarios, while the remaining data were used for testing purposes. It is worth mentioning that the inputs of the MLT models were the school level, the distance between home and school, travel time, family income, family size, the number of students in the family and education level, whereas their output was binary i.e., either choosing passenger car or walking.
However, for the multilayer perceptron neural network, the Levenberg-Marquardt (LM) algorithm was employed for training in the MATLAB environment. The LM is a combination of the Gauss-Newton and steepest descent algorithms and exhibits better generalization performance in terms of stability, robustness, and convergence. This study chose five hidden neurons through a systematic trial and error approach for the MLP-NN. However, it employed a binary SVM classifier in a MATLAB environment that trained or cross-validated an SVM model for two different classes of data by mapping the predictor data using kernel functions via quadratic programming for objective-function minimization. Like MLP-NN, this study selected the key parameters of the SVM through the systematic trial and error technique. Finally, this research employed an ELM toolbox developed in a MATLAB environment to predict the mode choice behavior of the school goers. Like the other two techniques, this study chose the ELM parameters, namely the regularization coefficient, kernel parameter, and kernel option through a systematic trial and error process.
Table 4,
Table 5 and
Table 6 show the operation time required for training and testing of the selected MLT along with their training and testing accuracies for a different number of training and testing data. As can be seen, the SVM required approximately 10 seconds for training purposes whereas the MLP-NN and ELM took approximately 0.05 seconds, respectively. Conversely, the SVM and MLP-NN took similar times for testing purposes that were almost double compared to the testing time of the ELM technique. Therefore, it can be concluded that in terms of computational time, the SVM technique is the most expensive one and the ELM is the least expensive one for this specific study. As can be observed, the training accuracy for the SVM technique was approximately 90 percent whereas the same for the ELM and MLP-NN techniques were more than 99 percent for all cases in predicting the mode choice behavior of the school-goers. In terms of testing accuracies, the ELM and MLP-NN achieved almost 98 percent accuracies whereas the accuracy for the SVM technique was less than 90 percent for each case.
Furthermore, this study employed a majority voting ensemble method based on the outputs of the three machine learning tools to achieve better generalization performance for the testing datasets. This technique gave the same rights to each model and chose the predicted solutions that received more than half of the votes. There was no ambiguity in the selection of the best class as this study employed odd numbers (three) of MLT.
Table 7 presents and compares the accuracy of the different machine learning tools and the majority voting ensemble method.
Figure 2 shows the testing accuracies of the employed MLT and the majority voting ensemble method for a different amount of training and testing data. Therefore, it can be concluded that in terms accuracies, the SVM technique showed the least performance, whereas the ELM and MLP-NN showed excellent performance that justified their employment in predicting the mode choice behavior of the school-goers of the selected area of Saudi Arabia. Moreover, the majority voting ensemble method predicted the solutions with better or at least equal accuracies of the individual models.
6. Conclusions
The promising performance of ELM and MLP-NN suggests their use for modeling the travel mode choice behavior of the school goers. Both techniques outperformed the SVM technique in terms of training and testing accuracies. The training accuracy for the SVM technique was approximately 90 percent, whereas for the ELM and MLP-NN techniques, the accuracies were more than 99 percent for all cases. In addition, the ELM and MLP-NN achieved almost 98 percent accuracies for the test datasets, whereas the accuracy for the SVM technique was less than 90 percent for each case. Furthermore, the SVM technique was computationally expensive over the other two techniques. The ELM was the best one in terms of overall computational expense. Therefore, the ELM and MLP-NN models can be applied to predict the mode choice behavior of the school going student populations in KSA with a higher precision. Moreover, the developed majority voting ensemble method also predicted the solutions with better or at least equal accuracies of the individual MLT models that confirmed the effectiveness of the ensemble method in modeling mode choice behavior problems.
It was observed that the travel time, family income, and parent education level were the prime variables to dictate the mode-choice behavior. This study expects to help and guide the transport planners, engineers, and decision-makers to devise a plan for strategic management and supporting infrastructure for both traffic and walkers based on accurate demand and its higher predictive power. This would also influence the decisions for surrounding land use changes that facilitates the commuting both in cars and on foot.