1. Introduction
The transport system is the heart of trade. The more efficient the transport system, the more efficient the trade. Transport vehicles play a very important role in the transport system. If there is a discrepancy in providing transport vehicles on time, it will lead to excessive expense. Sometimes it happens that the people cannot access vehicles for transporting their goods, or they have to pay a lot in advance. Thus, to deal with this problem, we developed an efficient transport vehicle demand prediction system. This system can accurately predict the demand of transport vehicles for goods in an area. These data could then be provided to transport companies, which will be able to provide a timely service that will be beneficial for both. This system employs advanced neural networks, specifically multilayer perceptrons and LSTMs (long short-term memory), to forecast the transportation vehicle needs in different areas or localities. Making sense of travel behaviors and patterns is an important area if we are to develop intelligent transportation systems [
1]. Often, this is achieved using short-term models that check recurrent patterns in various vehicles, especially those providing transport services. Alternatively, simple models such as logistic regression are also used for the classification and predictions. While these models are good enough to provide a basic overview and pattern definitions to decide policies, the need for a better solution is felt [
2]. Neural networks serve as an incredibly valuable tool in this scenario because the data collected follow a sequential pattern (a time series). Deep learning techniques, such as neural networks, excel in handling unorganized data and aid in identifying critical features for analysis, making them especially effective in this context [
3]. The use of LSTMs can also help identify long-term changes in trends or patterns in the demand for transport vehicles.
The use of neural networks for similar problems has also been quite widely applied, as discussed in the literature survey [
4]. The main problem felt during the number crunching is that there is no exact way to deal with sudden spikes in demand in certain areas, e.g., if there is a sports event or a music concert in a specific area, the demand for the transport there suddenly rises drastically. Traditional models and even neural networks encounter challenges when it comes to handling abrupt shifts or spikes in data, lacking precise methods to forecast such fluctuations in demand. The project is specifically geared towards resolving this issue by devising more effective strategies to predict and manage these sudden changes in transportation demands.
As is evident from
Figure 1, sudden changes in demand have no pattern as such and cannot be identified by any models without multimodal analysis.
The event data are obtained and processed separately from the dataset for the given transport vehicles. This information is then implemented into the model using multimodal deep learning [
5]. In essence, the unorganized event data undergo separate processing via another neural network, which could be LSTM or MLP. Following this, they are combined with the primary neural network through a cross-merge technique or concatenated using SoftMax to create a cohesive and comprehensive dataset for analysis. The event data are obtained from major news sources over the internet and then processed.
2. Literature Review, Proposed System, Methodology
2.1. Literature Review
CNN-LSTM is the model used to extract spatiotemporal characteristics. This model helps in the demand prediction system very much. The paper titled “Predicting taxi Demand based on 3D Convolutional Neural Network and Multi-tasking” describes the use of LSTM to extract spatiotemporal characteristics from historical data and uses 3D ResNet for accurate prediction [
6].
The research paper introduces a model that utilizes both the gradient-boosted decision tree algorithm and specific time series algorithms to ensure accurate and precise predictions. This hybrid approach aims to leverage the strengths of each method for more effective forecasting [
7]. The proposed model is able to predict taxi demand accurately, is 10% to 40% faster than other models, and uses the LSTM machine learning algorithm to perform time-series analysis.
The paper entitled “An Introduction to Convolutional Neural Networks” describes the working and use of an artificial CNN. The architecture of the CNN and use cases are properly illustrated in the paper [
8]. The use of CNNs in pattern recognition is also well explained in the paper.
The paper provides a comprehensive overview of the current variations in LSTM cells within network architectures used for predicting time series data. It delves into categorizing the different states of LSTM cells, offering insights into their functionalities and structures within these predictive models [
9]. The LSTM behavior is also illustrated in the paper. The use of long-term and short-term memory for time series analysis is also demonstrated.
CNN-LSTM models initially work like CNNs to extract a feature vector of the load map which is continuous and is constructed by the load influencing factor [
10]. Subsequently, LSTM is applied to predict the load. Thus, the results infer that the CNN-LSTM model performs better than CNN alone or LSTM alone [
11].
2.2. Methodology
This section discusses the methodology implemented in this project. The overall flow of the project, the data used, and the neural networks used will all be discussed along with other key points.
- (A)
Datasets and event data:
For this project, we used taxi data from New York City as a model for, and subset of, the overall transport vehicle data. It is noted that taxi demands might vary and not completely mirror transport vehicle demands as a whole [
12]. However, taxis are one of the key forms of transportation. Another important fact to be noted is that we restricted ourselves to four of the “taxi zones” in Brooklyn. This was due to the restricted scope of the project and time constraints. The chosen four zones for the project were zones 25, 97, 181, and 189, as shown in
Figure 2The data were collected from the Barclays Center in Brooklyn, a massive arena and stadium that serves as the home venue for one of the nation’s most popular basketball teams. Additionally, it hosts a diverse range of large-scale events beyond sports. Again, this was chosen as only one of the analysis points due to the limited scope of the project. The official website has an event calendar for the arena and the event data were mainly extracted from there. The descriptions for the events were scraped from the internet using Selenium and BeautifulSoup. The dataset was preprocessed to represent a time series. The summaries for the data are presented in
Table 1,
Table 2,
Table 3 and
Table 4.
- (B)
Neural network architecture
Multilayer perceptrons (MLPs) refer to fundamental feedforward neural networks composed of multiple layers of interconnected neurons, known as perceptrons. These networks consist of an input layer, one or more hidden layers, and an output layer, allowing for complex information processing and pattern recognition [
13]. They generate a set of outputs from a given set of inputs. Backpropagation is used to train the network, which is a widely used and simple artificial neural network. In this project, MLPs were mainly used as a comparison tool for the better neural network used.
Long short-term memory (LSTM) is a type of recurrent neural network that is extremely good at sequence prediction problems and time series problems [
14]. They are quite complex models and comprise individual units in a recurrent neural network, each called an LSTM cell.
Figure 3 explains the LSTM cells further.
LSTM units are used as the building blocks of a recurrent neural network; LSTM cells can read, write, and delete their memory [
15]. In neural networks, especially within long short-term memory (LSTM) cells, the gating system plays a crucial role. The input gate controls new information entry, the forget gate manages outdated data removal, and the output gate regulates the impact of information on current outputs. These gates are integral for effective information flow, enabling the sophisticated processing of sequential data. LSTMs are good at learning from experiences with large time gaps between them. Hence, they are the perfect tool for learning from seemingly random events.
2.3. Project Flow
Figure 4 shows the overall flow of the project.
3. Results
The dataset was divided into testing, training, and validation sets at a ratio of 10%, 70%, and 20%, respectively. A baseline model was also developed for comparison. This model did not use any machine learning or deep learning technique; instead, the mean of the previous few data points in the time series was considered as the predicted value in the baseline model. The parameters used for the evaluation of the model were the mean absolute error (MAE) and the root mean squared error (RMSE).
An incremental analysis was conducted to compare different models used in the study. Initially, a multilayer perceptron (MLP) was trained solely using taxi demand and location data. Subsequently, an LSTM model was trained using the same dataset. The comparative analysis aimed to evaluate the performance and effectiveness of these models in handling the given data for the specified task. Then, the MLP model was also provided with the event information. The same was performed for the LSTM model, which was the final model. Graphs were also plotted for the final LSTM model trained with the event data.
A comparative representation of the performance of the models is presented in
Table 5.
Graphs for the final model, i.e., LSTM with event data, are presented for each zone in
Figure 5 and
Figure 6.
As we can see, the model is able to quite correctly predict the overall trends in demand variation. It is even able to accommodate the sudden changes due to events, as visible from the spikes.
Due to significantly high error values, the current model’s reliability for accurately predicting demand is limited. As a result, it is not deemed an efficient tool for precise demand prediction at this stage. Further refinements or alternative approaches may be necessary to enhance its accuracy and practical utility. As discussed above, in this project we only took one event center into consideration, and only analyzed 4 zones. However, the model is sufficiently well trained to predict overall trends and can give a rough estimate of the demand, even if not the exact values.
4. Conclusions
The project was able to accurately estimate the demands for transport vehicles (taxis for now, as that scope was restricted to them) across four zones, and also considered events at the Barclays center to accommodate sudden rises in demands across the city.
The limitation of the datasets for many other transport vehicles was the biggest factor in the project. Reliable sources for various transport vehicles other than taxis could not be found. Another limitation was that only the data from the United States were available. Hence, the model’s deployability and use in other countries and cities depends very heavily on whether we can find or generate appropriate datasets.
The project’s core focus revolved around effectively processing and utilizing textual event data within the model’s training process to derive meaningful results. This capability holds tremendous potential for diverse applications and can be harnessed in various ways for different purposes. Event information is only one of them. We can also use the model for traffic analysis or for predicting the weather without actually obtaining numerical data and increasing the size of the feature vectors. Textual, context-based learning of various factors and training of models based on this is a very interesting concept, and should definitely be looked into more. The relationship between social media platforms and vehicle demands could also be established using the same technique.
Author Contributions
Conceptualization, K.J. and A.B.; methodology, A.S.; software, A.B. and A.S.; validation, P.K., K.J. and A.K.; formal analysis, A.S. and A.K.; investigation., A.S. and A.K.; resources, A.S. and A.B.; writing—original draft preparation, K.J., A.B. and R.B.; writing—review and editing, K.J., S.D. and A.B.; supervision, P.K.; project administration, P.K. and K.J. All authors have read and agreed to the published version of the manuscript.
Funding
This research received no external funding.
Institutional Review Board Statement
Not applicable.
Informed Consent Statement
Not applicable.
Data Availability Statement
No external data were used.
Acknowledgments
We wish to convey our deep appreciation to our mentor, Pankaj Kunekar, from the Department of Information Technology, Department of Artificial Intelligence and Data Science at Vishwakarma Institute of Technology, Pune, for his unwavering research and technical guidance during our project. We also extend heartfelt thanks to our references for their invaluable contributions. All the authors have consented to the acknowledgement.
Conflicts of Interest
The authors declare no conflicts of interest.
References
- Zhong, C.; Wu, P.; Zhang, Q.; Ma, Z. Online prediction of network-level public transport demand based on principle component analysis. Commun. Transp. Res. 2023, 3, 100093. [Google Scholar] [CrossRef]
- Xue, R.; Sun, D.J.; Chen, S. Short-term bus passenger demand prediction based on time series model and interactive multiple model approach. Discret. Dyn. Nat. Soc. 2015, 2015, 682390. [Google Scholar] [CrossRef]
- Vateekul, P.; Sri-iesaranusorn, P.; Aiemvaravutigul, P.; Chanakitkarnchok, A.; Rojviboonchai, K. Recurrent Neural-Based Vehicle Demand Forecasting and Relocation Optimization for Car-Sharing System: A Real Use Case in Thailand. J. Adv. Transp. 2021, 2021, 8885671. [Google Scholar] [CrossRef]
- Kunekar, P.R.; Azam, M.; Maggavi, R.R.; Gehlot, A.; Mahesh, B.; Prajapati, G.K. AI Aero Science Model To Predict Security And To Improve The Fault Space System. In Proceedings of the 2022 2nd International Conference on Advance Computing and Innovative Technologies in Engineering (ICACITE), Greater Noida, India, 28–29 April 2022; pp. 2289–2293. [Google Scholar]
- Kunekar, P.R.; Gupta, M.; Agarwal, B. Deep learning with multi modal ensemble fusion for epilepsy diagnosis. In Proceedings of the 2020 3rd International Conference on Emerging Technologies in Computer Engineering: Machine Learning and Internet of Things (ICETCE), Jaipur, India, 7–8 April 2020; pp. 80–84. [Google Scholar]
- Kuang, L.; Yan, X.; Tan, X.; Li, S.; Yang, X. Predicting taxi demand based on 3D convolutional neural network and multi- task learning. Remote Sens. 2019, 11, 1265. [Google Scholar] [CrossRef]
- Hsu, C.; Chen, H. Taxi Demand Prediction based on LSTM with Residuals and Multi-head Attention. In Proceedings of the 6th International Conference on Vehicle Technology and Intelligent Transport Systems; Science and Technology Publications: Setúbal, Portugal, 2020; Volume 1, pp. 268–275, ISBN 978-989-758-419-0. [Google Scholar] [CrossRef]
- Lin, Z.; Cao, Y.; Liu, H.; Li, J.; Zhao, S. Research on optimization of urban public transport network based on complex network theory. Symmetry 2021, 13, 2436. [Google Scholar] [CrossRef]
- Lindemann, B.; Müller, T.; Vietz, H.; Jazdi, N.; Weyrich, M. A survey on long short-term memory networks for time series prediction. Procedia CIRP 2021, 99, 650–655. [Google Scholar] [CrossRef]
- Agga, A.; Abbou, A.; Labbadi, M.; El Houm, Y.; Ali, I.H.O. CNN-LSTM: An efficient hybrid deep learning architecture for predicting short-term photovoltaic power production. Electr. Power Syst. Res. 2022, 208, 107908. [Google Scholar] [CrossRef]
- Xu, F.; Weng, G.; Ye, Q.; Xia, Q. Research on Load Forecasting Based on CNN-LSTM Hybrid Deep Learning Model. In Proceedings of the 2022 IEEE 5th International Conference on Electronics Technology (ICET), Chengdu, China, 13–16 May 2022; pp. 1332–1336. [Google Scholar]
- Liu, Z.; Chen, H.; Li, Y.; Zhang, Q. Taxi demand prediction based on a combination forecasting model in hotspots. J. Adv. Transp. 2020, 2020, 1302586. [Google Scholar] [CrossRef]
- Van Houdt, G.; Mosquera, C.; Nápoles, G. A review on the long short-term memory model. Artif. Intell. Rev. 2020, 53, 5929–5955. [Google Scholar] [CrossRef]
- Sherstinsky, A. Fundamentals of recurrent neural network (RNN) and long short-term memory (LSTM) network. Phys. D Nonlinear Phenom. 2020, 404, 132306. [Google Scholar] [CrossRef]
- Singh, J.; Banerjee, R. A study on single and multi-layer perceptron neural network. In Proceedings of the 2019 3rd International Conference on Computing Methodologies and Communication (ICCMC), Erode, India, 27–29 March 2019; pp. 35–40. [Google Scholar]
| Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).