1. Introduction
With the development of the economy and advancements in Internet technology, the volume of ride-hailing services has been growing rapidly. Compared to public transportation, ride-hailing services have gradually become an important component of transportation services due to their advantages in convenience and accessibility. However, ride-hailing services also have the drawback of higher travel costs. As a result, some passengers opt for a hybrid travel model that combines the convenience of ride-hailing with the cost-effectiveness of public transportation. The most common example is the integration of metro and ride-hailing services. Therefore, how to effectively integrate ride-hailing services with urban rail transit to reduce travel costs for passengers, enhance travel convenience, shorten travel time, and build a green, friendly, and sustainable transportation system has become an urgent issue that needs to be addressed.
Existing multimodal transport optimization research mainly focuses on evaluation systems for transfers between different modes of public transport. The principal interchange modes that have been the subject of study are subway–bus, bus–bicycle, and subway–bicycle. The travel time for the connections is typically determined through network analysis, GIS software analysis of origin and destination points, and passenger flow data mining methods [
1]. In terms of analyzing transfer paths, Liu [
2] points out that accurately representing various urban transport networks and improving connections between transport networks are crucial for understanding travel behavior and enhancing the resilience of transport systems. Wu [
3] developed a subway station spacing calculation model aimed at reducing passenger travel time by incorporating both grid and radial road network configurations, thereby improving the model’s applicability across various urban settings. Liu [
4] applied a greedy triangulation algorithm to identify possible routes for intermodal transfers across different transportation modes, followed by the development of a multi-objective optimization framework aimed at reconciling sustainability with efficiency in multimodal transport, ultimately lowering travel expenditures.
Most of the traditional path optimization methods use heuristic search algorithms such as
[
5], genetic algorithms [
6], and DQN algorithms. However, traditional genetic algorithms have some drawbacks in path planning problems, like long planning times, slow convergence, unstable solutions [
7], and the possibility of getting stuck in local optima [
8]. DQN algorithms also have some issues in path planning problems, like relatively low efficiency in action selection strategies and reward functions [
9], long learning times, and slow convergence speeds [
10]. As the problem scale increases, it becomes challenging to find the optimal solution within a constrained timeframe for medium- to large-scale instances, resulting in the attainment of only a local optimum. In recent years, reinforcement learning has gained attention since it addresses (model-free) Markov decision process (MDP) problems, where the system dynamics are not known, yet optimal policies can be derived from data sequences collected or generated under a specific strategy.
A substantial body of research has been carried out by researchers into the application of reinforcement learning for addressing path optimization problems, which can be classified into model-based and model-free approaches. One of the most representative algorithms for model-free methods is the Q-learning algorithm based on Markov decision processes [
11]. Zhou [
12] improved path planning efficiency by using a novel Q-table initialization method and applying root mean square propagation in learning rate adjustment. Wang [
13] utilized the energy iteration principle of the simulated annealing algorithm to adaptively modify the greedy factor throughout the training process, thus improving the path planning efficiency and accelerating the convergence rate of the conventional Q-learning algorithm. Zhang [
14] proposed an optimization framework based on multi-objective weighted Q-learning, using a positively skewed distribution to represent time uncertainty, which can solve multimodal transport multi-objective route optimization problems faster and better. Zhong [
15] avoided the standard Q-learning algorithm falling into local optima by dynamically adjusting exploration factors based on the SA principle. Q-learning algorithms have been widely applied to path planning problems, but challenges such as the tendency to get stuck in local optima and slow convergence speed still exist, which are also current research hotspots.
Accordingly, this research, informed by residents’ travel demands and integrating the subway schedule along with the spatial and temporal characteristics of ride-hailing services, presents an investigation into optimizing short-distance ride-hailing connections to rail transit through the application of guided reinforcement learning techniques. Utilizing GPS data from online ride-hailing services in Beijing, this approach incorporates travel information extraction, road network modeling, cluster analysis, and connection route optimization. The Q-learning algorithm is applied to train the agent, and the guided reinforcement empirical principle is integrated to enhance learning efficiency and expedite the algorithm’s convergence rate. The most efficient path across various connectivity options is determined to minimize passenger travel expenses and shorten journey durations, thereby enhancing the attractiveness of public transportation and encouraging sustainable travel practices.
2. Problem Description
This study focuses on the travel decision-making for the combination of ride-hailing services and subway systems. This combined mode of transportation effectively integrates the wide coverage and high travel efficiency of ride-hailing services with the low-cost advantage of subway travel. It aims to ensure efficient passenger travel while increasing the attractiveness of subway systems and reducing overall travel costs and energy consumption. It aims to provide users with a more convenient, intelligent, and environmentally friendly travel experience.
Subways are the most convenient urban transportation mode. However, for parallel subway lines, passengers frequently need to make several transfers to arrive at their destinations, leading to increased travel distances and diminished transfer convenience. To maximize the benefits of subway systems, the “last mile” problem between subway stations and the final destinations needs to be addressed. As shown in
Figure 1, traveling by subway alone takes a long time, and traveling by taxi is expensive. However, combining taxi services with the subway can effectively reduce travel costs for suburban residents and improve travel efficiency. It can also efficiently address the “last mile” challenge and alleviate the inconvenience of transferring between parallel subway lines.
To effectively integrate ride-hailing services and subway travel, this study first conducts cluster analysis and road network modeling using real data to identify pick-up and drop-off hotspots. Next, buffer zones are established around subway stations to analyze the connection modes for travel origins and destinations. Finally, a Guided Q-Learning (GQL) algorithm is proposed to search for the optimal connection mode and shortest path for travelers. Simulations are conducted to validate the proposed approach for different travel modes. The research findings and conclusions are presented in this study, as shown in
Figure 2.
6. Conclusions
This paper analyzes real data through clustering analysis and road network modeling and identifies the hotspots for passenger pick-up and drop-off. Buffer zones are set around the subway stations to analyze the connection modes for ODs. This paper proposes an algorithm for short-distance travel decision-making in online ride-hailing integrated with rail transit, grounded in reinforcement learning. The GQL algorithm is utilized to construct a network topology model reflecting the real road network, determine the optimal route across different connectivity scenarios, and formulate both the Q-table initialization approach and the action selection mechanism of the GQL algorithm tailored to the specific attributes of path planning. By adding the guiding principle of reinforcing experience and including “nearest subway station” as the guiding factor in the path search algorithm, blind search is avoided, and different connection modes exhibit better path planning performance based on prior experience.
Extensive experiments carried out in SUMO simulation environments with varying scales, scenarios, and features have robustly demonstrated the viability and efficiency of the proposed approach. The convergence speed is enhanced by 25% when compared to the conventional QL algorithm. The optimal path length is decreased by 8%, and the minimal travel cost is lowered by 11%. The algorithm exhibits robust adaptability to intricate and uncertain environments, efficiently decreasing both the number of iterations and computational time. By incorporating ride-hailing services with rail transit for transportation, this method helps lower passengers’ overall travel expenses. It effectively addresses the “last mile” problem in transportation and provides users with a more convenient, intelligent, and environmentally friendly travel experience.
A limitation of this paper is that the dataset only includes GPS data from ride-hailing vehicles in Fengtai District, Beijing, China, over a continuous week. Future research could incorporate data from additional regions and longer time periods to capture travel patterns that vary by location or season.