A Cluster-Then-Route Framework for Bike Rebalancing in Free-Floating Bike-Sharing Systems

Sun, Jiaqing; He, Yulin; Zhang, Jiantong

doi:10.3390/su152215994

Open AccessArticle

A Cluster-Then-Route Framework for Bike Rebalancing in Free-Floating Bike-Sharing Systems

by

Jiaqing Sun

,

Yulin He

and

Jiantong Zhang

^*

School of Economics and Management, Tongji University, Shanghai 200092, China

^*

Author to whom correspondence should be addressed.

Sustainability 2023, 15(22), 15994; https://doi.org/10.3390/su152215994

Submission received: 25 September 2023 / Revised: 1 November 2023 / Accepted: 7 November 2023 / Published: 16 November 2023

(This article belongs to the Special Issue The Improvement of Bike-Sharing System Help Urban Development and Sustainability)

Download

Browse Figures

Review Reports Versions Notes

Abstract

:

Bike-sharing systems suffer from the problem of imbalances in bicycle inventory between areas. In this paper, we investigate the rebalancing problem as it applies to free-floating bike-sharing systems in which the bicycles can be rented and returned almost anywhere. To solve the rebalancing problem efficiently, we propose a framework that includes (1) rebalancing nodes at which requirements for the redistribution (pickup or delivery) of bicycles are determined, (2) “self-balanced” clusters of rebalancing nodes, and (3) bicycle redistribution by service vehicles within each cluster. We propose a multi-period synchronous rebalancing method in which a rebalancing period is divided into several sub-periods. Based on the anticipated redistribution demand at each node in each sub-period, the service vehicle relocates bicycles between nodes. This method improves the efficiency of the system and minimizes rebalancing costs over the entire rebalancing period, rather than for a single sub-period. The proposed framework is tested based on data from the Mobike (Meituan) free-floating bike-sharing system. The test results demonstrate the effectiveness of the proposed methodologies and show that multi-period synchronous rebalancing is superior to single-period rebalancing.

Keywords:

free-floating bike-sharing systems; multi-period synchronous rebalancing; max–min ant colony system; density- and grid-based clustering algorithm

1. Introduction

Bike-sharing systems (BSSs) providing customers with bicycles for shared use are becoming increasingly popular. As a kind of sustainable travel mode [1,2,3], the BSSs enrich the urban public transport system and can effectively provide an alternative for the first- and last-mile problems. Basak and Iris [4] found that shared bikes decrease the demand for urban buses, and there is some complementarity between shared bikes and urban bus services.

The traditional type of BSS is station-based (SBBSS): users have to rent bikes from a specific docking station and return them to the available lockers at their target station. In recent years, free-floating bike-sharing systems (FFBSSs) have emerged. In some FFBSSs (for example, the Mobike (Meituan)/Hellobike system in China), bikes can be rented and returned almost anywhere in the system’s service area, eliminating the concept of docks and stations and even doing away with bike racks [5]. Smartphone applications allow users to locate, rent and return bicycles. For operators, the advantage of an FFBSS is that it does not require expensive docking stations, so system setup costs are lower. For customers, a free-floating system is more convenient because there is no need to go to a specific station to rent a bike or find a vacant dock to return it.

While there are many advantages to the FFBSSs, they still face some problems. A basic problem with FFBSSs is the constant emergence of imbalances in the geographical distribution of bicycles, resulting from fluctuations in demand and from one-way travel. In FFBSSs, there may be a shortage of bicycles in some areas such that users cannot find a bike when they need one, and an excess of bikes in other areas, a situation that not only wastes public transport resources but also impedes road traffic, since surplus bikes are parked by the side of the road. In order to increase efficiency and user satisfaction, BSS operators have to rebalance the bicycle inventory by picking up bikes from locations that are oversupplied and repositioning them at undersupplied locations [6,7,8].

Currently, related studies can be divided into two main categories based on the operation type: static rebalancing and dynamic rebalancing. The former with regard to BSSs is similar to the single-commodity many-to-many pickup and delivery problem [9]. It does not consider the changes in the number and demand of bicycles in the rebalancing process, and treats the system state as static during the rebalancing process [5,7,10,11,12,13,14,15,16,17,18]. The latter focuses mainly on the dynamic version of the bike-sharing rebalancing problem (DBRP): how to find the optimal routes and inventory levels to keep the system balanced while it is in operation with bike inventories constantly changing [19,20,21,22].

It is clear that static rebalancing cannot effectively address changes in the number of bikes during the rebalancing process of FFBSSs. In order to cope with the changes in the position of shared bikes throughout the day, DBRP is a more effective way.

The method generally used to solve the DBRP is single-period asynchronous rebalancing [19,21,22,23,24]. This entails dividing the planning horizon into several rebalancing periods and finding the optimal service vehicle routes (lowest cost and highest return) in each period, based on the redistribution demand at each node in each period. However, the optimal time to pick up or deliver bikes at a node, as well as the effects of each rebalancing operation on system efficiency in future periods, are ignored. Nevertheless, pickup and delivery operations carried out during a given period have an impact on subsequent periods, such that their net effect may even reduce the efficiency of the system over the entire planning horizon. In other words, an improvement in system efficiency in a single time period does not necessarily benefit system performance in the long run.

To obtain the optimal rebalancing effect in the long term, this dissertation seeks to propose a multi-period synchronous rebalancing method to optimize the rebalancing results of shared bicycles. This method combines several sub-periods into a longer rebalancing period, and synchronously considers the demand for pickup or delivery at each node in each sub-period in a synchronous manner during the bicycle redistribution process. The objective of our rebalancing method is to determine the optimal time to pick up or deliver bikes at each node so as to maximize system efficiency over the whole rebalancing period, thereby reducing the frequency and duration of bike shortages and keeping redistribution costs as low as possible. Ant colony optimization [25,26,27] is an efficient swarm intelligence algorithm that has been proven to be effective in vehicle routing problems and other optimization problems. Therefore, we adopt a max–min ant colony system to solve the rebalancing problem.

The clustering of rebalancing nodes and the prediction model of bike demand are the preparatory steps required before bicycle rebalancing so as to reduce modeling difficulty and computational complexity. Clustering [18,28,29,30,31,32] has practical significance, especially for very large-scale FFBSSs in which the shared bikes make frequent trips scattered over an extensive operating area. The main idea here is to create clusters of rebalancing nodes based on the location and inventory level of each node such that the pickup/delivery demands of nodes in the same cluster can be satisfied simply by redistributing bicycles within the cluster. In other words, each cluster of rebalancing nodes is “self-balanced”. Another major challenge to the DBRP in an FFBSS is fluctuation in user demand for bikes. The mainstream demand forecasting method [33,34,35,36,37,38,39,40] is the machine learning method. Here, we use the backpropagation (BP) neural network algorithm [41] as the prediction method for user demand.

Our methods have been tested on data captured from the Mobike system. A comparison of the system performance after rebalancing with that in a single-period rebalancing scenario demonstrates that our method decreases both the quantity and the duration of bicycle shortages while improving system efficiency. The effects of various managerial decisions were also tested, including varying the length of rebalancing periods, employing a different period strategy, changing the capacity of service vehicles, and other key parameters. In addition, we tested the clustering method for identifying rebalancing nodes and the proposed criteria for evaluating the “self-balanced” rebalancing nodes clustering scheme. The results of our experiments prove the effectiveness of the proposed methods.

In this paper, we propose a cluster-then-route framework for bike rebalancing in FFBSSs. A sequence diagram of the framework is shown in Figure 1. The main contributions of this paper are as follows.

(1): We propose a density- and grid-based clustering algorithm with multi-level density thresholds (DGBCAMDTs) to locate rebalancing nodes.
(2): A criterion to measure the self-balancing degree of node clustering is proposed, which is taken as the objective function, and a self-balancing node clustering scheme is obtained by using a heuristic algorithm.
(3): A multi-period synchronous bike-sharing rebalancing method is considered. A BP neural network and environmental simulation are used to predict the user demand of each node in sub-periods, and a rolling-period strategy is used to calculate the dynamic bike-sharing rebalancing problem.

The remainder of this paper is organized as follows. Section 2 reviews the current literature on bike rebalancing in BSSs. Section 3 describes the data used in the modeling. Section 4 describes the methods for determining rebalancing nodes and for calculating redistribution demand. Section 5 describes the method of user demand prediction. Section 6 describes solutions to create a “self-balancing” rebalancing node cluster problem. Section 7 formally states the multi-period synchronous rebalancing method and presents an improved max–min ant system for solving the rebalancing problem. Section 8 presents the results and analysis of the numerical experiment. Section 9 offers conclusions and discusses possible extensions and directions for further research.

2. Literature Review

The bike-sharing rebalancing problem (BRP) can be modeled using either a static or a dynamic approach. Models of the static approach (SBRP) generally assume that redistribution operations take place during the night, when user intervention is negligible. The redistribution demand at each node is calculated based on an inventory snapshot. On the other hand, models of the dynamic bike rebalancing problem (DBRP) assume that redistribution operations are performed during the day, and therefore take user demand into account. Redistribution operations are based on fluctuating real-time bicycle inventories. Most studies focus on traditional station-based BSSs.

DBRP studies draw on concepts developed in studies of the SBRP. Benchimol et al. [42] were the first to study the SBRP. They proved that it is an NP-hard problem and proposed approximation algorithms to solve the problem. Chemla et al. [6] built a mixed-integer programming (MIP) model and used a branch-and-cut algorithm for station-based BSSs. The upper bound of the problem was obtained by a tabu search and the lower bound of the problem was obtained by solving a relaxation problem. Finally, they used a randomly generated example to verify the effectiveness of the algorithm. Ho and Szeto [43] analyzed the SBRP by selecting a subset of stations to visit, sequence, and determine the pickup/drop-off quantities so as to minimize the total penalties incurred at all the stations, subject to constraints on the inventory level, pickup and drop-off quantities, vehicle capacity, total operating time and service vehicle route. Li et al. [44] proposed two new strategies to solve SBRP. They used a hybrid genetic search to quickly determine the access order of paths and a greedy search algorithm to determine the number of bikes to be picked up and sent to each station.

Liu et al. [45] proposed a station clustering method based on loan–return data between stations. They used SimRank to calculate the similarity between stations based on loan-to-return connections between them. Then, they employed the OPTICS (ordering points to identify the clustering structure) algorithm to cluster stations and delineated local closed areas within which most public bicycles flow.

Schuijbroek et al. [30] also presented a cluster-first, route-second approach. The inventory at each station was represented as a finite-buffer single-server non-stationary queueing system, and the authors used Kolmogorov forward equations to calculate the boundaries of the inventory level of each station. They proposed a mixed-integer programming-based clustering problem with the objective of finding a feasible solution while minimizing the makespan—in other words, rebalancing the system as soon as possible—and with the constraints that stations with shortages must be visited by a vehicle and that a cluster must contain a sufficient number of bicycles and maximum inventory. Stations were clustered based on the known upper bound of routing costs for any combination of stations, which was estimated by a new maximum spanning star approximation. The routing problem for a particular vehicle was solved independently in each cluster.

Many subsequent studies, such as Bruck et al. [46], Lu et al. [16], Ren et al. [17], Li et al. [47], Liu et al. [5], Chen [48], Mahmoodian et al. [18], Zhang et al. [11], etc., have built on these references by considering different objective functions, using mixed-integer programming as a modeling approach, and using different heuristics or exact algorithms to solve the problem.

Since bicycle inventories fluctuate widely and frequently while the system is in operation, especially in FFBSSs, dynamic rebalancing is more in line with real operational requirements.

Regue and Recker [20] proposed a proactive vehicle routing approach by anticipating inventory and redistribution needs and planning vehicle routes in a proactive manner based on current and expected events. They used gradient boosting machines to forecast demand and used Kolmogorov forward equations to calculate maximum and minimum inventory levels, and they proposed a stochastic linear integer program to calculate the pickup/delivery demands of each node. Then, they forecast the rebalancing demands of the following three sub-periods, taking into account the effect of the pickup and delivery operations in the current period on the next two periods, thereby preventing the recurrence of system inefficiencies.

Shui and Szeto [21] adopted a rolling horizon approach to break down the problem into a number of stages; a series of static rebalancing sub-problems are solved in each stage. They divided the planning horizon into several stages, each of which consists of a rolling period and a period of overlap with the next stage. An enhanced artificial bee colony (EABC) algorithm and a route truncation heuristic were jointly used to optimize the route design at each stage, and another heuristic was used to solve the loading and unloading sub-problem along the route at a given stage.

Chiariotti et al. [49] proposed a dynamic rebalancing strategy that utilizes historical data to predict network conditions. They used a birth–death process model to calculate stations’ occupancy and decide when to reallocate bicycles. They tested the effectiveness of their approach using data from New York City’s bike-sharing system. Numerical simulation results showed that the dynamic strategy is superior to the static rebalancing strategy. Zhai et al. [50] proposed a Markov stochastic process and linear programming to solve the FFBSSs. They validated the effectiveness of the above method using the Mobike dataset. Chen et al. [19] proposed a dynamic rebalancing problem to address the challenge of managing imbalanced bike distributions in FFBSSs. They used the hybrid rolling horizon strategy to rebalance the number of bicycles in each period. We summarize all the literature mentioned above in Table 1.

The popular solution to DBRP referred to above uses the rolling horizon strategy and single-period asynchronous rebalancing. This requires dividing the operating time into multiple time periods and finding the best solution in each period based on the rebalancing results in each period. However, the improvement in system efficiency in each sub-time period does not necessarily benefit the overall performance of the system.

In this paper, we propose a multi-period synchronous rebalancing method. The method combines multiple sub-periods into one long period and considers the pickup or delivery needs of each node in each sub-period synchronously during the bike rebalancing process. Considering the large scale of FFBSSs, we use a cluster-first, route-second approach, and each cluster is assigned one vehicle. We propose a comprehensive objective for the clustering problem, taking the location and time-variant redistribution demand of each cluster into consideration. The nodes in a cluster should be geographically close to one another, and on the horizon of the rebalancing period, it should be possible to satisfy the redistribution demand of every node in a cluster by performing pickups and deliveries within the area. Our method can obtain a node clustering scheme suitable for a long period of time, rather than just suitable for a specific point in time.

3. Data Description

The following models and methods are based on historical bike distribution data from the Mobike (Meituan) free-floating bike-sharing system. Through the Mobike mobile application, the locations of parked bikes can be ascertained. When we set the location to a certain point, the application displays all parked bikes in the surrounding area (see Figure 2a). To find the URL where this information was stored, we used the packet capture tool Fiddler.

Our research area was an urban area of about 17 square kilometers. We coded a web spider in Python 3.7 to download data about bikes in the research area. We identified specific locations within the area, posted them with the requested URL, and downloaded the received JSON data, which contained information about all bikes near those locations. The combined data from all locations cover all bicycles in the entire research area. The web spider started every 10 min from 00:00 on 1 May 2019, recording the ID and location of bikes in the research area and the data collection time. There were about 8000 bikes on average in the research area. Figure 2b displays how the bikes were scattered across the research area at 10:00 am on 9 May 2019. In the following discussions, we use n to denote the number of points in the dataset, and

(L n g, L a t)

denotes the location of a shared bike.

4. Rebalancing Nodes and Redistribution Demands

4.1. Identification of Rebalancing Nodes

In traditional SBBSSs, during the rebalancing process, bicycles are picked up and dropped off by service vehicles at stations, which can be regarded as rebalancing nodes. However, because there are no fixed stations or docks in FFBSSs, it is necessary to identify specific places to pick up and drop off bikes. Although the bikes are scattered throughout the service area, users will spontaneously leave bikes in certain locations, such as residential blocks and subway stations. This behavior results in the accumulation of bikes—and therefore the concentration of demand for pickup and delivery—in these areas. These hotspots where bikes are parked can be considered virtual stations or rebalancing nodes.

Bicycles parked in the hotspots outnumber those in surrounding areas, so we apply density-based clustering methods to identify them. The basic principle of density-based clustering is to find areas with higher density and interconnect them to form clusters [51]. The data points in the low-density areas are regarded as noise.

Considering the huge number of free-floating bikes, the data volume should be compressed to improve the clustering speed. Employing the method of grid division, we first divide the research area evenly into

m

grids:

g_{1}, g_{2}, \dots, g_{m}

. Each grid can be regarded as a point during clustering. Following Qiu, Li and Shen [52], we set

m = \sqrt{n}

(

n

being the number of data points) so as to balance clustering speed and quality. If the density threshold is denoted by

θ

, grids with a number of bikes

d e n (g_{i}) > θ (i = 1, \dots, m)

are regarded as high-density grids. Given a set of grids

g_{1^{'}}, g_{2^{'}}, \dots, g_{m^{'}}

, if

g_{1^{'}}

is adjacent to

g_{{(i + 1)}^{'}}

for

i^{'}

,

{(i + 1)}^{'} \in (1^{'}, 2^{'}, \dots, m^{'})

and we define

g_{m^{'}}

as density-reachable from

g_{1^{'}}

. According to the principle of density-based clustering, grids that are mutually density-reachable form a cluster.

At the same time, the distribution of bikes varies in space according to the nature of each urban area. Generally, for example, more bikes are parked in commercial areas than in residential areas. If a single, high-density threshold is set, the hotspots in residential areas will be taken as noise and overlooked; on the other hand, a lower threshold will result in the aggregation of the high-density grids in a commercial area into one cluster. Other density-based clustering algorithms, such as DBSCAN, also have this disadvantage.

For these reasons, we propose a density- and grid-based clustering algorithm with multi-level density thresholds (DGBCAMDTs), which allows for the identification of hotspot areas with different densities (see Algorithm 1).

We establish a multi-level density threshold

θ = (θ_{1}, θ_{2}, \dots, θ_{k})

(θ_{1} < θ_{2} < \dots < θ_{k}

and

θ_{k + 1} = + \infty)

and denote

G_{i}

as a set of grids in which the grids belong to the same density level:

G_{i} = {g_{j} | θ_{i} \leq d e n (g_{j}) < θ_{i + 1}}

. When

d e n (g_{j})

are less than

θ_{1}

, the data in

g_{j}

can be regarded as noise; otherwise,

g_{j}

is regarded as a density grid. In the following step, we cluster those density grids.

The overall clustering procedure is shown in Algorithm 1. First, we cluster the grids with the highest density level (

g_{j} \in G_{k}

) based on the density-reachable rule, namely, if two grids are mutually density-reachable, they must belong to the same cluster. Then, we cluster the grids at the second-highest density level (

g_{j} \in G_{k - 1}

). If

g_{j}

is density-reachable from existing clusters,

g_{j}

joins the nearest cluster. As for those grids that are not density-reachable from any cluster, we can still cluster them using the same method as in the first step. Similarly, we can cluster all the density grids from the highest density level to the lowest.

Finally, supposing

N

different cluster areas are obtained, we take the density centers of these clusters as rebalancing nodes and the cluster areas as node areas. The set of rebalancing nodes is denoted by

S = {s_{1}, s_{2}, \dots, s_{N}}

.

Algorithm 1: DGBCAMDT
	Input: The distribution of bikes; Multi-level density thresholds: $θ_{1} < θ_{2} < \dots < θ_{k}$ ;
	Output: All clusters $s_{1}, s_{2}, \dots, s_{N}$ ;
1	Initialize
	Divide the research area into $m$ grids: $g_{1}, g_{2}, \dots, g_{m}$ ;
	Obtain multi-level density grid sets: $G_{1}, G_{2}, \dots, G_{k}$ ;
2	for $i \in [k, k - 1, \dots, 1]$ do
3		foreach $g \in G_{i} \cup G_{i + 1} \cup \dots \cup G_{k}$ do
9			if clusters exist then
10				Join $g$ to the adjacent nearest cluster;
11			end
7		end
8		foreach $g \in G_{k}$ do
9			if $g$ does not belong to any existing clusters then
10				Bulit a new cluster $s$ , and join $g$ to $s$ ;
				Join the adjacent grids of $g$ which also does not belong to any existing clusters to $s$ ;
11			end
12		end
13	endfor

4.2. Rebalancing Demand Calculation

In station-based BSSs, the inventory of each station can be controlled within a certain interval [22]. For FFBSSs, although there is no strict limit on the number of bicycles that can be parked at each rebalancing node, numbers also need to be controlled so as to ensure that there are neither too many nor too few bikes.

Suppose the rebalancing period contains

p

sub-periods,

T_{1}, T_{2}, \dots, T_{p}

and

t_{1}, t_{2},

\dots, t_{p}

represent the start times of each sub-period. Denote by

n u m_{i}^{B} (T)

the benchmark number (ideal number) of bikes at rebalancing node

s_{i}

at

t_{j}

and by

n u m_{i} (t_{j})

the real number of bikes. We define the ratio between the benchmark number and the real number at node

s_{i}

at

t_{j}

as the preservation rate

H_{i} (t_{j})

:

H_{i} (t_{j}) = \frac{n u m_{i} (t_{j})}{n u m_{i}^{B} (T)}

(1)

Let

(δ_{i}^{\min} (t_{j}), δ_{i}^{\max} (t_{j}))

be the interval of the preservation rate for rebalancing node

s_{i}

at

t_{j}

. When

H_{i} (t_{j}) > δ_{i}^{\max} (t_{j})

, bikes can be picked up from this node; when

H_{i} (t_{j})

< δ_{i}^{\min} (t_{j})

, bikes can be delivered to the node. The setting of the interval determines the rebalancing operation, so the interval of each rebalancing node needs to be decided according to users’ rent/return demands both at the node itself and at related nodes.

Let

Z_{i}^{o u t} (T_{j})

and

Z_{i}^{i n} (T_{j})

be the number of bikes that are ridden by users into or out of the node

s_{i}

during sub-period

T_{j}

, and

L_{i} (T_{j}) = Z_{i}^{i n} (T_{j}) - Z_{i}^{o u t} (T_{j})

be the net flows of bikes. When

L_{i} (T_{j}) < 0

, more bikes are ridden out from

s_{i}

during

T_{j}

and the inventory decreases, so to prevent a shortage,

s_{i}

is likely to receive a delivery before

T_{j}

. When

L_{i} (T_{j}) > 0

,

s_{i}

is more likely to be relieved of surplus bikes. The turnover rate of node

s_{i}

in

T_{j}

is calculated as follows:

r_{i} (T_{j}) = \frac{Z_{i}^{o u t} (T_{j}) + Z_{i}^{i n} (T_{j})}{n u m_{i}^{B}}

(2)

A relatively high turnover in the following period means that the node is relatively more important. It is reasonable to pick up bikes from a node when it is idle and has a surplus of bikes, and to deliver bikes to busy nodes that lack them.

Accordingly, we put forward the programs for

δ_{i}^{\min} (t_{j})

and

δ_{i}^{\max} (t_{j}) :

δ_{i}^{\max} (t_{j}) = (H_{i}^{B} (t_{j}) + ω) + (\frac{r_{i} (T_{j}) - r^{\min} (T_{j})}{r^{\max} (T_{j}) - r^{\min} (T_{j})} - ε) \times σ

(3)

δ_{i}^{\min} (t_{j}) = (H_{i}^{B} (t_{j}) - ω) + (\frac{r_{i} (T_{j}) - r^{\min} (T_{j})}{r^{\max} (T_{j}) - r^{\min} (T_{j})} - ε) \times σ

(4)

where

H_{i}^{B} (t_{j}) = 1 - \frac{L_{i} (T_{j})}{n u m_{i}^{B}} \times μ

(5)

is the benchmark preservation rate. The parameter

μ

is used to adjust the range between the upper and lower boundaries.

r^{\min} (T_{j})

and

r^{\max} (T_{j})

are the minimum and maximum turnover of all nodes in

T_{j}

. The parameter

ε

is the dividing value between the idle node and the busy node. When the standardized turnover of the node is greater than

ε

, meaning this node is busy, the upper and lower boundaries are both higher; when the standardized turnover is smaller than

ε

, meaning this node is idle, the boundaries both decline. In other words, priority is given to delivering bikes to relatively busy nodes that lack bikes and to pick up bikes from a node when it is idle and has too many bikes.

σ

is the positive influence coefficient of the turnover rate.

In accordance with the boundaries of the preservation rate, the redistribution demand of each rebalancing node in each sub-period is determined. The redistribution demand may be either for pickup or for delivery. The pickup/delivery quantity is calculated as follows:

Q_{i} (t_{j}) = \{\begin{matrix} \min (C, (H_{i}^{B} (t_{j}) - H_{i} (t_{j})) \cdot n u m_{i}^{B}), H_{i} (t_{j}) < δ_{i}^{\min} (t_{j}) \\ - \min (C, (H_{i} (t_{j}) - H_{i}^{B} (t_{j})) \cdot n u m_{i}^{B}), H_{i} (t_{j}) > δ_{i}^{\max} (t_{j}) \end{matrix}

(6)

where

C

is the capacity of the service vehicles. When

Q_{i} (t_{j}) > 0

, the node

s_{i}

needs a delivery of bikes at

t_{j}

;

Q_{i} (t_{j}) < 0

means that node

s_{i}

needs a pickup at

t_{j}

. Every rebalancing node has a sequence of such demands during a rebalancing period:

Q_{i} (t_{1}), Q_{i} (t_{2}), \dots, Q_{i} (t_{p}), s_{i} \in S

.

For FFBSS operators, the distribution of bikes is available at any time, so we suppose that the number of bikes at each node at the beginning of the rebalancing period

n u m_{i} (t_{1}), s_{i} \in S

is already known. The rent and return demand of users during each sub-period (i.e.,

Z_{i}^{o u t} (T_{j}) and Z_{i}^{i n} (T_{j}), s_{i} \in S, j \in 2, \dots, p)

needs to be forecast, and the number of bikes in each sub-period can be calculated accordingly as

n u m_{i} (t_{j + 1}) =

n u m_{i} (t_{j}) + L_{i} (T_{j}), s_{i} \in S, j \in 2, \dots, p - 1

. The preservation rate can also be recalculated. The specific forecasting methods will be presented in Section 5.

Figure 3 gives an example of the preservation rate of a node during a rebalancing period, which is divided into five sub-periods. The number of bikes at a node at the start of the period

(t_{1})

is already known. The anticipated

H

is above the upper boundary at

t_{3}

, so this node needs a pickup at

t_{3}

, while the anticipated

H

is below the lower boundary at

t_{5}

and the node needs a delivery at

t_{5}

.

5. Environmental Simulation and User Demand Prediction

Accurate and efficient user demand prediction of shared bikes is of great value for operation management. However, the historical data reflect the actual number of shared bikes in use, rather than the user demand. Therefore, we propose the user loss probability and calculate the user demand through environmental simulation.

5.1. Bike-Sharing Environmental Simulation

The number of shared bikes used is mainly affected by user behavior, including two kinds of behavior, that is, the users’ behavior of borrowing bikes and the behavior of returning bikes.

In this paper, we use the Poisson process [53,54,55] to simulate user behavior. Suppose that the arrival rates of users borrowing bikes at node

s_{i}

at time t is

λ_{i}^{u} (t)

, and the arrival rates of users returning bikes at node

s_{i}

at time t is

λ_{i}^{p} (t), i \in {1, 2, \dots N}

, i.e., the time interval for riding in a bike at node

s_{i}

follows a distribution of

E x p (λ_{i}^{u} (t))

, and the time interval for riding out a bike follows a distribution of

E x p (λ_{i}^{p} (t))

. The number of shared bikes at each node is a non-homogeneous continuous-time Markov chain (CTMC).

Suppose that the rebalancing period contains

p

sub-periods

T_{1}, T_{2}, \dots, T_{p}

in each time period and the arrival rates of users borrowing and returning bikes is constant. We can calculate the approximate value of the arrival rates through the user’s historical data in this period. The arrival rates of users borrowing bikes at node

s_{i}

at time t can be calculated using a weighted sum of the historical arrival rates of the node in the same time period. The weights can be calculated based on the weather conditions. However, the arrival rates of users borrowing bikes reflect the users’ demand for bikes rather than the actual number of bikes used. When the number of bikes in node

s_{i}

is small, some user demand will not be satisfied and the actual number of bikes used is fewer than user demand. From the historical data, we can only get the number of users borrowing bikes, so we define user loss probability to calculate user demand.

We define

β (n u m) \in [0, 1]

as user loss probability when the number of bikes is num, i.e., the probability that users choose other transportation modes when no shared bike is available. When

n u m = 0

,

β (n u m) = 1

, set

β (n u m) = 0

when

n u m \geq n u m_{0}

. The formula is shown below:

β (n u m) = 1 - \frac{1}{n u m_{0}} n u m

(7)

Let the number of bikes borrowed by users in the time period

T_{j}, j \in {1, \dots, p}

at node

s_{i}

,

i \in {1, \dots, N}

be

u_{i} (T_{j})

and the real number of shared bikes be

n u m_{i} (t_{j})

. We define the user loss probability as

u d_{i} (T_{j})

. Users’ rent demand can be calculated based on the user loss probability.

u d_{i} (T_{j}) = \frac{u_{i} (T_{j})}{1 - β (n u m (t_{j}))} = \frac{n u m_{0}}{n u m (t_{j})} u_{i} (T_{j})

(8)

If the historical number of bikes borrowed at node

s_{i}

in time period

T_{j}

is selected as

{u_{i}^{1} (T_{j}), u_{i}^{2} (T_{j}), \dots, u_{i}^{m} (T_{j})}

, the arrival rates of users borrowing bikes in time period

T_{j}

can be calculated as:

λ_{i}^{u} (t) = \frac{1}{| T_{j} |} \cdot \frac{\sum_{x = 1}^{m} w^{x} \cdot u_{i}^{x} (T_{j}) \cdot s u^{x} (T_{j})}{\sum_{x = 1}^{m} s u^{x} (T_{j})}, t \in T_{j}

(9)

where

s u^{x} (T_{j}) = \{\begin{matrix} 1, n u m (t_{j}) \neq 0 \\ 0, n u m (t_{j}) = 0 \end{matrix}

(10)

Formula (10) means that it should be calculated only when the number of bikes is not 0, and we define

w_{x}

as the weight of the xth sample calculated based on weather conditions.

The number of bikes returned by users is not affected by the real number of bikes in each node, so the number of bikes returned by users can be directly used to calculate users’ return demand.

Let the number of bikes returned by users in the time period

T_{j}, j \in {1, \dots, p}

and at node

s_{i}

,

i \in {1, \dots, N}

be

p_{i} (T_{j})

. If the historical number of bikes returned at node

s_{i}

in time period

T_{j}

is selected as

{p_{i}^{1} (T_{j}), p_{i}^{2} (T_{j}), \dots, p_{i}^{m} (T_{j})}

, the arrival rates of users returning bikes in time period

T_{j}

can be calculated as:

λ_{i}^{p} (t) = \frac{1}{| T_{j} |} \cdot \frac{\sum_{x = 1}^{m} w^{x} \cdot p_{i}^{x} (T_{j})}{m}, t \in T_{j}

(11)

5.2. User Demand Forecasting Method

In view of the fact that there are many factors affecting users’ demand for using shared bikes and there may be nonlinear relationships between the factors, we use a backpropagation (BP) neural network for user demand prediction.

The BP neural network is a neural network consisting of multiple hidden layers. In general, the more hidden layers, the better the prediction result of the model will be. However, if the model is designed this way, the training time will be long, and it is prone to an overfitting problem. In order to balance the prediction results and training time, this paper constructs a three-layer neural network for prediction. The variables to be predicted are divided into three parts: users’ rent demand, users’ return demand and the number of shared bikes in the future decision period. These three parts will consider the same factors, i.e., have the same inputs, so the three parts are integrated into the same model for prediction. Although the number of bikes at each node in the future decision period can be calculated by predicting the users’ rent demand and users’ return demand, there may be errors in the predicted results. The error in the number of shared bikes will accumulate with the number of calculations. Therefore, we choose to predict the number of shared bikes separately.

The prediction model is a three-layer fully connected BP neural network, including an input layer, an output layer and two hidden layers. The input layer inputs the factors that influence user demand. The goal of the model is to predict the users’ rent demand, users’ return demand and the number of shared bikes in the future decision period. Since the neural network cannot output continuous values, the decision period T is divided into

p

sub-periods:

T_{1}, T_{2}, \dots, T_{p}

, and

t_{i}

represents the start time of the period

T_{i}

.

n u m_{i} (t_{1})

is an input to the model. The model needs to predict

n u m_{i} (t_{2})

,

n u m_{i} (t_{3})

,…,

n u m_{i} (t_{p + 1})

,

u_{i} (T_{1})

,…,

u_{i} (T_{p})

,

p_{i} (T_{1})

,…,

p_{i} (T_{p})

, and the output layer contains a total of 3 × m neurons.

6. “Self-Balanced” Clusters of Rebalancing Nodes

The whole service area for shared bikes is very large. Directly dealing with the BRPs in the whole service area will make the problem size very huge, and also not in line with the requirements of the actual enterprise operation. Therefore, we consider dividing the service area into several small rebalancing areas, and then consider BRPs in these small areas.

6.1. The Criteria for “Self-Balance”

The clustering of rebalancing nodes decomposes the rebalancing problem into separate single-vehicle rebalancing problems, reducing the computational complexity of a large-scale BRP, as well as improving operational management on the ground. Each cluster of rebalancing nodes is assigned to one service vehicle such that the redistribution demands of every node can be satisfied by operations performed by the vehicle entirely within the cluster. In other words, each cluster of nodes is “self-balanced”. The degree of self-balancing in a cluster is directly related to the efficiency of rebalancing operations in the cluster.

The degree of self-balance can be measured from multiple perspectives. On the one hand, the nodes in the same cluster should be close to each other so that the vehicle can travel from one node to another in less time, improving redistribution efficiency and reducing the vehicle routing cost. We adopt cluster compactness [56] as one criterion of self-balance. Suppose there is a clustering scheme that clusters the rebalancing nodes into

m

clusters:

S_{1}, S_{2}, \dots, S_{m} .

The cluster compactness is the overall deviation of a cluster, which can be simply computed as the overall summed distance between rebalancing nodes and their corresponding cluster centers.

D e v^{S_{k}} = \sum_{\forall s_{i} \in S_{k}} \frac{d (s_{i}, μ_{k})}{| S_{k} |}

(12)

where

μ_{k}

is the centroid of cluster

S_{k}

and

d (s_{i}, μ_{k})

is the distance between node

s_{i}

and

μ_{k}

. The smaller this value, the greater the compactness of cluster

S_{k}

.

On the other hand, the redistribution demands of the nodes in the same cluster need to be complementary. For example, if the cluster consists entirely of nodes that only need deliveries or that only need pickups, the degree of demand complementarity in the cluster is low and its requirements cannot be met solely through redistribution operations within the cluster. However, if some nodes in the cluster require pickups and others require deliveries, the degree of complementarity of the cluster is high and the redistribution of bikes between nodes within the cluster is more effective.

In previous studies, the clustering of rebalancing nodes was based on the snapshot demand or inventory level at each node [29,30,57]. In this study, we need to cluster the nodes based on their time-varying inventory during the rebalancing period. Since it takes time for the service vehicle to transfer the bikes, there will be a time difference in the mutual satisfaction of demand between nodes, but nodes in a “self-balanced” cluster should have opposite trends in their bike inventories so that they can complement one another (see Figure 4). Accordingly, we propose a criterion

N C o m p^{S_{k}}

, to calculate the non-complementarity of the rebalancing nodes in the cluster

S_{k} :

N C o m p^{S_{k}} = \sum_{j = 1}^{p} |\sum_{\forall s_{i} : s_{i} \in S_{k}} (H_{i} (t_{j}) - H_{i}^{B} (t_{j})) \times n u m_{i}^{B} (t_{j})|

(13)

where

|\sum_{\forall s_{i} : s_{i} \in S_{k}} (H_{i} (t_{j}) - H_{i}^{B} (t_{j})) \times n u m_{i}^{B} (t_{j})|

represents the degree of non-complementarity of

S_{k}

at

t_{j}

, and

N C o m p^{S_{k}}

is the sum of the degrees of non-complementarity at different times. The smaller the value of

N C o m p^{S_{k}}

, the greater the degree of complementarity of the nodes in cluster

S_{k} .

In addition, the number of rebalancing nodes in the cluster, denoted by

| S_{i} |, i \in

{1, \dots, m}

, should be limited. Since there is only one service vehicle per cluster, the larger the number of nodes, the more redistribution demands go unsatisfied.

We can combine the three criteria we have specified for an optimal self-balanced clustering scheme as follows:

\max O^{P S} = \sum_{\forall S_{i} \in P S} F (D e v^{S_{i}}, N C o m p^{S_{i}}, | S_{i} |), \forall P S \in Ω

(14)

where

F (D e v^{S_{i}}, N C o m p^{S_{i}}, | S_{i} |)

is a function of

D e v^{S_{i}}, N C o m p^{S_{i}}, and | S_{i} | .

This function measures the degree of self-balance of the single cluster

S_{i}

—In other words, it indicates how much system efficiency in

S_{i}

can be improved if a single service vehicle is assigned to perform all redistribution operations within it. We will use case analysis and statistics to get the expression of

F (D e v^{S_{i}}, N C o m p^{S_{i}}, | S_{i} |)

in Section 8.4.

O^{P S}

represents the average self-balance degree of the whole service area obtained under the clustering scheme

P S

, and

Ω

indicates the set of feasible clustering schemes in the solution space.

O^{P S}

is a criterion that allows us to evaluate the clustering scheme

P S

from the perspective of system improvement in the whole service area.

6.2. Genetic Algorithm-Based Clustering

The objective function proposed in Section 5.1 is an expression to measure the degree of “self-balancing” in the small service areas obtained by clustering. In order to better deal with the degree of self-balancing in each cluster, genetic algorithms (GAs) are selected for clustering in this section.

Genetic algorithms (GAs) are stochastic search and optimization techniques based on the mechanics of natural selection, genetics, and evolution. GAs find the global solution to a given problem by simultaneously evaluating many points in the search space. In GAs, the individuals in the search space are encoded as chromosomes, and the collection of chromosomes makes up a population. Initially, a random population is created. An objective function representing the fitness of the individual is associated with each chromosome. The selection is based on the principle of survival of the fittest, retaining the chromosomes with higher fitness. Through biologically inspired operators such as crossover and mutation, the population evolves and yields a new generation of chromosomes. Evolution continues until the termination condition is satisfied, which might mean that the optimal value of the objective function has not changed for many generations or that a preset maximum number of generations has been reached.

The basic steps in a GA are as follows:

Chromosome coding and population initialization.
Fitness function (objective function) computation and chromosome selection.
Chromosome pairing and crossover.
Chromosome mutation.
Repetition of steps 2–4 until the termination criterion is satisfied.

The search capacity of GAs has been used in this paper to cluster the rebalancing nodes taking the criterion

O^{P S}

(Equation (14)) as the fitness function. We adopt the locus-based adjacency representation [58] to encode the chromosomes, and each chromosome represents one partitioning scheme. Locus-based adjacency representation is a graph-based representation and does not require the number of clusters to be fixed in advance. Each individual in the population consists of

N g e n e s : g e n_{1},

g e n_{2}, \dots g e n_{N}

. The position of each gene represents a rebalancing node and its gene value can be taken in

1, \dots, N,

which represents another node. For example, the position of

g e n_{i}

i and its gene value j can be regarded as a link between node

s_{i} and s_{j},

the position of all genes and their gene values model a network, and every chromosome constructs a clustering scheme. All nodes belonging to the same connected component can be assigned to the same cluster. Figure 5 illustrates the locus-based adjacency scheme for a network of six nodes.

Crossover is a probabilistic process in which two parent chromosomes exchange information to produce two child chromosomes. We choose the uniform crossover [59], which is unbiased with respect to the ordering of genes and can generate any combination of genes from the two parents. Figure 6 provides an example of a uniform crossover; the heritability of the encoding under this crossover operator can be seen.

As for the mutation operation, we use a nearest-neighbor method [60], where the gene value

j of g e n_{i}

is selected only from the nearest neighbors of

node s_{i}

. This reduces the extent for the search space of mutation and improves the convergence of the algorithm. Note that the nearest-neighbor list is precalculated only once at the beginning of the algorithm.

7. Multi-Period Synchronous Rebalancing by a Service Vehicle

To obtain the optimal rebalancing effect in the long term, this section seeks to propose a multi-period synchronous rebalancing method by a service vehicle to optimize the rebalancing results of shared bicycles in each area.

7.1. Formulation

All rebalancing nodes have redistribution demands in each sub-period of the rebalancing period

Q_{i} (t_{1}), Q_{i} (t_{2}), \dots, Q_{i} (t_{p}), s_{i} \in S,

and these are considered synchronously in our synchronous rebalancing method, different from the single-period rebalancing approach, which only considers the single demand of each node in a period.

The service vehicle can visit a node only when it needs pickup or delivery, and each node can be visited multiple times as long as it has redistribution demand. In the rebalancing process, the vehicle could pick up or deliver bikes at a certain node in a certain sub-period and then travel to another node in the following sub-periods. The vehicle routes are planned to prevent shortage events; plans indicate which nodes the vehicle needs to visit in which sub-periods, as well as the corresponding pickup or delivery quantity. Because the vehicles take time to travel between nodes, not all node demands can be satisfied in each sub-period. The objective of our bike-sharing rebalancing model is to achieve optimal system efficiency over the entire rebalancing period by picking up or delivering bikes at each node at the optimal time, and at the same time to keep the routing cost as low as possible. This multi-period synchronous rebalancing method considers the node demands of multiple periods simultaneously, so it can obtain an optimal scheduling scheme applicable to a longer time dimension.

The rebalancing problem is modeled similarly to the shortest path problem. The vertexes in the network represent the start time of sub-periods in which the nodes require pickups or deliveries, and the arcs are from the earlier vertexes to the later. Every node has a series of redistribution demands, so every node has many vertexes, but the vertexes in the network change with every visit from the vehicle. This is because after the vehicle picks up bikes from a node or delivers bikes to it in a certain sub-period, the number of bikes at the node is updated and its redistribution demand is recalculated for later sub-periods.

An example is shown in Figure 7. Each node

s_{i}, s_{j}, s_{k}

has many separate vertexes, represented by the points. The horizontal line represents the timeline, passing from right to left. The arcs all go from earlier vertexes to later ones, as shown by the dotted arrows. Each solid arrow represents one step in the rebalancing process: the vehicle leaves the vertex at the tail of the arrow and travels to the vertex at the head of the arrow to make another pickup or delivery. The arc

(a, b)

represents the first step, and the number and position of vertexes of node

s_{i}

to the right of vertex

a

will change after this step.

At every step, the vehicle moves from a vertex that has just been serviced to the next vertex, and the existing route is extended by one new path, until we obtain the complete route, that is, until the vehicle cannot find the next vertex to visit and the rebalancing process terminates, yielding one rebalancing scheme. In this example, the rebalancing operation ends after three steps. This is a Markov decision process (MDP). At each step, the vehicle makes a decision based on the current system state.

We now attempt to formulate the rebalancing problem for one cluster

S_{i}

in a rebalancing period. The cluster contains

M

rebalancing nodes. The first step begins at

t_{s}

, when the vehicle departs from a virtual depot that is equidistant from all nodes, and the last step ends at

t_{e} .

At every step, the vehicle moves from one vertex to the next, and the existing route is extended by one new path. The directed graph

G^{k} = (V^{k}, A^{k})

describes the network of the routing problem in

step k .

(The directed graph of other steps would be different). The vertex set

V^{k} = {t_{i}^{k} [a]}_{q}

consists of the times of redistribution demands at all nodes in

step k,

and

t_{i}^{k} [a]

is the time of a redistribution demand of node

s_{i}

in step

k

and

a

is its index. Each vertex belongs to a node. The arc set

A^{k}

consists of all arcs from the earlier vertexes to the later vertexes in step

k

. The quantity of the redistribution demand

Q (t_{i}^{k} [a])

of node

s_{i}

at time

t_{i}^{k} [a]

in step

k

can be calculated Equation (6), and the real quantity to be picked up or delivered in step

k

is denoted by

q^{k}

. There is only one vehicle in this problem, with capacity

C

and speed

v

. The time required to load or unload the bikes is

l

. The inventory of the vehicle in step

k

is

I^{k} . d_{i j}

is the distance between node

s_{i} and s_{j}

.

Once a vertex has been visited, this vertex and previous vertexes do not change further. Suppose the vehicle cannot find the next vertex after step

K

. The rebalancing process terminates at that step. From the latest directed graph

G^{k} = (V^{k}, A^{k}),

we can obtain the entire vehicle route. Let

n u m_{i}^{K} (t_{j})

be the number of bikes in

node s_{i} at t_{j} (t_{j} \in V^{K})

in the final step. Some variables can be defined:

x (t_{i}^{k} [a], t_{j}^{k} [b]) = \{\begin{cases} 1, vehicle from s_{i} at t_{i}^{k} [a] \\ move to s_{j} at t_{j}^{k} [b] in or before step k; \\ 0, else . \end{cases}

(15)

y_{i}^{k} (t_{j}) = \{\begin{cases} 1, δ_{i}^{\min} > n u m_{i}^{k} (t_{j}) . \\ 0, δ_{i}^{\min} \leq n u m_{i}^{k} (t_{j}) . \end{cases}

(16)

The formulation of this problem is as follows:

\min O^{S_{i}} = P_{1} \cdot (- R U^{S_{i}} - R T^{S_{i}}) + P_{2} \cdot D^{S_{i}}

(17)

R U^{S_{i}} = \frac{\sum_{i = 1}^{M} \sum_{j = 1}^{P} (R U_{i}^{1} (t_{j}) - R U_{i}^{K} (t_{j}))}{\sum_{i = 1}^{M} \sum_{j = 1}^{P} R U_{i}^{1} (t_{j})}

(18)

R U_{i}^{k} (t_{j}) = (δ_{i}^{\min} (t_{j}) - n u m_{i}^{k} (t_{j})) y_{i}^{k} (t_{j})

(19)

R T^{S_{i}} = \frac{\sum_{i = 1}^{M} \sum_{j = 1}^{P} (y_{i}^{1} (t_{j}) - y_{i}^{K} (t_{j}))}{\sum_{i = 1}^{M} \sum_{j = 1}^{P} y_{i}^{1} (t_{j})}

(20)

D^{S_{i}} = \sum_{\forall t_{i}^{K} [a], t_{j}^{K} [b] : (t_{i}^{K} [a], t_{j}^{K} [b]) \in A^{K}} x (t_{i}^{K} [a], t_{j}^{K} [b]) \cdot d_{i j}

(21)

Subject to:

q^{k} = \min (I^{k}, Q (t_{i}^{k} [a])), Q (t_{i}^{k} [a]) > 0, \forall k \in {1, 2, \dots, K}

(22)

q^{k} = - \min (C - I^{k}, - Q (t_{i}^{k} [a])}, Q (t_{i}^{k} [a]) < 0, \forall k \in {1, 2, \dots, K)

(23)

I^{k + 1} = I^{k} - q^{k}, \forall k \in {1, 2, \dots, K - 1}

(24)

n u m_{i}^{k + 1} (t_{i}^{k} [a]) = n u m_{i}^{k} (t_{i}^{k} [a]) + q^{k}

(25)

\sum_{\forall t_{i}^{k} [a]} x (t_{i}^{k} [a], t_{j}^{k} [b]) (t_{i}^{k} [a] + \frac{d_{i j}}{v} + q^{k} \cdot l) \leq t_{j}^{k} [b]

(26)

G^{k + 1} (V^{k + 1}, A^{k + 1}) \leftarrow U p d a t e (s_{i}, t_{i}^{k} [a], n u m_{i}^{k + 1} (t_{i}^{k} [a]))

(27)

\sum_{\forall t_{j}^{k} [b]} x (t_{i}^{k} [a], t_{j}^{k} [b]) - \sum_{\forall t_{i}^{k} [a]} x (t_{j}^{k} [b], t_{i}^{k} [a]) = 1, t_{i}^{k} [a] = t_{s}

(28)

\sum_{\forall t_{j}^{k} [b]} x (t_{i}^{k} [a], t_{j}^{k} [b]) - \sum_{\forall t_{i}^{k} [a]} x (t_{j}^{k} [b], t_{i}^{k} [a]) = - 1, t_{i}^{k} [a] = t_{e}

(29)

\sum_{\forall t_{j}^{k} [b]} x (t_{i}^{k} [a], t_{j}^{k} [b]) - \sum_{\forall t_{i}^{k} [a]} x (t_{j}^{k} [b], t_{i}^{k} [a]) = 0, t_{i}^{k} [a] \neq t_{s}, t_{e}

(30)

Equation (17) is the objective function, combining three elements: (1) the reduction in the total number of bicycle shortages (Equation (18)), (2) the reduction in the total duration of bicycle shortages (Equation (20)), and (3) the length of the vehicle route (Equation (21)).

P_{1} and P_{2}

are two weight parameters.

Constraint 22 ensures that the real delivery quantity cannot exceed either the delivery demand of the vertex or the vehicle inventory. Constraint 23 ensures that the real pickup quantity cannot exceed either the pickup demand of the vertex or the remaining vehicle capacity. Constraint 24 is the transition rule of the vehicle inventory. Constraint 25 updates the number of bikes in the node that has been serviced. Constraint 26 is the time constraint, namely, that the arrival time of the service vehicle at the vertex cannot be later than the time of the redistribution demand represented by the vertex. Constraint 27 updates the directed graph. The function

U p d a t e (s_{i}, t_{i}^{k} [a], n u m_{i}^{k + 1} (t_{i}^{k} [a]))

recalculates the redistribution demand of the node that has just been visited

(s_{i})

. The inputs include the rebalancing node that needs to be updated

(s_{i})

, the updated time

(t_{i}^{k} [a])

and the new number of bikes at the node

(n u m_{i}^{k + 1} (t_{i}^{k} [a]))

. Based on these inputs, the redistribution demand of

s_{i}

after

t_{i}^{k} [a]

can be anticipated and the directed graph is rebuilt. Constraints 28–30 ensure that there is only one route and that no vertex can be visited more than once.

This methodology is intended to solve the rebalancing problem in each rebalancing period. We propose two different strategies for periodizing: the fixed-period strategy and the rolling-period strategy. The fixed-period strategy partitions the planning horizon into fixed rebalancing periods; when all the planned rebalancing operations in one rebalancing period are completed, vehicle routing and bike redistribution for the next fixed period begin. With the rolling-period strategy, when the vehicle picks up or delivers bikes at a node in a sub-period

T

, the rebalancing period moves forward, starting from the sub-period

T

, and the optimal rebalancing scheme will be replanned in the new rebalancing period. Figure 8 illustrates these two strategies. The rolling-period strategy is more consistent with the concept of dynamic rebalancing, but the rebalancing problem needs to be solved more often within the same planning horizon than it does under the fixed-period strategy, and this requirement increases the computing burden for operators.

7.2. Improved Max–Min Ant System

Ant colony optimization is an efficient swarm intelligence algorithm that has been proved to be effective in vehicle routing problems and other optimization problems. Therefore, we propose a solution for the above model based on the max–min ant system (MMAS) [61]. The solution process of the ant system algorithm is analogous to the vehicle route-building process. At each step, the ants in the ant colony system (ACS) choose the next vertex as indicated by the pheromone trail and heuristic information. For the problems in this paper, in addition to the geographical distance, the heuristic information is innovatively taking into account the time urgency as well as the number of node demands. The “ants” (service vehicles) prefer to visit the vertex closer in time and distance and with larger delivery demand. Assuming that an ant has already completed the

k - 1

step and is currently located at vertex

t_{i}^{k} [a] (t_{i}^{k} [a] \in V^{k})

, it will choose the next vertex

t_{j}^{k} [b]

in the

k

. The following rule governs its choice:

Given

q_{0} \in (0, 1)

, generate a random number

q \in (0, 1)

.

(1): If $q \leq q_{0}$ , the ant chooses the next vertex $t_{j}^{k} [b]$ with the highest probability:

$\{\begin{cases} t_{j}^{k} [b] = \arg \max {{[C_{τ} \cdot τ^{r} (t_{i}^{k} [a], t_{i^{'}}^{k} [a^{'}])]}^{α} \\ {[C_{η} \cdot η (t_{i}^{k} [a], t_{i^{'}}^{k} [a^{'}])]}^{β} \\ {[C_{ψ} \cdot ψ (t_{i}^{k} [a], t_{i^{'}}^{k} [a^{'}])]}^{γ} \\ {[C_{ϕ} \cdot ϕ (t_{i^{'}}^{k} [a^{'}])]}^{ζ} \cdot ξ} \\ s . t . q \leq q_{0}, (t_{i}^{k} [a], t_{i^{'}}^{k} [a^{'}]) \in A^{k}, i \neq i^{'} \end{cases}$

(31)
(2): If $q > q_{0}$ , the ant chooses the next vertex $t_{j}^{k} [b]$ based on the probability:

$p (t_{i}^{k} [a], t_{j}^{k} [b]) = \{\begin{matrix} \frac{h (t_{i}^{k} [a], t_{j}^{k} [b])}{\sum_{\forall t_{i^{'}}^{k} [a] : (t_{i}^{k} [a], t_{i^{'}}^{k} [a^{'}]) \in A^{k}} h (t_{i}^{k} [a], t_{i^{'}}^{k} [a^{'}])}, q > q_{0} \\ 0, e l s e \end{matrix}$

(32)

where:

$\begin{array}{l} h (t_{i}^{k} [a], t_{j}^{k} [b]) = {[C_{τ} \cdot τ^{r} (t_{i}^{k} [a], t_{j}^{k} [b])]}^{α} \\ {[C_{η} \cdot η (t_{i}^{k} [a], t_{j}^{k} [b])]}^{β} \\ {[C_{ψ} \cdot ψ (t_{i}^{k} [a], t_{j}^{k} [b])]}^{γ} \\ {[C_{ϕ} \cdot ϕ (t_{j}^{k} [b])]}^{ζ} \cdot ξ \end{array}$

(33)

α, β, γ, ζ

are four parameters that determine the relative importance of the pheromone trail and the heuristic operators.

C_{τ}, C_{η}, C_{ψ}, C_{ϕ}

are four positive parameters for regulating the influence of dimensions.

τ^{r} (t_{i}^{k} [a], t_{i^{'}}^{k} [a^{'}])

is the pheromone of the path

(t_{i}^{k} [a], t_{i^{'}}^{k} [a^{'}])

in the

r

iteration. The pheromone trail is used to induce the ant to choose the better path at every step, constructing the route that minimizes the objective function. In the MMAS, the range of possible pheromone trails on each path is limited to an interval

[τ_{\min}, τ_{\max}]

to avoid stagnation of the search and premature convergence, and the pheromone trails are all initialized to

τ_{\max}

in the first iteration. After all ants have completed their route construction, the pheromone trails are updated. The update follows the rule:

τ^{r + 1} (t_{i}^{k} [a], t_{i^{'}}^{k} [a^{'}]) = ρ \cdot τ^{r} (t_{i}^{k} [a], t_{i^{'}}^{k} [a^{'}]) + O_{i b}^{S_{i}}

(34)

where

O_{i b}^{S_{i}}

is the value of objective function of the best solution in the current iteration, and

ρ (with 0 < ρ < 1)

is the trail persistence (thus,

1 - ρ

models the evaporation). Only the best route adds a pheromone in each iteration. The

τ_{\max}

also needs to be updated:

τ_{\max} = O_{g b}^{S_{i}} / (1 - ρ),

where

O_{g b}^{S_{i}}

is the value of objective function of the best solution found so far. The lower pheromone limitation

τ_{\min} = τ_{\min} / {C_{τ}}^{'}

,

{C_{τ}}^{'}

is a positive regulating parameter.

η (t_{i}^{k} [a], t_{i^{'}}^{k} [a^{'}]), ψ (t_{i}^{k} [a], t_{i^{'}}^{k} [a^{'}]) and ϕ (t_{i^{'}}^{k} [a'])

are heuristic operators;

t_{i}^{k} [a]

belongs to node

s_{i}

and

t_{i^{'}}^{k} [a^{'}]

belongs to node

s_{i^{'}} (i \neq i^{'}) .

η (t_{i}^{k} [a], t_{i^{'}}^{k} [a^{'}]) = 1 / d_{i, i^{'}}

leads the ants to choose the closest vertex in distance.

ψ (t_{i}^{k} [a], t_{i^{'}}^{k} [a^{'}]) = \exp (- C_{ψ^{'}} \cdot (t_{i^{'}}^{k} [a^{'}] - (t_{i}^{k} [a] +

d_{i i^{'}} / v + q^{k} \cdot l)))

represents the time urgency of the demand of

t_{i^{'}}^{k} [a^{'}],

where

t_{i^{'}}^{k} [a^{'}]

also represents a rebalancing time of node

s_{i^{'}}

, and

(t_{i}^{k} [a] + d_{i j} / v + q^{k} \cdot l)

is the vehicle arrival time at node

s_{i^{'}} .

If

ψ (t_{i}^{k} [a], t_{i^{'}}^{k} [a^{'}])

is large,

t_{i}^{k} [a]

is close to

t_{i^{'}}^{k} [a^{'}]

in time.

C_{ψ^{'}}

is a positive regulating parameter.

ψ (t_{i}^{k} [a], t_{i^{'}}^{k} [a^{'}])

leads the ants to choose the closest vertex in time.

ϕ (t_{i^{'}}^{k} [a^{'}])

reflects the influence of the redistribution demand of

t_{i^{'}}^{k} [a^{'}] .

ϕ (t_{i^{'}}^{k} [a^{'}]) = \{\begin{cases} Q (t_{i^{'}}^{k} [a^{'}]), Q (t_{i^{'}}^{k} [a^{'}]) > 0 \\ - {C_{ϕ}}^{'} / Q (t_{i^{'}}^{k} [a^{'}]), Q (t_{i^{'}}^{k} [a^{'}]) < 0 \end{cases}

(35)

where

Q (t_{i^{'}}^{k} [a^{'}])

is the rebalancing demand of

t_{i^{'}}^{k} [a^{'}]

and

{C_{ϕ}}^{'}

is a positive regulating parameter. If

Q (t_{i^{'}}^{k} [a^{'}]) > 0

, node

s_{i^{'}}

at

t_{i^{'}}^{k} [a^{'}]

needs a delivery, while if

Q (t_{i^{'}}^{k} [a^{'}]) < 0,

node

s_{i^{'}}

at

t_{i^{'}}^{k} [a^{'}]

needs a pickup,

ϕ (t_{i^{'}}^{k} [a^{'}])

leads the ants to give priority to visit vertex with higher delivery demand or lower pickup demand. When the vehicle delivers bikes to the node, the number of bikes after that time will increase and the delivery demand decreases. Conversely, when the vehicle picks up bikes from the node, the bicycle inventory at the node decreases and the delivery demand increases.

ξ

equals 1 when part or all rebalancing demands can be satisfied; otherwise,

ξ

equals 0.

ξ = \{\begin{cases} 1, Q (t_{i^{'}}^{k} [a^{'}]) > 0, I^{k} > 0 or \\ Q (t_{i^{'}}^{k} [a^{'}]) < 0, C - I^{k} > 0 \\ 0, else \end{cases}

(36)

The complete solution process based on this improved MMAS is shown in Algorithm 2. Through this improved MMAS, we can solve the rebalancing problem and obtain a redistribution scheme for a cluster. That is, we can determine which nodes need pickups and which need deliveries, at what times and in what quantities, and nodes can be visited more than once if they so require. The rebalancing scheme is shown as

(s_{i}, t i m e, q^{1}) \to (s_{j}, t i m e^{2}, q^{2}) \to \dots,

where the three elements in each expression, respectively, represent the rebalancing node visited, the time of the visit, and the number of bicycles picked up or delivered, where

q^{k} > 0

means a delivery and

q^{k} < 0

means a pickup. The specific algorithm procedure is shown in Algorithm 2.

Algorithm 2: Improved MMAS
	Input: A cluster of nodes $S_{i}$ :
	All parameters: $τ^{\max}, ε, μ, α, β, γ, C, C_{τ}, {C_{τ}}^{'}, C_{η}, C_{ψ}, {C_{ψ}}^{'}, C_{ϕ}, {C_{ϕ}}^{'}, q_{0}$ ;
	Ants number $a$ ;
	Max iteration number $i t e r m a x$ ;
	Output: The value of elements of the objective function: $R U^{S_{i}}, R T^{S_{i}}, D^{S_{i}}$ ;
	Rebalancing scheme;
1	Initialize
	Calculate the pickup/delivery times and corresponding quantities of all nodes in $S_{i}$ ;
	Set the first iteration number $r = 1$ , set initial loading of vehicle $I = 0$ ;
	Initialize the $τ_{\max}$ and $τ_{\min}$ , and set initial pheromones $τ^{1} = τ_{\max}$ ;
2	while $r \leq i t e r m a x$ do
3		foreach ant do
			set the ant in the starting vertex, i.e., the virtue depot;
			the step $k = 1$ ;
			set up the initial directed-graph $G^{1} = (V^{1}, A^{1})$ ;
4			while the ant can find the next vertex in $G^{k}$ do
				calculate the pickup/delivery quantity $q^{k}$ according to Equations (22) and (23);
				update the loading $I^{k + 1} = I^{k} - q^{k}$ ;
				update the number of bikes at the rebalancing node just serviced;
				update the directed graph to $G^{k + 1} = (V^{k + 1}, A^{k + 1})$ ;
				choose the next vertex to visit;
				move the ant to the next vertex;
				update the step $k = k + 1$ ;
5			end
6		end
7		calculate the value of objective function of all routes in the current iteration $O_{i b}^{S_{i}}$ ;
		update the pheromone trail of all paths;
		update the best solution from the first iteration $O_{g b}^{S_{i}}$ , and update $τ_{\max}$ and $τ_{\min}$ ;
		$r = r + 1;$
8	end

8. Results and Discussion of the Numerical Study

8.1. Numerical Study Setup

We chose the data of the Mobike system from 9 May 2019 as our experimental dataset (see Figure 9).

All algorithms were coded in Python 3.7 and all experiments were carried out on a workstation powered by an Intel Xeon Gold 6142 CPU @ 2.6 GHz 2.9 GHz with a total RAM of 256 GB running the Windows 10 64-bit operating system.

8.2. Results of Rebalancing Node Determination

All the bike-distribution data collected throughout the day is aggregated. This is necessary because parking hotspots vary according to the time of day and it is important to identify all hotspots in the research area during the whole day. The three heat maps shown in Figure 10 illustrate the distribution of bikes at different times of the day. At 5:00 in the morning, large numbers of bikes are parked in the residential areas, as well as in work areas; at 10:00, the bikes are highly concentrated in work areas; and at 19:00, some bikes have returned to the residential areas. The stacking of all the data collected throughout the day highlights the degree of aggregation by revealing the wide density gaps between areas of high attraction and surrounding areas of low attraction, thus making it easier to identify parking hotspots.

There are 1,224,123 data points in the experimental dataset, and the research area is divided into 1089 (i.e., 33 × 33) grids. The distribution of shared bikes across the city is uneven. It is clear from our observations that bike density is lower in residential areas than in work areas. If we set a high density threshold to identify rebalancing nodes, the hotspot areas in the lower-density residential areas would be omitted as noise; on the other hand, if we set a low density threshold, the hotspots in the work areas would be joined and merged into one very large area. Figure 11a is the heat map of the distribution of all data over the whole day. Figure 11b is the clustering result obtained by setting the density threshold

θ = 2000

. Based on the simple density-reachable rule, each group of grids represents one cluster (i.e., a hotspot and potential rebalancing node). We observe that some clusters are too large. Figure 11c is the clustering result obtained by setting the density threshold to

θ = 4000;

only a few grids are left, and most of the data in grids with relatively low density are regarded as noise and omitted. Figure 12 illustrates the different clustering results obtained by using different density thresholds.

The problem suggested by these examples can be solved using density thresholds of varying magnitude. The proposed density- and grid-based clustering algorithm with multi-level density thresholds (DGBCAMDTs) identifies rebalancing nodes of varying densities. The larger the number of density threshold levels, the greater the number of rebalancing nodes that can be identified (see Figure 12). In the following experiment, we adopt a density threshold of

θ = (2000, 2500, 3000, 3500, 4000),

with which 69 rebalancing nodes are determined.

8.3. Case Analysis of Node Demand Forecasting

We set the future decision period to be one hour. Since the data are acquired every 10 min, the future decision period is divided into six time periods with a span of 10 min. Then, the neural network output layer contains 18 neurons, where the first 6 neurons are the number of shared bikes rented by users in the next 6 time periods, the next 6 neurons are the number of shared bikes returned by users in the next 6 time periods, and the last 6 neurons are the number of shared bikes in the next 6 time periods. The model has two hidden layers. The first of these layers contains 200 neurons and the second layer contains 100 neurons. The ReLU function serves as the activation function of the model. The model is optimized using the Adam method and the mean square error (MSE) is used as the loss function.

We use the bike-sharing data of Mobike through web crawlers and the environmental simulation method in Section 5.1 to obtain model prediction results, as shown in Figure 13.

8.4. Results of Bike Redistribution in a Single Cluster

Our rebalancing problem is solved by the improved MMAS. The default values of the parameters are presented in Table 2. The following rebalancing experiments are carried out on the 9 May 2019 dataset.

Figure 14 shows the result of rebalancing operations in the cluster with only two rebalancing-nodes: node

s_{2}

and node

s_{3}

. The length of the rebalancing period is 24 h, divided into 144 sub-periods. Figure 14a presents the number of bikes at the two nodes assuming that no rebalancing operations are performed. We observe that the inventory at node

s_{2}

would increase in the daytime and decrease in the evening, while the opposite would occur at node

s_{3},

so the cluster (2, 3) is a “self-balanced” cluster. Figure 14b presents the bike inventories at nodes 2 and 3 after rebalancing operations. The vehicle route is (0, 00:00, 0) → (2, 09:40, −9) → (3, 10:10, 9) → (2, 19:50, −3) → (3, 20:10, −12) → (2, 22:20, 14), the three elements of each expression, respectively, representing the node visited, the access time and the number of bikes picked up or delivered. Compared with no rebalancing operation scenarios, the total number of bike shortages (i.e., delivery demands) at these two nodes has decreased by 76.92% and the total duration of the shortages has fallen by 85.11% as a result of the rebalancing operations, while the total distance traveled by the service vehicle is 2.32 km.

In the following study, we test the effects on rebalancing effect of varying the length of the rebalancing period (30 min, 1 h, 3 h and 6 h), the period strategy (fixed or rolling), and the service vehicle capacity (30, 40 or 50 bikes); the other parameters are set to their default values. The experiment is conducted on three different clusters of 2, 5 and 10 nodes, respectively:

S_{1} = (2, 3), S_{2} =

(10, 16, 19, 21, 46)

and S_{3} = (2, 3, 24, 25, 26, 37,

38, 44, 55, 59)

. We use three indicators to measure the rebalancing effect: (1) the proportion by which the number of shortages of shared bikes in the cluster was reduced upon completion of the rebalancing operations (

R U^{S_{i}}

: Equation (18)), (2) the proportion by which the total duration of shortages was reduced (

R T^{S_{i}}

: Equation (20)), and (3) the length of the vehicle route (

D^{S_{i}}

: Equation (21)).

The planning horizon for the rebalancing experiment is one day. With the fixed-period strategy, it can be divided into several rebalancing periods, with the rebalancing problem being solved in each period. Table 3 presents the results obtained with rebalancing periods of various durations from 30 min to 6 h. Since the sub-period length is fixed at 10 min, a longer rebalancing period means more sub-periods. The table indicates that system efficiency would be improved by the adoption of our rebalancing methodology. In addition, the results improve when a longer rebalancing period is used. When the length of the rebalancing period is 24 h, meaning that there are 144 10-min sub-periods, the system efficiencies of the three tested clusters are all significantly enhanced, with reductions in both the number of shortages and their duration

(R U^{S_{i}} > 0 or R T^{S_{i}} > 0) .

When the length of the rebalancing period is 10 min, there is only one sub-period, so the rebalancing method is single-period asynchronous rebalancing. With this method, the rebalancing operation will likely increase shortages of shared bikes

(R U^{S_{i}} < 0 or R T^{S_{i}} < 0)

and exacerbate the inefficiency of the system.

These results demonstrate that the proposed methodology, which considers the cumulative effects of multiple rebalancing sub-periods, is superior to a method that considers each sub-period separately. This is because the redistribution operations of each period have an effect on subsequent periods, an effect that may even be to reduce system efficiency. Although single-period rebalancing may improve system efficiency within that period, its positive effects may not have been maximized, and indeed it may even have a negative impact from the perspective of the entire planning horizon. On the other hand, the rebalancing of a bike-sharing system involves forecasting redistribution demand. Such forecasts can never be 100% accurate, and the longer the period covered, the less accurate the forecast. In practice, therefore, the duration of the rebalancing period should be limited.

Table 4 depicts the rebalancing performance of the rolling-period strategy, in which a new rebalancing period begins every time the vehicle starts looking for the next node to visit. We tested the rolling-period strategy with rebalancing periods of different durations: 30 min, 1 h, 3 h and 6 h. As expected, given the same period length, the rolling-period strategy performs better than the fixed-period strategy, especially with rebalancing periods of short duration. However, a shorter period still produces worse results than a longer period. Moreover, the computing time is greater for a rolling-period strategy than for a fixed-period one, since the rolling strategy requires the rebalancing problem to be solved more often within the same planning horizon. In actual operation, while the rolling-period strategy will yield greater improvements in system performance and user satisfaction than the fixed-period scheme (for rebalancing periods of equal length), the computing burden will be greater and operating costs will be higher.

Table 5 shows the effects of varying the service vehicle capacity—30, 40 or 50 bikes—on performance measures (under a fixed-period strategy). The results indicate that vehicle capacity has little impact on the effectiveness of rebalancing operations.

8.5. Results of Bike Redistribution in the Research Area Based on Node Clusters

In Section 6.1, we propose a comprehensive criterion to measure the degree of “self-balance” of a single cluster

S_{i}

(Equation (14)). The criterion combines three factors: (1) cluster compactness,

D e v^{S_{i}}

; (2) the non-complementarity of the rebalancing nodes,

N C o m p^{S_{i}}

; and (3) the number of nodes in the cluster,

|S_{i}| .

In order to determine the relationships between rebalancing performance

(R U^{S_{i}}, R T^{S_{i}} and D^{S_{i}})

and these three criteria, we randomly group the nodes into 100 different clusters and solve the rebalancing problem in each cluster. The correlation matrix of the three criteria and rebalancing performance is shown in Table 6, and the relationship between

R T^{S_{i}}

and the three criteria is depicted in Figure 15.

From Table 6 and Figure 15, we observe that the three criteria work together to affect the rebalancing result. As the

D e v^{S_{i}}

of the cluster declines (i.e., its compactness increases), the values of

R U^{S_{i}}

and

R T^{S_{i}}

are likely to rise, meaning that system efficiency increases, but when the value of

D e v^{S_{i}}

is very small, the rebalancing result would correlate more with the complementarity of the nodes in the cluster. That is, the rebalancing effect is greater when

N C o m p^{S_{i}}

is smaller and the complementarity of the nodes is greater. There is a similar relationship between

N C o m p^{S_{i}}

,

|S_{i}|

and

R T^{S_{i}}

. When

N C o m p^{S_{i}}

and

|S_{i}|

are both smaller,

R T^{S_{i}}

is larger. In general terms, the rebalancing effect is greater in a small region with a small number of complementary rebalancing nodes.

This result is easy to understand. On the one hand, if the rebalancing nodes in the same cluster are closer to one another, the service vehicle can drive from one node to another in a shorter time, improving the efficiency of the rebalancing operations. At the same time, the trends in the bicycle inventories of two neighboring nodes are likely to be similar, so the complementarity of the two nodes is probably low. Thus, in a small cluster, the complementarity of the rebalancing nodes in the cluster takes on greater importance. However, with greater rebalancing efficiency, more redistribution operations can be completed in the same span of time and routing costs

(D^{S_{i}})

would also be improved. This explains the negative correlation between

D^{S_{i}}

and

N C o m p^{S_{i}}

.

R T^{S_{i}}

and

R U^{S_{i}}

are two criteria that directly evaluate the improvement of system efficiency in a cluster, and there is a strong positive correlation between them (0.9). We use

R T^{S_{i}}

alone to represent the degree of “self-balance” of a cluster.

The objective

O^{P S}

Equation (14) is a criterion that evaluates the rebalancing scheme

P S

from the perspective of improving system efficiency. That is,

O^{P S}

measures the reduction that can be achieved in all shortage events at rebalancing nodes in the research area through bike redistribution operations.

O^{P S}

is the sum of the function

F (D e v^{S_{i}}, N C o m p^{S_{i}}, | S_{i} |) (S_{i} \in P S)

, which calculates the degree of “self-balance” of the cluster

S_{i} .

Based on the results of the above tests on 100 random clusters, we use nonlinear regression to formulate the expression of the function

F (D e v^{S_{i}}, N C o m p^{S_{i}}, | S_{i} |)

. We adopt the reduction in bicycle shortage time

R T^{S_{i}} \times \sum_{\forall s_{j} : s_{j} \in S_{i}} | s_{j} |

as the dependent variable, while the independent variables are

D e v^{S_{i}}, N C o m p^{S_{i}},

| S_{i} |

and

\sum_{\forall s_{j} : s_{j} \in S_{i}} | s_{j} |

. Using SPSS 26 to perform nonlinear regression analysis on the output data of the above tests. we obtain the following function expression:

\begin{array}{l} F (D e v^{S_{i}}, N C o m p^{S_{i}}, | S_{i} |) \\ = R T^{S_{i}} \times \sum_{\forall s_{j} : s_{j} \in S_{i}} | s_{j} | \\ = {98 . 49 - 7 . 77 \times 10^{- 3} \times D e v^{S_{i}} - 3.95 \times N C o m p^{S_{i}} \\ + 0.10 \times {(N C o m p^{S_{i}})}^{2} - 7 . 23 \times 10^{- 4} \times {(N C o m p^{S_{i}})}^{3} \\ - 1 . 49 \times | S_{i} | + 7 . 34 \times 10^{- 3} \times {| S_{i} |}^{2}} \times \sum_{\forall s_{j} : s_{j} \in S_{i}} | s_{j} | \end{array}

(37)

The result of variance analysis of this regression equation is shown in Table 7. The R squared is 0.933, meaning that this regression equation can explain 93.3% of the variation in the dependent variable. With the expression of

F (D e v^{S_{i}}, N C o m p^{S_{i}}, | S_{i} |)

, we can also obtain the expression of the objective of rebalancing-node clustering,

O^{P S}

, and through the genetic algorithm-based clustering, we can obtain the clustering scheme for rebalancing nodes in the research area.

We randomly create 50 different clustering schemes, and each scheme comprises all nodes in the research area to test the explanatory ability of the expression of

O^{P S}

in rebalancing-node clustering. The relationships between the rebalancing performance measures

R U^{P S}, R T^{P S}, R^{P S} = R U^{P S} + R T^{P S}

and the criterion

O^{P S}

are shown in Figure 16, and the correlation coefficients of these three figures are all greater than 0.6. Hence, we can state that the criterion

O^{P S}

effectively explains the suitability of the clustering scheme from the perspective of system efficiency improvement, and that when the values of

R U^{P S} and R T^{P S}

increase, the explanatory ability of

O^{P S}

rises.

Taking the criterion

O^{P S}

as the fitness function, the rebalancing nodes are clustered by the genetic algorithm-based clustering method, and after 100 iterations, a final partitioning scheme is generated (see Figure 17 and Table 8). This scheme partitions the rebalancing nodes into 12 different clusters. After rebalancing operations in each cluster, the system performance after rebalancing is compared with its performance in a no rebalancing scenario. The number of bicycle shortages in the whole research area is reduced by 78.4% and the total duration of shortages by 75.2%, while the total distance traveled by service vehicles is 215,401.49.

8.6. Evaluation in Different Scenarios

We evaluate the validity of the model in different scenarios by using the dataset obtained by the web crawler from 1 May to 9 May 2019. The web crawler introduction can be seen in Section 3. The results are shown in Table 9.

Since all four days from 1 to 4 May 2019 were holidays in China, the number of bicycle shortages, the total duration of shortages, and the total distance traveled by service vehicles were smaller compared to the weekdays from 5 to 9 May 2019. The dynamic rebalancing model proposed in this paper has better evaluation results than the single-period rebalancing model on working days and holidays.

8.7. Time Complexity

In this subsection, we will evaluate the time complexity of Algorithms 1 and 2. We also use the bike-sharing data in Section 8.6. The specific experimental results can be seen in Table 10 and Table 11.

As can be seen from Table 10, the computational time of our proposed clustering algorithm is slightly greater than that of the DBSCAN algorithm. Among the three clustering algorithms, our clustering algorithm has significantly the best results.

As can be seen from Table 11, our proposed rebalancing algorithm has better results in terms of computation time and experimental results than the genetic algorithm and ant colony algorithm.

Therefore, these two algorithms we propose are practical and effective in solving the BRPs.

9. Conclusions and Future Research

Our research focused on a cluster-then-route framework for bike rebalancing in free-floating bike-sharing systems that allow for bikes to be rented and returned almost anywhere in the service area. The main contribution of this study is the proposal of a framework for the solution of the DBRP in emerging FFBSSs. (1) We propose a density- and grid-based clustering algorithm with multi-level density thresholds (DGBCAMDTs) to locate rebalancing nodes. This algorithm can find and distinguish the hotspots in the areas with different degrees of aggregation. (2) We put forward a method that divides the rebalancing period into multiple sub-periods. Based on a BP neural network and environmental simulation, it is possible to calculate the redistribution demand of each node in each sub-period. (3) A criterion to measure the self-balancing degree of node clustering is proposed, which is taken as the objective function, and a self-balancing node-clustering scheme is obtained by using a genetic algorithm. Based on the series of time-varying pickup/delivery demand of each node, we creatively consider the cluster self-balancing over the rebalancing period, rather than at a particular point in time. (4) We propose a multi-period synchronous rebalancing method. Service vehicles redistribute bikes based on the anticipated pickup or delivery demands of the nodes in all sub-periods of the rebalancing period. During the rescheduling process, the redistribution demands of all sub-periods during the rebalancing period of all nodes are considered simultaneously. (5) We improved the max–min ant system algorithm, taking into account the distance between nodes, the urgency of node demands and the number of node demands, so as to realize multi-period synchronous rebalancing.

Our experiments were based on data from the Mobike system. There were three significant findings. (1) The results of the experiment on the determination of rebalancing nodes demonstrated that the proposed algorithm, DGBCAMDTs, which identifies hotspot areas of varying density, is superior to a clustering method that relies on a single density threshold. (2) The results of the rebalancing experiment in three different test clusters show that the proposed rebalancing methodology significantly improves system performance, reducing the number and times of bike shortages. We also evaluated the impact of managerial decision factors such as the length of the rebalancing period, the fixed-period strategy and the rolling-period strategy, as well as vehicle capacity. Our results show that the proposed multi-period synchronous rebalancing method, which divides the rebalancing period into several sub-periods and calculates the redistribution demands of the rebalancing nodes in every sub-period, yields better outcomes than the most common dynamic rebalancing method, which is based on redistribution demand in only one period. (3) Through the analysis of the rebalancing results of 100 random clusters, we formulated a comprehensive criterion (objective) for rebalancing-node clustering and obtained a partitioning scheme using genetic algorithm-based clustering. Finally, we report that our tests of the proposed rebalancing method during the full research horizon in the whole experiment area showed a 23.4% reduction in the number of bicycle shortages, a 19.7% decrease in the total duration of such shortages, and a 25.8% reduction in the total distance traveled by the service vehicles, relative to a single-period rebalancing scenario.

Future developments may focus on techniques for predicting user demand, i.e., the number of bikes that users ride into and out of the rebalancing nodes, taking the season, weather, day of the week and other factors into account. In addition, our study evaluated the partitioning of the service area only from the perspective of improving system efficiency; further study is needed to evaluate the cost of rebalancing procedures. Future studies should also tackle the problem of collecting and repositioning bicycles scattered in areas outside the parking hotspots, as well as the multi-vehicle routing problem, and further improve the proposed model and solution algorithms.

Author Contributions

Conceptualization, J.S. and Y.H.; methodology, J.S. and Y.H.; software, J.S. and Y.H.; formal analysis, J.S. and Y.H.; writing—original draft preparation, J.S. and Y.H.; writing—review and editing, J.S., Y.H. and J.Z.; visualization, J.S. and Y.H.; supervision, J.Z.; funding acquisition, J.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by National Natural Science Foundation of China (No. 71971156 and No. 72371188), The Fundamental Research Funds for the Central Universities (No. 22120210241).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data presented in this study are available on request from the corresponding author. The data are not publicly available due to privacy.

Conflicts of Interest

The authors declare no conflict of interest.

References

Zhang, H.; Zhuge, C.X.; Jia, J.M.; Shi, B.Y.; Wang, W. Green travel mobility of dockless bike-sharing based on trip data in big cities: A spatial network analysis. J. Clean. Prod. 2021, 313, 127930. [Google Scholar] [CrossRef]
Hu, B.B.; Gao, Y.F.; Yan, J.C.; Sun, Y.; Ding, Y.; Bian, J.; Dong, X.L.; Sun, H.J. Understanding the Operational Efficiency of Bicycle-Sharing Based on the Influencing Factor Analyses: A Case Study in Nanjing, China. J. Adv. Transp. 2021, 2021, 8818548. [Google Scholar] [CrossRef]
Lu, L.; Zhao, S.C.; He, Q.C.; Zhu, N. Task assignment in predictive maintenance for free-float bicycle sharing systems. Comput. Ind. Eng. 2022, 169, 108214. [Google Scholar] [CrossRef]
Basak, E.; Iris, Ç. Do the First-and Last-Mile Matter? Examining the Complementary and Substitution Effects of Bike-Sharing Platforms on Public Transit. SSRN Elibrary 2023. Available online: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4391093 (accessed on 10 October 2023).
Liu, Y.; Szeto, W.Y.; Ho, S.C. A static free-floating bike repositioning problem with multiple heterogeneous vehicles, multiple depots, and multiple visits. Transp. Res. Part C Emerg. Technol. 2018, 92, 208–242. [Google Scholar] [CrossRef]
Chemla, D.; Meunier, F.; Calvo, R.W. Bike sharing systems: Solving the static rebalancing problem. Discret Optim. 2013, 10, 120–146. [Google Scholar] [CrossRef]
Pan, X.Y.; Tang, J.J.; Yu, T.J.; Cai, J.M.; Xiong, Y.; Gao, F. Reposition optimization in the free-floating bike-sharing system considering transferring travels from urban rail transit. Comput. Ind. Eng. 2023, 178, 109127. [Google Scholar] [CrossRef]
Cruz, F.; Subramanian, A.; Bruck, B.P.; Iori, M. A heuristic algorithm for a single vehicle static bike sharing, rebalancing problem. Comput. Oper. Res. 2017, 79, 19–33. [Google Scholar] [CrossRef]
Hernandez-Perez, H.; Salazar-Gonzalez, J.J. The one-commodity pickup-and-delivery Travelling Salesman Problem. Comb. Optim. Eureka You Shrink 2003, 2570, 89–104. [Google Scholar]
Pal, A.; Zhang, Y. Free-floating bike sharing: Solving real-life large-scale static rebalancing problems. Transp. Res. Part C: Emerg. Technol. 2017, 80, 92–116. [Google Scholar] [CrossRef]
Zhang, Y.H.; Shao, Y.C.; Bi, H.; Aoyong, L.; Ye, Z.R. Bike-sharing systems rebalancing considering redistribution proportions: A user-based repositioning approach. Phys. A Stat. Mech. Its Appl. 2023, 610, 128409. [Google Scholar] [CrossRef]
Wei, Z.H.; Wang, M.Q.; Wang, S.F. A worker-and-system trade-off model for rebalancing free-float bike sharing systems: A mixed rebalancing strategy. IET Intell. Transp. Syst. 2023, 17, 1037–1050. [Google Scholar] [CrossRef]
Du, M.Y.; Cheng, L.; Li, X.F.; Tang, F. Static rebalancing optimization with considering the collection of malfunctioning bikes in free-floating bike sharing system. Transp. Res. Part E Logist. Transp. Rev. 2020, 141, 102012. [Google Scholar] [CrossRef]
Szeto, W.Y.; Shui, C.S. Exact loading and unloading strategies for the static multi-vehicle bike repositioning problem. Transp. Res. Part B Methodol. 2018, 109, 176–211. [Google Scholar] [CrossRef]
Wang, Y.; Szeto, W.Y. Static green repositioning in bike sharing systems with broken bikes. Transp. Res. Part D Transp. Environ. 2018, 65, 438–457. [Google Scholar] [CrossRef]
Lu, Y.L.; Benlic, U.; Wu, Q.H. An effective memetic algorithm for the generalized bike-sharing rebalancing problem. Eng. Appl. Artif. Intell. 2020, 95, 103890. [Google Scholar] [CrossRef]
Ren, Y.P.; Zhao, F.; Jin, H.Y.; Jiao, Z.H.; Meng, L.L.; Zhang, C.Y.; Sutherland, J.W. Rebalancing Bike Sharing Systems for Minimizing Depot Inventory and Traveling Costs. IEEE Trans. Intell. Transp. Syst. 2020, 21, 3871–3882. [Google Scholar] [CrossRef]
Mahmoodian, V.; Zhang, Y.; Charkhgard, H. Hybrid rebalancing with dynamic hubbing for free-floating bike sharing systems. Int. J. Transp. Sci. Technol. 2022, 11, 636–652. [Google Scholar] [CrossRef]
Chen, D.W.; Chen, Q.; Imdahl, C.; Woensel, T.V. A Rolling-Horizon Strategy for Dynamic Rebalancing of Free-Floating Bike-Sharing Systems. IEEE Trans. Intell. Transp. Syst. 2023, 24, 12123–12140. [Google Scholar] [CrossRef]
Regue, R.; Recker, W. Proactive vehicle routing with inferred demand to solve the bikesharing rebalancing problem. Transp. Res. Part E Logist. Transp. Rev. 2014, 72, 192–209. [Google Scholar] [CrossRef]
Shui, C.S.; Szeto, W.Y. Dynamic green bike repositioning problem—A hybrid rolling horizon artificial bee colony algorithm approach. Transp. Res. Part D Transp. Environ. 2018, 60, 119–136. [Google Scholar] [CrossRef]
Swaszek, R.M.A.; Cassandras, C.G. Receding Horizon Control for Station Inventory Management in a Bike-Sharing System. IEEE Trans. Autom. Sci. Eng. 2020, 17, 407–417. [Google Scholar] [CrossRef]
Yuan, P.L.; Han, W.; Su, X.C.; Liu, J.; Song, J.Y. A Dynamic Scheduling Method for Carrier Aircraft Support Operation under Uncertain Conditions Based on Rolling Horizon Strategy. Appl. Sci. 2018, 8, 1546. [Google Scholar] [CrossRef]
Chang, D.F.; Jiang, Z.H.; Yan, W.; He, J.L. Developing a dynamic rolling-horizon decision strategy for yard crane scheduling. Adv. Eng. Inf. 2011, 25, 485–494. [Google Scholar] [CrossRef]
Jin, Y.; Ruiz, C.; Liao, H.T. A simulation framework for optimizing bike rebalancing and maintenance in large-scale bike-sharing systems. Simul. Model. Pract. Theory 2022, 115, 102422. [Google Scholar] [CrossRef]
Masdari, M.; Gharehpasha, S.; Ghobaei-Arani, M.; Ghasemi, V. Bio-inspired virtual machine placement schemes in cloud computing environment: Taxonomy, review, and future research directions. Clust. Comput. 2020, 23, 2533–2563. [Google Scholar] [CrossRef]
Yu, B.; Yang, Z.Z.; Yao, B.Z. An improved ant colony optimization for vehicle routing problem. Eur. J. Oper. Res. 2009, 196, 171–176. [Google Scholar] [CrossRef]
Wang, Y.J.; Kuo, Y.H.; Huang, G.Q.; Gu, W.H.; Hu, Y.H. Dynamic demand-driven bike station clustering. Transp. Res. E-Log 2022, 160, 102656. [Google Scholar] [CrossRef]
Forma, I.A.; Raviv, T.; Tzur, M. A 3-step math heuristic for the static repositioning problem in bike-sharing systems. Transp. Res. Part B Methodol. 2015, 71, 230–247. [Google Scholar] [CrossRef]
Schuijbroek, J.; Hampshire, R.C.; van Hoeve, W.J. Inventory rebalancing and vehicle routing in bike sharing systems. Eur. J. Oper. Res. 2017, 257, 992–1004. [Google Scholar] [CrossRef]
Caggiani, L.; Camporeale, R.; Ottomanelli, M.; Szeto, W.Y. A modeling framework for the dynamic management of free-floating bike-sharing systems. Transp. Res. Part C Emerg. Technol. 2018, 87, 159–182. [Google Scholar] [CrossRef]
Hyland, M.; Hong, Z.H.; Pinto, H.K.R.D.; Chen, Y. Hybrid cluster-regression approach to model bikeshare station usage. Transp. Res. Part A Policy Pract. 2018, 115, 71–89. [Google Scholar] [CrossRef]
Jiang, M.D.; Li, C.; Li, K.H.; Liu, H. Destination Prediction Based on Virtual POI Docks in Dockless Bike-Sharing System. IEEE Trans. Intell. Transp. 2022, 23, 2457–2470. [Google Scholar] [CrossRef]
Wang, W.; Zhao, X.F.; Gong, Z.G.; Chen, Z.K.; Zhang, N.; Wei, W. An Attention-Based Deep Learning Framework for Trip Destination Prediction of Sharing Bike. IEEE Trans. Intell. Transp. 2021, 22, 4601–4610. [Google Scholar] [CrossRef]
Li, X.Y.; Xu, Y.; Chen, Q.; Wang, L.; Zhang, X.H.; Shi, W.Z. Short-Term Forecast of Bicycle Usage in Bike Sharing Systems: A Spatial-Temporal Memory Network. IEEE Trans. Intell. Transp. 2022, 23, 10923–10934. [Google Scholar] [CrossRef]
Wang, J.B.; Miwa, T.; Morikawa, T. A Demand Truncation and Migration Poisson Model for Real Demand Inference in Free-Floating Bike-Sharing System. IEEE Trans. Intell. Transp. 2023, 24, 10525–10536. [Google Scholar] [CrossRef]
Farahbakhsh, F.; Shahidinejad, A.; Ghobaei-Arani, M. Mulituser context-aware computation offloading in mobile edge computing based on Bayesian learning automata. Trans. Emerg. Telecommun. Technol. 2021, 32, e4127. [Google Scholar] [CrossRef]
Shen, S.; Wei, Z.Q.; Sun, L.J.; Rao, K.S.; Wang, R.C. A Hybrid Dispatch Strategy Based on the Demand Prediction of Shared Bicycles. Appl. Sci. 2020, 10, 2778. [Google Scholar] [CrossRef]
Sohrabi, S.; Paleti, R.; Balan, L.; Cetin, M. Real-time prediction of public bike sharing system demand using generalized extreme value count model. Transp. Res. Part A Policy Pract. 2020, 133, 325–336. [Google Scholar] [CrossRef]
Ai, Y.; Li, Z.P.; Gan, M.; Zhang, Y.P.; Yu, D.B.; Chen, W.; Ju, Y.N. A deep learning approach on short-term spatiotemporal distribution forecasting of dockless bike-sharing system. Neural Comput. Appl. 2019, 31, 1665–1677. [Google Scholar] [CrossRef]
Jiang, J.; Lin, F.; Fan, J.; Lv, H.; Wu, J. A Destination Prediction Network Based on Spatiotemporal Data for Bike-Sharing. Complexity 2019, 2019, 7643905. [Google Scholar] [CrossRef]
Benchimol, M.; Benchimol, P.; Chappert, B.; de la Taille, A.; Laroche, F.; Meunier, F.; Robinet, L. Balancing the Stations of a Self Service “Bike Hire” System. RAIRO-Oper. Res.-Rech. Opérationnelle 2011, 45, 37–61. [Google Scholar] [CrossRef]
Ho, S.C.; Szeto, W.Y. Solving a static repositioning problem in bike-sharing systems using iterated tabu search. Transp. Res. Part E Logist. Transp. Rev. 2014, 69, 180–198. [Google Scholar] [CrossRef]
Li, Y.F.; Szeto, W.Y.; Long, J.C.; Shui, C.S. A multiple type bike repositioning problem. Transp. Res. Part B Methodol. 2016, 90, 263–278. [Google Scholar] [CrossRef]
Liu, L.X.; Hu, Z.H.; Zhou, C.L.; Xu, G.H. Research on the clustering algorithm of the bicycle stations based on OPTICS. Concurr. Comput. Pract. Exp. 2019, 31, e4876. [Google Scholar] [CrossRef]
Bruck, B.P.; Cruz, F.; Iori, M.; Subramanian, A. The Static Bike Sharing Rebalancing Problem with Forbidden Temporary Operations. Transp. Sci. 2019, 53, 882–896. [Google Scholar] [CrossRef]
Li, M.Y.; Wang, X.F.; Zhang, X.; Yun, L.F.; Yuan, Y. A Multiperiodic Optimization Formulation for the Operation Planning of Free-Floating Shared Bike in China. Math. Probl. Eng. 2018, 2018, 2639542. [Google Scholar] [CrossRef]
Chen, D.W. Free-floating bike-sharing green relocation problem considering greenhouse gas emissions. Transp. Saf. Environ. 2021, 3, 132–151. [Google Scholar] [CrossRef]
Chiariotti, F.; Pielli, C.; Zanella, A.; Zorzi, M. A Dynamic Approach to Rebalancing Bike-Sharing Systems. Sensors 2018, 18, 512. [Google Scholar] [CrossRef]
Zhai, Y.; Liu, J.; Du, J.; Wu, H. Fleet Size and Rebalancing Analysis of Dockless Bike-Sharing Stations Based on Markov Chain. ISPRS Int. J. Geo-Inf. 2019, 8, 334. [Google Scholar] [CrossRef]
Ester, M.; Kriegel, H.-P.; Sander, J.; Xu, X. A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise. In Proceedings of the 2nd International Conference on Knowledge Discovery and Data Mining (KDD-96), Portland, OR, USA, 2–4 August 1996; Volume 1996, pp. 226–231. [Google Scholar]
Qiu, B.Z.; Li, X.L.; Shen, J.Y. Grid-based clustering algorithm based on intersecting partition and density estimation. In Proceedings of the Emerging Technologies in Knowledge Discovery and Data Mining—PAKDD 2007, Nanjing, China, 22–25 May 2007; Volume 4819, p. 368. [Google Scholar]
Gervini, D.; Khanal, M. Exploring patterns of demand in bike sharing systems via replicated point process models. J. R. Stat. Soc. Ser. C Appl. Stat. 2019, 68, 585–602. [Google Scholar] [CrossRef]
Ashraf, M.T.; Hossen, M.A.; Dey, K.; El-Dabaja, S.; Aljeri, M.; Naik, B. Impacts of Bike Sharing Program on Subway Ridership in New York City. Transp. Res. Rec. 2021, 2675, 924–934. [Google Scholar] [CrossRef]
Gammelli, D.; Wang, Y.H.; Prak, D.; Rodrigues, F.; Minner, S.; Pereira, F.C. Predictive and prescriptive performance of bike-sharing demand forecasts for inventory management. Transp. Res. Part C Emerg. Technol. 2022, 138, 103571. [Google Scholar] [CrossRef]
Handl, J.; Knowles, J. An evolutionary approach to multiobjective clustering. IEEE Trans. Evol. Comput. 2007, 11, 56–76. [Google Scholar] [CrossRef]
Kloimüllner, C.; Papazek, P.; Hu, B.; Raidl, G.R. A Cluster-First Route-Second Approach for Balancing Bicycle Sharing Systems. In Proceedings of the Computer Aided Systems Theory—Eurocast 2015, Las Palmas de Gran Canaria, Spain, 8–13 February 2015; Volume 9520, pp. 439–446. [Google Scholar]
Park, Y.; Song, M. A genetic algorithm for clustering problems. In Proceedings of the Genetic Programming 1998: Proceedings of the Third Annual Conference, San Francisco, CA, USA, 22–25 July 1998; pp. 568–575. [Google Scholar]
Syswerda, G. Uniform Crossover in Genetic Algorithms. In Proceedings of the 3rd International Conference on Genetic Algorithms, San Diego, CA, USA, 13–16 July 1991; pp. 2–9. [Google Scholar]
Wehbi, L.; Bektas, T.; Iris, Ç. Optimising vehicle and on-foot porter routing in urban logistics. Transp. Res. Part D Transp. Environ. 2022, 109, 103371. [Google Scholar] [CrossRef]
Stutzle, T.; Hoos, H.H. MAX-MIN Ant System. Future Gener. Comput. Syst. 2000, 16, 889–914. [Google Scholar] [CrossRef]

Figure 1. Sequence diagram for FFBSS.

Figure 2. (a). The interface of the mobile application of Mobike. (b). The distribution of all bikes in the research area at 10:00 am on 9 May 2019, where each red dot represents one shared bike.

Figure 3. An example of scheduling demand determination.

Figure 4. An example of a self-balanced cluster with 3 rebalancing nodes.

Figure 5. Illustration of the locus-based adjacency scheme. (Left): one possible genotype. (Middle): translate the genotype into the graph structure. (Right): the final clusters.

Figure 6. Illustration of the uniform crossover. A and B are two parent genotypes and their corresponding graph structures. A standard uniform crossover of the genotypes yields the child C, which has inherited much of its structure from its parents, but differs from both of them.

Figure 7. An example of the network of scheduling times. After every step of the rebalancing operation, some vertexes change.

Figure 8. Illustration of the fixed-period and rolling-period strategies.

Figure 9. The distribution of shared bikes in the research area, where any red dot is a shared bike.

Figure 10. (a). The heat map of the distribution of all shared bikes in the research area at 5:00 am. (b). The heat map of the distribution of all shared bikes in the research area at 10:00 am. (c). The heat map of the distribution of all shared bikes in the research area at 19:00 pm.

Figure 11. (a) The heat map of the distribution of all bikes in the dataset. (b) The clustering result when θ = 2000. (c) The clustering result when θ = 4000.

Figure 12. Clustering results and rebalancing nodes with different density thresholds (one color represents one cluster).

Figure 13. (a) Prediction results of number of bikes rented by users at node 1. (b) Prediction results of number of bikes returned by users at node 1. (c) Prediction results of number of bikes at node 1.

Figure 14. (a) The number of bikes in cluster (2, 3) without rebalancing operations. (b) The number of bikes in cluster (2, 3) with rebalancing operations.

Figure 15. (a) The correlation between

D e v^{S_{i}}

and

R T^{S_{i}}

. (b) The correlation between

C o m p^{S_{i}}

and

R T^{S_{i}}

. (c) The correlation between

|S_{i}|

and

R T^{S_{i}}

. (d) The relationship of

D e v^{S_{i}}

,

C o m p^{S_{i}}

and

R T^{S_{i}}

. (e) The relationship of

|S_{i}|

,

C o m p^{S_{i}}

and

R T^{S_{i}}

.

Figure 15. (a) The correlation between

D e v^{S_{i}}

and

R T^{S_{i}}

. (b) The correlation between

C o m p^{S_{i}}

and

R T^{S_{i}}

. (c) The correlation between

|S_{i}|

and

R T^{S_{i}}

. (d) The relationship of

D e v^{S_{i}}

,

C o m p^{S_{i}}

and

R T^{S_{i}}

. (e) The relationship of

|S_{i}|

,

C o m p^{S_{i}}

and

R T^{S_{i}}

.

Figure 16. The relation between the comprehensive criteria

O^{P S}

and the real rebalancing result.

Figure 16. The relation between the comprehensive criteria

O^{P S}

and the real rebalancing result.

Figure 17. The partitioning scheme (each color represents one self-balancing region).

Table 1. Review of related literature.

References	Year	Dynamic	FFBSS	Objective	Model	Solution Method	Dataset
Liu et al. [5]	2018	×	√	(1) the total number of the unsatisfied customers (2) the inconvenience of getting a bike from the FFBSS (3) weighted vehicles’ total operational time	MIP	chemical reaction optimization	data from New York, NY, USA
Chemla et al. [6]	2013	×	×	travel cost	MIP	branch-and-cut	randomly generated instances
Zhang et al. [11]	2023	×	√	(1) the upper level: the repositioning workload (2) the lower level: the user profit	MIP	a bi-level programming model	data from Philadelphia, PA, USA
Lu et al. [16]	2020	×	×	total travel cost	MIP	memetic algorithm	90 real-world instances
Ren et al. [17]	2020	×	×	depot inventory cost and traveling cost	MIP	branch-and-cut	random demand datasets
Mahmoodian et al. [18]	2022	×	√	level of service and rebalancing cost	multi-objective optimization	non-dominated sorting genetic algorithm	a real dataset that operates at the University of South Florida
Chen et al. [19]	2023	√	√	(1) the total cost of the rebalancing vehicle and the degree of imbalance in the FFBSS	the rolling-horizon theory	rolling-horizon strategy	data from the Shanghai FFBSS
Regue and Recker [20]	2014	√	×	-	inventory levels model and machine learning	gradient boosting machine	data from the Hubway Bike-sharing system
Shui and Szeto [21]	2018	√	×	-	the rolling-horizon theory	a hybrid rolling horizon artificial bee colony algorithm approach	data from Vienna’s real Citybike stations
Schuijbroek et al. [30]	2017	×	×	makespan	MIP	Markov processes	data from Hubway (Boston, MA, USA) and Capital Bikeshare (Washington, DC, USA)
Benchimol et al. [42]	2011	×	×	travel cost	graph theory	approximation algorithm	-
Ho and Szeto [43]	2014	×	×	total penalty	MIP	iterated tabu search	data from Paris
Li et al. [44]	2016	×	×	total cost	MIP	combined hybrid genetic algorithm	variants of VRP instances
Liu et al. [45]	2019	×	×	-	clustering theory	the density clustering algorithm	data from Ningbo
Bruck et al. [46]	2019	×	×	total cost	MIP	branch-and-cut	variations on the examples of Chemla et al. [6]
Li et al. [47]	2018	×	√	the overload situation of sites and the profit of operators	multi-periodic optimization formulation	branch-and-bound	data of the FFBSSs in China
Chen [48]	2021	×	√	(2) the degree of imbalance in the FFBSS (3) the greenhouse gas emissions	MIP	adaptive variable neighborhood tabu search	a large-scale FFBSS in Shanghai, China
Chiariotti et al. [49]	2018	√	×	-	stochastic process	birth-death Processes	data from New York, NY, USA
Zhai et al. [50]	2019	√	√	-	stochastic process	Markov stochastic process	data from Mobike

Table 2. The default values of the parameters involved.

Parameter	Illustration	Value	Parameter	Illustration	Value
$T$	The length of a sub-period	10 (min)	$q_{0}$	The probability of a random search by an ant	0.6
$μ$	The weight of the changed number of bikes	0.7	$ρ$	Trail persistence (Equation (34))	0.9
$ω$	The width between the benchmark reservation rate and the boundary (Equation (3))	0.3	$C_{τ}$	A positive regulation parameter of the pheromone trail (Equation (33))	1000
$ε$	The critical value of the turnover ratio of an idle node to a busy node (Equation (3))	0.5	$C_{η}$	A positive regulation parameter of distance information (Equation (33))	10
$σ$	The weight of the turnover of bikes (Equation (3))	0.7	$C_{ψ}$	A positive regulation parameter of time information (Equation (33))	10
$C$	The capacity of the service vehicle (in number of bikes)	30	$C_{ϕ}$	A positive regulation parameter of demand information (Equation (33))	0.2
$v$	The average velocity of the service vehicle	500 (m/min)	${C_{τ}}^{'}$	The ratio of $θ^{\max}$ to $θ^{\min}$ (Equation (33))	100
$l$	The time required to load or unload one bike	0.2 (min)	${C_{ψ}}^{'}$	A positive regulation parameter of time urgency (Equation (33))	1/1200
$α$	The weight of the pheromone trail (Equation (33))	1	${C_{ϕ}}^{'}$	A positive regulation parameter of pickup demand (Equation (33))	100
$β$	The weight of distance information (Equation (33))	1	$i t e r m a x$	The maximum number of iterations of the improved MMAS	300
$γ$	The weight of time information (Equation (33))	1	$P$	The length of each period	24 (h)
$ζ$	The weight of demand information (Equation (33))	1

Table 3. The performance of rebalancing periods of different durations (P) with a fixed-period strategy.

P	$S_{1}$				$S_{2}$				$S_{3}$
P	$R U^{S_{i}}$	$R T^{S_{i}}$	$D^{S_{i}}$	$C P U$	$R U^{S_{i}}$	$R T^{S_{i}}$	$D^{S_{i}}$	$C P U$	$R U^{S_{i}}$	$R T^{S_{i}}$	$D^{S_{i}}$	$C P U$
10 min	−861.54%	−38.30%	580.22	4.23	−62.70%	15.56%	2548.08	4.32	−28.19%	27.01%	29,738.79	7.56
30 min	−778.21%	−38.30%	580.22	2.43	42.70%	46.67%	6291.79	3.52	−13.04%	24.76%	27,578.00	8.43
1 h	−861.54%	−38.30%	580.22	1.86	49.73%	53.33%	4170.65	5.12	0.82%	33.41%	33,171.79	7.46
3 h	76.92%	85.11%	2320.88	1.42	43.24%	44.44%	3628.53	2.38	26.02%	38.91%	27,235.79	7.35
6 h	76.92%	85.11%	2320.88	1.33	92.43%	86.67%	6108.94	3.34	43.00%	29.26%	21,922.68	4.23
12 h	76.92%	85.11%	2320.88	1.23	92.43%	86.67%	4664.23	3.96	36.28%	27.97%	30,190.43	5.27
24 h	76.92%	85.11%	2320.88	1.21	91.89%	82.22%	8519.03	2.03	34.38%	29.90%	29,835.45	4.45

Table 4. The performance of rebalancing periods of different durations (P) with a rolling-period strategy.

P	$S_{1}$				$S_{2}$				$S_{3}$
P	$R U^{S_{i}}$	$R T^{S_{i}}$	$D^{S_{i}}$	$C P U$	$R U^{S_{i}}$	$R T^{S_{i}}$	$D^{S_{i}}$	$C P U$	$R U^{S_{i}}$	$R T^{S_{i}}$	$D^{S_{i}}$	$C P U$
30 min	76.92%	85.11%	2320.88	7.32	49.19%	55.56%	4186.30	13.54	19.57%	21.22%	21,545.33	15.87
1 h	76.92%	85.11%	2320.88	6.54	55.14%	60.00%	4254.76	11.23	25.27%	34.08%	18,877.57	16.54
3 h	76.92%	85.11%	2320.88	6.97	54.50%	57.78%	4238.42	10.38	25.61%	29.58%	27,129.72	15.42
6 h	75.64%	82.97%	2320.88	6.91	94.32%	82.22%	4675.54	9.65	46.74%	23.47%	20,184.74	20.34

Table 5. The performance effects of different service vehicle capacity.

P	$S_{1}$				$S_{2}$				$S_{3}$
P	$R U^{S_{i}}$	$R T^{S_{i}}$	$D^{S_{i}}$	$C P U$	$R U^{S_{i}}$	$R T^{S_{i}}$	$D^{S_{i}}$	$C P U$	$R U^{S_{i}}$	$R T^{S_{i}}$	$D^{S_{i}}$	$C P U$
30	76.92%	85.11%	2320.88	7.51	92.43%	86.67%	6645.49	12.64	34.38%	29.90%	29,835.45	16.52
40	70.51%	80.85%	5802.20	8.32	89.19%	80.00%	9917.81	10.92	25.14%	28.30%	32,574.27	16.96
50	70.51%	80.85%	5802.20	9.62	89.19%	80.00%	7592.76	13.32	27.79%	24.76%	40,404.88	17.84

Table 6. Correlation matrix of the three criteria and the rebalancing results.

	$D e v^{S_{i}}$	$C o m p^{S_{i}}$	$\|S_{i}\|$	$R U^{S_{i}}$	$R T^{S_{i}}$	$D^{S_{i}}$
$D e v^{S_{i}}$	1
$C o m p^{S_{i}}$	−0.41	1
$\|S_{i}\|$	0.59	−0.45	1
$R U^{S_{i}}$	−0.23	−0.06	−0.49	1
$R T^{S_{i}}$	−0.25	−0.07	−0.53	0.90	1
$D^{S_{i}}$	0.86	−0.55	0.75	−0.23	−0.29	1

Table 7. The performance effects of service vehicle capacity.

Parameter Estimates					ANOVA ^a
Parameter	Estimate	Std. Error	95% Confidence Interval		Source	Sum of Squares	df	Mean Squares
Parameter	Estimate	Std. Error	Lower Bound	Upper Bound	Source	Sum of Squares	df	Mean Squares
a	98.492	0.000	98.492	98.492	Regression	4,590,354.369	7	655,764.910
b	−0.008	0.000	−0.008	−0.008	Residual	119,952.631	193	621.516
c	−3.952	1.039	−6.001	−1.903	Uncorrected Total	4,710,307.000	200
d	0.102	0.049	0.006	0.198	Corrected Total	1,800,711.355	199
e	−0.001	3.652	−7.206	7.204	Dependent variable: RTS
f	−1.489	0.193	−1.868	−1.109	^a. R squared = 1 − (Residual Sum of Squares)/(Corrected Sum of Squares) = 0.933.
g	0.007	0.003	0.002	0.013

Table 8. Rebalancing results for each cluster.

Cluster	$R U^{S_{i}} %$	$R T^{S_{i}} %$	$D^{S_{i}}$	Cluster		$R U^{S_{i}} %$	$R T^{S_{i}} %$	$D^{S_{i}}$
(1, 17, 39)	60.87	5.88	1405.067	(12, 47, 63, 65)		95.65	90.48	11,769.03
(2, 23, 24, 25, 37, 38, 53, 54, 55)	85.92	71.64	20,587.25	(13, 21, 49, 69)		60.11	42.31	13,992.57
(3, 4, 16, 19, 27, 31, 44, 46, 58, 59, 62, 64)	77.11	71.21	52,853.94	(18, 26, 28, 30, 42, 43, 57, 60)		74.84	77.64	24,805.19
(5, 9, 29, 40, 61)	85.75	88.81	25,434.84	(34, 35)		0.00	0.00	0.00
(6, 7, 10, 32)	0.00	0.00	0.00	(36, 41, 56)		100.00	100.00	8795.043
(8, 11, 14, 15, 20, 22, 33, 48, 51, 52)	68.29	80.90	34,926.24	(45, 50, 66, 67, 68)		90.11	82.08	20,832.33
Total system improvement			$R U^{P S}$ :	78.4%	$R T^{P S}$ :	75.2%	$D^{P S}$ :	215,401.49

Table 9. Results of bike-sharing rebalancing from 1 to 9 May 2019.

NUM	$R T^{P S}$	$R U^{P S}$	$D^{P S}$	CPU
Day 1	16.5%	15.7%	154,267.23	37.42
Day 2	15.6%	15.1%	148,761.58	36.53
Day 3	13.2%	14.6%	148,963.23	38.31
Day 4	13.5%	13.9%	156,523.45	36.47
Day 5	21.4%	18.3%	214,387.65	37.94
Day 6	22.1%	19.3%	208,651.34	37.16
Day 7	21.8%	18.9%	218,764.32	38.34
Day 8	22.3%	20.4%	220,432.26	36.51
Day 9	23.4%	19.7%	215,401.49	38.65

Table 10. Comparison of time complexity of clustering algorithms.

NUM	DGBCAMDTs		k-Means		DBSCAN
NUM	Silhouette Coefficient	CPU	Silhouette Coefficient	CPU	Silhouette Coefficient	CPU
Day 1	0.512	17.32	−0.067	4.32	0.123	16.14
Day 2	0.532	16.52	−0.054	4.52	0.135	14.45
Day 3	0.496	16.13	−0.063	4.37	0.147	15.64
Day 4	0.393	15.95	−0.046	4.69	0.152	16.55
Day 5	0.558	15.77	0.122	5.52	0.313	13.98
Day 6	0.613	15.63	0.143	5.51	0.345	14.69
Day 7	0.578	17.65	0.153	5.87	0.319	15.67
Day 8	0.533	16.76	0.136	5.13	0.254	16.55
Day 9	0.571	17.23	0.172	5.33	0.298	15.99

Table 11. Comparison of time complexity of rebalancing algorithms.

NUM	Our Method				Ant Colony Algorithm				Genetic Algorithm
NUM	$R T^{P S}$	$R U^{P S}$	$D^{P S}$	CPU	$R T^{P S}$	$R U^{P S}$	$D^{P S}$	CPU	$R T^{P S}$	$R U^{P S}$	$D^{P S}$	CPU
Day 1	16.5%	15.7%	154,267.23	37.42	14.7%	13.6%	169,765.44	41.25	13.6%	14.2%	160,477.64	40.65
Day 2	15.6%	15.1%	148,761.58	36.53	14.9%	13.4%	167,123.56	39.64	14.6%	13.2%	160,123.63	37.92
Day 3	13.2%	14.6%	148,963.23	38.31	12.3%	13.5%	165,236.45	39.55	13.0%	12.9%	157,222.79	41.32
Day 4	13.5%	13.9%	156,523.45	36.47	13.1%	13.2%	169,876.52	37.56	13.2%	13.4%	161,123.64	36.95
Day 5	21.4%	18.3%	214,387.65	37.94	19.6%	17.9%	216,354.23	41.32	17.6%	17.2%	222,366.45	37.96
Day 6	22.1%	19.3%	208,651.34	37.16	20.6%	18.9%	221,124.33	44.65	21.1%	17.9%	210,119.85	37.54
Day 7	21.8%	18.9%	218,764.32	38.34	19.8%	18.5%	221,796.57	40.12	20.8%	17.6%	222,381.46	38.95
Day 8	22.3%	20.4%	220,432.26	36.51	19.6%	20.0%	224,417.65	37.13	21.5%	20.2%	225,445.12	43.78
Day 9	23.4%	19.7%	215,401.49	38.65	20.9%	18.2%	228,725.63	38.95	20.4%	18.3%	227,236.55	42.67

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Sun, J.; He, Y.; Zhang, J. A Cluster-Then-Route Framework for Bike Rebalancing in Free-Floating Bike-Sharing Systems. Sustainability 2023, 15, 15994. https://doi.org/10.3390/su152215994

AMA Style

Sun J, He Y, Zhang J. A Cluster-Then-Route Framework for Bike Rebalancing in Free-Floating Bike-Sharing Systems. Sustainability. 2023; 15(22):15994. https://doi.org/10.3390/su152215994

Chicago/Turabian Style

Sun, Jiaqing, Yulin He, and Jiantong Zhang. 2023. "A Cluster-Then-Route Framework for Bike Rebalancing in Free-Floating Bike-Sharing Systems" Sustainability 15, no. 22: 15994. https://doi.org/10.3390/su152215994

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Cluster-Then-Route Framework for Bike Rebalancing in Free-Floating Bike-Sharing Systems

Abstract

1. Introduction

2. Literature Review

3. Data Description

4. Rebalancing Nodes and Redistribution Demands

4.1. Identification of Rebalancing Nodes

4.2. Rebalancing Demand Calculation

5. Environmental Simulation and User Demand Prediction

5.1. Bike-Sharing Environmental Simulation

5.2. User Demand Forecasting Method

6. “Self-Balanced” Clusters of Rebalancing Nodes

6.1. The Criteria for “Self-Balance”

6.2. Genetic Algorithm-Based Clustering

7. Multi-Period Synchronous Rebalancing by a Service Vehicle

7.1. Formulation

7.2. Improved Max–Min Ant System

8. Results and Discussion of the Numerical Study

8.1. Numerical Study Setup

8.2. Results of Rebalancing Node Determination

8.3. Case Analysis of Node Demand Forecasting

8.4. Results of Bike Redistribution in a Single Cluster

8.5. Results of Bike Redistribution in the Research Area Based on Node Clusters

8.6. Evaluation in Different Scenarios

8.7. Time Complexity

9. Conclusions and Future Research

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI