**3. Methodology**

Similar to the solution for the minimum fleet-size problem, the purpose of this study is to improve the operational e fficiency of a shared bike system by constructing a shared bike trip chain. In areas with high cycling requirements, it is not always necessary to supply more bikes. If the number of cycling-in bikes is always greater than the number of cycling-out bikes, then it means that the demand does not exceed the supply. The more bikes there are in a system, the greater the ine fficiency of the shared bikes. As shown in Figure 1, there are six consecutive cycling trips among the three sites. In the ideal scenario, one bike at site A is su fficient for all trips. However, in the oversupply scenario, for example, two bikes are required at each site, and the six trips may be completed by up to six di fferent bikes. However, no matter how many di fferent bikes are being used, the bike stock at site A is always greater than 1, and the numbers of bikes at sites B and C are always greater than 2. When the volume of shared bikes is greater than the cycling requirement, bikes will remain unused, and road space will be wasted. Within a certain time interval and space range, the number of bikes in stock is always greater than zero, regardless of the possibility of damage to the bikes; therefore, the supply is greater than the demand, and there are no more bikes potentially needed. The key to improving the self-organization process of cycling is to fix the initial positions of the shared bikes at the optimal positions.

**Figure 1.** Bike movement and stocks in different scenarios: (1) only one bike at site A and (2) two bikes at each site.

Fixed boundaries are not suitable for shared bikes because of the random nature of user behavior and the unrestricted parking of station-free bikes. Therefore, we propose a heuristic bike optimization algorithm (HBOA). The core concept of the HBOA is to use the fewest number of bikes to meet all cycling requirements. The principle of using shared bikes is "first come, first served". If the ending position of one trip is close to the starting position of another trip, the ending time of the last trip and the starting time of the next trip can be continuous in time; thus, in theory, the same bike can be used for both trips.

To obtain a more reasonable number of optimized bikes, we set the minimum time interval for cycling requirements between the ending time of the last trip and the starting time of next trip to 10 min, and the maximum Euclidean distance between the ending position of the last trip and the starting position of the next trip is 100 m. That is, after completing the last trip, the optimized bike would service the closest trip at that time within 100 m of the ending position. Finally, the number of optimized bikes could be considered the ideal delivery scale of shared bikes in meeting all cycling requirements. The initial positions of these bikes can also be considered an optimal configuration for delivering or dispatching the shared bikes.

The calculation process of the HBOA is shown in Figure 2. We set all the data from valid cycling trips as data set C, including O, D, Ts, and Te information. O is the original position of trip Ci, D is the destination position of trip Ci, Ts is the starting time of trip Ci, and Te is the ending time of trip Ci. First, one of the earliest cycling trips is selected randomly and recorded as the first trip for optimized bike Bj,m(O, D, Ts, Te), where (j = 1, m = 1). Then, the trips within 100 m of Bj,m(D) are searched, and the closest trip at given starting time is identified as the next trip Bj, m + 1. This process continues until it is impossible to identify another trip for this optimized bike. The search for the earliest cycling trips in the unmarked cycling data set continues. The first trip for a new optimized bike is identified as Bj,m(O, D, Ts, Te), where (j = j + 1, m = 1). All subsequent trips are also analyzed. The process of searching is repeated until each trip is marked as one trip for an optimized bike. Obviously, the result of this algorithm is not unique. However, considering the size of the data set and the aim of the HOBA, the result does not need to be the best solution to improve the usage e fficiency of shared bikes. The time-space distribution characteristics of these optimized bikes can be used as a configuration reference for initial bike delivery.

**Figure 2.** The calculation process of the HBOA.

#### **4. Study Area and Data Preprocessing**

Shenzhen, the younges<sup>t</sup> megacity in China, was founded only 40 years ago. By the end of 2017, the city had 12.52 million people in an area of 1997.27 km<sup>2</sup> [38]. According to a report, there were approximately 10 shared bike companies in Shenzhen with approximately 890 thousand shared bikes in the market in August 2017. In September, a new shared bike policy was released by the Shenzhen governmen<sup>t</sup> that suspended the launch of new shared bike systems in the city [39].

Through the API ports of shared bike apps, the positions of all vacant bikes are given in real time. Therefore, we scanned the positions of vacant bikes for two companies, Ofo and Mobike, which account for more than 80% of the shared bike market. Limited by the app client, we only obtained 2 days of scanning data from 6–7 May 2018. These dates fall on a Sunday and Monday, representing non-working and working days. The weather conditions were similar on these two days, with sporadic light rain. We found approximately 306 thousand different Mobike bikes and 434 thousand Ofo bikes by scanning the entire city, accounting for over 80% of the total number of shared bikes.

Because it took approximately ten minutes to scan the entire city, the time interval of scanning was ten minutes. By comparing the positions of the vacant bikes at different times, it can be determined whether a bike moved, and the origin-destination positions and trip times can be obtained. Correspondingly, we can obtain the Euclidean distance and speed of these trips. However, there may be two types of data errors. The first type of error is equipment error. According to an actual test, the error of the GPS for a vacant bike returned to the same position can reach approximately 100 m. The second type of error is inference error. For example, some shared bike companies use motor vehicles to manually dispatch bikes, and the speed of cycling trips is too fast in these cases. Additionally, it is also possible that some bikes are missed during the scanning process, resulting in a long trip time. Therefore, data cleaning was performed for the original data. First, trips with Euclidean distances less

than 200 m were considered invalid, or walking was considered a more reasonable alternative. Second, trips with an average speed greater than 25 km/h may involve the manual dispatching of bikes by motor vehicles instead of normal cycling. Other trips with low speeds are indistinguishable and were retained for use in the HBOA. After cleaning, only 640 thousand available movements remained, and the average usage time of each bike was less than one. Nearly 340 thousand shared bikes did not move in two days.

In addition, two types of databases were used in this paper, as shown in Figure 3. One database includes the transportation routes in Shenzhen 2018, as well as the metro stations and bus stations. The other database includes building information from 2015, such as outline and usage information for residential buildings, urban village buildings, industrial buildings, commercial buildings, o fficial buildings, and others. Among these buildings, urban village buildings are a special type of low-cost residential building in Shenzhen. These data will help us further analyze the temporal and spatial distribution characteristics of optimized bike use. Urban area in Shenzhen has gradually transformed from a belt shape within the original Special Economic Zone (including Luohu, Futian, Nanshan and Yantian districts) into an outward radial shaped city in the past three decades, which, to some extent, deviates a multi-center development pattern [40]. Six central areas are selected to compare with the spatial distribution of shared bikes. Three of them are public service centers, including Baoan center area, Futian center area and Luohu center area. Two commercial centers are Nanshan center area and Huaqiang center area. One is an o fficial employment center, High-tech center area in Nanshan district.

**Figure 3.** The transportation and building information in Shenzhen.

#### **5. Results and Discussion**

#### *5.1. Optimized and Actual Bike Availability*

The HBAO indicated that only 137,216 bikes were needed to complete all valid trips on 6 May 2018, and 154,625 bikes were needed on 7 May 2018. The average usage number of an optimized bike on each day was 4.6 and 4.2. Overall, less than 1/5 of all shared bikes were used.

As shown in Figure 4, there are bikes in almost every land unit (200 m \* 200 m) in Shenzhen built environment. However, over 99% of these units, the actual number of bikes is higher than the number of optimized bikes, which indicates that the supply is higher than the demand. In particular, the number of bikes in the central area exceeds the number of optimized bikes by more than 100.

**Figure 4.** Difference between the actual number of available bikes and the number of optimized bikes in two days.

We took all the exits of the Houhai metro station as an example, Houhai metro station is located the central area of Nanshan district of Shenzhen, which is surrounded by commercial and residential buildings, and close to some popular public spaces, such as Shenzhen Bay Stadium, Shenzhen Bay Park and Shenzhen Talent Park. We counted the shared bikes within 100 m of the metro station exits which cycling in or out every 10 min on 6 May 2018. There were 883 cycling-in and 997 cycling-out bikes. The initial number of shared bikes around this station was 735 at 0:00 a.m., and there were always more than 500 bikes available in 24 h. As shown in Figure 5, this station had a serious oversupply issue.

**Figure 5.** Cycling-in and cycling-out bikes and stock changes around the Houhai metro station every 10 min.

The result of the HOBA showed that only 219 optimized bikes are needed around Houhai station exits. The idling of a large number of bikes is a waste of resources and road space. High cycling requirements do not necessarily correspond to the need for more bikes, especially in areas where the cycling requirements re self-balanced by user activities. Shared bicycle companies tend to delivery more bikes in high cycling requirement area to occupy the market. However, for areas with a higher frequency of use, if the cycling in and out could reach equilibrium, more delivery means less efficiency. It is more worthwhile to see where the bikes heading to these areas come from. As mentioned earlier, the key to improving the self-organization process of cycling is to fix the initial positions of the shared bikes at the optimal positions. Therefore, we would compare the high requirements space of cycling and the spatial distribution of optimized bikes' initial positions in the next section.

#### *5.2. Spatial Requirements of Cycling and Spatial Distribution of Optimized Bikes' Initial Positions*

We used the kernel density estimation to compare the requirement space and the ideal supply space of shared bikes. We defined the origin positions distribution of all valid trips as the requirement space of cycling, and the initial positions distribution of all optimized bikes as the ideal space of supply demand. As shown in Figure 6, we find that the requirement spaces are similar on working days and non-working days, and the correlation coefficient was 0.942 (*p* < 0.001). The supplying demand spaces optimized bikes on working days and non-working days has a high correlation coefficient too (0.862, *p* < 0.001). We picked the areas with expected values of greater than 25 uses per hectare as

high requirement areas and those with expected values of greater than 5 bikes per hectare as the high supply areas for optimized bikes. It can be seen that these areas are consistent or adjacent to the central areas of each district in Shenzhen.

**Figure 6.** Spatial requirements of cycling and the spatial distribution of optimized bikes.

Overlay analysis is applied for these spaces, including overlays of the transportation and building data (Figure 7). The overlay results sugges<sup>t</sup> that: (1) the area with high requirement for cycling is more consistent with the central areas of the city. Except for the central area of Luohu, other high requirement areas basically contain the central areas; (2) most of areas with high requirement for cycling are not necessarily consistent with high supply-demand space, but adjacent, such as Baoan and Futian central areas; (3) There are also some very stable areas with high demand and supply both in working or non-working days, especially in Nanshan district. It is easy to understand that the central areas often bring a lot of cycling requirements because of its high vitality. And due to its non-residential properties and attractive features to the surrounding area, a large number of cycling in bikes could meet the cycling requirements without the need of a large supply of shared bikes. One of the distinguishing features of the Nanshan District, which is different from other central areas, is that the number of metro stations and lines through it are less than those of other districts. But it is still difficult to explain why some areas have higher stability of supply demand than others. And these areas should be our most noteworthy space, because the initial bikes in these areas would result in higher efficiency. In the next section, we will focus on the initial position of each optimized bike and its surrounding traffic and the built environment.

**Figure 7.** Built environments in six central areas with high requirement or supply spaces for shared bikes.

#### *5.3. The Temporal and Spatial Characteristics of the Initial Position of Each Optimized Bike*

In this section, the temporal and spatial characteristics of the initial positions of all optimized bikes are discussed. There are two main reasons for assigning an optimized bike: the departure time of cycling out is relatively early, or the numbers of cycling in bikes couldn't meet the demand for cycling out. Therefore, finding the initial departure time and its surrounding built environment of these optimized bikes could help us better understand their supply needs. In order to easy statistics, we set a simple proximity priority for optimized bikes. First of all, the optimized bikes closest to the public transport facility are considered as demand of transfer. Among the remaining optimized bikes, public transportation is preferred too. Metro connections are assumed for those bikes within 100 m of all metro station exits. Bus connections are assumed for those within 50 m of all bus stations. Finally, the closest building to each remaining unused bike is assumed to be related to the use of that bike.

As shown in Table 1, about 45% of optimized bikes are closest to residential buildings and urban village buildings. This is because most of the first trip in one day starts from the residence. What's interesting is that the area nearby industrial buildings also a significant need for optimized bikes. Although the metro stations have higher cycling requirements as mentioned by other literature [41], only 5% of optimized bikes is needed within 100 m of all metro stations. The previous analysis in Section 5.1 also proved it.


**Table 1.** The number and percentage of optimized bikes nearby public transportation facilities and di fferent buildings.

Combining the nearby spatial characteristics and temporal characteristics of the first trip for all optimized bikes, we obtained Figure 8. In addition to the early peak at 7–9 a.m., there is also a small peak during the night from 0:00 to 1:00 a.m. This peak is partially because the algorithm searches for the earliest trip starting at 0:00 a.m., another reason may be the public transportation stoppage and high taxi prices during the nighttime. Another finding is that industrial buildings, like living buildings, have the same night peaks and early peak demand both on working and non-working day. One possible explanation is that these factories implement a three-shift switching working system which resulted in higher demand for optimized bikes at midnight and early peak time. In general, the distribution of optimized bikes is mainly in areas where the first trip of cycling out earlier or the number of cycling in bikes is less than the demand for cycling out. Correspondingly, major destinations for cycling in, such as commercial buildings and o fficial buildings have less demand for optimized bikes.

**Figure 8.** The nearby spatial and temporal characteristics of the first trip for all optimized bikes.

Furthermore, we compare the spatial distributions of optimized bikes in various nearby areas to identify the specific characteristics of the spatial demand for optimized bikes.

As shown in Figure 9, on working and non-working days, the spatial distribution of optimized bikes near public transportation facilities displays some spatial characteristics. The metro stations around the central areas have relatively high optimized bike demands on both working and non-working days, especially in Nanshan district. Our study found that 53.3% of the employed population in Nanshan high-tech area lives within 5 km. However, the layout of metro lines in Nanshan district is seriously mismatched with the commuter corridor [42]. The bus line has similar problems, mainly along the east-west strip, while the commuter corridor in Nanshan district is north-south. The high demand for optimized bikes at these public transportation facilities shows that the direct accessibility of public transportation is poor and require more transfer in the last mile.

**Figure 9.** The spatial distribution of optimized bikes near public transportation facilities.

Similar to the previous analysis, we compared the spatial distribution of optimized bikes in adjacent buildings to find these relatively stable areas with high demand for shared bikes. As shown in Figure 10a), urban village buildings next to the central areas have a significantly high demand for optimized bikes. There is no such obvious spatial characteristic in residential buildings (Figure 10b), except for the buildings in Nanshan District. Among the industrial buildings, Bantian industrial zone in Longhua District is very special area which is an industrial production base for electronic information, biotechnology and new materials in Shenzhen (Figure 10c). Whether the three-shift working system generally occurs here needs further investigation. For official and commercial buildings, there are also some such particularly stable areas with relatively high demand for optimized bikes both on weekdays and non-working days (Figure 10d,e).

**Figure 10.** Spatial distribution of optimized bikes near different buildings.

Due to data limitations, we only analyzed the spatial distribution of optimized bikes on two days and found that the results exhibited high consistency on both working and non-working days. If the proposed algorithm was applied to a long-term data, more spatial characteristics may be identified to help us understand the complementary relationship between public transportation and shared bikes or direct shared bikes for more scientific and e ffective delivery.
