*3.4. Batches Prediction*

Constructing a bike-sharing system in most cities can be realized in several steps (batches). First, the governmen<sup>t</sup> sets up a large number of bike station locations in the downtown area where lots of commercial buildings and tourist attractions are located, spreading out to nearby regions in the following months, perhaps with a short lull. However, as the frequency of shared bikes and new users increases, the governmen<sup>t</sup> needs to distribute a wider range of bike locations to satisfy users' demand, and therefore the area expands to the suburbs and even empty districts in the city center to relieve excessive demand.

**Definition 6.** *Batches Prediction. Our work focuses on batch prediction; in other words, site prediction established in later stages in the suburbs or border zones, which are also defined as*

*expansion areas in this paper. We propose to utilize EMA (Exponential Moving Average) to determine the periods of batches given a continuous time interval. The EMA is a type of average that applies weighting factors that decrease exponentially to the past. We define a batch that exists if the EMA values of month demands are continuously not less than a given threshold for several months. Figure 5 shows the EMA distribution that we perform using 2, 3, and 6 months as the average units. For example, if we define the threshold as 30 using the two months average of EMA for New York City, we can then identify three batches(peaks) from 2013 to 2018. The corresponding periods of the first, second and third batches of NYC are shown in Table 4. Our framework provides the government the estimation of the demands of newly established stations through given locations, and this can also be applied to the expansion of other facilities.*

**Figure 5.** EMA of each month.

**Table 4.** Data source and detailed contents.


In this work, we mainly use XGBoost [15] to make the prediction for each batch. Apart from XGBoost in this work, other machine learning approaches can also be applied under our framework. We will compare their effectiveness in our experiments.
