**1. Introduction**

In recent years, due to congestion, air pollution, climate change, energy scarcity, and physical inactivity, an increasing importance has been attributed to sustainable transport modes, and in particular to cycling. Municipalities have drawn attention to these issues and started to implement different strategies to encourage a greater usage of bicycles on urban streets and to reduce car trips. Many cities have decided to invest in the construction of quality bikeways with the intention to incentivize people to cycle even medium (and long) distances on a daily basis.

Several studies have found positive correlations between bike facilities and levels of bicycling. Dill and Carr [1] have analyzed data from 35 large cities across the U.S., finding that cities with higher levels of bicycle infrastructure were characterized by higher levels of bicycle commuting. Pucher and Buehler found that the key to achieving high levels of cycling in Dutch, Danish, and German cities appears to be the provision of separate cycling facilities [2].

Many researchers have studied cyclists' preferences and have estimated route choice models for planning new bicycle infrastructure. Most of these studies were based on stated preference (SP) data and on opinion surveys. Dill and Voros [3] explored the relationships between levels of cycling and demographics, objective environmental factors, perceptions of the environment, and attitudes based on the results from a random phone survey of adults in the Portland, Oregon. Stinson and Bhat [4], proposed empirical models to evaluate the importance of factors (such as travel time and pavement quality) a ffecting commuter bicyclists' route choices. Winters et al. [5] evaluated 73 motivators and deterrents of cycling based on a survey (telephone interviews) and a self-administered survey (either via the web or mail) of 1402 current and potential cyclists in metropolitan Vancouver. These studies highlighted the cyclists' preference for physically separated bike paths and on-road bike lanes separated by markers. The known limitation of SP studies arises from the di fference between stated and observed behavior [6].

In the past, only a few studies were based on revealed preference (RP) data due to the limited availability of this data type. The results of Howard and Burns [7] show that bicycle commuters in metropolitan Phoenix respond to the provision of bicycle facilities. Nelson and Allen [8] found that each additional mile of bikeway per 100,000 people is associated with a 0.069% increase in bicycle commuting.

Data on cycling volumes is the result of choices actually made by cyclists in a real outdoor environment and therefore help to support decision making. Such information is necessary to understand what factors influence ridership. Bike flows can be collected with traditional manual or instrumental counts, which have some drawbacks: Traditional manual counts lack spatial detail and temporal coverage [9,10]. Instrumental and permanent counting stations do provide continuous data, but cover typically only a small number of road sections [11].

More recently, the widespread use of smartphones and mobile applications for self-localization and navigation has increased the availability of observed cyclists' data in the form of time series of GPS points, called GPS traces. This type of data provides detailed information about the origin/destination of trips as well as the chosen routes. Furthermore, GPS traces allow to determine the total deviation (detour) generated for cyclists in terms of extra distances traveled with respect to the shortest possible route. Empirical data on detour rates do exist. Detours have been calculated in di fferent ways, see for example Pritchard et al. [12] and Gri ffin and Jiao [13].

The availability of GPS traces led to the development of new cyclists' route choice models. Dill collected data on bicycling behavior from 166 regular cyclists (1955 trips) in the Portland, Oregon, using GPS devices [14]. This study highlighted that a well-connected network of low tra ffic streets may be more e ffective than adding bike lanes on major streets with a high volumes of motor vehicle tra ffic. Menghini et al. estimated the route choice model for bicyclists from a large sample of GPS observations (2498 trips) collected in Zürich, Switzerland [15]. Their conclusion has been that the trip-length dominates the choices of the Zurich cyclists. Hood et al. analyzed GPS traces from cyclists (366 users and 2777 trips) in San Francisco, USA, and proved a preference for separated bicycle lanes—especially for infrequent cyclists [16]. Broach et al. estimated the route choice of cyclists (164 users and 1449 trips) in Portland metropolitan area, USA [17]. Their study confirmed that route length and slopes do have a negative e ffect on cyclists' route choice. In addition, they found that high tra ffic volumes, high turn frequencies, and tra ffic signals are also repellant road attributes according to the cyclists' route choice. Zimmermann et al. estimated a link-based bike route choice model from a sample of GPS observations (103 users and 648 trips) in the city of Eugene, Oregon [18]. Their study confirmed the sensibility of cyclists to distance, tra ffic volume, slopes, crossings and the presence of bike facilities, distinguishing between average slope above or below 4%, and tra ffic volume above or below 8000 vehicles per day. More recently, Bernardi et al. analyzed the GPS traces recorded by approximately 280 bicycle users throughout the Netherlands [19]; the results show a high usage of cycleway links and the preference of the shortest route by frequent cyclists. Casello and Usyukov used GPS data on cyclists' activities to

estimate a generalized-cost function that reflects the cyclists' evaluation of path alternatives [20]: their model correctly predicted the revealed path choice for 65% of the examined trips.

One of the main problems with GPS data is its representativeness, because data collection is usually provided on a volunteer basis, which is not necessarily representative for the entire population [21]. This type of problem has been highlighted by Jestico et al. [22], who used data provided by strava.com to quantify how well crowdsourced fitness app data represent ridership through a comparison with manual cycling counts in Victoria, British Columbia. Another problem is the level of detail of the network: in many cases, the success of identifying the correct network links from GPS points is limited if the bike network model is not su fficiently detailed [23,24].

This paper explains how to estimate the city-wide bicycle flows and how to identify weak points of the road network in terms of bicycle friendliness. Both methods are data driven, explicit and do not require the calibration of sophisticated models. Nevertheless, a route choice model is also calibrated explaining the reasons for deviations at certain road links.

The paper is organized as follows. Section 2 describes the study area and the features of the bike network. Section 3 depicts the bicycle flows obtained by traditional (manual and instrumental) counting methods and by GPS data collected by smartphone application. Section 3 identifies a correlation between cycling counts and GPS data and describes the bicycle flow reconstruction method. In Section 4, a deviation analysis is carried out and a Logit model is calibrated in order to shed more light on the reasons for the decision of individuals to accept deviations by assessing the route characteristics of chosen routes and shortest routes. Section 5 discusses the results of the analysis. Concluding remarks and future research directions are presented in Section 6.

## **2. Dataset Description**

## *2.1. Study Area*

Bologna is a northern Italian city with approximately 390,000 inhabitants [25]. The climate is convenient for cycling all year, with an annual average temperature slightly below 15 ◦C and low rainfall (about 700 mm rain/year and 74 days of rain per year). Figure 1 shows an overview map to facilitate the location of the city of Bologna, including a zoom on the city center (see box in Figure 1 bottom left).

The home-to-work bicycle mode share was 8.2% in 2011 [26], which is relatively high compared with other medium to large Italian cities [27]. Nevertheless, the car ownership equals 0.515 cars per inhabitant [25], which corresponds to 0.97 cars per household. Based on a survey carried out by TNS opinion & social network in the 28 member states of the European Union between the 11th and 20th of October 2014, the average bicycle mode share was 8.0% [28], whereas in Italy the percentage of people who frequently commute by bike was approximately 4.7% in 2017 [27]. In addition, all major Italian municipalities show an average car ownership of 0.616 per person and a motorcycle ownership of 0.132 per person [29].

**Figure 1.** Bologna location in the Italian contest.
