**5. Discussion**

Regarding the estimation of all bicycle flows on the network, the high agreemen<sup>t</sup> of flows from GPS traces and manual/instrumental counts (R<sup>2</sup> = 0.73) is significantly better than the results obtained by previous studies, e.g., Jestico et al. [22] obtained an R<sup>2</sup> of 0.4 for the a.m. peak period. The reason for this difference is likely due to the more detailed network model of Bologna, representing better the cyclists' freedom to move on all possible links in both directions. Based on this correlation, one crowdsourced cyclist corresponds in average to 59 cyclists counted with traditional methods, which is consistent with previous findings in [22].

Although crowdsourced cyclists represent a small portion of all cyclists, the flows obtained from the map matched GPS data are consistent with the observed flows on the main sections of the Bologna cycle network. Only a few outliers emerge, most likely due to two potential error sources: (1) the days of data sampling of the two methods do not entirely overlap because the GPS records have been registered during the whole month of May, while the traffic counts have been conducted only for two weeks of the same month; (2) the sampling hours do not exactly overlap either, because the GPS traces are selected by their begin time while the traditional methods count the cyclists actually passing by the road-section during the analyzed time interval.

The applied deviation metric from Section 4 quantifies the total deviations of cyclists generated by single road links, but the metric itself does not identify the reasons for the deviations. However, it becomes evident that those radial roads with high bike flows and high deviations in Figure 10 are characterized by an absence of reserved bike lanes, a high level of bus traffic, often on reserved bus lanes, and a high density of intersections. In contrast, those roads where bicycle flows are high but deviations are low, are characterized by low motorized traffic volumes, a high share of bike lanes and the absence of major bus routes.

Nevertheless, the total deviation metric depends on the presence of alternative routes with respect to the shortest route and their respective road attributes: in case there are no feasible route alternatives to avoid a certain link, then the total deviation metric of the respective link is zero, even though the attributes may be unfavourable. In case the shortest route has favourable link attributes but the alternative has even more favourable link attributes, then the total deviation metric is high despite the good conditions on the shortest route. The former case is the most severe as criticalities of unfavourable roads for cyclists without route alternatives remain undiscovered by the deviation analysis.

The calibration of a binominal logit model allows to determine the choice probability, given the attributes of two non-overlapping route alternatives. The attributes distance, share of exclusive bikeways and share of low-priority roads have been found to be significant. The negative sign of β1 and the positive sign of β2 are expected as a longer distance is a disincentive and the presence of a an exclusive bikeway is an incentive for cyclists. The negative sign of β3 related to the share of low-priority roads seems to contradict the previous analyses where the average route on deviations contains a lower share of low priority roads with respect to routes where shortest and chosen routes overlap. One explanation could be the way the alternative routes are generated: probably the second shortest route between two points does use minor roads which cyclists would typically avoid, or cyclists are simply not aware of such alternatives. The determination of road priorities is a heuristic algorithm derived from OSM attributes such as speed limits, number of lanes and access restrictions. It is possible that some types of low priority roads are in reality not attractive to cyclists. A possibility to avoid such problems would be to consider only alternative routes used by at least one cyclist instead of using the adopted route generation method. Another solution would be to specify a model which quantifies the probability to accept a route by considering only the link attributes of the chosen routes, enhanced by a dummy variable indicating whether the chosen route is also the shortest route. Such a model would definitively avoid the problem of alternative route generation. However, modelling errors could be introduced by ignoring all attributes of alternative routes.

The result from Equation (3) relates the di fferences in road attributes between two alternative routes with the distance that would compensate those di fferences. The meaning of this result shall be explained by a simple numerical example: assuming two route alternatives, one with a 100% exclusive bikeway and the other without any bikeway, and both alternatives are without priority roads. In such a case, the first alternative could be 3.2 km longer than the second while still attracting 50% of the cyclists.
