3.1. User Activity and Data Development
The number of OSM participants in Germany increases from year to year. To date (June 2011), a total of more than 40,000 different members have actively contributed to the project. Slightly different numbers of contributors have generated the three OSM object types: Nodes, Ways and Relations (
Figure 1). Another interesting fact, also presented in
Figure 1, is shown by the lines in orange, light blue, and purple, representing the users that generated a total of 98% of the data volume of each respective object type. Here 98% of about 74 million Nodes can be attributed to approximately 8,500 members, 98% of about 11 million Ways to approximately 7500, and 98% of about 171,000 Relations to approximately 2,600. These numbers are based on the information in the database on who has been saved as the last owner of each Node, Way, or Relation.
Figure 1.
Number of OSM Contributors in Germany from 2009 to 2011.
Figure 1.
Number of OSM Contributors in Germany from 2009 to 2011.
To be able to give more information about the results conducted during our analysis, we need to introduce the three object types that are used in the OSM project/database in greater detail. A Node is the basic object in the database and constitutes a coordinate. Ways represent lines or surface objects and constitute references to Nodes. Objects can be linked via Relations, which relate to each other. The results of our analysis showed a general pattern of an increase in the number of OSM Node and Way objects over the past few years (2007–2011) (
Figure 2a,b), which was expected due to the general trend that OSM showed in Germany in recent years.
Figure 2.
(a) Development of OSM Nodes in Germany; (b) Development of OSM Ways in Germany.
Figure 2.
(a) Development of OSM Nodes in Germany; (b) Development of OSM Ways in Germany.
However, due to the three-month interval used in this analysis, other factors can be interpreted and distinguished. The data clearly shows that during the summer months the members are more active than during the winter months. Also, an above-average increase in data can be noticed at the turn of the year 2010/2011 and in spring 2011. The high proportion of new objects during these points in time can be attributed to the release of the Bing aerial images for digitalization purposes. The negative trend for Ways in OSM at the beginning of 2007 is due to the API changes during that year. With these changes the data schema and representation has been adjusted, and the total number of Ways was affected in this way; however, no data was lost by this change.
In prior studies the development of the German street network has only been compared to a commercial dataset (TomTom) for a period of eight months [
32]. Neither definite statements about when different street types can be considered relatively complete nor a projection for the future were given. In our analysis the strongest increase in transportation-related routes in OSM for Germany to date could be distinguished in the third quarter of 2008 (+180,000 km) (cf.
Figure 3a,b). The year 2008 was also when the most transport routes were added to the OSM database in general with a total length of almost 530,000 km. Since 2008 the annual expansion has decreased over time, and a slight change is discernible for the first half of 2011, where the trend slightly increases again. However, if this tendency continues for the rest of 2011, a small increase in the total street network will be detected for this specific year.
Figure 3.
(a) Increase in German OSM Street Network (three-month interval); (b) Annual Increase in German OSM Street Network (2007–2011).
Figure 3.
(a) Increase in German OSM Street Network (three-month interval); (b) Annual Increase in German OSM Street Network (2007–2011).
After gaining this first general impression about the development of the German OSM dataset, we conducted further analyses to give more detailed information about the street network. Due to the fact that every country’s street network consists of several different street categories, it seemed mandatory to consider these in our analysis. Thus, the various different street categories, which can also be found on the OSM Map Features web page (
http:/wiki.openstreetmap.org/wiki/Map_Features), have been divided into four groups for the sake of clarity and for enhanced research and comparison methods; namely, motorways/dual carriageway, district/municipal roads, roads to/in residential areas, and other roads such as service roads and dirt/forest trails (
Figure 4).
Tracing the growth of the different categories, it can be noted that from a specific point in time, most categories do not expand any further. This indicates which category should be close to “completion” or in which category there are still new streets being added. It needs to be noted though that for comparison the TomTom dataset is suitable only for street network data for car-specific navigation, three out of the four categories. The “other routes” category can be compared only to TomTom to a certain degree. In this fourth category, OSM has a far higher street network than the commercial provider. Based on the presumptions stated above and the comparison with the corresponding TomTom category street lengths, we reached the following conclusions. First, motorways and expressways were completely recorded for Germany by the middle of 2008. Second, all municipal roads for all of Germany were recorded by the middle of 2009. Third, streets that are close to or within residential areas are not fully recorded yet. Fourth, at the end of 2009 there were more segments in the “other routes” class in OSM than in the total TomTom commercial dataset. Fifth, in the middle of 2010 OSM surpassed TomTom in the total number of streets recorded. However, a high number of field and forest trails caused this advantage for OSM. Finally, most data contributions in 2011 are isolated street networks close to or within residential areas, but “other route” data, such as forest and field trails, are also increasing.
Figure 4.
Development of OSM Street Network in Germany by Street Category (2007–2011).
Figure 4.
Development of OSM Street Network in Germany by Street Category (2007–2011).
The development of the individual categories in comparison to prior research results [
32] and the commercial TomTom Multinet dataset from 2011 is depicted in
Figure 5. The assumptions mentioned above with regard to the development and completeness of the total street network can here be confirmed too.
Figure 5.
Development of OSM Street Network in Comparison to TomTom.
Figure 5.
Development of OSM Street Network in Comparison to TomTom.
In June 2011, our studies for Germany showed that OSM had provided a street network for car navigation that is approximately 9% smaller than that of TomTom (
Table 1). However, OSM’s total street network is approximately 27% larger in comparison with TomTom’s. In terms of pedestrian-related data and information, the OSM Germany dataset is even approximately 31% larger.
Table 1.
Total Street Length of TomTom Multinet 2011 and OSM in June 2011.
Table 1.
Total Street Length of TomTom Multinet 2011 and OSM in June 2011.
Street Network | TomTom Multinet 2011 | OSM June 2011 | % |
---|
Total street network | Approximately 1,283,000 km | Approximately 1,630,000 km | OSM 27% longer street network |
Street network for car navigation | Approximately 777,000 km | Approximately 705,000 km | TomTom 9% longer street network |
Street network for pedestrian navigation | Approximately. 1,185,000 km | Approximately 1,552,000 km | OSM 31% longer street network |
In addition to the relative geometric completeness in comparison with another dataset, the internal completeness within the street network with regard to the street names is also important. This factor, sometimes also referred to as attribute accuracy [
30], plays a significant role in applications such as routing applications that are being built on the specific dataset. Our results showed that a total of approximately 16% of streets in OSM have neither a name nor a route number (e.g., A 61) that could be used for car navigation. However, these results vary by street type in significant ways (
Figure 6).
The results clearly show that the majority of the unnamed streets are streets that are either within or close to residential areas. The “unclassified” street category could lead to confusion in this case, since streets that have a linking function between villages are included within this category. Another reason for this high value could be the fact that many of these particular routes (e.g., country lanes) have been digitized from satellite images, thus the local knowledge to add the specific name of each route is missing.
Figure 6.
Distribution of Streets without Name or Route Number Attribute Information by Street Category (June 2011).
Figure 6.
Distribution of Streets without Name or Route Number Attribute Information by Street Category (June 2011).
3.2. Data Completeness and Population Density
For further, more detailed studies of the route length of the total street network, the dataset was divided into the smallest possible German administrative units: municipalities and town boundaries. Detailed presumptions about the data development and the relative completeness with regard to population and area can be provided as a consequence of calculating the length of the route network for the different modes of transportation within the specified boundaries.
The administrative areas used in our analysis (12,387 in total) feature the number of inhabitants for the years 2008 and 2009 and were obtained from the TomTom Multinet dataset. The entire administrative area dataset is subdivided into six groups considering different population numbers. The first group (≥1,000,000) represents metropolitan areas; the second group (≥500,000 and <1,000,000) large towns; the third (≥100,000 and <500,000) towns; the fourth (≥50,000 and <100,000) medium-sized towns; the fifth (≥10,000 and <50,000) small towns; and the last (<10,000) rural towns. With regard to the entire administrative area of Germany, this means that approximately 73% of the entire population lives in population groups one to five, covering one third of the entire area of Germany. Conversely, around 27% of the population lives in population group six (rural towns) and is distributed over two-thirds of the total area of Germany.
For our analysis we considered the development of three different street networks: total street network, and car and pedestrian networks (
Figure 7a). The rows in
Figure 7a visualize the expansion for each individual network and percentage of new data per year. It is evident that over the past four years, the route network of the individual groups has developed in correlation to their population density. While the general route network has been less active in the more densely populated areas, an increase in new data can still be seen in the more sparsely populated areas. It is also clearly discernible that the largest overall increase in new streets occurred in 2008.
Figure 7.
(a) Development of OSM Street Network by Town Type; (b) Relative Difference by Town Type and Street Network (June 2011).
Figure 7.
(a) Development of OSM Street Network by Town Type; (b) Relative Difference by Town Type and Street Network (June 2011).
Another aspect that has been included in our analysis was the difference in total length of the route networks by town or municipality (
Figure 7b). The results showed that the aforementioned approximately 9% of missing data is mainly distributed over the sparsely populated areas. It is also clearly discernible that OSM provides more overall data in comparison to the proprietary dataset with regard to the total and pedestrian route network length.
When expressed in route network lengths, this means that in mid-2011, OSM was still lacking approximately 3% (21,000 km) in the small-town population group and approximately 6% (48,000 km) in the rural-town population group. Using these highly detailed studies for the increase in street data for the different town types and the analyses of the differences in route network lengths in comparison with a commercial dataset, projections could be made of the time frame within which the dataset could be completed (at least in a relative comparison to another dataset, since neither of the datasets represent ground truth). As
Figure 7b indicates, there is currently still a lack of data for less densely populated areas in OSM.
Figure 7a shows the development of data by population group. In line with the expansion rate of this graph, 6% (14,000 km) of new streets were added to group 5 for car navigation in 2010 and nearly 10% (33,000 km) to group 6. By mid-2011, 2% (5,000 km) of new streets were added to group 5 and slightly less than 4% (16,000 km) to group 6. This means that if there is an increase in new street data that remains at least at the same level and does not decline as shown in
Figure 3, the German street network for population groups 5 and 6 will be almost completely covered by the middle to end of 2012.
With regard to the correlation between TomTom’s commercial dataset and OSM, and the relative route network comparisons by town or municipality area, the following statement can be made (cf.
Figure 8). Overall there exists an 85% correlation in total length between the OSM and the TomTom dataset for a total of 87% of the area of Germany. For data related to car navigation, this value decreases to approximately 69%. Considering the population density, this means that nearly 95% of the inhabitants of Germany are covered by 85% data coverage. In the case of car navigation data, this value decreases again to nearly 84% of the population.
Figure 8.
Correlation between OSM Data Coverage and Area, and OSM Data Coverage and Population (June 2011).
Figure 8.
Correlation between OSM Data Coverage and Area, and OSM Data Coverage and Population (June 2011).
Although OSM’s total route length already exceeds that of TomTom, there are still areas in Germany within which TomTom has more data present than does OSM. According to the previous results gathered with regard to population density, these are typically areas in which the population tends to be low. The following two maps show where the differences in the total route network (
Figure 9, left) and the route network for car navigation (
Figure 9, right) can be found, based on the administrative areas for municipalities and towns.
Figure 9.
Relative Difference between TomTom and OSM for Total Route Network (left) and for Car Navigation Network (right) (June 2011).
Figure 9.
Relative Difference between TomTom and OSM for Total Route Network (left) and for Car Navigation Network (right) (June 2011).
The results gathered from several analyses over time showed that data collections in municipalities in the southeast of Germany show a good total route network; however, the same areas still lack data specific to car navigation. Upon closer examination, routes within these areas showed that although they were geometrically present in the dataset, attributes associated with these routes would not give a definitive street category. This error occurs often when streets are digitized by a contributor from aerial images, but due to the lack of local knowledge about the area, no statement can be made on the category of the street. The second information that could be derived from the maps was that TomTom has less data available in the total route network for the eastern part of Germany, while OSM generally shows a higher total route network length in this area. Overall, with the exception of a few areas, this statement can be made for large parts of all of Germany. However, with regard to the route network for car navigation, this situation is, as mentioned before, somewhat worse.
A cloud diagram allows us to visualize the towns and municipalities according to their population and relative differences between the TomTom and OSM datasets, in particular, the network for car navigation (
Figure 10). The graph clearly shows a decrease in discrepancies between the datasets with growing population density. These discrepancies can be positive and negative for each dataset. Additionally we can see that data differences in the class of rural towns (10,000–50,000 inhabitants) can vary between 10% and 20%.
Figure 10.
Correlation between Dataset Differences and Population Density (June 2011).
Figure 10.
Correlation between Dataset Differences and Population Density (June 2011).
Different numbers of members have been gathering data for OSM in each administrative area that we analyzed. A simplified number of participants per square kilometer can be calculated by dividing the total number of participants per administrative area by the size of the area. Our results showed that with an increasing number of participants, the relative difference between the datasets decreased (
Figure 11). However, what is more important, a statement can be made on how many participants are required to gather all data to receive a sophisticated dataset.
Figure 11.
Correlation between Data Completeness and Number of Contributors (June 2011).
Figure 11.
Correlation between Data Completeness and Number of Contributors (June 2011).
Bearing in mind with the current data collection trend in Germany, completeness for car navigation data of more than 90% could already be achieved in relative comparison to the commercial dataset with an average of two project participants per square kilometer. According to the trend line, more than six participants are required to achieve a dataset that is close to “complete”.
3.3. Topology Errors and Turn Restrictions
A graph is generally required for a routing application that represents a street network and also comprises nodes and edges. Due to this fact, it is essential that the graph is topologically correct and that it does not contain any errors. OSM data is not routable in its standard form [
41,
42]; however, within the OSM project, attempts are being made to record the street data correctly topologically, but this topology cannot be used directly for routing without additional data preparation. During this preparation, procedure junctions must be localized by searching for nodes that are used by several streets, and streets must be attributed to these nodes accordingly. However, errors do occur in the OSM dataset. We have examined the entire route network for Germany to find possible topology errors. In doing so, we identified errors in the topology similar to those visualized in
Figure 12. The first possible error that can occur is that the junction cannot be determined as such, as the ways do not share a common node (1). Second, duplicate nodes or ways can cause an error (2), and third, the streets do not cross or lack information and they simply overlap (3).
Figure 12.
OSM Topology Error Types.
Figure 12.
OSM Topology Error Types.
We converted the annual datasets (2007–2011) of OSM into routable street networks and searched for possible topology errors. The topology errors for non-linked streets were determined by measuring the distance between the two applicable streets, which should not be greater than 1 m. It can be clearly seen that the number of such errors has decreased over the years and remains high only for routes of cyclists or pedestrians (
Figure 13a). The results of the second analysis for possible double streets also showed that the quality has continually improved here, at least in the street network for car navigation (
Figure 13b). The number of errors for the third analysis, which shows the results of the error for intersecting streets without any shared nodes (
Figure 13c), remains relatively constant, with the exception of the “other routes” data group. During random sampling, it happened that some of the errors that were identified were based on attribute errors in the dataset. For example, the information that the street is in fact a bridge was missing.
Figure 13.
(a) OSM Topology Errors; (b) OSM Duplicate Nodes or Ways Errors; (c) Lack of Information Errors.
Figure 13.
(a) OSM Topology Errors; (b) OSM Duplicate Nodes or Ways Errors; (c) Lack of Information Errors.
Turn restrictions constitute an essential component of routing applications. In a worst-case scenario, serious street accidents can occur should they be absent or incorrect. There are several different types of turn restrictions. In general, two types can be differentiated: requirements and prohibitions. Requirements prescribe the only possible way(s) to turn or travel at a junction. Prohibitions, on the other hand, indicate where it is not permitted to travel. In the following preliminary comparison (
Table 2), the total number of turn restrictions of TomTom and OSM for Germany are compared.
Table 2.
Total Number of TomTom and OSM Turn Restrictions in Germany (June 2011).
Table 2.
Total Number of TomTom and OSM Turn Restrictions in Germany (June 2011).
Data Provider | Date | Total | Standardized |
---|
TomTom | 2011 | Approximately 176,000 | Approximately174,000 |
OpenStreetMap | June 2011 | Approximately 21,000 | Approximately 28,000 |
The difference between TomTom and OSM totals almost 146,000. As such, TomTom currently has five times more turn restrictions available for Germany than does OSM. Although the number of turn restrictions available in the OSM dataset is continually increasing, it will probably take several more years before OSM achieves the same level as TomTom, based on the current status and development.
The biggest issue during this analysis was to adjust TomTom’s dataset, read the turn restrictions, and convert them in such a way that the OSM data would be applicable for a comparison. In addition to the distribution of information for turn restrictions over several attribute tables and datasets in TomTom, the existing restrictions also had to be filtered. For example “automatically calculated” turn restrictions or those prohibiting turning into a “residents only” street were among the restrictions that have been filtered out of the TomTom dataset. In addition to the total number of differences described above, a comparison by street category was also conducted (
Figure 14).
Figure 14.
Number of Turn Restrictions by Street Category in Germany for TomTom and OSM (June 2011).
Figure 14.
Number of Turn Restrictions by Street Category in Germany for TomTom and OSM (June 2011).
For a further analysis, we organized the standardized turn restrictions according to their appearance in the different population groups (
Figure 15). The results showed that a large number of missing objects fall into the rural groups. However, the graph also shows that objects are missing in urban areas as well.
A further important quality parameter, and the final aspect of our analysis, is the temporal accuracy of the geodata. The OSM dataset allowed us to analyze this accuracy factor by identifying the street time stamp of each object in the dataset. According to the information retrieved from the dataset, which included the time stamp of each route network object, approximately one third of the data originated during 2011 and 2010, and another third during 2009 and 2008 (cf.
Figure 16).
Figure 15.
Number of Turn Restrictions by Town Type in Germany for TomTom and OSM (June 2011).
Figure 15.
Number of Turn Restrictions by Town Type in Germany for TomTom and OSM (June 2011).
Figure 16.
Actuality of the OSM Route Network.
Figure 16.
Actuality of the OSM Route Network.