Next Article in Journal
Spatio-Temporal Variability in a Turbid and Dynamic Tidal Estuarine Environment (Tasmania, Australia): An Assessment of MODIS Band 1 Reflectance
Previous Article in Journal
Review of Web Mapping: Eras, Trends and Directions
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Flow Orientation Analysis for Major Activity Regions Based on Smart Card Transit Data

Department of Industrial and Management Systems Engineering, Kyung Hee University, Yongin, Gyeonggi 17104, Korea
*
Author to whom correspondence should be addressed.
ISPRS Int. J. Geo-Inf. 2017, 6(10), 318; https://doi.org/10.3390/ijgi6100318
Submission received: 30 July 2017 / Revised: 12 September 2017 / Accepted: 16 October 2017 / Published: 23 October 2017

Abstract

:
Analyzing public movement in transportation networks in a city is significant in understanding the life of citizen and making improved city plans for the future. This study focuses on investigating the flow orientation of major activity regions based on smart card transit data. The flow orientation based on the real movements such as transit data can provide the easiest way of understanding public movement in the complicated transportation networks. First, high inflow regions (HIRs) are identified from transit data for morning and evening peak hours. The morning and evening HIRs are used to represent major activity regions for major daytime activities and residential areas, respectively. Second, the directional orientation of flow is then derived through the directional inflow vectors of the HIRs to show the bias in directional orientation and compare flow orientation among major activity regions. Finally, clustering analysis for HIRs is applied to capture the main patterns of flow orientations in the city and visualize the patterns on the map. The proposed methodology was illustrated with smart card transit data of bus and subway transportation networks in Seoul, Korea. Some remarkable patterns in the distribution of movements and orientations were found inside the city. The proposed methodology is useful since it unfolds the complexity and makes it easy to understand the main movement patterns in terms of flow orientation.

1. Introduction

Most major cities in the world have gradually developed over time without pre-defined plans. This often causes unarranged and unexpected changes of many regions in the city. Diagnosis of the urban sprawl is crucial to understand public movements, which have many facets attached: primarily traffic congestion, locating business centers, and type of residential setup; and secondarily information flow, spread of biological viruses, and urban and transit planning [1,2,3,4,5,6,7]. The objective of this paper was to develop a methodology for understanding the flow orientation of public movements from real data sources such as smart card transit data. Even though many methods for public movement analysis have been developed, one of the easiest ways of understanding the movement flow is to show the directional flow of each region. Moreover, the flow orientation analysis based on the smart card transit data, which contain the exact movement information of the public in transportation networks according to time and location, has not been analyzed yet. Visualization of such directional movements on the map is very useful for easy understanding of real public movements which were distributed in complicated transportation networks in the developed cities.
Generally, major activities in a city are concentrated in a small number of regions. By interpreting the flow towards the major activity regions, one can understand most of the public movements in the city. In this study, the term High Inflow Regions (HIRs) is used to refer the regions that attract the majority of the public to perform their activities during the daytime or residential areas at night. The term is similar to the poly-centers used in other studies [2,8], but HIRs also contain the residential regions, as well as major areas for daytime activities. Meanwhile, high traffic is plausible in network routes that connect origins and destinations; origins and destinations of trips along with time of travel depend on living and working locations. The HIRs for major daytime activities can be determined by investigating trips for morning commutes, while the HIRs for residences can be derived from the trips for evening commutes. The two kinds of HIRs are called morning HIRs and evening HIRs in this study, respectively. These are significant in urban development projects, transportation improvement, and social network analysis.
Another aspect covered in this paper was the distribution of traffic from the origin regions towards discovered HIRs. This flow orientation in traffic was measured from each direction towards HIRs. This directional flow orientation provided the comparative incoming flow from regions in different directions. Transport service providers can benefit by concentrating on dominant directions to provide the optimum level of transport for those regions. The availability of public transport can also be inspected for the least inflow directions.
In this study, the flow orientation was intended to be analyzed using real transit data of public transport. Smart card transit data was very useful to investigate the detailed movements in traffic according to specific times and locations. More specifically, the transit data in the morning and evening peak hours were focused on in order to identify the morning and evening HIRs in a city, respectively. For the purpose of abstraction and simplicity of regions, the Geohash system was adopted, which is a geo-coding system of mapping a specific location in the world to a unique code according to the required resolution [9].
A directional inflow vector was obtained for each HIR based on the direction of the trip to the destination, and the vector was used to compare the similarity between two HIRs. Furthermore, the flow orientation patterns of HIRs were visualized by clustering the HIRs based on their directional inflow vectors. To illustrate the proposed methodology, smart card transit data for the bus and subway networks in Seoul, Korea, were utilized, and the implication of the method was described with the experimental results. In the experiments, the flow orientation patterns of morning HIRs were more variable compared to those of evening HIRs, which meant the working regions were more concentrated than the living regions in terms of orientation in this city. It was also possible to find the relationship between flow orientation and regional environment, such as a river and expressway in the city.
The contributions of this study can be summarized as follows:
  • To understand major activity regions in a city, morning and evening HIRs were derived and investigated from real transit data.
  • To provide a comprehensible way of showing public movement, the method for flow orientation analysis were developed with a directional inflow vector and a dominant factor.
  • To lessen the complexity of complicated transportation networks, the Geohash system was adopted for scalable abstraction of bus stops and subway stations.
  • To show similar flow orientation patterns of HIRs, hierarchical clustering of HIRs was applied and then visualized on a city map.
  • Through a smart card transit data in Seoul, the illustration and effectiveness of the proposed method for flow orientation analysis were presented.
The remainder of this paper is structured as follows. In Section 2, the related work is discussed. In Section 3, the methodology for processing smart card transit data, discovering HIRs, and analyzing flow orientation of these regions is presented. In Section 4, the proposed methodology for flow orientation analysis is demonstrated with a smart card transit data collected in Seoul, Korea. Section 5 makes concluding remarks and describes future work.

2. Related Work

The subject of our research is use of public transport data such as smart card transit data to study human mobility for orientation of flow. Hence, a literature review on smart card-based transit data analysis will be provided and the studies will be segregated similar to our research interest to help readers understand and compare the research purposes of studies based on the smart card transit data. After that, the flow analysis on human mobility will be reviewed and similar studies to our method will be introduced, although their data sources were not smart card transit data because of little work on flow orientation based on the smart card transit data.
A review paper on smart card data-based study [10] focused on smart card technology applied in public transit as a whole, emphasizing research classification. This paper helps us develop perspective in research direction. We found few related studies conducted in past on smart card data and we listed them in Table 1. Since the travel behavior using human mobility was studied for various purposes by researchers, we found it well again to categorize them for the purpose of the studies. Four categories of the studies were presented in Table 1 and described in succeeding paragraphs.
The first category is the clustering of geographical areas, which was also used in our research to group major activity regions for flow orientation. Spatial and temporal regularity of travelers was measured by researchers in the past by grouping them by chosen boarding/alighting stops and routes on different weekdays, and by grouping them by time of travel [11,12,13]. Morency et al. were further interested in the class wise regularity patterns of travelers [11,12]. Kieu et al. went on to categorize passengers as based on the regular selection of time of travel over observed time frame [14]. The categories are regular, habitual and irregular passengers occasional. Some researchers have considered the travel behavior for a geographical location. Kim et al. clustered subway stops and created zones having similar directions of travel [15]. Du et al. clustered regions and studied travel patterns between regions regarding direction and destination of travel [16]. These existing researches have adjacency as common constraint that affects the accuracy of derived results. Another similar study was conducted by Roth et al. [5]. They studied variation in flow between subway stops and the orientation of flow. Our work is based on their study. We took it a step further by developing a more effective technique for studying the concentration of flow to perform flow orientation analysis in a detailed manner.
Researchers interested in studying travel behavior generally use visualization for studying movement patterns. Prior to this study, maps were used to visualize commuting patterns at a time of day for different categories of passengers [17,18], and job and housing locations were marked in the form of stations or bus stops [19]. Map visualization also evolved over a period of time with more granularity in visualization combined with visual analytics. Zeng et al. used a map visualization with multidimensional attributes such as the volume of arrivals and the departure for each day of a week and activity categories in the area [20]. Compared to the previous studies on movement visualization, we showed HIRs and similarity between them in term of orientation pattern on the map.
Another research trend is to derive relationships between human mobility and the reason of travel from smart card transit data. Existing studies that fall into this category either relate passenger journeys with their travel purpose [21] using additional survey data; or relate the attractiveness of bus stops, stations or travel modes in terms of passenger inflow [22,23]. Zeng et al. linked the arrivals and departures at stations with activities in the area based on the time of arrivals and departures [20]. Following this trend, our efforts were first to extract the relationship between regular traveler movements with the concentration of day and night activities; and second to extract the relationship between two regions for similarity in orientation of inflow.
The analysis of flow orientation, which was the objective of our research, was not very popular in our literature review of smartcard data analysis. Du et al. touched on this topic and analyzed routes connecting high density regions [16]. Roth et al. grouped bus stops based on proximity and high inflow in their research, and analyzed the orientation of flow for them [4]. Cats et al. identified public transport activity centers in line with passenger flow and spatial proximity [24]. They studied total flow (the sum of arrivals and departures), differential flow, and flow ratio variation in these clusters for a time of day segments. In another study, Song et al. identified the type of industrial agglomerations and analyzed each of these orientations with a respect to different transportation modes assess relation with transport accessibility [26]. This study gauged the proximity of agglomerations to different transport modes to access the source of flow and importance of transport accessibility for industry types. They studied the orientation of flow with the variation that we grouped the discovered regions for flow orientation. Furthermore, the implementation of their methodology in other geographical areas was less feasible. We performed analyses by defining new measures for comparing flow orientation, which was useful for comparison between two flow orientations. Moreover, we also demonstrated the use of clustering and visualizing tools in analyzing flow orientation geographically.
Furthermore, while most previous studies analyzed the smart card transit data at the bus stop/station or route level [2], our research concentrated on the data at the geographical level. One of the highlights of this research is to show that the geographical complex flow information was analyzed in a simple way. The motivation for this was the assumption that analyzing the flow of high inflow areas is ample enough to understand rough geographical movement, and also to plan for future urban and network development. Hence, this study focused on the orientation of incoming flow for just a small number of HIRs. Moreover, the Geohash system was adopted to abstract the information of the geographical locations of stations and bus stops, and to choose a proper resolution. Using the Geohash system, we reduced the complexity of flow orientation analysis in this study.
The scope of this paper was restricted to analyze the flow orientation based on smart card transit data. Due to little work on past smart card data-based analysis, some major studies on human mobility from other types of data sources such as mobile phones and car GPS data were briefly introduced to compare with our study. To provide a glance on urban human mobility, which is an emerging area of interest for researchers, a few examples are provided. Zhu et al. clustered trips based on similarities of origins and destinations to bundle flow. This study provided a flow mapping view from different levels of resolutions and demonstrated the method with taxi data [25]. Wu et al. studied the urban human mobility by grouping trips with the same destinations [27]. They called it co-occurrence and the approach helped to understand the traffic flow at a given time in a given area. In their study, mobile phone trajectory data was used for demonstration. Andrienko et al. segmented events from trajectories based on car GPS data using specific query instead of focusing on the whole travel trajectories [28]. These clusters of movement events were used to analyze traffic conditions in the areas of interest. Those studies tried to analyze the traffic flow and visualize them geographically. In a similar way, this study also analyzed flow patterns in terms of flow orientation by using smart card transit data to discover the movement patterns of the public.
Meanwhile, there are three studies that were most related to this study [5,15,16]. Each of these studies attempted to develop methods to apply smart card transit data for analyzing high density concentration of flow and public mobility patterns. It seems that not many researchers have focused on performing this task. The HIR clusters in this study were a similar concept to the spark regions in [16], poly-centers in study [5], and MZPs in [15]. Each of the three studies discussed peak hour flow. However, they did not use it exclusively for discovering the high density of flow. Considering all of the day data simultaneously for discovering flow concentration could be misguided, since it includes both the journey and return. To remove this bias, we analyzed peak hour data, which served the purpose of analyzing regular travel behavior and could better help in capacity evaluation.
Morning and evening peak commute data were analyzed using the same method in this study. The former was for work destinations and the latter for residential. Poly-centers described in [5] did not separate residential and working destinations based merely on data. Spark regions in [16] could be bifurcated as high inflow or outflow, and considered both inflow and outflow for the entire day, but they could not clearly separate out residential and work places. In their experiments in study [15], the MZPs were discovered for the morning and evening commute, while the MZPs could not be distinguished for residential or work concentration areas.
The normalized directional flow vector along with a dominance factor that was introduced in our study can help represent the variation in contributed flow for HIRs. It can also be used to compare any two HIRs for dominant direction and balance in the flow orientation. Roth et al. [4] performed directional flow analysis in which the normalization was through a null model, having actual inflow and outflow degrees of stations and randomized rides. Their method was not proper for comparing the characteristics of two regions since it targeted flow orientation for a required region.

3. Methodology

Smart card transit systems provide the ability to examine user behavior in a better way than revealed preference surveys [29]. The usage of transit systems in changing urban movement and local communities can be monitored in more precise and flexible manners [5,30]. Taking this into consideration, smart card data was used for spatial analysis of a metropolitan city.
This smart card transit data were processed to transform riding and alighting stations/stops to the required resolution area using Geohash codes. Then, actual trips were segregated from transfer to obtain the final origin and destination (OD) database. This OD database was analyzed in this study to discover HIRs, which were the spatial regions on the map where the majority of the population preferred to travel. Each of these HIRs were further analyzed for flow orientation in this study considering inflow from eight compass directions. The analysis was carried out by comparing the directional flow for balance in the flow orientation.

3.1. Data Pre-Processing

The dataset considered for analysis in this study consisted of smart card transit data. It contained OD data in the form of station codes with time stamps. Other various attributes, such as amount paid, mode of transport, passenger category, route, and direction, were present in the smart card dataset.
During pre-processing, the origin and destination in the form of station codes were transformed to Geohash codes using GPS location. The main attributes of the OD dataset used in this study are presented in Table 2. The origin and destination bus stops or subway stations were mapped to Geohash codes using their longitude and latitude. Although the Geohash system was not requisite in this research, it dramatically reduced the complexity of dealing with a large number of subway stations and bus stops. Geohash can globally offer the possibility to increase or decrease the size of considered regions by adjusting the number of digits according to the required analysis precision. Geohash codes range from 1 to 12 digits according to the different precision. As an example, Figure 1 depicts how a 4-digit grid, wydm, representing approximately 39.1 km × 19.5 km area of Seoul, is divided into 32 smaller 5-digit grids, and a 5-digit grid, wydmb, is again divided into 32 6-digit grids. In this study, the transportation data used for the analysis were processed to 6-digit Geohash codes at a spatial resolution level. In addition, we ignored transfers to reflect only the final destinations. To achieve this, trips where the stoppage between two trips was less than 30 min were merged, and the first origin and the final destination were then used.

3.2. Discovery of High Inflow Regions

The pre-processed data contained origin and destination regions with time stamps for each individual trip. The destination regions were analyzed to find the HIRs, which were typically preferred destinations for the majority of travelers. The high inflow destination regions were chosen using the Pareto principle [31,32]. The Pareto principle, which is also known as the 80/20 rule, acted as a basis for segregating HIRs. Typically, around 20 to 30% of regions that contributed to 80% of the total inflow were selected as HIRs and further analyzed for flow orientation. The proposed rule, also called the law of the vital few, was justifiable for assigning HIRs, since it is widely used in various businesses and scientific fields for making decisions. The assumption was that analyzing those around 20% alighting regions could explain the flow orientation of most of the area by overlooking noise.

3.3. Flow Orientation Analysis

Flow orientation in this study signified the directional ratio of the movement amounts of passengers from origins to the specific destination. To investigate the flow orientation of a HIR (destination), all origin regions of the destination were divided with respect to relative direction to the target destination. In this study, an arrow from the center of the destination to the center of the origin was drawn virtually and then the angle of the arrow was used to map the direction of the origin to one of eight compass directions.
The directions of origin regions of a HIR and the flow amounts toward the HIR were used to measure the directional contribution of flow for the HIR. The directional contribution of flow for a particular HIR was proportional to the total incoming flow from all of the origin regions that belonged to the corresponding direction. Specifically, the directional inflow was used to measure each directional contribution, and the directional inflow of HIRi from direction d, denoted by fid, was calculated as Equation (1):
f i d = R j O d ( H I R i ) f j i
f i m a x = m a x d f i d
where Od(HIRi) is a set of regions Rj’s that are located in the d-th direction of HIRi, and fji is the movement amount from Rj to HIRi. In Equation (2), fimax is the maximum inflow of HIRi among all of the directional inflows of HIRi.
To compare the flow orientations among HIRs, the normalized directional inflow vector of HIRi, denoted by Fi, was calculated. The vector was derived from all of the directional inflows of HIRi normalized by the maximum directional inflow, as shown in Equation (3).
F i = ( f i 1 f i m a x ,   ,   f i D f i m a x )
The normalized vector could be applied to measure the similarity among HIRs in terms of flow orientation. In this equation, D is the number of considered directions and eight compass directions (i.e., D = 8) were used in this study.
After the normalized directional inflow was derived for each HIR, we also measured the imbalance in inflow contributed by each direction. The dominance factor df was introduced to evaluate the dominance of the maximum directional inflow for other inflows as follows:
d f i = d = 1 D ( f i m a x f i d ) ( D 1 )   f i m a x .
In Equation (4), the dominance factor of HIRi, denoted by dfi, is the summation of the differences between the maximum inflow and all of the directional inflows divided by the D−1 times of the maximum directional inflow. In flow orientation analysis, the dominance factor in terms of direction measured the extent to which the orientation of inflow was dominated by a single direction.
df had the maximum value of 1 when all of the directional flows were zero except for the maximum direction, while it had the minimum value of 0 when all of the directions equally contributed to the inflows. In other words, a high value of df indicated that a few directions contributed towards the inflow more than the other directions, while a low df indicated that all of the directions contributed to the inflow similarly.

4. Experiments

The flow orientation analysis was illustrated by applying the proposed methodology to smart card transit data in Seoul, Korea. In Seoul, the public transportation means such as bus and subway have charge of 64.3% among all transportation means, and the adoption rate of the smart card for public transportation reaches around 99%. A single weekday was chosen to analyze the regular travel patterns since it was known that weekdays had similar movement patterns in our preliminary study. The overall procedure of data transformation and experiments are presented in Figure 2.
The bus and subway transit data in Seoul on a chosen weekday were used as input for the analysis. The data was first processed by converting stop/station numbers to Geohash codes and distinguishing transfers from trips. Later, we extracted the transit data in the morning and evening peak hours from the transformed data having the origin and destination in the form of Geohash codes, which was proper for analyzing major movement patterns in a city. This transformed data of morning and evening peak commute hours was then used to discover the HIRs where most of the activities were concentrated during the day and night. Finally, the HIRs were clustered using agglomerative hierarchical clustering (AHC) to investigate similar flow orientation patterns of HIRs. Each experiment and the results are presented in the following sub-sections.

4.1. Data Pre-Processing

In this study, we employed the bus and subway transit data of Seoul, Korea. Weekday data for 2 February 2012, were used for the analysis. The attributes of the data used included passenger number, origin and destination station information, bus/rail type, and time stamp. These data were then transformed according to the proposed methodology so that the resulting origin and destination information was in the form of 6-digit Geohash codes. The size of a 6-digit Geohash region is approximately 1.2 km by 0.6 km in Korea, which depends upon the latitude of the region. Considering the distance between adjacent Geohash region, the area covered under 6-digit Geohash codes was found most suitable, because it was a walking distance and not large enough to use public transport for travelling within the Geohash region. In our OD data, 738 highlighted 6-digit Geohash regions existed. More specifically, they included a total of 736 destination regions and 737 origin regions as seen in the database.
In this study, we focused on the peak hours that could reflect major travel behavior on a day. A total of 4.5 million trips during the day were then divided into 24 hourly data based on the boarding time. We chose morning peak hours from 7:00 to 10:00 and evening peak hours from 16:00 to 20:00 to the extract HIRs of the morning and evening commutes. Morning HIRs had a high probability of being places where people travelled for daily activities, while evening HIRs had a high probability of being residential areas. Both the morning and evening peak commute data were analyzed separately after removing transfers to induce final destinations. Finally, the resulting OD dataset used in this experiment had 1.09 million trips for the morning and 1.30 million trips for the evening commute.

4.2. Discovery of HIRs

In this subsection, the morning and evening peak datasets were investigated separately to identify the HIRs where the majority of day and night activities were concentrated. The Pareto principle was also assumed that 80% of the output was produced by 20% of the input to deal with major flow orientation patterns. The inflow distribution of regional inflows in the morning and evening peak hours is shown in Figure 3. In the figure, HIRs were sorted in descending order of the number of inflows. It could be found that the top 23% of regions (175 of 730 regions) had charge of 80.7% of trips (0.88 million among 1.09 million trips) for morning commutes, and the top 30% of regions (219 of 731 regions) had charge of 80.0% of trips (1.04 million among 1.30 million trips) for evening commutes.
Figure 4 shows the locations of the 174 HIRs discovered from morning commute data and the 219 HIRs discovered from evening commute data on the city map. The identified HIRs for the morning commute, highlighted in blue, represent the day activity concentration areas, while the HIRs for the evening commute, highlighted in red, represent the night activity concentration areas, which are supposedly the residential areas. The HIRs in purple are the overlapped regions of the morning and evening HIRs.

4.3. Flow Orientation Analysis

Once HIRs were selected, the analysis of flow orientation was performed for the HIRs. For each HIR, its origin regions were assigned to one of eight compass directions. Two directional inflow vectors Fi were then prepared for the morning and evening commutes by using Equation (3), and the dominance factor dfi was also calculated by using Equation (4). The results of the dominance factors and maximum directional inflows of the HIRs in Seoul are summarized with their Geohash codes in Table A1 of Appendix A.
The df values gave the impression at first glance that the particular HIR was symmetrically connected to all of the directions in terms of flow volume. For 175 morning commute HIRs in Seoul, the mean value of df was 0.542 and the standard deviation was 0.101. For 219 evening commute HIRs, the mean value of df was 0.614 and the standard deviation was 0.113. It could be said that residential and working places were concentrated in specific directions and, moreover, working places were more concentrated in certain directions than residences, since the evening commute HIRs had a higher df than the morning commute HIRs.
To examine flow orientation patterns, the clustering of HIRs was performed and the result was then visualized on the map. To achieve this, the AHC technique was applied in this study. AHC is an unsupervised bottom-up clustering method for making hierarchical groups of instances based on the similarity among instances [33]. AHC can generate a dendogram as an output, which shows the progressive grouping of instances. The result gives the insight of dissimilarity in instances and sufficient options to select the suitable number of clusters. Hence, the AHC was adopted to understand the orientation flow patterns of HIRs in this research.
In this study, the similarity between the directional inflow vectors Fi of two HIRs was used to cluster the HIRs with AHC. Since Fi represents the normalized flow orientation from all eight directions for the HIR, clustering HIRs for Fi can find similar flow orientation patterns among HIRs. More specifically, average link clustering based on the Euclidean distance was opted. This clustering was justifiable in line with identifying HIRs having similar flow orientations. Figure A1 in Appendix B shows two dendograms representing the progress of hierarchically clustering the HIRs for morning and evening commutes. Based on the evaluation measures of cubic clustering criteria, such as R-square, pseudo f, and pseudo T-square, 27 HIR clusters for the morning commute and 15 HIR clusters for the evening commute were identified. The results of HIR clustering for the morning and evening commutes were visualized with df and fmax as shown in Figure 5.
The morning HIRs shown in Figure 5a signify clustering of major business or study destinations and, conversely, the evening HIRs in Figure 5b signify major residential destinations. In Figure 5a,b, the HIRs with the same number or the same color belong to the same cluster, and their flow orientation patterns can be found in Figure 6a,b with the cluster number. For example, many HIRs in the morning commutes shown in Figure 5a were clustered in Cluster 1 in yellow, which directional orientation was mainly E and W directions as depicted in Figure 6a.
On the two maps with locations of HIR clusters, many of the near HIRs have similar flow orientation patterns for the morning and evening commutes. It is reasonable to believe that near regions have higher probabilities of similar structure of residential and working places. Nevertheless, it could be found that flow orientation similarity among near HIRs in the morning is less than the similarity in the evening by comparing Figure 5a,b. In other words, there were many similar flow orientation patterns among near locations in Figure 5b and also the orientation patterns were directed to the center of the city when inspecting the major directions shown in Figure 6b. However, morning orientation patterns shown in Figure 5a were more diverse than evening orientation patterns in Figure 5b.
Moreover, near regions are often segregated by transportation convenience, such as expressways and subway lines, and surrounding environments, such as rivers and parks. For example, most of the morning HIRs in the bottom of Figure 5a belong to Cluster 1 in yellow, which had mainly horizontal movement, such as the east and west directions (see the first fan diagram in Figure 6a), because the south area of Seoul is separated from the center and the north areas by the Han River. Likewise, the HIRs in Cluster 7 above the river, shown in light pink, had E and NE directional dominant regions. However, the HIRs of Cluster 3, which are near the highway shown in orange, had vertical movement from a dominantly north direction. Moreover, the HIRs near the Han River in the south area, such as Clusters 4, 10, and 12, had a lot of inflow from the N direction due to the four bridges and the riverside expressway. It was interpreted that the flow was prominent across the river for HIRs near the highway. Such characteristics could also be found for the evening commute HIRs (see Clusters 9 and 12 depicted in Figure 6a,b).
In Figure 6a, we could find more highly clustered regions for the evening commute compared to those for the morning commute shown in Figure 5a. This result was interesting, since evening commute patterns were expected to be symmetrical to the morning commute patterns in that residential and working places were exchanged between the two commutes. It could be explained that many of the evening commutes still remained from 20:00 to 23:00 except for what we considered to be the evening commutes, 16:00–20:00. The evening commute HIRs were expected to contain many new trips in the evening. It is expected that a variety of information can be induced from the results of flow orientation analysis and that this information can be utilized for the purpose of citizen movement analysis and urban development.

5. Conclusions and Future Work

Using smartcard transit data, an analysis for discovering HIRs was used to understand their complexity by showing that daily activities were concentrated in limited areas. In this study, we identified morning commute HIRs where most of the day time activities were concentrated and evening commute HIRs that were supposed to be the residential regions. The geographical abstraction based on Geohash codes made it easy to represent and identify HIRs on a city map. Along with detecting HIRs, we also provided flow orientation analysis for each HIR in this study, which expanded the variation in the attracted orientation of HIR in terms of the eight compass directions. The dominance factor df presented the balance in flow orientation for any region at a glance, and flow orientation vectors were used for in-depth analysis of flow by applying classical data mining and statistical techniques. In this study, this vector was utilized to create HIR clusters that could visualize flow orientation patterns in a city. The analysis illustrated the proposed methodology with the smart card transit data that had been collected from the subway and bus networks in Seoul. The results from the real data provided HIR clusters for flow orientation.
This methodology can also be applied to smart card data for other places to study existing scenarios and to point to exceptions in movement. Analyzing existing scenarios is a prerequisite for making operational changes or planning an upgradation. For understanding the cause of high or low inflows, analysis results should be referred to with land use information and network data. This process can help transport service providers when adopted before any operational changes or transit network improvements are made, and for urban planning. For other urban development projects, it is beneficial when combined with the socio-economic features of passengers travelling to destinations.
The approach to flow orientation analysis might be straightforward to analyze the movement patterns in a city. Therefore, the proposed flow orientation analysis can provide basic information on how people travel in public transportation networks according to time and visualize main movement in the city. It is helpful to understand major movements of people related to context such as daytime activities and residence. In practice, more detailed analysis on flow orientation and human movement could surely be conducted to induce the patterns according to various types of activities and different time periods such as weekday and weekend.
For future studies, this research could be extended to a longer period and a different scales of areas, although the analysis in this study was conducted using one-day data in a city. In future research, it would also be interesting to compare inflow and outflow patterns even though this study was only focused on incoming flow orientation patterns. In addition, the context information of a city, such as point-of-interest [20] and accessibility [24], could be integrated to provide more comprehensible results of flow orientation analysis results to civil planners.

Acknowledgments

This work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIP) (No. 2013R1A2A2A03014718).

Author Contributions

P.S. conceived the experiments; J.-Y.J. designed the experiments; K.O. pre-processed the data; P.S. performed the experiments; K.O. visualized the results; P.S. and J.-Y.J. analyzed the data; P.S. and J.-Y.J. wrote the paper.

Conflicts of Interest

The authors declare no conflict of interest. The founding sponsors had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, and in the decision to publish the results.

Appendix A

Table A1. Description of HIRs in Seoul.
(a) Morning commute HIRs.
(a) Morning commute HIRs.
HIRGeohashdffmaxHIRGeohashdffmaxHIRGeohashdffmaxHIRGeohashdffmax
HIR1wydm6d0.638005HIR51wydm6e0.692187HIR101wydmfx0.63897HIR151wydjxg0.71777
HIR2wydm9w0.495460HIR52wydm610.692128HIR102wydjr70.53717HIR152wydms80.38367
HIR3wydm9m0.535782HIR53wydm1w0.712244HIR103wydm2j0.731205HIR153wydq5b0.52471
HIR4wydm750.565983HIR54wydm3p0.521370HIR104wydjqz0.51683HIR154wydm930.33339
HIR5wydm9x0.505136HIR55wydjx90.461149HIR105wydq590.66964HIR155wydmgu0.59545
HIR6wydm6f0.555667HIR56wydmfb0.32862HIR106wydm690.681014HIR156wydmbd0.34337
HIR7wydm7k0.585546HIR57wydjrt0.441049HIR107wydmfv0.45598HIR157wydjw60.59532
HIR8wydm9k0.393310HIR58wydmee0.621545HIR108wydjry0.65934HIR158wydmes0.76911
HIR9wydm2n0.564344HIR59wydmdr0.471102HIR109wydm7u0.51655HIR159wydjxw0.53462
HIR10wydm9y0.665326HIR60wydjrx0.36903HIR110wydm1n0.46779HIR160wydmgr0.49417
HIR11wydm8s0.553928HIR61wydmf80.541246HIR111wydm910.51657HIR161wydjwt0.63583
HIR12wydm9r0.594202HIR62wydmed0.491109HIR112wydm1m0.48821HIR162wydmmp0.47515
HIR13wydm9z0.523350HIR63wydq4e0.611445HIR113wydm240.63837HIR163wydm220.61522
HIR14wydm9v0.462875HIR64wydmsc0.43958HIR114wydmdv0.45576HIR164wydmgx0.61531
HIR15wydm6x0.422415HIR65wydjrv0.581340HIR115wydq5m0.721107HIR165wydq5x0.68632
HIR16wydjpx0.674167HIR66wydjrp0.621464HIR116wydmbq0.34566HIR166wydmbm0.46374
HIR17wydm600.623550HIR67wydm700.581318HIR117wydm6h0.43502HIR167wydm8u0.60516
HIR18wydm6t0.492630HIR68wydms10.501082HIR118wydq5n0.68874HIR168wydq560.53438
HIR19wydm8f0.552794HIR69wydmem0.581254HIR119wydmkq0.59693HIR169wydmku0.51415
HIR20wydm4x0.292722HIR70wydm630.561191HIR120wydq5e0.62742HIR170wydmey0.66597
HIR21wydm9t0.562777HIR71wydme50.501035HIR121wydmeu0.37537HIR171wydmgv0.58484
HIR22wydmc80.361919HIR72wydmf70.43850HIR122wydmen0.59635HIR172wydmtk0.38558
HIR23wydm7n0.643223HIR73wydjr90.641300HIR123wydm9b0.35428HIR173wydjx80.61520
HIR24wydmdj0.381842HIR74wydm1z0.541408HIR124wydmgk0.49537HIR174wydms90.44353
HIR25wydmdp0.492189HIR75wydm890.641252HIR125wydjwm0.53568HIR175wydjrr0.66579
HIR26wydme40.562529HIR76wydm6z0.47846HIR126wydmev0.49513
HIR27wydm6m0.492045HIR77wydmft0.53929HIR127wydm870.53580
HIR28wydm4r0.402170HIR78wydm710.581039HIR128wydmdw0.36414
HIR29wydmcc0.612625HIR79wydm0r0.631162HIR129wydjrb0.62694
HIR30wydm6v0.491965HIR80wydm2m0.611093HIR130wydm0x0.57595
HIR31wydm650.471848HIR81wydjpr0.40886HIR131wydm3q0.33384
HIR32wydmc20.431679HIR82wydmec0.49814HIR132wydmdy0.63679
HIR33wydm9q0.471767HIR83wydjz80.58975HIR133wydm3h0.57592
HIR34wydm9n0.562095HIR84wydmdh0.46754HIR134wydmk20.60634
HIR35wydm2p0.531885HIR85wydm1r0.631077HIR135wydms60.62636
HIR36wydm960.391442HIR86wydme30.601005HIR136wydmkk0.60634
HIR37wydm2t0.582050HIR87wydm3b0.641100HIR137wydmk80.34607
HIR38wydmdu0.481680HIR88wydm8e0.60972HIR138wydm7s0.71857
HIR39wydmkm0.632290HIR89wydm2r0.671203HIR139wydjw50.651101
HIR40wydm8k0.592068HIR90wydmdt0.63980HIR140wydmkg0.63655
HIR41wydm6k0.602084HIR91wydmf40.651084HIR141wydm5z0.46575
HIR42wydmdn0.662333HIR92wydmg60.50763HIR142wydjpk0.50650
HIR43wydmdq0.381192HIR93wydq490.52785HIR143wydm260.61606
HIR44wydmfd0.331101HIR94wydm3f0.58893HIR144wydmk90.70777
HIR45wydm0z0.571703HIR95wydm2c0.61947HIR145wydmgf0.54502
HIR46wydq5q0.621849HIR96wydq000.52974HIR146wydm2q0.58556
HIR47wydjrk0.431239HIR97wydjr20.691117HIR147wydmf60.66681
HIR48wydm900.702275HIR98wydmg10.56780HIR148wydmk70.58549
HIR49wydm9j0.581625HIR99wydmfg0.38551HIR149wydmcy0.45421
HIR50wydm850.531410HIR100wydm380.56774HIR150wydm0u0.49588
(b) Evening commute HIRs.
(b) Evening commute HIRs.
HIRGeohashdffmaxHIRGeohashdffmaxHIRGeohashdffmaxHIRGeohashdffmaxHIRGeohashdffmax
HIR1wydm6d0.626254HIR51wydm850.742992HIR101wydmgv0.621266HIR151wydms80.52756HIR201wydm8h0.65726
HIR2wydm8s0.707309HIR52wydjrb0.712582HIR102wydm6h0.40813HIR152wydm3q0.56809HIR202wydm3f0.64705
HIR3wydm0r0.737758HIR53wydmgc0.561673HIR103wydjzc0.541066HIR153wydmd70.50718HIR203wydmdu0.51516
HIR4wydm9x0.483529HIR54wydmbh0.531546HIR104wydmmp0.451150HIR154wydq6c0.671710HIR204wydmbb0.66741
HIR5wydm650.533425HIR55wydq4v0.702447HIR105wydm1m0.611862HIR155wydjxw0.701194HIR205wydjv80.46866
HIR6wydmed0.573725HIR56wydm9r0.561622HIR106wydmfv0.701600HIR156wydmgm0.751397HIR206wydmbc0.61650
HIR7wydm8k0.735886HIR57wydm960.401177HIR107wydm6k0.44868HIR157wydm3p0.55768HIR207wydjxg0.74953
HIR8wydq5q0.695031HIR58wydq490.682185HIR108wydjr70.551067HIR158wydmf80.65969HIR208wydmd60.40411
HIR9wydm0z0.715183HIR59wydjqz0.793416HIR109wydme40.43825HIR159wydm6t0.33515HIR209wydm630.63666
HIR10wydq4e0.735196HIR60wydq5e0.722496HIR110wydm2n0.581111HIR160wydjwm0.701123HIR210wydm870.66725
HIR11wydjrk0.664078HIR61wydq5x0.793236HIR111wydm7n0.45841HIR161wydjwg0.741326HIR211wydq4s0.75947
HIR12wydm9z0.412052HIR62wydmdj0.481333HIR112wydm2m0.571075HIR162wydm7c0.731256HIR212wydmge0.64662
HIR13wydm9k0.552744HIR63wydm4x0.291481HIR113wydjw50.782105HIR163wydm700.64939HIR213wydme20.66717
HIR14wydmkm0.704175HIR64wydm9m0.591654HIR114wydm6v0.52949HIR164wydmfg0.701129HIR214wydqh00.76988
HIR15wydq000.604414HIR65wydmg60.581596HIR115wydm890.661349HIR165wydm9j0.681034HIR215wydm670.59580
HIR16wydmfx0.633269HIR66wydmbd0.621734HIR116wydmbn0.631230HIR166wydm3h0.58788HIR216wydmk20.70793
HIR17wydm750.562595HIR67wydmdq0.631756HIR117wydm9y0.581040HIR167wydjrf0.741274HIR217wydjwv0.57779
HIR18wydmsc0.633064HIR68wydm4r0.531915HIR118wydmev0.741703HIR168wydq5b0.761365HIR218wydmk60.60590
HIR19wydjrv0.582682HIR69wydms90.621676HIR119wydjrt0.631197HIR169wydm910.48624HIR219wydm690.61574
HIR20wydm7u0.683561HIR70wydmdh0.35951HIR120wydmem0.651256HIR170wydmm10.41706
HIR21wydm1w0.693595HIR71wydmbm0.742371HIR121wydjxu0.771921HIR171wydjz80.63873
HIR22wydm6x0.562355HIR72wydjx90.742346HIR122wydmen0.681374HIR172wydmf60.52664
HIR23wydm7k0.532226HIR73wydmsg0.651778HIR123wydm2e0.671283HIR173wydm9n0.65915
HIR24wydjpx0.703378HIR74wydmkq0.722142HIR124wydmt40.55974HIR174wydq7c0.441701
HIR25wydjr90.703342HIR75wydjwt0.742353HIR125wydm9b0.42744HIR175wydjx80.751240
HIR26wydm9w0.441766HIR76wydm260.702028HIR126wydm2c0.661256HIR176wydmm30.22754
HIR27wydms10.622583HIR77wydjq90.681882HIR127wydmey0.641143HIR177wydq050.621186
HIR28wydm2t0.572278HIR78wydmgf0.511220HIR128wydjx50.751657HIR178wydm0t0.61758
HIR29wydjw20.815148HIR79wydm6e0.742314HIR129wydmes0.721431HIR179wydm6z0.40469
HIR30wydm9t0.431661HIR80wydmeu0.531678HIR130wydmts0.611027HIR180wydmgb0.60743
HIR31wydmf70.682941HIR81wydmg10.742110HIR131wydmgw0.661200HIR181wydm8e0.49570
HIR32wydm1n0.723282HIR82wydm730.541177HIR132wydjzg0.34738HIR182wydm610.67889
HIR33wydmdt0.602228HIR83wydmdv0.661590HIR133wydq750.712373HIR183wydmku0.49570
HIR34wydmcc0.501795HIR84wydm1z0.441243HIR134wydmfd0.46734HIR184wydmkc0.711000
HIR35wydq5n0.763768HIR85wydq5m0.772306HIR135wydmk80.541170HIR185wydjtt0.641220
HIR36wydm600.521846HIR86wydme30.691733HIR136wydjqx0.822223HIR186wydmtk0.67871
HIR37wydm380.632385HIR87wydq590.701787HIR137wydmgq0.701276HIR187wydm3b0.64788
HIR38wydmft0.692808HIR88wydmf40.641471HIR138wydmuh0.621028HIR188wydjry0.56631
HIR39wydmec0.582015HIR89wydjrx0.661535HIR139wydmdw0.701297HIR189wydmdy0.58672
HIR40wydm6m0.471611HIR90wydm9v0.37811HIR140wydmt10.651112HIR190wydms60.53802
HIR41wydmbq0.712937HIR91wydmdn0.621364HIR141wydm220.58893HIR191wydm5z0.41802
HIR42wydm8f0.491664HIR92wydmgu0.531105HIR142wydm240.661117HIR192wydq530.751095
HIR43wydjw60.793989HIR93wydmdr0.671512HIR143wydjq30.761624HIR193wydmm40.58658
HIR44wydmfb0.692624HIR94wydm900.49983HIR144wydmt50.521076HIR194wydmfm0.50537
HIR45wydmdp0.531695HIR95wydq720.812711HIR145wydmk70.60943HIR195wydjpd0.58892
HIR46wydmgr0.722907HIR96wydme50.541102HIR146wydmc80.41624HIR196wydm8v0.71903
HIR47wydm0x0.733046HIR97wydjr20.671482HIR147wydmc20.61939HIR197wydm9q0.41435
HIR48wydm6f0.501591HIR98wydmgk0.721750HIR148wydmee0.55822HIR198wydm2g0.73989
HIR49wydm1r0.632133HIR99wydjqw0.792354HIR149wydjqu0.731358HIR199wydq6b0.771125
HIR50wydjrp0.753207HIR100wydm71.561116HIR150wydm2j0.58870HIR200wydmgn0.751016

Appendix B

Figure A1. Dendograms of HIR clustering. (a) Morning commute HIRs; (b) Evening commute HIRs.
Figure A1. Dendograms of HIR clustering. (a) Morning commute HIRs; (b) Evening commute HIRs.
Ijgi 06 00318 g007aIjgi 06 00318 g007b

References

  1. McMillen, D.P.; McDonald, J.F. A nonparametric analysis of employment density in a polycentric city. J. Reg. Sci. 1997, 37, 591–612. [Google Scholar] [CrossRef]
  2. Jun, M.J.; Ha, S.K. Evolution of employment centers in Seoul. Rev. Urban Reg. Dev. Stud. 2002, 14, 117–132. [Google Scholar] [CrossRef]
  3. Baumont, C.; Ertur, C.; Gallo, J. Spatial analysis of employment and population density: the case of the agglomeration of Dijon 1999. Geogr. Anal. 2004, 36, 146–176. [Google Scholar] [CrossRef]
  4. Roth, C.; Kang, S.M.; Batty, M.; Barthélemy, M. Structure of urban movements: polycentric activity and entangled hierarchical flows. PLoS ONE 2011, 6, e15923. [Google Scholar] [CrossRef] [PubMed]
  5. Zhong, C.; Arisona, S.M.; Huang, X.; Batty, M.; Schmitt, G. Detecting the dynamics of urban structure through spatial network analysis. Int. J. Geogr. Inf. Sci. 2014, 28, 2178–2199. [Google Scholar] [CrossRef]
  6. Craig, S.G.; Kohlhase, J.E.; Perdue, A.W. Empirical polycentricity: The complex relationship between employment centers. J. Reg. Sci. 2016, 56, 25–52. [Google Scholar] [CrossRef]
  7. Yang, X.; Fang, Z.; Xu, Y.; Shaw, S.L.; Zhao, Z.; Yin, L.; Lin, Y. Understanding spatiotemporal patterns of human convergence and divergence using mobile phone location data. ISPRS Int. J. Geo-Inf. 2016. [Google Scholar] [CrossRef]
  8. Helsley, R.W.; Sullivan, A.M. Urban subcenter formation. Reg. Sci. Urban Econ. 1991, 21, 255–275. [Google Scholar] [CrossRef]
  9. Geohash. Geohash Tips & Tricks; 2017. Available online: http://Geohash.org/site/tips.html (accessed on 1 May 2017).
  10. Pelletier, M.P.; Trépanier, M.; Morency, C. Smart card data use in public transit: A literature review. Transp. Res. C Emer. Technol. 2011, 19, 557–568. [Google Scholar] [CrossRef]
  11. Morency, C.; Trépanier, M.; Agard, B. Analysing the Variability of Transit Users’ Behaviour with Smart Card Data. In Proceedings of the 19th International IEEE Intelligent Transportation Systems Conference (ITSC), Toronto, ON, Canada, 17–20 September 2006; pp. 44–49. [Google Scholar]
  12. Morency, C.; Trepanier, M.; Agard, B. Measuring transit use variability with smart-card data. Transp. Policy 2007, 14, 193–203. [Google Scholar] [CrossRef]
  13. Ma, X.; Wu, Y.J.; Wang, Y.; Chen, F.; Liu, J. Mining smart card data for transit riders’ travel patterns. Transp. Res. C Emer. Technol. 2013, 36, 1–12. [Google Scholar] [CrossRef]
  14. Kieu, L.M.; Bhaskar, A.; Chung, E. Passenger segmentation using smart card data. IEEE Trans. Intell. Transp. 2015, 16, 1537–1548. [Google Scholar] [CrossRef]
  15. Kim, K.; Oh, K.; Lee, Y.K.; Kim, S.; Jung, J.-Y. An analysis on movement patterns between zones using smart card data in subway networks. Int. J. Geogr. Inf. Sci. 2014, 28, 1781–1801. [Google Scholar] [CrossRef]
  16. Du, B.; Yang, Y.; Lv, W. Understand Group Travel Behaviors in an Urban Area Using Mobility Pattern Mining. In Proceedings of the 10th IEEE International Conference on Ubiquitous Intelligence and Computing and 10th International Conference on Autonomic and Trusted Computing (UIC/ATC), Washington, DC, USA, 3–6 December 2013; pp. 127–133. [Google Scholar]
  17. Tao, S.; Corcoran, J.; Mateo-Babiano, I.; Rohde, D. Exploring Bus Rapid Transit passenger travel behaviour using big data. App. Geogr. 2014, 53, 90–104. [Google Scholar] [CrossRef]
  18. Tao, S.; Rohde, D.; Corcoran, J. Examining the spatial–temporal dynamics of bus passenger travel behaviour using smart card data and the flow-comap. J. Transp. Geogr. 2014, 41, 21–36. [Google Scholar] [CrossRef]
  19. Long, Y.; Thill, J.C. Combining smart card data and household travel survey to analyze jobs–housing relationships in Beijing. Comp. Environ. Urban Syst. 2015, 53, 19–35. [Google Scholar] [CrossRef]
  20. Zeng, W.; Fu, C.W.; Arisona, S.M.; Schubiger, S.; Burkhard, R.; Ma, K.L. Visualizing the Relationship Between Human Mobility and Points of Interest. IEEE Trans. Intell. Transp. Syst. 2017, 18, 2271–2284. [Google Scholar] [CrossRef]
  21. Zhong, C.; Huang, X.; Arisona, S.M.; Schmitt, G.; Batty, M. Inferring building functions from a probabilistic model using public transportation data. Comput. Environ. Urban Syst. 2014, 48, 124–137. [Google Scholar] [CrossRef]
  22. Bagchi, M.; White, P.R. The potential of public transport smart card data. Transp. Policy 2005, 12, 464–474. [Google Scholar] [CrossRef]
  23. Kusakabe, T.; Asakura, Y. Behavioural data mining of transit smart card data: A data fusion approach. Transp. Res. C Emer Technol. 2014, 46, 179–191. [Google Scholar] [CrossRef]
  24. Cats, O.; Wang, Q.; Zhao, Y. Identification and classification of public transport activity centres in Stockholm using passenger flows data. J. Transp. Geogr. 2015, 48, 10–22. [Google Scholar] [CrossRef]
  25. Zhu, X.; Guo, D. Mapping large spatial flow data with hierarchical clustering. Trans. GIS 2014, 18, 421–435. [Google Scholar] [CrossRef]
  26. Song, Y.; Lee, K.; Anderson, W.P.; Lakshmanan, T.R. Industrial agglomeration and transport accessibility in metropolitan Seoul. J. Geogr. Syst. 2012, 14, 299–318. [Google Scholar] [CrossRef]
  27. Wu, W.; Xu, J.; Zeng, H.; Zheng, Y.; Qu, H.; Ni, B.; Ni, L.M. Telcovis: Visual exploration of co-occurrence in urban human mobility based on telco data. IEEE Trans. Vis. Comput. Graph. 2016, 22, 935–944. [Google Scholar] [CrossRef] [PubMed]
  28. Andrienko, G.; Andrienko, N.; Hurter, C.; Rinzivillo, S.; Wrobel, S. Scalable analysis of movement data for extracting and exploring significant places. IEEE Trans. Vis. Comput. Graph. 2013, 19, 1078–1094. [Google Scholar] [CrossRef] [PubMed]
  29. Bahamonde, J.; Hevia, A.; Font, G.; Bustos-Jimenez, J.; Montero, C. Mining private information from public data: The Transantiago Case. IEEE Pervas. Comp. 2014, 13, 37–43. [Google Scholar] [CrossRef]
  30. Ma, Y.; Xu, W.; Zhao, X.; Li, Y. Modeling the hourly distribution of population at a high spatiotemporal resolution using subway smart card data: A case study in the central area of Beijing. ISPRS Int. J. Geo-Inf. 2017, 6, 128. [Google Scholar] [CrossRef]
  31. Sanders, R. The Pareto principle: Its use and abuse. J. Serv. Mark. 1987, 1, 37–40. [Google Scholar] [CrossRef]
  32. Juran, J.M.; Gryna, F.M. Juran’s Quality Control Handbook, 5th ed.; McGraw-Hill: New York, NY, USA, 1998; ISBN 0-07-034003-X. [Google Scholar]
  33. Olson, C.F. Parallel algorithms for hierarchical clustering. Parallel Comput. 1995, 21, 1313–1325. [Google Scholar] [CrossRef]
Figure 1. Hierarchical structure of the Geohash system for global geo-coding.
Figure 1. Hierarchical structure of the Geohash system for global geo-coding.
Ijgi 06 00318 g001
Figure 2. Flow orientation analysis of Seoul transportation data.
Figure 2. Flow orientation analysis of Seoul transportation data.
Ijgi 06 00318 g002
Figure 3. Distribution of inflows and selection of HIRs. (a) 175 HIRs selected from morning peak hour data; (b) 219 HIRs selected from evening peak commute data.
Figure 3. Distribution of inflows and selection of HIRs. (a) 175 HIRs selected from morning peak hour data; (b) 219 HIRs selected from evening peak commute data.
Ijgi 06 00318 g003
Figure 4. HIRs extracted in Seoul. 159 HIRs in purple both for morning and evening commutes, 16 HIRs in blue only for morning commute, and 59 HIRs in red only for evening commutes.
Figure 4. HIRs extracted in Seoul. 159 HIRs in purple both for morning and evening commutes, 16 HIRs in blue only for morning commute, and 59 HIRs in red only for evening commutes.
Ijgi 06 00318 g004
Figure 5. Clusters of HIRs in Seoul. (a) Morning commute HIR; (b) Evening commute HIR. Their hierarchical clustering progress can be seen in Appendix A.
Figure 5. Clusters of HIRs in Seoul. (a) Morning commute HIR; (b) Evening commute HIR. Their hierarchical clustering progress can be seen in Appendix A.
Ijgi 06 00318 g005
Figure 6. Flow orientation patterns of HIR clusters. (a) Morning HIR clusters; (b) Evening HIR clusters.
Figure 6. Flow orientation patterns of HIR clusters. (a) Morning HIR clusters; (b) Evening HIR clusters.
Ijgi 06 00318 g006aIjgi 06 00318 g006b
Table 1. Related studies on human mobility using smart card transit data.
Table 1. Related studies on human mobility using smart card transit data.
CategorySubjectMethodTransit Data
Geographical clusteringTravelers for age occupation wise travel behavior [11]k-means277 consecutive days
Travelers for regularity in boarding [12]k-means277 consecutive days
Mining travel patterns [13]DBSCAN5 consecutive weekdays
Origin-destination pairs for discovering zones based on movement patterns [15]Clustering5 consecutive weekdays
Travelers for direction and destination of travel [16]DBSCAN92 days
Subway stations for high inflow poly-center [4]k-means1 week
Travelers for temporal boarding pattern [14]DBSCAN4 months
Movement visualizationDrawing travel trajectory and visual representation of movement pattern [17,18]co-map5 days
Job housing location and commuting pattern [19]GIS platform7 days
Interactive visualization of human mobility with activity context [20]-1 week
Relationship extractionRelationship between mobility pattern of individual and daily activities [21]Bayesian classifier1 week
Travel behavior analysis by measuring passenger turnover [22]rule-based methodApprox. 3 years
Behavioral trip purpose estimation [23]Bayesian classifier20 months
Relation of arrival and departure at certain station [20]-1 week
Flow orientationDiscovering flow orientation for poly-centers [5]compass direction1 week
Discovering spark regions based on high density routes [16]DBSCAN92 days
Identifying activity centers and clustering them for spatial proximity and temporal flows [24]clustering1 day
Identifying industrial agglomerations and their orientation with respect to different modes of transport to check importance of transport accessibility [25]-1 day
Table 2. Basic attributes of smart card transit data.
Table 2. Basic attributes of smart card transit data.
AttributeDescriptionData Type
Passenger codeSmart card serial numberNumeric
Origin stationOrigin station number (card punch in)Numeric
Boarding timeBoarding date and time at origin stationDate time
Destination stationDestination station number (card punch out)Numeric
Alighting timeAlighting date and time at destination stationDate time

Share and Cite

MDPI and ACS Style

Singh, P.; Oh, K.; Jung, J.-Y. Flow Orientation Analysis for Major Activity Regions Based on Smart Card Transit Data. ISPRS Int. J. Geo-Inf. 2017, 6, 318. https://doi.org/10.3390/ijgi6100318

AMA Style

Singh P, Oh K, Jung J-Y. Flow Orientation Analysis for Major Activity Regions Based on Smart Card Transit Data. ISPRS International Journal of Geo-Information. 2017; 6(10):318. https://doi.org/10.3390/ijgi6100318

Chicago/Turabian Style

Singh, Parul, Kyuhyup Oh, and Jae-Yoon Jung. 2017. "Flow Orientation Analysis for Major Activity Regions Based on Smart Card Transit Data" ISPRS International Journal of Geo-Information 6, no. 10: 318. https://doi.org/10.3390/ijgi6100318

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop