*4.3. Results*

Table 1 reports the comparison results in terms of accuracy and accuracy in top 3 (if the correct label corresponds to one of the top three predicted locations, the accuracy is 1, otherwise it is 0; the result is the average for each testing trajectory). Our model (LSTM) outperformed the Markov approaches, yielding a 5% improvement compared to the best baseline, the global Markov model (GMM), 10% improvement compared to the variable-order Markov model (VGMM), and 33% to the personal Markov model (PMM). The accuracy in top 3 confirmed this trend, showing a 7% improvement of our model with respect to GMM, 8% to VGMM, and 47% to PMM.

**Table 1.** Overall performance comparison between our methodology (LSTM) and the Markov baseline approaches, namely personal Markov model (PMM), global Markov model (GMM), and variable-order global Markov model (VGMM).


Reasonably, PMM, which was solely based on individual mobility and ignored the collective motion behavior, had the lowest scores in this regime of short and non-repetitive traces. GMM and VGMM, which considered the collective mobility of all users, greatly improved performances, with the first-order model surpassing the variable-order model. LSTM determined a further increment, exceeding the best baseline of 2.5 percentage points in terms of accuracy and 5 percentage points in terms of accuracy in top 3.

Moreover, we analyzed how different trajectory characteristics affect prediction. The idea was to evaluate the influence of different values of motion features, such as the traveled distance and radius of gyration, on the prediction performances.

Table 2 shows the accuracy and accuracy in top 3 (in brackets) for different values of traveled distance within six hours prior to prediction. Five bins were selected: ≤10 km, 10–25 km, 25–50 km, 50–100 km, and ≥100 km. Comparing accuracy, despite an overall tendency of decreasing performance when the traveled distance increases, PMM always performed very poorly, while GMM and VGMM achieved remarkable results for mid and short distances, respectively. In particular, GMM substantially outperformed VGMM for mid-range values (10–100 km), but was overcome by the latter for very short distances (≤10 km). LSTM always exceeded every baseline, even if it only slightly outperformed GMM for mid-short distance values (10–50 km). It is worth noticing how LSTM largely overcame the other methods for very long distances (≥100 km). Moreover, its accuracy in top 3 was consistently much higher than every baseline for each distance bin.


**Table 2.** Accuracy (and accuracy in top 3 in brackets) comparison for different values of traveled distance.

Table 3 reports the accuracies for different values of radius of gyration (ROG), in bins of ≤3 km, 3–10 km, 10–32 km, and ≥32 km. These results reinforce the observations reported in the previous case, such as the general tendency of decreasing performance as the ROG value increases, the overall poor achievements of PMM, the good results of VGMM for very small values (≤3 km), and the remarkable performance of GMM for mid-range values (3–32 km). Again, LSTM always outperformed the baselines, only slightly beating the GMM accuracy for the 3–10 km bin, but greatly overcoming the other methods for very large ROG values (≥32 km). As in the traveled distance case, its accuracy in top 3 was consistently much higher than the baselines for each of the ROG bins.

**Table 3.** Accuracy (and accuracy in top 3 in brackets) comparison for different values of radius of gyration.


In addition, we observed the prediction variability at different hours of the day. Figure 5 displays the accuracy and accuracy in top 3 of the four methods over time, starting from midnight. Rush hours in the afternoon appeared to be more predictable than the ones in the morning, while accuracies significantly increased in the evening and night due to the higher regularity of mobility patterns during these hours. LSTM was shown to outperform the baselines for every hour of the day.

**Figure 5.** Prediction accuracy (on the left) and accuracy in top 3 (on the right) with respect to the hour of the day.

Performances were further explored based on the imbalance of the dataset, by evaluating results corresponding to popular and rare locations. Table 4 reports the accuracies for different ranges of location occurrences in the data, defining frequently visited locations and less visited ones. The columns from left to right identify specific groups of locations, where each location of each group represents, respectively, over 0.5% of the whole dataset, between 0.1% and 0.5%, between 0.05% and 0.1%, and less than 0.05%. As expected, there is a general drop of performance when passing from popular locations to rare ones. However, the superiority of LSTM is once again clearly exhibited.


**Table 4.** Accuracy (and accuracy in top 3 in brackets) comparison for visited locations in different ranges of occurrence in the data. The percentage value in the first row refers to the amount of occurrences of each location in that column with respect to the whole dataset.

Finally, we focused on the prediction errors to study the performance of our model in the particular case when it was not able to correctly identify the future visited location. We compared LSTM with GMM, the best baseline in terms of accuracy, to assess how their predicted locations differed when a misprediction occurred in both models. Figure 6 reports the bar graphs representing the error distance distribution of the segments that are wrongly predicted by both models. The error distance was calculated as the absolute distance between the wrongly predicted location and the real future location (to calculate the error distance of wrong predictions in top 3, we considered the predicted location, within the first three, having the shortest distance with the real location). The bar graphs highlight the overall tendency of LSTM to make mistakes with a shorter error distance than GMM.

**Figure 6.** Bar graphs representing the error distance distribution of LSTM and global Markov model (GMM) when both models predicted wrongly (wrong predictions in the left graph, wrong predictions in top 3 in the right graph).

We also studied the difference of error distance between the two prediction models, analyzing the corresponding mispredictions on the same segment. The bar graphs in Figure 7 display the subtraction *error*\_*distance*(*GMM*) − *error*\_*distance*(*LSTM*) for wrong predictions and wrong predictions in top 3; a negative value indicates that the baseline provided a shorter error distance on a wrongly predicted segment; a positive value is in favor of our model. As depicted by the high bars on the right part of both graphs, there were a remarkable number of samples on which GMM tended to make prediction mistakes in the order of a few tens of km more than LSTM. Overall, our model, besides the higher prediction accuracy, also presented better results in terms of the shortest error distance.

**Figure 7.** Bar graphs representing the difference of error distance between GMM and LSTM when both models predicted wrongly (wrong predictions in the left graph, wrong predictions in top 3 in the right graph).

#### *4.4. Discussion*

We proposed a method to predict individual mobility traces of short-term foreign tourists leveraging the collective large-scale motion behavior of people and a deep learning-based methodology adapted to process motion trajectories. The model relies on a recurrent neural network architecture composed of embedding and LSTM layers. We assessed the feasibility of such methodology on short, non-repetitive traces, revealing its potentiality for human mobility studies and applications.

In particular, our method was shown to outperform the widely used Markov model approaches based on location transition probabilities. The results reported how a probabilistic approach built on the motion behavior of a single individual performs very poorly in this mobility regime, proving the need for collective motion information. This collective mobility, however, consists of non-repetitive traces that clearly influence prediction performances; the simpler first-order Markov model generally overcame the variable-order model based on the longest common suffix. LSTM, specifically designed to find patterns along series, outperformed every baseline, demonstrating a higher capability of correctly predicting individual mobility traces, represented as ordered sequences of locations.

We also observed how predictability varied for different trajectory characteristics. Despite the general tendency of decreasing performances for longer traveled distances and larger explored areas (local movements were more predictable than long-distance movements), our model always achieved a better accuracy than the baseline approaches. Reasonably, local movements rely on a restricted set of likely future locations, whereas long-distance movements are more unpredictable since the broad explored area could determine a large number of possible future visited locations. However, our model achieved the largest accuracy gap over the baselines exactly in correspondence of very high values of traveled distance and ROG, showing a particular potential for long distances and large covered areas. Moreover, its accuracy in top 3 was always significantly higher than the other models independently from trajectory characteristics. This also includes predictability over time, where results were split on the basis of the hour of the day. Besides the fact that our methodology constantly performed better than the comparison methods, we observed that rush hours in the morning were generally less predictable than rush hours in the afternoon. This is caused by the fact that the traces preceding the early morning hours contain less meaningful past information with regard to future activities. Due to the higher stationarity and regularity (individual and collective) during the night hours, trajectories sharing the same locations during the night can easily lead to different destinations in the morning; therefore, the recent past motion activity becomes less important in predicting the next location. However, the recent past visited locations gain more importance for predicting the afternoon hours because they carry information about motion behavior in the morning, which is more often meaningful and indicative of future movements. Finally, predictability increases in the night due to the intrinsic higher regularity of mobility patterns during these hours, which is also represented by the better performance of the variable-order Markov model over the first-order model in the late night and morning hours, and in correspondence of small values of traveled distance and ROG.

Furthermore, another meaningful performance indicator was defined by assessing the results in relation to the class imbalance, to observe how the model behaves with respect to frequent locations and rare locations. While better results were expected in correspondence to those locations that are often visited, it was worth verifying that the model did not totally drop in performance for very rare locations. In general, besides a tendency to obtain very accurate predictions for popular locations, LSTM was shown to still outperform the baselines, achieving acceptable results even for very rare locations.

Another meaningful matter to mention is related to the prediction error. While the main goal is to correctly detect the next location, it is also important, when the prediction is wrong, to assess how wrong it is. Comparing our model with the best baseline, we verified that the error distance of our methodology is generally smaller, in particular a few tens of kilometers smaller for a large number of observations, whereas far more rarely the error is strongly in favor of the Markov model. This shows that LSTM implicitly makes less serious mistakes in terms of the error distance with respect to Markov, further emphasizing its superiority.

In conclusion, the presented deep learning methodology shows advantages in location prediction of non-repetitive traces generated by short-term foreign tourists. This fits in the field of deep learning-based artificial intelligence for smart city research and smart tourism, e.g., for enhancing user experiences or providing advanced decision making. In particular, this work brings a contribution to the computer science side of the variety of disciplines involved in smart city research [79], specifically falling into the field of analytics technologies, comprising decision-making oriented approaches to

discover hidden patterns over big data. These approaches have recently gained critical interest and development, especially for social impact implications [80,81]. Nonetheless, their contribution is only a facet of the multi-disciplinary reality of smart city and smart tourism, and synergies with the other disciplines need to be carefully evaluated to guarantee valuable outcomes [82]. In any case, the proposed research opens a wide variety of potentially suitable applications, ranging from personalized location-based services, to crowd control, to destination planning and management. The most straightforward implementation option is related to the optimization of the quality of individual touristic experiences. Personalized information and recommendations can be provided to a specific tourist along the path, highlighting optional spots and attractions within the next visited area predicted by the model. In addition, collecting the predictions of individual spatial choices can reveal potential crowded areas, giving rise to congestion warning information for those tourists that were forecasted to visit those areas. Combining individual predictions can indeed be used to study the future spatial collective distribution of tourists, which is certainly important for several tasks, including the adjustment of supply of facilities and services, and sustainable countermeasures complying with real-time crowd control.

More broadly, this study fits in the background of trajectory prediction employing machine learning methodologies, particularly contributing to highlighting the potential of deep learning on human mobility studies, disclosing recurrent network models as a promising tool for pattern recognition in trajectory analysis.

#### **5. Conclusions**

This paper presented a deep learning model to mine human motion patterns, aimed at predicting short-term foreign tourists' next location from place-based trajectories. The model was trained on the collective behavior of users to capture the dependency of track points and infer the latent patterns of motion traces to predict individual trajectories. The process follows a purely data-driven perspective, whereby the model is able to grasp mobility patterns directly from location sequences, without requiring any manual feature extraction or external information. We initially transformed raw traces into sequences of locations unfolding in fixed time steps, and then applied a deep neural network model composed of embedding and LSTM layers to correctly predict the next location in the sequence. Adopted in the context of short non-repetitive traces, our methodology was shown to outperform traditional approaches, expressing a potential that is worth examining in depth.

Possible extensions of this paper can explore augmentation of trajectory data with further information. A research direction may consist of explicitly integrating time information in the sequence, assessing probable performance improvements. In addition, other factors can be taken into consideration, including tourist characteristics such as nationality or age. Furthermore, it could be appropriate to study tourists' mobility at a smaller scale, investigating the predictability of finer traces in time and space (e.g., in an urban environment); in this case, GPS data would allow more detailed resolutions than telecom data. Lastly, the same methodology could be tested for different use cases dealing with short and non-repetitive traces, not limited to tourism analysis.

In conclusion, the use of recurrent network architectures should be further explored in the field of human mobility, since the current promising results can potentially become successful applications in a variety of tasks related to trajectory analysis and motion behavioral studies.

**Author Contributions:** A.C. conceived and designed the experiments, analyzed the data and wrote the paper. E.B. supervised the work, helped with designing the conceptual framework, and edited the manuscript. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was funded by the Austrian Science Fund (FWF) through the Doctoral College GIScience at the University of Salzburg (DK W 1237-N23).

**Acknowledgments:** The authors would like to thank Vodafone Italia for providing the dataset for the case study, and the Austrian Science Fund (FWF) for the Open Access Funding.

**Conflicts of Interest:** The authors declare no conflict of interest.
