*3.2. Self-Sustainability*

In general, self-sustainability of a model refers to its ability to survive—hence, to continue to be useful—in a dynamic environment. ITS models and developments are usually intended to operate during long periods of time. However, it is widely accepted that traffic and transportation phenomena are strongly dynamic in nature, meaning that these phenomena exhibit long term trends, evolve in space and time, but also, at the occurrence of an unexpected event, they are susceptible to abrupt changes and exhibit long term memory effects. For instance, a trip information system based on traffic forecasts on a certain part of the network trained with historical data coming from recurrent traffic conditions may not be easily transferable to other road networks or not efficient in case of a severe disruption in traffic operation (accident). What is more, if the specific system does not undergo constant training with new data over time, eventually it will fail to correctly operate even for the network location it was originally designed to operate due to contextually induced non-stationarities. Thus, an intelligent transportation system developed based on data-based approaches should at least follows a set of minimum self-sustainability requirements during the design workflow.

To better understand the importance of self-sustainability as a significant aspect of model's actionability, one should bring to mind the case of cooperative ITS systems (e.g., advanced vehicle control systems) and the automated driving. To this end, a selfsustainable data-based model should bridge the gap between the development of a model prototype and its deployment in a real, potentially non-stationary environment.

When an ITS system or model is deployed to operate in changing conditions, selfsustainability involves dealing with the effects of such changes in the learned knowledge. To this end, different strategies and design approaches could be required depending on the nature of the change and its effects on the model. We next delve into several attributes that can be desirable to deploy data-driven systems or models in changing environments, rendering them actionable:

1. *Adaptable*: Data-driven models for ITS applications created in controlled conditions, with static, self-contained datasets, can provide grea<sup>t</sup> performance metrics, but could also fail if data evolve along time [96]. Adaptation is the reaction of a system, model or process to new circumstances intended to reduce its performance deterioration in comparison to the one expected before the change in the environment happened. If data change over time, their evolution is not detected by the model and it does not adapt to it whatsoever, then the developed model will eventually provide an obsolete output. When these contextual variations occur over data streams and models are learned on-line therefrom (for e.g., on-line clustering or classification), such variations can imprint changes in the statistical distribution of input and/or output data, making it necessary to update such models to reflect this change in their learned knowledge. This phenomenon is known as *concept drift* [97], and has been identified as an active research challenge for most of fields connected to machine learning in non-stationary environment [98]. Many of those fields are already studying this topic, from spam detection [99,100] to medicine [101].

There are two main lines related to concept drift: how to detect drift, and how to adapt to it. Both lines should be scheduled in the research agenda of data-driven ITS, as they have obvious implications when analyzing traffic [102]. Situations like road works can modify completely traffic profiles over a certain area during a period of time, after which the situation goes back to normal. A similar casuistry happens with road design changes (i.e., new lanes, transformation of types of lanes, new accesses, roundabouts, etc), although in those cases there is a new stable traffic profile largely after the change. Even without man-crafted changes, traffic profiles may change for social-economical reasons [103]. Besides, analysis of drift can be used to detect anomalies in the normal operation of roads [104], or to analyze patterns in maritime traffic flow data [105]. However, the adaptability of ITS models to evolving data is scarcely found in literature, and certainly, in many cases concept drift managemen<sup>t</sup> is the scope of the work, and not a circumstance that is considered to achieve a greater goal [104,106]. There are though some online approaches to typical ITS problems that consider the effects of drift in data [36,107,108], and we consider this kind of initiatives should lead the way for an actionable ITS research.

2. *Robust*: When an ITS system is deployed in a real-life environment, diverse kinds of setbacks can affect its normal operation, from power failures that preclude its functioning to the interruption of the input data flow. Robustness is a self-sustainability trait that prevents a system to stop working when external disruptions occur. Although in most research-level designs this is not a relevant feature, it is essential for actionable, self-sustainable designs. Robustness, defined as the ability to recover from failures, would have, however, different requirements depending on the criticality of the ITS system. Thus, in a traffic flow forecasting system robustness could only imply that the system does not crack when input data fail [109], and it continues to operate; on the

other hand, for critical systems such as air traffic management, robustness would require additional measurements to contain damage [110,111]. All in all, robust databased workflows should be able to accommodate unseen operational circumstances, such as data distribution shifts or unprecedented levels of information uncertainty, which particularly prevail in crowdsourced and Social Media data [112,113].


Leaving aside calibration and training phases, classic transportation theories tend in general to be computationally more affordable than data-driven models. However, the unprecedented amount of computing power available nowadays discards any real pragmatic limitation due to the computational complexity of learning algorithms in data-based modeling. An exception occurs with models falling within the Deep Learning category which, depending on their architecture and size of training data, may require specialized computing hardware such as GPU or multi-core equipment. Nevertheless, the rising trend in terms of scalability is to make data-based models incremental and adaptable [3], which finds its rationale not only in the environmental sustainability of data centers (lower energy consumption and thereby, carbon footprint), but also in the deployment of scalable model architectures on edge devices, usually with significantly less computing resources than data centers.

Although some ITS problems are easier to scale and this feature would not be troublesome, there are some fields that can be very sensitive to scalability. For instance, route planners frequently consist of shortest-path problem and travel-salesman problem implementations that increase in complexity when the number of nodes grow [115]. This is a good example where artificial intelligence and optimization tools provide solutions that are actionable in terms of scalability, and where cases are found effortlessly [116,117]. Caring about aspects like the easiness to introduce new variables when needed, the complexity of tuning if applies, or the execution time, would make a model more actionable, by increasing its self-sustainability. This need for scalability is not just a matter related to the computational complexity of modeling elements along the pipeline, but also links to the feasibility of migrating the designed models from a lab setup to a, e.g., Big Data computing architecture. Unfortunately, scarce publications reflect nowadays on whether their proposed data-based workflows can be deployed and run on legacy ITS systems, thereby avoiding costly upgrade investments in computing equipment.

### *3.3. Traffic Theory Awareness*

Theoretical representations of traffic attempt to construct (mostly simple) models with causal aspects. These models are usually of a closed form and are frequently dictated by simplifying assumptions, which leads to limited performance when modeling complex spatio-temporal dynamics in the microscopic analysis context. In these models, data are instrumental to estimate how well they fit real world conditions. On the other hand, and since their upsurge in the 80s, data driven models rely exclusively on the data to extract the dynamics that govern the phenomena. This, at least theoretically, makes them more adaptable and more efficient in complex conditions when compared to theory based models. But, they can hardly claim applicability in large scale scenarios (city level traffic management) due to significant computational resources requirements. Such data-driven traffic models have been systematically implemented as proof of concepts and are now dominant in Traffic Engineering literature [16], incorporating most well-known advanced techniques, and, in many cases, ignoring the elementary knowledge of traffic and focusing blindly on performance.

Owing to the above, researchers in traffic modeling have diversified the way in which their models are developed and evaluated, fitting them to the technology that is introduced, as opposed to fitting the model to the well established knowledge described in well established theories of traffic flow. This results in models that are hardly actionable for traffic engineers, in terms of integration to legacy traffic control and managemen<sup>t</sup> systems and relevance to the decision making process of road network operators. Besides, there is a lack of standards in what regards to data and scenarios used to assess the performance, usually due to the availability of real data for each researcher. This was already identified in [78], where test-beds were proposed, either generating them or using some of the existent as standards. This would help compare models, understand them better, as they can be evaluated in a known environment, and obtain their insights concerning traffic theory. Besides, as we anticipated in Section 2.1 there is a industrial trend towards the consideration of different data sources when modeling traffic dynamics. In many cases, these data sources do not have any straightforward relationship to traffic itself. The integration of these sources of data, the models learned from them and theoretical representations of transportation scenarios remains an open challenge that has started to be addressed in literature [118–120].

In this line of reasoning, linking data-driven to theory based models in transportation may resort to efficient and physically consistent representations of transportation phenomena. In fields like traffic modeling and forecasting, this hybrid approach permits to consider theoretical aspects of traffic, such as the relationship among speed, flow and density, the three phases of traffic [121], or the Breakdown Minimization Principle [122] when modeling bottlenecks. The consideration of these theoretical concepts takes effect mainly in the preprocessing, modeling and prescription phases of the modeling workflow. In preprocessing, domain knowledge can be crucial for feature engineering, by describing how available features are related to each other, estimating collinearities in advance, deleting irrelevant predictors, or obtaining feature combinations with improved modeling power [78]. Applying traffic theories and principles can also be useful for data augmentation and missing data imputation, by simulating or generating data that are more akin to what the context can provide [36]. In the modeling phase, previously defined mathematical frameworks can help define the constraints, operation ranges and correct the output of data-based models, which do not take into account the compliance of their output with respect to well-established theories. Lastly, in the prescription phase, model outputs can be linked to traffic theory knowledge to improve the way in which they are applied: a predicted flow value can be more useful if the travel time or the bottleneck probability can be computed afterwards. Furthermore, in the case of predictive models, they can reach a point in which the provided predictions ultimately affect the future behavior of the models themselves, if they are trained only with observed past data. For instance, a model that assists traffic managemen<sup>t</sup> decisions, like closing a lane, might lead to a situation that has not been observed by the model before, thus making the knowledge captured

by the traffic model obsolete and useless until the data captured from the environment is exploited for retraining. Physical models can be highly useful to anticipate scenarios and complement data-based models, providing additional information of what theories or simulations determine that the behavior of the scenario should be.

This emergen<sup>t</sup> modeling paradigm is known as Theory Guided Data Science, and aims to enhance data driven models by integrating scientific knowledge [123]. The main objective of this approach is to enable an insightful learning using theoretic foundations of a specific discipline to tackle the problem of data representativeness, spurious patterns found in datasets, as well as providing physically inconsistent solutions. From the algorithmic point of view, this induction of domain knowledge can be done in assorted means, such as the use of specially devised regularization terms in predictive models (e.g., in the loss function of Deep Learning models), data cleansing strategies that account for known data correlations, or memetic solvers that incorporate local search methods embedding problem-specific heuristics. In transportation, there has been several example of theory enhanced models departing from traffic conditions identification and characterization [124,125], to data driven and agen<sup>t</sup> based traffic simulation models for control and managemen<sup>t</sup> [3,42,126,127], or cooperative intelligent driving services [128].

Awareness with domain-specific knowledge can be also enforced at the end of the workflow. When decision making is formulated as an optimization problem, the family of optimization strategies known as Memetic Computing [129,130] has been used for years to incorporate local search strategies compounded by global search techniques and lowlevel local search heuristics. These heuristics can be driven by intuition when tackling the optimization problem at hand or, more suitably for actionability purposes, by a priori knowledge about the decision making process gained as a result of human experience or prevailing theories. For instance, traffic managemen<sup>t</sup> under incidences in the road network can largely benefit from the human knowledge acquired for years by the manager in charge, since this knowledge may embed features of the traffic dynamics that are not easily observable from historical data. This knowledge can be inserted in an optimization algorithm devised to decide e.g., which lanes should be rerouted in an accident.

### *3.4. Application Context Awareness*

Transportation is exceptionally diverse around the world, with notable differences in modes, preferences and availability due to social, economic and cultural disparity. Moreover, Intelligent Transportation Systems with different purposes have also characteristic requirements that can also be very divergent with respect to space and time. To address this landscape of complex and some times conflicting goals, policies and decision making should span from few seconds (traffic managemen<sup>t</sup> and control) to years (planning and designing of new systems). It is strongly argued that data driven framework are able to cope with context aware datasets, due to their inherent capabilities of learning patterns hindering in resourceful data and reconstruct-in a sense-the context of the application. Typical examples of such context aware systems are the extraction of Origin-Destination matrices from cellphone based data [131], the mobility applications that aim to improve the the mobility footprint of users [132], as well as the smartphone based driving insurance systems [133]. Although these approaches seem to be appropriate to complement the user or system's experience on a problem, significant uncertainty lies in their transferability and accuracy, owing to the lack of context-aware knowledge.

A certain degree of awareness of the context should be a matter of concern when developing ITS models that intend to be actionable. Context aware information is usually introduced in the modeling, for example accounting fro the demographic characteristics of the application area, the type of the road or network, the mode, the travel purpose etc. However, what is usually disregarded is a much broader consideration of the operational and system's characteristics, such as how models can be introduced to the operations at hand, what the privacy concerns are with respect to data and information flows, what is the regulatory framework and policy level restrictions and goals to be reached.

First, within the operation, the deployment context where a developed model is intended to be implemented can enforce a series of operative constraints. Creating and proposing an ITS model without observing these requirements is an exercise of futility, for its lack of actionability. From this operation perspective, the context covers from deployment and operation costs—is the system cost-efficient considering its potential service?—to functioning modes—has the model the expected response times? can it operate in reduced computational power environments? As an illustration, a system designed to detect and identify pedestrians can be very effective in terms of performance, but if it does not operate at an appropriate speed, or it needs more demanding computations that cannot be embarked in a vehicle, it is useless for an autonomous driving context [134]. A similar reasoning holds if by *operation cost* one thinks about the energy consumption of the model at hand. Questions such as whether the energy consumed by the model compliant with the system should be kept in mind at design time, but also from the academic perspective, where efforts should be directed to the development of models that are consequent with the actual operative circumscription.

Second, regulations constitute a hard and highly contextual constraint in the implementation of ITS. Besides the wide regulatory differences that can be found across regions, there are transport frameworks where regulations are specially rigid. A typical example is the case of airports [135], and where there is a broad field for specialized ITS. Another example is the constantly rising use of drone systems to monitor traffic [136]. Models that fail to relate to the application's regulatory environment are not actionable.

Third, data privacy and sovereignty constitute a growing concern in a connected world where, after a decade of handing over data with complacency, an awareness about personal information sharing is springing. A recent example is the introduction of the EU General Data Protection Regulation (GDPR) framework, that severely disputed the manner data were introduced to models, as well as data availability. ITS models that are based on personal data are common nowadays, for instance in floating car data based developments [137]. However, there are fields where this aspect is becoming crucial (autonomous driving connectivity [138], security in public transport environments [139]), and research is steering to privacy-preserving approaches [140], spheres where technologies such as Blockchain can have a major dominance [141,142].

Fourth, social aspects of the application play a major role in modeling. Social transportation is the subfield in ITS where the "social" information coming from mobile devices, wearable devices and social media is used for a number of ITS managemen<sup>t</sup> related applications [143]. The outcomes from social transportation may be, to name a few, traffic analysis and forecasting [144,145], transportation based social media [146], transportation knowledge automation in the form of recommending systems and decision support systems [147], and services for the collection of further signal to be used later for the already mentioned purposes or others. However, cultural differences can have a relevant impact in how these systems operate, as social data are most commonly strongly linked to geographical information. This is a key aspect for their actionability.

Fifth, transportation is currently a large source of greenhouse-gas emissions [148]. These concerns are gaining momentum in a wide range of ITS applications, such as the discovery of parking spots [149], multimodality applications that gran<sup>t</sup> travelers the chance of using collective transportation systems efficiently and conveniently [150], the improvement of logistics operations [151], shared mobility applications, which help reducing the number of one-passenger vehicles in the road network [152], or driving analytics to improve safety and ecological footprint [153–155].

Of course, research goes beyond the application context and does not need to be always connected to a certain application scenario. A prototype can be far from the practical requirements of its eventual deployment; still, knowing the essential application common grounds is key to converge to actionable models. Unfortunately, this is a matter frequently disregarded in ITS research.
