**3. Methodology**

We outline the overall procedure for predicting the link's speed on the traffic networks, as shown in Figure 1. Given raw data such as adjacency, distance, and speed in a data tensor, we compute reachability information such as hop count, distance, and cost of traffic flowing between a pair of source and destination links through a chain of neighboring links. Paired with other external features such as weather conditions, time of day, and day of the week, we expect that encoding the dynamics of the temporal traffic flows on the neighboring links would significantly improve the prediction of links' speed. How we generate the embedding of complex context-aware spatio-temporal features is explained in greater detail in Section 3.1. We model the correlation between the input feature and each link speed with a Long Short-Term Memory (LSTM) algorithm. We use 512 perceptrons in the hidden layer, ReLU for the activation function [36,37], and Adam for the optimizer [38,39].

**Figure 1.** The overall procedure for traffic link embedding and training for traffic speed prediction.
