**2. Methodology**

As formulated in the previous section, we have three different problems to solve: the coordinate systems of the sensor data are not completely synchronized; there is no known matching between individual vehicles; and the dynamic behavior of the vehicles between the two measured cross-sections, which we call gap sections, are unknown. We solved the first two problems simultaneously by making an assumption about the vehicle dynamics, namely that the sensors are close enough and the traffic is fluent enough, so that on average, the drivers make limited deceleration or acceleration maneuvers. This assumption is feasible for the considered orders of magnitude for the distances considered in this work of 100 m–200 m, because in fluent traffic, such sections are covered within several seconds, which restricts the amount of maneuvers. We must also note that it is of grea<sup>t</sup> interest to reconstruct microscopic data of fluent traffic as this state leads to high-risk interactions between vehicles due to large speed differences. For non-fluent traffic, analytically determining more complex dynamic behaviors between cross-sections is impossible without the means of driving behavior modeling. Additionally, low traffic speeds also lead to less safety-critical interaction, so from the point of view of traffic data analysis on a microscopic scale, this state is less of an interest than fluent traffic.

We formulate the assumption by describing the vehicles' movement along the street axis (in one lane) as:

$$s\_i(t) = s\_i(t\_0) + v\_i(t\_0)(t - t\_0) + a\_i(t - t\_0)^2,\tag{1}$$

where *si*(*t*) is the position of vehicle *i* at time *t*, *si*(*<sup>t</sup>*0) and *vi*(*<sup>t</sup>*0) are its location and speed at the beginning of the gap section, and *ai* is its time-constant acceleration.

From Equation (1), it can also easily be derived that:

$$s\_i(t) = s\_i(t\_0) + \frac{v\_i(t) + v\_i(t\_0)}{2}(t - t\_0) \tag{2}$$

so that the position is the same as if the vehicle would have moved with the mean speed between *t*0 and *t*.

#### *2.1. Vehicle Matching and Offset Estimation*

Based on the assumption formulated above, we treated the vehicle matching as a probability density estimation problem. To solve it, we applied the expectation maximization algorithm described in [25]. The algorithm aims at finding the maximum likelihood estimates from incomplete data by looping over two iterations:


This method was originally intended to determine the parameters of a few statistical distributions based on a larger set of incomplete data. We applied this method to our problem as follows: Suppose we have two sensors detecting vehicles with their time-stamps and speeds at two consecutive cross-sections. The set of *M* vehicles with their respective speeds from the second sensor form linear Gaussian kernels, while the *N* vehicles and speeds from the first sensor form the data upon which we are trying to find the distribution parameters. The mean of the individual Gaussian models is defined by the mean speed between a vehicle pair as a linear trajectory, as defined in (2). The standard deviation is defined as the distance of the data point to the trajectory line. The geometrical interpretation of this distance can be seen in Figure 1, where *ym* = (*tm*0,*<sup>s</sup>*(*tm*0)) is the recorded time and location of a vehicle entry from one sensor. *xn* is an entry from the dataset of the other sensor. We define the vector *wmn* = *xn* − *ym* and *umn* as being the unit vector of the linear vehicle trajectory calculated from the mean speed between the two entries. We calculate the perpendicular vector *<sup>w</sup>*⊥*mn* from the desired linear trajectory to *xn* as follows:

$$\begin{split} w\_{\mathrm{mn}}^{\perp} &= -proj\_{\mathrm{ll}\boldsymbol{w}\boldsymbol{u}} (w\_{\mathrm{mn}}) + w\_{\mathrm{mn}} = -\boldsymbol{u}\_{\mathrm{mm}} \frac{\boldsymbol{u}\_{\mathrm{mn}}^{\mathsf{T}} w\_{\mathrm{mn}}}{||\boldsymbol{u}\_{\mathrm{mn}}||} + \boldsymbol{w}\_{\mathrm{mn}} = \\ &= \boldsymbol{w}\_{\mathrm{mn}} - ((\boldsymbol{u}\_{\mathrm{mn}}^{\mathsf{T}} w\_{\mathrm{mn}})^{\mathsf{T}} \boldsymbol{u}\_{\mathrm{mn}}^{\mathsf{T}})^{\mathsf{T}} = \boldsymbol{w}\_{\mathrm{mn}} - (\boldsymbol{w}\_{\mathrm{mn}}^{\mathsf{T}} (\boldsymbol{u}\_{\mathrm{mn}} \boldsymbol{u}\_{\mathrm{mn}}^{\mathsf{T}}))^{\mathsf{T}} = \\ &= \boldsymbol{w}\_{\mathrm{mm}} - \boldsymbol{u}\_{\mathrm{mm}} \boldsymbol{u}\_{\mathrm{mn}}^{\mathsf{T}} \boldsymbol{w}\_{\mathrm{mn}} = (\boldsymbol{I} - \boldsymbol{u}\_{\mathrm{mm}} \boldsymbol{u}\_{\mathrm{mn}}^{\mathsf{T}}) \boldsymbol{w}\_{\mathrm{mn}} \end{split}$$

Thus, the distance of the point to the trajectory line is given by:

> *<sup>w</sup>*⊥*mn* = (**I** − *umnu*-

$$\begin{array}{c|c} \stackrel{4.5}{\text{ $\cdot$ }} \\ \stackrel{4.5}{\text{ $\cdot$ }} \\ \stackrel{3.5}{\text{ $\cdot$ }} \\ \stackrel{2.5}{\text{ $\cdot$ }} \\ \stackrel{2.5}{\text{ $\cdot$ }} \\ \stackrel{1.3}{\text{ $\cdot$ }} \\ \stackrel{1.3}{\text{ $\cdot$ }} \\ \stackrel{1.3}{\text{ $\cdot$ }} \\ \text{ $\cdot$ } \\ \stackrel{0.5}{\text{ $\cdot$ }} \\ \text{ $\cdot$ } \\ \stackrel{1.3}{\text{ $\cdot$ }} \end{array} \qquad \begin{array}{c|c} \text{ $\cdot$ } \\ \stackrel{1.3}{\text{ $\cdot$ }} \\ \stackrel{1.3}{\text{ $\cdot$ }} \\ \stackrel{1.3}{\text{ $\cdot$ }} \\ \stackrel{1.3}{\text{ $\cdot$ }} \\ \stackrel{1.3}{\text{ $\cdot$ }} \\ \stackrel{1.3}{\text{ $\cdot$ }} \\ \stackrel{1.3}{\text{ $\cdot$ }} \\ \text{ $\cdot$ } \\ \end{array} \qquad \begin{array}{c|c} \text{ $\cdot$ } \\ \stackrel{1.3}{\text{ $\cdot$ }} \\ \stackrel{1.3}{\text{ $\cdot$ }} \\ \stackrel{1.3}{\text{ $\cdot$ }} \\ \stackrel{1.3}{\text{ $\cdot$ }} \\ \stackrel{1.3}{\text{ $\cdot$ }} \\ \stackrel{1.3}{\text{ $\cdot$ }} \\ \text{ $\cdot$ } \\ \text{T}} \\ \text{T} \\ \text{ $\cdot$ } \\ \end{array}$$

*mn*)*wmn* = (**I** − *umnumn*)(*xn* − *ym*)

**Figure 1.** Geometric interpretation of the distance of a data point from the line trajectory of another data point. The data point *ym* is the time and location of a vehicle detection in the first sensor, while *xn* is a corresponding data entry from the second sensor. *umn* is the unit vector with the gradient calculated as the mean of the two speeds measured at the sensors. *projumnwmn* is the projection of *wmn* onto this unit vector. The distance of the second data entry from the optimal linear trajectory corresponding to the mean speed can be derived by subtracting the projected vector from *wmn*.

The Gaussian model of the vehicle pair (*<sup>m</sup>*, *n*) has the form:

$$p(\mathbf{x}\_n|\Theta\_m) = \frac{1}{\sqrt{2\pi\sigma^2}} \exp^{-\frac{\|(\mathbf{I} - \mathbf{u}\_{\text{min}} \mathbf{u}\_{\text{min}}^T)(\mathbf{x}\_n - \mathbf{y}\_m)\|}{2\sigma^2}},\tag{3}$$

In the rest of the paper, we will use the notation *Umn* = **I** − **umnuTmn** for the projection matrix used to calculate the perpendicular vectors on the unit vector *umn*. The Gaussian distribution of all vehicles use the same standard deviation, while they have equal membership probabilities *<sup>P</sup>*(*m*) = 1*M* . The mixture model is defined as:

$$p(\mathbf{x}|\boldsymbol{\Theta}) = \sum\_{m=1}^{M} p(m)p(\mathbf{x}|\boldsymbol{\Theta}\_{m}),\tag{4}$$

while the log likelihood function given the set of model parameters is:

$$\ln(p(\mathbf{x}|\boldsymbol{\Theta})) = \sum\_{n=1}^{N} \ln \sum\_{m=1}^{M} p(m) p(\mathbf{x}\_n|\boldsymbol{\Theta}\_m) \,. \tag{5}$$

Figure 2 shows a probability distribution of a Gaussian mixture model defined by Equation (4) with two vehicle trajectories, with the first one having a higher speed than the second and the vehicles passing the detection sensor with a difference of one second.

**Figure 2.** Distribution of a Gaussian mixture model consisting of two linear vehicle trajectories. If there is one data entry *x* with a specific speed and two possible data entries *y*, it results in two projected vectors *projumnwmn* and two Gaussian distributions from Equation (3). The Gaussian distributions depend on the orthogonal distance from the projected vector and add up to the Gaussian mixture model seen in this figure.

As described above, the maximization of this likelihood function is solved by looping through two steps. The first one is calculating the expected complete-data log likelihood function given by:

$$Q(\boldsymbol{\Theta}, \boldsymbol{\Theta}^{old}) = \sum\_{n=1}^{N} \sum\_{m=1}^{M} p(m|\mathbf{x}\_{n}, \boldsymbol{\Theta}^{old}) \ln[p(m)p(\mathbf{x}\_{n}|\boldsymbol{\Theta}\_{m})].\tag{6}$$

The missing data consist of the soft matching of the detected vehicles so that (6) can be derived by determining the a posteriori probabilities of Gaussian model *m* matching vehicle *n*:

$$p(m|\mathbf{x}\_{\text{n}},\Theta^{old}) = \frac{p(\mathbf{x}\_{\text{n}}|\Theta^{old})}{\sum\_{m=1}^{M} p(\mathbf{x}\_{\text{n}}|\Theta^{old})} = \frac{\exp^{-\frac{\|\bigcup\_{m}(\mathbf{x}\_{\text{n}} - \mathbf{y}\_{\text{n}})\|}{2\sigma^{2}}}}{\sum\_{m=1}^{M} \exp^{-\frac{\|\bigcup\_{m}(\mathbf{x}\_{\text{n}} - \mathbf{y}\_{\text{n}})\|}{2\sigma^{2}}}}\tag{7}$$

The second step consists of maximizing the *Q* function given by (6) with respect to the Gaussian parameters. At this point, we have to define the set of parameters Θ being considered here for optimization. As already stated above, one of the common parameters is the variance of the Gaussians given by *σ*2. As our initial problem was to determine the offset of the sensors, the other parameter is a translation, which moves the Gaussians along the space and/or time axes. As the set of Gaussians is based on vehicle data recorded from one sensor, it is feasible to use a common translation vector **t** for all the Gaussians, as the offsets have to be consistent for all vehicles recorded within a sensor. After applying the logarithm to the exponential function, we get:

$$\begin{split} \mathcal{Q}(\Theta, \Theta^{old}) &= -Nln(M\sqrt{2\pi}) - \frac{N}{2}ln(\sigma^2) - \\ &- \frac{1}{2\sigma^2} \sum\_{n=1}^{N} \sum\_{m=1}^{M} p(m|\mathbf{x}\_n, \Theta^{old}) \| \| I\_{mn}(\mathbf{x}\_n - (y\_m + \mathbf{t})) \|^2 \end{split} \tag{8}$$

Note that in Equation (8), the first term is constant and can be neglected. Taking the partial derivative with respect to the translation parameter **t** and equating it to zero give:

$$\mathbf{t} = \mathbf{U}^{-1} \left( \sum\_{n=1}^{N} \sum\_{m=1}^{M} p(m|\mathbf{x}\_n, \Theta^{old}) \mathsf{U}\_{mn} (\mathbf{x}\_n - y\_m) \right), \tag{9}$$

where **U** = ∑ *N <sup>n</sup>*=1 ∑ *M <sup>m</sup>*=1 *pmn Umn* and **U**−<sup>1</sup> is its inverse matrix. By substituting **t** back into the function *Q*, we can maximize with respect to *σ* again by partial derivation and equalizing with zero:

$$\sigma^2 = \sum\_{n=1}^{N} \sum\_{m=1}^{M} p(m|\mathbf{x}\_{n\prime} \oplus^{old}) \left\| \mathcal{U}\_{mn}(\mathbf{x}\_n - \widehat{\mathcal{Y}\_m}) \right\|^2,\tag{10}$$

where *ym* = *ym* − **t**. In accordance with the EM-algorithm, we cycle through the E- and M-step until the change in either the log-likelihood function or in the parameters of the Gaussian models ( *σ* and **t**) is small enough to consider that the algorithm has converged.

We summarize our approach as an algorithmic description in Algorithm 1. Here, the E-Step consists of recalculating the soft matching probabilities, in other words the probability that vehicle *m* from the first sensor matches vehicle *n* from the second one. The M-step is a spatio-temporal shift of all M vehicles from the second sensor with a common value *t* and an update of the standard deviations of the Gaussians.

**Algorithm 1:** Vehicle registration using expectation maximization.

**Data:**

*xn* = (*tn*,*sn*) vehicle data from first sensor, *n* ∈ (1, *N*) *ym* = (*tm*,*sm*) vehicle data from second sensor, *m* ∈ (1, *M*) *vmn* = 1 2 (*vn* + *vm*) ← mean speed between sensors measurements**Initialization: t** = 0 *σ*<sup>2</sup> = 1*NM* ∑ *N <sup>n</sup>*=1 ∑ *M <sup>m</sup>*=1 *Umn*(*xn* − *ym*) 2 *umn* = *vmn*/ *vmn* , *Umn* = (**I** − *umnuTmn*) **while** Δ **Q** > **T Q** & (**Δ œ** > **T œ Δt** > **Tt**) **do** E-Step: *p mn* = exp<sup>−</sup> *Umn*(*xn*−*ym*) <sup>2</sup>*σ*<sup>2</sup> ∑ *M <sup>m</sup>*=1 exp<sup>−</sup> *Umn*(*xn*−*ym*) <sup>2</sup>*σ*<sup>2</sup> M-Step: **U** = ∑ *M* ∑ *N p mn Umn t* = **U**−<sup>1</sup> ∑ *M* ∑ *N p mn Umn*(*xn* − *ym*)) *ym* = *ym* +*t σ*<sup>2</sup> = ∑ *M* ∑ *N pmn Umn*(*xn* − *ym*) 2 **end Result:**

*pmn* ← probabilities of vehicle matches *ym* = (*tm*,*sm*) ← corrected position and/or timestamps of the measurements for the second sensor

#### *2.2. Microscopic Data Reconstruction*

In this section, we will take the next step in reproducing microscopic traffic data, based on the results of the previous section. Let us suppose that with the means of the EM-algorithm, we found the spatio-temporal offset of the sensors, and we also have a correspondence matrix **P** where the entries *pmn* are the last computed prior distribution of the vehicle matching problem from Algorithm 1. We can compute the matrix **M** with the entries *rmn* = *Umn*(*xn* − *ym*) being the respective point to line distances.

Now, after we found the sensor offsets and the soft matching probabilities *rmn*, we applied the Hungarian algorithm [26] to find the first correspondences between vehicles. The Hungarian algorithm is a polynomial time solution to the assignment problem of two datasets. We treated the vehicles of the first and second sensor as the two different datasets. The Hungarian algorithm uses a cost value for the assignment, which in our case was the distance from the linear trajectory from the already shifted entries of the second sensor to entries of the first one. Therefore, we used the same *rmn* values as a cost value in the EM-algorithm and formed a matrix Rwith the number of rows and columns corresponding to the data entries of the first and second sensor. The overall procedure of the Hungarian algorithm is as follows:


The result of the assignment problem is a set of pairs, where for each vehicle of the first sensor, a vehicle of the second sensor is matched. The assignment is only an interim step towards reconstructing the trajectories as it still assigns the vehicle pairs based on a linear trajectory based on mean speed between the vehicle pairs. However, the assignment is very robust, because the real trajectories of the vehicles are similar enough to the linear ones in order for the matching to be successful.

We also have to note at this point that the resulting assignments also hold assignments of outliers, as they have also been translated according to the EM-algorithm. Thus, both false positives and false negatives in the data of both sensors need to be considered. We applied a threshold of 3*σ* to the assigned values of *rmn*. We considered all the assignments with costs above this threshold as outliers and removed them from the dataset, as they were misdetections (false positive or negative) of either sensor. A trajectory should not be reconstructed for such singular data entries. One important aspect here is that we have no possibility of identifying a false positive that fits well to a trajectory of a true positive detection. If a sensor delivers a false positive that fits a smoother trajectory than the true matching, that will influence the assignment of the Hungarian algorithm and also the prediction of the resulting trajectories. On the other hand, most of the sensors used in our experiments showed a tendency towards false negatives rather than false positives, so the chance of a false positive detection influencing the results was quite low. We will analyze the effect of false detection more thoroughly in the following section.

After finding appropriate assignments, we reconstructed the trajectory of the vehicles between the two sensors using curves, which can be parametrized to connect the two points while also following the detected speeds profile. In other words, the connecting curves *s*(*t*) need to satisfy:

$$\mathbf{s}(t\_n) = \mathbf{s}\_{n\prime} \quad \mathbf{s}(t\_m) = \mathbf{s}\_{m\prime} \quad \frac{\delta \mathbf{s}}{\delta t}(t\_n) = \mathbf{v}\_{n\prime} \quad \frac{\delta \mathbf{s}}{\delta t}(t\_m) = \mathbf{v}\_m \tag{11}$$

where (*tn*,*sn*) and (*tm*,*sm*) are the time and positions of the assigned sensor detections, while *vn* and *vm* are the respective speeds. The polynomial curves used in this work were Bézier curves [27]. They have very useful properties, being easy to calculate and derive, so acceleration and deceleration rates can easily be analyzed. Thus, the use of Bézier curves enables the acquisition of a relatively high amount of dynamic behavior of road users, given the fact that only simple cross-section data were used and no complex driver behavior models were applied.

The general definition of Bézier curves is:

$$\mathbf{B}(\boldsymbol{\mu}) = \sum\_{i=0}^{n} \mathbf{b}\_{i} B\_{i,\boldsymbol{\mu}}(\boldsymbol{\mu}), \tag{12}$$

where:

$$B\_{i,n}(u) = \begin{cases} \frac{n!}{(n-i)!i!} (1-u)^{n-i} u^i & \text{, if } 0 \le i \le n\\ 0 & \text{, otherwise} \end{cases} \tag{13}$$

called Bernstein polynomials of degree *n*. **b***i* are the control points that define the curve, and as *u* increases within the interval [0, 1], the Bézier curve is drawn from the first towards the last control point. From Equation (12), the general formulation of the *r*th derivative of a Bézier curve of degree *n* can be deduced to:

$$\mathbf{B}^{(r)}(\boldsymbol{\mu}) = \sum\_{i=0}^{n-r} \mathbf{b}\_i^{(r)} B\_{i,n}(\boldsymbol{\mu}),\tag{14}$$

where:

$$\mathbf{b}\_{i}^{(r)} = n(n-1)\dots(n-r+1)\sum\_{j=0}^{r}(-1)^{r-j}\binom{r}{j}\mathbf{b}\_{i+j}.\tag{15}$$

We now proceed to define the control points based on the input data. The requirements defined by Equation (11) ensure that the positions and the speed values are respected by the trajectory. The disadvantage is that there is no continuity in the second derivative. In other words, a vehicle trajectory fulfilling those requirements will have an unrealistic speed change directly at the first and last control points. Thus, we add the continuity requirements of the second derivative in order to reconstruct the acceleration and deceleration in a more realistic manner. From Equations (12)–(15), it can be seen that the control points can be computed by setting the parameter *u* to zero and one using the derivatives and equating it with the values defined by the input data. In our case, we had six conditions in total (Equations (11) plus the continuity of the second derivative at both ends). These lead to a quintic Bézier curve, the control points of which can be derived as:

$$\mathbf{b}\_0 = (t\_{\mathfrak{n}\_\prime} s\_\mathfrak{n}), \quad \mathbf{b}\_5 = (t\_{\mathfrak{m}\_\prime} s\_\mathfrak{m}) \tag{16}$$

$$\mathbf{b}\_1 = (t\_{\rm tr}, s\_{\rm tr}) + \frac{1}{5}(1, v\_{\rm tr}), \quad \mathbf{b}\_4 = (t\_{\rm tr}, s\_{\rm tr}) - \frac{1}{5}(1, v\_{\rm tr}) \tag{17}$$

$$\mathbf{b}\_2 = (t\_n, s\_n) + \frac{2}{5}(1, v\_n), \quad \mathbf{b}\_3 = (t\_m, s\_m) - \frac{2}{5}(1, v\_m) \tag{18}$$

Figure 3 shows the iterative process of the proposed method. In the first image (a), the initial datasets are shown, where the crosses and circles are the timestamp and location entries of the first and second sensor, respectively. Note that all entries are at Location 0, as there is no prior knowledge of the location or time offsets between the sensors. In the second image (b), the Bézier curves of the matches are visualized. At the beginning of the algorithm, all pairs have similar matching probabilities, which is visualized by the similar opacity of the curves. The S-shape of the curve is originating from the condition of the speed continuity of the curve at the first and second detection with an unfinished spatial offset estimation. In the following plots, the spatio-temporal synchronization is shown by the shift of the circles (detections in the second sensor), while all the unrealistic curves ge<sup>t</sup> more transparent (less probable) until saturation. Please note that in this example, the reconstructed trajectory is the longitudinal one, so the resulting *s*(*tn*) is the vehicle position along the road. As we will see in later sections, this work was conducted with the use of sensors that only deliver the timestamps of vehicles passing, without measuring the lateral position on the road. Nevertheless, the method can easily be applied to the lateral position in the same way in order to deliver lane changing maneuvers.

**Figure 3.** Iterative steps of the EM-algorithm with the visualization of the corresponding trajectories. The top left plot (**a**) shows the initial datasets, while the plots (**b**–**f**) are the visualizations of Iteration Steps 2, 4, 10, 24, and 100, respectively. In this figure, the opacity of the blue lines correspond to the results of the E-step, and the shift of the red circles corresponds to the M-Step of the EM approach, as described in Algorithm 1.

## **3. Experimental Results**

In this section, we present the results of the behavior analysis of the described methods. In order to allow a comprehensive performance investigation, we used the different datasets consisting of synthetic, GNSS, and radar data. The validation steps applied were based on appropriate performance metrics for the problems presented in Section 2: matching, offset estimation, and trajectory reconstruction. The metrics will be discussed in more detail in this section.
