*2.1. State-Space Model*

The state-space model is represented by a state equation and a measurement equation. The state equation describes how the system behaves and provides a prior knowledge of the estimation. The measurement equation is used to help correct and improve the prior estimation. In this study, the goal is to estimate the number of vehicles on signalized links using only CV data, as depicted in Figure 1, where CVs are the vehicles that have the connection icon (e.g., the first vehicle on the left). The only information that is needed in practice is as follows: (1) the traffic flow of CVs observed at the tested link's entrance and exit. (2) the travel time of each CV. Vehicle-to-Infrastructure (V2I) communication can provide this information to the traffic signal controller.

**Figure 1.** Tested link section includes connected vehicles (CVs) and non-CVs.

The model is formulated using the derived state-space equations in [7]. The state equation, Equation (1), is based on the continuity equation of traffic flow; whereas the measurement equation, Equation (2), is based on the traffic flow hydrodynamic relationship, based on measurements of the average travel time of CVs. In Equation (1), the number of vehicles is computed by continuously adding the difference of the number of vehicles that enter and exit the tested section to the cumulative number of vehicles traveling along the section previously computed.

$$N(t) = N(t - \Delta t) + \mu(t) \tag{1}$$

$$TT(t) = H(t) \times N(t) \tag{2}$$

where *N*(*t*) is the number of vehicles traversing the link at time *t*, *N*(*t* − Δ*t*) is the number of vehicles traversing the link in the preceding time interval, and *u*(*t*) is the system inputs, as described in Equation (3).

$$u(t) = \frac{\Delta t \left[ q^{in}(t) - q^{out}(t) \right]}{\max(\rho\_{actual}, \rho\_{min})} \tag{3}$$

where *qin* and *qout* represent the flow of CVs entering and exiting the link, respectively, during Δ*t*. *ρ* is the CVs' LMP, defined as the ratio of CV count to total vehicle count. In the state equation, the *ρ* variable is set to be the maximum number of the actual *ρ* (*ρactual*) and a predefined minimum value of *ρ* (*ρmin*). *ρactual* can be obtained from historical data. *ρmin* is introduced to avoid producing large errors in the state equation since a single *ρ* value is used to approximate the two *ρ* values (upstream and downstream of the tested link) [7]. In this study, *ρmin* is set to be equal to 0.5; more details about the system state representation can be found in [7]. It should be noted that the *ρ* variable is the main noise source in the system, and thus, there is an urgen<sup>t</sup> need to develop the measurement equation to

fix these errors. In Equation (2), *TT* is the average vehicle travel time, *H*(*t*) is a vector that transforms the vehicle counts to travel times. *H*(*t*) is derived from the hydrodynamic relationship between the macroscopic traffic parameters (flow, density, and space-mean speed), as presented in Equation (4).

$$H(t) = \frac{1}{\overline{q}(t)} = \frac{2 \times \rho\_{actual}}{q^{in}(t) + q^{out}(t)}\tag{4}$$
