*2.4. Measurement Model*

A direct VO was used as the measurement model where the intensity was used as the measurement directly without feature extraction and matching processes. The position of each point *j* was transformed from the SICS into the CCS by the motion state as follows:

$$\prescript{\mathbf{C}}{}{\mathcal{P}}\_{\mathbf{P}\_{\mathbf{P}\_{j}}} = \prescript{\mathbf{C}}{}{\mathcal{R}}\_{\mathbf{SI}}^{\mathbf{I}} \mathcal{R} \left( \prescript{\mathbf{I}}{}{\mathbf{SI}}\_{\mathbf{SI}} \boldsymbol{\rho} \right) \left( \prescript{\mathbf{SI}}{}{\mathbf{P}}\_{\mathbf{P}\_{j}} - \prescript{\mathbf{SI}}{}{\mathbf{P}}\_{\mathbf{P}} \right) + \prescript{\mathbf{C}}{}{\mathbf{P}}\_{\mathbf{P}} \tag{11}$$

The pixel coordinate of point *j* in the current view was evaluated according to the pin-hole camera model.

$$
\pi \begin{bmatrix} u\_{\mathbb{P}\_{\mathbb{P}}} \\ v\_{\mathbb{P}\_{\mathbb{P}}} \end{bmatrix} = \pi \begin{pmatrix} \mathbb{C}\_{\mathbb{P}} \\ p\_{\mathbb{P}\_{\mathbb{P}}} \end{pmatrix} = \begin{bmatrix} \mathbb{C}\_{\mathbb{X}\_{\mathbb{P}} \cdot \frac{f\_{\mathbf{x}}}{\mathbb{C}\_{\mathbb{P}\_{\mathbb{P}}}}} + c\_{\mathbf{x}} \\ \mathbb{C}\_{\mathbb{P}\_{\mathbb{P}} \cdot \frac{f\_{\mathbf{y}}}{\mathbb{C}\_{\mathbb{P}\_{\mathbb{P}}}}} + c\_{\mathbf{y}} \end{bmatrix} \tag{12}
$$

The intensity *Ik u*p*j* , *v*p*<sup>j</sup>* of point *j* in the RGB image at time *k* was used as the measurement

$$z\_{j,k} = I\_k \left( u\_{\mathbb{P}/\prime} v\_{\mathbb{P}\_j} \right) + u\_{j,k} \tag{13}$$

where *nj*,*<sup>k</sup>* is the measurement noise and assumed to obey a Gaussian distribution with covariance *Rj*,*k*. Combining Equations (11)–(13), one can obtain the following measurement equation.

$$z\_k = h \left( \mathbf{l}\_{\rm Sl} \boldsymbol{\rho}\_{k'} \, ^{\rm Sl} \boldsymbol{p}\_k \right) \tag{14}$$
