*Article* **Bit-Level Automotive Controller Area Network Message Reverse Framework Based on Linear Regression**

**Zixiang Bi, Guoai Xu \*, Guosheng Xu, Chenyu Wang and Sutao Zhang**

> School of Cyberspace Security, Beijing University of Posts and Telecommunications, Beijing 100876, China; bzx@bupt.edu.cn (Z.B.); guoshengxu@bupt.edu.cn (G.X.); wangchenyu@bupt.edu.cn (C.W.); lunter-zst@bupt.edu.cn (S.Z.)

**\*** Correspondence: xga@bupt.edu.cn

**Abstract:** Modern intelligent and networked vehicles are increasingly equipped with electronic control units (ECUs) with increased computing power. These electronic devices form an in-vehicle network via the Controller Area Network (CAN) bus, the de facto standard for modern vehicles. Although many ECUs provide convenience to drivers and passengers, they also increase the potential for cyber security threats in motor vehicles. Numerous attacks on vehicles have been reported, and the commonality among these attacks is that they inject malicious messages into the CAN network. To close the security holes of CAN, original equipment manufacturers (OEMs) keep the Database CAN (DBC) file describing the content of CAN messages, confidential. This policy is ineffective against cyberattacks but limits in-depth investigation of CAN messages and hinders the development of in-vehicle intrusion detection systems (IDS) and CAN fuzz testing. Current research reverses CAN messages through tokenization, machine learning, and diagnostic information matching to obtain details of CAN messages. However, the results of these algorithms yield only a fraction of the information specified in the DBC file regarding CAN messages, such as field boundaries and message IDs associated with specific functions. In this study, we propose multiple linear regression-based frameworks for bit-level inversion of CAN messages that can approximate the inversion of DBC files. The framework builds a multiple linear regression model for vehicle behavior and CAN traffic, filters the candidate messages based on the decision coefficients, and finally locates the bits describing the vehicle behavior to obtain the data length and alignment based on the model parameters. Moreover, this work shows that the system has high reversion accuracy and outperforms existing systems in boundary delineation and filtering relevant messages in actual vehicles.

**Keywords:** Controller Area Network; electronic control units; database CAN; reverse; multiple linear regression; bit-level; vehicle behavior
