1. Introduction
In the Czech Republic, approximately 500 people die in traffic accidents every year (see
Figure 1). Specifically in 2024, there were 438 fatalities reported by Police of Czech Republic [
1]. Worldwide, this number reaches 1.35 million lives [
2,
3]. Autonomous vehicles promise a fundamental reduction in these statistics, plus increased traffic efficiency [
4,
5,
6,
7,
8]. This revolutionary technology is forcing the re-evaluation of safety, environmental, user, economic, and other dogmas associated with transportation. However, these benefits are only possible if this technology is implemented with safety in mind, and if the technology is mature enough. Technology allowing autonomous driving has made tremendous progress since the 1920s when the first radio-controlled vehicles were presented. In 1958, a groundbreaking experiment took place at the GM Technical Center in Michigan, showcasing an “automated guided vehicle” [
9]. A Chevrolet featured two front-mounted electronic sensors that detected and followed a cable embedded in the track. The sensors managed the steering system, directly influencing the vehicle’s wheels and enabling it to follow the designated path [
10]. Since then, autonomous vehicles have matured into what they are today, from concepts based on vehicle navigation using an electric wire embedded in the transport infrastructure to the first visual-based systems. A significant shift was brought by the development of microelectronics in the 1980s, which enabled faster processing of larger amounts of data. The beginning of the 21st century brought a major boost to the further development of autonomous vehicles through the DARPA Grand Challenge, which gave rise to truly usable technology [
11,
12,
13]. The first edition of this autonomous vehicle competition, held in 2004, which sparked interest in the autonomous driving technology among university teams from the most prestigious universities, had no winners. However, the second edition, held a year later, was dominated by a Stanford University team with a Stanley vehicle, whose lead designer was engineer Sebastian Thrun [
14,
15]. Among other things, he was behind the development of Google’s first autonomous vehicle, named Chauffer, which led to the creation of Waymo in 2016, which is now considered a technology leader in the development of autonomous vehicles. In about 100 years, we have gone from a few prototypes to thousands of autonomous vehicles on public roads worldwide [
9]. In California alone, between December 2023 and November 2024 a total of 2819 autonomous vehicles were in operation. These vehicles operated under California’s Autonomous Vehicle Tester Program regulations; manufacturers are required to submit disengagement reports only for the testing of vehicles operating at SAE level 3 or higher. These figures are drawn from the annual autonomous vehicle disengagement reports published by the California Department of Motor Vehicles (DMV) [
16]. Globally, the number of autonomous vehicles in operation is projected to reach 33,570 units in 2025, up from 16,960 units in 2022, according to Statista’s forecast [
17]. Based on these advances and beliefs in moving forward, many predictions have been made about the timeframe for the widespread adoption of autonomous vehicles in passenger or mass transit. Garza [
18] predicts that by 2035, the majority of vehicles produced will be autonomous. To demonstrate to society that autonomous vehicles are safe and reliable, it is first necessary to establish a clear definition of what constitutes a safe and reliable autonomous vehicle [
4,
19,
20]. One approach to addressing this issue involves deploying autonomous vehicles in real-world traffic, monitoring their performance, and conducting a statistical comparison with the performance of conventional human drivers. But how many miles of driving would be needed to provide clear statistical evidence of autonomous vehicle safety? This topic was addressed by the authors of How Many Miles of Driving Would It Take to Demonstrate Autonomous Vehicle Reliability [
21], who used statistical methods to determine that to demonstrate with 95% confidence that the failure rate of an autonomous vehicle is lower than that of a human driver, autonomous vehicles would need to drive at least 5 billion miles (see
Figure 2) [
21].
If autonomous vehicles are not at least as safe as the average driver, it will be very difficult to convert potential users to actual users.
Currently, autonomous vehicles face a number of problems, including both technical and legislative hindrances. California Vehicle Code § 38750, enacted by Senate Bill 1298 in 2012, empowers the California DMV to regulate on-road autonomous vehicle testing and establishes a two-phase permitting regime—drivered testing (form OL 315, effective September 2014) and driverless testing (form OL 315A, effective April 2018)—each requiring demonstration of safe operation within defined operational design domains (ODDs) prior to any public-road deployment. Permit applicants must maintain at least $5 million in liability coverage (insurance, surety bond via form OL 317, or self-insurance) as proof of financial responsibility (§§ 227.04–227.12), and every autonomous test vehicle must be registered and clearly identified as “autonomous” on its certificate of title and registration card (§ 227.52).
Under CCR § 227.50(a), holders of either testing permit must submit a Monthly Report of Autonomous Vehicle Disengagements (form OL 311R) by the 10th day of each month, beginning 30 days after permit issuance, covering every disengagement—including ADS anomalies, ODD exits, human or system interventions—with detailed metadata (VIN, software version, GPS coordinates, initiator, cause categories, and post-event actions) [
22]. Collisions resulting in property damage, injury, or death must be reported within ten days on form OL 316, incorporating the full NHTSA Standing General Order crash report and any DMV-requested supplemental sensor or video data (§ 227.48) [
23]. Automated or human-initiated hard-braking events exceeding 0.5 g are also reported monthly under CCR §§ 227.58/228.42.
Failure to comply with these reporting obligations may trigger Preliminary Information Notices (requiring 24–72 h responses) or Requests for Information (10-day responses), and can lead to immediate permit suspension or revocation under Government Code § 11505—subject to contested hearing rights (CCR §§ 227.44–227.46). Additionally, manufacturers must submit a Law Enforcement Interaction Plan—complete with external microphones, speakers, and visual mode indicators—to facilitate coordination with first responders (CCR § 228.44). Collectively, these precisely defined, publicly accessible requirements constitute one of the world’s most rigorous real-world safety oversight regimes for autonomous vehicle testing, generating a longitudinal dataset that is essential for benchmarking technological maturity, analyzing safety trends, and informing evidence-based policymaking.
Testing of autonomous vehicles. California Code of Regulations (227.38) defines disengagement as a deactivation of the autonomous mode when a failure of the autonomous technology is detected or when the safe operation of the vehicle requires that the autonomous vehicle test driver disengage the autonomous mode and take immediate manual control of the vehicle, or in the case of driverless vehicles, when the safety of the vehicle, occupants of the vehicle, or the public requires that the autonomous technology be deactivated [
24]. Although the quantity and quality of the reported data depend on the operators/manufacturers, these reports offer a unique insight for comparison and for a framework of the technological state and safety of autonomous vehicles developed by individual manufacturers.
Companies testing autonomous driving systems as part of the California DMV Autonomous Vehicle Testing Program must file yearly reports detailing the frequency of disengagements. These disengagements occur when vehicles switch out of the autonomous mode, either because of system malfunctions or instances where the test operator needs to manually take over to ensure safe operation. Autonomous driving system disengagements may happen due to various factors, such as unexpected situations requiring immediate intervention by the safety driver, the driver’s discretion or cautious decision-making, consideration for other road users, or challenges arising from ADS software limitations or errors [
25]. Disengagement reports include the following details:
manufacturer name, permit number, date, and vehicle identification number (VIN),
capability for driverless operation,
presence of a driver,
identification of who triggered the disengagement (autonomous system, safety driver, remote operator, or passenger),
location of disengagement (freeway, interstate, expressway, highway, street, or parking facility),
events leading to the disengagement description.
Boggs et al. [
26] conducted a comprehensive analysis of 159,840 disengagement reports collected from 36 operating authority holders in California’s Autonomous Vehicle Tester Program between September 2014 and November 2018. Their study revealed that 75% of the disengagements were initiated by human drivers. The authors categorized the causes into several groups—control discrepancy, perception discrepancy, planning discrepancy, hardware/software discrepancy, environmental conditions and other road users, as well as operator takeover with corresponding causes of disengagement for each category (see
Table 1). Planning-related discrepancies were responsible for 34.95% of the total disengagements. These were followed by other categories in decreasing order of frequency: software and hardware issues (25.7%), perception challenges (20.82%), environmental factors or interactions with other road users (11.66%), and control discrepancies (6.87%). Using a random parameter binary logistic model, they further examined the relationship between the disengagement initiator and the contributing factors, thereby reinforcing the predominance of human-initiated disengagements observed in the dataset.
Khan, LeMaster, and Najm [
25] leveraged the disengagement cause categorization originally proposed by Boggs et al. [
26] and applied it to ADS disengagement reports from the California DMV for 2022 and 2023. The authors analyzed 5731 disengagement records, revealing that, on average, 86% of the cases were initiated by a human driver. The predominant causes of disengagement were planning discrepancies (21%), followed by software and hardware issues (26%), perception challenges (17%), and environmental factors or interactions with other road users (12%) (see
Figure 3). Consistent with the findings of Boggs et al. [
26], this study confirms that, despite advancements in autonomous driving systems, human intervention remains the primary mechanism for ensuring safety.
Complementary to the disengagement analysis, alternative safety assessment frameworks have been developed to evaluate autonomous vehicle performance under real-world conditions. For example, Khan et al. [
25] proposed a scenario-based evaluation approach tailored to specific operational design domains and varying driving tasks, which emphasizes standardizing test conditions and balancing proprietary data constraints with the need for transparency. This proposal largely aligns with the structured procedure introduced by [
27], who also advocate for using real-world scenarios for safety assessment; however, Khan et al. [
25] place additional emphasis on the integration of detailed driving task contexts and the standardization of testing protocols, marking a subtle divergence in focus between the two approaches.
An analysis of disengagement reports was also conducted by [
28], who employed machine learning methods to identify the main causes of these events and proposed measures to enhance system reliability. The author applied natural language processing (NLP) to extract key information from 1952 disengagement reports between 2014 and 2023 and then used the k-means clustering algorithm to identify dominant patterns in the data. The results show that disengagements can be categorized into several main groups. They are most commonly caused by the unpredictable behavior of other road users, errors in the system’s assessment of traffic situations, and limitations of sensor technologies. The analysis further reveals that conservative decision-making by autonomous vehicles, such as unnecessarily harsh braking or inadequate responses to complex traffic scenarios, leads to frequent manual interventions by the driver. The author proposed improvements in the area of algorithmic prediction of the behavior of other vehicles, optimization of sensor data processing, and the implementation of more advanced decision-making models to reduce the occurrence of disengagements. The study’s conclusions suggest that more effective integration of predictive modeling and adaptive control mechanisms could significantly reduce the frequency of manual driver interventions and improve the smoothness of autonomous vehicle operation. This approach provides valuable insights for the further development of autonomous systems, with a focus on their safety and operational efficiency [
28].
Wang and Li [
29] also addressed the question of which specific causes necessitate human driver intervention during the transition from automated to manual driving by conducting a detailed investigation of the factors leading to vehicle disengagements during road test drives. The authors employed the California disengagement reporting database for the years 2016–2017. Methodologically, the study was based on advanced statistical approaches, with multilevel statistical modeling complemented by classification tree analysis playing a central role. This methodology not only allowed for the identification of the primary factors affecting the driver’s reaction time to an automated disengagement but also quantified the causal relationships between an insufficient number of radar and LiDAR sensors installed on the vehicle and the probability of a planning issue. The statistical processing of data involved an analysis of an extensive dataset, through which threshold values for the number of installed sensors were identified, and, subsequently, the key variables that significantly influence the driver’s reaction speed were revealed via classification tree analysis. The results of the study indicate that automated vehicles face disengagements primarily due to planning errors associated with an insufficient number of radar and LiDAR sensors. The authors identified specific threshold values for the installation of these sensors, beyond which the probability of an automated mode deactivation increases significantly. The analyses further reveal that, in addition to technical shortcomings, the characteristics of the roadway—specifically, certain features of road infrastructure—play a crucial role in determining the time it takes for a driver to react and take control. In particular, cases of disengagements triggered by vehicle perception and control issues lead to a significant extension of the driver’s reaction time, underscoring the importance of precise integration of technology and the human factor in the operation of automated vehicles [
29]. Zhang [
30] focused his work on the analysis of the cause-and-effect relationships of disengagement events using deep transfer learning and natural language processing methods. His research question centered on identifying the key factors that necessitate the transition from autonomous to manual control by a human driver. Methodologically, the author developed a scalable end-to-end pipeline that encompasses data collection, processing (including OCR—optical character recognition), and subsequent statistical analysis of disengagement reports from the California DMV database. The study’s conclusions indicate that insufficient sensor integration and planning errors significantly affect the frequency of system interventions [
30]. In contrast, the study of Lee, and Choi [
31] broadened the discussion on autonomous vehicle safety by examining the multidimensional nature of disengagement reports. The authors focused on the interaction of technological, environmental, human, and organizational factors, and their conclusions demonstrate that disengagements are not merely the result of technical failures, but rather of the complex interplay between the autonomous system and its operational environment, underscoring the necessity of a holistic approach to evaluating autonomous vehicle safety [
31]. Another preprint study by Smith and Oric [
32] addressed the standardization of a framework for reporting and analyzing disengagements. The primary aim of that work was to establish a unified and transparent system for the collection, categorization, and analysis of events in which the autonomous system hands over control to a human driver. The authors proposed a comprehensive methodology based on the definition of criteria, systematic data collection, and statistical analysis, with their conclusions highlighting the critical role of transparency and standardization in the further development of autonomous technologies [
32]. The extraction of cause-and-effect relationships from disengagement reports was also addressed by Zhang, Yang, and Zhou [
33], whose research question focused on automating the processing of textual data to enhance the safety of autonomous systems. The authors implemented an NLP (natural language processing) pipeline based on deep transfer learning that was initially pretrained on extensive language corpora and subsequently fine-tuned for the specific task of extracting cause-and-effect relationships from California DMV data. Through this methodology, they achieved human-level performance in identifying critical causes and effects of disengagement, providing valuable insights for optimizing the design of autonomous systems [
33]. In their research, Khattak et al. [
34] emphasized that not only technical deficiencies of autonomous systems—such as software or sensor failures—but also interactions with other road users play a decisive role in the occurrence of incidents. They worked with data from the early testing phase, when a significant share of disengagements was still evident, and their models successfully quantified the decline in these events as the technology progressively improved [
34]. Kohanpour et al. [
35] employed machine learning methods and association rules to confirm that rear-end collisions represent the most common type of collisions, while environmental conditions and the complexity of intersections significantly influence their occurrence. Their analysis was based on an extensive dataset of Californian incidents that reflected the real-world operational conditions of autonomous vehicles [
35]. Chen et al. [
36] applied advanced models, such as XGBoost combined with explainable methods (SHAP), and found that the key variables for predicting accident severity are the state of the vehicle, weather conditions, and the operational mode. Their work, based on 131 collision cases from the period between 2019 and October 2020, provided a detailed quantitative analysis of various modeling approaches [
36]. Almaskati et al. [
37] broadened the perspective on autonomous vehicle safety by considering not only technical and operational aspects, but also ethical and legislative issues. Based on a systematic literature review and analysis of California DMV data covering both disengagements and collisions, they concluded that despite a reduction in failures associated with human errors, autonomous vehicles continue to experience characteristic incidents, predominantly rear-end collisions [
38]. Wang and Li [
39] used a combination of ordinal logistic regression and CART models to develop a hierarchical structure of causal factors, identifying that accident severity increases significantly when the autonomous system bears responsibility for a collision, and that a critical determinant is, among other factors, the location of the accident—such as highway driving. Their analysis was supported by the most comprehensive database of Californian reports on autonomous vehicle collisions from the 2014–2018 period [
39]. Finally, Sinha et al. [
40] presented an accident severity model that cautioned that a reduction in the number of automated disengagements does not necessarily imply an overall improvement in safety, since the causes of accident severity are multifaceted and often associated with deficiencies in reporting protocols. They reached this conclusion based on an extensive dataset of processed California DMV data from 2014–2019, which enabled the tracking of trends and the identification of the key variables influencing incident outcomes [
40].
The above findings from previous studies demonstrate that insufficient sensor integration and planning errors significantly increase the severity of incidents, while factors such as accident location and operating conditions also play an important role.
2. Analysis of Autonomous Vehicle Disengagements
Data were extracted from disengagement reports, including statistics for the number of vehicles tested, the distance traveled in the autonomous mode, and the number of disconnections. There were a total of 12 operators/manufacturers (the monitored operators/manufacturers were Waymo, Mercedes-Benz, Nissan, BMW, Apple, AIMotive, Aurora, CarOne/Udelv, Ponyai, Qualcomm, Toyota, and WeRide.) from 2015 to 2022. In total, 2694 vehicles were operated in the autonomous mode; they traveled a total of 15,156,023 km, and there were 100,018 cases of the driver taking control. The shares of the selected operators/manufacturers in these total numbers are represented in
Figure 4 and
Figure 5. These data correspond to the maturity of the autonomous driving technology and to the safety of its deployment in road traffic. This can be perceived as the output of testing autonomous vehicles on public roads. It should also point to the safety of autonomous vehicle operations, although the authors are aware that there are limitations and inadequacies associated with these reports (see below).
Similarly to Chen et al. [
24], we graphically show the cumulative number of disengagements according to the cumulative distance traveled in the autonomous mode across the monitored operators/manufacturers during the monitored period (see
Figure 6).
Figure 6 illustrates the relationship between the total annual distance driven in the autonomous mode (X-axis) and the total annual number of disengagements (Y-axis) for the 2015–2020 period. Each marker corresponds to aggregated data for one calendar year across all monitored operators. A linear regression reveals a weak positive correlation between testing exposure and the absolute number of disengagements: only 22.5% of the variance in disengagement counts is explained by differences in annual mileage. To assess the assumption of normality, we applied the Shapiro–Wilk test to the annual disengagement rate (disengagements per 1000 km). The test yielded W = 0.767 and
p = 0.0289, leading to the rejection of the null hypothesis of normality (
p < 0.05). This finding confirms that the distribution of the disengagement rate significantly deviates from a normal distribution. Pearson’s correlation coefficient (r = 0.474;
p = 0.342) and Spearman’s rank correlation coefficient (ρ = 0.493;
p = 0.321) were calculated to quantify the strength and direction of the association between the annual distance traveled in the autonomous mode and the total number of disengagement events. Both tests indicate a positive but statistically nonsignificant correlation (
p > 0.05), suggesting that there is no reliable linear or monotonic relationship between the absolute annual mileage and disengagement frequency. This low explanatory power implies that additional factors—such as variations in operational environments or software updates—substantially influence disengagement frequency. A pronounced outlier (approximately 71,902 disengagements at ~3.13 million km) strongly skews the regression model and underscores the necessity of employing normalized metrics (e.g., disengagements per 1000 km) for meaningful year-to-year and cross-operator comparisons.
Figure 7 presents the annual number of disengagements plotted against the total number of vehicles tested during the 2015–2020 period. The fitted linear regression line exhibits an essentially zero slope and an extremely low coefficient of determination (R
2 = 0.0001), indicating that there is no meaningful linear relationship between the number of vehicles tested and the absolute number of disengagements. To assess whether the annual disengagement counts follow a normal distribution, a Shapiro–Wilk test was conducted. The test yielded W = 0.572 with
p = 0.00021, leading to the rejection of the null hypothesis of normality (
p < 0.05) and confirming that the annual disengagement counts significantly deviate from a normal distribution.
The total number of disengagements by individual year shows the same trend (i.e., an increase in the number of disengagements over time). Apple is worth mentioning. It reported a total of 83,423 disengagements while operating a total of 274 vehicles, which is starkly different from other operators/manufacturers, whose reported disengagement numbers were in the order of hundreds of cases. By contrast, Waymo operated 1978 vehicles with a total of 1181 disengagements in the period under review. The ratio of the number of disengagements to the number of vehicles for Apple is thus a total of 304.46, while, for Waymo, the average reported vehicle disconnection was only 0.59. The development of the total number of disengagements of autonomous vehicles operated by Waymo in the monitored period is shown in
Figure 8. The trend depicted in
Figure 8 follows a classic learning curve pattern: although the absolute number of disengagements increased with cumulative autonomous mileage, the slope of the curve gradually flattened. From 2018 to 2022, Waymo’s cumulative disengagement count rose from 122 to 704, representing a net increase of 582 events (+478%). Year-over-year gains were highly variable: 103 disengagements (+84% relative to 2018) were added in 2019; only 23 (+10%) in 2020; a record 292 (+118%) in 2021; and 164 (+30%) in 2022. A linear regression fit produced a slope of 0.0075 disengagements per kilometer (R
2 = 0.67), indicating that each additional 1000 km of testing corresponded to approximately 7.5 fewer disengagements per 1000 km traveled. This concave trajectory suggests a declining marginal frequency of driver- or system-initiated interventions as operational experience accumulates, reflecting continuous improvements in the perception, planning, and control subsystems.
The year-on-year increase in the number of disengagements was in decline in the monitored period (see
Figure 9). A linear regression of the year-over-year disengagement rate normalized by the cumulative autonomous distance (disengagements per 1000 km) revealed a statistically significant downward trend from 0.12 in 2018 to 0.04 in 2022. The model yielded a slope of −0.020 disengagements per 1000 km/year (R
2 = 0.78;
p = 0.03), indicating that each additional year of cumulative testing corresponded to a 16.7% reduction in disengagement frequency per 1000 km on average. Despite a modest uptick in 2021, the overall decline underscores improving system reliability relative to operational exposure. This decreasing normalized disengagement rate further corroborates the interpretation of
Figure 8—that the autonomous driving technology demonstrates increasing maturity and reduced reliance on manual intervention as cumulative mileage grows. Nevertheless, the idea that this is solely a manifestation of more advanced technology cannot be stated with certainty, because the development of institutional policies of individual operators/manufacturers for reporting disengagement cases is not known.
The maturity of the technology is also indicated by the disengagement ratio, which indicates the distance traveled before disengagement occurs. This distance shows an upward trend across the monitored operators/manufacturers during the monitored period (see
Figure 10). A linear regression of calendar year against the distance-to-disengagement ratio (kilometers per disengagement) revealed a pronounced positive trend. The fitted model exhibited a slope of +1100 km/disengagement/year (R
2 = 0.84;
p = 0.008), demonstrating a statistically significant increase in the average distance between successive disengagement events over time. This result indicates that each additional year of cumulative testing is associated with an approximate 1100 km extension in the autonomous driving distance before requiring a driver- or system-initiated takeover.
In [
41], the ratio of the distance traveled to the number of disengagements is divided into levels. The first level is based on a resulting ratio lower than 2000 that has the basic capabilities of the system and requires improvement; this points to automation, according to the SAE J3016 methodology. The second level is above the resulting ratio of 2000, which is defined as automation with more advanced technology, which is close to the third level of automation according to the SAE. According to this division, it can be stated that since 2018, the monitored sample has generally showed more maturity in the technologies, thus enabling the automation of transport to continue to improve.
We further analyzed which entity was the most frequent initiator of disengagements. For this purpose, we limited our analysis to disengagement data from Waymo, as this company provides the most extensive dataset available. Between 2015 and 2022, a total of 1232 disengagements were recorded. The distribution of disengagements based on the initiator is presented in
Table 2.
From the table, it is evident that the analysis must be further constrained to the years 2018–2022, as information on the initiator of disengagements first appears in reports from 2018 onward. The number of interventions performed by the safety driver exhibits an increasing trend. This could indicate intensified testing activities, more complex traffic environments in which autonomous vehicles have been deployed over the years, or a decline in the system’s ability to appropriately respond to traffic situations. However, without additional insights into the specific traffic conditions under which the vehicles were operated, it is not possible to conclude that the ability of autonomous vehicles to respond adequately has deteriorated.
The number of disengagements initiated by the autonomous system also showed an increasing trend, although it was more moderate and highly variable. On average, 26.5% of the disengagements were initiated by the system itself, indicating that, despite substantial increases in some years (e.g., 2021), this trend was not consistent. Such fluctuations suggest that technological development may occur in leaps rather than follow a linear progression—improvements in one year do not necessarily translate into sustained long-term advancements. The sharp increase in 2021, followed by a decline in 2022, may indicate either software modifications or changes in testing conditions.
The ratio of disengagements initiated by the autonomous system to those executed by the safety driver in different years can be interpreted as an indicator of vehicle autonomy. A higher value suggests more frequent interventions by the autonomous system, whereas a lower value reflects the dominant role of the safety driver in regaining control (see
Figure 11). In 2019, this ratio was 0.144, meaning that for every 100 disengagements performed by the driver, approximately 14 were initiated by the system. In 2020, the ratio increased to 0.353, potentially indicating improvements in the autonomous driving software or differing testing conditions that led to a higher rate of self-initiated disengagements. The highest value was observed in 2021, reaching 1.071, meaning that the autonomous system initiated more disengagements than the safety driver. This trend could reflect either a more conservative approach to detecting critical situations or an advancement in technology that resulted in more proactive interventions by the system rather than by the human operator. However, in 2022, the ratio dropped sharply to 0.038, indicating a significant shift in disengagement dynamics. This decline may have been caused by system optimizations, a reduction in scenarios where the autonomous system deemed disengagement necessary, or changes in testing protocols, such as stricter intervention criteria for safety drivers.
These findings suggest that the trend toward autonomy is not linear but subject to year-on-year fluctuations, likely influenced by technological developments, testing methodologies, or regulatory requirements. The overall trend indicates that the role of the safety driver remains a crucial element in the safe testing of autonomous vehicles in real-world conditions. This is further supported by the fact that driver-initiated interventions have increased more than system-initiated disengagements, which could be attributed to manufacturers testing their autonomous vehicles in increasingly demanding and complex environments.
We further analyzed the causes of disengagements for Waymo vehicles between 2016 and 2022. In its disengagement reports, the company distinguishes between two main categories of disengagements: passive and active. A passive disengagement occurs when the system detects a failure or a risky situation and automatically terminates the autonomous mode, requiring the driver to immediately assume control. In contrast, an active disengagement happens when the driver, based on their own judgment and situational awareness, manually takes over control, even though the system does not issue any explicit warning. The company also specifies the following categories of disengagement causes (see
Table 3):
- -
weather conditions during testing,
- -
reckless behavior of road users,
- -
hardware discrepancy,
- -
unwanted maneuver of the vehicle,
- -
perception discrepancy,
- -
incorrect behavior prediction of other traffic participants,
- -
software discrepancy,
- -
construction zone during the test (only for the years 2016 and 2017),
- -
emergency vehicles during the test (also only for the years 2016 and 2017),
- -
debris in the way.
The time series analysis of the ratio of system-initiated to safety driver-initiated disengagements between 2018 and 2022 revealed pronounced year-to-year fluctuations without a statistically significant linear or monotonic correlation (R2 = 0.12; p = 0.31; Spearman’s ρ = 0.49; p = 0.32). These irregular variations are likely attributable to asynchronous software updates and variable operational conditions rather than to steady technological advancement.
Figure 11 illustrates the trend of disengagements caused by perception mismatches, with a sixth-degree polynomial regression achieving an ideal fit (R
2 = 1). The trend indicates that after a period of relatively low disengagement numbers between 2017 and 2020, there was a sharp increase in 2021, which may suggest changes in sensor algorithms or operational conditions.
Figure 12 analyzes the trend of disengagements caused by unwanted vehicle maneuvers. A third-degree polynomial regression achieved a lower but still acceptable coefficient of determination (R
2 = 0.7136), indicating that the model captured the general trend, albeit with some deviations. The noticeable decline between 2018 and 2020 could suggest improvements in vehicle planning and control algorithms. The subsequent sharp increase in 2021 and 2022 may have resulted from changes in operational conditions, stricter safety requirements, or the introduction of new systems that led to the identification of additional problematic scenarios.
Overall, both analyses indicate that autonomous vehicles have been undergoing dynamic development, with certain aspects, such as perception discrepancies or unwanted maneuvers, exhibiting unstable trends over time. These findings suggest that technological advancement is not strictly linear, but may be influenced by changes in algorithms, testing conditions, or regulatory requirements.
We further relate the number of disengagements in autonomous vehicles due to the two most frequently reported causes—perception discrepancy and unwanted maneuvers—to the distance traveled (see
Figure 13 and
Figure 14). This data normalization enables an objective assessment by eliminating distortions resulting from variations in testing distances or vehicle operational profiles. Analyzing disengagement trends relative to the distance traveled provides valuable insights into the system’s capabilities, particularly during pilot operations, which are essential for ensuring safe and seamless performance. Identifying correlations between disengagements and specific operational conditions, such as weather effects or traffic density, facilitates targeted development and optimization of perception and control algorithms, thereby enhancing the overall robustness of the autonomous system. Tracking this ratio over time reveals whether system capabilities improve due to learning and algorithm refinement or, conversely, deteriorate due to factors such as sensor degradation or changes in software configuration.
We observe a decreasing trend in the number of disengagements per kilometer traveled as the total distance increases. This suggests that vehicles’ autonomous systems are gradually improving their perception of the environment, reducing the need for human intervention. The declining trend in disengagements indicates that with each kilometer traveled, the system learns and enhances its perception capabilities. This improvement may be attributed to more advanced algorithms, better sensors, or a larger volume of training data. Although the R2 value of 0.2117 means that the model explains approximately 21.17% of the data variability, it also indicates that additional factors influencing the number of disengagements are not accounted for in this model. The low R2 value (0.2117) suggests that the trend curve does not fully explain the data. This implies that other factors influence the number of disengagements that are not considered in this model. These factors may include:
- -
weather, time of day, road type, and traffic density, which can affect the system’s perception and lead to a higher number of disengagements;
- -
software and hardware changes: software updates or hardware modifications can affect the system’s perception and lead to a higher number of disengagements.
Furthermore, we investigated the impact of recklessly behaving road users on the frequency of disengagements by relating the number of such events to the total distance traveled (see
Figure 15). The presented graph shows that the proportion of disengagements triggered by these road users (per kilometer driven) slightly increases as the total distance grows. This trend may indicate a heightened probability of encountering hazardous situations across a broader range of operational conditions as the vehicle accumulates more mileage. The increase in the ratio of disengagements triggered by recklessly behaving road users (normalized by the total distance traveled) with increasing mileage may imply a gradual deterioration in the autonomous driving technology’s ability to respond to these situations. It may also serve as an indicator of the operational environment in which vehicles are deployed across different years, suggesting a higher frequency of such risky behaviors among road users. From an overall evaluation perspective, it remains crucial to further examine the context of each disengagement in order to distinguish whether the increase in the relative number of these events is truly related to a higher occurrence of recklessly behaving road users or rather to external factors and changes in the testing process.
3. Discussions
The body of literature examining autonomous vehicle disengagements employs a diverse array of methodological approaches to quantify system reliability and identify the factors that necessitate human intervention. Baş Kaman and Olmuş [
42] used nonlinear regression and software reliability growth models to demonstrate that persistent technical deficiencies—most notably inadequate sensor integration and planning algorithm errors—remain the predominant drivers of disengagement frequency. Favaro et al. [
43] further showed that operational speed, traffic environment complexity, and driver behavior significantly influence takeover performance, with even minor technical shortcomings markedly prolonging driver reaction times. Dixit et al. [
44] confirmed a strong correlation between the autonomous miles traveled, disengagement frequency, and driver reaction time, underscoring the critical role of operator trust and situational awareness. Meanwhile, Koopman and Osyk [
45] and Zhao et al. [
46] cautioned against relying solely on raw disengagement counts for safety assessment, noting that dynamic operating conditions, frequent software updates, and inconsistent reporting practices complicate direct trend extrapolation.
Building on these prior studies, our analysis offers a novel insight: longitudinal trends in disengagement frequency constitute an empirical indicator of autonomous system maturity and gradual performance improvement [
47,
48]. By examining California DMV disengagement data from 2015 through 2022 across all major operators, we observed a clear downward trajectory in disengagement occurrences that cannot be attributed solely to reporting inconsistencies or year-to-year variability. This finding fills a gap identified by Zhao et al. [
46] by demonstrating that aggregated, longitudinal analysis of disengagement reports reveals meaningful patterns of technological progress rather than noise.
However, substantial limitations in current reporting constrain definitive conclusions about the absolute safety performance. Critically, disengagement reports lack standardized metadata describing each vehicle’s operational design domain (ODD), preventing adjustment for variation in test environments—from dense urban traffic to sparsely populated highways—that directly affect disengagement rates. A further “gray zone” arises around near-miss events: dangerous situations narrowly averted without formal disengagement go unreported, introducing systematic undercounting of critical system boundaries. Moreover, manufacturers’ thresholds for reporting and individual safety driver‘s discretion vary widely, such that a driver’s decision to intervene—even under identical circumstances—reflects personal risk tolerance rather than objective system capability.
Technical deficiencies continue to underpin many disengagements. Faisal et al. [
49] demonstrated that insufficient sensor coverage—whether in number, placement, or calibration—leads to perception errors and planning failures that force manual takeover. Planning algorithm shortcomings, including poor trajectory prediction under complex traffic scenarios, further increase disengagement frequency. Equally important are sociobehavioral factors: Nordhoff [
50] revealed that drivers often intervene not only when automation performance is deficient, but also when anticipating failure, experiencing discomfort, or reacting to the unpredictable behavior of other road users. His conceptual framework highlights five key influences on disengagement decisions, emphasizing that disengagements reflect a technical–social interplay rather than purely system shortcomings.
By extending existing cause-based typologies [
25,
28,
50] to incorporate temporal evolution in the relative prevalence of leading triggers—particularly perception mismatches and unwanted maneuvers—our work demonstrates that standardized and contextualized disengagement data can provide a valid benchmark of autonomous vehicle maturity. This integrated perspective offers regulators, manufacturers, and researchers a practical tool for tracking real-world progress toward safer, more reliable autonomy, while underscoring the urgent need for a harmonized reporting framework. Such a framework should capture both technical metadata (e.g., ODD specifications, software/hardware updates) and contextual details (e.g., traffic complexity, near-miss severity), enabling robust cross-operator comparisons and more accurate assessments of longitudinal safety performance.