3.1.1. High-Level Statistical Evaluation for Braking Speed and Total Deceleration
The first level of statistical assessment focuses on answering two critical questions—how accurate were the different labs to control the brake speed and the deceleration from the WLTP–brake cycle? Kinetic energy is a function of the square of the braking and release speeds; therefore, the study of braking speed provides an insight into the primary input to the amount of kinetic energy dissipated by the brake. Additionally, assessing brake deceleration provides useful metrics related to the rate of kinetic energy dissipation (braking power), and thus, brake temperature rise during braking.
Table 2 presents the essential speed, and deceleration statistics for all labs averaged over the six repetitions of T1.
Table 2 also provides a high-level summary of the corresponding vehicle statistics (WLTP). The summary metrics depict the behaviour for the entire cycle and its urban, rural, and motorway sections.
This part of the statistical evaluation involves all events and is separated by type of driving (urban, rural, motorway) as defined in Valverde et al. [
24]. All the estimates (in absolute values or as a per cent) for accuracy/bias and precision/scatter relate to the nominal value established on the WLTP–brake cycle. Accuracy of braking speed and deceleration is the bias or deviation from its nominal value. The braking speed and deceleration precision measured as scatter or variability corresponding to a 95% confidence interval from the six repeats of the cycle during T1. The data review for most labs shows that the application of different control programmes influenced the correct replication of the cycle. The root causes for such discrepancies may relate to programming errors, the dynamometer control programme skipping or missing certain brake events, and data entry errors during the compilation of results from all labs. The WLTP–brake cycle has many brake events combined with drastic differences in set-points between consecutive events. This makes the metrics for test quality highly sensitive to potential errors in the control programme, data collection, or data reporting. Since there is no intention to migrate towards a unique and universal control programme, it is necessary to ensure all labs conducting this cycle rely on standard speed metrics and an agreed-upon method to correct for parasitic vehicle losses.
Overall, speed control seems to be more accurate and precise than deceleration control. Most dynamometers convert the deceleration set-point to an equivalent torque set-point. The relatively low torque levels during the WLTP–brake cycle on the primary test vehicle (from 160 N·m to 200 N·m on average on the front axle’s brake) can coalesce with other mechanical, electrical, and servo response characteristics (hysteresis, linearity, accuracy, zero offsets, ramp rates, and hydraulic servo‒controller settings, among others). For the vehicle under test, the average braking torque equates to less than 5% of a typical inertia dynamometer’s full scale.
Accuracy/bias for each lab and speed range relative to the nominal set-points from the WLTP–brake cycle displayed as a scatter plot in
Figure 2 (left-hand side). The bias in speed is smaller than 1 km/h compared to the individual set-points; the weighted bias compared to the nominal WLTP–brake cycle is 2.1 km/h (4.6%) on average for all labs combined. Labs 3, 4, 7, and 8 exhibited a weighted average bias below 0.2 km/h (1%). Lab 5 exhibited a bias comparable to the overall averages except for an average bias of 20 km/h during the motorway section. Labs 1 and 2 exhibited a weighted average between 4–5 km/h, with rural and motorway events being more penalised. Lab 6 exhibited a systematic bias or about 4.5 km/h (>10%) for all types of brake events. Regarding deceleration, labs exhibited a weighted average bias of 0.13 m/s
2, which corresponds to 13% of the nominal value. It is noteworthy that Labs 1, 4, and 7 controlled the brake with a bias higher than 15%. These labs indicated issues with the design of the control programme and the use of an incorrect set of parameters for the parasitic vehicle losses. Labs 3 and 8 exhibited the smallest bias in all three different speed ranges.
Precision/scatter for each lab and speed range relative to the nominal set-points from the WLTP–brake cycle displayed as a scatter plot in
Figure 2 (right-hand side). On average, all labs exhibited a speed variation lower than 0.5% during 95% of the brake events for all six repetitions of T1. Labs 1, 4, and 8 had the smallest scatter (less than 0.2%), while Labs 3, 5, and 6 exhibited a weighted average scatter close to 1%. As expected, urban brake events exhibit the most significant level of variability compared to rural and motorway events. Regarding deceleration, data from all labs show that the weighted average scatter is 3.3% of the nominal deceleration levels. This value is below the legacy limit of 5% accepted for torque control. The tolerance for torque control combines actual deceleration rate and brake inertia. Labs 1, 4, 7, and 8 were able to limit variability to 1.5% on average, while Labs 2 and 5 exhibited variability of about 8.5% (mainly due to urban braking events). The scatters for all rural and motorway events for all labs remained low as they did not exceed 1.8% of the nominal. Another source of variability is the variance in the reference deceleration measured on the vehicle at the proving ground. Using data from proving ground on different vehicles, the variance on deceleration is estimated at 8.85 × 10
−5 (m/s
2)
2 [
25].
Regarding deceleration, the application of different control programmes and test parameters can influence the correct replication of the brake power dissipation during the cycle. For example,
Figure 3 illustrates the distribution of all braking events (T1–R1) for Labs 1 and 3 based on deceleration rate bins of 0.1 m/s
2. As discussed previously, Lab 1 executed significantly more braking events with a deceleration higher than 1.0 m/s
2 compared to Lab 3 and the nominal WLTP–brake cycle. As a result, the average deceleration rate for Lab 1 over the entire cycle is higher.
These findings indicate that some labs might need to revise their control strategy to improve the precision—mainly, but not only, during urban events. These events comprise almost 85% of the brake events during the WLTP–brake cycle; therefore, they are the most significant part of the overall cycle.
Figure 4 shows the linear relationship of braking deceleration between each lab and the vehicle. Even though most labs exhibited high linearity (high R‒squared factor), the linear relationship to vehicle level (slope factor) varied from 0.8143 for Lab 7 to 1.4274 for Lab 1. Labs 1 and 4 exhibited significantly higher deceleration above the nominal. On the other hand, Labs 6 and 7 indicate significantly lower deceleration below the nominal, with Lab 6 showing more significant bias at decelerations below 1 m/s
2. The regression for Lab 6 uses a total least squared regression to include the variance on vehicle deceleration per the method and the spreadsheet illustrated by Cantrell et al. [
26]. Comparing the common linear least squared regression (minimising the vertical distance to the regression line) to the total least squared regression (minimising the perpendicular distance to the regression line) for other labs yields a difference of less than 1%. The values in
Figure 4 reflect the deceleration levels for the foundation brake after correcting for the resistive vehicle forces. The differences in brake deceleration levels also help explain some of the variations observed in brake temperatures discussed later in the document.
3.1.2. Statistical Assessment of Time-Resolved Speed Violations and Error
The next level of numerical assessment of the dynamometer data pertains to the time-resolved speed response at 1 Hz. The two standard metrics to qualify the cycle’s execution relative to the speed trace include: (i) Speed violations as the % of the time the brake speed were outside the limits established on the Annex 6 of the UN GTR No. 15–5th amendment (6th amendment to be published soon) [
27]; and (ii) Speed error as the Root Mean Square of the Speed Error (RMSSE) per SAE J2951 [
28] also expressed as % of the maximum error allowed by the GTR 15. The cited documents provide more details regarding the implementation of the metrics.
This paragraph aims to determine for every nominal value on the speed trace at 1 Hz whether there is a speed violation or not. According to Annex 6 of the UN GTR No. 15, the speed trace for a given timestamp
t is allowed to deviate from the nominal value within a predefined threshold of ±2 km/h. When the speed trace exceeds this threshold, then a violation is recorded for the given timestamp
t. The calculation for total speed violations (above and below) for the entire trip follows Equation (1). More details regarding the equations and the metrics’ implementation can be found to the respective Global Technical Regulation and its recent amendment [
27].
In Equation (1)
represents the total speed violations, as a per cent of the entire drive-on time of the WLTP–brake cycle (15,826 s), and
is the count of instances (at 1 Hz) when the dynamometer speed is below or above the speed tolerance.
Table 3 indicates the summary of speed violations as a per cent of total time for all six repetitions of the WLTP–brake cycle on T1. Lab 6 did not exhibit any speed violation, followed closely by Labs 4 and 5, which performed the cycles with less than 3% of the time being outside the nominal speed profile. On the other hand, Labs 3 and 8 exhibited 42% and 32% of speed violations, respectively, failing to correctly follow the nominal speed profile. Labs 1 and 2 exhibited more significant numbers of speed violations during R1, whereas significantly improved to the following repetitions. This behaviour hints to possible control programme optimisations before continuing to subsequent repetitions. In contrast, Lab 8 performed the first five repeats with ~37% speed violations and made changes to R6 to reduce the violations to 0.5%.
The examination of the RMSSE parameter yields similar conclusions. For each timestamp of a given speed trace, the RMSSE for a given cycle as a per cent is determined using Equation (2). More details regarding the equations and the metrics’ implementation are available at the Society of Automotive Engineers (SAE) Standard J:2951 [
28].
RMSSE% is the ratio between the
RMSSE of the entire test and the
RMSSElimit calculated for the upper or lower speed limits defined per GTR 15 or GRPE 81-12, expressed as a per cent.
Table 4 indicates the summary RMSSE (km/h) and per cent RMSSE for all repeats on T1. The WLTP row refers to the maximum RMSSE metric while still complying with the speed tolerance. Labs 4 and 6 managed to conduct the six repeats below 50% of the acceptable limit, while Lab 5 did not exceed the acceptable limit in any of the repeats. Labs 1 and 2 completed five repeats between 85% and 95% of the acceptable limit. Labs 7 and 8 had only one or two repeats below the maximum RMSSE limit of 3.2 km/h (rounded value for 3.191 km/h).
It is essential to contrast the speed metrics (using the entire 1 Hz dataset) with the accuracy and precision during braking to understand the speed control during the cycle fully. Even though Labs 3, 7, and 8 exhibited a performance comparable or better than other labs in the latter metrics (braking speed and average deceleration), the two metrics for speed (speed violations and RMSSE) for the entire cycle (at 1 Hz) tended to exhibit lower-than-average performance. It is interesting to note that Lab 3 develops braking decelerations with high‒fidelity compared to the WLTP–brake cycle. This contrast hints that Lab 3 performed better during braking than in-between braking events (
Table 2).
3.1.3. Statistical Analysis and Derivation of Standard Deviations Per ISO 5725-5
The analysis provided in this part relies entirely on the series of ISO 5725-5 standards. The current study followed the general guidance designed for heterogeneous materials [
29], while the general formulae accommodate the fact that not all labs were able to report all the results for each level-braking event [
30]. The general statistical approach involves three factors arranged in a specific hierarchy: The factor
“laboratory” at the highest level, a factor
“samples within laboratories” as the next level, and a factor
“test results within samples” as the lowest level of the hierarchy. The methods applied in this study provide statistical estimates removing the possible variation between samples.
The scrutiny of the data for consistency, stragglers, and outliers relied on the implementation of the
h* and
k* statistics as outlined by Mandel and applied in ISO 5725-2 [
31]. The
h* statistic detects the difference between means and the
k* statistic detects the difference between variances. These statistics allow for the detection of stragglers (95
th percentile) and outliers (99
th percentile) for lab averages, lab ranges, and test results. The (*) denotes metrics and statistics derived using robust algorithms. The use of robust methods allows the calculation of the standard deviations for repeatability (between tests) and reproducibility (between dynamometers) minimising the influence of outlying data [
30]. The use of robust methods determines the standard deviations to use as the denominators in the
h* and
k* statistics, and to calculate the overall averages, avoiding distortion on the results. Besides, the statistics obtained from robust methods generate estimates of repeatability, intermediate precision, and reproducibility standards deviations.
The current paper follows the usual practice for heterogeneous materials for preparing two samples for each lab. The tests (T1 and T2) on each sample generated six results (R1 to R6) for each of the 303 levels (braking events). This dataset allows three applications of Algorithms A and S to estimate the repeatability and reproducibility. The iteration process repeats several times until the change in the robust estimates (s * is the robust standard deviation, and w *is the robust medial per ISO 5725 series of standards) is small. The analysis team for this project defined a limit of less than 5% change or four iterations, whichever was achieved first. The output of the robust Algorithms generates the estimates of the repeatability and reproducibility standard deviations and the standard deviation between samples.
Table 5 provides the summary standard deviations for main measurands as a per cent of the average as calculated based on the application of robust methods. These include metrics for standard deviations for repeatability
, sample effect
, lab effect
, and reproducibility
. The statistics indicate values for all levels combined (303 brake events), for the 50th and the 95th percentiles, and each trip. The values in
Table 5 indicate the per cent of the general average for all events, the 50th percentile, and the 95th percentile values. Moreover, the table provides the values as per cent of the general average for each trip. More details regarding the application of the equations, as well as examples with calculations, are provided in the PMP Brake Emissions Protocol—Part 1 [
22].
Table 5 shows that the repeatability standard deviation
is less than 1% of the mean values for braking speed and deceleration, as well as for the maximum brake temperature (regardless of the measurement method) and the cooling air temperature. In general, it seems that the repeatability within the labs for most parameters is within reasonable ranges both for the whole cycle, as well as for individual trips. Regarding the variation between samples as expressed by
, it follows the trends described for the repeatability standard deviation.
Table 5 shows that only the maximum temperature measured with TC
Rub exhibited relatively high variation (between 15% and 25% of the mean value) between labs, while all other parameters showed no significant difference between T1 and T2 (below 0.5% of the mean value for control parameters: Braking speed, braking deceleration, and braking torque). When studying the standard deviation for the between laboratory effect
it seems that most parameters come with relatively high values. More specifically, deceleration and maximum brake temperature standard deviations are frequently higher than 15%. In contrast, the standard deviation of cooling air parameters depends on whether the examination includes all labs or only the ones with conditioned air for temperature and humidity. Finally, the standard deviation for reproducibility seems to be at an acceptable level. It is again the maximum temperature that comes with relatively high values, regardless of the measurement method, while deceleration seems to be less reproducible compared to the rest of the examined parameters.