Utilizing Principal Component Analysis and Hierarchical Clustering to Develop Driving Cycles: A Case Study in Zhenjiang

Wang, Tianxiao; Jing, Zhecheng; Zhang, Shupei; Qiu, Chengqun

doi:10.3390/su15064845

Open AccessArticle

Utilizing Principal Component Analysis and Hierarchical Clustering to Develop Driving Cycles: A Case Study in Zhenjiang

by

Tianxiao Wang

¹,

Zhecheng Jing

^1,*

,

Shupei Zhang

¹ and

Chengqun Qiu

²

¹

School of Automotive and Traffic Engineering, Jiangsu University, Zhenjiang 212013, China

²

Jiangsu Province Intelligent Optoelectronic Devices and Measurement-Control Engineering Research Center, Yancheng Teachers University, Yancheng 224007, China

^*

Author to whom correspondence should be addressed.

Sustainability 2023, 15(6), 4845; https://doi.org/10.3390/su15064845

Submission received: 2 February 2023 / Revised: 2 March 2023 / Accepted: 8 March 2023 / Published: 9 March 2023

Download

Browse Figures

Versions Notes

Abstract

:

Accurate driving cycles are key for effectively evaluating electric vehicle performance. The K-means algorithm is widely used to construct driving cycles; however, this algorithm is sensitive to outliers, and determining the K value is difficult. In this paper, a novel driving cycle construction method based on principal component analysis and hierarchical clustering is proposed. Real road vehicle data were collected, denoised, and divided into vehicle microtrip data. The eigenvalues of the microtrips were extracted, and their dimensions were reduced through principal component analysis. Hierarchical clustering was then performed to classify the microtrips, and a representative set of microtrips was randomly selected to construct the driving cycle. The constructed driving cycle was verified and compared with a driving cycle constructed using K-means clustering and the New European Driving Cycle. The average relative eigenvalue error, maximum speed acceleration probability distribution difference rate, average cycle error, and simulated relative power consumption error per 100 km between the hierarchical driving cycle and the real road data were superior to those of the K-means driving cycle, which indicated the effectiveness of the proposed method. Though the methodology proposed in this paper has not been verified in other regions, it provided a certain reference value for other research of the developing driving cycle.

Keywords:

principal component analysis; hierarchical cluster; electric vehicle; driving cycle; driving cycle construction

1. Introduction

Electric vehicle assessment is a key part of automobile development. Various automobile performance indicators, such as electricity consumption per 100 km and energy recovery, are measured by using a driving cycle, which is a series of data points comprising a representative speed–time course for a certain type of vehicle in the intended driving environment [1]. Most relevant national standards in China have adopted the New European Driving Cycle (NEDC); however, the NEDC was formulated in Europe in the 1960s and has been widely criticized as being overly idealized and unrepresentative of modern high-speed traffic patterns. It does not accurately reflect the real-world operating conditions of vehicles, which results in large errors in performance indicators measured using this standard. The World Light Vehicle Test Cycle (WLTC) was proposed by the European Union, Japan, and the United States in 2015 and has been adopted by some nations; however, the WLTC is not suited to the characteristics of the automotive industry in all countries. Notably, the United States, Europe, and Japan do not use the WLTC domestically; they use the FTP75, NEDC, and JAPAN10-15, respectively [2]. There are now an increasing number of driving cycles, and the large differences between each of them means that each region has special needs [3]. Achieving accurate vehicle performance measurements in a country requires developing a driving cycle tailored to the characteristics of automobile use in the country.

Scholars have performed numerous studies on developing appropriate driving cycles. Typically, they collect a large quantity of real driving data, represent these data parametrically, and perform multivariate statistical analysis. Peng et al. [4] adopted K-means cluster analysis and the silhouette function screening clustering method to construct driving cycles. Zhang et al. [5] proposed improving K-means clustering by combining the distance optimization method and the density method to measure data set density for constructing driving cycles. Shen et al. [6] used K-means clustering to develop hybrid vehicle driving cycles for Shanghai. Yuan et al. [7] proposed an improved K-means text clustering algorithm based on density peaks to avoid randomly selecting initial clusters. Su et al. [8] introduced the maximum distance, minimum distance, and weighted Euclidean distance to improve the K-means algorithm; these metrics can prevent the initial center from falling into a local optimum, thereby decreasing the convergence time. Yu et al. [9] proposed a semisupervised classification model by combining K-means clustering and support vector machines to develop a driving cycle for Xi’an, China. N. Anida et al. [10] use k-means clustering method to develop BAS KITe driving cycle. Philip Korata et al. [11] combine energy center of gravity (ECG) and K-means clustering two methods to construct driving cycles, which are used for the optimization of FSCW SPMSMs.

In summary, many researchers have used K-means clustering to construct driving cycles [12,13,14]. However, K-means is sensitive to outliers, and determining the K value is difficult for large data sets, which affects the accuracy of the clustering results. Researchers proposed methodologies to this problem [15]. Hierarchical clustering can effectively improve the accuracy and computational efficiency of clustering by distinguishing outlier points and determining the optimal number of clusters. Utpal Kumar et al. [16] use hierarchical agglomerative clustering and the dynamic time warping to cluster 11 years’ worth of 3-D displacement data. Lerato and Niesler [17] have proposed a multi-stage agglomerative hierarchical clustering approach aimed at large datasets of speech segments. Tai Dinh et al. [18] use hierarchical cluster verify the recommended number of clusters by average silhouette. Therefore, this paper proposes a method based on principal component analysis (PCA) and hierarchical clustering for constructing driving cycles. In this study, a driving cycle was developed for electric vehicles in Zhenjiang City, Jiangsu Province, China. First, driving data for real vehicles were collected and preprocessed. Vehicle trips were then divided into microtrips, eigenvalues were selected and extracted from the data, and the dimensionality of the eigenvalues was reduced through PCA. The microtrips were then classified using hierarchical clustering, and microtrips from each class were selected to construct driving cycles. Finally, the constructed driving cycles were verified statistically and economically. The flowchart of the study is shown in the Figure 1.

2. Data Collection and Preprocessing

Zhenjiang is a quintessential small and medium-sized city in China, and the data were collected during the day in summer on flat urban routes [4]. The electric vehicles used for data collection were provided by volunteers who freely and randomly travel throughout the city’s urban area.

Speed and time data for the vehicle were acquired through its controller area network (CAN) bus with a sampling frequency of 1 Hz. The complex electromagnetic environment in the car, signal jitter, global positioning system fluctuations, and other factors cause noise. The average filtering method was selected for denoising in this study because of its effectiveness at noise suppression and its resulting smoothness. The K of average filtering method is 9. Figure 2 presents a comparison of the data fragments before and after noise removal.

A microtrip was defined as the speed–time segment from the beginning of a stop until the beginning of the next stop. The division of the collected driving data produced 1090 microtrips.

For further analysis, 15 eigenvalues were extracted from the microtrips, such as average vehicle speed, maximum vehicle speed, and maximum acceleration (Table 1).

The 15 eigenvalues were defined and calculated as follows [19]. Microtrips were divided into 1-s intervals, and the average value of an indicator during an interval was considered its instantaneous value. Small changes in vehicle speed occur because of driver actions and roadway conditions. Therefore, uniform speed, idle speed, acceleration, and deceleration were defined as follows [20,21,22]. Idle speed was defined as v_i = 0 with the engine continuously operating, where v_i is the speed at the ith second. Uniform speed, acceleration, and deceleration were defined as v_i ≠ 0 with |a_i| ≤ 0.15 m/s², a_i > 0.15 m/s², and a_i < −0.15 m/s², respectively.

Maximum speed, acceleration, and deceleration were the maximum instantaneous vehicle speed, acceleration value during acceleration, and acceleration value during deceleration for a microtrip. Acceleration, deceleration, uniform speed, and idle time were the sum of the duration of all accelerating, decelerating, nonaccelerating, and idle intervals, respectively, during a microtrip.

Average speed was the average value of all speeds in a microtrip.

V_{a v g} = \sum_{i = 1}^{T} v_{i} / T

(1)

where n is the total time of the microtrip.

Similarly, average acceleration and deceleration were the average of the instantaneous acceleration or deceleration value during corresponding microtrip periods (Equations (2) and (3)).

a_{a v g_a} = \sum_{i = 1}^{T} a_{i} / T

(2)

a_{a v g_d} = \sum_{i = 1}^{T_{d}} a_{d} / T_{d}

(3)

a_{d}

corresponds to the deceleration for each deceleration point.

Acceleration, deceleration, idle, and cruise time ratios were defined as the ratios of the acceleration, deceleration, idle, and cruise times, respectively, to the total microtrip time. These ratios were calculated using Equations (4)–(7), respectively.

T_{r_a} = \frac{T_{a}}{T}

(4)

T_{r_d} = \frac{T_{d}}{T}

(5)

T_{r_i} = \frac{T_{i}}{T}

(6)

T_{r_u} = \frac{T_{u}}{T}

(7)

3. Principal Component Analysis

Because of the high correlation between the 15 eigenvalues and the high dimensionality of the data, direct analysis was challenging; thus, PCA was conducted to reduce the dimensionality while retaining the information. In PCA, several principal components of the data are identified; each component explains a different percentage of the data variance. The dimensionality (and thus the computational complexity of the analysis) can be reduced by selecting the principal components that explain the most variance. Typically, a cumulative contribution rate exceeding 80% [23] is considered to reduce the dimensionality effectively while ensuring that sufficient information is retained. The variance explained by each of the 15 eigenvalues and the cumulative contribution rate is presented in Table 2.

Table 2 reveals that the cumulative contribution rate of the first three principal components exceeded 80%; thus, these three components were selected for the subsequent analysis, and the remaining components were discarded. The correlation coefficients between these three principal components and the 15 eigenvalues are presented in Table 3.

Table 3 reveals that the first principal component was strongly correlated with the acceleration and deceleration time ratios, the average and maximum speeds, acceleration, and deceleration. The second principal component was affected by the acceleration, deceleration, cruise time, and total time. The third principal component mainly reflected the idle time, idle time ratio, and cruise time ratio (Table 4).

4. Hierarchical Cluster Algorithm

The hierarchical cluster algorithm combines qualitative and quantitative methods to decompose complex systems and facilitate their implementation. The hierarchical cluster algorithm is divided: agglomerative and divisive. Usually, an agglomerative hierarchical cluster is applied widely and simple. So, this paper adopts agglomerative hierarchical cluster. This algorithm is insensitive to outliers, and the number of clusters can be selected in accordance with the actual data set, which facilitates the development of driving cycles that are consistent with local driving conditions.

The general steps of the hierarchical cluster algorithm are as follows:

(1): Standardize the original data, eliminate the effects of dimensions, and construct a data matrix.
(2): Treat each object as a category.
(3): Calculate the interclass distance matrix of any two samples and construct a dissimilarity matrix.

The distance between classes was calculated as the class average. The class average is defined as the average distance between each pair of samples in the two classes.

D_{G} (p, q) = \frac{1}{l k} \sum_{i = 1}^{l} \sum_{j = 1}^{k} d_{i j}

(8)

where D_G(p, q) is the distance between classes; l and k are the number of samples in class G_p and class G_q, respectively; and d_ij is the distance between the ith sample in class G_P and the jth sample in class G_q.

(4): Merge the two closest classes into a new class and simultaneously update the number of classes and the dissimilarity matrix.
(5): Repeat Steps (3) and (4).
(6): If the termination condition is met, clustering ends.

Vehicle motion can be categorized as low, medium, or high speed on the basis of the average speed during a period [24]. This was used as the termination condition. The real driving data were divided into the aforementioned three speed categories through hierarchical clustering, and the clustering results were used as a microtrip candidate database for driving cycles. The numbers of samples in the low-, medium-, and high-speed categories were 353, 657, and 80, respectively. Figure 3 shows the clustering effect, each color presents a cluster.

Silhouette coefficient is often used to evaluate the clustering effect, and the calculation results show that hierarchical cluster is better than k-means, as shown in Table 5.

5. Construction and Verification of a Driving Cycle

5.1. Construction of a Driving Cycles

The ratio of the number of samples in each cluster was 3:6:1. Ten microtrips with a total duration of approximately 1200 s were randomly selected from the three categories in accordance with this ratio [25], and the eigenvalues of these microtrips are presented in Table 6.

These 10 microtrips were connected to construct a driving cycle (Figure 4).

5.2. Statistical Verification of the Driving Cycle

The average relative error between the calculated driving cycle and the real road data was calculated using the following equation:

S = (S_{1} + S_{2} + \dots + S_{j}) / j

(9)

where

S_{j}

is the relative error of the jth eigenvalue. This parameter was calculated as follows:

S_{j} = \frac{|a_{a v g_c j} - a_{r j}|}{a_{r j}}

(10)

where

a_{a v g_c j}

is the average value of the jth eigenvalue for all microtrips in the constructed driving cycle and

a_{r j}

is the jth eigenvalue of the real road data.

Because of the different duration of each driving condition, T_a, T_d, T_i, T_u, and

T

are different. Therefore, these five eigenvalues were not used to calculate the relative error.

PCA and K-means clustering were also used to construct a driving cycle for comparison with that constructed using the hierarchical clustering method (Figure 5). Table 7 presents a comparison of the eigenvalues of the driving cycles obtained from the real road data, K-means clustering, and hierarchical clustering.

The average relative errors of hierarchical and K-means clustering were 4.83% and 7.25%, respectively; both errors were <10%, which indicated that these clustering methods have acceptable accuracy [26]. However, the error of the hierarchical driving cycle is lower than that of the K-means driving cycle, which indicated that hierarchical clustering is superior to K-means clustering for constructing driving cycles.

In addition, the agreement between the speed–acceleration probability distributions of the constructed and real driving cycles were analyzed (Figure 6, Figure 7 and Figure 8). Figure 6 and Figure 7 reveal that in the real and hierarchical distributions, the acceleration and speed were concentrated around −0.6 to 0.6 m/s² and 20 km/h, respectively; the probability peak occurred at an acceleration of −0.2 m/s² to 0.2 m/s² and a speed of approximately 20 km/h. Thus, the speed–acceleration probability distributions were consistent. Compared with the aforementioned distributions, the K-means speed–acceleration probability distribution (Figure 8) had a greater spread. Moreover, the K-means speed–acceleration probability distribution exhibited a peak at approximately 35 km/h, which did not exist in the real road data (Figure 6). Therefore, the hierarchical speed–acceleration probability distribution was more consistent with the real road data than was the K-means distribution.

The accuracy of the constructed driving cycles was further evaluated by calculating the difference between their speed–acceleration probability distributions and the real data (Figure 9 and Figure 10). The hierarchical and K-means driving cycles had maximum speed acceleration probability distribution (SADP) difference rates of 1.06% and 1.63%, respectively, and average error rates of 0.13% and 0.17%, respectively. Thus, the driving cycle produced through hierarchical clustering was more accurate than that produced through K-means clustering.

5.3. Verification of the Driving Cycle on the Basis of the Economical Detection of Pure Electric Vehicles

Energy economy is a key indicator for all-electric vehicles and is typically measured as electricity consumption per 100 km. An economic simulation was performed in which the hierarchical driving cycle was used as the detection driving cycle; the results were compared with those of simulations of real road data, the NEDC, and the K-means driving cycle.

Table 8 reveals that the electricity consumption per 100 km for real road data, the hierarchical driving cycle, the K-means driving cycle, and the NEDC were 17.4, 17.85, 18.52, and 16.19, respectively. The NEDC and K-means results had large errors of –1.21 and 1.12 kW·h (6.95% and 6.44%, respectively), respectively. The hierarchical driving cycle had a substantially smaller error of only 0.45 kW·h (2.76%). Therefore, the simulation results confirmed the effectiveness of the proposed driving cycle development method.

In summary, the feasibility and effectiveness of the proposed method of applying PCA and hierarchical clustering to develop a driving cycle were confirmed by the results of statistical analysis and an economic simulation of pure electric vehicles.

6. Summary

In this paper, a method based on PCA and hierarchical clustering is proposed for constructing driving cycles. CAN bus driving data were collected for Zhenjiang, China and preprocessed using a moving-average filter. The preprocessed data were then divided into microtrips, the eigenvalues were extracted, and PCA was used to reduce the dimensionality of the data. Hierarchical clustering was then used to classify the microtrips, and the classified microtrips were used to construct driving cycles. The constructed driving cycles were verified statistically and through simulations. The average relative eigenvalue error, maximum SADP difference rate, average cycle error, and simulated relative power consumption error per 100 km between the hierarchical driving cycle and the real road data were 4.83%, 1.06%, and 0.13%, and 2.76%, respectively. These results were superior to those obtained with the K-means driving cycle (7.25%, 1.63%, 0.17%, and 6.44%, respectively). Thus, the results of statistical analysis and simulated economic verifications indicated that the hierarchical clustering algorithm produces driving cycles that more closely match real road data than does the K-means clustering algorithm. The hierarchical clustering algorithm is applied to construct driving cycles for the first time and the relative errors is less than before. In summary, the results of this study confirm the feasibility and effectiveness of the proposed driving cycle development method. This method can be used to develop accurate localized driving cycles. However, this research still has limitations. Some factors need to be considered in further study, including the effect of different factors such as season, day and night environment, and region. In addition, the algorithm will be optimized to make the results more accurate and reduce errors.

Author Contributions

Conceptualization, Z.J., S.Z. and C.Q.; Methodology, T.W.; Software, T.W.; Validation, T.W.; Formal analysis, C.Q.; Investigation, S.Z. and C.Q.; Data curation, S.Z.; Writing—original draft, T.W. and Z.J.; Writing—review & editing, Z.J.; Visualization, Z.J. and C.Q.; Supervision, S.Z.; Project administration, Z.J. and S.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National High Technology Research and Development Program of China, grant number 2011AA11A286; the Natural Science Foundation of Jiangsu Province, grant number BK20211364; and College Students’ innovation and entrepreneurship training program, grant number 202210299991X.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

Tong, H.Y.; Hung, W.T. A framework for developing driving cycles with on-road driving data. Transp. Rev. 2010, 30, 589–615. [Google Scholar] [CrossRef]
Kim, W.G.; Kim, C.K.; Lee, J.T.; Yun, C.W.; Yook, S.J. Characteristics of nanoparticle emission from a light-duty diesel vehicle during test cycles simulating urban rush-hour driving patterns. J. Nanopart. Res. 2018, 20, 94. [Google Scholar] [CrossRef]
Gebisa, A.; Gebresenbet, G.; Gopal, R.; Nallamothu, R.B. Driving Cycles for Estimating Vehicle Emission Levels and Energy Consumption. Future Transp. 2021, 1, 615–638. [Google Scholar] [CrossRef]
Peng, Y.H.; Yang, H.B.; Li, M.L.; Qiao, X. Research on the Construction Method of Driving Cycle for the City Car Based on K-Means Cluster Analysis. Automob. Technol. 2017, 11, 13–18. [Google Scholar] [CrossRef]
Zhang, Y.; Su, X.; Gao, G. Driving Conditions of a Car Based on Improved Principal Component and K-means Clustering Algorithm. Sci. Technol. Eng. 2021, 21, 3199–3250. [Google Scholar] [CrossRef]
Shen, P.; Zhao, Z.; Li, J.; Zhan, X. Development of a typical driving cycle for an intra-city hybrid electric bus with a fixed route. Transp. Res. Part D Transp. Environ. 2018, 59, 346–360. [Google Scholar] [CrossRef]
Yuan, Y.M.; Liu, H.Z.; Li, H.S. An Improved K-Means Text Clustering Algorithm Based on Density Peaks and Its Parallelization. J. Wuhan Univ. (Nat. Sci. Ed.) 2019, 5, 457–464. [Google Scholar] [CrossRef]
Su, X.H.; Zhang, Y.X.; Xu, S.P.; Shang, Y. Driving conditions and fuel consumption of an improved K-means clustering algorithm. Comput. Eng. Sci. 2021, 43, 2020–2026. [Google Scholar] [CrossRef]
Yu, M.; Zhao, W.; Wu, L.; Li, Y. Working Condition of Electric Vehicle Based on K-Mean Clustering and Support Vector Machine. J. Chongqing Jiaotong Univ. (Nat. Sci.) 2021, 40, 129–139. [Google Scholar] [CrossRef]
Anida, I.N.; Norbakyah, J.S.; Zulfadli, M.; Norainiza, M.H.; Salisa, A.R. Consumption Evaluation of Energy Consumption and Emissions of BAS KITe in Kuala Terengganu from the Development of Its Driving Cycle. J. Phys. Conf. Ser. 2020, 1532, 012018. [Google Scholar] [CrossRef]
Korta, P.; Iyer, L.V.; Lai, C.; Mukherjee, K.; Tjong, J.; Kar, N.C. A novel hybrid approach towards drive-cycle based design and optimization of a fractional slot concentrated winding SPMSM for BEVs. In Proceedings of the 2017 IEEE Energy Conversion Congress and Exposition (ECCE), Cincinnati, OH, USA, 1–5 October 2017; pp. 2086–2092. [Google Scholar] [CrossRef]
Berzi, L.; Delogu, M.; Pierini, M. Development of driving cycles for electric vehicles in the context of the city of Florence. Transp. Res. Part D Transp. Environ. 2016, 47, 299–322. [Google Scholar] [CrossRef]
Brady, J.; O’Mahony, M. Development of a driving cycle to evaluate the energy economy of electric vehicles in urban areas. Appl. Energy 2016, 177, 165–178. [Google Scholar] [CrossRef]
Nouri, P.; Morency, C. Evaluating Microtrip Definitions for Developing Driving Cycles. Transp. Res. Rec. J. Transp. Board 2017, 2627, 86–92. [Google Scholar] [CrossRef]
Chen, Z.; Yang, C.; Fang, S. A Convolutional Neural Network-Based Driving Cycle Prediction Method for Plug-in Hybrid Electric Vehicles With Bus Route. IEEE Access 2020, 8, 3255–3264. [Google Scholar] [CrossRef]
Kumar, U.; Legendre, C.P.; Lee, J.C.; Zhao, L.; Chao, B.F. On analyzing GNSS displacement field variability of Taiwan: Hierarchical Agglomerative Clustering based on Dynamic Time Warping technique. Comput. Geosci. 2022, 12, 105243. [Google Scholar] [CrossRef]
Lerato, L.; Niesler, T. Clustering Acoustic Segments Using Multi-Stage Agglomerative Hierarchical Clustering. PLoS ONE 2015, 10, e0141756. [Google Scholar] [CrossRef] [Green Version]
Dinh, D.T.; Fujinami, T.; Huynh, V.N. Estimating the Optimal Number of Clusters in Categorical Data Clustering by Silhouette Coefficient. In Proceedings of the Knowledge and Systems Sciences: 20th International Symposium, KSS 2019, Da Nang, Vietnam, 29 November–1 December 2019. [Google Scholar] [CrossRef]
Shi, M. Research of Construction of Light-Duty Vehicles Driving Cycle. Master’s Thesis, Tianjin University of Technology, Tianjin, China, 2013. [Google Scholar] [CrossRef]
Ciuffo, B.; Marotta, A.; Tutuianu, M.; Anagnostopoulos, K.; Fontaras, G.; Pavlovic, J.; Serra, S.; Tsiakmakis, S.; Zacharof, N. Development of the world-wide harmonize light duty rest cycle(WLTC) and a possible pathway for its introduction in the European legislation. Transp. Res. Part D Transp. Environ. 2015, 40, 61–75. [Google Scholar] [CrossRef]
Shi, S.; Lin, N.; Zhang, Y.; Cheng, J.; Huang, C.; Liu, L.; Lu, B. Research on Markov property analysis of driving cycles and its application. Transp. Res. Part D Transp. Environ. 2016, 47, 171–181. [Google Scholar] [CrossRef]
Arun, N.; Mahesh, S.; Ramadurai, G.; Nagendra, S. Development of driving cycles for passenger cars and motorcycles in Chennai, India. Sustain. Cities Soc. 2017, 32, 508–512. [Google Scholar] [CrossRef]
Fan, J.; Mei, C. Data Analysis; Science Press: Beijing, China, 2010. [Google Scholar]
Hu, Z.; Qin, Y.; Tan, P.; Lou, D. Large-sample-based Car-driving Cycle in Shanghai City. J. Tongji Univ. (Nat. Sci.) 2015, 43, 1523–1527. [Google Scholar] [CrossRef]
Galgamuwa, U.; Perera, L.; Bandara, S. A representative driving cycle for the southern expressway compared to existing driving cycles. Transp. Dev. Econ. 2016, 2, 1–8. [Google Scholar] [CrossRef] [Green Version]
Du, A.M.; Bu, X.; Chen, L.F. Investigation on Bus Driving Cycles in Shanghai. J. Tongji Univ. (Nat. Sci.) 2006, 34, 943–946. [Google Scholar] [CrossRef]

Figure 1. Flowchart of the study.

Figure 2. Original and denoised data.

Figure 3. Cluster scatter plot.

Figure 4. Driving cycle based on the hierarchical cluster algorithm.

Figure 5. Driving cycle obtained through K-means clustering.

Figure 6. Speed–acceleration probability distribution for real road data.

Figure 7. Speed–acceleration probability distribution of driving cycles constructed through hierarchical clustering.

Figure 8. Speed–acceleration probability distribution of driving cycles constructed through K-means clustering.

Figure 9. SADP difference rate between the hierarchical driving cycle and the real road data.

Figure 10. SADP difference rate between the K-means driving cycle and the real road data.

Table 1. Eigenvalues of microtrips.

Serial Number	Eigenvalues	Symbol
1	Average speed (Km/h)	$V_{a v g}$
2	Maximum speed (Km/h)	$V_{m a x}$
3	Maximum acceleration (m/s²)	$a_{m a x_a}$
4	Average acceleration (m/s²)	$a_{a v g}$
5	Maximum deceleration (m/s²)	$a_{m a x_d}$
6	Average deceleration (m/s²)	$a_{a v g_d}$
7	Acceleration time (s)	$T_{a}$
8	Deceleration time (s)	$T_{d}$
9	Idle time (s)	$T_{i}$
10	Cruise time (s)	$T_{u}$
11	Total time (s)	$T$
12	Acceleration time ratio (%)	$T_{r_a}$
13	Deceleration time ratio (%)	$T_{r_d}$
14	Idle time ratio (%)	$T_{r_i}$
15	Cruise time ratio (%)	$T_{r_u}$

Table 2. Principal components.

Component	Total	Percent Variance	Cumulative Contribution Rate
1	7.1840	0.4789	0.4789
2	3.5585	0.2372	0.7162
3	1.7081	0.1139	0.8300
4	0.7500	0.0500	0.8800
5	0.6447	0.0430	0.9230
6	0.4364	0.0291	0.9521
7	0.3638	0.0243	0.9764
8	0.1495	0.0100	0.9863
9	0.0817	0.0054	0.9918
10	0.0593	0.0040	0.9957
11	0.0426	0.0028	0.9986
12	0.0183	0.0012	0.9998
13	0.0029	0.0002	1.0000
14	0.0000	0.0000	1.0000
15	0.0000	0.0000	1.0000

Table 3. Correlation coefficient between the selected principal components and the 15 eigenvalues.

Eigenvalues	Principal Components
Eigenvalues	1	2	3
Average speed	−0.32	0.09	0.10
Maximum speed	−0.35	−0.01	0.02
Maximum acceleration	−0.33	−0.09	−0.05
Average acceleration	−0.28	−0.23	−0.10
Maximum deceleration	0.33	0.12	0.05
Average deceleration	0.29	0.23	0.09
Acceleration time	−0.24	0.38	−0.04
Deceleration time	−0.24	0.39	−0.04
Idle time	0.01	0.03	−0.56
Cruise time	−0.16	0.45	−0.07
Total time	−0.18	0.44	−0.16
Acceleration time ratio	−0.29	−0.22	0.12
Deceleration time ratio	−0.29	−0.21	0.12
Idle time ratio	0.17	−0.03	−0.62
Cruise time ratio	0.16	0.27	0.46

Table 4. Eigenvalues represented by the first three principal components.

Principal Component	Eigenvalues
1	Average speed, maximum speed, maximum acceleration, average acceleration, maximum deceleration, average deceleration, acceleration time ratio, and deceleration time ratio
2	Acceleration time, deceleration time, cruise time, and total time
3	Idle time, idle time ratio, and cruise time ratio

Table 5. Comparison of the silhouette coefficient.

	Hierarchical Cluster	K-Means
Silhouette Coefficient	0.5449	0.4009

Table 6. Eigenvalues of the 10 selected microtrips.

	$V_{a v g}$	$V_{m a x}$	$a_{m a x_a}$	$a_{a v g_a}$	$a_{m a x_d}$	$a_{a v g_d}$	$T_{a}$	$T_{d}$	$T_{i}$	$T_{u}$	$T$	$T_{r_a}$	$T_{r_d}$	$T_{r_i}$	$T_{r_u}$
1	18.43	37.68	0.69	0.32	−0.71	−0.36	43	37	6	74	160	26.88	23.13	3.75	46.25
2	21.86	63.06	0.90	0.51	−0.91	−0.45	34	37	61	10	142	23.94	26.06	42.96	7.04
3	0.10	0.18	0.00	0.00	0.00	0.00	0	0	6	10	16	0.00	0.00	37.50	62.50
4	19.17	35.74	1	0	−1	0	36	32	8	81	157	23	20	5	52
5	0.47	0.73	0.00	0.00	0.00	0.00	0	0	0	14	14	0.00	0.00	0.00	100
6	5.12	28.59	0.88	0.45	−0.74	−0.38	22	25	83	37	167	13.17	14.97	49.70	22.16
7	26.37	40.29	0.88	0.37	−0.60	−0.36	43	49	4	61	157	27.39	31.21	2.55	38.85
8	24.63	66.99	0.82	0.51	−1.06	−0.51	36	35	45	16	132	27.27	26.52	34.09	12.12
9	25.81	68.82	1	1	−1	−1	41	37	14	36	128	32	29	10.94	28.13
10	33.26	66.34	0.90	0.52	−1.31	−0.69	65	49	23	27	164	39.63	29.88	14.02	16.46

Table 7. Comparison of the eigenvalues for real vehicle road data and the constructed driving cycle.

Eigenvalue	Real Road Data	Hierarchical Cluster		K-Means
Eigenvalue	Real Road Data	Constructed Driving Cycle	Relative Error	Constructed Driving Cycle	Relative Error
$V_{a v g}$	16.4763	17.522	6.35%	15.495	14.27%
$V_{m a x}$	35.5884	40.842	14.76%	36.70556	2.01%
$a_{m a x_a}$	0.6753	0.702	3.95%	0.846	10.86%
$a_{a v g}$	0.3431	0.358	4.36%	0.399	9.05%
$a_{m a x_d}$	−0.7098	−0.716	0.87%	−0.848	2.56%
$a_{a v g_d}$	−0.3562	−0.365	2.46%	−0.405	5.83%
$T_{r_a}$	20.6916	21.324	3.06%	21.242	2.91%
$T_{r_d}$	19.8506	20.106	1.29%	21.696	14.78%
$T_{r_i}$	22.3366	20.061	10.19%	23.294	8.58%
$T_{r_u}$	38.1250	38.51	1.01%	33.8	1.62%

Table 8. Comparison of power consumption per 100 km.

Real Road Data	Hierarchical Driving Cycle	NEDC	K-Means Driving Cycle
17.4 kW·h	17.88 kW·h	16.19 kW·h	18.52 kW·h

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Wang, T.; Jing, Z.; Zhang, S.; Qiu, C. Utilizing Principal Component Analysis and Hierarchical Clustering to Develop Driving Cycles: A Case Study in Zhenjiang. Sustainability 2023, 15, 4845. https://doi.org/10.3390/su15064845

AMA Style

Wang T, Jing Z, Zhang S, Qiu C. Utilizing Principal Component Analysis and Hierarchical Clustering to Develop Driving Cycles: A Case Study in Zhenjiang. Sustainability. 2023; 15(6):4845. https://doi.org/10.3390/su15064845

Chicago/Turabian Style

Wang, Tianxiao, Zhecheng Jing, Shupei Zhang, and Chengqun Qiu. 2023. "Utilizing Principal Component Analysis and Hierarchical Clustering to Develop Driving Cycles: A Case Study in Zhenjiang" Sustainability 15, no. 6: 4845. https://doi.org/10.3390/su15064845

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Utilizing Principal Component Analysis and Hierarchical Clustering to Develop Driving Cycles: A Case Study in Zhenjiang

Abstract

1. Introduction

2. Data Collection and Preprocessing

3. Principal Component Analysis

4. Hierarchical Cluster Algorithm

5. Construction and Verification of a Driving Cycle

5.1. Construction of a Driving Cycles

5.2. Statistical Verification of the Driving Cycle

5.3. Verification of the Driving Cycle on the Basis of the Economical Detection of Pure Electric Vehicles

6. Summary

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI