1. Introduction
With the rapid evolution of the vehicle industry, vehicles have become an indispensable and fundamental aspect of modern life. Consequently, the vehicle maintenance industry has emerged as a crucial service sector, directly impacting individuals’ well-being and overall quality of life. To ensure a secure and reliable driving experience while minimizing maintenance expenses, accurate vehicle maintenance prediction has garnered significant attention from both researchers and industry professionals in this domain. Unfortunately, current maintenance practices primarily rely on fixed time intervals or mileage, which represent a conventional yet restricted approach to planned maintenance. This approach overlooks the varying usage patterns and driving environments of different vehicles, consequently hampering maintenance flexibility and resulting in excessive costs and unnecessary maintenance procedures.
Throughout the vehicle maintenance procedure, multiple maintenance tasks exhibit interconnectedness, indicating strong correlations between them. For instance, to maintain the engine’s lubrication and cooling functionalities, it becomes imperative to simultaneously replace both the oil and oil filter. To sustain optimal combustion efficiency and emissions performance, the simultaneous replacement of both the air filter and fuel filter ensures a continuous supply of clean air and fuel to the engine. In order to uphold the ignition system’s functionality and brake operation, replacing the spark plugs and ignition coils after reaching a predetermined mileage is crucial.
Mileage is a pivotal factor in determining the requirement of vehicle maintenance. The number of common maintenance projects (Here, common maintenance projects are those that need to be performed more than five times a year.) for vehicles with different annual mileage, as depicted in
Figure 1, varies based on experimental data. Heightened mileage intensifies the wear and aging process of vehicle components. Frequently used parts such as the engine, suspension, and braking systems endure heightened strain, fatigue, or even damage as mileage escalates. Moreover, long journeys result in increased fuel consumption, compromised lubricant quality, and associated complications, increasing the maintenance risk factor. Therefore, vehicles require more frequent and varied engagement with maintenance as mileage accumulates.
The vehicle base information, including location, manufacturer, type, engine type, engine displacement, model year, production year, and assembly plant, is closely related to the development of maintenance projects. Varied regions present dissimilar climates, road conditions, and environments, which in turn impact the demand for specific vehicle maintenance endeavors. Manufacturers vary in terms of vehicle design, performance benchmarks, and the developmental methodologies employed for their respective maintenance products. Additionally, disparate vehicle types involve unique high-stress components during operation, requiring specialized maintenance approaches. Diverse engine classifications and capacities require distinct maintenance procedures due to their inherent design and functional differences. Notably, advancing vehicle technology leads to discrete engineering upgrades and modifications across varying models and production years, consequently influencing corresponding maintenance protocols. Thus, different assembly processes and component selections at various assembly facilities result in discrepancies in maintenance requirements.
Drawing upon the aforementioned context, to address the prevalent challenges encountered in predicting vehicle maintenance projects, we propose a methodology founded upon vehicle maintenance record data and vehicle base information data. The former encapsulates maintenance projects and maintenance mileage, while the latter encompasses vehicle’s location, manufacturer, type, engine type, engine displacement, model year, production year, and assembly plant.
To the best of our knowledge, this paper represents the inaugural endeavor to predict vehicle maintenance projects and makes the following contributions:
To fully exploit the interdependencies among various maintenance projects, we introduce a novel correlation representation scheme for maintenance projects based on the co-occurrence matrix.
We develop a deep fusion network, endowed with an attention mechanism, to seamlessly integrate vehicle mileage and foundational vehicle information into the vehicle maintenance project prediction framework.
Extensive experiments conducted with real-world data demonstrate the superior performance of our model relative to contemporary baselines.
2. Related Work
Predicting vehicle maintenance projects is a complex endeavor, involving the forecasting of future maintenance needs using historical maintenance records and fundamental vehicle information, which constitutes a quintessential time series prediction task. Accordingly, we will offer a comprehensive description of the principal time series prediction techniques. Time series prediction techniques can be classified into traditional approaches, machine learning-based methods, and deep learning-based techniques.
2.1. Traditional Time Series Prediction Methods
Traditional time series analysis centers on establishing parametric models, determining model parameters, and utilizing the resolved models for future predictions. Exponential smoothing, introduced by Robert G. Brown [
1], forecasts future values by assigning weights to historical values. The moving average method, devised by George W. Brown [
2], computes the mean within a specific time window to forecast future values. Building on these methods, Zhang G. P. developed the extensively utilized ARIMA model [
3]. ARIMA amalgamates autoregressive and moving average models to capture data trends, seasonal variations, and noise characteristics. Prior to prediction, the observed value series is subjected to tests for smoothness, white noise evaluation, and assessments of the autocorrelation coefficient and partial autocorrelation coefficient. Although traditional methods excel in addressing straightforward time series prediction problems, they encounter challenges in managing high-dimensional time series data with intricate dependencies and nonlinearities.
2.2. Machine Learning-Based Time Series Prediction Methods
Machine learning methodologies, particularly linear regression techniques [
4], are significantly linked to time series prediction tasks. Linear regression employs a linear model to predict the future trajectory of a series, presuming a linear relationship between the input and output variables. SVM [
5], grounded in statistical learning theory, employs kernel functions to map data to high-dimensional spaces, thus overcoming issues related to dimensionality. SVR [
6], a variant of SVM functioning as regression analysis, transforms data nonlinearly within this feature-rich space. SVR determines a function that precisely represents the relationship between input and output data, thus facilitating time series fitting and prediction. MLP [
7], characterized by its hidden layers and output layer, generates predictions through the iterative adjustment of weights and biases via backpropagation. The utilization of nonlinear activation functions and multiple layers in MLP enables it to learn and represent complex nonlinear relationships. HMMs [
8] provide a probabilistic framework for the modeling of multivariate time series predictions. HMMs utilize hidden Markov chains, which represent underlying stochastic processes and can be estimated through a sequence of observations.
2.3. Deep Learning-Based Time Series Prediction Methods
Deep learning has undergone rapid development and achieved significant progress in predicting time series data. CNN [
9], RNN [
10], LSTM [
11], Transformer [
12], and GNN [
13] are extensively employed across a range of time series prediction tasks. Numerous enhanced methodologies have been proposed based on these models. TCN [
14] treats the time series as a one-dimensional object, capturing long-term relationships through iterative multilayer convolution. SCINet [
15] employs a hierarchical convolutional network structure to extract and aggregate features at different temporal resolutions. To predict future diseases, RETAIN [
16], Dipole [
17], and Timeline [
18] integrate RNN with attention mechanisms [
12] to model and analyze patients’ historical disease diagnostic data. Deep State Space [
19] models the relationship between consecutive hidden states via an RNN, enabling predictions from the current hidden state to the desired outcome. Log Sparse Transformer [
20] employs causal convolution to generate Queries and Keys in the self-attention layer, thus introducing log sparse sparsity into the model. Chet [
21] integrates GNN with the attention mechanism to accurately predict diseases for patients. In order to capture latent spatial dependencies in the data, Graph Wave Net [
22] introduces an adaptive graph modeling technique. CoDMO [
23] utilizes correlation-enhanced hierarchical propagation models and prior interactions in historical records to learn dual medical ontology representations for predicting a patient’s future conditions and procedures during the next admission. HGV4Risk [
24] proposes the Global Graph Embedding module and
-attention mechanism, thereby enabling risk prediction based on temporal sequential data.
Although the aforementioned methods exhibit robust performance in time series prediction, their applicability to directly predict vehicle maintenance projects is limited. This limitation arises from their inability to accommodate the unique characteristics inherent to these tasks. Unlike these methods, our study comprehensively integrates the correlations between maintenance projects and the effects of vehicle mileage and foundational data to ensure precise predictions of vehicle maintenance projects. With the continuous development of the vehicle industry, methods for forecasting vehicle maintenance demand are evolving and can be categorized based on the data source into single data-based and combined data-based vehicle maintenance demand predictions.
3. Method
3.1. Notations
To facilitate a comprehensive depiction of the maintenance protection prediction task at hand, we provide a set of notations in
Table 1. The base information of a vehicle is represented by
, where
n is the types of base information. During the vehicle maintenance process, numerous maintenance projects are necessary. All maintenance projects are systematically encoded, forming the set of project codes denoted as
, comprising
different project types. Each vehicle possesses multiple maintenance records, which encapsulate two essential pieces of information: the maintenance projects and the corresponding mileage. The maintenance projects in the
t-th maintenance are defined as a multi-hot column vector
;
= 1 means that the maintenance project
was carried out at the
t-th maintenance;
; and
. Multi-hot column vector is a type of binary vector representation where each element of the vector corresponds to a specific maintenance project. If a certain maintenance project is performed during a maintenance event, the corresponding element in the vector is set to 1; otherwise, it is set to 0. For example, if a maintenance record includes projects
and
, the multi-hot vector could be [1, 0, 1, …, 0]. The mileage is denoted by
, with
representing the mileage achieved during the
t-th maintenance, where
T is the number of maintenance. In this paper, maintenance project prediction is based on the historical maintenance projects
; mileage data
; and base information
B. The objective is to predict the maintenance projects
for the (
)-th maintenance.
3.2. Framework
The framework of the proposed
Multi-
source Data
Deep
Fusion
Network (MsDFN) for vehicle maintenance project prediction is presented in
Figure 2. The framework comprises two key modules:
(1) Maintenance project correlation representation: Recognizing the existence of correlations among different maintenance projects, we propose a correlation representation function to derive correlation representation results for each project based on the co-occurrence matrix.
(2) Multi-source data fusion network: Acknowledging the significant impact of vehicle mileage and vehicle base information on the development of maintenance projects, a multi-source data fusion network based on attention mechanism is proposed to effectively incorporate the project correlation representation results with the vehicle mileage and vehicle base information.
3.3. Maintenance Project Correlation Representation
During the process of vehicle maintenance, each maintenance session frequently entails the execution of multiple distinct maintenance projects simultaneously. For example, within a vehicle’s maintenance record, activities such as an oil change and air filter replacement occur simultaneously, indicating an interdependence among various maintenance projects. To fully utilize the correlations among maintenance projects, we propose a correlation representation module for their correlation based on the co-occurrence matrix.
We create a global co-occurrence graph
G for all maintenance projects with weighted edges, where each node serves as a representation of a maintenance project
sourced from the set
P. If a code pair
co-occurs in a vehicle’s maintenance record, two equal weights
and
are integrated into
G. Then, we count the total co-occurrence frequency
of
in all vehicles’ maintenance records for further calculation of edge weights. In addition, we want to detect important and common project pairs. Therefore, we define a threshold
to filter out combinations with low frequency and obtain a qualified set
for
. Let
be the total frequency of qualified projects co-occurring with
. We use an adjacency matrix
to represent
G with the definition in Equation (
1).
Note that
A is designed to be symmetric to represent the influence of two maintenance projects. As a static matrix,
A quantifies the frequency at which global maintenance projects co-occur. However, the appearance and disappearance of different maintenance projects occur at varying stages. Even if a specific maintenance project is absent from the current maintenance records, a related maintenance project may arise in the future due to the correlations of different projects. Consequently, in order to account for the correlations among maintenance projects, the correlation represent results
of each maintenance project are derived for each vehicle’s historical maintenance projects
, utilizing
individually. The definition of
is presented in Equation (
2).
where
and function
f serves to filter out the row vectors of the matrix that are all zeros.
3.4. Multi-Source Data Deep Fusion Network
The mileage and base information of a vehicle play a pivotal role in the formulation of maintenance projects. Based on maintenance project correlation representation, in order to thoroughly take advantage of their impact on this endeavor, we propose a multi-source data deep fusion network that combines maintenance project correlation representation result, mileage, and base information.
3.4.1. Mileage Fusion Representation
Mileage is an important facet of vehicle condition. In general, higher mileage corresponds to an increase in the type and frequency of maintenance projects required.
In the evaluation of vehicle maintenance tasks, it is imperative for maintenance personnel to first grasp the present state of the vehicle, encompassing its historical maintenance projects and mileage. Drawing on this information, maintenance staff can make initial inferences about the ongoing maintenance projects necessary for the vehicle. Yet, due to distinct driving patterns exhibited by different vehicles, the impact of mileage on maintenance projects varies across vehicles. To tackle this challenge, we propose a mileage-aware key query attention mechanism that discerns pivotal mileage thresholds in the development of vehicle maintenance projects. Herein, the correlation representation result of each project serves as the query vector, while the mileage of each maintenance episode forms the key and value vectors. Notably, the raw mileage
and the correlation representation results of the maintenance project
are not in the same potential space, and it becomes imperative to map
to the same potential space as
to obtain the mapping result
. This is achieved by
[
25], as illustrated in Equation (
3).
where
,
,
, and
are parameters. Once we have obtained the potential space result
that aligns with
, we input
as the query vector and
as the key and value vectors into the attention mechanism one by one. The specific implementation is illustrated in Equation (
4).
The attention mechanism [
12] in the above equation is defined as shown in Equation (
5):
Attention weights are defined as shown in Equation (
6):
where
d is the dimension of attention andthe dimension of attention and
are attention weights. For each maintenance project and mileage of the vehicle, the attention mechanism is fused one by one and merged to obtain the result of the historical mileage-fused attention representation of the vehicle
.
3.4.2. Representation of Data Fusion from Multiple Sources
The base information of the vehicle includes location, manufacturer, type, engine type, engine displacement, model year, production year, and assembly plant. These factors significantly influence the development of the vehicle’s maintenance project. Consequently, it is imperative to incorporate the aforementioned vehicle base information into the vehicle maintenance project prediction model. This integration effectively enhances the accuracy of the prediction process. The specific implementation is outlined as follows.
To begin with, the base information
are transformed utilizing the representation functions
and
to acquire
. The representation functions and corresponding outputs for each base information are illustrated in
Table 2.
Let us consider a variable
X comprising
n categories, denoted as
for the
i-th category. Through
, each category is converted into a binary vector of length
n, where only one element is assigned the value of 1 while the remainder are set to 0. Precisely, the representation result for the
i-th category is outlined in Equation (
7):
where the length of the vector is
n, the
i-th element is 1, and the rest of the elements are 0.
is defined as in Equation (
8):
maps discrete features onto a more meaningful lower dimensional space based on
, where
is a parameter matrix and
d denotes the dimension of the low-dimensional space. Using
and
, the results of each base information representation are spliced in order to obtain
.
For the categorical data in the model, we employ one-hot-based encoding and multi-hot vector encoding. For numerical data, we apply an embedding function, , to map it to the same vector space as the categorical data. Utilizing different encoding methods enables the model to consider a wide range of factors, thus enhancing its ability to accurately predict upcoming maintenance projects by utilizing various vehicle characteristics and history.
To seamlessly incorporate the vehicle base information into the maintenance project prediction task, we set
U as the query vector and
E as both the key vector and value vector. These vectors are then fed into the attention mechanism, yielding the representation result
L for the vehicle’s base information
U, as well as the historical mileage fused attention representation
E. The specifics of this implementation are outlined in Equation (
5). Subsequently,
L and
U are concatenated to generate the representation result
for the vehicle. The final maintenance project prediction result
is obtained through the vehicle representation result
H. The specific implementation process is shown in Equation (
9):
where
and
are parameters, and
is the dimension of
H. To minimize the risk of overfitting during the prediction process, a dropout operation is performed before
H makes the prediction.
3.5. Model Optimization
We train the MsDFN model to predict the last maintenance project for each vehicle, with a binary cross-entropy loss function for the global objective function, as shown in Equation (
10):
where
represents the predicted results of the maintenance project
and
represents the true label of the maintenance project
.
4. Experiment Result and Analysis
4.1. Dataset Description
To assess the performance of our proposed method, we utilized authentic vehicle maintenance and base information data sourced from 73 vehicle maintenance companies for validation purposes. During the preprocessing phase, we standardized the data formats by aligning field names, the maintenance project name, and data types across different companies in order to ensure consistency. We implemented a cleaning process, which included deduplication and outlier detection to enhance data accuracy. Subsequently, we screened the data to retain records of vehicles that had undergone two or more repairs, ensuring complete information for each repair and corresponding base information was available. For the data merging process, we employed a unified vehicle identifier to integrate datasets from different companies, thus maintaining comprehensive records for each vehicle. As a result of these preprocessing and merging steps, the dataset comprises records of 26,831 vehicles that underwent maintenance between April 2011 and April 2023. This enhanced description provides clarity on the methods used to preprocess and merge the data, ensuring its reliability and usability in our study. The detailed dataset statistics are presented in
Table 3, and the distributions of the vehicle maintenance records are depicted in
Figure 3.
To enhance the experimentation process, we proceed to randomly partition the dataset into training, validation, and testing sets. Specifically, these sets consist of 18,000, 3831, and 5000 vehicles, respectively. In our approach, we designate the last maintenance project as the label, while utilizing the remaining maintenance projects, mileage, and vehicle base information as input features. The global project co-occurrence graph G is constructed based on the maintenance project within the training set.
4.2. Baseline Models and Evaluation Metrics
The main task of the experiment is to predict the (
)-th maintenance projects based on the vehicle’s first
T maintenance projects, mileage, and vehicle base information, which is a multi-label classification problem. For this task, the evaluation metrics are weighted
F1 score (
w-F1) [
21] and
R@k [
21]:
w-F1 calculates
F1-score for each project code and reports their weighted mean;
R@k is an average ratio of desired project codes in top
k predictions by the total number of desired project codes in each maintenance, which measures prediction accuracy. In order to compare our proposed method with state-of-the-art models, we choose the following method as a comparison experiment.
Traditional machine learning method: MLP [
7].
Traditional deep learning methods: CNN [
9], RNN [
10], and LSTM [
11].
Models based on RNN and attention: RETAIN [
16] and Dipole [
17].
Model based on dynamic graph and context-aware: Chet [
21].
Typical methods for vehicle maintenance predicting: SLFN [
26], DBN [
27], and EFMSAE-LSTM [
28].
Typical deep learning methods for data fusion: MIFDELN [
29], MFDL [
30], and IKN-ConvLSTM [
31].
In the experimental process, MLP indicates that only historical maintenance projects are input into the model for prediction, MLP+ indicates that historical maintenance projects, mileage, and vehicle base information are fused for prediction and so on.
4.3. Realization Details
To ensure unbiased results, we randomly initialize the model parameters in our experimental setup. The hyperparameters are carefully tuned on the validation set. Specifically, we set the threshold to 0.07, the dropout rate to 0.45, and consistently set the batch size to 32 across all experiments. Our model training process consists of 100 epochs, employing the Adam optimizer with an Initial learning rate of . We incorporate learning rate decay using a multi-step scheduler, learning rates are adjusted to and at epochs 5 and 15. We implemented the entire experiment using Python 3.7.0 and PyTorch 1.10.0, with CUDA 11.4 utilized on a device equipped with 64 GB of memory and an NVIDIA-SMI 472.39 GPU. To ensure the reliability of our findings, we repeated the experiment five times using distinct random seeds.
4.4. Prediction Performance
Table 4 depicts the experimental results. Since the average number of project codes per maintenance is 5.5, we established (k = [3,5,7]) for (R@k). Remarkably, our proposed model surpasses all baseline models in terms of performance. When compared to the top-performing baseline RETAIN
+, MsDFN demonstrates an enhanced accuracy in predicting maintenance projects at varying (k) values—specifically, 3 (k = 3), 5 (k = 5), and 7 (k = 7)—with improvements of 1.11%, 1.14%, and 1.21%, respectively. This proves the effectiveness of the correlation representation of maintenance projects and multi-source data deep fusion. Despite the interpretability associated with original traditional machine learning and deep learning models, their efficacy is limited because they solely focus on modeling maintenance history data without incorporating the essential aspects of learning from mileage and base information. Moreover, the two experimental results of each baseline model confirm the inadequacy of solely considering the development process of maintenance projects, thus underscoring the significance of integrating vehicle mileage and base information.
Compared to typical deep fusion models, MsDFN exhibits superior performance. Specifically, for w-F1, MsDFN achieves a score of 34.62%, significantly surpassing IKN-ConvLSTM’s 33.30%, MFDL’s 33.35%, and MIFDELN’s 33.51%. Similarly, for R@3, MsDFN attains a higher score of 40.39%, outperforming scores of 39.15%, 39.20%, and 39.24% from IKN-ConvLSTM, MFDL, and MIFDELN, respectively. This trend continues in R@5 and R@7, with MsDFN achieving 47.29% and 52.18%, respectively, consistently outperforming the results of the other three models. This demonstrates MsDFN’s capability in processing comprehensive vehicle maintenance and basic information, thus affirming its superior performance in predicting maintenance projects.
4.5. Performance Assessment of Data Sufficiency
To evaluate the impact of data sufficiency on prediction accuracy, we maintained a fixed size of the validation set at 3831 entries; varied the size of the training sets at 12,000, 14,000, 16,000, and 18,000, respectively; and utilized the remaining data for the test set. The remarkable performance of MsDFN in comparison to other baseline models, even with limited data, is evident in
Figure 4.
4.6. Ablation Study
In order to conduct a thorough analysis of each module’s effectiveness in our proposed approach, we compared ten ablation variants of the model, each with distinct settings. These variants are as follows:
MsDFN-B1: This model aims to underscore the importance of location in predicting vehicle maintenance projects by removing the input of location.
MsDFN-B2: The elimination of the manufacturer input in this model enables us to assess the significance of the manufacturer in predicting vehicle maintenance projects.
MsDFN-B3: By excluding the input of vehicle type, this model allows us to evaluate the contribution of vehicle type to the prediction of maintenance projects.
MsDFN-B4: The elimination of engine type as an input in this model enables us to ascertain the impact of engine type on predicting maintenance projects.
MsDFN-B5: This model investigates the significance of engine displacement in predicting vehicle maintenance projects by removing the input of engine displacement.
MsDFN-B6: The exclusion of the model year input in this model allows us to evaluate the importance of the vehicle model year in predicting maintenance projects.
MsDFN-B7: By eliminating the input of the year of vehicle production, this model enables us to analyze the impact of production year on predicting maintenance projects.
MsDFN-B8: This model investigates the contribution of vehicle assembly plants to the prediction of maintenance projects by removing the input of vehicle assembly plants.
MsDFN-B: The exclusion of all base information in this model allows for the assessment of its significance in predicting vehicle maintenance projects.
MsDFN-M: The removal of the mileage fusion process in this model enables the evaluation of the importance of mileage in predicting vehicle maintenance projects.
MsDFN-Co: This model assesses the significance of project correlation representation in predicting vehicle maintenance projects by eliminating the process of project correlation representation.
The results of the experiments for each variant model are presented in
Table 5, indicating that the performance of each variant model of MsDFN is inferior to that of the original MsDFN, thereby validating the efficacy of each component within the MsDFN model.
The variant models MsDFN-B1, MsDFN-B2,…, MsDFN-B8 evince that the inclusion of vehicle base information enhances the accuracy of vehicle maintenance project predictions, underscoring the necessity of integrating multiple data sources into the model.
MsDFN-B demonstrates the impact of the lack of base information on the experimental results at a holistic level, again showing that the development of vehicle maintenance projects is influenced by vehicle base information.
Remarkably, the MsDFN-M variant reveals the substantive impact of vehicle maintenance mileage on the formulation of maintenance projects, affirming a robust association between the advancement of vehicle maintenance projects and vehicle mileage.
Additionally, the MsDFN-Co variant demonstrates that the adequate representation of maintenance project correlations significantly enhances the precision of vehicle maintenance project predictions, highlighting the interplay between different projects in the development of maintenance projects and the imperative of incorporating project correlation representation within the model.
4.7. Prediction Analysis of New Maintenance Projects
In this study, a new maintenance project is defined as one that has not previously been executed on the vehicle. Within the realm of vehicle maintenance, there exists a heightened interest in predicting such previously unencountered projects. By leveraging the underlying assumption that distinct maintenance projects correlate, our proposed model is expected to exhibit enhanced accuracy in predicting new maintenance projects.
Table 6 displays the experimental results detailing each model’s predictive capabilities regarding new maintenance projects. Notably, our proposed model surpasses all baseline models, attesting to its exceptional proficiency in predicting these previously unencountered maintenance projects. These results validate our model’s effectiveness in predicting new maintenance projects.
4.8. Parametric Sensitivity
Analysis
To explore the influence of variable dimensionality on the performance of MsDFN, we scrutinized the impact of
E and
U dimensions on the experimental results.
Figure 5 shows the performance of the proposed MsDFN model across various hyperparameter combinations. Notably, we ascertained that an optimal balance between efficiency and performance is achieved by setting the
E and
U dimensions at approximately 75. Furthermore, we sought to evaluate the significance of the co-occurrence matrix threshold
in Equation (
1) on the experimental results, examining thresholds including 0.04, 0.05, 0.06, 0.07, 0.08, and 0.09. Remarkably, as illustrated in
Figure 6, the overall performance remains relatively stable, with the most favorable results attained at a threshold of 0.07.
5. Case Study
To visually unveil the interrelationship between maintenance projects and the influence of mileage on these projects, we selectively sampled four sets of three maintenance projects from a single maintenance record. The maintenance projects for each group are illustrated in
Table 7. We computed the frequency of each group’s maintenance projects within distinct mileage ranges in the training dataset and calculated the average attention weights in Equation (
6) for each group’s maintenance projects across various mileage ranges in the validation dataset.
As depicted in
Figure 7, a robust correlation emerges between the frequency and weight of the same project, signifying the efficacy of integrating mileage into the predictive process of maintenance projects. Moreover, mileage serves as an influencing factor on project weight, as deduced from the weight–frequency relationship. Furthermore, the weights of different projects exhibit a strong correlation, underscoring the effectiveness of the maintenance project correlation representation in capturing inter-project correlations. In summary, our case study showcases the comprehensive consideration of maintenance project correlations and the impact of mileage, affirming the capability of our model to effectively address these factors.
6. Conclusions
In this paper, we introduce a new deep fusion network that offers a comprehensive approach to predicting global maintenance projects. To enhance our understanding of the relationships between various maintenance projects, we introduce a correlation representation framework utilizing the maintenance project co-occurrence matrix. Building upon this correlation learning, we propose a deep fusion network that integrates the attention mechanism to synthesize vehicle mileage and vehicle base information. By conducting extensive experiments with actual vehicle data, we demonstrate the effectiveness and robustness of our model.
In summary, our model accounts for the interplay between vehicle maintenance projects, mileage, and vehicle base information. Furthermore, the strong interrelation between vehicle maintenance projects, vehicle breakdowns, and vehicle parts enables the seamless application of our model to forecast breakdowns and parts requirements. However, the correlation calculation is based solely on the co-occurrence frequency between maintenance projects, overlooking the correlation embedded in the textual information of maintenance projects. In addition, the co-occurrence can suggest correlation but does not necessarily imply causality or a meaningful relationship for prediction. In future studies, we intend to delve deeper into the correlations of maintenance projects and consider integrating explicit vehicle maintenance technology information into the prediction task. This approach is expected to enhance the interpretability of our model and promote further advancements in the field.