1. Introduction
As one of the key technologies for the digital transformation of traditional industries, digital twin can create virtual mappings of physical entities, which enables real-time reflection of the entire lifecycle of physical systems and facilitates their optimization and monitoring [
1]. Maintaining consistency between the twin model and the physical system is crucial for the successful implementation of digital twins’ functionalities [
2]. However, several factors can degrade the training and prediction performance of twin models. First, network communication often has irregularities such as delays and packet loss, leading to irregular sampling intervals [
3]. Second, sensors in physical systems have varying sampling periods, causing misalignment and sparsity in multivariate sequences. These issues make the sampled sequences unsuitable for model training and produce negative impacts on prediction performance [
4].
Sequence reconstruction is an effective means for these problems in digital twin systems. It involves rebuilding irregular observation sequences using function fitting or generative neural networks to obtain regular sequence data. Traditional methods include polynomial fitting [
5] and maximum likelihood estimation [
6]. Polynomial fitting approximates complex functions by learning suitable polynomial coefficients but requires predefining the polynomial degree [
7,
8]. A higher predefined polynomial degree may lead to overfitting and a lower one may have negative impact on the fitting performance. For complex and ever-changing observation sequences, it is often difficult to make an assumption on its degree. Moreover, polynomial fitting is very sensitive to noise, resulting in its inapplicability for noise-corrupted observation sequences [
9]. Based on the curve fitting method in statistics, maximum likelihood estimation determines model parameters by calculating likelihood function values but shares similar limitations with polynomial fitting, such as narrow applicability and the need to assume a model form. Narrow applicability means this method can only produce satisfactory results under some certain distributions [
10]. And an assumed model form will prevent the method from dealing well with complex sequences influenced by control inputs. Fixed-time discretization is another conventional approach that divides continuous-time observations into fixed-length, non-overlapping data windows [
11]. Although this method is easy to follow and utilize, it may induce empty windows and data aggregation issues when window sizes are increased [
12]. Therefore, window size is a key parameter for this method. However, determining an appropriate window size for complex sequences is often difficult.
Alternative solutions involve developing neural network models that can directly use irregular sequences as inputs, including interpolation-based and neural ordinary differential equation (NODE)-based methods. Interpolation-based methods use past and future observations for interpolation. Yoon et al. [
13] studied a multi-directional recurrent neural network (RNN) that can use past and future observation data flows at a given time step to realize interpolation, thus improving estimation performance in missing measurement scenarios. Shukla et al. [
14] developed an interpolation prediction network, which consists of several semi-parametric radial basis function (RBF) interpolation layers. This interpolation network can interpolate multivariate and irregularly time series against a set of reference time points. In [
15,
16,
17], attention mechanism is used for time encoding and several multi-time attention-based RNNs are proposed for irregular sequence modeling. However, these interpolation-based methods face variable uncertainty when observation intervals change significantly [
18]. Exponential decayed RNN autoencoder frameworks for sequence reconstruction were studied in [
19,
20], but Mozer et al. [
21] found these networks have no obvious improvements in prediction accuracy compared to the standard RNN autoencoders. In 2018, Chen et al. introduced the NODE network [
22], offering new solutions for irregular sequence reconstruction. Subsequently, Rubanova et al. [
23] developed a latent NODE model, which combines RNN and NODE in a variational autoencoder (VAE) framework, outperforming the exponential decay RNNs. The GRU-ODE-Bayes model proposed by Brouwer et al. [
24] achieved good results on sparse data. Huang et al. [
25] presented LG-ODE, which is a latent ordinary differential equation VAE for multi-agent systems with known graph structure. LG-ODE can learn the embedding of high-dimensional trajectories and deduce the latent system dynamics simultaneously. In practical applications, most systems have state trajectories influenced not only by initial values but also by control inputs. NODEs cannot account for control inputs’ impact on output sequences, limiting their application in controlled system sequence reconstruction. Moreover, when the measurement data are sparse or noisy due to network issues and electromagnetic interference, existing methods struggle to handle digital twin sequence reconstruction tasks.
To address these challenges, this article proposes a VAE model based on a parallel reference network (PRN) and neural controlled differential equation (NCDE) [
26] for noisy multivariate irregular sequence reconstruction in controlled systems. First, we establish a multi-channel self-attention (MCSA) module to analyze the position information and correlations of sampled data in the target sequence, improving reconstruction accuracy under sparse measurements while effectively handling misalignment and irregularity through multi-channel and masking mechanisms. Second, to enhance reconstruction accuracy under high noise levels, we construct a PRN to obtain reference features from the prediction results of digital twin model and fuse them with actual data features. We also calculate feature weights based on the noise level of the observation sequence to determine the proportion of actual and reference information in the fused features. Third, we use NCDE to build a decoder that can predict observations at any time by incorporating control inputs to solve the sequence reconstruction problem in controlled systems. Finally, we develop a weighted loss function based on feature weights to better train the model’s network parameters. Simulation experiments demonstrate the effectiveness and fitting accuracy of the proposed model for controlled systems under sparse measurements and high noise levels compared to the existing methods.
Overall, the main contributions of this article are threefold:
- (1)
We propose a PRN-NCDE-based VAE model that improves sequence reconstruction accuracy for controlled systems under sparse measurements and high noise levels.
- (2)
We develop an MCSA module that can not only analyze data position and correlations to enhance reconstruction performance under sparse measurements, but also effectively handle misaligned and irregular observation sequences.
- (3)
To improve reconstruction accuracy under high noise levels, we establish a PRN to obtain reference features and calculate feature weights based on the noise level of observation data for weighted fusion of latent features.
The rest of this article is organized as follows.
Section 2 formally describes the irregular sequence reconstruction problem.
Section 3 presents the overall framework of the PRN-NCDE model and analyzes the MCSA module, PRN, NCDE network, and weighted loss function.
Section 4 validates effectiveness of the proposed model compared to the existing methods through a boiler system under different sampling numbers and high noise levels. Conclusions and future work are discussed in
Section 5.
2. Problem Formulation
In this section, we first give formal descriptions of multivariate regular and irregular observation sequences.
Definition 1. Consider an observation dataset , where is the observation vector at sampling time , and is the total number of samples. Let NaN denote the missing data in the observation vector. And when an observation vector has missing data, its dimension satisfies . Then, if dataset meets the following two conditions:
- (1)
For any two consecutive observation vectors and in , the sampling interval is a constant , i.e., for .
- (2)
No observation vector in contains missing data NaN, i.e., for .
Then, forms a regular sampled time series. Conversely, if does not satisfy both conditions, it forms an irregular sampled time series.
Condition (1) in Definition 1 describes the regularity of the sampling sequence distribution on the time dimension, that is, whether a fixed sampling interval is followed. Condition (2) describes whether the sampling time of every variable in the multivariate observation sequence is aligned. Only when satisfies both conditions, which means that the sampling times of all the variables are aligned and there is an unified and fixed sampling interval, it can be regarded as a regular multivariate observation sequence. If either of the conditions is not met, it is considered as an irregular multivariate observation sequence. Overall, a regular multivariate sampling sequence needs to meet both conditions: a fixed sampling interval and aligned sampling times.
Due to various uncertainties in communication networks and non-synchronous sampling of sensors, twin systems usually receive non-aligned, irregular sampling sequences. Consider the nonlinear discrete-time system corrupted by measurement noise as follows:
where
,
, and
are the state vector, control vector, and observation vector of the system at sampling time step
, respectively.
and
are the nonlinear state transition function and measurement function, respectively. For system (1), we make the following assumptions:
Assumption 1. The process noise of the system is ignored, and it is considered that system (1) only contains measurement noise , which follows a Gaussian distribution with mean and covariance .
Assumption 2. Due to the uncertainties of network communication and non-synchronous sampling of sensors, the time series formed by the observations of system (1) is no longer regular. Instead, it is a non-aligned irregular time series with non-aligned sampling times and non-fixed sampling intervals.
Based on the above descriptions, the main objective of this article is as follows: Given a set of noisy, non-aligned and irregular observation sequences
from a controlled system, we want to develop a VAE network model that can maximize the evidence lower bound (ELBO) given by
to train the encoder network weights
and decoder network weights
, as shown in
Figure 1.
is the estimated observation vector from the neural network model, corresponding to the original observation vector
;
is the conditional distribution that the encoder network needs to approximate;
is the conditional distribution that the decoder needs to approximate;
is the prior distribution. Once the optimal sequence reconstruction model is trained, it can predict observations at any desired time and produce the regular sequence needed
by the twin system. Here,
is the estimated observation vector at desired time
;
is the number of samples in the regular sequence.
The next section will focus on the reconstruction method for noisy irregular sequences in controlled systems and propose a VAE network model based on PRN-NCDE to obtain regular sequences with higher fitting accuracy.
3. Irregularly Sampled Observation Sequence Reconstruction Method Based on PRN-NCDE
This section constructs a deep generative network model based on PRN-NCDE within the VAE framework, which mainly consists of an encoder and a decoder. First, the overall network framework is established, and the workflow of the model and the functions of each module are described. Then, the establishment of the MCSA module in the encoder, the PRN and the calculation method of latent feature weights, as well as the construction method of the NCDE network in the decoder, are analyzed in detail, and the corresponding weighted loss function is established. Finally, the approximation ability of the reconstruction model for the observation sequence is discussed. We refer the readers to [
27,
28], and [
26,
29] for detailed theories and implementations of VAE, MCSA and NCDE.
3.1. Overall Network Framework of PRN-NCDE
The overall network framework of the sequence reconstruction model, as shown in
Figure 2, consists primarily of an encoder and a decoder. The encoder comprises two identical network structures that map the actual observational sequence and the reference observational sequence into latent features, subsequently merging them through weighted fusion based on the noise level of the actual observational data. The decoder, consisting of an NCDE network and an output layer, utilizes these fused latent features to reconstruct the desired regular sequence.
Initially, the input of the encoder is made up by the actual irregular observation sequence and the reference observation sequence, each passing through the identical neural network architecture to produce the actual and reference latent features, respectively. In order to enhance reconstruction accuracy under sparse measurement conditions, a multi-head self-attention mechanism is adopted to process both sequences, focusing on the position information of the measurement data and capturing the correlations among them. The multi-channel self-attention mechanism specifically addresses the irregularity and non-alignment issues in actual observation sequences.
Taking the coal-fired power plant boiler system as an illustrative example, multivariate irregular observational sequences are first expanded to equal-length sequences based on desired sampling intervals to deal with irregular sampling intervals. Furthermore, discrepancies in sensor sampling times and frequencies lead to non-alignment among variables such as gas density, gas temperature, and gas oxygen content, causing inconsistent positions of missing values across different variables. Therefore, the expanded sequence is segmented along variable dimensions and input into corresponding masked multi-head self-attention modules, where masks indicate missing values, preventing negative impacts during network training and prediction. Subsequently, the output sequence from the self-attention module undergo group normalization and reversal before being fed into an LSTM network for reverse-time inference from time to , obtaining the hidden state at the initial time step . A fully connected layer maps this hidden state to the mean and variance of the latent feature distribution, and by combining with noise sampled from a standard Gaussian distribution can generate latent features.
Similarly, reference observation sequences generated by a digital twin model undergo processing through an identical PRN structure to obtain reference latent features. Incorporating PRN to acquire the reference latent features provides prior information, thereby enhancing the reconstruction model’s fitting performance under higher noise conditions. Higher noise levels would increase the noise information in the latent features from the actual observation sequences, significantly impairing the decoder’s effectiveness. Hence, weights assigned to latent features are set as a nonlinear function of observation sequence noise variance. With noise level increasing, we should reduce the weight of latent features of the actual observation sequence and rely more on the reference latent features. Noise variance is approximated from the difference between actual and reference sequences. The resulting weighted fusion feature is subsequently used as the decoder input.
The decoder can be viewed as the inverse process of the encoder, reconstructing observation data from latent features through neural networks. Compared with NODE models that rely solely on the initial value , the NCDE-based decoder, which takes the influence of the control inputs into consideration during sequence reconstruction, can deal with the sequence reconstruction of a controlled system better. The NCDE network includes multi-channel fully connected layers and activation function layers linked to corresponding control derivative channels to perform element-wise multiplication with control derivatives. Summation across channels produces the NCDE output. The number of channels in the NCDE network is determined by the dimensionality of control variables . Ultimately, the NCDE-generated predictions are mapped via the output layer to yield the desired regular data sequence.
3.2. Multi-Channel Self-Attention Module
The multi-channel self-attention (MCSA) module comprises multiple masked multi-head self-attention modules designed to handle the misalignment and irregularities inherent in multivariate observation sequences. Initially, the observation sequence of each dimension within the coal-fired power plant is expanded to a fixed-length sequence at a desired sampling interval. Consequently, for the irregular observation vector
, certain dimensional observations will be missing. These missing observations can be filled using a constant value
, represented as follows:
where
is the observation vector after filling.
is an indicator function taking values of either 0 or 1, with 1 denoting the presence of an observation and 0 denoting its absence.
is the vector formed by inverting elements of
, and
represents element-wise multiplication.
Therefore, the irregular observation sequence can be expanded into a regular sequence with the desired length
, represented as follows:
where
represents the expanded multivariate sequence containing the filled values
.
Additionally, due to the non-alignment of multivariate observation sequence in the boiler system, the filled positions of missing values are not entirely consistent. Thus, the number of MCSA self-attention channels should correspond to the dimensionality of observational variable
, and the sequence
must be divided according to variable dimensions (i.e., the rows of
), and input into their corresponding self-attention module channels. Taking observational variables in the boiler system as an example, division by variable dimension can be expressed as
where
represent the time series with length
of gas density, gas temperature, and gas oxygen content.
represents row-wise partitioning of the matrix. Since the filled positions for each variable are inconsistent, the divided sequences are separately input into corresponding multi-head self-attention modules, represented as follows:
with
where
represents the time series of the
i-th variable,
is the output row vector of MCSA,
represents the multi-head attention operation of the
i-th channel,
is the output projection matrix of the
i-th channels and
is the self-attention operation of the
h-th head of the
i-th channel.
,
, and
are the corresponding projection matrices.
The output results of each self-attention channel are concatenated along the dimension direction and written as , where represents the concatenated matrix and represents the concatenation operation.
3.3. Weighted Fusion of Latent Features and Weight Calculation
Sensor data in twin systems can have significant measurement noise covariance due to electromagnetic interference. Existing methods struggle with reconstructing irregular observation sequences with high noise levels. Large measurement noise will heavily contaminate the actual observation data, which means that the real distribution of the measurement information cannot be well learned. An effective means for dealing with this issue is to incorporate reference latent features into the model training and prediction process, and fuse them with latent features from noisy actual observations, thus reducing the impact of noise on reconstruction results.
Therefore, we utilize the prediction results of a high-fidelity digital twin model as a reference observation sequence, and construct a PRN with the same network structure to obtain reference latent features. We can assume that the actual and reference observation sequences are mapped to latent states
and
, respectively, through identical neural network structures, with their posterior distributions following normal distributions given by
with
where
and
denote the mean values of
and
,
and
are the corresponding covariances.
and
are the initial hidden states from two LSTM networks.
and
represent the actual irregular observation sequence and the reference observation sequence generated by the model, respectively.
is the feedforward neural network transforming the initial hidden state
into
and
, and
denotes the exponential operation which can ensure the covariance matrix is positive definite.
By using the reparameterization trick,
and
can be written as
where
is sampled from the standard normal distribution.
Then, we need to determine reasonable weight allocation criteria for the weighted fusion of latent features. By introducing reference latent features, the reconstruction performance under large noise conditions can be improved. Therefore, the noise level of the actual observation sequence can be used for weight allocation. When noise is small, the weight of the actual observation sequence should be increased to capture more of its feature information. Conversely, when noise is large, the weight of the reference latent features should be increased to avoid introducing excessive measurement noise. Assuming the measurement noise follows a Gaussian distribution with mean
and covariance
, we estimate the noise covariance using the reference observation sequence:
where
and
are the normalized actual and reference observations.
Let
be the largest diagonal element of
. The reference feature weight is then calculated by
where
,
,
.
,
,
and
are the threshold parameters needed to be set properly. If the noise level is smaller than
, there is no need to increase the weight of the reference observation feature and the model will primarily utilize the actual observation features to realize sequence reconstruction. While if the noise level is larger than
, the weight of the reference feature should be increased so that the negative impact of noise on reconstruction accuracy could be reduced. In addition, in order to guarantee that the fused feature always contains some feature information of the actual data, the maximum reference feature weight is set to
.
Finally, the fused feature can be expressed as follows:
3.4. NCDE-Based Decoder
The decoder predicts the desired regular data sequence from the fused initial latent feature
. Generally, most real systems are controlled, e.g., the coal-fired power plants. However, the prediction results of the NODE method depend only on the initial hidden state without considering the impact of control inputs on state changes. The NCDE proposed by Kidger et al. [
28] takes the impact of control inputs into consideration during the prediction process. The solution of the NCDE is defined as follows:
where
is the hidden state at time
,
is a thrice-spline curve based on the control sequence
over the interval
, and
is a neural network with learnable parameters
.
Consider a generative model defined by the NCDE network with the initial latent feature
and desired sampling times
. The initial latent feature
follows a normal distribution given by
where
Using Formula (12), we can compute the latent features at all desired time points. The neural network
consists of multiple channels of multilayer feedforward networks, with the number of channels determined by the system’s control dimension
. Taking the boiler system as an example, since it has two control inputs (coal feed and secondary air flow),
has two channels as follows:
where
denotes a multilayer feedforward network cascaded with fully connected layers and LeakyReLU activation functions, and
are the outputs of the
for each channel. Multiplying
and
with their respective control derivatives and summing the results yields the latent feature of the NCDE network at the desired time
given by
where
and
are the cubic spline curves formed by the sequences of control input 1 (coal feed) and control input 2 (secondary air flow), respectively.
After obtaining the latent features at all desired time points, the output layer generates the desired observation sequence as follows:
where
.
is the desired regular observation sequence generated from the latent feature sequence
at sampling time
.
3.5. Weighted Loss Function of the PRN-NCDE Model
Through reparameterization, parameters can be learned by using the gradient descent method. According to [
30], the objective loss function in Equation (2) can be simplified as
where
and
are the encoder input and decoder output at the corresponding time,
is a hyperparameter controlling the variance and
is the dimension of the latent feature
.
and
are the mean and covariance of
, respectively.
denotes the trace operator.
This objective function measures the reconstruction accuracy of the decoder output relative to the encoder input . When the actual observation sequence is not corrupted by noise or the noise level is low, this function enables the model to learn the data distribution effectively. However, if the actual observational data contain high levels of noise, achieving the desired training performance using this objective function is difficult.
To address this challenge, we propose a weighted loss function for the PRN-NCDE model, which is defined as follows:
where
is the reference latent feature weight calculated in
Section 3.3.
Equation (18) integrates data from the reference observation sequence. It uses the reference feature weight, which is calculated based on the noise level in the actual observation data, to determine the importance of the actual and reference data distributions for parameter training. When noise is low, the loss function will rely more on the actual observation data for training. Conversely, the reference observation data will be used to mitigate the negative impact of noise on model learning.
4. Simulation Experiments and Analysis
This section selects the furnace of the boiler system as the test object to evaluate the reconstruction performance of the developed model. We use the nominal data of a 600 MW coal-fired power plant under 100% rated conditions to train the sequence reconstruction model and validate its effectiveness. Under the rated conditions, it is required that the coal flow rate is about 74.5 kg/s and the excess air coefficient is 1.2; thus, the gas oxygen content in the boiler system can be maintained around 3.2% and the boiler system can operate stably. The reference observation sequence is generated by the digital twin model of the furnace, which is defined by the following equations [
31,
32,
33]:
where
,
, and
represent the system’s state, input, and observation vectors, respectively, and they are defined as follows:
The matrices
,
, and
are given by:
where
The physical meanings of the parameters and the value ranges of the input and output quantities are shown in
Table 1. To validate the model’s performance, we compare the proposed model with the RNN-NODE method from [
22] and the improved RNN-NCDE method. The network architecture parameters for each model are listed in
Table 2.
The model training process uses the following settings: number of iterations—200; initial learning rate—; minimum learning rate—; decay rate—0.999; hyperparameter—. The thresholds for the reference latent feature weights are set as , , , and . The desired regular sequence length is set to , with a sampling interval of . The control input is normalized to the range , and the observation is normalized to the range . The experimental analysis focuses on two aspects: (1) performance under different sampling numbers; (2) performance under high noise levels.
The relevant information about the parameters of the experimental environment, including CPU, GPU, memory, Matlab and deep learning toolbox version, are detailed in
Table 3.
4.1. Analysis of Model Reconstruction Results Under Different Sampling Numbers
The dataset for model training and validation consists of 150 irregular observation sequences, including 5 different random sampling sequences with a sampling number of
under 30 different working conditions. This section compares the reconstruction results of models under sampling numbers of
,
, and
. The noise covariance of the actual observation sequence is set to
Figure 3 and
Figure 4 show the reconstruction results of each model for the normalized values of each observation variable under different sampling numbers. Since RNN-NODE does not consider the impact of control inputs on outputs, its predictions rely solely on the initial latent state, resulting in significant estimation errors. After improvement with NCDE, the model can account for control sequences, enhancing reconstruction accuracy. As shown in
Table 4, RNN-NCDE reduces the estimation error by 66.3%, 65.6%, and 53.5% compared to RNN-NODE when the sampling numbers are 60, 40, and 20, respectively. Although RNN-NCDE improves the reconstruction results of RNN-NODE, its performance degrades noticeably under sparse measurements, as evidenced by a near doubling of prediction errors when the sampling number decreases from 60 to 20.
The proposed PRN-NCDE model, by focusing on the correlation between sampling points and their position information, not only further improves fitting accuracy under each sampling number but also mitigates the impact of sparse measurements on reconstruction results. The RMSE results indicate that PRN-NCDE enhances estimation accuracy by 54%, 58.5%, and 63.7% compared to RNN-NCDE when the sampling numbers are 60, 40, and 20, respectively. Moreover, when the sampling number decreases from 60 to 20, PRN-NCDE’s prediction error increases by only 50%, remaining superior to the other two methods. Therefore, the PRN-NCDE model is more effective in handling irregular sequence reconstruction for controlled systems and can significantly reduce the impact of sparse measurements on reconstruction results.
4.2. Reconstruction Performance Analysis Under High Noise Levels
Similarly, a dataset of 150 irregular sequences with three noise levels is used to compare reconstruction performance. The sampling number for each noise level is set to , with noise covariances as follows:
Figure 5 and
Figure 6 show the reconstruction results of each model under high noise levels. Both RNN-NCDE and RNN-NODE fail to effectively recover the true values from noisy sequences, with significant drops in accuracy.
The results in
Table 5 show that RNN-NCDE has larger fitting errors than RNN-NODE under all noise levels, indicating that NCDE does not improve fitting accuracy in noisy conditions. As can be seen from the above, these two methods cannot handle the sequence reconstruction problem under large noise levels effectively.
The proposed PRN-NCDE model uses a parallel reference network to obtain reference latent features and calculates weights based on the noise level of the actual observation sequence. This weighted fusion of latent features improves reconstruction accuracy under high noise levels. The results show that PRN-NCDE can accurately recover the true value trends, with prediction errors reduced by over 70% and 60% compared to RNN-NCDE and RNN-NODE, respectively.
4.3. Discussion
For a well-trained sequence reconstruction model, the parameters related to the latent feature weight, including , , and , will decide the reconstruction results. For the simulation experiments in this paper, the value of is set based on the largest measurement noise variance. While guaranteeing the noise in a reasonable range, the features of the real observation data are more utilized for sequence reconstruction. If the noise level is smaller than , there is no need to increase the weight of the reference observation feature and the model will primarily utilize the actual observation features to realize sequence reconstruction. While if the noise level is larger than due to electromagnetic interference or sensor failure, the weight of the reference feature should be increased so that the negative impact of noise on reconstruction accuracy could be reduced. In addition, in order to guarantee that the fused feature always contains some feature information of the actual data, the maximum reference feature weight is set to .
In conclusion, for the reconstruction of noisy, irregular multivariate sequences in controlled systems, the PRN-NCDE model outperforms existing methods. Simulation results confirm its effectiveness in improving reconstruction accuracy under sparse measurements and high noise levels, making it suitable for digital twin systems in complex coal-fired power plant environments.
5. Conclusions
The uncertainty and randomness in network communication and the non-synchronous sampling of sensors can cause irregularities, sparsity, and misalignment in measurement data. Existing methods struggle to achieve ideal reconstruction under sparse measurements and high noise, making them unsuitable for sequence reconstruction tasks in digital twin systems. To address these issues, this article establishes a variational autoencoder model based on parallel reference networks and neural controlled differential equations, which can effectively handle the reconstruction of noisy, irregular multivariate sequences in digital twin systems.
Firstly, a multi-channel self-attention module is incorporated into the encoder. This module not only analyzes the position information of sampled data in the sequence and focuses on the correlation between data to enhance reconstruction accuracy under sparse measurements but also handles the misalignment and irregularity of observation sequences. Secondly, a parallel reference network is established. The reference sequence provided by the digital twin model is input and mapped to reference latent features, which are then fused with the latent features of the actual observation sequence in a weighted manner to address the reconstruction of observation sequences under high noise conditions. Thirdly, a decoder is constructed using an NCDE network, which takes into account the effect of control inputs on outputs to improve the sequence reconstruction performance of controlled systems. Finally, a weighted loss function for training the PRN- NCDE model is formed based on the calculated feature weights, which helps to better train the network parameters of the model.
Simulation results show that the proposed PRN-NCDE model improves the estimation accuracy by more than 50% and 70%, respectively, compared with RNN-NCDE under different sampling numbers and noise levels, and by more than 80% and 60%, respectively, compared with RNN-NODE. It can recover the changing trend of observation sequence more accurately. Therefore, the proposed method can effectively improve the reconstruction accuracy of observation sequences under sparse measurements and large noise conditions, and is suitable for irregular sequence reconstruction tasks in digital twin systems with uncertain network transmission and non-synchronous sensor sampling.
In future work, three aspects could be further investigated. First, the structure of the deep learning network can be improved to realize better abilities of feature extraction and prediction accuracy, especially under small sampling numbers and high noise levels. Second, the situation when outliers exist in the measurements should be considered to enhance the robustness of the reconstruction method. Third, we assume that the measurement noise follows Gaussian distribution. Non-Gaussian distribution should be considered for modelling measurement noise and the reconstruction performance under this situation could be another interesting research direction.