A Deep Learning-Based Acoustic Signal Analysis Method for Monitoring the Distillation Columns’ Potential Faults

Wang, Honghai; Zheng, Haotian; Zhang, Zhixi; Wang, Guangyan

doi:10.3390/app14167026

Open AccessArticle

A Deep Learning-Based Acoustic Signal Analysis Method for Monitoring the Distillation Columns’ Potential Faults

¹

National-Local Joint Engineering Laboratory for Energy Conservation in Chemical Process Integration and Resources Utilization, School of Chemical Engineering and Technology, Hebei University of Technology, Tianjin 300130, China

²

School of Information Engineering, Tianjin University of Commerce, Tianjin 300134, China

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2024, 14(16), 7026; https://doi.org/10.3390/app14167026 (registering DOI)

Submission received: 5 July 2024 / Revised: 3 August 2024 / Accepted: 8 August 2024 / Published: 10 August 2024

(This article belongs to the Topic Artificial Intelligence in Smart Industrial Diagnostics and Manufacturing, 2nd Volume)

Download

Browse Figures

Versions Notes

Abstract

:

Distillation columns are vital for substance separation and purification in various industries, where malfunctions can lead to equipment damage, compromised product quality, production interruptions, and environmental harm. Early fault detection using AI-driven methods like deep learning can mitigate downtime and safety risks. This study employed a lab-scale distillation column to collect passive acoustic signals under normal conditions and three potential faults: flooding, dry tray, and leakage. Signal processing techniques were used to extract acoustic features from low signal-to-noise ratios and weak time-domain characteristics. A deep learning-based passive acoustic feature recognition method was then applied, achieving an average accuracy of 99.03% on Mel-frequency cepstral coefficient (MFCC) spectrogram datasets. This method demonstrated robust performance across different fault types and limited data scenarios, effectively predicting and detecting potential faults in distillation columns.

Keywords:

chemical equipment; distillation column; deep learning; passive acoustic monitoring; neural network

1. Introduction

Distillation columns play a crucial role in fields such as chemical engineering, food processing, pharmaceuticals, and environmental protection. However, ensuring the stable and safe operation of distillation columns is highly challenging [1].

Traditionally, the techniques for monitoring the status and diagnosing faults in distillation columns rely on acquiring and analyzing key internal variables. Zheng et al. [2] collected ten sets of temperature and pressure variables from the trays of industrial distillation columns. These variables were used as dynamic characteristic parameters in a framework based on dynamic-control principal component analysis (DCPA) for industrial process monitoring and automated response. The framework detects faults in the distillation process by analyzing the deviations between the current state vector and historical variables. Shahid et al. [3] utilized Aspen Plus simulations to obtain fourteen continuous variables related to the distillation process, including reflux ratio, heat exchanger power, and product composition. Subsequently, techniques such as the change in variable values, average variable threshold limits, and wavelet transform–principal component analysis were applied to evaluate the contribution of each variable under different fault conditions.

However, due to the highly coupled nonlinear relationships among these variables during the operation of the distillation column, obtaining accurate measurements is very challenging. Additionally, changes in these process variables are often detected only after a fault has occurred, leading to delays in diagnosing the issue. In contrast, monitoring technologies based on acoustic signal analysis have shown promising potential for distillation columns [4]. The acoustic signals emitted by a distillation column reflect the relative motion state of the fluids inside. When the operational state of the column changes, the acoustic signals also change accordingly. This continuous nature of state changes makes it possible to monitor potential faults in the distillation column through acoustic signal analysis. Acoustic signals can capture the initial signs of faults, which are often difficult to detect in advance using traditional methods. By continuously monitoring and analyzing these signals, preventive maintenance measures can be taken before the faults develop into severe problems, significantly reducing the impact of faults on the production process [5].

The passive acoustic signals from a distillation column typically contain rich information about the equipment’s operating conditions [6]. In passive acoustic signal analysis, Mel-frequency cepstral coefficients (MFCCs), inspired by the human auditory system, can effectively extract key features. When combined with deep learning models, it enhances the accuracy and efficiency of fault detection and preventive maintenance. Hui Li et al. [7] proposed a fault diagnosis method for converter transformers based on multi-strategy-improved MFCCs and time convolutional networks (TCNs). This method considers the impact of load factors and employs an improved hunter–prey optimization algorithm (IHPO) and variational modal decomposition (VMD) for denoising acoustic signals. The S-transform is used for time–frequency conversion, and multi-strategy-improved MFCCs are utilized for feature extraction. Combined with an improved time convolutional network (ITCN), this method achieves effective fault diagnosis for converter transformers. In the area of multi-fault diagnosis, Cabrera et al. [8] introduced a fault diagnosis method based on improved Mel-frequency cepstral coefficients (MFCCs) and their derivatives. This method uses deep learning models, including LSTM, BiLSTM, and CNN, to efficiently classify a total of 30 faults in centrifugal pumps and reciprocating compressors. By leveraging the pseudo-periodicity of fault signals for block-wise classification and applying an efficient data augmentation approach for vibration signals, the method demonstrates good adaptability and reliability in industrial applications.

Currently, state monitoring and fault diagnosis methods for chemical equipment based on deep learning have been applied to devices such as pumps [9,10,11,12], heat exchangers [13], fans [14,15,16], fluidized beds [17,18,19,20], and pipelines [21,22,23,24], showing considerable potential. Against this backdrop, methods based on deep learning have become a primary choice and trend.

Back for the distillation process, Su Ming et al. [25] employed the support vector machine (SVM) method to establish a soft sensor model for viscosity in vacuum distillation. Utilizing a differential evolution algorithm’s search strategy, they constructed a model with high prediction accuracy and excellent trend-tracking performance. However, the SVM method has a slower learning speed and is prone to getting stuck in local minima. Pan Hongfang et al. [26] proposed an extreme learning machine method based on a wavelet kernel function, applying it to the soft sensor modeling problem in acetic acid distillation. Simulation results demonstrated that this algorithm’s learning speed was 92 times faster than that of the SVM while also improving the method’s accuracy and generalization capability.

In the field of fault diagnosis for distillation columns, Oeing et al. [27] leveraged a dataset comprising photographs capturing instances of liquid flooding processes occurring 35 times over a 17 h period within an extraction column. Employing a convolutional neural network (CNN) coupled with class activation mapping, they identified key regions in the images influencing the network’s performance significantly. Their study underscores the practicality and effectiveness of employing deep learning methodologies for predicting and monitoring operational statuses and faults in chemical processing equipment.

In the realm of complex reactive distillation processes, Ge et al. [28] devised a method to transform 14 operational variables and temporal information representing 13 types of faults into two-dimensional features. These features were then utilized to train and validate a deep CNN for monitoring and diagnosing the operation status of reactive distillation equipment. The highest accuracy achieved across various CNN architectures during training reached 91.31%. This underscores the efficacy of utilizing both temporal and spatial information as inputs to CNNs for diagnosing and detecting operational states in reactive distillation equipment.

The residual network (ResNet) introduces a structural innovation of skip connections on top of the traditional CNN architecture. This design aims to enhance the training process of deep networks, improve overall network performance, and accelerate convergence rates [29]. By incorporating skip connections, ResNet effectively addresses challenges commonly encountered during the training of deep networks within the CNN framework.

In this study, ResNet was chosen for acoustic signal classification due to its proven ability to handle the vanishing gradient problem that often plagues deep neural networks. ResNet’s architecture allows for the construction of very deep networks without the degradation in performance that typically accompanies increased depth. This capability is particularly advantageous for complex tasks such as fault detection in distillation columns, where subtle differences in acoustic signals must be accurately identified.

In this study, acoustic signals from the distillation column under different operating conditions were collected, including normal operation, flooding, dry tray, weeping, and three potential fault states. These acoustic signals were then converted into MFCC feature maps through a series of signal processing and feature extraction techniques, and a corresponding dataset was generated. Finally, a classification method based on a residual network was proposed, trained, and tested on the dataset. The results indicated that the distillation column potential fault detection technology based on deep learning and acoustic signal analysis demonstrated good performance, effectively ensuring the safe and stable operation of the distillation process.

2. Experiments and Signal Acquisition

A laboratory-scale sieve tray column was employed in this study, and the parameters of the distillation column are provided in Table 1. The distillation apparatus used in this experiment is shown in Figure 1. The sound acquisition equipment used was a ICD-UX565, SONY, Guangzhou, China, with a sampling frequency range of 1–44,100 Hz, capable of covering the audible sound range. The equipment was placed on the outside of the distillation column, at a distance of 2 mm from the column body. The upper and lower sections of the distillation column were equipped with viewports, allowing for the observation of the relative motion state of the fluid inside the column during operation.

The experiment was conducted at ambient temperature and pressure. The system within the column was chosen to be a 15% ethanol–water solution [30]. This selection aimed to simulate the purification process of the original fermentation broth of bioethanol, thereby better mimicking the actual production process and obtaining results with higher reliability and comparability.

Initially, approximately 30.0 L of feed liquid was added to the feed tank and thoroughly stirred with a feed pump to ensure uniformity. The specific gravity of the feed was recorded to calculate the concentration, and the sample was set aside for later use. Then, a certain amount of feed liquid was added to the reboiler using the feed pump. After fully opening the cooling water valves of the top condenser, the heating power switch of the reboiler was turned on, and the voltage was set to 150 V. The two-phase contact state within the column was observed through the viewports. Once the reflux liquid began to appear at the top of the column, the temperature of the reflux liquid was maintained, and operational parameters such as the top temperature, reboiler temperature, reflux flow rate, and reflux temperature were recorded every minute while continuing to observe the two-phase contact situation within the column. After the operation of the entire column stabilized, the specific gravity of the top product was measured and recorded at room temperature. When the ethanol mass fraction of the top product reached over 95% and remained stable, the passive acoustic signals of the equipment were collected. Subsequently, the heating voltage of the reboiler was continuously increased to raise the gas velocity within the column and change the operating state of the distillation column. The operating parameters of the columns are shown in Table 2.

The instrument for collecting passive acoustic signals is shown in Figure 2. The recording equipment was fixed on the outside of the distillation column, at a distance of 1–2 mm from the column. As the temperature of the column was high during operation, this distance not only ensured that the sensor would not exceed the normal operating temperature range and thus be damaged during the acquisition process but also effectively avoided the non-acoustic interference caused by the sensor being close to the equipment. As illustrated, the distillation column was equipped with upper and lower viewports that were made of glass, which could withstand the operating pressure and temperature of the distillation process and facilitated the direct observation of the gas–liquid two-phase interaction during the acquisition process. The acquisition method shown in the figure could accurately collect the passive acoustic signals of the target distillation column under different operating conditions and reduce the influence of environmental noise.

The behavior of bubbles can be categorized into three types: bubble formation, bubble coalescence, and bubble bursting Bubble formation refers to the creation of bubbles when the ascending gas phase enters the liquid phase on the upper tray through the sieve holes. Bubble coalescence occurs when two or more bubbles merge within the liquid phase to form a new bubble. Bubble breakup is the process in which a single bubble breaks into multiple bubbles due to external forces within the liquid or on the liquid surface. Different bubble behaviors result in varying amounts of energy being transferred to the external environment by the gas–liquid interactions, causing differences in the information carried by the passive acoustic signals.

When the reboiler heating voltage reached 150 V, it was observed that the liquid layer height on the sieve trays gradually increased to 30 mm and remained stable. From the top window, it was observed that the bubble diameter in the liquid phase ranged between 8 and 15 mm, indicating that the distillation column was in a normal mass transfer state. As shown in Figure 3a, the generation, coalescence, and breakup of bubbles occurred almost simultaneously. The passive acoustic signal collected at this time was classified as N1. When the reboiler heating voltage stabilized at 170 V, the liquid layer height on the sieve trays further increased. Due to the increased gas flow rate, the gas velocity through the perforations also increased, leading to a faster necking and breakup of gas bubbles. Consequently, the bubble diameter decreased, and the number increased, with bubble generation and breakup dominating and bubble coalescence in the liquid phase being almost unobservable. More liquid was carried to the upper trays, as shown in Figure 3c, indicating entrainment flooding. The passive acoustic signal collected at this time was classified as F1.

However, as the heating voltage gradually increased from 150 V to 170 V, the distillation column began to deviate from the normal mass transfer state. Observations from the top window revealed that entrainment started to occur during this phase. However, based on the top column temperature and the reflux flow rate, the mass transfer process inside the column was not severely affected. Further observations from the top window indicated that although the bubble size did not significantly differ from normal mass transfer, the liquid layer height on the sieve trays increased to 40–50 mm, as shown in Figure 3b. Bubble generation increased, and bubble breakage on the liquid surface intensified, potentially leading to different acoustic characteristics. The passive acoustic signal collected in this state was labeled as T1, indicating a potential flooding fault.

When the reboiler heating voltage was increased to 190 V, observations from the top window showed that the liquid layer height on the sieve trays rapidly decreased until it disappeared, with the light component vapor accumulating at the top and upper trays drying out. As shown in Figure 3e, as the liquid phase decreased, bubble generation almost ceased, with bubble coalescence becoming the dominant behavior. The passive acoustic signal collected at this time was classified as F2. Similarly, as the voltage increased, the distillation column did not undergo a sudden state change, as shown in Figure 3d; the state before tray drying was classified as the potential dry tray fault state T2.

To simulate a heating failure due to reasons such as circuit aging or short-circuiting, the reboiler heating voltage was significantly reduced. In the early stages of voltage reduction, the top window revealed the situation shown in Figure 3f, with liquid droplets beginning to accumulate below the upper trays but without a significant operational fault. The acoustic signal collected at this time was classified as T3, indicating a potential leakage fault. Over time, the bottom window showed that the condensed droplets began to flow down the inner wall of the distillation column, as shown in Figure 3g. Due to the reduced gas velocity, the bubble diameter significantly increased, and the generation and breakup process slowed down, with bubble coalescence becoming unobservable. The passive acoustic signal collected at this time was classified as F3. The Operating variables corresponding to each category are shown in Table 3.

The duration of each data sample was 3.0 s, and the number of valid samples collected for each category was as follows: class N1 included 674 samples; class T1 included 702 samples; class F1 included 711 samples; class T2 included 727 samples; class F2 included 695 samples; class T3 included 236 samples; and class F3 included 379 samples.

3. Signal Analysis Method

The collected samples were in audio file format and could not be directly analyzed. Therefore, conversion from audio to data was needed, using the steps shown in Figure 4.

3.1. Analog-to-Digital Conversion

As modern electronic devices and computer systems operate and process based on digital signals, while the signals obtained by the acquisition equipment are analog signals, it is first necessary to perform analog-to-digital conversion (ADC) [31]. This conversion allows the signal to be processed, stored, transmitted, and analyzed within digital systems. The raw audio signals obtained in the experiment were subjected to 16-bit ADC operations, resulting in the outcome shown in Figure 5. It can be observed that the signal after ADC is converted into a series of sample points, preserving the original dynamic range while presenting discretization. The collected acoustic signals had a sound pressure level range from 45 to 75 decibels.

3.2. Monaural Separation

At this point, it is evident that the signal has both left and right channels, which complicates subsequent signal processing, leading to increased costs in data transmission and hardware, as well as issues such as channel imbalance due to inconsistent data [32,33]. However, the left and right channel signals exhibit a correlation and possess sufficient passive acoustic characteristics. Figure 6 presents the signals after monaural separation, demonstrating that both monaural signals exhibit similar trends and characteristics. Therefore, analyzing one of the monaural signals suffices for research purposes. Additionally, it can be observed that the left channel contains richer temporal information, as evidenced by more pronounced changes in relative intensity. Thus, the left channel was selected for further analysis.

3.3. Signal Enhancement and Denoising

Next, the signals were further processed using pre-emphasis and Hampel filtering techniques. Firstly, pre-emphasis was applied to adjust the spectral distribution of the signal, enhancing the high-frequency components that were more susceptible to attenuation during signal acquisition and transmission, making them more prominent throughout the signal. The pre-emphasis function is shown in Equation (1).

y (n) = x (n) - α \cdot x (n - 1),

(1)

where x(n) represents the signal at the n-th sampling point, and y(n) denotes the signal after pre-emphasis. The pre-emphasis coefficient α is typically set to 0.97 [34].

Subsequently, Hampel filtering, based on the median absolute deviation estimation method, was applied to attenuate impulsive noise in the signal, ensuring that the final passive acoustic signal achieves higher quality and accuracy. The expression for the Hampel filter is given in Equation (2).

y_{i} = median (x_{i - k_{0}}, \dots, x_{i + k_{0}}),

(2)

Here, k₀ represents the window size. If the input value deviates from the median of the window by more than a selected threshold n_σ, the median is used as the output; otherwise, the original value is output. Figure 7 shows the MFCC spectrogram of a passive acoustic signal after Hampel filtering. It can be observed that with the application of the Hampel filter, the intensity of anomalies in the audio is significantly reduced. The peak intensity of the noise was reduced by 59.83%.

3.4. Fast Fourier Transform and MFCC Extraction

The signals after monaural separation remained in the time domain. Figure 8 shows the time-domain plots of signals from seven categories. It can be observed that in the time domain, the signals of the different states show relatively subtle differences due to the similar interactions between gas and liquid phases inside the column. This similarity makes it difficult to achieve accurate recognition and classification of these different states based solely on time-domain features. It is evident that the features contained in the time domain for different categories cannot fully meet the requirements of classification tasks [35]. Therefore, it is necessary to transform the time-domain signals into frequency-domain signals through the steps of FFT (fast Fourier transform) and MFCC extraction.

Firstly, the sound signals after frame segmentation and windowing underwent FFT. The resulting spectrogram represents the time–frequency characteristics of the signal after overlapping the frames. The FFT is defined in Equation (3).

G (k_{1}) = \sum_{n = 0}^{N_{F} - 1} g (n) e^{- j \frac{2 π n k_{1}}{N_{F}}}, 0 \leq k_{1} \leq N_{F},

(3)

where G(k₁) represents the amplitude and phase information of the signal at frequency k₁, g(n) denotes the time-domain information at time n, and N_F is the number of samples. Next, the energy spectrum E(k₁) is calculated using Equation (4).

E (k_{1}) = \frac{1}{N_{F}} {|G (k_{1})|}^{2},

(4)

The computed energy spectrum describes the characteristics of the audio signal in both the time and frequency domains. Using Mel filters, the frequencies in the frequency domain were converted into Mel frequencies to obtain the Mel spectrum. The calculation formula for the Mel filter is presented in Equation (5).

H m (k_{2}) = \{\begin{matrix} \frac{k_{2} - f (m - 1)}{f (m) - f (m - 1)} \begin{matrix}  \end{matrix}, \begin{matrix}  \end{matrix} f (m - 1) \leq k_{2} \leq f (m) \\ \frac{f (m + 1) - k_{2}}{f (m + 1) - f (m)} \begin{matrix}  \end{matrix}, \begin{matrix}  \end{matrix} f (m) \leq k_{2} \leq f (m + 1) \\ \begin{array}{l} 0 \begin{matrix} \begin{matrix}  \end{matrix} \end{matrix}, \begin{matrix} \begin{matrix}  \end{matrix} \end{matrix} o t h e r s \end{array} \end{matrix},

(5)

where N_Mel denotes the number of Mel filters in the Mel filter bank, and k₂ represents the k₂-th point within an FFT. The variable mmm ranges from 1 to N. The computation of the Mel spectrum from the energy spectrum of the passive acoustic signal using Mel filters is given by Equation (6).

M e l S p e c (m) = \sum_{k_{2} = f (m - 1)}^{f (m + 1)} H m (k_{2}) \cdot E (k_{2}),

(6)

where MelSpec(m) represents the Mel spectrum at Mel-frequency bin m, E(k₂) denotes the energy spectrum at frequency bin k₂, and Hm(k₂) represents the response of the Mel filter m at frequency bin k₂.

After the aforementioned calculation, the logarithm of several outputs obtained for each frame was adopted, resulting in the logarithm Mel spectrum S_m. Then, discrete cosine transform (DCT) was applied to the logarithm Mel spectrum to obtain the Value of MFCCs. The computation process is given by Equation (7) as follows:

V a l u e = \sum_{m = 1}^{M - 1} S_{m} \cos [\frac{π n (m + 1 / 2)}{M}], 0 \leq m \leq M,

(7)

Finally, the obtained MFCCs were sorted in a time series, and after applying color mapping, the MFCC spectrogram of the passive acoustic signals was generated. In the spectrogram, the rows represent time, and the columns represent the values of the MFCCs.

4. Classification Method

4.1. Data Splitting

In this study, MFCC spectrogram data were integrated into an image repository, each entry associated with its corresponding labels and paths. The MFCC spectrogram data were then divided into three subsets: the training set, validation set, and test set. The dataset allocation adhered to predefined proportions, with 60% designated for the training set, 20% for the validation set, and the remaining 20% for the test set. This distribution strategy ensures an effective framework for training, fine-tuning, and comprehensive evaluation of the model, thereby facilitating an in-depth assessment of the model’s capabilities and overall performance.

4.2. Network Architecture

The network architecture plays a crucial role in the recognition and feature extraction of MFCC spectrograms. We used a network model based on the classical ResNet-18 architecture, which is known for its compact size and exceptional performance, making it a preferred choice in various research and practical applications [36]. The basic configuration of ResNet is depicted in Figure 9.

The network comprises three primary sections. The initial section, referred to as low-level feature extraction, includes an input layer, a convolutional layer, and a pooling layer. Here, the MFCC spectrogram data from the dataset were inputted into the network. Through convolutional operations and pooling, low-level image features are extracted and forwarded to the subsequent high-level feature extraction section.

The core of the ResNet model constitutes the high-level feature extraction section, which consists of three residual blocks. Each residual block contains multiple convolutional layers interconnected with skip connections spanning these layers, as depicted in Figure 10.

Each residual block is composed of several residual layers, each incorporating convolutional operations. These operations encompass two essential actions, namely local correlation and window sliding, collectively known as convolution operations. Training proceeds accordingly, with the training configurations detailed in Table 4.

The mini-batch size, maximum number of iterations, and the initial learning rate were set, and the Adam optimizer was chosen to update model parameters. The validation frequency was determined during training. If the current validation accuracy surpassed all previous records, the model was saved as the optimal model, and we proceeded to the next iteration. If not, we continued to the next iteration until reaching the maximum number of iterations. Upon completing the training process, the optimal network model was established along with its validation accuracy. Subsequently, five different test sets were randomly selected, each comprising 20% of the original dataset, and the trained residual network model was evaluated on these test sets.

4.3. Evaluation Method

In this study, the deep learning model was evaluated using classification performance metrics and a confusion matrix. Classification performance metrics were calculated based on four key indicators: true-positive (TP), false-positive (FP), true-negative (TN), and false-negative (FN) values. True positives refer to the number of samples correctly classified as positive; false positives refer to the number of samples incorrectly classified as positive; true negatives refer to the number of samples correctly predicted as negative; and false negatives refer to the number of positive samples incorrectly predicted as negative.

The classification performance metrics included classification accuracy (Acc), precision, recall, and F1 score. Classification accuracy is the ratio of correctly predicted samples to the total number of samples, with higher Acc indicating stronger overall model performance; precision is the proportion of true positives among all positive predictions, with higher precision indicating lower error rates when dealing with positive samples; recall is the proportion of true positives among all actual positives, with higher recall indicating the better ability of the model to correctly identify the category; F1 score is the harmonic mean of recall and precision, with a higher F1 score indicating a better balance between coverage and recognition ability, leading to higher performance on datasets with different class sample sizes. The calculation of classification performance metrics is given by Equations (8)–(11).

A c c = \frac{T P + T N}{T P + + T N + F P + F N},

(8)

P r e c i s i o n = \frac{T P}{T P + F P},

(9)

R e c a l l = \frac{T P}{T P + F N},

(10)

F 1 S c o r e = \frac{2 \cdot (P r e c i s i o n \cdot R e c a l l)}{P r e c i s i o n + R e c a l l}

(11)

Considering accuracy, the higher it is, the stronger the model’s overall classification capability, i.e., the stronger it is able to correctly identify most samples. Regarding precision, the higher it is, the lower the error rate among the samples predicted as positive by the model. In terms of recall, the higher it is, the stronger the model’s ability to cover and identify true-positive (TP) values. With the F1 score, the higher it is, the better the balance the model achieves between precision and recall, indicating superior performance on datasets with varying sample sizes across different classes.

In deep learning, a confusion matrix is a supervised learning tool used to summarize the prediction results of a classification model. It presents data in a matrix form, classifying records in the dataset based on the true and predicted categories and providing a visual analysis of the model’s recognition accuracy.

5. Results and Discussion

5.1. MFCC Feature Extraction

The images obtained after FFT and MFCC extraction are shown in Figure 11.

It can be seen that after the two-step transformation, the differences between the passive acoustic signals under different categories become more apparent, especially reflected in certain specific MFCC values, showing stable and clearly defined spectral lines. By identifying these features, it is possible to monitor the operating status of the distillation column.

5.2. Network Training

Figure 12 shows the changes in validation accuracy, batch accuracy, loss, and batch loss with the number of iterations during the training of the residual network. The figure clearly shows that the training process of the residual network is relatively smooth, with all metrics stabilizing between 400 and 500 iterations, indicating that the network parameters gradually converge.

As shown in Figure 12a, both the validation accuracy and batch accuracy gradually increased and leveled off, indicating that the model’s generalization ability improved. The loss and batch loss gradually decreased and stabilized, indicating that the model underwent optimization and reached its optimal state during training. These curves reflect the effectiveness and stability of the residual network in handling complex acoustic signal classification tasks.

The training process of the model showed notable variability in the early stages, reflecting the initial adjustments of the model parameters. In the first 20 iterations, batch accuracy increased from 21.88% to 24.88%, and the batch loss decreased from 1.99 to 1.97. This phase demonstrated the model’s initial learning and adaptation. As training progressed, a more consistent improvement in performance was observed. By the 100th iteration, accuracy had increased to 51.23%, and the loss had decreased significantly to 1.21, indicating the model’s enhanced ability to generalize from the training data.

Continuing through the mid-phase of training, from the 100th to the 200th iteration, the accuracy further increased to 71.23%, and the loss value dropped to 0.8139. This period marked substantial refinement in the model’s learning, as it effectively captured more complex patterns within the data. The latter stages of training, particularly between the 300th and 500th iterations, highlighted the model’s approach toward convergence. By the 300th iteration, the accuracy had reached 88.50%, with a loss value of 0.14.

In the final phase, from the 400th to the 500th iteration, the model’s performance approached near-perfect levels. Accuracy increased to 99.50%, with the loss stabilizing at around 0.01. This indicates successful convergence and a high degree of model performance. However, the near-zero loss value also suggests a need for caution to ensure that overfitting is avoided. The overall trend underscores the model’s robust learning capabilities and its potential for high accuracy in predictive tasks.

5.3. Network Performance Evaluation Metrics

The indicators obtained from the five tests are shown in Table 5.

The evaluation of network performance metrics across different categories (N1, T1, F1, T2, F2, T3, and F3) underscores the model’s classification capabilities and its consistency in handling various data types. Overall, the model demonstrated high accuracy across all categories, indicating robustness and reliability in classification tasks.

For the N1 category, the model achieved the highest accuracy at 99.73%, with precision and recall rates of 98.94% and 98.17%, respectively. These metrics suggest that the model performs exceptionally well in this category, achieving a near-perfect balance between correctly identifying positive instances and minimizing false positives and negatives.

In the T1 category, the model maintained a high accuracy of 99.69%, with precision and recall rates of 96.61% and 97.85%, respectively. The F1 score of 97.23% reflects strong performance, indicating the effective handling of this category with minimal errors.

In both the F1 and T2 categories, the model exhibited an accuracy rate of 99.01%. In the F1 category, the model had a precision of 96.56% and a recall of 97.54%, while in the T2 category, a precision of 97.53% and a recall of 96.87% were achieved. These metrics suggest that, although the model performs well, there is slight variation in its ability to consistently identify true positives across these categories.

The F2 category, with an accuracy of 98.84%, a precision of 95.35%, and a recall of 97.83%, indicates strong performance but highlights room for improvement in balancing precision and recall.

In the T3 and F3 categories, the model exhibited relatively lower accuracy and precision/recall balances. In the T3 category, the model had an accuracy of 98.23%, with precision and recall rates of 95.73% and 93.99%, respectively, indicating higher rates of false positives and false negatives. Similarly, the F3 category, with an accuracy of 98.72%, a precision of 96.59%, and a recall of 95.59%, suggests that, although the model performs well, it could benefit from further optimization to enhance the balance between precision and recall.

In summary, the model demonstrates robust classification performance across various categories, with N1 and T1 exhibiting the highest reliability. The consistency in high accuracy across all categories underscores the model’s overall effectiveness. However, specific categories, such as T3 and F3, indicate areas for potential improvement to achieve a more balanced performance in precision and recall.

The visual analysis of the two confusion matrices, as shown in Figure 13, reveals that the model performs remarkably in classifying most categories, exhibiting high classification accuracy. Specifically, Figure 13a shows the distribution of the classifier’s results between the actual and predicted categories, with diagonal values representing correctly classified samples and off-diagonal values indicating misclassified samples. Figure 13b displays the percentage distribution of classification results, providing a more intuitive reflection of the model’s classification accuracy.

In the majority of categories, the model demonstrated high classification accuracy, where the performance was nearly perfect with almost no misclassification. However, there were still certain degrees of misclassification among some categories. This indicates that while the model possesses strong overall classification capabilities, there remains room for improvement in distinguishing specific categories.

6. Conclusions

This paper proposes a passive acoustic method based on deep learning to monitor potential fault states in distillation columns. Acoustic signals from the target distillation column in different operating states were collected, processed, and analyzed using the proposed method, with MFCC feature maps extracted. These feature maps were used as the dataset for training a residual network.

The trained residual network model achieved an average accuracy, precision, recall, and F1 score of 99.03%, 96.83%, 96.76%, and 96.79% on the test dataset, demonstrating excellent performance. Compared to the 97.25% model accuracy achieved by the MSVM method used in previous similar studies [4], this method shows a significant performance improvement of 1.78 percentage points. The use of MFCC spectrogram datasets eliminates the need for manual design and the selection of multiple features, highlighting the advantage of the proposed deep learning method in reducing the workload of feature engineering. Additionally, by diagnosing three different potential faults, the proposed method effectively predicted faults in upper flooding, lower flooding, and leakage conditions. This research not only improves the accuracy of fault detection but also reduces the complexity of manual feature design, making the technology more efficient for industrial applications.

Considering that actual distillation processes in production may be more complex, several recommendations for future work include the following:

Industrial application and promotion should be considered. Collaboration with industry can facilitate experimentation and validation in more real production scenarios, collecting more field data to optimize model parameters and advancing the application of this technology in real production environments.

Future work could extend the scope to monitoring and diagnosing more types of potential faults, enhancing the comprehensiveness of the system.

Data augmentation and enhancement should be considered in practical applications. The characteristics of infrequent faults are difficult to capture in actual production, but simulation experiments or data augmentation techniques can enhance the model’s robustness against these rare faults.

The operational states in actual distillation processes might be more finely divided, requiring feature extraction methods with higher performance. Exploring advanced noise reduction techniques could play a crucial role in this context.

Author Contributions

Conceptualization, H.W. and H.Z.; methodology, H.W.; software, H.Z.; validation, Z.Z., G.W. and H.W.; formal analysis, H.Z.; investigation, H.W.; resources, G.W.; data curation, H.Z. and Z.Z.; writing—original draft preparation, H.Z.; writing—review and editing, H.W.; visualization, H.Z.; supervision, H.W. and G.W.; project administration, H.W.; funding acquisition, H.W. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Natural Science Foundation of China, grant number 22308079; the Natural Science Foundation of Hebei Province, China, grant number B2022202008 and B2023202025; the Science and Technology Project of Hebei Education Department, China, grant number BJK2022037.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The original contributions presented in the study are included in the article, further inquiries can be directed to the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Sun, L.; Li, J.; Li, Q. Progress in technology of dividing wall column. Mod. Chem. Ind. 2008, 28, 38–41, 43. [Google Scholar]
Zheng, N.; Luan, X.; Shardt, Y.A.W.; Liu, F. Dynamic-controlled principal component analysis for fault detection and automatic recovery. Reliab. Eng. Syst. Saf. 2024, 241, 109608. [Google Scholar] [CrossRef]
Shahid, M.; Zabiri, H.; Taqvi, S.A.A.; Hai, M. Fault root cause analysis using degree of change and mean variable threshold limit in non-linear dynamic distillation column. Process Saf. Environ. Prot. 2024, 189, 856–866. [Google Scholar] [CrossRef]
Zhang, Z.; Wang, G.; Yang, Z.; Yu, X.; Wang, H.; Gao, B.; Zheng, H.; Zhang, S.; Li, C. Acoustic Signal-Based Method for Recognizing Fluid Flow States in Distillation Columns. Ind. Eng. Chem. Res. 2022, 61, 17582–17592. [Google Scholar] [CrossRef]
Madakyaru, M.; Harrou, F.; Sun, Y. Improved data-based fault detection strategy and application to distillation columns. Process Saf. Environ. Prot. 2017, 107, 22–34. [Google Scholar] [CrossRef]
Ren, C.; Chen, M.; Cao, Y.; Huang, Z.; Wang, J.; Yang, Y. Application of passive acoustic emission measurement in chemical processes. Chem. Ind. Eng. Prog. 2011, 30, 918–929. [Google Scholar]
Li, H.; Yao, Q.; Li, X. Voiceprint Fault Diagnosis of Converter Transformer under Load Influence Based on Multi-Strategy Improved Mel-Frequency Spectrum Coefficient and Temporal Convolutional Network. Sensors 2024, 24, 757. [Google Scholar] [CrossRef] [PubMed]
Cabrera, D.; Medina, R.; Cerrada, M.; Sanchez, R.-V.; Estupinan, E.; Li, C. Improved Mel Frequency Cepstral Coefficients for Compressors and Pumps Fault Diagnosis with Deep Learning Models. Appl. Sci. 2024, 14, 1710. [Google Scholar] [CrossRef]
Piri, J.; Pirzadeh, B.; Keshtegar, B.; Givehchi, M. Reliability analysis of pumping station for sewage network using hybrid neural networks—Genetic algorithm and method of moment. Process Saf. Environ. Prot. 2021, 145, 39–51. [Google Scholar] [CrossRef]
Zhang, Y.; Li, J.; Xu, Y.; Zhou, P. Research on Sound Intensity Test and Sound Source Localization of Steam Turbine Driven Pumps. Noise Vib. Control 2020, 40, 194–197. [Google Scholar]
Rui, X.; Liu, J.; Li, Y.; Qi, L.; Li, G. Research on fault diagnosis and state assessment of vacuum pump based on acoustic emission sensors. Rev. Sci. Instrum. 2020, 91, 025107. [Google Scholar] [CrossRef] [PubMed]
Guo, C.; Gao, M.; Wang, J. Acoustic distribution study inside centrifugal pump impeller under different blade outlet angles using the Powell vortex sound theory. Proc. Inst. Mech. Eng. Part C J. Mech. Eng. Sci. 2020, 234, 2595–2609. [Google Scholar] [CrossRef]
Shi, L.; Xu, W.; Ma, X.; Wang, X.; Tian, H.; Shu, G. Thermal and acoustic performance of silencing heat exchanger for engine waste heat recovery. Appl. Therm. Eng. 2022, 201, 117711. [Google Scholar] [CrossRef]
Zenger, F.; Herold, G.; Becker, S. Acoustic Characterization of Forward- and Backward-Skewed Axial Fans Under Increased Inflow Turbulence. AIAA J. 2017, 55, 1241–1250. [Google Scholar] [CrossRef]
Kroemer, F.; Mueller, J.; Becker, S. Investigation of Aeroacoustic Properties of Low-Pressure Axial Fans with Different Blade Stacking. AIAA J. 2018, 56, 1507–1518. [Google Scholar] [CrossRef]
Kroemer, F.; Czwielong, F.; Becker, S. Experimental Investigation of the Sound Emission of Skewed Axial Fans with Leading-Edge Serrations. AIAA J. 2019, 57, 5182–5196. [Google Scholar] [CrossRef]
Sheng, T.; Zhang, P.; Huang, Z.; Yang, Y.; Sun, J.; Jiang, B.; Ding, X.; Wang, J.; Yang, Y.; Liao, Z. The screened waveguide for intrusive acoustic emission detection and its application in circulating fluidized bed. AIChE J. 2021, 67, e17118. [Google Scholar] [CrossRef]
Lu, Y.; Kang, P.; Yang, L.; Hu, X.; Chen, H.; Zhang, R.; Zhou, Y.; Luo, X.; Wang, J.; Yang, Y. Multi-scale characteristics and gas-solid interaction among multiple beds in a dual circulating fluidized bed reactor system. Chem. Eng. J. 2020, 385, 123715. [Google Scholar] [CrossRef]
Wang, J.; Jiang, B.; Yang, Y.; Shu, W. Multi-scale analysis of acoustic emissions and malfunction diagnosis in gas-solid fluidized bed. J. Chem. Ind. Eng. 2006, 57, 1560–1564. [Google Scholar]
Hou, L.; Wang, J.; Yang, Y.; Hu, X. Frequency analysis of acoustic emission and application in gas-solid fluidized bed. J. Chem. Ind. Eng. 2005, 56, 1474–1478. [Google Scholar]
Fan, H.; Tariq, S.; Zayed, T. Acoustic leak detection approaches for water pipelines. Autom. Constr. 2022, 138, 104226. [Google Scholar] [CrossRef]
Wang, K.; Liu, G.; Li, Y.; Wang, J.; Wang, G.; Qin, M.; Yi, L. An investigation of the detection of acoustic sand signals from the flow of solid particles in pipelines. Chem. Eng. Res. Des. 2019, 144, 254–257. [Google Scholar] [CrossRef]
Epiphantsev, B. An Acoustic Method for Diagnostics of the State of Underground Pipelines: New Possibilities. Russ. J. Nondestruct. Test. 2014, 50, 254–257. [Google Scholar] [CrossRef]
Bernasconi, G.; Giunta, G. Acoustic detection and tracking of a pipeline inspection gauge. J. Pet. Sci. Eng. 2020, 194, 107549. [Google Scholar] [CrossRef]
Ming, S.U.; Xiong, W.; Zhi-Hua, X. Product Viscosity Estimation of Vacuum Distillation Column Using Support Vector Machine. Microcomput. Inf. 2009, 25, 1. [Google Scholar]
Pan, H.; Liu, A. Wavelet Kernel Extreme Learning Machine and Its Application in Soft Sensor Modeling of an Industrial Acetic Acid Distillation System. J. East China Univ. Sci. Technol. Nat. Sci. Ed. 2014, 40, 474–480. [Google Scholar]
Oeing, J.; Neuendorf, L.M.; Bittorf, L.; Krieger, W.; Kockmann, N. Flooding Prevention in Distillation and Extraction Columns with Aid of Machine Learning Approaches. Chem. Ing. Tech. 2021, 93, 1917–1929. [Google Scholar] [CrossRef]
Ge, X.; Wang, B.; Yang, X.; Pan, Y.; Liu, B.; Liu, B. Fault detection and diagnosis for reactive distillation based on convolutional neural network. Comput. Chem. Eng. 2021, 145, 107172. [Google Scholar] [CrossRef]
Li, B.; He, Y. An Improved ResNet Based on the Adjustable Shortcut Connections. IEEE Access 2018, 6, 18967–18974. [Google Scholar] [CrossRef]
Aditiya, H.B.; Mahlia, T.M.I.; Chong, W.T.; Nur, H.; Sebayang, A.H. Second generation bioethanol production: A critical review. Renew. Sustain. Energy Rev. 2016, 66, 631–653. [Google Scholar] [CrossRef]
Wang, Y.; Zeng, Y.; Jin, X.; Yan, M. Analog to digital technics and their development trends. Semicond. Technol. 2003, 28, 7–10. [Google Scholar]
Kim, M.; Choi, S. Monaural music source separation: Nonnegativity, sparseness, and shift-invariance. In Independent Component Analysis and Blind Signal Separation, Proceedings of the 6th International Conference, ICA 2006, Charleston, SC, USA, 5–8 March 2006; Rosca, J., Erdogmus, D., Principe, J.C., Haykin, S., Eds.; Springer: Berlin/Heidelberg, Germany, 2006; Volume 3889, pp. 617–624. [Google Scholar]
Radfar, M.; Dansereau, R.; Chan, W. Monaural Speech Separation Based on Gain Adapted Minimum Mean Square Error Estimation. J. Signal Process. Syst. Signal Image Video Technol. 2010, 61, 21–37. [Google Scholar] [CrossRef]
Qaisar, S. Isolated Speech Recognition and Its Transformation in Visual Signs. J. Electr. Eng. Technol. 2019, 14, 955–964. [Google Scholar] [CrossRef]
Ma, Q.; Zou, Z. Traffic State Evaluation Using Traffic Noise. IEEE Access 2020, 8, 120627–120646. [Google Scholar] [CrossRef]
Shafiq, M.; Gu, Z.Q. Deep Residual Learning for Image Recognition: A Survey. Appl. Sci. 2022, 12, 8972. [Google Scholar] [CrossRef]

Figure 1. Distillation equipment used in this study.

Figure 2. Schematic diagram of the signal acquisition device.

Figure 3. Fluid flow states in the column under different operating conditions: (a) normal; (b) potential fault 1; (c) flooding; (d) potential fault 2; (e) dry tray; (f) potential fault 3; (g) weeping.

Figure 4. Progress of signal processing.

Figure 5. (a) original signal and (b) signal after ADC.

Figure 6. (a) Left channel after ADC and (b) right channel after ADC.

Figure 7. The MFCC spectrograms before (a) and after (b) Hampel filtering.

Figure 8. Time-domain spectrograms under different operating conditions: (a) normal; (b) potential fault 1; (c) flooding; (d) potential fault 2; (e) dry tray; (f) potential fault 3; (g) weeping.

Figure 9. The structure of the residual network.

Figure 10. Schematic of the residual block.

Figure 11. MFCC spectrograms under different operating conditions: (a) normal; (b) potential fault 1; (c) flooding; (d) potential fault 2; (e) dry tray; (f) potential fault 3; (g) weeping.

Figure 12. (a) Accuracy and (b) loss curves during the training process of the residual network model.

Figure 13. (a) The confusion matrix and (b) normalized confusion matrix obtained through the CNN model.

Table 1. Parameters of distillation columns.

Distillation Column Parameters
Column height	2230 mm
Tray type	Sieve tray
Tray spacing	100 mm
Number of trays	9
Column diameter	50 mm
Downcomer diameter	5 mm
Sieve pore diameter	2 mm
Free area fraction of trays	8.7%

Table 2. Operating parameters of the columns.

Operating Parameters
Maximum reboiler duty	3.19 kW
Heater voltage	150–190 V
Operation state	Total reflux
F factor	0.25–0.4 (m/s)·(kg/m³)^0.5
Gas flow velocity	3.0–4.6 m/s

Table 3. Operating variables corresponding to each category.

Data Classes	Liquid Level	Top Temperature	Bubble Diameter	Heater Voltage
N1	30–50 mm	78.0 °C	8–15 mm	150 V
T1	45–55 mm	78.1 °C	5–15 mm	150–170 V
F1	50–90 mm	78.2 °C	5–7 mm	170 V
T2	25–35 mm	78.2 °C	10–15 mm	180 V
F2	0–5 mm	78.5 °C	–	190 V
T3	30–45 mm	78.4 °C	5–15 mm	90 V
F3	20–30 mm	79.2 °C	10–15 mm	90 V

Table 4. The initialization parameters of the residual network.

Parameter	Value
Optimizer	Adam
Kernel Size	7 × 7
Kernel Size (Residual Layer)	3 × 3
Pooling Layer	2 × 2 Max Pooling
Mini-batch Size	56
Learning Rate	0.0001
Max Epochs	740
Activation Function	ReLu

Table 5. Metrics obtained through the model.

Category	TP	TN	FP	FN	Acc	Precision	Recall	F1 Score
N1	375	3742	7	4	99.73%	98.94%	98.17%	98.55%
T1	228	3887	5	8	99.69%	96.61%	97.85%	97.23%
F1	674	3413	17	24	99.01%	96.56%	97.54%	97.05%
T2	711	3376	23	18	99.01%	97.53%	96.87%	97.20%
F2	677	3403	15	33	98.84%	95.35%	97.83%	96.58%
T3	672	3383	43	30	98.23%	95.73%	93.99%	94.85%
F3	651	3424	30	23	98.72%	96.59%	95.59%	96.09%

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Wang, H.; Zheng, H.; Zhang, Z.; Wang, G. A Deep Learning-Based Acoustic Signal Analysis Method for Monitoring the Distillation Columns’ Potential Faults. Appl. Sci. 2024, 14, 7026. https://doi.org/10.3390/app14167026

AMA Style

Wang H, Zheng H, Zhang Z, Wang G. A Deep Learning-Based Acoustic Signal Analysis Method for Monitoring the Distillation Columns’ Potential Faults. Applied Sciences. 2024; 14(16):7026. https://doi.org/10.3390/app14167026

Chicago/Turabian Style

Wang, Honghai, Haotian Zheng, Zhixi Zhang, and Guangyan Wang. 2024. "A Deep Learning-Based Acoustic Signal Analysis Method for Monitoring the Distillation Columns’ Potential Faults" Applied Sciences 14, no. 16: 7026. https://doi.org/10.3390/app14167026

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Article metric data becomes available approximately 24 hours after publication online.

Article Menu

A Deep Learning-Based Acoustic Signal Analysis Method for Monitoring the Distillation Columns’ Potential Faults

Abstract

1. Introduction

2. Experiments and Signal Acquisition

3. Signal Analysis Method

3.1. Analog-to-Digital Conversion

3.2. Monaural Separation

3.3. Signal Enhancement and Denoising

3.4. Fast Fourier Transform and MFCC Extraction

4. Classification Method

4.1. Data Splitting

4.2. Network Architecture

4.3. Evaluation Method

5. Results and Discussion

5.1. MFCC Feature Extraction

5.2. Network Training

5.3. Network Performance Evaluation Metrics

6. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI