1. Introduction
Rail containers and tank-cars are widely used for liquid freight transport and storage. Upon their arrival by railway, the containers are usually detached from the locomotives and wait for the following operation, while the locomotives are immediately driven away to perform other work. In large factories and rail-yards, this gap between container arrival and its processing makes it difficult for the workers to keep track of which containers need emptying and which are ready for re-use. Currently, the problem is usually addressed by a manual check, which can be extremely inefficient when a large number of emptied and filled containers are mixed together in the yard. It also takes long to empty a container due to its huge capacity. This makes it challenging, especially for high-viscosity content, to tell whether the container has been fully emptied. In such cases manual checking can be rather subjective, leading to errors in fill-state estimation. All this results in decreased operational efficiency. Therefore, a remote and automatic fill-state detection system is of great interest to the relevant industries. In addition to a high detection accuracy, a low complexity in installation and maintenance is of great importance in this scenario. Primarily, the focus is on empty/full detection. Estimating the level is of added value; however, here, a generous tolerance in the level resolution is afforded. Instead of an exact prediction of the level, it would suffice, for the purposes of the freight industry, to roughly quantify the level of the content according to discrete (quantised) fill-states.
There have been many studies on liquid level detection. Some popular methods using optical-fiber liquid level sensors [
1], optoelectronic level detection [
2], impedance change [
3], and a few sound reflection-based methods [
4,
5,
6] require direct access to the liquid. Techniques based on Helmholtz resonance [
7] put strong restrictions on the shape of the container (presence of an open, constricted neck). Requiring such direct access to contents or introducing openings in the container are difficult to meet in industrial practice, and even dangerous for flammable or toxic contents. This limits the application scenarios of these systems.
In contrast, acoustic methods can provide non-intrusive solutions for this problem. The most common acoustics-based fill-state detection set-ups for liquid containers are summarised in
Figure 1. The penetrative method in
Figure 1a requires to install a pair of acoustic transducers on opposite sides of the outer surface of the container. When an acoustic signal is applied to the container by the transmitter, the wave is propagated through either air (when the liquid level is below the sensor) or liquid (when the liquid level is above the sensor). Since different media have different acoustic attenuation characteristics, change of the transmission medium between the sensors can be detected by analysing the energy loss in the process, which indicates the change of the liquid level. This method is more appropriate for small containers such as those for food products [
8]. The larger the container capacity, the more the energy required to produce a suitable excitation, since the signal needs to be strong enough to be detected by the receiver after attenuation along the propagation path. This may limit the portability and applicability of such methods in the freight industry.
Alternatively, a single transceiver may be used, as is mostly exploited in pulse-echo analysis. There are two ways to employ the pulse-echo analysis for liquid level detection. The first one, as shown in
Figure 1b, is based on the echo from the metal-air or metal-liquid interface, i.e., the inside of the container wall [
9,
10]. The decision criterion is based on the transmitted-reflected signal energy ratio. The ratio depends on the material characteristics at both sides of the interface. Therefore, a different energy ratio indicates a different material behind the container wall. The method in [
11] had a similar setup but investigated the relationship between the content at the sensor height and the duration of echos.
The other way of pulse-echo analysis is to capture the reflection from the air-liquid interface as shown in
Figure 1c. The transducer can be installed at the bottom [
11,
12] or the top of the container [
13]. Given the speed of sound in air or the liquid, the level heights can be calculated from the time-of-flight [
12,
13]. Commercial products such as [
14,
15] are already available in the market. To improve the robustness of such methods in noisy conditions, a preprocessing of the signals e.g., by beamforming with sensor array as in [
5] or by signal denoising using e.g., Kalman filtering as in [
13] can be done. This method also requires the sensor surface to be vertical to the liquid surface, which could be an issue for rail cars, where access to the underside of the container/wagon may not be easy.
The aforementioned methods work best when the content of the container is homogeneous. Thus, if a container were filled with powders, emulsions, etc., the effectiveness of these approaches may be affected. This limits the applicability of such solutions.
Figure 1d shows the other common set-up with two transducers. To reduce the energy attenuation in propagation, the excitation generated by the transmitter only travels a small distance instead of penetrating the whole container. Here, the literature indicates that the Lamb wave mode [
16], the resonance frequency [
17,
18,
19] or the spectral peak amplitude [
20] is affected by the presence of liquid and may be used to deduce the liquid level. Detection criteria can be deduced from a physical model of the impulse response of the container. The common practice is to first simulate this model and derive a relation between liquid levels and features extracted from the impulse response of the container. To detect the liquid level in practice, a physical impact is applied to the container and the response is recorded. The extracted features are then input to the relation derived from the simulation to obtain a level prediction. For example, ref. [
18] estimates the peak frequency shift during liquid loading by finite element method (FEM) for a standing cylindrical container with constraints applied to its base. The approach of [
19] adopts a simplified model, the Euler-Bernoulli beam theory, and yields a similar relationship between the peak frequency and the weight of its content when assuming the container is constrained at both ends. However, these physical models are heavily dependent on the geometry and material characteristics of containers, and the calculation could be too complex to transfer established models to unseen ones. It is thus inadequate as a general solution.
An interesting idea is provided in [
17] with a similar setup. Instead of computational analysis, the relation between the spectral feature and the liquid level is given by fitting a regression curve from experimental data. This makes the method theoretically feasible for an easy transfer to other containers. However, the approach is only tested on a relatively small container of 8.5 L. Furthermore, the dependence of the above methods on only the resonance peak features (i.e., amplitude or frequency) would require, in general, an excitation of the whole container and (as we demonstrate here) does not generalise to the much larger containers used in rail-freight. However, this basic idea, which has its parallel in the manual check procedure, where a yard inspector would tell if the container was empty or not by knocking on it and listening to the sound is at the root of our study. We aim to similarly identify the effect of the liquid level by an automated capture and analysis of the impulse response. If the prototype is well designed and the signal processing procedures are carefully chosen, such impulse response analysis can provide a portable solution (i.e., a compact and mobile device that can be easily installed onto travelling containers) for fill-state detection.
In this paper, we employ the set-up in
Figure 1d, consisting of a transducer applying an impulse excitation using a mechanical impact (‘knock’) at a location on the outer wall of the container (Note that we use the term ‘impulse’ or ‘knock’ interchangeably in the text to describe such an excitation), and a vibration sensor located a little further away, measuring the response. However, unlike [
17], we propose to analyse the spectrum of the impulse response, instead of focusing solely on spectral peak amplitude or its shift. The benefit of this approach is that we do not need to excite the whole container; it is still possible to detect a change in the fill-state by analysing the change in the impulse response locally. Thereby we address the following questions (i) which features of the impulse response are best suited for the task of liquid level detection, (ii) where should the data be captured in order to obtain the best discrimination, (iii) to what extent is a (quantised) level detection feasible and how does this depend on the sensor location. For this, we carry out experiments on two different types of containers in realistic settings.
The paper is organised as follows: the system overview is presented in
Section 2.
Section 3 introduces the excitation generation and response capture paradigm, the feature extraction, and the machine learning models used for the binary decision and quantised level prediction.
Section 4 describes the experimental setup and the data capture, and presents a preliminary, illustrative visual analysis on the feasibility of the system. Finally, in
Section 5, the system is validated, first on a smaller tank-container and then on a larger tank-car. The learnings are summarised in
Section 6 and directions for future work are presented in
Section 7.
2. System Concept and Overview
As previously stated, we propose to detect the container fill-state by applying mechanical impacts on the outer wall of the container and recording and analysing its response using an appropriate sensor. The system schematic is shown in
Figure 2.
The prototype system hardware consists of: an actuator to generate impulses, a receiver to record the vibrations, and a local controller for data acquisition, sensor synchronisation/activation, and necessary data analysis. The components are powered by a battery/powerpack.
The prototype is installed at a fixed location on the container. Generally speaking, there are two factors to consider when choosing the location: (i) the measured impulse responses at this location should be sensitive to liquid level change, and (ii) the location is easily accessible for convenient installation and system maintenance. While the lower part of the tank is evidently more favourable in consideration of (ii), we consider also a location midway up the container to contrast the difference in performance at different locations.
The test protocol is as follows: when queried, a series of N mechanical impacts, uniformly spaced in time, is applied to the exterior wall of the inspected container by the actuator. The receiver starts recording simultaneously, capturing the vibrations caused by the knocks. The recordings are then analysed to predict the fill-state.
Note that the impulse response characteristics would vary from one container model to the other. The impulse response also depends on the material properties such as density, bulk modulus, shear modulus, and viscosity. Thus, before deployment, a calibration experiment is required for each concerned container model, with its typical content. This calibrated model can then be stored for later look-up during the analysis.
We note that in a practical, large-scale deployment, one such system would be permanently attached to each container, and would be remotely queried—as needed—by a central server/base station in the yard. The necessary communication between each such system and the base station can be achieved by, e.g., mobile data (3G–5G), wireless LAN, etc. With such necessary infrastructure on-site, a user can remotely query the status of multiple containers in the yard in a short time. The choice of what processing needs to be local and what processing needs to be on the central server is a design decision that depends on the available bandwidth for data transfer, the energy trade-off acceptable and the relative costs of the local and central units.
Prototype Hardware Details
We choose to use a solenoid [
21] as the actuator and an acclerometer [
22] as the receiver. The solenoid was chosen because it could generate a mechanical impact that could be regarded as wide-band excitation. The solenoid and the accelerometer should be installed at a close distance so that a small amount of energy can still generate a clear signal at the accelerometer. The distance between the actuator and the receiver is set to 20 cm. They are both fixed on the curved surface by custom-designed and 3D-printed fixtures.
To emphasise the low computational effort required, a Raspberry Pi 4 [
23] is used as the controller and for recording the impulse responses. For the purposes of this proof-of-concept study, the data were downloaded onto a personal computer, where the analyses were carried out. In principle, however, given a trained model, a controller like the Raspberry Pi is computationally capable of the analyses. The whole system is powered by a power bank, resulting in a portable device that can be permanently installed on the container.
5. Evaluation
Following the illustrative analysis, we now present a thorough evaluation of the system. The prime goal is to detect the binary (empty/non-empty) state of the container. Predicting the quantised liquid level is useful (supplementary) information.
The proposed system is first validated on data gathered from the small container. Next, the data gathered on the larger tank-car is evaluated. The effect of the different sensor locations, previously depicted graphically, is now quantified. Furthermore, the minimal number of impulses required for a robust classification will be investigated. This parameter should be optimised to guarantee a good balance between a robust decision and the low energy consumption.
Fused decision was described in
Section 3.4 as a means to improve the robustness of the system, by aggregating likelihoods over consecutive responses. To investigate the minimum required number of impulses in a practical system, we pool over the requisite number of consecutive impulses to simulate impulse sequences of different lengths in each recording. The fused decision is investigated for all models. For the data gathered from the container, the models are trained on 60% of the recordings with one-second impulse interval, and tested on the rest of the container recordings. For the data gathered from the tank wagon, the data is divided into training and test set by the impulse intervals: 0.4-s-interval recordings are split 50-50 as training/test set, and all one-second-interval recordings as test set to avoid data leakage.
Separate Logistic Regression (LR) models are trained, for each feature set (LPCC, MFCC) and for each task (the binary state detector and the liquid level predictor), respectively. The evaluation details are listed in
Table 1 for the data from the container, and in
Table 2 for the data from the tank-wagon.
5.1. Evaluation on Container Data
The proposed method is first verified by the preliminary test experiment on the small container. The liquid level detection results on the container are depicted in
Figure 10 and
Figure 11. The classification results are succinctly summarised in the
confusion matrix, where the x-axis breaks down the data distribution according to their true labels, and the y-axis by their classification results. Thereby, the correct classifications lie on the diagonal of the matrix. The off-diagonal elements indicate false classifications (e.g., in the case of a binary classifier, the off-diagonal elements would correspond to the missed detection and false alarms). Further, the presented values are normalised by the total number of data-points for each prediction label and expressed as a percentage. Consequently the diagonal elements directly indicate the correct detection rates for each label. This representation of the classifier performance by a confusion matrix thus provides a quick overview not only of the accuracy of the system but also
how the errors are distributed. This is particularly useful for analysing the performance for multi-label classification tasks, such as the quantised level prediction.
Binary Detection
Using LPCC as the feature, the proposed system shows higher accuracy on the binary classification (first row of
Figure 10) than on the quantised level prediction (second row of
Figure 10). It is clear that fused decisions improves the prediction performance for both goals even when only 3 consecutive impulses are taken into consideration. When we take a closer look at the level prediction results, it can be observed that the major prediction error comes from the confusion in the ‘non-empty’ conditions. If we consider the level prediction results with the ‘empty/non-empty’ boundary (the sensor installation height) as the threshold, the binarised results show a similar accuracy to the results from the binary classifier.
For feature comparison, LR models based on MFCCs are also trained and tested in a similar manner.
Figure 10 and
Figure 11 indicate that LPCCs outperform MFCCs in both tasks (binary classification and quantised level prediction). The difference is more evident for level prediction. Comparing
Figure 10e to
Figure 11e, the overall accuracy decreases from 80% (LPCCs) to 65% (MFCCs). This result is consistent with the implications of the PCA projections of the two features (
Figure 8 and
Figure 9).
The preliminary test indicates that the proposed method is feasible for fill-state detection. Fused decisions improves the prediction accuracy for both goals. LPCCs outperform MFCCs as the input feature for the LR classifier. Therefore, LR with LPCC input as the detection model will be employed in the following sections.
Figure 10.
The confusion matrices of LPCC-based LR models on the data gathered from the container. Results are presented for the binary classification (‘empty/full’) of model 1 in the first row (a–c) as well as for the quantised level prediction task of model 3 in the second row (d–f). The first column indicates the performance on the training data, when only a single impulse is considered. The second column indicates the accuracy on the test set, again for a single impulse. The third column shows the benefit of fusion across three impulses, i.e., pooling the likelihoods across multiple impulses before taking the final decision. The result, for the binary classification, is now 100% on the test set and significantly improved on the more challenging level prediction task.
Figure 10.
The confusion matrices of LPCC-based LR models on the data gathered from the container. Results are presented for the binary classification (‘empty/full’) of model 1 in the first row (a–c) as well as for the quantised level prediction task of model 3 in the second row (d–f). The first column indicates the performance on the training data, when only a single impulse is considered. The second column indicates the accuracy on the test set, again for a single impulse. The third column shows the benefit of fusion across three impulses, i.e., pooling the likelihoods across multiple impulses before taking the final decision. The result, for the binary classification, is now 100% on the test set and significantly improved on the more challenging level prediction task.
Figure 11.
The confusion matrices of MFCC-based LR models on the data gathered from the container. The first row (
a–
c) shows the the performance of Model 2 in the binary classification task. The second row (
d–
f) shows the the performance of Model 4 in the level prediction task. The first, second and third column indicates the performance on the training data, on the test set, and the benefit of fusion across three impulses, respectively. The trends follow that in
Figure 10. Whereas MFCC features too, show relatively good performance, this system performs worse compared to the LPCC-based systems. Further, the benefit of fused decisions is evident here as well, especially for the binary classification.
Figure 11.
The confusion matrices of MFCC-based LR models on the data gathered from the container. The first row (
a–
c) shows the the performance of Model 2 in the binary classification task. The second row (
d–
f) shows the the performance of Model 4 in the level prediction task. The first, second and third column indicates the performance on the training data, on the test set, and the benefit of fusion across three impulses, respectively. The trends follow that in
Figure 10. Whereas MFCC features too, show relatively good performance, this system performs worse compared to the LPCC-based systems. Further, the benefit of fused decisions is evident here as well, especially for the binary classification.
5.2. Evaluation on Tank-Car Data
It has already been shown by
Figure 9 that the sensor location has an influence on the feature distribution. Therefore, separate models are trained for the two prototypes.
Figure 12 presents the confusion matrices of empty/non-empty binary classification of the models on single impulses. The results are also summarised by sensitivity and specificity in
Table 3. Generally speaking, it can be observed that binary classification is a simple task for the proposed system. Nevertheless, prototype 2 that was installed at a higher sensor location performs worse in the binary classification task even though the labels are binarised accordingly. This is in line with the analysis from the PCA visualisation in
Section 4.3.
5.2.1. Effect of Fused Decision
Fused decision is introduced to improve accuracy. Fusing likelihoods across different numbers of consecutive single impulse results could simulate a shorter recording and demonstrate the influence of different impulse numbers
N. The classifiers already show high accuracy on single impulse recordings, so only a marginal increase of three-impulse per recording is simulated as described in
Section 3.3 and evaluated. Note that 16 three-impulse recordings can be obtained from one original 20-impulse recording.
Figure 12c,f present the fused decision results (only shown for the test set). The fused decision results are also shown in terms of sensitivity and specificity in
Table 4, for the test set. As shown in the preliminary test, the binary detection is still an easy task even for the full-size tank-car. A positive effect of fused decisions is also observed with three impulses in each recording.
5.2.2. Sensitivity to Model Mismatch
The sensitivity of the trained
models to the sensor location is next investigated.
Figure 13 shows the binary liquid level detection results when applying the trained prototype 1 model to the data collected by prototype 2. The well-trained classifier fails to predict the data from other sensor locations. The low prediction accuracy indicates that the matching of sensor location and the model is crucial to the correct detection. Therefore, it is essential to ensure that the sensor installation location is fixed for each calibrated system.
5.2.3. Liquid Level Detection
It is also of great interest to see how much extra information we can deduce from the recordings. In this section, we investigate the accuracy of the proposed quantised liquid level estimation system.
Figure 14 shows the liquid level prediction results of the trained LR models by confusion matrices. Since we deal with multi-label classification, we only present the results on the confusion matrix.
This conveys more insights regarding the performance, specifically how the errors are distributed across the various labels. It has already been shown that the lower part of the tank is better for the binary prediction. A similar result can be observed in the liquid level prediction results: prototype 1 installed at the tank bottom performs better than prototype 2 installed at a higher location. Further, the binary ‘empty/non-empty’ classification can be deduced from the level prediction results. Again, if we pool the level prediction results for model 7 using the sensor height as the classification boundary, the binary classification accuracy is in line with the results shown in
Figure 12b. This reaffirms the correctness of our choice on the threshold to classify empty/non-empty conditions.
Average-pooling also improves accuracy for liquid level prediction. As shown in
Figure 14c, five impulses per recording are sufficient for a reliable liquid level detection for prototype 1. However, the situation is more complex for prototype 2. Comparing
Figure 14f with
Figure 14g,h, the decision fusion hardly improves prediction accuracy. Especially when the liquid level is at the sensor installation height, the fusion does not bring much additional benefit.
Taking both results into consideration, the optimal sensor location for level prediction is along the base of the container.
6. Conclusions
We presented a portable non-intrusive fill-state detection method for liquid containers, to query the container fill-state in a remote and automatic manner. When the container state is queried, a sequence of mechanical impacts (‘knocks’) is applied to the container at evenly-spaced intervals and the container impulse responses are recorded simultaneously. The fill-state information of the container can be deduced from the recorded responses.
Two aims have been set for the system: a main but ‘coarse’ one to detect if the tank is empty, and a supplementary, ‘detailed’ one to detect how much liquid is left in the tank (quantised to a discrete set of fill-levels or fill-states).
Prototypes were designed and built to validate the proposed method, consisting of a solenoid actuator as the excitation source, a piezoelectric sensor as the vibration receiver, a Raspberry PI as an archetypal local controller with limited computational resources, and a powerbank for energy supply. Experiments were performed on a container and a freight tank to verify the method and to optimise the following system configurations: sensor locations, minimal numbers of impulses in one recording for a robust decision and the data representation for different tasks.
We solved both the binary empty/non-empty detection and the quantised liquid level detection as classification problems. The logistic regression model is adopted as the classifier. The visual data analysis by PCA and the more thorough evaluation using trained LR models both show that Linear Prediction Cepstral Coefficients (LPCC) offer a more robust data representation than Mel Frequency Cepstral Coefficients (MFCC) as features. The experiments show that the models are most sensitive at the decision boundary, which is at the same height as the sensor. Therefore, the best sensor location is the bottom of the container given our prime goal of binary fill-state classification. Furthermore, decision fusion could improve detection accuracy for both the binary classification and the liquid level detection. Using the best sensor location (the bottom of the container) and feature (LPCCs), the proposed system yields a 100% accuracy on the binary prediction task if fusing three impulses, while fusing 10 impulses can achieve an accuracy of 80% on the quantised liquid level prediction task.
We note that the proposed method offers a degree of built-in robustness to the external interference endemic in the industrial environments where the system will be deployed. Firstly, extraneous impulses captured in a recording can be detected and filtered out using the synchronisation between the impact generator and the data capture. Secondly, as shown in the evaluation, fusing likelihoods across multiple impulses further improves the robustness of the detection. The high accuracy of the test results, where the experiments were performed in an active rail-yard and during normal working hours, offer some proof of this robustness.
7. Future Work
The proposed system can be an important contributor to improving rail-intelligence and our paper is a first step in this direction. Through proof-of-concept experiments with prototype hardware we have chiefly demonstrated the feasibility of the concept. The results indicate that a reliable fill-state detection can be obtained in real-life working conditions, given a proper calibration for a container-content pair. In the context of facilitating a large-scale deployment we expect the calibrations done on one container-content pair to hold for similar container-content pairs, as long as the same model of container is used. This limits the calibration to only a few container-content pairs, which can be acquired over a period of time. Since the purpose of a container (the type of freight it carries) very rarely varies, once a container is calibrated we can expect the parameters to hold for a relatively long period of time. Another way to do the calibration is to have an automation (at least for the binary detection) that makes recordings/measurements whenever the container is emptied and filled; this can ensure an up-to-date calibration for that container.
As more data is gathered, we can establish more general relationships between the acoustic features and liquid level for different combinations of container model and content pairs, which can allow for a generalisation of the calibrations. A mobile set-up is also feasible, in which case the system can be designed as a handheld device, allowing for desired containers to be checked by a field operative. As mentioned above, with sufficient data, the operative could e.g., select models from the model database to use the correct parameters, or these could be derived from the more generalised relations if essential parameters of the inspected container (capacity, insulation layer information, general geometry shape, etc.) are provided. Thus, we hope for interesting developments in this field, in collaboration with the industry.