1. Introduction
The Western Treatment Plant (WTP) in Werribee, Victoria, Australia, is operated by Melbourne Water and plays a crucial role in providing sewage treatment services to over half of Melbourne’s population [
1]. The anaerobic digestion of raw sewage takes place in treatment lagoons that are covered with high-density polyethylene (HDPE) sheets approximately 2 mm thick and 450 × 170 m in size, and produces methane-rich biogas that can be harvested for renewable energy generation [
1]. However, a progressive accumulation of scum can occur underneath these floating covers, building up to large mounds known as scumbergs. The presence of scumbergs can compromise the structural integrity of the floating covers as well as obstruct the collection of biogas, thereby affecting renewable energy generation.
Recently, unmanned aerial vehicle (UAV)-based photogrammetry has been employed as an inherently safe, rapid inspection to regularly capture the elevation profile of the floating covers. Specifically, orthomosaics and digital elevation models (DEMs) of these covers are generated through this approach, enabling tracking of the elevation of the cover, and, hence, of the underlying scum [
2,
3,
4], as depicted in
Figure 1. In particular, this remote sensing imagery can provide the early detection of scum accumulation and offers the potential for developing more detailed diagnostic and prognostic models [
5,
6] for assessing the structural health of the floating covers using artificial intelligence, in line with the promise of the Industry 4.0 revolution.
To date, our research project has primarily focused on the development of non-contact techniques for safely gathering information regarding deformation and solid scum accumulation under the floating cover [
2,
3,
7]. The DEMs, in particular, have been instrumental in providing the valuable spatial contexts of the floating cover, enhancing our understanding of scum behaviour and facilitating WTP asset management. In our previous work, Wong et al. [
2] demonstrated the capability of an unsupervised machine learning technique in delineating boundaries between regions of different scum hardness levels by leveraging elevation data from the DEM. Additionally, the study reported that a linear model explains 77% of the variance in scum depths based on the cover elevation above water level. However, the extensive manual suppression of unwanted artefacts (e.g., flotation, ballast, and water features, see
Figure 1) in the DEM was required, since these features were not associated with the actual floating cover elevation and could cause erroneous results. Therefore, there is a need for a robust method to remove artefacts in DEMs, allowing for a more rigorous assessment of the effects of scum on the floating covers.
Notwithstanding these acknowledged advantages of DEMs, it is important to recognise their susceptibility to errors, which manifest as noise in the elevation data. Furthermore, multiple preprocessing steps are very often necessary to isolate the feature of interest before conducting analysis [
8,
9]. In particular, filtering algorithms to autonomously remove unwanted features are advantageous, considering the labourious task of manually manipulating data, particularly for high-resolution images with detailed features. Several filter-based methods [
8] have been implemented to remove artefacts and reduce errors associated with DEMs. Classical smoothing filter methods, such as the mean filter, present a trade-off between the degree of smoothing and the preservation of key features. The median filter, a widely-used simple non-linear filtering method used in various industries, exhibits robustness in reducing impulse-like noise by replacing each data point with the median value of the surrounding points within a specified window. Advanced spatial methods, using adaptive smoothing, multiple anisotropic filters [
10] and filtering in the frequency space [
11] have been shown to smooth 2D elevation profiles while preserving subtle details. Specifically, multiscale analysis methods analyse variations and complexities across multiple scales, providing a comprehensive understanding of the phenomena, as opposed to those operating at singular scalar measurements [
12,
13]. These approaches have been increasingly applied to DEMs to capture features at different scales, as well as remove noise. Hui et al. [
12] introduced a simple technique using a linearly expanding window size and simple slope thresholds. Hani et al. [
13] utilised the lift scheme, a variant of the wavelet transform, to evaluate terrain surface roughness. Gallant [
9] demonstrated a multiscale adaptive smoothing approach to progressively increase the level of smoothing when noise is relatively larger. Booth et al. [
14] achieved 97% classification accuracy rate in landslide mapping by using spectrum-based methods and filtering unwanted non-native features with the assumption that they exhibit higher spatial frequency. Considerable progress has been achieved in the active field of vegetation suppression within geospatial models, which includes colour-based and slope-based filtering techniques [
15,
16,
17,
18], with commercial software integrating proprietary algorithms specifically designed for vegetation filtering in ground terrain elevation [
19]. However, in specialised applications, the necessity arises for employing highly specific processing strategies, including multiple stages of filtering [
15,
17,
18,
20,
21]. These demands often require a substantial investment in developing filtering approaches tailored to unique application needs, and may not be suitable for other applications. Furthermore, the limitations associated with the implementation of advanced spatial techniques lie in their diminishing intuitiveness as the complexity of the filtering technique increases, along with the increasing number of adjustable parameters. Additionally, these techniques are predominantly limited to showing one (i.e., spatial) content of the data, and do not consider multiple data attributes, thus restricting their scalability when presented with new information. Nevertheless, removing natural features (e.g., vegetation and trapped rainwater), man-made objects, and unwanted artefacts that do not correspond to the terrain of interest is known to be problematic and challenging [
8,
22,
23].
In the past decade, there has been a significant interest in incorporating machine learning techniques into DEM and imagery data for classification and feature segmentation applications [
7,
24,
25,
26,
27,
28]. A key advantage of machine learning techniques is their ability to learn the characteristics of features automatically from the data, eliminating the need for explicit feature definitions used in classical methods for classification purposes. Henriques et al. [
25] segmented intertidal habitats for ecological research using DEM and satellite imagery in a supervised ensemble learning random decision forests algorithm. In Su’s work [
26], semi-arid vegetation mapping was conducted using remote sensing data, including mean elevation via DEM and nadir and off-nadir reflectance measurements, through support vector machine learning, achieving a classification accuracy of approximately 80%. Gebrehiwot et al. [
27] utilised a pre-trained convolutional neural network by applying transfer learning for flood extent mapping, in classifying water, building, and non-relevant features. Their neural network was fine-tuned using a smaller training population of manually annotated UAV imagery data, resulting in an overall accuracy of 97%. While supervised learning models, especially deep learning models, have the capability to produce highly accurate results, the main challenges in their development include the need for very large datasets and labelled data. These limitations require substantial time and labour investments, which can render their practical implementation less feasible. Yet, studies have employed unsupervised learning methods, i.e., clustering methods, to address the common challenges of insufficient labelled imagery data [
29,
30,
31]. A notable advantage of clustering algorithms is their capability to rapidly identify and enable the visual examination of groups associated with features of interest within the clustering results without prior knowledge of the data or labelled data to train the model. Cinat et al. [
30] demonstrated the application of an unsupervised clustering algorithm on imagery data for isolating canopy vegetation for crop management. More recently, an unsupervised deep learning method using a convolutional autoencoder network trained on multispectral imagery and DEM has been used to extract abstract, high-level features from the embedding layer for landslide detection [
31]. Evidently, the emerging research trend in utilising machine learning algorithms is attributed to their performance in effectively filtering noise and artefacts, while offering more capabilities such as flexibility and scalability compared to conventional methods.
In today’s data-centric landscape, there is a strong emphasis on developing efficient and user-friendly data processing techniques for various industry applications. Melbourne Water is actively pursuing innovative methods that consider both DEMs and orthomosaics, aiming to seamlessly integrate future imagery for further refinement. Currently, conventional filtering methods, such as median filters, are employed for DEM processing on anaerobic lagoons due to their straightforward, rapid, and cost-effective approach, which is sufficient for their specific applications [
4]. However, advanced filtering methods, known for their superior accuracy, often introduce increased complexity. As a result, they can be less intuitive for non-specialists to interpret and implement, and may not scale effectively with additional data characteristics, making them not suitable for this specific industrial application. As such, machine learning approaches, particularly unsupervised learning methods, emerge as highly promising alternatives that align with Melbourne Water’s resources and needs, with the potential to enhance the existing practices.
This paper proposes a novel approach that utilises an unsupervised k-means clustering machine learning algorithm by incorporating DEMs and their associated orthomosaics for filtering features on the floating cover of the anaerobic lagoon. This approach enables the visual identification of features within each cluster, followed by filtering to retain clusters containing features related to the membrane cover. The investigation first examines the learned clusters and then demonstrates the filtering approach in a localised region of interest, comparing it with current filtering methods used in practice. Furthermore, the correlations between scum hardness and unfiltered and filtered DEM elevations are investigated. The preceding results highlight the significance of this approach in achieving a more accurate and precise analysis of the scum hardness and elevation of the floating cover. Furthermore, this preliminary work represents a significant step forward in the pursuit of real-time structural health monitoring for anaerobic lagoons at WTP.
3. Results
3.1. Elevation Distributions of a Localised Region of Interest Using Various Filtering Methods
The following results demonstrate the effectiveness of the k-means filtering method and its comparison with existing methods in a 2 m by 2 m localised region of interest. In this section, the clustered groups are identified via the k-means clustering algorithm, followed by a statistical comparison of different filtering methods and their accuracy relative to laser survey measurements.
Within the localised region, the k-means clustering algorithm identified three distinct clusters: Groups 2, 7 and 3, as illustrated in
Figure 7. Upon visual inspection, Group 2 predominantly represents the trapped rainwater body on the floating cover, while Group 7 corresponds to the floating membrane cover. Group 3 encompasses the porthole structure and its shadow. As indicated in
Table 2, Group 2 exhibits a large standard deviation of 137.8 mm with a mean value of 46.4 mm, indicating significant variation in elevations and noise within this group. Although Group 3 exhibited mean and median values similar to Group 7, its relatively larger standard deviation was mainly due to the presence of the porthole structure’s shadow. It was evident that the features in Groups 2 and 3 consisted of relatively larger noise and did not represent the membrane cover. As a result, through the k-means filtering method, these features were subsequently filtered (refer to
Section 2.3.3), and, hence, only Group 7 was retained.
The comparisons of the different filtering methods on the elevation distributions of the localised region are shown in
Figure 8. Referring to
Table 3, the median filters resulted insignificant changes in their median values (with a maximum difference of 1.3%) compared to the median value of the raw data. In contrast, both mean and Gaussian filters exhibited reductions in the median values (ranging from 5.6% to 15.2%). This decrease is attributed to their smoothing nature, which tends to blur or not preserve edges and finer details, particularly at larger window sizes. While it is evident that all methods can remove impulse-like noises, classical methods do not specifically filter out unwanted natural and man-made features, as their function is to smooth the entire surface elevation of the image.
Significant differences in statistical measurements were observed when the k-means filtering method was applied compared to the raw data and classical filtering methods. Specifically, there was a 54.2% increase in mean and a 24.5% increase in median relative to those of the raw data, whereas the classical filtering methods showed a relative decrease in mean of up to 5.0% and a relative decrease in median value of up to 15.2% relative to those of the raw data. Furthermore, the standard deviation of the k-means filtered elevations exhibited a significant reduction of 57.2% relative to those of the raw data, while classical filtering methods showed relative reductions in standard deviation ranging from 2.2% to 14.1% relative to those of the raw data. Notably, the majority of the elevations obtained through the k-means filtering method demonstrated a close correspondence with the laser survey measurement of 370 mm, particularly in mean and median values, with relative percentage errors of 1.0% and 3.8%, respectively.
3.2. Correlation between Scum Hardness and Floating Cover Elevation
In this section, the correlations between scum hardness and floating elevation, considering both raw and filtered elevation data, are presented. The elevation distributions for each scum hardness level before and after applying median and k-means filtering methods are shown in
Figure 9. As indicated in
Table 4, the k-means filtering method resulted in an average reduction of 32.9% in the filtered data for each scum hardness, while the median filter showed a negligible reduction of less than 1.7% due to the presence of null (NaN) data due to the filtering process. There were no significant differences, with a maximum difference of 2%, in the relative proportion of data in scum hardness levels between the filtered data and the raw data. This suggests that the removal of artefacts through the filter methods did not significantly impact the proportion of each scum hardness relative to the raw data. Therefore, it is considered appropriate to compare the scum hardness distributions between the filtered data and the raw data since the relative proportions remain relatively consistent.
There were 3% to 8% differences in the mean values of soft to hard scum hardness levels between the k-means filtered data and the raw data, as indicated in
Table 4. However, fluffy and water scum hardness levels exhibited larger mean differences, at 24.1% and 39.4% relative to the raw data, respectively. In comparison to the median filter, the majority of the mean differences were less than 0.6%, except for fluffy and water scum hardness levels, which showed 4.4% and 17.7% mean differences relative to the raw data, respectively.
The k-means filtered data showed a significant average reduction of 59.4% in the proportion of outliers compared to the outliers of raw data. Particularly, the hard and medium scum hardness levels exhibited substantial reductions of 80.2% and 85.4% in the proportion of outliers relative to those of raw data, respectively. This reduction was attributed to the removal of large water bodies, which were predominantly present in these regions. In contrast, when considering the median filtered data, the most substantial reductions in the proportion of outliers were observed in the hard, medium, and fluffy scum hardness, with reductions of 37.4%, 14.9%, and 7.0%, respectively, relative to those of raw data. Conversely, the remaining scum hardness levels exhibited relative reductions of less than 3.5%. The k-means filtered data also showed an average reduction of 36.3% in standard deviation relative to the standard deviation of raw data, with significant relative standard deviation reductions of 62.5% and 46.7% observed in the water–fluffy and fluffy scum hardness levels, respectively. These reductions were attributed to the removal of ballast, the water surrounding the ballast, and debris/dirt, which were prevalent in these regions. For the median filter, the majority of scum hardness levels showed outlier reductions of less than 7.4% relative to those of raw data, except for the water scum hardness, which exhibited a significant relative outlier reduction of 19.7%. Similar to the previous findings, it was observed that the median filter is efficient in removing impulse noises from raw data. However, the k-means filtering method exhibits notable effectiveness in its ability to filter unwanted features, including those that contain significant noises, resulting in reduced variance and fewer outliers in elevation distributions.
Furthermore, Spearman’s correlation analysis revealed strong and highly significant correlation coefficients between scum hardness and elevation for the raw data, median filtered data, and k-means filtered data, with values of 0.719, 0.746, and 0.813, respectively. These coefficients indicated a monotonic relationship between scum hardness and elevation. Notably, the coefficient of correlation strength was lower in the raw data due to the presence of noise and artefacts. However, the k-means filtered data showed a significant improvement of 13.1% in correlation strength compared to a 3.6% improvement with the median filtered data.
4. Discussion
The preceding findings underscore the capability of this method to attain a more precise assessment of scum hardness and the elevation of the floating cover. The research objective was to employ inspection parameters, such as DEMs and orthomosaics, to make informed decisions that ensure the cover maintains its structural integrity. The findings reveal clear associations between scum hardness and elevation, and support the advantages of effectively filtering unwanted features. Both filters demonstrated their capacity to mitigate irrelevant data points, thereby improving data quality for uncovering the association with scum hardness. This is demonstrated by the significant increase in correlation strength achieved through the use of the k-means filtering method to remove artefacts, which constitute approximately 16% of the total asset. Particularly in studies where unwanted natural features can have a significant influence on the results, the k-means filtering method offers a convenient way to identify these features for removal. The analysis suggests that the consistency of scums influences the height of the floating cover, which may have implications for the asset’s structural integrity. This highlights the necessity for the further understanding and management of scum to maintain the asset effectively.
The preceding findings highlight the advantages of the k-means filtering method, which utilises machine learning techniques to efficiently group features based on their parameter similarity without relying on pre-existing knowledge of the data. This capability facilitates the rapid identification of clusters associated with the features of interest through visual inspection. It eliminates the need for conventional methods that require defining functions or dependencies, especially those related to spatial characteristics, as commonly seen in most smoothing filters. In this particular study, a substantial proportion of the image is dedicated to the elevation of the floating cover, with only small portions of the image occupied by the unwanted features. This allows a simple thresholding approach to remove clusters with lower proportions of the image. Furthermore, the effectiveness of the k-means filtering method in reducing noise and artefacts was exemplified by the significant reduction in outliers and standard deviation compared to classical filtering methods. In particular, water features have been observed to be the primary artefacts where impulse-like noises are concentrated. It should be noted that the classical methods take into account the entire elevations, including those of the water surface. This does not accurately represent cover, and, therefore, the removal of water features is deemed crucial in this study.
The present work only included the image processing step of transforming the DEM and orthomosaic to a global reference point. Further applications of morphology and digital image processing techniques, such as contrast correction adjustment, can enhance the quality of the image before performing k-means filtering. Nonetheless, the k-means filtering method has demonstrated its robustness in effectively identifying distinct groups associated with the membrane covers with different luminance characteristics (Groups 6 and 7), as well as unrelated image backgrounds (Group 5), without the need for extensive preprocessing procedures. Group 5 was immediately excluded from the analysis due to the primary objective of demonstrating the filtering of features specific to the asset. In the case of extending the filtering process to the entire image, this approach would consequently identify Group 5 as an undesirable feature as well.
It is important to acknowledge that errors may arise from variations in the scum hardness survey, considering the time lapse of approximately a year between the survey and DEM data. However, given the gradual increase in scum hardness over a long period of time, it is reasonable to assume that the data are sufficiently reliable for the purpose of this study. Additionally, there were substantial differences in sample sizes among the scum hardness levels, with water scum hardness comprising more than 37% of the image data, while medium-soft scum hardness accounts for less than 1.5%, as seen in
Table 4 and
Figure 2. The considerably smaller sample size of medium-soft scum hardness may result in underrepresentation and potentially unreliable estimates of statistical measurements. Nevertheless, the inclusion of simultaneous surveys with the DEMs would be beneficial in mitigating errors arising from temporal variation and providing additional data samples to strengthen the validity of the results obtained from the analysis.
The filtering process relies on the clustering capability of k-means to effectively group the features of interest. In this study, we employed the CH criterion for clustering validation to quantitatively assess the quality of the clustering results. The advantage of the CH criterion lies in its capability to account for both the separation between clusters and the cohesion within the clusters, thereby providing an overall assessment of clustering quality, where a higher CH index indicates a more well-defined and compact clustering outcome. As the CH criterion considers variance ratios, it is more robust in scenarios where clusters exhibit complex geometry compared to other alternatives such as the elbow method and silhouette plot. Upon applying this criterion, the
clusters exhibited visually identifiable features, making them suitable for filtering purposes. In contrast, clusters with a lower CH index, falling below
, did not capture less prevalent and complex features (e.g., dirt), while those exceeding
yielded less meaningful groups that were challenging to interpret, as illustrated in
Figure 3. Therefore, a non-optimal number of clusters (lower CH index) would render the k-means filtering method less robust and practical, as the key advantages of this method lie in its ability to cluster interpretable features for filtering. Further evaluation to analyse the quality of the clusters individually and globally can also be carried out through the integration of multiple cluster validation methods; however, this is beyond the scope of this work.
Furthermore, certain features (e.g., flotation) were not distinctly separated into groups. This is attributed to overlapping patterns among data points related to the feature, which makes achieving distinct separation difficult. To address this limitation and enhance the filtering process, the integration of supplementary information, such as thermal imagery, can provide valuable insights, enabling the better distinction and grouping of desired features that may not be apparent from elevation and colour variables alone. While clustering was effective for the prevalent feature of interest in this study, further consideration is needed for less prevalent and more intricate features.
This comparative analysis, as well as the proposed k-means filtering method, are designed for the specific application of anaerobic lagoon covers, and may not be universally applicable. While the capabilities of the proposed method have been demonstrated, determining its suitability for other applications may require further investigation, modification, and refinement. Furthermore, it should be noted that the k-means and the conventional methods are not mutually exclusive, and they operate on different principles and applications. Hence, a direct comparison of their capabilities to filter noise and unwanted artifacts is rather difficult. Nevertheless, for this specific study application, the intention is to showcase a machine learning approach and its additional capabilities as opposed to the currently existing methods in practice.
Another aspect of this research involves employing data-driven learning algorithms to predict biogas performance using historical operational data from the WTP [
5,
6]. Furthermore, incorporating filtered DEMs and orthomosaics are highly beneficial in enhancing these machine learning models. The outcomes of this work also lay the groundwork for future machine learning endeavours, enabling models to incorporate the spatial characteristics of the asset and effectively account for elevation variations.
In future work, efforts will be directed towards enhancing the data quality and incorporating additional information to further improve the filtering process. Moreover, the focus will be on developing a deep learning model for asset performance and monitoring, leveraging the advancements in machine learning techniques.