MDPI - Publisher of Open Access Journals

24 pages, 1698 KB

Open AccessArticle

Deep Learning-Based Classification of Transformer Inrush and Fault Currents Using a Hybrid Self-Organizing Map and CNN Model

by Heungseok Lee, Sang-Hee Kang and Soon-Ryul Nam

Energies 2025, 18(20), 5351; https://doi.org/10.3390/en18205351 - 11 Oct 2025

Viewed by 150

Abstract

Accurate classification between magnetizing inrush currents and internal faults is essential for reliable transformer protection and stable power system operation. Because their transient waveforms are so similar, conventional differential protection and harmonic restraint techniques often fail under dynamic conditions. This study presents a [...] Read more.

Accurate classification between magnetizing inrush currents and internal faults is essential for reliable transformer protection and stable power system operation. Because their transient waveforms are so similar, conventional differential protection and harmonic restraint techniques often fail under dynamic conditions. This study presents a two-stage classification model that combines a self-organizing map (SOM) and a convolutional neural network (CNN) to enhance robustness and accuracy in distinguishing between inrush currents and internal faults in power transformers. In the first stage, an unsupervised SOM identifies topologically structured event clusters without the need for labeled data or predefined thresholds. Seven features are extracted from differential current signals to form fixed-length input vectors. These vectors are projected onto a two-dimensional SOM grid to capture inrush and fault distributions. In the second stage, the SOM’s activation maps are converted to grayscale images and classified by a CNN, thereby merging the interpretability of clustering with the performance of deep learning. Simulation data from a 154 kV MATLAB/Simulink transformer model includes inrush, internal fault, and overlapping events. Results show that after one cycle following fault inception, the proposed method improves accuracy (AC), precision (PR), recall (RC), and F1-score (F1s) by up to 3% compared with a conventional CNN model, demonstrating its suitability for real-time transformer protection. Full article

(This article belongs to the Special Issue Application of Artificial Intelligence in Electrical Power Systems)

► Show Figures

Graphical abstract

19 pages, 9329 KB

Open AccessArticle

How to Achieve Integrated High Supply and a Balanced State of Ecosystem Service Bundles: A Case Study of Fujian Province, China

by Ziyi Zhang, Zhaomin Tong, Feifei Fan and Ke Liang

Land 2025, 14(10), 2002; https://doi.org/10.3390/land14102002 - 6 Oct 2025

Viewed by 366

Abstract

Ecosystems are nonlinear systems that can shift between multiple stable states. Ecosystem service bundles (ESBs) integrate the supply and trade-offs of multiple services, yet the conditions for achieving high-supply and balanced states remain unclear from a nonlinear, threshold-based perspective. In this study, six [...] Read more.

Ecosystems are nonlinear systems that can shift between multiple stable states. Ecosystem service bundles (ESBs) integrate the supply and trade-offs of multiple services, yet the conditions for achieving high-supply and balanced states remain unclear from a nonlinear, threshold-based perspective. In this study, six representative ecosystem services in Fujian Province were quantified, and ESBs were identified using a Self-Organizing Map (SOM). By integrating the Multiclass Explainable Boosting Machine (MC-EBM) with the API interpretable algorithm, we propose a framework for exploring ESB driving mechanisms from a nonlinear, threshold-based perspective, addressing two key questions: (1) Which factors dominate ESB formation? (2) What thresholds of these factors promote high-supply, balanced ESBs? Results show that (i) the proportion of water bodies, distance to construction land, annual solar radiation, annual precipitation, population density, and GDP density are the primary driving factors; (ii) higher proportions of water bodies enhance and balance multiple services, whereas intensified human activities significantly reduce supply levels, and ESBs are highly sensitive to climatic variables; (iii) at the 1 km × 1 km grid scale, optimal threshold ranges of the dominant factors substantially increase the likelihood of forming high-supply, balanced ESBs. The MC-EBM effectively reveals ESB formation mechanisms, significantly outperforming multinomial logistic regression in predictive accuracy and demonstrating strong generalizability. The proposed approach provides methodological guidance for multi-service coordination across regions and scales. Corresponding land management strategies are also proposed, which deepen understanding of ESB formation and offer practical references for enhancing ecosystem service supply and reducing trade-offs. Full article

► Show Figures

Figure 1

37 pages, 1134 KB

Open AccessArticle

SOMTreeNet: A Hybrid Topological Neural Model Combining Self-Organizing Maps and BIRCH for Structured Learning

by Yunus Doğan

Mathematics 2025, 13(18), 2958; https://doi.org/10.3390/math13182958 - 12 Sep 2025

Viewed by 511

Abstract

This study introduces SOMTreeNet, a novel hybrid neural model that integrates Self-Organizing Maps (SOMs) with BIRCH-inspired clustering features to address structured learning in a scalable and interpretable manner. Unlike conventional deep learning models, SOMTreeNet is designed with a recursive and modular topology that [...] Read more.

This study introduces SOMTreeNet, a novel hybrid neural model that integrates Self-Organizing Maps (SOMs) with BIRCH-inspired clustering features to address structured learning in a scalable and interpretable manner. Unlike conventional deep learning models, SOMTreeNet is designed with a recursive and modular topology that supports both supervised and unsupervised learning, enabling tasks such as classification, regression, clustering, anomaly detection, and time-series analysis. Extensive experiments were conducted using various publicly available datasets across five analytical domains: classification, regression, clustering, time-series forecasting, and image classification. These datasets cover heterogeneous structures including tabular, temporal, and visual data, allowing for a robust evaluation of the model’s generalizability. Experimental results demonstrate that SOMTreeNet consistently achieves competitive or superior performance compared to traditional machine learning and deep learning methods while maintaining a high degree of interpretability and adaptability. Its biologically inspired hierarchical structure facilitates transparent decision-making and dynamic model growth, making it particularly suitable for real-world applications that demand both accuracy and explainability. Overall, SOMTreeNet offers a versatile framework for learning from complex data while preserving the transparency and modularity often lacking in black-box models. Full article

(This article belongs to the Special Issue New Advances in Data Analytics and Mining)

► Show Figures

Figure 1

29 pages, 21087 KB

Open AccessArticle

Multi-Scale Ecosystem Service Supply–Demand Dynamics and Driving Mechanisms in Mainland China During the Last Two Decades: Implications for Sustainable Development

by Menghao Qi, Mingcan Sun, Qinping Liu, Hongzhen Tian, Yanchao Sun, Mengmeng Yang and Hui Zhang

Sustainability 2025, 17(15), 6782; https://doi.org/10.3390/su17156782 - 25 Jul 2025

Viewed by 697

Abstract

The growing mismatch between ecosystem service (ES) supply and demand underscores the importance of thoroughly understanding their spatiotemporal patterns and key drivers to promote ecological civilization and sustainable development at the regional level in China. This study investigates six key ES indicators across [...] Read more.

The growing mismatch between ecosystem service (ES) supply and demand underscores the importance of thoroughly understanding their spatiotemporal patterns and key drivers to promote ecological civilization and sustainable development at the regional level in China. This study investigates six key ES indicators across mainland China—habitat quality (HQ), carbon sequestration (CS), water yield (WY), sediment delivery ratio (SDR), food production (FP), and nutrient delivery ratio (NDR)—by integrating a suite of analytical approaches. These include a spatiotemporal analysis of trade-offs and synergies in supply, demand, and their ratios; self-organizing maps (SOM) for bundle identification; and interpretable machine learning models. While prior research studies have typically examined ES at a single spatial scale, focusing on supply-side bundles or associated drivers, they have often overlooked demand dynamics and cross-scale interactions. In contrast, this study integrates SOM and SHAP-based machine learning into a dual-scale framework (grid and city levels), enabling more precise identification of scale-dependent drivers and a deeper understanding of the complex interrelationships between ES supply, demand, and their spatial mismatches. The results reveal pronounced spatiotemporal heterogeneity in ES supply and demand at both grid and city scales. Overall, the supply services display a spatial pattern of higher values in the east and south, and lower values in the west and north. High-value areas for multiple demand services are concentrated in the densely populated eastern regions. The grid scale better captures spatial clustering, enhancing the detection of trade-offs and synergies. For instance, the correlation between HQ and NDR supply increased from 0.62 (grid scale) to 0.92 (city scale), while the correlation between HQ and SDR demand decreased from −0.03 to −0.58, indicating that upscaling may highlight broader synergistic or conflicting trends missed at finer resolutions. In the spatiotemporal interaction network of supply–demand ratios, CS, WY, FP, and NDR persistently show low values (below −0.5) in western and northern regions, indicating ongoing mismatches and uneven development. Driver analysis demonstrates scale-dependent effects: at the grid scale, HQ and FP are predominantly influenced by socioeconomic factors, SDR and WY by ecological variables, and CS and NDR by climatic conditions. At the city level, socioeconomic drivers dominate most services. Based on these findings, nine distinct supply–demand bundles were identified at both scales. The largest bundle at the grid scale (B3) occupies 29.1% of the study area, while the largest city-scale bundle (B8) covers 26.5%. This study deepens the understanding of trade-offs, synergies, and driving mechanisms of ecosystem services across multiple spatial scales; reveals scale-sensitive patterns of spatial mismatch; and provides scientific support for tiered ecological compensation, integrated regional planning, and sustainable development strategies. Full article

► Show Figures

Figure 1

24 pages, 18590 KB

Open AccessArticle

Soil Organic Matter (SOM) Mapping in Subtropical Coastal Mountainous Areas Using Multi-Temporal Remote Sensing and the FOI-XGB Model

by Hao Zhang, Xiaomei Li, Jinming Sha, Jiangning Ouyang and Zhipeng Fan

Remote Sens. 2025, 17(15), 2547; https://doi.org/10.3390/rs17152547 - 22 Jul 2025

Cited by 1 | Viewed by 453

Abstract

Accurate regional-scale mapping of soil organic matter (SOM) is crucial for land productivity management and global carbon pool monitoring. Current remote sensing inversion of SOM faces challenges, including the underutilization of temporal information and low feature selection efficiency. To address these limitations, this [...] Read more.

Accurate regional-scale mapping of soil organic matter (SOM) is crucial for land productivity management and global carbon pool monitoring. Current remote sensing inversion of SOM faces challenges, including the underutilization of temporal information and low feature selection efficiency. To address these limitations, this study developed an integrated framework combining multi-temporal Landsat imagery, field-measured SOM data, intelligent feature optimization, and machine learning. The framework employs two novel image-processing strategies: the Maximum Annual Bare-Soil Composite (MABSC) method to extract background spectral information and the Multi-temporal Feature Optimization Composite (MFOC) method to capture seasonal and environmental dynamics. These features, along with topographic covariates, were processed using an improved Feature-Optimized and Interpretable XGBoost (FOI-XGB) model for key variable selection and spatial mapping. Validation across two subtropical coastal mountainous regions at different scales in southeastern China demonstrated the framework’s effectiveness and robustness. Key findings include the following: (1) Both the MABSC-derived spectral bands and the MFOC-optimized indices significantly outperformed traditional single-season approaches. Their combined use achieved a moderate SOM inversion accuracy (R² = 0.42–0.44). (2) The FOI-XGB model substantially outperformed traditional feature selection methods (Pearson, SHAP, and CorrSHAP), achieving significant regional R² improvements ranging from 9.72% to 88.89%. (3) The optimal model integrating the MABSC-derived features, MFOC-optimized indices, and topographic covariates attained the highest accuracy (R² up to 0.51). This represents major improvements compared with using topographic covariates alone (R² increase of up to 160.11%) or the combined spectral features (MABSC + MFOC) alone (R² increase of up to 15.91%). This study provides a robust, scalable, and practical technical solution for accurate SOM mapping in complex environments, with significant implications for sustainable land management and carbon monitoring. Full article

► Show Figures

Graphical abstract

29 pages, 8640 KB

Open AccessArticle

A Multi-Objective Optimization and Decision Support Framework for Natural Daylight and Building Areas in Community Elderly Care Facilities in Land-Scarce Cities

by Fang Wen, Lu Zhang, Ling Jiang, Wenqi Sun, Tong Jin and Bo Zhang

ISPRS Int. J. Geo-Inf. 2025, 14(7), 272; https://doi.org/10.3390/ijgi14070272 - 10 Jul 2025

Viewed by 769

Abstract

With the rapid advancement of urbanization in China, the demand for community-based elderly care facilities (CECFs) has been increasing. One pressing challenge is the question of how to provide CECFs that not only meet the health needs of the elderly but also make [...] Read more.

With the rapid advancement of urbanization in China, the demand for community-based elderly care facilities (CECFs) has been increasing. One pressing challenge is the question of how to provide CECFs that not only meet the health needs of the elderly but also make efficient use of limited urban land resources. This study addresses this issue by adopting an integrated multi-method research framework that combines multi-objective optimization (MOO) algorithms, Spearman rank correlation analysis, ensemble learning methods (Random Forest combined with SHapley Additive exPlanations (SHAP), where SHAP enhances the interpretability of ensemble models), and Self-Organizing Map (SOM) neural networks. This framework is employed to identify optimal building configurations and to examine how different architectural parameters influence key daylight performance indicators—Useful Daylight Illuminance (UDI) and Daylight Factor (DF). Results indicate that when UDI and DF meet the comfort thresholds for elderly users, the minimum building area can be controlled to as little as 351 m² and can achieve a balance between natural lighting and spatial efficiency. This ensures sufficient indoor daylight while mitigating excessive glare that could impair elderly vision. Significant correlations are observed between spatial form and daylight performance, with factors such as window-to-wall ratio (WWR) and wall thickness (WT) playing crucial roles. Specifically, wall thickness affects indoor daylight distribution by altering window depth and shading. Moreover, the ensemble learning models combined with SHAP analysis uncover nonlinear relationships between various architectural parameters and daylight performance. In addition, a decision support method based on SOM is proposed to replace the subjective decision-making process commonly found in traditional optimization frameworks. This method enables the visualization of a large Pareto solution set in a two-dimensional space, facilitating more informed and rational design decisions. Finally, the findings are translated into a set of practical design strategies for application in real-world projects. Full article

► Show Figures

Figure 1

19 pages, 1706 KB

Open AccessArticle

An Unsupervised Anomaly Detection Method for Nuclear Reactor Coolant Pumps Based on Kernel Self-Organizing Map and Bayesian Posterior Inference

by Lin Wang, Shuqiao Zhou, Tianhao Zhang, Chao Guo and Xiaojin Huang

Energies 2025, 18(11), 2887; https://doi.org/10.3390/en18112887 - 30 May 2025

Cited by 1 | Viewed by 641

Abstract

Effectively monitoring the operational status of reactor coolant pumps (RCPs) is crucial for enhancing the safety and stability of nuclear power operations. To address the challenges of limited interpretability and suboptimal detection performance in existing methods for detecting abnormal operating states of RCPs, [...] Read more.

Effectively monitoring the operational status of reactor coolant pumps (RCPs) is crucial for enhancing the safety and stability of nuclear power operations. To address the challenges of limited interpretability and suboptimal detection performance in existing methods for detecting abnormal operating states of RCPs, this paper proposes an interpretable, unsupervised anomaly detection approach. This innovative method designs a framework that combines Kernel Self-Organizing Map (Kernel SOM) clustering with Bayesian Posterior Inference. Specifically, the proposed method uses Kernel SOM to extract typical patterns from normal operation data. Subsequently, a distance probability distribution model reflecting the data distribution structure within each cluster is constructed, providing a robust tool for data distribution analysis for anomaly detection. Finally, based on prior knowledge, such as distance probability distribution, the Bayesian Posterior Inference is employed to infer the probability of the equipment being in a normal state. By constructing distribution models that reflect data distribution structures and combining them with posterior inference, this approach realizes the traceability and interpretability of the anomaly detection process, improving the transparency of anomaly detection and enabling operators to understand the decision logic and the analysis of the causes of anomalous occurrences. Verification via real-world operational data demonstrates the method’s superior effectiveness. This work offers a highly interpretable solution for RCP anomaly detection, with significant implications for safety-critical applications in the nuclear energy sector. Full article

(This article belongs to the Section B4: Nuclear Energy)

► Show Figures

Figure 1

20 pages, 14821 KB

Open AccessArticle

Seismic Facies Classification of Salt Structures and Sediments in the Northern Gulf of Mexico Using Self-Organizing Maps

by Silas Adeoluwa Samuel, Camelia C. Knapp and James H. Knapp

Geosciences 2025, 15(5), 183; https://doi.org/10.3390/geosciences15050183 - 19 May 2025

Viewed by 1106

Abstract

Proper geologic reservoir characterization is crucial for energy generation and climate change mitigation efforts. While conventional techniques like core analysis and well logs provide limited spatial reservoir information, seismic data can offer valuable 3D insights into fluid and rock properties away from the [...] Read more.

Proper geologic reservoir characterization is crucial for energy generation and climate change mitigation efforts. While conventional techniques like core analysis and well logs provide limited spatial reservoir information, seismic data can offer valuable 3D insights into fluid and rock properties away from the well. This research focuses on identifying important structural and stratigraphic variations at the Mississippi Canyon Block 118 (MC-118) field, located on the northern slope of the Gulf of Mexico, which is significantly influenced by complex salt tectonics and slope failure. Due to a lack of direct subsurface data like well logs and cores, this area poses challenges in delineating potential reservoirs for carbon storage. The study leveraged seismic multi-attribute analysis and machine learning on 3-D seismic data and well logs to improve reservoir characterization, which could inform field development strategies for hydrogen or carbon storage. Different combinations of geometric, instantaneous, amplitude-based, spectral frequency, and textural attributes were tested using Self-Organizing Maps (SOM) to identify distinct seismic facies. SOM Models 1 and 2, which combined geometric, spectral, and amplitude-based attributes, were shown to delineate potential storage reservoirs, gas hydrates, salt structures, associated radial faults, and areas with poor data quality due to the presence of the salt structures more than SOM Models 3 and 4. The SOM results presented evidence of potential carbon storage reservoirs and were validated by matching reservoir sands in well log information with identified seismic facies using SOM. By automating data integration and property prediction, the proposed workflow leads to a cost-effective and faster understanding of the subsurface than traditional interpretation methods. Additionally, this approach may apply to other locations with sparse direct subsurface information to identify potential reservoirs of interest. Full article

(This article belongs to the Special Issue Editorial Board Members’ Collection Series: “New Horizons in Geophysics: From Theory to Applications”)

► Show Figures

Figure 1

27 pages, 6135 KB

Open AccessArticle

Integrated SOM Multi-Attribute Optimization and Seismic Waveform Inversion for Thin Sand Body Characterization: A Case Study of the Paleogene Lower E₃d₂ Sub-Member in the HHK Depression, Bohai Bay Basin

by Jing Wang, Dayong Guan, Xiaobo Huang, Youbin He, Hua Li, Wei Xu, Rui Liu and Bin Feng

Appl. Sci. 2025, 15(9), 5134; https://doi.org/10.3390/app15095134 - 5 May 2025

Cited by 2 | Viewed by 1159

Abstract

Thin-bedded beach-bar reservoirs in the continental faulted basins of eastern China hold significant potential, yet pose challenges for unconventional hydrocarbon development due to their thin-layer characteristics and heterogeneity. This study focuses on the Paleogene Lower E₃d₂ Sub-member in the HHK [...] Read more.

Thin-bedded beach-bar reservoirs in the continental faulted basins of eastern China hold significant potential, yet pose challenges for unconventional hydrocarbon development due to their thin-layer characteristics and heterogeneity. This study focuses on the Paleogene Lower E₃d₂ Sub-member in the HHK Depression, Bohai Bay Basin as a case study. We propose an innovative technical framework integrating Self-Organizing Map (SOM) multi-attribute optimization with seismic waveform inversion. Petrophysical analysis demonstrates that waveform-indicated inversion can detect 1.8–3.0 m thin sandstones, achieving a 90.2% mean match rate (95% CI: 87.5–92.7%, n = 12; bootstrap resampling) for training wells and 81.5% (95% CI: 76.8–85.3%, n = 11) for validation wells. By integrating SOM seismic attribute clustering with seismic waveform inversion, we were able to delineate microfacies boundaries with precision, enhancing the visibility of beach-bar sand body distributions. This methodology establishes a new paradigm for thin-bed sandstone prediction in low-well-control areas, providing critical support for geological interpretation and resource evaluation in complex depositional systems. Full article

(This article belongs to the Special Issue The Exploration and Development of Unconventional Hydrocarbon Resources, 2nd Edition)

► Show Figures

Figure 1

35 pages, 30272 KB

Open AccessArticle

Machine-Learning-Based Integrated Mining Big Data and Multi-Dimensional Ore-Forming Prediction: A Case Study of Yanshan Iron Mine, Hebei, China

by Yuhao Chen, Gongwen Wang, Nini Mou, Leilei Huang, Rong Mei and Mingyuan Zhang

Appl. Sci. 2025, 15(8), 4082; https://doi.org/10.3390/app15084082 - 8 Apr 2025

Cited by 2 | Viewed by 1988

Abstract

With the rapid development of big data and artificial intelligence technologies, the era of Industry 4.0 has driven large open-pit mines towards digital and intelligent transformation. This is particularly true in mature mining areas such as the Yanshan Iron Mine, where the depletion [...] Read more.

With the rapid development of big data and artificial intelligence technologies, the era of Industry 4.0 has driven large open-pit mines towards digital and intelligent transformation. This is particularly true in mature mining areas such as the Yanshan Iron Mine, where the depletion of shallow proven reserves and the increasing issues of mixed surrounding rocks with shallow ore bodies make it increasingly important to build intelligent mines and implement green and sustainable development strategies. However, previous mineralization predictions for the Yanshan Iron Mine largely relied on traditional geological data (such as blasting rock powder, borehole profiles, etc.) exploration reports or three-dimensional explicit ore body models, which lacked precision and were insufficient to meet the requirements for intelligent mine construction. Therefore, this study, based on artificial intelligence technology, focuses on geoscience big data mining and quantitative prediction, with the goal of achieving multi-scale, multi-dimensional, and multi-modal precise positioning of the Yanshan Iron Mine and establishing its intelligent mine technology system. The specific research contents and results are as follows: (1) This study collected and organized multi-source geoscience data for the Yanshan Iron Mine, including geological, geophysical, and remote sensing data, such as mine drilling data, centimeter-level drone image data, and high-spectral data of rocks and minerals, establishing a rich mine big data set. (2) SOM clustering analysis was performed on the elemental data of rock and mineral samples, identifying key elements positively correlated with iron as Mg, Al, Si, S, K, Ca, and Mn. TSG was used to interpret shortwave and thermal infrared hyperspectral data of the samples, identifying the main alteration mineral types in the mining area. Combined with spectral and elemental analysis, the universality of alteration features such as chloritization and carbonation, which are closely related to the mineralization process, was further verified. (3) Based on the spectral and elemental grade data of rock and mineral samples, a training model for ore grade–spectrum correlation was constructed using Random Forests, Support Vector Machines, and other algorithms, with the SMOTE algorithm applied to balance positive and negative samples. This model was then applied to centimeter-level drone images, achieving high-precision intelligent identification of magnetite in the mining area. Combined with LiDAR image elevation data, a real-time three-dimensional surface mineral monitoring model for the mining area was built. (4) The Bagged Positive Label Unlabeled Learning (BPUL) method was adopted to integrate five evidence maps—carbonate alteration, chloritization, mixed rockization, fault zones, and magnetic anomalies—to conduct three-dimensional mineralization prediction analysis for the mining area. The locations of key target areas were delineated. The SHAP index and three-dimensional explicit geological models were used to conduct an in-depth analysis of the contributions of different feature variables in the mineralization process of the Yanshan Iron Mine. In conclusion, this study successfully constructed the technical framework for intelligent mine construction at the Yanshan Iron Mine, providing important theoretical and practical support for mineralization prediction and intelligent exploration in the mining area. Full article

(This article belongs to the Special Issue Green Mining: Theory, Methods, Computation and Application)

► Show Figures

Figure 1

21 pages, 7365 KB

Open AccessArticle

The Spatial Distribution and Driving Mechanism of Soil Organic Matter in Hilly Basin Areas Based on Genetic Algorithm Variable Combination Optimization and Shapley Additive Explanations Interpretation

by He Huang, Yaolin Liu, Yanfang Liu, Zhaomin Tong, Zhouqiao Ren and Yifan Xie

Remote Sens. 2025, 17(7), 1186; https://doi.org/10.3390/rs17071186 - 27 Mar 2025

Viewed by 913

Abstract

Studying the spatial variation patterns and influencing factors of soil organic matter (SOM) in hilly and basin areas is of great significance for guiding agricultural production practices. This study takes Lanxi City as an example and comprehensively considers soil formation factors such as [...] Read more.

Studying the spatial variation patterns and influencing factors of soil organic matter (SOM) in hilly and basin areas is of great significance for guiding agricultural production practices. This study takes Lanxi City as an example and comprehensively considers soil formation factors such as climate, vegetation, and terrain. Based on the genetic algorithm, 47 environmental variables are combined and optimized to construct a random forest (RF) model and an improved version—a random forest model based on genetic algorithm variable combination optimization (RF-GA). At the same time, the SHAP interpretation method is used to quantitatively analyze the spatial distribution characteristics of the SOM content and further identify the main driving factors. Compared with the ordinary Kriging (OK) and random forest (RF) methods, the random forest model based on genetic algorithm variable combination optimization (RF-GA) demonstrates a significantly improved prediction accuracy (R² = 0.49; RMSE = 3.49 g·kg⁻¹), with an MAE = 3.019 and LCCC = 0.67. Among the three models, the R² of the RF-GA model increases by 87.84% and 56.29%. The model prediction results indicate that the SOM content in the study area ranges from 12.11 to 31.38 g·kg⁻¹, showing spatial distribution characteristics of a higher content in mountainous areas and a lower content in plains. A further SHAP analysis shows that terrain, climate, and biological factors are key environmental factors affecting the spatial differentiation of the SOM, with the channel network base level (CNBL), which contributes 20.68% to the model, and DEM, which has a contribution rate of 5.57%, playing particularly significant roles. By regulating moisture, erosion deposition, vegetation distribution, and microclimate conditions, they significantly affect the spatial distribution of the SOM. In summary, the RF-GA and its interpretable prediction model constructed in this study not only effectively reveal the spatial and driving mechanisms of SOM in hilly and basin areas but also provide a solid theoretical basis and practical guidance for accurate mapping, the formulation of sustainable utilization strategies for soil resources, and ensuring national food security. Full article

► Show Figures

Figure 1

12 pages, 4767 KB

Open AccessArticle

Disentangling Multiannual Air Quality Profiles Aided by Self-Organizing Map and Positive Matrix Factorization

by Stefano Fornasaro, Aleksander Astel, Pierluigi Barbieri and Sabina Licen

Toxics 2025, 13(2), 137; https://doi.org/10.3390/toxics13020137 - 14 Feb 2025

Viewed by 1578

Abstract

The evaluation of air pollution is a critical concern due to its potential severe impacts on human health. Currently, vast quantities of data are collected at high frequencies, and researchers must navigate multiannual, multisite datasets trying to identify possible pollutant sources while addressing [...] Read more.

The evaluation of air pollution is a critical concern due to its potential severe impacts on human health. Currently, vast quantities of data are collected at high frequencies, and researchers must navigate multiannual, multisite datasets trying to identify possible pollutant sources while addressing the presence of noise and sparse missing data. To address this challenge, multivariate data analysis is widely used with an increasing interest in neural networks and deep learning networks along with well-established chemometrics methods and receptor models. Here, we report a combined approach involving the Self-Organizing Map (SOM) algorithm, Hierarchical Clustering Analysis (HCA), and Positive Matrix Factorization (PMF) to disentangle multiannual, multisite data in a single elaboration without previously separating the sites and years. The approach proved to be valid, allowing us to detect the site peculiarities in terms of pollutant sources, the variation in pollutant profiles during years and the outliers, affording a reliable interpretation. Full article

(This article belongs to the Special Issue Atmospheric Emissions Characteristics and Its Impact on Human Health)

► Show Figures

Graphical abstract

14 pages, 6956 KB

Open AccessArticle

Enhanced Inversion of Sound Speed Profile Based on a Physics-Inspired Self-Organizing Map

by Guojun Xu, Ke Qu, Zhanglong Li, Zixuan Zhang, Pan Xu, Dongbao Gao and Xudong Dai

Remote Sens. 2025, 17(1), 132; https://doi.org/10.3390/rs17010132 - 2 Jan 2025

Cited by 4 | Viewed by 960

Abstract

The remote sensing-based inversion of sound speed profile (SSP) enables the acquisition of high-spatial-resolution SSP without in situ measurements. The spatial division of the inversion grid is crucial for the accuracy of results, determining both the number of samples and the consistency of [...] Read more.

The remote sensing-based inversion of sound speed profile (SSP) enables the acquisition of high-spatial-resolution SSP without in situ measurements. The spatial division of the inversion grid is crucial for the accuracy of results, determining both the number of samples and the consistency of inversion relationships. The result of our research is the introduction of a physics-inspired self-organizing map (PISOM) that facilitates SSP inversion by clustering samples according to the physical perturbation law. The linear physical relationship between sea surface parameters and the SSP drives dimensionality reduction for the SOM, resulting in the clustering of samples exhibiting similar disturbance laws. Subsequently, samples within each cluster are generalized to construct the topology of the solution space for SSP reconstruction. The PISOM method significantly improves accuracy compared with the SOM method without clustering. The PISOM has an SSP reconstruction error of less than 2 m/s in 25% of cases, while the SOM method has none. The transmission loss calculation also shows promising results, with an error of only 0.5 dB at 30 km, 5.5 dB smaller than that of the SOM method. A physical interpretation of the neural network processing confirms that physics-inspired clustering can bring better precision gains than the previous spatial grid. Full article

(This article belongs to the Special Issue Artificial Intelligence for Ocean Remote Sensing)

► Show Figures

Figure 1

5 pages, 1557 KB

Open AccessProceeding Paper

A Data-Driven Analysis for Understanding and Risk Estimation of Discolouration in Drinking Water Distribution Systems

by Grigorios Kyritsakas, Stewart Husband, Killian Gleeson, Katrina Flavell and Joby Boxall

Eng. Proc. 2024, 69(1), 206; https://doi.org/10.3390/engproc2024069206 - 11 Nov 2024

Viewed by 714

Abstract

This paper presents machine learning analysis to understand the factors impacting iron concentrations and discolouration customer contacts in drinking water distribution systems. Fourteen years of network sampling and additional data from a large UK utility were collated, analysed, and interpreted using self-organising maps [...] Read more.

This paper presents machine learning analysis to understand the factors impacting iron concentrations and discolouration customer contacts in drinking water distribution systems. Fourteen years of network sampling and additional data from a large UK utility were collated, analysed, and interpreted using self-organising maps (SOMs), which include complex network theory (CNT) centrality metrics for the first time, investigating how possible explanatory variables interact. The outputs are used to inform ensemble decision trees for risk estimation of iron exceedance and customer contacts for each of the utility’s DMAs, helping inform proactive maintenance. Full article

(This article belongs to the Proceedings of The 3rd International Joint Conference on Water Distribution Systems Analysis & Computing and Control for the Water Industry (WDSA/CCWI 2024))

► Show Figures

Figure 1

27 pages, 15476 KB

Open AccessEditor’s ChoiceArticle

Explainable AI-Based Ensemble Clustering for Load Profiling and Demand Response

by Elissaios Sarmas, Afroditi Fragkiadaki and Vangelis Marinakis

Energies 2024, 17(22), 5559; https://doi.org/10.3390/en17225559 - 7 Nov 2024

Cited by 13 | Viewed by 1811

Abstract

Smart meter data provide an in-depth perspective on household energy usage. This research leverages on such data to enhance demand response (DR) programs through a novel application of ensemble clustering. Despite its promising capabilities, our literature review identified a notable under-utilization of ensemble [...] Read more.

Smart meter data provide an in-depth perspective on household energy usage. This research leverages on such data to enhance demand response (DR) programs through a novel application of ensemble clustering. Despite its promising capabilities, our literature review identified a notable under-utilization of ensemble clustering in this domain. To address this shortcoming, we applied an advanced ensemble clustering method and compared its performance with traditional algorithms, namely, K-Means++, fuzzy K-Means, Hierarchical Agglomerative Clustering, Spectral Clustering, Gaussian Mixture Models (GMMs), BIRCH, and Self-Organizing Maps (SOMs), across a dataset of 5567 households for a range of cluster counts from three to nine. The performance of these algorithms was assessed using an extensive set of evaluation metrics, including the Silhouette Score, the Davies–Bouldin Score, the Calinski–Harabasz Score, and the Dunn Index. Notably, while ensemble clustering often ranked among the top performers, it did not consistently surpass all individual algorithms, indicating its potential for further optimization. Unlike approaches that seek the algorithmically optimal number of clusters, our method proposes a practical six-cluster solution designed to meet the operational needs of utility providers. For this case, the best performing algorithm according to the evaluation metrics was ensemble clustering. This study is further enhanced by integrating Explainable AI (xAI) techniques, which improve the interpretability and transparency of our clustering results. Full article

(This article belongs to the Special Issue Advances in Energy Market and Distributed Generation)

► Show Figures

Figure 1

Search Results (37)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (37)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI