Next Article in Journal
The Recovery of the Strategic Metals from the Nitrate Solutions of Zn-Pb Tailings Using a Solvent Extraction Process
Next Article in Special Issue
Rapid Updating of Multivariate Resource Models Based on New Information Using EnKF-MDA and Multi-Gaussian Transformation
Previous Article in Journal
Exploring Pore Structure Features, Crack Propagation and Failure Behavior of Fiber Reinforced Foam Tail Fill by CT Imaging and 3D Reconstruction
Previous Article in Special Issue
A 3D Geological Modeling Method Using the Transformer Model: A Solution for Sparse Borehole Data
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A Biological-Inspired Deep Learning Framework for Big Data Mining and Automatic Classification in Geosciences

by
Paolo Dell’Aversana
Independent Researcher, 20133 Milan, Italy
Minerals 2025, 15(4), 356; https://doi.org/10.3390/min15040356
Submission received: 1 February 2025 / Revised: 19 March 2025 / Accepted: 27 March 2025 / Published: 28 March 2025

Abstract

:
MycelialNet is a novel deep neural network (DNN) architecture inspired by natural mycelial networks. Mycelia, the vegetative part of fungi, form extensive underground networks that, in a very efficient way, connect biological entities, transport nutrients and signals, and dynamically adapt to environmental conditions. Drawing inspiration from these properties, MycelialNet integrates dynamic connectivity, self-optimization, and resilience into its artificial structure. This paper explores how mycelial-inspired neural networks can enhance big data analysis, particularly in mineralogy, petrology, and other Earth disciplines, where exploration and exploitation must be efficiently balanced during the process of data mining. We validate our approach by applying MycelialNet to synthetic data first, and then to a large petrological database of volcanic rock samples, demonstrating its superior feature extraction, clustering, and classification capabilities with respect to other conventional machine learning methods.

Graphical Abstract

1. Introduction

The field of geoscience has undergone a radical transformation with the advent of big data and a large variety of machine learning (ML) methods [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17]. These technologies have enabled the development of advanced algorithms and models capable of processing and analyzing vast amounts of geoscientific data, often characterized by complexity, multi-scale dimensions, and diverse data types. As geological and geophysical exploration continue to generate enormous volumes of data from various sources—ranging from seismic surveys to remote sensing and rock sample analysis—the need for more sophisticated tools to extract meaningful insights from these datasets has become paramount. Deep learning methods, particularly in their ability to recognize patterns and make predictions, have proven to be powerful tools in geoscientific research, providing innovative solutions to challenges such as model inversion, data classification, and the identification of subsurface features. However, while machine learning algorithms excel at processing big data, they often lack the adaptive and decentralized intelligence observed in many natural biological systems.
This is where interdisciplinary research, particularly insights from biology, offers a compelling opportunity for innovation. For instance, mycology, the branch of biology dedicated to the study of fungi, provides fascinating models of decentralized intelligence, particularly through mycelial networks [18]. These networks exemplify efficient exploration and resource distribution strategies in natural ecosystems, making them an ideal inspiration for advanced AI models, particularly deep neural networks. Indeed, mycelial networks play a fundamental role in natural ecosystems by interconnecting plant roots and other biological entities through vast underground structures (see also the Data Availability Statement at the end of this paper). These networks function as information highways, transporting nutrients, signaling molecules, and even electrical impulses. Their ability to dynamically explore their environment, adapt to stress, and optimize resource distribution has profound implications for artificial intelligence. As anticipated earlier, a key aspect of the fungal life is the mycelium. This is an intricate network of fine fungal threads beneath the forest floor. It forms a vast, interwoven structure that permeates the organic geological layer. The majority of fungal biomass exists underground in this form, extending through the soil and intertwining with plant roots, soil organisms, and microbial communities. While mushrooms are commonly associated with fungi, they are merely the visible fruiting bodies of an expansive mycelial network that remains hidden beneath the surface. Despite their vast size, these underground networks remain largely unnoticed until they produce mushrooms or other fruiting structures. However, their role in ecological balance is crucial, particularly through the Common Mycelial Network (CMN), also known as the Common Mycorrhizal Network. In a CMN, plant and tree roots interconnect via a mycelial web, forming symbiotic partnerships with fungi that enable the exchange of isotopic carbon, nitrogen, phosphorus, water, and biochemical signals across species, space, and time.
Drawing inspiration from this natural network system, in this paper, we propose “MycelialNet”, a biologically inspired deep learning framework designed to enhance big data analysis in geosciences (as well as in other scientific fields). MycelialNet mimics the decentralized, adaptive intelligence of mycelial networks to create robust and dynamic machine learning systems. Unlike traditional deep learning models, which rely on rigid, predefined structures, MycelialNet embraces a self-organizing, exploratory approach that balances efficient data mining with the targeted exploitation of valuable patterns. By integrating biological principles [19,20,21,22,23,24,25]—such as emergent intelligence, adaptive learning, and distributed information processing—into AI architectures, MycelialNet offers a novel way to tackle the complexity and variability inherent in large geoscientific datasets.
Motivated by these considerations, we introduce a comprehensive approach that combines deep learning, reinforcement learning, and biological principles to address key challenges in the classification and interpretation of extensive rock sample datasets. By leveraging the fundamental properties of mycelial networks, we demonstrate how biologically (mycologically) inspired AI can improve the efficiency and accuracy of geoscientific data analysis. Specifically, our approach integrates exploratory search mechanisms, resilience to uncertainty, self-aware capabilities, and multi-scale pattern recognition, ensuring a robust performance in big data mining even in the case of complex and heterogeneous datasets.
Ultimately, this interdisciplinary framework highlights the potential of blending biological sciences, artificial intelligence, and geoscience to develop next-generation computational tools for Earth sciences. By integrating these domains, new avenues for more effective and computationally efficient big data analysis can be open, offering innovative solutions to some of the most pressing challenges in geoscientific research.

2. Methodology

2.1. Key Components of MycelialNets

A core aspect of our biological-inspired approach to big data analysis in geosciences is the implementation of simulated Mycelial deep neural networks, here named “MycelialNets”. As anticipated in the introduction, these use a biologically (mycologically) inspired neural architecture that emulates the extraordinary adaptability of fungal networks. The core components include the following basic aspects:
(1)
“MicelialLayer”: this is a dynamic layer that adjusts its connectivity during training, pruning weak connections while regenerating new ones to optimize learning pathways.
(2)
Dynamic Connectivity: This functionality is inspired by mycelial exploration strategies. The network restructures itself iteratively, mirroring the adaptability of fungal networks.
(3)
Self-Monitoring Mechanism: The MycelialNet model incorporates self-reflection mechanisms. This aspect is inspired by the self-awareness of the biological brain, as discussed in previous works [26]. Adjusting its connectivity ratio based on performance metrics such as accuracy, the MycelialNet model can continuously monitor itself, adapting its own architecture to dynamic environmental conditions on time-varying datasets.
(4)
Exploration Factor: This is an additional component that encourages the model to explore diverse configurations and hyperparameters. It provides a dynamic balance between the exploration/exploitation ratio when the hyper-parameter space is explored by the network model, with the final goal to set an optimal MycelialNet architecture.
In the next sub-section, we define in a more quantitative way the concepts here just mentioned, using a detailed mathematical formulation of the MycelialNet model.

2.2. Mathematical Formulation

Let X R m × n be the input data matrix, where m is the number of samples (for instance, rock samples) and n is the number of features (for instance, major oxides).
Let Y R m × k be the corresponding labels for classification with k output classes (for instance, Basalt, Andesite, Diorite, etc.).
Each input passes through multiple layers of dynamically changing artificial neurons.
Unlike standard artificial neural networks with fixed architectures, MycelialNet introduces a time-dependent weight matrix, Wt, that dynamically adapts:
W t = M t W t 1
where:
Wt is the weight matrix at time t;
Wt−1 is the weight matrix at the previous time step t − 1;
Mt ∈ {0,1}n×d is a binary mask matrix controlling active connections at time t;
n is the number of features;
m is the number of samples (as anticipated earlier);
d is the number of neurons in the layer;
⊙ represents the Hadamard (elementwise) product.
The mask Mt is updated dynamically based on a connectivity ratio ct:
Mt = 1 (Ut < ct)
where Ut is a uniform random matrix. The formula means that the mask, Mt, is updated by comparing each element of a random matrix, Ut, with the connectivity ratio, ct. Those elements that are smaller than ct are “activated” (set to 1), and others are “deactivated” (set to 0). This dynamic update of the mask can be used to model how elements of a system are connected or disconnected in response to a changing connectivity threshold.
This approach allows for not only adjusting weights but also adjusting which connections exist at any given moment. Therefore, rather than having a fixed architecture, the model can ‘reconfigure’ its network structure by selectively enabling or disabling connections, much like the mycelial network of fungi that grows and adapts in response to environmental stimuli.
The connectivity ratio ct evolves over training:
c t + 1 = c t + η L t
where:
η is the learning rate;
L t is the gradient of the loss function L at time t.
The total loss function is
L t o t a l = i = 1 m j = 1 k y i j l o g ( y ^ i j )
where:
m is the number of samples (e.g., rock samples), as stated earlier;
k is the number of output classes (e.g., types of rock), as stated earlier;
y i j is the true label (one-hot encoded, where y i j = 1 for the true class and y i j = 0   otherwise);
y ^ i j s the predicted probability for the j-th class for the i-th sample, computed using the Softmax function.
We briefly remind that the Softmax function is a mathematical function commonly used in machine learning, particularly in multi-class classification problems. It takes a vector of real numbers as the input and converts it into a probability distribution, where each element is in the range (0, 1) and the sum of all elements equals 1. In this case, we compute the Softmax for each sample and class. The final loss is the sum over all samples and classes.
Coming back to the computation of the gradient of the loss function, in our case, higher gradients increase MycelialNet connectivity, mimicking the mycelial network’s expansion in response to environmental stimuli.
For a given neuronal layer l, the activation H t l at time t is computed as:
H t l = σ H t l 1   W t l   b l
where:
H t l 1 is the activation from the previous neuronal layer;
W t l is the dynamically adjusted weight matrix;
b l is the bias vector;
σ is an activation function (e.g., Rectified Linear Unit, briefly ReLU, or sigmoid, as well as other activation functions settable by the user).
The output of the final layer is computed as:
Y ^ = S o f t m a x H t P W t P + b P
where P is the total number of layers.
This equation intuitively means that the final output of the neural network is computed in a multi-class classification problem. The model’s final layer uses weights and biases to compute raw scores (logits), and the Softmax function is then applied to transform these raw scores into a probability distribution, making it suitable for classification tasks. The network minimizes a standard cross-entropy loss for classification. Additionally, we introduce a regularization term to encourage network sparsity. The gradient update rule for weights is given by:
W t + 1 = W t α L t o t a l W t
This equation allows optimizing the weights in the MycelialNet model by moving them in the direction that reduces the total loss function. By iteratively applying this rule, the model learns to make better predictions. The learning rate α controls how quickly or slowly the weights are updated in each iteration.
Finally, to balance the exploration and exploitation of the parameters and the hyper-parameters space, we introduce an entropy-based connectivity adjustment:
c t + 1 = c t + γ H X
where H(X) is the entropy of the activations:
H X = i = 1 N p i l o g p i
This is the standard formula for the Shannon entropy, which measures the uncertainty in the system’s state. It is used here to measure the “spread” or uncertainty in the activations, guiding the network’s adaptability. Higher entropy leads to increased connectivity attempting to reduce uncertainties, mimicking mycelial expansion in high-information regions.

3. Simulations

Previous works have demonstrated the efficacy of both chemical composition and thin section images for automatic rock (and facies) classification using various machine learning techniques [27,28,29,30]. These are particularly valuable in the context of big data in geosciences, where large datasets with complex features require sophisticated methods to extract meaningful insights. To verify the effectiveness of our MycelialNet method, we apply it to both synthetic and real-world mineralogical datasets, comparing its performance against traditional classifiers. The key performance evaluation metric is based on the classification accuracy.

3.1. First Synthetic Test

The first classification test discussed here simulates a three-category classification task, where the goal is to classify a simulated large dataset of rock samples into three distinct hypothetical rock types. The features used for classification are typical oxides commonly found in geological studies, such as TiO2, SiO2, and others. These features are crucial in identifying different types of rocks based on their mineral composition, as we will see in the real-data test discussed ahead. Of course, traditional rock classification relies on multiple factors, including textural and contextual information. However, this test (as well as the subsequent tests) serves an illustrative and methodological purpose, and we limited the number of features to keep the complexity low. In future works, once the methodology is properly consolidated, we plan to incorporate additional chemical, physical, and structural features that will be useful for our machine learning applications.
The synthetic dataset consists of 1000 simulated rock samples, each with features which represent the concentrations or ratios of oxides in each sample. The neural network model was trained on these data using the MycelialNet approach. This proved to be highly effective in handling the complexity of the data. In fact, the accuracy achieved by the model is relatively high (89%), demonstrating its ability to make reliable classifications despite the potential variability in the input data (Figure 1, left panel).
The convergence of both the Training and Test Loss curves (Figure 1, right panel) was fast and effective over the epochs. We remind that an epoch represents a full pass through the training dataset during the learning process. Thanks to entropy-driven connectivity adjustments, the model learned efficiently while avoiding overfitting. This was evident in the Loss function trends, where both Training and Test Losses showed a rapid and consistent reduction over epochs (right panel of Figure 1). The entropy-based updates help the model adapt to the underlying data distribution, improving both speed and stability during the optimization process. We ran many other tests like this comparing the performance of the MycelialNet Model with those of other classifiers (such as “standard” fully connected neural networks). In most cases, MycelialNet outperforms the standard machine learning models. In fact, it shows a better performance in various metrics, with an estimated improvement ranging between 8% and 12% in accuracy, precision, recall, and F1 score.

3.2. Addressing Non-Linear Classification Challenges with MycelialNet

A fundamental challenge in machine learning classification tasks arises when class boundaries in the feature space are non-linear. Many traditional classification models struggle with such problems because they rely on linear decision boundaries, making them ineffective for complex datasets where classes are intertwined in intricate ways.
Of course, current classification methods, such as decision trees, support vector machines (SVMs) with non-linear kernels, and deep learning models with non-linear activations, are capable of handling non-linear decision boundaries. We remark that MycelialNet does not aim to replace these techniques; rather, it seeks to introduce an additional layer of flexibility by evolving its internal architecture in response to the data, not just adjusting weights. This approach allows MycelialNet to potentially uncover more intricate patterns in the data by learning both the structure of the model and the decision boundaries simultaneously, in a way that most current models, which often rely on predefined kernel functions or architectures, may not. Indeed, the MycelialNet model includes news aspects, processes, and mechanisms addressing the self-monitoring and self-adjustment of the entire network architecture (not only the connection weights). Inspired by the resilient and self-optimizing nature of fungal mycelial networks, the model continuously adjusts its internal structure, selectively pruning and regenerating connections to improve the performance on challenging classification tasks. This adaptability allows it to handle large, complex datasets with greater efficiency compared to conventional neural networks.
In the second test discussed in this paper, we use a synthetic dataset based on the “make_moons” function, which generates a dataset where the two classes are intertwined in a “crescent-moon” shape. MycelialNet successfully captures the underlying patterns and effectively distinguishes between the classes.
Also for this test, as in the previous one, the network consists of multiple Mycelial-Layers, each capable of adjusting its structure dynamically. The final dense layer produces the output classification. Figure 2 shows the scatter plot of the synthetic data, displaying the distribution of data points with their respective classes. The “Decision Boundary Plot” illustrates how MycelialNet effectively separates the two intertwined classes, showcasing its ability to handle non-linear classification tasks.

4. Test on a Real Dataset of Rock Samples

4.1. Introducing the Test

After discussing the synthetic tests, we apply the MycelialNet to a real data set. We remark that this type of deep neural network model is designed to perform both supervised and unsupervised analysis of big datasets, offering powerful capabilities for data mining and statistical analysis. It is particularly well-suited for tasks where large amounts of data must be processed, analyzed, and categorized, such as the case of rock sample datasets in mineralogical and petrological disciplines or geophysics. In the case of rock sample datasets, MycelialNet can be trained to classify different types of rocks (e.g., Andesite, Basalt, Granite) based on their chemical composition, specifically oxide percentages (SiO2, Al2O3, Fe2O3, etc.), by learning patterns in the data that map these features to rock types. However, it is well known that every automatic classification approach is improved if it is anticipated by an adequate analysis of the entire dataset, allowing the clear identification of key features, correlations, and hidden relationships between the data. For that reason, MycelialNet is designed also for performing effective statistical analysis, unsupervised learning, and clustering tasks, where there are no predefined labels. MycelialNet mining techniques includes identifying hidden correlations between different oxides and rock types or identifying anomalies in the dataset (such as outliers in the chemical composition). This type of deep learning method allows for assessing how different oxide percentages correlate with each other, identifying highly correlated features (which can be important for feature selection and dimensionality reduction). Moreover, it is very effective in determining which features (oxides, in this case) have the most influence on the classification of rock types, as well as in identifying groups of unlabeled rock samples that share similar chemical compositions. MycelialNet also integrates self-reflection mechanisms [26], which allow the model to evaluate and modify its architecture dynamically based on feedback from the data. This helps improve the model’s performance by optimizing its internal parameters during data mining, and adjusting its structure (e.g., layer sizes and neuron connections) for a better representation of the data. This feature is particularly useful in unsupervised learning scenarios, where the model may adapt its structure to uncover patterns not originally anticipated.
In the test discussed in this section, we applied MycelialNet to a real rock sample dataset. We started with the unsupervised clustering of rock types based on their oxide content, helping geologists identify natural groupings or categories of rocks based on their chemical signatures. Next, we performed a correlation analysis between oxides to determine which oxides tend to vary together, which could provide insights into how geological processes influence the rock composition. After performing the unsupervised analysis, we trained MycelialNet in a supervised learning process to classify rock samples into predefined categories (e.g., Andesite, Basalt, etc.). For that classification task, the model uses labeled training data (with known rock types, based on analysis by human experts) and learns from these samples to predict the rock type of new, unseen samples. We evaluated the performances of the MycelialNet model using standard metrics for classification (accuracy, precision, recall, and AUC) By incorporating cross-validation, the model can provide robust performance metrics.

4.2. The Dataset

The dataset used in this test is sourced from the publicly available GEOROC (Geochemistry of Rocks of the Oceans and Continents) database (GEOROC website, accessed on 15 January 2025). It consists of a compilation of more than 1500 major- and trace-element data points, and 570 Pb-isotopic analyses of Mesozoic–Cenozoic (190–0 Ma) magmatic rocks in southern Peru, northern Chile, and Bolivia (Central Andean orocline) [31]. The chemical oxides used for this classification test include SiO2, TiO2, Al2O3, Fe2O3, MnO, MgO, CaO, Na2O, K2O, and P2O5 (in weight %). The rock samples encompass both the dominant rock types, such as Andesite, Basaltic Andesite, Rhyolite, and Dacite, as well as less common classes with far fewer samples. The chemical and mineralogical compositions of these rock types exhibit considerable overlap, adding complexity to the classification process. As highlighted in the previous synthetic test, one of the key challenges in this classification task is the presence of non-linear decision boundaries due to the overlapping chemical compositions of different rock classes. Traditional machine learning models, such as Support Vector Machines (SVMs), Random Forest, and conventional Neural Networks, often struggle with this high-dimensional, non-linearly separable dataset. For that reason, we attempted to perform this classification task using the MycelialNet model, comparing the results with those obtained by other “traditional” machine learning methods.

4.3. Workflow

We started performing accurate data mining by creating a complete correlation matrix (Figure 3). Creating a correlation matrix in data mining is essential for understanding the quantitative relationships between oxides in a large rock sample dataset. That is crucial for multiple reasons. First, correlations can reveal geochemical associations. In fact, certain oxides co-vary due to mineralogical and petrogenetic processes (e.g., Al2O3 and K2O in feldspars). Second, strongly correlated oxides indicate redundancy, allowing for dimensionality reduction and feature selection to improve the computational efficiency. Third, the correlation matrix helps cluster rock types based on oxide interdependencies. Next, this type of data representation allows identifying which oxides best separate certain rock types, improving the decision boundary accuracy. Moreover, it can reveal latent geochemical trends useful for both classification and exploratory analysis. Finally, the correlation matrix helps highlight economic mineralization trends. For instance, correlations between oxides (e.g., Fe2O3 with TiO2) can indicate ore deposit formation.
Next, we created histograms of all oxides (Figure 4). The x-axis of each histogram corresponds to the concentration of a specific oxide in the samples, measured in percentages or weight ratios (e.g., SiO2, Fe2O3, and Al2O3). Each bin in the histogram represents a range of values for that oxide, showing how often certain concentration levels appear in the dataset. The y-axis represents the frequency (or count) of samples that fall into each concentration range for a given oxide. Higher bars indicate a higher concentration of samples in that range, while shorter bars represent fewer samples in that range. These histograms help identify asymmetrical distributions, guiding MycelialNet to apply adaptive normalization techniques instead of conventional scaling, which might misrepresent the feature importance. MycelialNet uses histogram-based variance analysis to automatically drop or reduce the weight for such features, making the model more computationally efficient. Furthermore, histograms help visualize how distributions overlap, guiding feature engineering for MycelialNet’s dynamic connectivity. In addition, histograms allow MycelialNet to detect natural groupings of rock types before labels are applied, leading to a more efficient classification process. Finally, if histograms show that an oxide has a multimodal distribution, MycelialNet can apply different learning rates for different sub-populations in the data, improving the convergence speed. Another important aspect strictly linked with the mining industry is that histograms can reveal geochemical signatures of ore deposits. In fact, many ore deposits are characterized by specific oxide enrichments (e.g., Fe2O3 for iron ore, or TiO2 for titanium deposits). Histograms help identify whether certain oxide levels correlate with economic mineralization, guiding prospecting models. In conclusion, histograms of oxides are not just visual tools. Instead, they are a fundamental step in intelligent data mining, feature selection, and classification. That is true in general, but it is important especially when using MycelialNet. They enhance the model’s ability to dynamically adapt, extract meaningful patterns, and make accurate predictions in large-scale rock sample datasets.
An additional step of the unsupervised data mining workflow is to visualize the oxide content for each rock type (Figure 5). When using MycelialNet, visualizing the oxides content for each rock type is essential in the unsupervised data mining workflow because it helps uncover geochemical patterns before classification. This step identifies the specific geochemical signatures of different rock types, enhancing the ability to cluster similar samples and refine feature selection. It also improves data interpretation by validating results against geological knowledge while detecting potential outliers or rare rock types. Additionally, it supports dimensionality reduction by guiding techniques like Principal Component Analysis (PCA) and helps in designing more effective supervised learning strategies. By first analyzing oxide distributions, MycelialNet structures the data more effectively, leading to more accurate and meaningful classifications.

4.4. Supervised Learning and Classification Results

After the accurate analysis of the dataset, we applied the MycelialNet model to perform the supervised classification of the rock samples. The classification accuracy was almost 90% on the test dataset. (This result is particularly significant given the complexity and variability of the dataset). The network was trained on a percentage of the dataset ranging between 80% and 90%. The remaining unlabeled data were used as the test dataset. The classification results were visualized using cross-plots that display the predicted rock-class labels in a two-feature space. Figure 6 presents an example of a classification cross-plot where the test data are color-coded according to their assigned rock-class labels in the SiO2–TiO2 feature space. This visualization highlights the model’s ability to distinguish rock types, with some uncertainties, based on their oxide composition. The trend is generally correct: SiO2 tends to increase from basalt, to andesite, dacite, trachyandesite, and rhyolite. Instead, TiO2 decreases with the increasing SiO2, being highest in basalts–basaltic andesite, and lowest in rhyolites. There are some minor misclassification cases. For instance, some diorite samples (yellow dots) appear at a lower SiO2 level (around 50–52%), where basalts and basaltic andesites should be. A possible explanation is that some diorite compositions can be transitional or can contain mafic inclusions. More in general, rock classifications based on oxides are not always rigid: some transitional types exist. Furthermore, geochemical variations in different samples or minor alteration effects can influence SiO2 and TiO2 values. In addition, we remark that this plot (as well as the plot in Figure 7) considers the projection in a two-feature space only (SiO2 and TiO2), but our classification is based on multiple oxides (e.g., K2O, Na2O, and Al2O3).
Similarly, Figure 7 provides an additional classification display, showing the results in the Fe2O3–MgO feature space. General Trends: Fe2O3 and MgO decrease as rocks evolve from basalt, andesite, and basaltic classes to dacite, trachyandesite, and rhyolite classes. Mafic rocks (basalt and andesite) show high Fe2O3 and MgO, while felsic rocks (dacite and rhyolite) show low levels of these oxides. Some minor misclassification cases can be justified, as in Figure 6.
In summary, both figures illustrate the effectiveness of MycelialNet in correctly assigning rock classes while preserving the geochemical relationships within the dataset. For comparison purposes, we applied several different classification models to the same dataset to evaluate their performance relative to MycelialNet. The models tested include Random Forest, Logistic Regression, and a “standard” neural network (with fixed architecture). Each model was trained using the same input features and evaluated under identical conditions to ensure a fair comparison. The classification accuracy achieved by each model is summarized in Table 1. As shown in the results, the MycelialNet model outperformed all other classifiers, achieving the highest accuracy of 87.5%. In contrast, the Random Forest model achieved an accuracy of 62.5%, while Logistic Regression and the standard neural network reached 65% and 69%, respectively. These results highlight the superior performance of MycelialNet in effectively learning complex patterns within the dataset, demonstrating its potential as a powerful classification tool in this context.

5. Discussion

All the tests discussed in the previous sections (on both simulated and real data) show that the MycelialNet model offers several advantages over conventional deep learning models when it comes to data mining, unsupervised learning, and supervised learning. These advantages stem from its unique architecture and learning mechanisms, which make it more adaptable, self-organizing, and efficient in handling complex datasets. Unlike conventional machine learning models with static architectures, MycelialNet incorporates self-organizing connectivity, inspired by the behavior of mycelial networks in nature. This means that the model dynamically adjusts its internal neuron connections based on data patterns, instead of relying on pre-defined layers and connections. This allows it to reconfigure itself for both unsupervised data mining and supervised classification. As shown by both synthetic and real data tests, this approach enables a more organic analysis of information, avoiding bottlenecks and reducing unnecessary computational costs. This helps automatically uncover hidden relationships in large datasets without requiring predefined structures, making it superior to conventional models that often struggle with fixed architectures. As clearly shown in the test on real data, MycelialNet first analyzes the dataset without labels, using clustering, density estimation, and correlation studies to understand the structure of the data. It can self-discover relationships between features and classes before formal classification begins. Once patterns are identified, the model can transition to supervised learning, using labeled data to train a classifier more efficiently. The prior unsupervised step enhances generalization because the network already understands the data distribution. In other words, instead of blindly training on labeled rock types, MycelialNet first explores systematically the oxide compositions, identifies hidden clusters, and then fine-tunes a classification model. This makes the final model more accurate and generalizable compared to conventional neural networks or other machine learning methods. Furthermore, MycelialNet integrates self-aware deep learning techniques, allowing it to self-evaluate its own learning process and self-adjust parameters dynamically, without any external human intervention. Finally, the MycelialNet model evolves by allowing multiple competing subnetworks to solve the same task and selecting the most optimal configuration dynamically. Instead of converging on a single fixed solution (as conventional models do), MycelialNet continuously evaluates and adapts, ensuring higher accuracy and more diverse representations of data. This capability has a significant impact on rock classification: in fact, different rock types can have similar oxide compositions, making classification challenging. MycelialNet tests multiple evolving decision pathways, leading to more robust and confident classifications. In summary, MycelialNet is not just a deep learning model; it is a dynamic, evolving system that integrates self-awareness, biological inspiration, and hybrid learning to offer a fundamentally new approach to big data analysis.

6. Conclusions

Inspired by fungal mycelial networks, MycelialNet introduces a biologically inspired approach to deep learning that enhances big data mining and automatic classification. The integration of dynamic connectivity creates an efficient, self-adaptive neural architecture. Applied to geosciences, this approach facilitates better feature ranking, correlations discovery, and high performance in classification tasks, offering a powerful tool for data-intensive research fields. For rock sample datasets, it allows for both exploratory geochemical studies and accurate supervised classification, making it an invaluable tool for geology, geophysics, and beyond. Future work will explore further refinements and applications in other domains, such as geophysical data inversion, composite well-logs analysis, and so forth, where adaptive data analysis is critical.

Funding

This research received no external funding.

Data Availability Statement

The dataset used in this paper is sourced from the publicly available GEOROC (Geochemistry of Rocks of the Oceans and Continents) database (GEOROC website, accessed on 15 January 2025). Link to the web site: https://georoc.eu/georoc/new-start.asp. Information can be downloaded at: https://www.spun.earth/ (accessed on 15 January 2025), where there is an accurate description and high-resolution images of the mycelium. Furthermore, a very informative video about mycelium and fungal life is “How Fungi Make our Worlds”, by Merlin Sheldrake at https://www.youtube.com/watch?v=ZRFmCXBv5R4. Accessed on 10 January 2025.

Conflicts of Interest

The author declares no conflicts of interest.

References

  1. Russell, S.; Norvig, P. Artificial Intelligence: A Modern Approach; Global Edition; Pearson Education, Inc.: London, UK; Prentice Hall: Upper Saddle River, NJ, USA, 2016. [Google Scholar]
  2. Raschka, S.; Mirjalili, V. Python Machine Learning: Machine Learning and Deep Learning with Python, Scikit-Learn, and TensorFlow, 2nd ed.; PACKT Books: Birmingham, UK, 2017. [Google Scholar]
  3. Ravichandiran, S. Deep Reinforcement Learning with Python; Packt Publishing: Birmingham, UK, 2020. [Google Scholar]
  4. Ribeiro, C.; Szepesvári, C. Q-learning combined with spreading: Convergence and results. In Proceedings of the ISRF-IEE International Conference: Intelligent and Cognitive Systems (Neural Networks Symposium), Tehran, Iran, 23–26 September 1996; pp. 32–36. [Google Scholar]
  5. Zhong, S.H.; Liu, Y.; Li, S.Z.; Bindeman, I.N.; Cawood, P.A.; Seltmann, R.; Liu, J.Q. A machine learning method for distinguishing detrital zircon provenance. Contrib. Mineral. Petrol. 2023, 178, 35. [Google Scholar]
  6. Zhong, S.; Li, S.; Liu, Y.; Cawood, P.A.; Seltmann, R. I-type and S-type granites in the Earth’s earliest continental crust. Commun. Earth Environ. 2023, 4, 61. [Google Scholar]
  7. Binetti, M.S.; Massarelli, C.; Uricchio, V.F. Machine Learning in Geosciences: A Review of Complex Environmental Monitoring Applications. Mach. Learn. Knowl. Extr. 2024, 6, 1263–1280. [Google Scholar] [CrossRef]
  8. Li, Y.E.; O’Malley, D.; Beroza, G.; Curtis, A.; Johnson, P. Machine Learning Developments and Applications in Solid-Earth Geosciences: Fad or Future? J. Geophys. Res. Solid Earth 2023, 128, e2022JB026310. [Google Scholar] [CrossRef]
  9. Sören, J.; Fontoura do Rosário, Y.; Fafoutis, X. Machine Learning in Geoscience Applications of Deep Neural Networks in 4D Seismic Data Analysis. Ph.D. Thesis, Technical University of Denmark, Kongens Lyngby, Denmark, 2020. [Google Scholar]
  10. Bhattacharya, S. Summarized Applications of Machine Learning in Subsurface Geosciences. In A Primer on Machine Learning in Subsurface Geosciences; SpringerBriefs in Petroleum Geoscience & Engineering; Springer: Berlin/Heidelberg, Germany, 2021; pp. 123–165. [Google Scholar]
  11. Zhang, W.; Gu, X.; Tang, L.; Yin, Y.; Liu, D.; Zhang, Y. Application of machine learning, deep learning and optimization algorithms in geoengineering and geoscience: Comprehensive review and future challenge. Gondwana Res. 2022, 109, 1–17. [Google Scholar] [CrossRef]
  12. Fradkov, A.L. Early History of Machine Learning. IFAC-PapersOnLine 2020, 53, 1385–1390. [Google Scholar] [CrossRef]
  13. Nilsson, N.J. The Quest for Artificial Intelligence: A History of Ideas and Achievements; Cambridge University Press: Cambridge, UK, 2011; pp. 1–562. [Google Scholar]
  14. Zhong, S.; Zhang, K.; Bagheri, M.; Burken, J.G.; Gu, A.; Li, B.; Ma, X.; Marrone, B.L.; Ren, Z.J.; Schrier, J.; et al. Machine Learning: New Ideas and Tools in Environmental Science and Engineering. Environ. Sci. Technol. 2021, 55, 12741–12754. [Google Scholar] [PubMed]
  15. Barnes, A.E.; Laughlin, K.J. Investigation of methods for unsupervised classification of seismic data. In Expanded Abstracts; SEG Technical Program: Salt Lake City, UT, USA, 2002; pp. 2221–2224. [Google Scholar] [CrossRef]
  16. Bestagini, P.; Lipari, V.; Tubaro, S. A machine learning approach to facies classification using well logs. In Expanded Abstracts; SEG Technical Program: Houston, TX, USA, 2017; pp. 2137–2142. [Google Scholar] [CrossRef]
  17. Dell’Aversana, P. Comparison of different Machine Learning algorithms for lithofacies classification from well logs. Bull. Geophys. Oceanogr. 2017, 60, 69–80. [Google Scholar] [CrossRef]
  18. Sheldrake, M. Entangled Life: How Fungi Make Our Worlds, Change Our Minds & Shape Our Futures; First US edition; Random House: New York, NY, USA, 2020. [Google Scholar]
  19. Damasio, A. Self Comes to Mind: Constructing the Conscious Brain; Pantheon: New York, NY, USA, 2010. [Google Scholar]
  20. Edelman, G.M. Neural Darwinism: The Theory of Neuronal Group Selection; Basic Books: New York, NY, USA, 1987; ISBN 0-19-286089-5. [Google Scholar]
  21. Edelman, G.M. Bright Air, Brilliant Fire: On the Matter of the Mind; Reprint Edition 1993; Basic Books: New York, NY, USA, 1992; ISBN 0-465-00764-3. [Google Scholar]
  22. Tononi, G.; Boly, M.; Massimini, M.; Koch, C. Integrated information theory: From consciousness to its physical substrate. Nat. Rev. Neurosci. 2016, 17, 450–461. [Google Scholar] [PubMed]
  23. Tononi, G.; Edelman, G.M. Consciousness and complexity. Science 1998, 282, 1846–1851. [Google Scholar] [PubMed]
  24. Panksepp, J.; Biven, L. The Archaeology of Mind: Neuroevolutionary Origins of Human Emotions (Norton Series on Interpersonal Neurobiology); W W Norton & Co. Inc.: New York, NY, USA, 2012. [Google Scholar]
  25. Panksepp, J.; Moskal, J. Dopamine and SEEKING: Subcortical “reward” systems and appetitive urges. In Handbook of Approach and Avoidance Motivation; Elliot, A.J., Ed.; Psychology Press: England, UK, 2008; pp. 67–87. [Google Scholar]
  26. Dell’Aversana, P. Enhancing Deep Learning and Computer Image Analysis in Petrography through Artificial Self-Awareness Mechanisms. Minerals 2024, 14, 247. [Google Scholar] [CrossRef]
  27. Dell’Aversana, P. Deep Learning for automatic classification of mineralogical thin sections. Bull. Geophys. Oceanogr. 2021, 62, 455–466. [Google Scholar] [CrossRef]
  28. Hall, B. Facies classification using machine learning. Lead. Edge 2016, 35, 906–909. [Google Scholar]
  29. She, Y.; Wang, H.; Zhang, X.; Qian, W. Mineral identification based on machine learning for mineral resources exploration. J. Appl. Geophys. 2019, 168, 68–77. [Google Scholar]
  30. Liu, K.; Liu, J.; Wang, K.; Wang, Y.; Ma, Y. Deep learning-based mineral classification in thin sections using convolutional neural network. Minerals 2020, 10, 1096. [Google Scholar]
  31. Mamani, M.; Wörner, G.; Sempere, T. Geochemical variations in igneous rocks of the Central Andean orocline (13° S to 18° S): Tracing crustal thickening and magma generation through time and space. GSA Bull. 2010, 122, 162–182. [Google Scholar] [CrossRef]
Figure 1. Scatter plot of classified rock samples (left panel). Here the different classes are represented with different symbols and colors. Training and Test Loss functions trend vs. epochs (right panel).
Figure 1. Scatter plot of classified rock samples (left panel). Here the different classes are represented with different symbols and colors. Training and Test Loss functions trend vs. epochs (right panel).
Minerals 15 00356 g001
Figure 2. Scatter plot of the two classes (blue: class 1; red: class 2) and final decision boundary obtained through the application of the MycelialNet classification model.
Figure 2. Scatter plot of the two classes (blue: class 1; red: class 2) and final decision boundary obtained through the application of the MycelialNet classification model.
Minerals 15 00356 g002
Figure 3. Oxides’ correlation matrix.
Figure 3. Oxides’ correlation matrix.
Minerals 15 00356 g003
Figure 4. Oxides’ histograms in the dataset (see text for explanations).
Figure 4. Oxides’ histograms in the dataset (see text for explanations).
Minerals 15 00356 g004
Figure 5. Oxide content for each rock type.
Figure 5. Oxide content for each rock type.
Minerals 15 00356 g005
Figure 6. Example of test-data classification cross-plot, showing different colors in the 2-feature display of SiO2 and TiO2.
Figure 6. Example of test-data classification cross-plot, showing different colors in the 2-feature display of SiO2 and TiO2.
Minerals 15 00356 g006
Figure 7. Example of test-data classification cross-plot in the 2-feature display of Fe2O3 and MgO.
Figure 7. Example of test-data classification cross-plot in the 2-feature display of Fe2O3 and MgO.
Minerals 15 00356 g007
Table 1. Accuracy comparison.
Table 1. Accuracy comparison.
MethodAccuracy
Random Forest0.625
Logistic Regression0.65
Standard Neural Network0.69
MycelialNet model0.875
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Dell’Aversana, P. A Biological-Inspired Deep Learning Framework for Big Data Mining and Automatic Classification in Geosciences. Minerals 2025, 15, 356. https://doi.org/10.3390/min15040356

AMA Style

Dell’Aversana P. A Biological-Inspired Deep Learning Framework for Big Data Mining and Automatic Classification in Geosciences. Minerals. 2025; 15(4):356. https://doi.org/10.3390/min15040356

Chicago/Turabian Style

Dell’Aversana, Paolo. 2025. "A Biological-Inspired Deep Learning Framework for Big Data Mining and Automatic Classification in Geosciences" Minerals 15, no. 4: 356. https://doi.org/10.3390/min15040356

APA Style

Dell’Aversana, P. (2025). A Biological-Inspired Deep Learning Framework for Big Data Mining and Automatic Classification in Geosciences. Minerals, 15(4), 356. https://doi.org/10.3390/min15040356

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop