Deep Learning-Based Reconstruction of 3D Morphology of Geomaterial Particles from Single-View 2D Images

Zhao, Jiangpeng; Xie, Heping; Li, Cunbao; Liu, Yifei

doi:10.3390/ma17205100

Open AccessArticle

Deep Learning-Based Reconstruction of 3D Morphology of Geomaterial Particles from Single-View 2D Images

by

Jiangpeng Zhao

,

Heping Xie

,

Cunbao Li

and

Yifei Liu

^*

State Key Laboratory of Intelligent Construction and Healthy Operation and Maintenance of Deep Underground Engineering, College of Civil and Transportation Engineering, Shenzhen University, Shenzhen 518060, China

^*

Author to whom correspondence should be addressed.

Materials 2024, 17(20), 5100; https://doi.org/10.3390/ma17205100

Submission received: 9 September 2024 / Revised: 11 October 2024 / Accepted: 16 October 2024 / Published: 18 October 2024

(This article belongs to the Special Issue Physical Characterization and Mechanical Resistance of Geomaterials in Deep Underground Engineering)

Download

Browse Figures

Versions Notes

Abstract

:

The morphology of particles formed in different environments contains critical information. Thus, the rapid and effective reconstruction of their three-dimensional (3D) morphology is crucial. This study reconstructs the 3D morphology from two-dimensional (2D) images of particles using artificial intelligence (AI). More than 100,000 particles were sampled from three sources: naturally formed particles (desert sand), manufactured particles (lunar soil simulant), and numerically generated digital particles. A deep learning approach based on a voxel representation of the morphology and multi-dimensional convolutional neural networks was proposed to rapidly upscale and reconstruct particle morphology. The trained model was tested using the three particle types and evaluated using different multi-scale morphological descriptors. The results demonstrated that the statistical properties of the morphological descriptors were consistent for the real 3D particles and those derived from the 2D images and the model. This finding confirms the model’s validity and generalizability in upscaling and reconstructing diverse particle samples. This study provides a method for generating 3D numerical representations of geological particles, facilitating in-depth analysis of properties, such as mechanical behavior and transport characteristics, from 2D images.

Keywords:

single-view 3D reconstruction; sand particle; deep learning; convolutional neural network

Graphical Abstract

1. Introduction

Geomaterials, such as soil and rock, consist of particles whose morphology significantly influences the material’s physical, mechanical, and thermal properties [1,2]. Three-dimensional (3D) particle representations are required for physical and mechanical analyses of geomaterials [3,4]. However, obtaining 3D representations of microscopic geomaterial particles is challenging. Effective methods, such as computed tomography (CT) [5,6], white light interferometry [7], and 3D laser scanning [8], are employed in laboratories to conduct high-resolution 3D analyses of the particles’ morphology. However, these methods are costly, labor intensive, and time-consuming, making them inconvenient for routine use, especially when in situ 3D information is required. In contrast, high-resolution two-dimensional (2D) images can be easily obtained from other microscopy techniques or a smartphone at a fraction of the cost (less than 1% of the 3D imaging costs). It is well-known that 2D information is a projection of the 3D structure [9], encoding the 3D data within a lower-dimensional form. A skilled observer can intuitively reconstruct the 3D morphology of an object from a 2D image. Therefore, reconstructing 3D morphology from 2D images is a solvable problem. Figure 1 illustrates the concept of reconstructing the 3D morphology of natural sand particles from a single-view 2D image.

The reconstruction of particles’ 3D morphologies yields numerical entities representing fundamental elements in discrete element modeling (DEM) [10] and digitized particles. Several attempts have been made to reconstruct 3D morphology using 2D information. Wang [11] reconstructed the 3D morphology of rock particles using multi-view 2D images. However, acquiring multiple views in real-world scenarios is challenging, and the generalization performance may not be sufficient. Some researchers have conducted detailed analyses on reconstructing the 3D pore structure of rock bodies using 2D slices [12,13]. Reconstructing 3D pores is fundamentally different from reconstructing particles’ 3D morphologies. Genetic algorithms (GAs) [14] have been used in DEM simulations to reconstruct particles’ morphologies [15,16], but this method is only applicable to clumps of spheres and requires some 3D information [17]. Some researchers have employed generative models like variational autoencoders (VAEs) [18] for 3D morphology reconstruction [17,19]. However, these approaches generate additional 3D particles without addressing the dimensional upscaling from two to three dimensions.

Due to the rapid development of artificial intelligence (AI) in recent years, deep learning methods [20] have improved significantly. They can learn implicit high-dimensional mapping relationships from large amounts of data [21,22], making them well-suited for 2D to 3D mapping tasks. Multi-view approaches facilitate the reconstruction of 3D morphology [23,24]; the fewer the views, the more challenging the reconstruction, with single-view reconstruction being the most difficult [25,26]. Predicting the complete 3D shape of an object from a single image is a long-standing and extremely challenging task. Recently, several representations for 3D models, including point cloud [27], mesh [28], and signed distance field [29], have been adopted in 3D objects. A deep learning architecture, the pixels-to-voxels model (Pix2Vox) proposed by Xie [30,31], can reconstruct 3D object morphology from single views. However, it cannot reconstruct the complex 3D morphology of natural particles. For single-view reconstruction of particles, current challenges include a lack of representative real datasets, research on the upscaling and 3D reconstruction of natural particles is limited, and efficient algorithms and models with high reconstruction speed and quality for reconstructing the morphology of natural particles are lacking, and the absence of work for reconstructing particles from a single view.

Therefore, deep learning has significant potential for reconstructing the 3D morphology of particles from 2D single-view images. The 3D representation of particles primarily includes voxel, point cloud, and mesh formats, each requiring different data preprocessing methods and models for training. Given the regularity of voxel representation, which integrates well with Convolutional Neural Networks (CNNs) and is widely used in deep learning [32,33], our study employs voxels to express 3D particles. We propose the pixels-to-voxels of particles (PVP) model based on the Pix2Vox for reconstructing the 3D morphology of irregular sand particles from 2D images. We construct a dataset comprising more than 100,000 particles. The results indicate that the 3D morphology reconstructed from 2D images closely matches the real 3D morphology, demonstrating the model’s suitability for dimensional upscaling.

2. Materials and Methods

2.1. Dataset

Data are critical in deep learning approaches. The quantity and quality of the data significantly affect model performance [34]. A sufficiently large and low-noise dataset enables the model to learn the intrinsic data patterns. Generally, data augmentation and data balancing techniques are used when there is a lack of sufficient and balanced data [35,36].

This study utilized three types of samples: naturally formed particles (Tengger sand), manufactured particles (HIT-LS1 lunar soil simulant) [37], and numerically generated digital particles. We determine the 3D morphology of the samples by scanning them sequentially using micro CT (µCT), achieving ultra-high resolution with voxel dimensions of 2048 × 2048 × 2048 and a data volume exceeding 40 GB. Figure 2 shows the µCT scans of the samples. The final dataset included 25,000 particles of the HIT-LS1 lunar soil simulant (Dataset 1), 58,000 particles of the Tengger sand (Dataset 2), and 25,000 numerically generated digital particles (Dataset 3). The total dataset comprised more than 100,000 particles, with 90% of the particles used as the training set and 10% as the test set.

2.2. Network Architecture

The PVP model was designed to reconstruct the 3D morphology of particles from a single 2D grayscale image. The model’s architecture is based on the classic deep learning model Pix2Vox++ [28], but was modified to handle irregular, randomly oriented particles. The 3D morphology was randomly projected to generate the input 2D grayscale images, and no fixed viewpoints were used, enhancing the model’s ability to learn mapping features. The architecture of the PVP (Figure 3) consists of an encoder and a decoder. The encoder extracts features from the input 2D grayscale image for encoding. The decoder generates the corresponding 3D voxels from the extracted features. Due to memory constraints and the need for effective visualization of the particles, the voxel resolution was 64³. The 2D image size was 64² pixels to ensure consistency between input and output.

The calculation formulas for the encoder and decoder of the autoencoder are briefly outlined in [38], and here we provide a detailed formula for the encoder, which includes 2D convolutional layers, max pooling layers, and residual connections:

The equation for the 2D convolution is as follows:

y [i, j] = \sum_{m = 0}^{k - 1} \sum_{n = 0}^{k - 1} x [i + m, j + n] \cdot w [m, n] + b

(1)

The equation for the max pooling is as follows:

y [i, j] = \max_{m, n} {x [i + m, j + n]}

(2)

The equation for the residual connection is as follows:

y = F (x, {W_{i}}) + x

(3)

where

x [i + m, j + n]

represents the input feature map pixel values,

w [m, n]

represents the convolutional kernel weights,

b

represents the bias,

y [i, j]

represents the output feature map pixel values.

W_{i}

represents the weights, and

F (x, {W_{i}})

represents the output after applying convolution and other operations to the input x.

The equation for the 3D transposed convolution in the decoder is as follows:

y [e, f, g] = \sum_{m = 0}^{k - 1} \sum_{n = 0}^{k - 1} \sum_{l = 0}^{k - 1} x [e + m, f + n, g + l] \cdot w [m, n, l] + b

(4)

where

x [e + m, f + n, g + l]

represents the input feature map voxel values,

w [m, n, l]

represents the convolutional kernel weights,

b

represents the bias, and

y [e, f, g]

represents the output feature map voxel values.

2.2.1. Encoder

The encoder extracts features from the input image, which are used to reconstruct the 3D morphology of the particles. The input image passes through multiple layers: a convolutional layer with a 3² convolution kernel (followed by batch normalization and ReLU activation), a 2² pooling layer, four residual blocks (each containing two convolutional layers and a residual connection), two convolutional layers with a 3² convolution kernel, a 2² pooling layer, a convolutional layer, and a 2² pooling layer. The feature dimensions are upscaled from 64 to 1024 to capture specific features, followed by downscaling to 256, producing a 4² feature map. This architecture is inspired by the classic Visual Geometry Group (VGG) algorithm [39], and is tailored to the sizes of the particle output features. The inclusion of residual blocks [40] helps to prevent vanishing features and allows for deeper networks, enhancing the model’s expressive power.

2.2.2. Decoder

The decoder converts the feature map extracted by the encoder into voxels. The decoder comprises six 3D transposed convolution layers with a 4³ convolution kernel (followed by batch normalization and ReLU activation) to map the features into voxels. Dropout layers are added between the middle two 3D transposed convolution layers to prevent overfitting and improve generalization. The final layer does not have a batch normalization layer and uses a Sigmoid activation function to produce voxel values, downscaling the feature dimensions from 512 to 1 and resulting in the reconstructed 64³ voxel representation.

In the encoder, the process of handling a single input image involves extracting features and increasing dimensionality from an image sized 64² × 1 through 2D convolution, ultimately transforming it into 4² × 256 feature maps, totaling 256. These feature maps are then input into the decoder, where they are first converted to a voxel size of 2³ × 512 with a dimension of 512. Subsequently, 3D transposed convolution is employed to reconstruct the final output as a 64³ × 1 3D model.

2.3. Loss Function

Different architectures and data types require appropriately tailored loss functions to help the model learn the data features. We used the Adam optimization algorithm to minimize the loss function and maximize the similarity between the reconstructed and real particles. The loss function is defined as follows:

L o s s = 10 \cdot {L o s s}_{B C E} + {λ \cdot L o s s}_{M S E}

(5)

The final loss consists of two parts

L o s s_{B C E}

represents the binary cross-entropy loss, a common and effective loss function for binary classification. It is used to classify the predicted voxel values as 0 or 1. λ is a weighting parameter that adjusts the contribution of

{L o s s}_{M S E}

. The

L o s s_{B C E}

is defined as follows:

L o s s_{B C E} = \frac{1}{N} \sum_{i = 1}^{N} [r_{i} \log (p_{i}) + (1 - r_{i}) \log (1 - p_{i})]

(6)

where

{L o s s}_{M S E}

represents the mean squared error loss, which is commonly used to evaluate the difference between the predicted and actual values. It is a standard approach in regression problems. The

{L o s s}_{M S E}

is defined as follows:

L o s s_{M S E} = \frac{1}{N} \sum_{i = 1}^{N} {(r_{i} - p_{i})}^{2}

(7)

where

N

represents the number of voxels in the real particle,

r_{i}

represents the voxel value for the real particle, and

p_{i}

represents the voxel value of the predicted particle at the same position.

L o s s_{B C E}

and

{L o s s}_{M S E}

decrease as the predicted particle approximates the real particle. A smaller loss indicates better reconstruction results.

2.4. Evaluation Indicators

It is impossible to reconstruct the complex and disordered 3D morphology of microscopic particles using a single 2D image with limited information. Instead, we evaluated the reconstruction effectiveness by selecting parameters characterizing particles of the same type. Voxel representation facilitates model learning and training. We used spherical harmonic (SH) reconstruction of the voxel particles to eliminate the impact of voxel roughness on parameter calculation [41], as illustrated in Figure 4a.

Extensive research has been conducted on particle characterization [9,42] to evaluate the reconstruction performance. We calculated the triaxial dimensions (length, width, thickness) and basic morphological parameters, including the elongation index, volume, and surface area [6,43,44] based on SH reconstruction. Additionally, we incorporated effective and complex characterization parameters, such as sphericity (S), roundness (R), and structural index (SI), to quantify particle morphology. These were calculated as follows:

S = \frac{S_{s}}{S_{A}} = \frac{\sqrt[3]{36 π V^{2}}}{S_{A}}

(8)

R = \frac{\sum k^{- 1}}{N_{e} k_{i n s}^{- 1}}

(9)

E_{p b} = \frac{1}{2} \sum_{i = 1}^{N_{c}} \sum_{j = 1, j \neq i}^{N_{c}} \frac{1}{G_{c} (e_{i}, e_{j})}

(10)

S I = \frac{E_{p b}}{E_{p}}

(11)

where

V

represents the particle volume,

S_{s}

denotes the surface area of a sphere with the same volume, and

S_{A}

is the particle’s surface area.

k

is the mean curvature of the original particle surface,

k_{i n s}

is the curvature of the inscribed sphere,

N_{e}

is the number of points where

k

exceeds

k_{i n s}

, and

R

is obtained by weighted averaging. The roundness ranges from 0 to 1; the closer it is to 1, the more spherical and smoother the particle, with fewer sharp corners (local maximum curvature).

e_{i}

and

e_{j}

represent positions on the sphere, and

G_{c} (e_{i}, e_{j})

refers to the great circle distance between two corners, with

N_{c}

representing the number of corners.

E_{p}

is the Riesz energy, an indicator of uniformity, and

E_{p b}

is the Riesz energy for the spherical Fibonacci distribution of

N_{c}

points, representing the minimum energy [19]. A detailed description can be found in Figure 4b.

3. Results and Comparison with Other Models

3.1. Ablation Study

This section analyzes the contributions of various components within the model through an ablation study, aiming to assess the impact of hyperparameters and model architecture on performance. The hyperparameters are crucial in deep learning. They help the model to learn effectively and enhance its generalization capability. Hyperparameters refer to parameters that must be defined before training, such as the coefficient of the loss function (if applicable), learning rate, batch size, network architecture, and training steps. We focused on the most critical hyperparameters, i.e., the loss function coefficient and learning rate, because they have the greatest impact on the model’s performance. All other hyperparameters had standard values, as shown in Table 1 (Num_workers is number of threads used by the model during data loading and processing). The hyperparameter analysis was conducted using 25,000 particles from the HIT-LS1 lunar soil simulant (Dataset 1).

3.1.1. Loss Function

We conducted experiments using the

{L o s s}_{p 2 v}

loss function of the classic Pix2Vox++ model. The loss curve is shown in Figure 5.

{L o s s}_{p 2 v} = 10 \cdot {L o s s}_{B C E}

(12)

The training and validation losses do not decrease significantly during the first 100 steps. The training loss decreases after 100 steps, but the validation loss increases rapidly, exceeding the initial loss. This result indicates that the model only fits the training data. The validation loss diverges significantly, suggesting overfitting. Therefore, the classic Pix2Vox++ model is not suitable for reconstructing particles.

We improved the loss function using Equation (1) and experimented with different values of λ (50, 20, 10, 1) to determine the coefficient of

L o s s_{B C E}

. The trends in the validation loss for different numbers of iterations and λ values are shown in Figure 6. The model converges when different λ values are used. As λ decreases, the convergence result decreases. However, as the value approaches 10, it is very close to the result at 1. Therefore, the optimum convergence effect is achieved when λ is 1.

3.1.2. Learning Rate

The learning rate has the most significant impact on the model’s loss. A low learning rate increases the convergence time, and a high rate may lead to incomplete convergence, causing the model to fall into a local minimum or gradient explosion. We tested learning rates of 0.05, 0.005, and 0.0005 and implemented a variable learning rate strategy. We began at 0.005 and reduced the learning rate by 95% every ten epochs until it reached 0.0004.

As shown in Figure 7, the loss at convergence is the smallest, with a learning rate of 0.05. As the learning rate decreases, the loss at convergence decreases, but the difference between 0.005 and 0.0005 is minimal. Therefore, the optimal convergence is around 0.0005. Although the convergence at 0.0005 is the smallest, it converges more slowly. The blue line representing the variable learning rate ensures that the initial convergence avoids local minima while achieving the best final convergence result. Thus, we selected a variable learning rate, starting at 0.005, decreasing by 95% every ten epochs.

3.2. Reconstruction Performance of Different Models

3.2.1. Lightweight Model

The proposed PVP model did not exhibit underfitting; however, it is unclear whether overfitting occurs and whether the model complexity matches the data volume. Therefore, we reduced the size of the PVP model by half to reduce its complexity. The small PVP (PVP-S) model was used to analyze whether it matched the data. The PVP model’s encoder was reduced from 15 to 7 layers, and the decoder was reduced from 6 to 4 layers. The residual blocks and 2D convolutional layers for feature extraction were retained to obtain the feature map, and the main structure of the 3D transposed convolutions and dropout regularization was the same. This reduction reduced model size from 385 M to 134 M.

3.2.2. Comparison of Results from Different Models

The reconstruction results of the PVP and PVP-S models for the training set were evaluated using 2495 particles. Figure 8 presents the histogram of the real and reconstructed particles obtained from different models. Real: real particles, Pred PVP: predicted particles using the PVP model, and Pred PVP-S: predicted particles using the PVP-S model.

As shown in Figure 8a,b, similar distributions of the surface area and volume are obtained from both models, indicating that they accurately reconstruct the morphological features of the particles. The differences between the average values of the evaluation indicators for the real and reconstructed particles for the PVP and PPVP-S models are listed in Table 2. The difference in the surface area is significantly smaller for the PVP model than for the PVP-S model. The difference is more pronounced for the volume. The distribution of the sphericity (Figure 8c) and roundness (Figure 8d) is similar for the reconstructed and real particles, and the PVP model performs better than the PVP-S model. The elongation index (Figure 8e) and structural index Figure 8f) exhibit less satisfactory results, but are acceptable. Neither model shows an advantage based on these metrics. However, the small differences in their average values suggest that the distributions of the two statistical parameters are similar.

The results demonstrate that the PVP model outperforms the PVP-S model, showing a good match of the model to the dataset. This finding indicates that the PVP model learns the morphological characteristics of the particles and can reconstruct the 3D morphology from a single 2D image. Although it cannot capture fine texture details, it successfully learns the features, resulting in a similar distribution of the metrics for the reconstructed and real particles.

3.3. Reconstruction Results for Natural Sand Particles and Numerically Generated Digital Sand Particles

We used the PVP model to compare the evaluation indicators for the real Tengger sand (Dataset 2), and the numerically generated digital particles (Dataset 3) to assess the model’s generalization ability and suitability in reconstructing 3D morphology from a single-view 2D image. The results demonstrates the model’s performance for analyzing different particle samples with significant differences. A total of 2462 particles were used for testing.

Figure 9 shows the violin plots of the evaluation indicators for the real and model-reconstructed particles. The violin plots represent the probability distribution of the parameters, with the vertical axis showing the parameter values and the horizontal width representing the number of particles at each value. The left violin plot depicts the distribution of the Tengger sand particles, and the right one shows the distribution of the numerically generated particles. Blue represents the real particles, and red represents the model-reconstructed ones.

Figure 9a,b shows the data for the surface area and volume, respectively. The distribution is similar for the reconstructed and real groups. As shown in Figure 9a, the surface area is smaller for the reconstructed Tengger sand, indicating a worse performance than for the numerically generated particles. The reconstruction results for the volume are similar for both samples (Figure 9b), demonstrating excellent results. The sphericity (Figure 9c) and roundness (Figure 9d) show significant differences. The values are smaller for Tengger sand, and the distribution is broader than for the numerically generated sand. The distributions of both reconstructed samples closely resemble that of the real particles, although the sphericity has slightly larger values. The roundness values are slightly less accurate for Tengger sand than for the numerically generated particles. The elongation index for the real Tengger sand and the numerically generated sand exhibits a bimodal distribution, which is not perfectly aligned with the reconstructed group (Figure 9e). Figure 9e,f shows that the distribution of the real group is broader, and that of the reconstructed group is more concentrated.

Figure 10 presents the 3D morphology of the real particles, their 2D random projections, and the PVP-reconstructed 3D morphology for the Tengger sand and numerically generated particles. The Tengger sand exhibits a more complex morphology, whereas the numerically generated digital particles are smoother. The reconstruction results reflect these differences, indicating that the model can adapt to particle samples with significant differences in the distribution. Although the model demonstrates good generalization ability, the reconstruction of the real samples is less realistic than that of the numerically generated digital samples.

4. Discussion

We compared multiple evaluation indicators obtained from different models and particle sample types. The morphological parameters of the real particles exhibited a more dispersed distribution. Since deep learning models learn from data, all features are derived from the dataset. The numerically generated digital particles are highly characterized and parameterized, resulting in distinctive morphology and the best learning results. The morphological features of natural sand are formed by random complex processes like hydrodynamics, weathering, and fragmentation. Thus, these features are more challenging to learn due to high complexity and noise. The lunar soil simulant was processed (crushing, grinding, sintering, etc.) and has more uniform features, resulting in better learning outcomes than natural sand but worse results than for the numerically generated digital particles.

We used the PVP model for 2D single-view reconstructions of 3D particle morphologies. Training with a sample size of less than 50,000 particles required only 10 h on an NVIDIA 4090 GPU. The trained model reconstructed 2500 particles in approximately 4 s. Unlike traditional parameter-based methods that are limited to statistical representation and analysis of parameters without producing numerical entities, our model can upscale and reconstruct the morphology of irregular particles. The reconstructed particle distributions were generally consistent with the real distributions, and the model quickly generated large quantities of 3D numerical particles. It can be used to create particle libraries for input into DEM simulations.

This approach is particularly suitable for reconstructing particles that cannot be analyzed using 3D CT experiments, such as real lunar soil particles, which are rare, but whose 2D images are abundant. With the continuous advancement of deep space exploration, humanity will access more celestial bodies, allowing for the acquisition of richer geomaterials’ 2D images. This enables the reconstruction of 3D morphologies from single 2D images of asteroids or rocks, significantly aiding space exploration and showcasing substantial potential.

Moreover, our model has some inherent limitations. While the voxel data representation is structurally simple and facilitates the learning of overall features, it contains excessive internal useless data, causing the model to learn not only surface features, but also waste computational resources on useless internal data. Furthermore, memory constraints necessitate the use of lower resolutions. Other data representations, such as point clouds and meshes, hold more advantages; however, there are currently no related models for single-view reconstruction of particles, and the training difficulty is greater, making the validity of results uncertain. Therefore, this paper does not include experimental comparisons with other deep learning methods.

5. Conclusions

The objective of this study was to develop a rapid AI model for reconstructing 3D morphology from a single-view 2D image of particles. We used a dataset of more than 100,000 particles of HIT-LS1 lunar soil simulant, Tengger sand, and numerically generated digital particles. Transfer learning techniques were used, leveraging the backbone network of the Pix2Vox model, and different models were trained. The following conclusions were obtained:

The distributions were similar for the reconstructed and real particles for the three sample types, indicating that upscaling from a single-view 2D image to 3D morphology was statistically feasible.
The PVP model provided distributions of the reconstructed particles consistent with the real distributions. The surface area and volume were highly similar. The similarity between the distributions of the reconstructed and real particles for natural and numerically generated particles demonstrated the strong generalization ability of the model and its suitability for different particle types.
Due to differences in formation, the reconstruction results were better for the HIT-LS1 lunar soil simulant than for the natural sand, but worse for the numerically generated sand particles, reflecting varying levels of difficulty for the AI model.

Reconstructing 3D particle morphology from a single-view 2D image overcomes the limitations of parameter-based particle characterization. It provides numerical elements for the assessment of 3D morphology and a robust and effective sample database for DEM. However, due to the memory limits of the voxel data format, alternative formats, like point clouds and frequency domain data, could be utilized to overcome it and achieve closer alignment with the distribution of real particles.

Author Contributions

Conceptualization, Y.L. and H.X.; methodology, J.Z. and C.L.; analysis, C.L.; data curation, Y.L.; writing—original draft preparation, J.Z.; writing—review and editing, H.X. All authors have read and agreed to the published version of the manuscript.

Funding

This study was funded by National Natural Science Funding of China (No. 52104141, No. 12172230), Guangdong Basic and Applied Basic Research Foundation (No. 2024A1515011952), the Stable Support Program Project of Shenzhen Municipal Science and Technology Innovation Committee (No. 20231120174839003) and the Shenzhen Science and Technology Program (No. JCYJ20220531102012028).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The original contributions presented in the study are included in the article, further inquiries can be directed to the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Altuhafi, F.N.; Coop, M.R.; Georgiannou, V.N. Effect of particle shape on the mechanical behavior of natural sands. J. Geotech. Geoenviron. Eng. 2016, 142, 04016071. [Google Scholar] [CrossRef]
Deal, E.; Venditti, J.G.; Benavides, S.J.; Bradley, R.; Zhang, Q.; Kamrin, K.; Perron, J.T. Grain shape effects in bed load sediment transport. Nature 2023, 613, 298–302. [Google Scholar] [CrossRef] [PubMed]
Lawson, R.; Woods, S.; Jensen, E.; Erfani, E.; Gurganus, C.; Gallagher, M.; Connolly, P.; Whiteway, J.; Baran, A.; May, P.; et al. A review of ice particle shapes in cirrus formed in situ and in anvils. J. Geophys. Res. Atmos. 2019, 124, 10049–10090. [Google Scholar] [CrossRef]
Zhao, J.; Zhao, S.; Luding, S. The role of particle shape in computational modelling of granular matter. Nat. Rev. Phys. 2023, 5, 505–525. [Google Scholar] [CrossRef]
Bostanabad, R.; Zhang, Y.; Li, X.; Kearney, T.; Brinson, L.C.; Apley, D.W.; Liu, W.K.; Chen, W. Computational microstructure characterization and reconstruction: Review of the state-of-the-art techniques. Prog. Mater. Sci. 2018, 95, 1–41. [Google Scholar] [CrossRef]
Zhou, B.; Wang, J.; Zhao, B. Micromorphology characterization and reconstruction of sand particles using micro X-ray tomography and spherical harmonics. Eng. Geol. 2015, 184, 126–137. [Google Scholar] [CrossRef]
Wyant, J.C. White light interferometry. In Holography: A Tribute to Yuri Denisyuk and Emmett Leith; SPIE: Warsaw, Poland, 2002; pp. 98–107. [Google Scholar]
Ebrahim, M.A.B. 3D laser scanners’ techniques overview. Int. J. Sci. Res. 2015, 4, 323–331. [Google Scholar]
Su, D.; Yan, W. Prediction of 3D size and shape descriptors of irregular granular particles from projected 2D images. Acta Geotech. 2020, 15, 1533–1555. [Google Scholar] [CrossRef]
Kloss, C.; Goniva, C.; Hager, A.; Amberger, S.; Pirker, S. Models, algorithms and validation for opensource dem and cfd–dem. Prog. Comput. Fluid Dyn. Int. J. 2012, 12, 140–152. [Google Scholar] [CrossRef]
Wang, X.; Zhang, H.; Yin, Z.Y.; Su, D.; Liu, Z. Deep-learning-enhanced model reconstruction of realistic 3D rock particles by intelligent video tracking of 2D random particle projections. Acta Geotech. 2023, 18, 1407–1430. [Google Scholar] [CrossRef]
Feng, J.; Teng, Q.; Li, B.; He, X.; Chen, H.; Li, Y. An end-to-end three-dimensional reconstruction framework of porous media from a single two-dimensional image based on deep learning. Comput. Methods Appl. Mech. Eng. 2020, 368, 113043. [Google Scholar] [CrossRef]
Fu, J.; Xiao, D.; Li, D.; Thomas, H.R.; Li, C. Stochastic reconstruction of 3D microstructures from 2D cross-sectional images using machine learning-based characterization. Comput. Methods Appl. Mech. Eng. 2022, 390, 114532. [Google Scholar] [CrossRef]
Holland, J.H. Genetic algorithms. Sci. Am. 1992, 267, 66–73. [Google Scholar] [CrossRef]
Jaeger, H.M.; de Pablo, J.J. Perspective: Evolutionary design of granular media and block copolymer patterns. APL Mater. 2016, 4, 53209. [Google Scholar] [CrossRef]
Miskin, M.Z.; Jaeger, H.M. Evolving design rules for the inverse granular packing problem. Soft Matter 2014, 10, 3708–3715. [Google Scholar] [CrossRef]
Macedo, R.B.d.; Monfared, S.; Karapiperis, K.; Andrade, J. What is shape? characterizing particle morphology with genetic algorithms and deep generative models. Granul. Matter 2023, 25, 2. [Google Scholar] [CrossRef]
Doersch, C. Tutorial on variational autoencoders. arXiv 2016, arXiv:1606.05908. [Google Scholar]
Shi, J.J.; Zhang, W.; Wang, W.; Sun, Y.H.; Xu, C.Y.; Zhu, H.H.; Sun, Z.X. Randomly generating three-dimensional realistic schistous sand particles using deep learning: Variational autoencoder implementation. Eng. Geol. 2021, 291, 106235. [Google Scholar] [CrossRef]
LeCun, Y.; Bengio, Y.; Hinton, G. Deep learning. Nature 2015, 521, 436–444. [Google Scholar] [CrossRef]
Jun, H.; Nichol, A. Shap-e: Generating conditional 3D implicit functions. arXiv 2023, arXiv:2305.02463. [Google Scholar]
Liu, R.; Wu, R.; Van Hoorick, B.; Tokmakov, P.; Zakharov, S.; Vondrick, C. Zero-1-to-3: Zero-shot one image to 3D object. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Paris, France, 2–6 October 2023; pp. 9298–9309. [Google Scholar]
Melas-Kyriazi, L.; Laina, I.; Rupprecht, C.; Vedaldi, A. Realfusion: 360deg reconstruction of any object from a single image. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Paris, France, 2–6 October 2023; pp. 8446–8455. [Google Scholar]
Nichol, A.; Jun, H.; Dhariwal, P.; Mishkin, P.; Chen, M. Point-e: A system for generating 3D point clouds from complex prompts. arXiv 2022, arXiv:2212.08751. [Google Scholar]
Long, X.; Guo, Y.C.; Lin, C.; Liu, Y.; Dou, Z.; Liu, L.; Ma, Y.; Zhang, S.H.; Habermann, M.; Theobalt, C.; et al. Wonder3D: Single image to 3D using cross-domain diffusion. arXiv 2023, arXiv:2310.15008. [Google Scholar]
Fan, H.; Su, H.; Guibas, L.J. A point set generation network for 3D object reconstruction from a single image. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 605–613. [Google Scholar]
Wang, N.; Zhang, Y.; Li, Z.; Fu, Y.; Liu, W.; Jiang, Y. Pixel2mesh: Generating 3D mesh models from single rgb images. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 52–67. [Google Scholar]
Xu, Q.; Wang, W.; Ceylan, D.; Mech, R.; Neumann, U. Disn: Deep implicit surface network for high-quality single-view 3D reconstruction. Adv. Neural Inf. Process. Syst. 2019, 32. [Google Scholar] [CrossRef]
Qian, G.; Mai, J.; Hamdi, A.; Ren, J.; Siarohin, A.; Li, B.; Lee, H.Y.; Skorokhodov, I.; Wonka, P.; Tulyakov, S.; et al. Magic123: One image to high-quality 3D object generation using both 2D and 3D diffusion priors. arXiv 2023, arXiv:2306.17843. [Google Scholar]
Xie, H.; Yao, H.; Sun, X.; Zhou, S.; Zhang, S. Pix2vox: Context-aware 3D reconstruction from single and multi-view images. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Replubic of Korea, 27 October–2 November 2019; pp. 2690–2698. [Google Scholar]
Xie, H.; Yao, H.; Zhang, S.; Zhou, S.; Sun, W. Pix2vox++: Multi-scale context-aware 3D object reconstruction from single and multiple images. Int. J. Comput. Vis. 2020, 128, 2919–2935. [Google Scholar] [CrossRef]
Deng, J.; Shi, S.; Li, P.; Zhou, W.; Zhang, Y.; Li, H. Voxel r-cnn: Towards high performance voxel-based 3D object detection. In Proceedings of the AAAI Conference on Artificial Intelligence, Virtual Conference, 2–9 February 2021; Volume 35, pp. 1201–1209. [Google Scholar]
Shi, S.; Guo, C.; Jiang, L.; Wang, Z.; Shi, J.; Wang, X.; Li, H. Pv-rcnn: Point-voxel feature set abstraction for 3D object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 10529–10538. [Google Scholar]
Liu, Y.; Chen, Y.; Ding, B. Deep learning in frequency domain for inverse identification of nonhomogeneous material properties. J. Mech. Phys. Solids 2022, 168, 105043. [Google Scholar] [CrossRef]
Wang, Y.; Chung, S.H.; Khan, W.A.; Wang, T. ALADA: A lite automatic data augmentation framework for industrial defect detection. Adv. Eng. Inform. 2023, 58, 102205. [Google Scholar] [CrossRef]
Khan, W.A. Balanced weighted extreme learning machine for imbalance learning of credit default risk and manufacturing productivity. Ann. Oper. Res. 2023, 1–29. [Google Scholar] [CrossRef]
Xie, H.; Wu, Q.; Liu, Y.; Xie, Y.; Gao, M.; Li, C. Direct measurement and theoretical prediction model of interparticle adhesion force between irregular planetary regolith particles. Int. J. Min. Sci. Technol. 2023, 33, 1425–1436. [Google Scholar] [CrossRef]
Khan, W.A.; Masoud, M.; Eltoukhy, A.E.E.; Ullah, M. Stacked encoded cascade error feedback deep extreme learning machine network for manufacturing order completion time. J. Intell. Manuf. 2024, 1–27. [Google Scholar] [CrossRef]
Simonyan, K. Very deep convolutional networks for large-scale image recognition. arXiv 2014, arXiv:1409.1556. [Google Scholar]
He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
Liu, Y.; Jeng, D.S.; Xie, H.; Li, C. On the particle morphology characterization of granular geomaterials. Acta Geotech. 2023, 18, 2321–2347. [Google Scholar] [CrossRef]
Bagheri, G.; Bonadonna, C.; Manzella, I.; Vonlanthen, P. On the characterization of size and shape of irregular particles. Powder Technol. 2015, 270, 141–153. [Google Scholar] [CrossRef]
Feng, Z.K.; Xu, W.J.; Lubbe, R. Three-dimensional morphological characteristics of particles in nature and its application for dem simulation. Powder Technol. 2020, 364, 635–646. [Google Scholar] [CrossRef]
Xie, W.Q.; Zhang, X.P.; Yang, X.M.; Liu, Q.S.; Tang, S.H.; Tu, X.B. 3D size and shape characterization of natural sand particles using 2D image analysis. Eng. Geol. 2020, 279, 105915. [Google Scholar] [CrossRef]

Figure 1. Schematic of the 3D reconstruction of natural sand particles using a single-view 2D image.

Figure 2. The µCT scans of the samples.

Figure 3. The network architecture of the PVP model.

Figure 4. (a) The voxel is represented using spherical harmonic reconstruction into Fibonacci triangular grid particles. (b) The mean curvature distribution and projected curvature distribution of particles.

Figure 5. Loss function curve for the Classic Pix2Vox++ model applied to the HIT-LS1 particles.

Figure 6. Loss curves of models with different λ values.

Figure 7. Model loss curves with different learning rates (LR).

Figure 8. Comparison of evaluation indicators of the reconstruction performance of two models for the HIT-LS1 dataset.

Figure 9. Six violin plots of evaluation indicators for natural and numerically generated sand particles.

Figure 10. Three-dimensional morphology of the real Tengger sand and the numerically generated sand particles, 2D random projections, and PVP-reconstructed 3D morphology.

Table 1. Some model hyperparameters.

Epoch	Learning Rate	Batch Size	Optimizer	Num_Workers
500	0.005	256	Adam	12

Table 2. The differences between the average evaluation indicator values for the real and reconstructed particles for the PVP and PVP-S models.

	Surface Area	Volume	Sphericity	Roundness	Elongation Index	Structural Index
PVP	5.3%	2.8%	3.7%	3.6%	4.0%	1.3%
PVP-S	9.5%	7.2%	5.1%	7.6%	4.5%	1.1%

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Zhao, J.; Xie, H.; Li, C.; Liu, Y. Deep Learning-Based Reconstruction of 3D Morphology of Geomaterial Particles from Single-View 2D Images. Materials 2024, 17, 5100. https://doi.org/10.3390/ma17205100

AMA Style

Zhao J, Xie H, Li C, Liu Y. Deep Learning-Based Reconstruction of 3D Morphology of Geomaterial Particles from Single-View 2D Images. Materials. 2024; 17(20):5100. https://doi.org/10.3390/ma17205100

Chicago/Turabian Style

Zhao, Jiangpeng, Heping Xie, Cunbao Li, and Yifei Liu. 2024. "Deep Learning-Based Reconstruction of 3D Morphology of Geomaterial Particles from Single-View 2D Images" Materials 17, no. 20: 5100. https://doi.org/10.3390/ma17205100

APA Style

Zhao, J., Xie, H., Li, C., & Liu, Y. (2024). Deep Learning-Based Reconstruction of 3D Morphology of Geomaterial Particles from Single-View 2D Images. Materials, 17(20), 5100. https://doi.org/10.3390/ma17205100

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Deep Learning-Based Reconstruction of 3D Morphology of Geomaterial Particles from Single-View 2D Images

Abstract

1. Introduction

2. Materials and Methods

2.1. Dataset

2.2. Network Architecture

2.2.1. Encoder

2.2.2. Decoder

2.3. Loss Function

2.4. Evaluation Indicators

3. Results and Comparison with Other Models

3.1. Ablation Study

3.1.1. Loss Function

3.1.2. Learning Rate

3.2. Reconstruction Performance of Different Models

3.2.1. Lightweight Model

3.2.2. Comparison of Results from Different Models

3.3. Reconstruction Results for Natural Sand Particles and Numerically Generated Digital Sand Particles

4. Discussion

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI