Next Article in Journal
XGBoost-Based Heuristic Path Planning Algorithm for Large Scale Air–Rail Intermodal Networks
Previous Article in Journal
Research on Partial Discharge Spectrum Recognition Technology Used in Power Cables Based on Convolutional Neural Networks
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

CrySPAI: A New Crystal Structure Prediction Software Based on Artificial Intelligence

by
Zongguo Wang
1,2,*,
Ziyi Chen
1,2,
Yang Yuan
1,2 and
Yangang Wang
1,2
1
Computer Network Information Center, Chinese Academy of Sciences, Beijing 100083, China
2
School of Computer Science and Technology, University of Chinese Academy of Sciences, Beijing 100049, China
*
Author to whom correspondence should be addressed.
Inventions 2025, 10(2), 26; https://doi.org/10.3390/inventions10020026
Submission received: 30 November 2024 / Revised: 23 December 2024 / Accepted: 25 February 2025 / Published: 6 March 2025

Abstract

:
Crystal structure predictions based on the combination of first-principles calculations and machine learning have achieved significant success in materials science. However, most of these approaches are limited to predicting specific systems, which hinders their application to unknown or unexplored domains. In this paper, we present a crystal structure prediction software based on artificial intelligence, named as CrySPAI, to predict energetically stable crystal structures of inorganic materials given their chemical compositions. The software consists of three key modules, an evolutionary optimization algorithm (EOA) that searches for all possible crystal structure configurations, density functional theory (DFT) that provides the accurate energy values for these structures, and a deep neural network (DNN) that learns the relationship between crystal structures and their corresponding energies. To optimize the process across these modules, a distributed framework is implemented to parallelize tasks, and an automated workflow has been integrated into CrySPAI for seamless execution. This paper reports the development and implementation of the AI-based CrySPAI Crystal Prediction Software tool and its unique features.

1. Introduction

Crystal structure plays a fundamental role in understanding the physical and chemical properties of solid materials. One of the key challenges in materials research is how to efficiently and accurately determine crystal structures. Currently, two primary approaches are used to obtain target structures for inorganic materials. The first approach involves searching for similar structures in established crystal structure databases, such as the Inorganic Crystal Structure Database (ICSD) [1], the Pauling File [2], and others. The second approach involves predicting crystal structures by substituting elements in structure prototypes using high-throughput techniques [3,4,5]. While these methods can sometimes yield accurate structures quickly, they are less effective when dealing with new or unknown structure types. Moreover, the extremely high computational cost associated with the substitution method limits its practical applicability. Consequently, the ability to rapidly predict crystal structures based solely on chemical composition remains a pressing issue in theoretical materials research.
In recent years, several computational methods have been proposed and widely used for crystal structure prediction. These include the simulated annealing algorithm (SA) [6], genetic algorithm (GA) [7,8,9,10,11], particle swarm optimization (PSO) [12,13], and ab initio random structure searching (AIRSS) [14]. These methods have achieved notable success in predicting the structures of element, binary, and ternary systems. Building on these approaches, several software packages for structure prediction have also been developed. Notable examples include USPEX (Universal Structure Predictor: Evolutionary Xtallography), which uses the GA method [7,8], CALYPSO (Crystal structure AnaLYsis by Particle Swarm Optimization), which is based on the PSO method [13], an adaptive-GA method that combines classical potential and DFT calculations [11], and XTALOPT, which implements an evolutionary algorithm with hybrid operators [15]. In all of these packages and workflows, structural energies are typically calculated using density functional theory (DFT). While DFT provides accurate results, its high computational cost limits the size and complexity of the structure cells that can be modeled. In addition, the accuracy of DFT is influenced by the choice of exchange-correlation functional and basis set, which can introduce uncertainties in energy rankings and affect the prediction of the most stable structures. In contrast, artificial intelligence technologies offer the advantage of lower computational costs and shorter development cycles, presenting a promising alternative.
With the increasing performance of artificial intelligence (AI) across various fields, machine learning, coupled with powerful DFT data, has become increasingly common in materials design and discovery [16,17,18,19,20,21,22,23]. In particular, interatomic potentials trained by machine learning have been employed to predict thermodynamical and other properties of bulk materials, achieving DFT-level accuracy for energy calculations. The typical models include the structure–property relationship model and the force field model. The structure–property relationship mainly includes the Crystal Graph Convolutional Neural Network (CGCNN) [24], MatErials Graph Network (MEGNet) [25], and Tripar-tite interaction representation algorithm-enhanced Crystal Graph Neural Networks (TiraCGCNN) [26]. The widely used force field model includes M3GNet [27], CHEG-NET [28], and GPTFF [29]. To facilitate their use, several software packages have been developed to construct accurate atomic interaction potentials. For example, the open-source Atomic Energy Network (aenet) [30] provides a deep learning-based representation of potential energy. DeepKit also provides the representation of potential energy and force fields, enabling molecular dynamics simulations [31,32]. Additionally, workflows have been developed to generate new crystal structures. Notable examples include MAGUS, which combines DFT calculations with machine learning methods [33], and a framework that integrates a graph network model with an optimization algorithm [34]. A more ambitious effort by Google DeepMind utilized AI to predict 380,000 new materials with the GNoME model [35]. While these existing tools and programs have demonstrated success in predicting crystal structures within predefined chemical or structural families, they face significant challenges when applied to unknown or unexplored materials. Therefore, the development of a new program is urgently needed to overcome these limitations by introducing a generalized, robust, and efficient approach.
In this paper, building on our previous work with the adaptive genetic algorithm (AGA) [36,37], we propose a crystal structure prediction software based on artificial intelligence (CrySPAI), which supports automatic structure optimization by combining AI technology with DFT data. The schematic workflow of CrySPAI is shown in Figure 1. The software consists of three modules: an evolutionary optimization algorithm (EOA) for searching crystal structure configurations, DFT calculations for determining energy values, and a deep neural network (DNN) for fitting the relationship between structures and their energies. Additionally, a structure-energy function is employed to filter out unreasonable structures in the EOA module. CrySPAI offers four distinct advantages, making it a powerful tool for researchers in the field of materials science: (1) broad applicability—it can be applied across a wide range of inorganic materials; (2) seamless integration and automation—the software integrates and automates all stages of the prediction process; (3) enhanced predictive accuracy and efficiency—by combining AI and DFT, CrySPAI delivers improved accuracy and efficiency; and (4) the exploration of unknown domains—it provides robust capabilities to explore materials in previously uncharted domains.

2. Materials and Methods

2.1. Implementations of CrySPAI

2.1.1. EOA Module

The GA, inspired by Darwinian evolution, has been widely used for crystal structure optimization. In CrySPAI, the EOA module employs GA to generate structures through inheritance, mutation, selection, and crossover operations. To enhance the comprehensiveness and accuracy of the structure search, the EOA module runs seven parallel procedures, each corresponding to a different crystal system. Each procedure consists of several iterations using the same GA operations. The workflow of the EOA module for the cubic system is shown in Figure 2.
The input information typically includes the element names, number of atoms, and optional volume data. When the EOA module receives the input for the target materials, it activates seven parallel processes, each corresponding to a different crystal system. For example, Figure 2 illustrates the structure search process for the cubic crystal system. Initially, some candidate crystal structures are generated based on the input data with the help of the ICSD database. These structures are then subjected to energy calculations in the “local optimization process”. During this process, the energies of the crystal structures are predicted using a trained model. If the model is unavailable, energy calculations based on empirical formulas or single-point DFT calculations are used instead.
After the first “local optimization process”, both the structures and their corresponding energies form the trial structure set. The GA then begins its operation. The default number of trial structures in each generation Ntrial is 64. One-third of these, the structures with the lowest energy values, are selected as parent structures for crossover and mutation to produce the next generation. In each generation, 16 optimal structures with the lowest energy values are selected and interpolated into the trial structure set, replacing the 16 structures with the highest energy values from the previous generation. Once the GA generation loop ends, Np structures are recommended and output to the user as the final results, with the default value of Np being 16 for each crystal system.
If the EOA module is part of the iteration loop of the DNN module, to ensure the generalization ability of the model, 112 recommended structures (corresponding to seven crystal systems) from each GA generation loop are always transferred to the DFT module, regardless of whether the crystal system of the target structure is specified. Once the model is trained, the structure search for the specified crystal system is conducted separately in the EOA module, and the default number of recommended structures is 16.

2.1.2. DFT Module

The DFT module is used to obtain accurate energy values for crystal structures. These results, along with their corresponding structures, form the training set for the model. As the choice of exchange-correlation functional and basis set can influence the energy calculations and ranking results, a brief note has been included to draw the user’s attention to this critical aspect. In addition to its role within the iteration loop of the DNN module (as shown in Figure 1), the DFT module can also be used to calculate the energies of structures recommended by the EOA module upon user request. The DFT module in CrySPAI interfaces with external DFT calculation software, with VASP being the default computational tool. Input files, such as POTCAR (containing pseudopotential information) and INCAR (with calculation parameters), must be prepared by the user, while the POSCAR file is automatically generated and transferred from the EOA module. The KPOINTS file is optional. Given the time-consuming nature of DFT calculations, parallel computing and high-throughput computational techniques are employed to accelerate the process. All the relevant information, including composition, lattice parameters, atomic positions, and their energies, is extracted and stored in a MongoDB database in a standardized format for each DFT calculation.
The goal of the DFT module is to construct a comprehensive training dataset, which is continuously updated and expanded until the DNN iterations are complete. All the results from the DFT calculations are stored in the Materials database; however, not all of these calculations are added to the training set. CrySPAI considers two cases in which structure–energy pairs are excluded from the training set. The first case involves structures that are similar to those already present in the training dataset. The second case involves “useless” structures, where the energies from DFT calculations closely match the predicted values from the existing potential model, indicating that little new information is provided.

2.1.3. DNN Module

Once the number of DFT calculations exceeds a specific threshold, the DNN module is activated. An atomic energy DNN model is then generated, and the model undergoes continuous training and updates in the model iteration loop, as shown in Figure 1, until convergence is achieved. The default neural network used in CrySPAI consists of four layers: one input layer, two hidden layers, and one output layer. The architecture of the neural network is illustrated in Figure 3.
The input to the network is a structure feature vector, which describes the local atomic environment of the crystal structure, and its dimensionality is determined by the structure features. By default, both hidden layers have the same number of nodes, and the output of the network corresponds to atomic energy. The choice of activation function, batch size, and optimization method for the neural network has been systematically studied and optimized, as detailed in our previous work [36]. CrySPAI provides recommended network parameters, though users can modify these parameters as needed.
The default parameters for the DNN module are pre-set. The minimum number of training samples for the initial model training iteration is set to 2000, and an early stopping method is employed to prevent overfitting. The root mean square error (RMSE) (Equation (1)) between the model’s predictions and the DFT-calculated results is used to monitor the training process, with a default convergence threshold of 8 meV/atom. Once the model has converged, it is stored and no longer updated for that specific calculation, and the model iteration loop is complete. To enhance the accuracy of the predicted model, a swarm algorithm, as described in our previous work [38], is integrated into CrySPAI. This algorithm helps to refine the energy calculations of structures, allowing CrySPAI to more easily identify stable structures.
RMSE   = 1 n ( 1 N N ( E pre E DFT ) 2 ) 1 / 2  
where N is the total number of calculated structures and n is the number of atoms in the calculated cell.
The converged model is used in the local optimization process of the EOA module, which operates outside of the model iteration loop shown in Figure 1. It is important to note that the model used in the local optimization process while the EOA module is in iterations is a model that converges at the end of each loop, rather than a globally converged model.

2.2. Adaptive Volume Adjustment Algorithm

As volume is a key physical parameter in constructing crystal structures, CrySPAI allows users to provide the volume value of the target structure. Alternatively, the volume can be calculated using the adaptive volume adjustment algorithm.
In this algorithm, the structural volume is calculated as the sum of the volumes of all atoms in the structure, as given by Equation (2). A is an adjustment factor, with a default value of 1.2. The ionic radii, R, can be obtained either from the periodic table or from the “RCORE” parameter in the VASP pseudopotential file. CrySPAI offers several methods for users to specify the volume, including direct input of atomic or structural volume, extraction of ionic or atomic core radii from the periodic table or the VASP pseudopotential file, and other options.
V = i N i ( 4 3 π R 3 ) A  

3. Results

3.1. Framework of CrySPAI

Three modules have three main modules: EOA, DFT and DNN. As shown in Figure 1. The input consists of the chemical composition provided by the user, and the output is the recommended crystal structure that CrySPAI determined to be energetically stable. The iterations in the middle shown in Figure 1 are used to train a convergence model, which is the core of CrySPAI. The accuracy of the model, as well as its applications, are key factors in determining the overall performance of the software. The model is iteratively trained to accurately predict the energy of different structures. The EOA module and DFT module in the iterative loop provide samples for the dataset of the model training, while the EOA module operates outside the iterations to search for reasonable structures based on the trained model. Two databases are employed: one stores the results from DFT calculations, and the other contains the converged models.
Model training is a time-consuming and continuously evolving process, particularly when exploring new materials from scratch. However, once the model has been successfully trained, it facilitates efficient structure searching by comparing real-time structural energy information. This process, shown as the right process in Figure 1, enables CrySPAI to generate numerous satisfactory output structures, optimizing the search for the most stable configurations.

3.2. Applications on Crystal Structure Prediction of CrySPAI

The simplest way to predict crystal structures using CrySPAI is by providing the component information of the structure, such as the element names and their atomic numbers. If you wish to modify the default settings of CrySPAI, you can do so by editing the parameter file. The default parameter file, args.py, includes structure information, model training parameters, and computational resource settings.

3.2.1. Parameters and Files Preparation

For the structure information, five main parameters need to be specified. The parameter names, default values, and descriptions are listed in Table 1.
Among these parameters, atomType and atomn are required, while the others are optional. The aenergy parameter is used to determine whether a new structure is stable. If the energy of the new structure exceeds the sum of the aenergy values for all atoms, CrySPAI will consider the structure unstable and discard it; otherwise, it will be retained. The crystalsys parameter specifies the crystal system of the target structure. However, throughout the training process, seven crystal systems are used to develop a universal model, as mentioned earlier.
In addition to the structure parameters, some input calculation files must be prepared in advance. The specific files required depend on the calculation software accessed by CrySPAI’s DFT module, and users are free to customize the settings of these files. For the Vienna Ab initio Simulation Package (VASP), the required files include INCAR and POTCAR. If these input files are not provided, CrySPAI will automatically generate an INCAR file with default parameter settings. In the INCAR file, the “Accurate” precision mode is selected to avoid aliasing or wrapping errors. A conjugate-gradient algorithm is used to relax the ions to their instantaneous ground state, with a maximum of 100 ionic steps and an energy convergence criterion of 10−6 eV/atom. A plane-wave basis set is employed, with the kinetic-energy cutoff determined by the precision mode or manually specified by the user. The smallest allowed spacing between k-points is 0.1 Å−1. In this study, the generalized gradient approximation (GGA) [39], specifically the Perdew–Burke–Ernzerhof (PBE) form, is used to calculate the exchange-correlation energy. While PBE is widely used for inorganic materials, we acknowledge its limitations and recommend that users validate results for specific systems using alternative functionals when necessary. CrySPAI allows users to specify any functional supported by the chosen calculation software, such as LDA, meta-GGA, hybrid functionals, or even dispersion corrections like DFT-D2 or DFT-D3. Users performing DFT calculations with VASP must hold a valid VASP license. CrySPAI also supports other first-principles calculation software, such as Abinit, providing users with flexibility in their computational workflows.
Parameters related to model convergence and the training method are specified in the arg.py file, and users can review and modify them as needed. Once all preparations are complete, users can configure computational resources and submit tasks to supercomputers. After all iterations are finished, the results, including DFT calculations and trained models, will be stored in the database, and the recommended structure information will be output in a file format.

3.2.2. Crystal Structure Prediction

To demonstrate the effectiveness of the software in structure prediction, we selected several widely studied representative materials, including the electronic device material Si, the photocatalytic material TiO2, the perovskite material CaTiO3, and the metallic material Mg. These materials serve as foundational structures for a wide range of research areas, and their modifications have led to numerous research hotspots. The accuracy of structure prediction for these materials highlights the application potential of our software. Moreover, these representative materials also showcase CrySPAI’s capability to predict structures for elemental crystals, binary compounds, and ternary compounds.
We specified the atomic numbers and volumes for Si, TiO2, Mg, and CaTiO3 to generate their crystal structures. The composition and volume information for these materials are provided in Table 2. All models are trained from scratch based on DFT calculation results, with structures generated at a fixed volume. These structures are derived from the seven crystal systems, and the initial model is trained using DFT results from randomly generated structures. The model is then iteratively updated until it converges globally. The EOA module uses the converged model to search for target structures.
To describe the local atomic environment, the feature vector is constructed using a Chebyshev expansion [40], ensuring that the computational complexity of the machine learning model remains manageable even as the number of chemical species increases. Default DFT calculation parameters are employed. The model convergence tolerance is set to 8 meV/atom, and the ’patience’ for epochs with no loss improvement is set to 40, based on repeated testing. Additionally, the swarm algorithm is integrated to enhance the prediction accuracy of the neural network during the training process.
The predicted structure information, including the space group, lattice parameters, and the biases in atomic positions between the predicted and experimental structures, are summarized in Table 2. CrySPAI generates multiple candidate structures for each material, including known polymorphic forms. In Table 2, we present one of the 16 lowest-energy structures for each material to demonstrate that CrySPAI successfully identifies experimentally observed stable phases. Users can access the complete ranked list of predicted structures to explore other stable or metastable configurations. The c/aBias represents the difference in shape between the predicted and experimentally stable crystal structures. The site bias is calculated as the root mean square error (RMSE) of atomic site positions. From Table 2, it can be observed that crystal structures with orthogonal angles exhibit higher accuracy, particularly in terms of atomic coordinate positions. In contrast, the prediction errors for structures with inclined geometries are larger due to the combined deviation in both lattice constants and site positions. Additionally, the space group can be accurately predicted when the volume is fixed, and smaller site bias values are observed in these cases. These findings demonstrate that CrySPAI exhibits strong predictive power for commonly used materials, particularly in identifying stable crystal phases with high precision.

4. Discussion

For an unknown structure, CrySPAI first calculates its volume and then searches for the structural information, including shape and atomic positions. In all these processes, the accuracy of the model is crucial. The network model quickly predicts the energies of structures and recommends the optimal structural candidates for further generation. To enhance the robustness and efficiency of CrySPAI, an adaptive volume adjustment algorithm and a hybrid swarm intelligence algorithm are introduced. The adaptive volume adjustment algorithm is employed to determine an appropriate unit cell, while the hybrid swarm intelligence algorithm—combining the strengths of genetic algorithms, particle swarm optimization, and Bayesian optimization—is used to optimize the neural network, improving model stability and training efficiency. Further details of this method can be found in our previous work [38].

4.1. Volume Prediction Performance

To verify whether CrySPAI can accurately predict a suitable volume for the structure, we first calculated the volumes of five typical metal crystal structures to test the adaptive volume adjustment algorithm we developed. Table 3 presents the predicted and experimental results for these five metal elemental crystals, along with their respective volumes. As shown, the volumes of the structures recommended by CrySPAI are generally consistent with the experimental standard values, with a volume error of less than 3 Å3/cell. This demonstrates that the proposed volume adjustment strategy is effective for searching unknown structures. Additionally, the space groups were predicted with good accuracy, particularly for Ni4, which shows a higher level of precision.

4.2. Swarm Intelligence Algorithm Performance

To evaluate the performance of the swarm intelligence algorithm in CrySPAI, we tested model convergence using a small dataset and conducted structure searches based on trained models for Li, Ca, and Mn. The results were compared with those obtained using the traditional back-propagation algorithm, as shown in Table 4. In this test, the total dataset consisted of approximately 1000 structures; the loss was kept below 0.08 eV/atom; and the iteration count was set to 1500 for each loop of the EOA module used for structure searching.
As shown in Table 4, the proposed swarm intelligence algorithm demonstrates excellent performance in structure searching. It effectively utilizes small datasets to achieve rapid convergence and exhibits strong generalization ability. In comparison, while the traditional back-propagation algorithm also performs well with small datasets, it is more prone to getting stuck in local optima or overfitting, which can hinder the EOA module from finding the optimal structure during the search process. Therefore, the swarm intelligence algorithm in CrySPAI offers faster convergence and higher accuracy, as the recommended individuals provide a stronger initial advantage. Additionally, energy prediction models for TiO2 and CaTiO3 were trained in earlier tests, showing that the neural network optimized by the proposed hybrid swarm intelligence algorithm improves both efficiency and accuracy, particularly for systems with a larger variety of atoms [33].

4.3. Capability Performance of CrySPAI in Crystal Structure Search

To evaluate the performance of CrySPAI, we compare it with two well-known structure search software, CALYPSO and GSGO, using available data from previous studies [12]. CALYPSO employs a particle swarm optimization algorithm, while GSGO utilizes a genetic algorithm for structure search. Both software packages use first-principles calculations for optimization. It is important to note that each software searches different structural systems independently, so we treat the search in each system as one generation. When comparing the same optimized population size Npop, CrySPAI reproduces experimentally reported structures in fewer generations than the other methods shown in Table 5. This improvement is primarily due to the efficiency of the DNN model, which reduces computational cost, and the more accurate parent structures provided by the model, allowing CrySPAI to more effectively search for the global optimum.

5. Conclusions

In this study, we developed CrySPAI, a crystal structure prediction software based on artificial intelligence. CrySPAI integrates EOA for the structure search, DFT for energy calculations, and DNN for modeling the relationship between structure and energy. These modules operate both independently and collaboratively, enabling efficient and accurate predictions of crystal structures.
To ensure a robust model, we used a diverse training dataset that includes structures from various crystal systems, stoichiometries, and cell sizes. This approach ensures CrySPAI’s versatility in predicting a wide range of materials. Additionally, CrySPAI employs human–computer collaboration to refine and validate structure predictions, further improving overall accuracy. Currently, CrySPAI is tailored specifically for the prediction of crystal structures of inorganic materials, given their chemical compositions. While the present implementation is optimized for inorganic materials due to their distinct structural and energetic characteristics, the employed methods could theoretically be extended to organic or hybrid systems.
Although CrySPAI has demonstrated significant success in inorganic crystal structure predictions, further developments are required.
  • Dependence on training data. CrySPAI’s prediction accuracy is heavily dependent on the quality and diversity of the training data used for DNN. To address this, we plan to expand the training dataset to cover a broader range of materials, ensuring better generalization and improved performance.
  • Scalability to complex systems. While CrySPAI is currently effective for inorganic materials, its application to highly complex systems, such as amorphous materials, organic materials, or systems with strong electron correlation effects, may require additional computational resources or tailored methods. Future iterations of CrySPAI will involve specific adaptations to handle these complex systems effectively.
  • Generalizability to experimental conditions. CrySPAI predictions are now conducted under idealized computational conditions (e.g., 0 K and no pressure). These may differ from experimental conditions, and some phase transitions and defect states will also have been thrown away. In the future versions, we will extend CrySPAI’s capabilities to simulate dynamic processes, offering deeper insights into material behaviors under realistic conditions.
With these planned advancements, CrySPAI holds the potential to accelerate the discovery of new materials with tailored properties for a wide range of applications.

Author Contributions

All the authors contributed to the study conception and design. Z.W. and Y.W.: methodology, software, validation, writing—original draft preparation, writing—review and editing and funding acquisition; Z.C. and Y.Y.: software, writing—review and editing. All authors have read and agreed to the published version of the manuscript.

Funding

This study was supported by The Strategic Priority Research Program of Chinese Academy of Sciences (Grant No. XDB0500202), National Natural Science Foundation of China (Grant No.51802312), Key Research Program of Frontier Sciences, CAS (Grant NO.ZDBS-LY-7025), and Youth Innovation Promotion Association CAS (Grant NO.2021167).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data used to support the findings of this study are included within the article.

Acknowledgments

We would like to thank our partners who provided their help during the research process and the team for their great support.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Bergerhoff, G.; Hundt, R.; Sievers, R.; Brown, I.D. The Inorganic Crystal Structure Data Base. J. Chem. Inf. Comput. Sci. 1983, 23, 66–69. [Google Scholar] [CrossRef]
  2. Villars, P.; Cenzual, K.; Gladyshevskii, R.; Iwata, S. PAULING FILE—Towards a holistic view. Chem. Met. Alloys 2018, 11, 43–76. [Google Scholar] [CrossRef]
  3. Curtarolo, S.; Hart, G.L.W.; Nardelli, M.B.; Mingo, N.; Sanvito, S.; Levy, O. The high-throughput highway to computational materials design. Nat. Mater. 2013, 12, 191–201. [Google Scholar] [CrossRef]
  4. Curtarolo, S.; Morgan, D.; Ceder, G. Accuracy of ab initio methods in predicting the crystal structures of metals: A review of 80 binary alloys. Calphad Comput. Coupling Phase Diagr. Thermochem. 2005, 29, 163–211. [Google Scholar] [CrossRef]
  5. Hart, G.L.W.; Curtarolo, S.; Massalski, T.B.; Levy, O. Comprehensive search for new phases and compounds in binary alloy systems based on platinum-group metals, using a computational first-principles approach. Phys. Rev. X 2014, 3, 1–33. [Google Scholar] [CrossRef]
  6. Doll, K.; Schön, J.C.; Jansen, M. Global exploration of the energy landscape of solids on the ab initio level. Phys. Chem. Chem. Phys. 2007, 9, 6128–6133. [Google Scholar] [CrossRef]
  7. Oganov, A.R.; Glass, C.W. Evolutionary crystal structure prediction as a tool in materials design. J. Phys. Condens. Matter 2008, 20, 064210. [Google Scholar] [CrossRef] [PubMed]
  8. Lyakhov, A.O.; Oganov, A.R.; Stokes, H.T.; Zhu, Q. New developments in evolutionary structure prediction algorithm USPEX. Comput. Phys. Commun. 2013, 184, 1172–1182. [Google Scholar] [CrossRef]
  9. Ji, M.; Umemoto, K.; Wang, C.-Z.; Ho, K.-M.; Wentzcovitch, R.M. Ultrahigh-pressure phases of H2O ice predicted using an adaptive genetic algorithm. Phys. Rev. B 2011, 84, 220105. [Google Scholar] [CrossRef]
  10. Wu, S.; Umemoto, K.; Ji, M.; Wang, C.Z.; Ho, K.M.; Wentzcovitch, R.M. Identification of post-pyrite phase transitions in SiO2 by a genetic algorithm. Phys. Rev. B-Condens. Matter Mater. Phys. 2011, 83, 6–9. [Google Scholar] [CrossRef]
  11. Wu, S.Q.; Ji, M.; Wang, C.Z.; Nguyen, M.C.; Zhao, X.; Umemoto, K.; Wentzcovitch, R.M.; Ho, K.M. An adaptive genetic algorithm for crystal structure prediction. J. Phys. Condens. Matter 2013, 26, 035402. [Google Scholar] [CrossRef] [PubMed]
  12. Wang, Y.; Lv, J.; Zhu, L.; Ma, Y. Crystal structure prediction via particle-swarm optimization. Phys. Rev. B-Condens. Matter Mater. Phys. 2010, 82, 094116. [Google Scholar] [CrossRef]
  13. Wang, Y.; Lv, J.; Zhu, L.; Ma, Y. CALYPSO: A method for crystal structure prediction. Comput. Phys. Commun. 2012, 183, 2063–2070. [Google Scholar] [CrossRef]
  14. Pickard, C.J.; Needs, R.J. Ab initio random structure searching. J. Phys. Condens. Matter 2011, 23, 053201. [Google Scholar] [CrossRef]
  15. Lonie, D.C.; Zurek, E. XtalOpt: An open-source evolutionary algorithm for crystal structure prediction. Comput. Phys. Commun. 2011, 182, 372–387. [Google Scholar] [CrossRef]
  16. Ouyang, R. Exploiting Ionic Radii for Rational Design of Halide Perovskites. Chem. Mater. 2020, 32, 595–604. [Google Scholar] [CrossRef]
  17. Bartel, C.J.; Sutton, C.; Goldsmith, B.R.; Ouyang, R.; Musgrave, C.B.; Ghiringhelli, L.M.; Scheffler, M. New tolerance factor to predict the stability of perovskite oxides and halides. Sci. Adv. 2019, 5, eaav0693. [Google Scholar] [CrossRef] [PubMed]
  18. Kusne, A.G.; Yu, H.; Wu, C.; Zhang, H.; Hattrick-Simpers, J.; DeCost, B.; Sarker, S.; Oses, C.; Toher, C.; Curtarolo, S.; et al. On-the-fly closed-loop materials discovery via Bayesian active learning. Nat. Commun. 2020, 11, 5966. [Google Scholar] [CrossRef]
  19. Schleder, G.R.; Padilha, A.C.; Acosta, C.M.; Costa, M.; Fazzio, A. From DFT to machine learning: Recent approaches to materials science-A review. J. Phys. Mater. 2019, 2, 032001. [Google Scholar] [CrossRef]
  20. Schmidt, J.; Marques, M.R.G.; Botti, S.; Marques, M.A.L. Recent advances and applications of machine learning in solid-state materials science. NPJ Comput. Mater. 2019, 5, 83. [Google Scholar] [CrossRef]
  21. Wei, J.; Chu, X.; Sun, X.; Xu, K.; Deng, H.; Chen, J.; Wei, Z.; Lei, M. Machine learning in materials science. InfoMat 2019, 1, 338–358. [Google Scholar] [CrossRef]
  22. Mortazavi, B.; Podryabinkin, E.V.; Novikov, I.S.; Roche, S.; Rabczuk, T.; Zhuang, X.; Shapeev, A.V. Efficient machine-learning based interatomic potentialsfor exploring thermal conductivity in two-dimensional materials. J. Phys. Mater. 2020, 3, 02LT02. [Google Scholar] [CrossRef]
  23. Vasudevan, R.; Pilania, G.; Balachandran, P.V. Machine learning for materials design and discovery. J. Appl. Phys. 2021, 129, 070401. [Google Scholar] [CrossRef]
  24. Xie, T.; Grossman, J.C. Crystal Graph Convolutional Neural Networks for an Accurate and Interpretable Prediction of Material Properties. Phys. Rev. Lett. 2018, 120, 145301. [Google Scholar] [CrossRef]
  25. Chen, C.; Ye, W.; Zuo, Y.; Zheng, C.; Ong, S.P. Graph Networks as a Universal Machine Learning Framework for Molecules and Crystals. Chem. Mater. 2019, 31, 3564–3572. [Google Scholar] [CrossRef]
  26. Yuan, Y.; Chen, Z.; Feng, T.; Xiong, F.; Wang, J.; Wang, Y.; Wang, Z. Tripartite interaction representation algorithm for crystal graph neural networks. Scientific Reports 2024, 14, 24881. [Google Scholar] [CrossRef] [PubMed]
  27. Chen, C.; Ong, S.P. A universal graph deep learning interatomic potential for the periodic table. Nat. Comput. Sci. 2022, 2, 718–728. [Google Scholar] [CrossRef] [PubMed]
  28. Deng, B.; Zhong, P.; Jun, K.; Riebesell, J.; Han, K.; Bartel, C.J.; Ceder, G. CHGNet as a pretrained universal neural network potential for charge-informed atomistic modelling. Nat. Mach. Intell. 2023, 5, 1031–1041. [Google Scholar] [CrossRef]
  29. Xie, F.; Lu, T.; Meng, S.; Liu, M. GPTFF: A high-accuracy out-of-the-box universal AI force field for arbitrary inorganic materials. Sci. Bull. 2024, 69, 3525–3532. [Google Scholar] [CrossRef]
  30. Artrith, N.; Urban, A. An implementation of artificial neural-network potentials for atomistic materials simulations: Performance for TiO2. Comput. Mater. Sci. 2016, 114, 135–150. [Google Scholar] [CrossRef]
  31. Zhang, L.; Lin, D.-Y.; Wang, H.; Car, R.; E, W. Active learning of uniformly accurate interatomic potentials for materials simulation. Phys. Rev. Mater. 2019, 3, 023804. [Google Scholar] [CrossRef]
  32. Zhang, L.; Han, J.; Wang, H.; Car, R.; E, W. Deep Potential Molecular Dynamics: A Scalable Model with the Accuracy of Quantum Mechanics. Phys. Rev. Lett. 2018, 120, 143001. [Google Scholar] [CrossRef] [PubMed]
  33. Wang, J.; Gao, H.; Han, Y.; Ding, C.; Pan, S.; Wang, Y.; Jia, Q.; Wang, H.-T.; Xing, D.; Sun, J. MAGUS: Machine learning and graph theory assisted universal structure searcher. Natl. Sci. Rev. 2023, 10, nwad128. [Google Scholar] [CrossRef]
  34. Cheng, G.; Gong, X.G.; Yin, W.J. Crystal structure prediction by combining graph network and optimization algorithm. Nat. Commun. 2022, 13, 1492. [Google Scholar] [CrossRef] [PubMed]
  35. Merchant, A.; Batzner, S.; Schoenholz, S.S.; Aykol, M.; Cheon, G.; Cubuk, E.D. Scaling deep learning for materials discovery. Nature 2023, 624, 80–85. [Google Scholar] [CrossRef]
  36. Liu, Z.-W.; Wang, Z.-G.; Guo, J.-L.; Wang, Y.-G. Deep Learning Method for Crystal Structure Prediction. Comput. Syst. Appl. 2021, 30, 40–49. [Google Scholar]
  37. Zhao, X.; Shu, Q.; Nguyen, M.C.; Wang, Y.; Ji, M.; Xiang, H.; Ho, K.-M.; Gong, X.; Wang, C.-Z. Interface Structure Prediction from First-Principles. J. Phys. Chem. C 2014, 118, 9524–9530. [Google Scholar] [CrossRef]
  38. Liu, Z.; Guo, J.; Chen, Z.; Wang, Z.; Sun, Z.; Li, X.; Wang, Y. Swarm intelligence for new materials. Comput. Mater. Sci. 2022, 214, 111699. [Google Scholar] [CrossRef]
  39. Perdew, J.P.; Burke, K.; Ernzerhof, M. Generalized gradient approximation made simple. Phys. Rev. Lett. 1996, 77, 3865–3868. [Google Scholar] [CrossRef]
  40. Artrith, N.; Urban, A.; Ceder, G. Efficient and accurate machine-learning interpolation of atomic energies in compositions with many species. Phys. Rev. B 2017, 96, 014112. [Google Scholar] [CrossRef]
Figure 1. Schematic workflow of CrySPAI, showing the EOA, DFT, and DNN modules, along with databases for storing DFT results and model parameters.
Figure 1. Schematic workflow of CrySPAI, showing the EOA, DFT, and DNN modules, along with databases for storing DFT results and model parameters.
Inventions 10 00026 g001
Figure 2. Structure search flowchart of the EOA module for the cubic crystal system, with “Local Optimization” for selecting stable structures from GA outputs.
Figure 2. Structure search flowchart of the EOA module for the cubic crystal system, with “Local Optimization” for selecting stable structures from GA outputs.
Inventions 10 00026 g002
Figure 3. The graph representation of training network. Input is the feature vectors; output is the atomic energy (E). The color in figure is intended to achieve a visual effect and is not significant.
Figure 3. The graph representation of training network. Input is the feature vectors; output is the atomic energy (E). The color in figure is intended to achieve a visual effect and is not significant.
Inventions 10 00026 g003
Table 1. Parameters to construct crystal structure for CrySPAI.
Table 1. Parameters to construct crystal structure for CrySPAI.
NAMEDescriptionDefault ValueType
entry 1atomelement no default valuestring list
atomnnumber of each atomno default valueint list
avolumeatomic volume of structurecalculated by Equation (2)float list
aenergyatomic energyatomn*0.0 [a]float list
crystalsyscrystal system of target structureallstring
[a] atomn*0.0 means default atomic energy of every atom in the structure is 0.0.
Table 2. Structure information of Si, TiO2, Mg, and CaTiO3. Comparison of experimental and predicted space groups, c/aBias, and siteBias.
Table 2. Structure information of Si, TiO2, Mg, and CaTiO3. Comparison of experimental and predicted space groups, c/aBias, and siteBias.
StructureatomTypeatomNumberVolume of Cell (Å3)TargSpg
/PredSpg
c/aBiassiteBias
SiSi 8160.1227/22700.175
TiO2O, Ti8, 4136.28227/227−0.2590.258
MgMg245.41194/1940.0480.440
CaTiO3Ca, O, Ti1, 3, 159.17194/19400.419
Table 3. The predicted volume for different structures.
Table 3. The predicted volume for different structures.
StructureTargVol/PredVol (Å3)TargSpg/PredSpgTolerance (Å)
Cu450.007/47.238225/2250.2
Ni442.842/43.774225/2250.01
Mg245.405/46.454194/1940.1
Zn230.319/29.792194/1940.2
Zr246.299/46.570194/1940.1
Table 4. Comparison between swarm intelligence algorithm and traditional back-propagation algorithm in the process of model training and structure searching.
Table 4. Comparison between swarm intelligence algorithm and traditional back-propagation algorithm in the process of model training and structure searching.
StructureOptimized MethodLossDataset SizeLoop Number
Liback-propagation algorithm0.074668Null
swarm intelligence algorithm0.0096731
Caback-propagation algorithm0.0751279Null
swarm intelligence algorithm0.05312161
Mnback-propagation algorithm0.0421135Null
swarm intelligence algorithm0.0513352
Table 5. Comparison between CrySPAI and other methods for several structure systems with equal population sizes (Npop).
Table 5. Comparison between CrySPAI and other methods for several structure systems with equal population sizes (Npop).
StructureAlgorithmPrototype StructuresGenerationsNpop
SiCALYPSODiamond8//516
GSGODiamond1516
CrySPAIDiamond216
SiCCALYPSOZinc blende8//512
GSGOZinc blende512
CrySPAIZinc blende112
GaAsCALYPSOZinc blende16//512
GSGOZinc blende1912
CrySPAIZinc blende212
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Wang, Z.; Chen, Z.; Yuan, Y.; Wang, Y. CrySPAI: A New Crystal Structure Prediction Software Based on Artificial Intelligence. Inventions 2025, 10, 26. https://doi.org/10.3390/inventions10020026

AMA Style

Wang Z, Chen Z, Yuan Y, Wang Y. CrySPAI: A New Crystal Structure Prediction Software Based on Artificial Intelligence. Inventions. 2025; 10(2):26. https://doi.org/10.3390/inventions10020026

Chicago/Turabian Style

Wang, Zongguo, Ziyi Chen, Yang Yuan, and Yangang Wang. 2025. "CrySPAI: A New Crystal Structure Prediction Software Based on Artificial Intelligence" Inventions 10, no. 2: 26. https://doi.org/10.3390/inventions10020026

APA Style

Wang, Z., Chen, Z., Yuan, Y., & Wang, Y. (2025). CrySPAI: A New Crystal Structure Prediction Software Based on Artificial Intelligence. Inventions, 10(2), 26. https://doi.org/10.3390/inventions10020026

Article Metrics

Back to TopTop