*Article* **Tool for Designing Breakthrough Discovery in Materials Science**

**Michiko Yoshitake**

Research Center for Functional Materials, National Institute for Materials Science, Tsukuba 305-40044, Japan; yoshitake.michiko@nims.go.jp; Tel.: +81-298-863-5696

**Abstract:** A database of material property relationships, which serves as a scientific principles database, and a database search system are proposed and developed. The use of this database can support a broader research perspective, which is increasingly important in the era of automated computer-aided experimentation and machine learning of experimental and calculated data. Examples of the wider use of scientific principles in materials research are presented. The database and its advantages are described. An implementation of the proposed database and search system as a prototype software is reported. The usefulness of the database and search system is demonstrated by an example of a surprising but reasonable discovery.

**Keywords:** knowledge database; scientific principles; material property relationship; network-type database; interdisciplinary; multidisciplinary; graph search; wide perspective

#### **1. Introduction**

In conventional materials research and development (R&D), researchers explore materials or synthesis processes based on known materials or processes by modifying one or two conditions in the composition or process (conventional search area). The entire search area is very large, for example, the number of five-element systems composed of any combination of 76 practical elements (excluding inert gases and radioactive elements from the periodic table) can be briefly estimated as follows: the number of permutations of five elements from 76 elements (choosing from the largest content) is 76!/(76−5)!, where ! means factorial. If the compounds containing the same five elements but with different compositions of 1 at% are regarded as different compounds, the total number of possible compounds of the five-element system is approximated by 76!/(76−5)! <sup>∗</sup> (100−4)<sup>5</sup> <sup>∗</sup> (1/2)<sup>4</sup> , which is larger than 1017. Here, (100−4)<sup>5</sup> (96 at% is the possible maximum concentration) is possible variation of compositions without considering the order in composition and (1/2)<sup>4</sup> is for taking the order of five elements in consideration. To increase the search speed, high-throughput experimental techniques [1–3] and automated experimental systems using robotics techniques [4–6] have been developed recently. Machine learning techniques using accumulated data or output data from high-throughput experiments have been introduced in materials R&D [7–11]. Machine learning is a powerful tool for optimizing compositions or process parameters within systems (for example, to find a local minimum) where data are given (that is, the search area consists of various numerical input data). However, because machine learning requires numerical input data, its applications are limited to systems where numerical data for learning exist. By contrast, innovative materials or processes have often been discovered in systems far from existing or explored systems. For example, carbon alloy catalysts for fuel cells [12,13] have no metallic components but contain only carbon and nitrogen, whereas most researchers have tried to decrease the Pt or precious metal content of catalysts. Carbon alloy catalysts could not have been discovered by machine learning using existing data on catalysts containing Pt and/or other metals. To develop these catalysts, it appears that the inventor considered basic scientific principles without being limited by commonly used approaches. The scientific

**Citation:** Yoshitake, M. Tool for Designing Breakthrough Discovery in Materials Science. *Materials* **2021**, *14*, 6946. https://doi.org/10.3390/ ma14226946

Academic Editor: Teofil Jesionowski

Received: 31 October 2021 Accepted: 15 November 2021 Published: 17 November 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

principles and functional mechanism are essentially the same as those of known systems. Here, the knowledge of the inventor appears to have contributed to the discovery. Figure 1 schematically illustrates the automated experiment and machine learning loop (blue lines) and human contribution (red lines) in computer-aided materials R&D. The blue loop in Figure 1 is still under development; however, it is gradually becoming apparent that the red path will become increasingly important in the future. Here, the problem is that individuals acquire knowledge mainly by reading books and papers, which limits the broadness of a field and often results in a narrow outlook on possible approaches. For breakthrough discovery, it is important to support a broader perspective. Here, the knowledge of the inventor appears to have contributed to the discovery. Figure 1 schematically illustrates the automated experiment and machine learning loop (blue lines) and human contribution (red lines) in computer-aided materials R&D. The blue loop in Figure 1 is still under development; however, it is gradually becoming apparent that the red path will become increasingly important in the future. Here, the problem is that individuals acquire knowledge mainly by reading books and papers, which limits the broadness of a field and often results in a narrow outlook on possible approaches. For breakthrough discovery, it is important to support a broader perspective.

principles and functional mechanism are essentially the same as those of known systems.

*Materials* **2021**, *14*, x FOR PEER REVIEW 2 of 16

**Figure 1.** Schematic representation of research process consisting of automated loop with computer aid (**blue**) and human involvement in the process (**red**). Green arrows indicate information inputs. **Figure 1.** Schematic representation of research process consisting of automated loop with computer aid (**blue**) and human involvement in the process (**red**). Green arrows indicate information inputs.

The author has tried to obtain a broader perspective and has made discoveries, which will be described later. On the basis of these experiences, the author proposed "materials curation", a method of interdisciplinary utilization of scientific principles to solve problems or search for materials from this wider perspective [14–18]. In this method, searches for materials or solutions are conducted beyond the search space in which numerical data are available, as shown schematically in Figure 2, where red indicates more desirable values of target material properties and green indicates less desirable values. To make this method available to many researchers, the author made the concept of a database of scientific principles in materials science [16–18]. The database of scientific principles is used in the third and fourth stages of "materials curation", where the stages are divided into (1) detach from common approaches, (2) consider what the user wants (not needs), (3) describe conditions that satisfy the wants from viewpoint of scientific principles, (4) list methods that can satisfy the conditions in principle, (5) test the method one by one using numerical data, and (6) get new solutions for the wants [16]. On the red path in Figure 1, where the knowledge of an individual human is required, knowledge of scientific principles is acquired mainly from books. The interdisciplinary utilization of scientific principles requires knowledge from multiple fields. However, it is somewhat difficult for individuals to read many books from a broad range of fields. Developing and sharing a data-The author has tried to obtain a broader perspective and has made discoveries, which will be described later. On the basis of these experiences, the author proposed "materials curation", a method of interdisciplinary utilization of scientific principles to solve problems or search for materials from this wider perspective [14–18]. In this method, searches for materials or solutions are conducted beyond the search space in which numerical data are available, as shown schematically in Figure 2, where red indicates more desirable values of target material properties and green indicates less desirable values. To make this method available to many researchers, the author made the concept of a database of scientific principles in materials science [16–18]. The database of scientific principles is used in the third and fourth stages of "materials curation", where the stages are divided into (1) detach from common approaches, (2) consider what the user wants (not needs), (3) describe conditions that satisfy the wants from viewpoint of scientific principles, (4) list methods that can satisfy the conditions in principle, (5) test the method one by one using numerical data, and (6) get new solutions for the wants [16]. On the red path in Figure 1, where the knowledge of an individual human is required, knowledge of scientific principles is acquired mainly from books. The interdisciplinary utilization of scientific principles requires knowledge from multiple fields. However, it is somewhat difficult for individuals to read many books from a broad range of fields. Developing and sharing a database of material property relationships to serve as a database of scientific principles (Figure 1, bottom left) would at least partially solve this problem.

base of material property relationships to serve as a database of scientific principles (Figure 1, bottom left) would at least partially solve this problem. Interdisciplinary support is realized by associating material properties not with material types or material usage but with academic fields, as shown in Figure 3. For example, the electrical conductivity is determined by the same principle described in solid-state physics regardless of the value. Metals, semiconductors, and ceramics (which are typically

insulators) have different conductivity values, but those values are determined mainly by carrier density, which depends primarily on band gap energy. Here, the electrical conductivity, carrier density, and band gap energy (each of which is a material property) are connected through solid-state physics (blue lines in Figure 3). Because associations among material properties are made based on published electronic textbooks, the names of the academic fields are mostly based on titles or categories of textbooks from publishers. This article describes the database of material property relationships and the system for searching these relationships. *Materials* **2021**, *14*, x FOR PEER REVIEW 3 of 16 Search area composed of various input numerical data numerical data) Input numerical data are limited (no data, or data not included as input)

Conventional

*Materials* **2021**, *14*, x FOR PEER REVIEW 3 of 16

マテリアルキュレーション

"Materials curation": indicate regions to be explored (hypothesis formation, no

**Figure 2.** Schematic representation of search space with numerical input data (conventional or machine learning) and without numerical input data (materials curation). **Figure 2.** Schematic representation of search space with numerical input data (conventional or machine learning) and without numerical input data (materials curation). article describes the database of material property relationships and the system for searching these relationships.

polymers thermal conductivity solid-state physics ・・・ **Figure 3.** Schematic relationships among material properties (usually categorized by material type or usage; black lines) and scientific principles (usually categorized by academic fields; blue lines).

material mechanics

device physics

interface science

・・・

dielectric constant

modulus

・・・

conductivity

structural materials

magnets

・・・

thermoelectric materials

batteries

Usage category

#### **2. Examples of Knowledge Utilization**

Here, examples of knowledge utilization by the author are presented to explain the process of perspective broadening.

#### *2.1. Substrate for the Growth of Ultra-Thin Atomically Flat Epitaxial Alumina Film*

Thin epitaxial alumina films have been grown for the study of electron tunneling, model catalysts and so forth. The most popular substrate used for model catalysts is NiAl(110), where the growth of atomically flat, 0.5 nm thick epitaxial alumina is well known [19]. However, it has been found that a thickness of 0.5 nm is not sufficient to avoid the effects of the metallic underlayer (in this case, NiAl). Therefore, many attempts have been made to use other (metallic) substrates. Figure 4 briefly summarizes the results of these attempts. Two types of substrates have been investigated: the (110) plane of pure body-centered cubic (bcc) metals with high melting temperature such as Ta(110) [20] and Mo(110) [21], and the (110) plane of Al-containing intermetallic compounds such as NiAl(110) and FeAl(110) [22]. On the former type of substrate, aluminum is deposited and then oxidized at high temperatures so that it crystalizes. Alumina is known to grow epitaxially but does not form flat films. The reason is that aluminum–oxygen bonds are so strong that in the first step of oxidation, aluminum atoms agglutinate and become islands. This kind of growth is well known to occur in molecular beam epitaxy (MBE) [23]. For Al-containing intermetallic compounds, preferential oxidation produces flat epitaxial alumina films, but the thickness is less than 1 nm, which is insufficient to avoid the effects of the substrate. In the preferential oxidation of Al-containing intermetallic compounds, O atoms react individually with Al atoms on the upper surface because there is no Al–Al bonding at the surface, and agglutination of Al atoms does not occur. If the Al atomic content is less than stoichiometric, Al atoms below the surface diffuse to the surface and bind with O atoms. Because O atoms do not agglutinate, the diffusion of Al atoms is the rate-determining process. Therefore, the agglutination of Al atoms does not occur, and atomically flat epitaxial films are produced. This mechanism is used in MBE, although the supply of metallic atoms is not controlled by diffusion from a substrate but by beam flux, for example, in the growth of GaAs [23]. Thicker alumina epitaxial layers (slightly thicker than 0.5 nm) can be grown by alternately suppling Al and O under controlled conditions [24]. The thickness is limited to less than 1 nm because of the symmetry mismatch of the crystal planes. In ultra-thin (nanometer-order) epitaxial alumina films, oxygen atoms typically align in sixfold symmetry on the plane parallel to the surface. The crystal structure of NiAl and FeAl is bcc-like, where atoms are aligned quasi-hexagonally but do not have sixfold symmetry on the (110) plane. The symmetry mismatch between the substrate and alumina film causes strain, which is thought to prevent further growth of epitaxial alumina. This hypothesis is supported by the fact that when a thicker layer of alumina was grown on NiAl(110) by further deposition of Al and O, the structure changed at a thickness of 0.84 nm, and the alumina became amorphous when the thickness reached 1.62 nm [24,25].

The above findings suggest the possibility of using Al-containing alloys that have a crystal plane with sixfold symmetry. The author was successful in finding such alloys that fulfill the conditions and demonstrated the growth of 1–4 nm thick atomically flat alumina films using Cu-9Al(111) as a substrate [26–28]. The key was to expand the search space beyond intermetallic compounds, which rarely have a plane with sixfold symmetry, and consider alloys as candidate materials.
