269

consequently, the Fermi level changes, decreasing the generated voltage thus *S*. However, this explanation between *S* and *σ* applies only for doping. A comparison of materials with differently shaped DOSs reveals that there is no trade-off relationship [31]. The reason is that the shape of the DOS depends on the carrier mobility, which is determined by the effective mass of electrons. Therefore, a search for materials considering not the DOS but the shape of the DOS would identify materials that have both large Seebeck coefficients

**Figure 5.** Relationships among various properties affecting the performance factor of thermoelectric materials. **Figure 5.** Relationships among various properties affecting the performance factor of thermoelectric materials.

#### *2.3. Prediction of Work Function from Vickers Hardness 2.3. Prediction of Work Function from Vickers Hardness*

and high electrical conductivity.

The work function is a material property that determines the energy barrier to electron transfer in many devices such as transistors, batteries, and solar cells. Although it is a material property, the value is determined not only by the bulk term (the bulk composition and bulk structure) but also by the surface term (the surface composition, which is not necessarily the same as the bulk composition, and surface atomic arrangement and structures, including the arrangement of steps). Figure 6 shows various material properties that contribute to the work function. In the devices mentioned above, the main functional material is sandwiched between two metallic electrodes, one with low work function and the other with high work function. Most materials with low work function, such as alkali metals, are very reactive. Among low-work-function materials, transition metal carbides (TMCs) and nitrides (TMNs) are less reactive and relatively easy to handle in The work function is a material property that determines the energy barrier to electron transfer in many devices such as transistors, batteries, and solar cells. Although it is a material property, the value is determined not only by the bulk term (the bulk composition and bulk structure) but also by the surface term (the surface composition, which is not necessarily the same as the bulk composition, and surface atomic arrangement and structures, including the arrangement of steps). Figure 6 shows various material properties that contribute to the work function. In the devices mentioned above, the main functional material is sandwiched between two metallic electrodes, one with low work function and the other with high work function. Most materials with low work function, such as alkali metals, are very reactive. Among low-work-function materials, transition metal carbides (TMCs) and nitrides (TMNs) are less reactive and relatively easy to handle in device processing. Carbides such as TiC and TaC are in practical use.

device processing. Carbides such as TiC and TaC are in practical use. TMCs are non-stoichiometric compounds, and carbon atoms often deviate from a 1:1 ratio, resulting in the formula TMCx (x < 1). The work function is affected by the stoichiometry, but only two experimental results on the effects for well-defined surfaces have been reported [33]. First-principles calculations of these two systems have also been reported [34]; they show that carbon deficiency does not affect surface term of the work function. In addition, first-principles calculations have shown that the surface term of the work function of other TMCs remains constant under a carbon deficiency. Therefore, the carbon deficiency affects only the bulk term of the work function. Thus, the question is how to estimate the bulk term of the work function. From the origin of the work function [35], the author found that the Vickers hardness can be used as one measure of the bulk term of the work function of TMCs and TMNs in general [36]. Figure 6 was compiled on the basis of the above consideration. When this diagram is created and published, other researchers who are not familiar with the work function but need to control it for their devices can use it as a reference without following the author's entire thought process as described in [36].

**Figure 6.** Relationships among factors that contributing to the work function, compiled from de-**Figure 6.** Relationships among factors that contributing to the work function, compiled from descrip-

#### scriptions in books and review articles. tions in books and review articles. **3. Relationship between Material Properties**

TMCs are non-stoichiometric compounds, and carbon atoms often deviate from a 1:1 ratio, resulting in the formula TMCx (x < 1). The work function is affected by the stoichiometry, but only two experimental results on the effects for well-defined surfaces have been reported [33]. First-principles calculations of these two systems have also been reported [34]; they show that carbon deficiency does not affect surface term of the work function. In addition, first-principles calculations have shown that the surface term of the work function of other TMCs remains constant under a carbon deficiency. Therefore, the carbon deficiency affects only the bulk term of the work function. Thus, the question is how to estimate the bulk term of the work function. From the origin of the work function [35], the author found that the Vickers hardness can be used as one measure of the bulk term of the work function of TMCs and TMNs in general [36]. Figure 6 was compiled on the basis of the above consideration. When this diagram is created and published, other researchers who are not familiar with the work function but need to control it for their devices can use it as a reference without following the author's entire thought process as described in [36]. **3. Relationship between Material Properties**  If a diagram of the relationships between various material properties such as Figure 5 is stored as a database and shared among many material scientists, material development is expected to be greatly accelerated. Consequently, the author proposed a system composed of a database of relationships between various material properties and a search tool If a diagram of the relationships between various material properties such as Figure 5 is stored as a database and shared among many material scientists, material development is expected to be greatly accelerated. Consequently, the author proposed a system composed of a database of relationships between various material properties and a search tool for the database [16,17,37,38] as shown schematically in Figure 7. Many relationships on material properties, which are given literally, are extracted as pairs of two material properties from texts either by (a) manually, where a person reads textbooks and learns the relationships like Figures 5 and 6, or by (b) automatically using natural language processing techniques and a computer. Extracted pairs of two material properties are input into a database (<Input of relations> in Figure 7). The database of sets of material property pairs is represented as a graph. Users search relations from the database represented as a graph (<Search of relations (users)> in Figure 7). The characteristic feature of the relationship database is its graph-type (network-type) structure, which consists of nodes (material properties) and edges (relations between material properties). This database is completely different from conventional material databases, which contain material names or compositions and the values of material properties such as melting point, density, and dielectric constant. There are no numerical values in the database. Like a train map, this database describes connections. The contents are not numerical data but words such as density. The sources of scientific principles are mainly literal (including mathematical formula), not numerical. Literal information describes essentially universal relationships independent of specific material compositions. Numerical data are useful for specific material systems.

for the database [16,17,37,38] as shown schematically in Figure 7. Many relationships on material properties, which are given literally, are extracted as pairs of two material properties from texts either by (a) manually, where a person reads textbooks and learns the relationships like Figures 5 and 6, or by (b) automatically using natural language processing techniques and a computer. Extracted pairs of two material properties are input into a database (<Input of relations> in Figure 7). The database of sets of material property pairs is represented as a graph. Users search relations from the database represented as a graph (<Search of relations (users)> in Figure 7). The characteristic feature of the relationship database is its graph-type (network-type) structure, which consists of nodes (material

properties) and edges (relations between material properties). This database is completely different from conventional material databases, which contain material names or compositions and the values of material properties such as melting point, density, and dielectric constant. There are no numerical values in the database. Like a train map, this database describes connections. The contents are not numerical data but words such as density. The sources of scientific principles are mainly literal (including mathematical formula), not numerical. Literal information describes essentially universal relationships independent of specific material compositions. Numerical data are useful for specific material systems.

**Figure 7.** Schematic structure of the proposed system, which enables searches for relationships among material properties. **Figure 7.** Schematic structure of the proposed system, which enables searches for relationships among material properties.

The advantage of a graph-type database is that it is easy to add or subtract data on connections as shown in Figure 8a. Consequently, it is easy to expand the area of scienfitic principles in the relationship database by connecting a material property mentioned in two textbooks in different academic fields (Figure 8b). Basic techniques for searching for relationships (connections) have been established in the framework of graph theory in mathematics [39] and are widely used in society, for example, in route searches of a train map. Graph-type databases are searched mainly by network searches and path searches, as shown in Figure 9. Here, each node (A, B, C, etc.) represents a material property such as density, thermal conductivity, or Vickers hardness, and each edge shows the relationship between two connected properties. Using a network search, one can, for example, find the material properties that affect the target property M. One example in which a path search is useful is when a material modification that increases material property A causes an unexpected decrease in material property B, which is undesirable. By searching the paths from A to B, one can find relationships that might cause the decrease in B with increasing A on these paths. It is also possible to search for possible ways of avoiding trade-off relationships (Figure 9c) by combining a path search and a network search, for example, by finding nodes that do not have a path to A without passing through H (J in Figure 9c) or finding nodes that connect directly to H but have a long path from A (H in Figure 9c). A node with a long path is usually expected to have less effect on a target node (=property), because there are many other nodes that affect the target node, which are used to avoid a trade-off relationship between A and H. The advantage of a graph-type database is that it is easy to add or subtract data on connections as shown in Figure 8a. Consequently, it is easy to expand the area of scienfitic principles in the relationship database by connecting a material property mentioned in two textbooks in different academic fields (Figure 8b). Basic techniques for searching for relationships (connections) have been established in the framework of graph theory in mathematics [39] and are widely used in society, for example, in route searches of a train map. Graph-type databases are searched mainly by network searches and path searches, as shown in Figure 9. Here, each node (A, B, C, etc.) represents a material property such as density, thermal conductivity, or Vickers hardness, and each edge shows the relationship between two connected properties. Using a network search, one can, for example, find the material properties that affect the target property M. One example in which a path search is useful is when a material modification that increases material property A causes an unexpected decrease in material property B, which is undesirable. By searching the paths from A to B, one can find relationships that might cause the decrease in B with increasing A on these paths. It is also possible to search for possible ways of avoiding trade-off relationships (Figure 9c) by combining a path search and a network search, for example, by finding nodes that do not have a path to A without passing through H (J in Figure 9c) or finding nodes that connect directly to H but have a long path from A (H in Figure 9c). A node with a long path is usually expected to have less effect on a target node (=property), because there are many other nodes that affect the target node, which are used to avoid a trade-off relationship between A and H. *Materials* **2021**, *14*, x FOR PEER REVIEW 9 of 16

(a) Adding a graph

(b) Connecting properties through different academic fields

**Figure 8.** Graph-type database that enables the easy addition of a graph (**a**) and easy expansion of academic fields (**b**), where different colors indicate different academic fields such as materials me-

Text-N

Text-Q

Text-O Text-P

Text-G

Text-E

Text-F

Text-I

Text-L

Text-H

Text-M

Text-K Text-J

**Figure 8.** *Cont.*

Text-C

Text-B

Text-R

Text-D

Text-S

Text-T

chanics, solid-state physics, and chemical thermodynamics.

**Figure 8.** Graph-type database that enables the easy addition of a graph (**a**) and easy expansion of academic fields (**b**), where different colors indicate different academic fields such as materials mechanics, solid-state physics, and chemical thermodynamics. **Figure 8.** Graph-type database that enables the easy addition of a graph (**a**) and easy expansion of academic fields (**b**), where different colors indicate different academic fields such as materials mechanics, solid-state physics, and chemical thermodynamics. *Materials* **2021**, *14*, x FOR PEER REVIEW 10 of 16

### (a) Network search around M

(a) Adding a graph

**Figure 9.** *Cont.*

**4. Computer Systems** 

A

<sup>M</sup> <sup>D</sup>

(c) Avoiding trade-off relationships

offs (**c**), which can be realized by combining path search and network search.

**Figure 9**. Two basic searches (**a**) network search and (**b**) path search, and search for avoiding trade-

K

F B

Although the small system shown in Figure 7 has been developed and demonstrated [16,17,40], the number of material properties and relationships stored in the system is quite limited because the relationships between material properties were extracted manually. Computer technology for automated relationship extraction is essential for practical use. The author has collaborated with a company to realize automated relationship extraction from several textbooks on materials science, and a prototype system has been developed as a result of this collaborative project [41]. The relationships between material properties automatically extracted from the 12 textbooks listed in Table 1 are currently included in the web-based system. Figure 10 shows an example of the system output for a path search (Figure 9b) between work function and Vickers hardness, whose

E

J

G

L

N

<sup>B</sup> <sup>F</sup>

<sup>M</sup> <sup>G</sup>

L

C

(b) Path search between A and B

C

L

(a) Network search around M

D

N

<sup>B</sup> <sup>F</sup>

G M

D

A E

K

H

A E

K

H

**Figure 9**. Two basic searches (**a**) network search and (**b**) path search, and search for avoiding tradeoffs (**c**), which can be realized by combining path search and network search. **Figure 9.** Two basic searches (**a**) network search and (**b**) path search, and search for avoiding trade-offs (**c**), which can be realized by combining path search and network search.

#### **4. Computer Systems**

**4. Computer Systems**  Although the small system shown in Figure 7 has been developed and demonstrated [16,17,40], the number of material properties and relationships stored in the system is quite limited because the relationships between material properties were extracted manually. Computer technology for automated relationship extraction is essential for practical use. The author has collaborated with a company to realize automated relationship extraction from several textbooks on materials science, and a prototype system has been developed as a result of this collaborative project [41]. The relationships between material properties automatically extracted from the 12 textbooks listed in Table 1 are currently included in the web-based system. Figure 10 shows an example of the system output for a path search (Figure 9b) between work function and Vickers hardness, whose Although the small system shown in Figure 7 has been developed and demonstrated [16,17,40], the number of material properties and relationships stored in the system is quite limited because the relationships between material properties were extracted manually. Computer technology for automated relationship extraction is essential for practical use. The author has collaborated with a company to realize automated relationship extraction from several textbooks on materials science, and a prototype system has been developed as a result of this collaborative project [41]. The relationships between material properties automatically extracted from the 12 textbooks listed in Table 1 are currently included in the web-based system. Figure 10 shows an example of the system output for a path search (Figure 9b) between work function and Vickers hardness, whose relationship was explained in Section 2.3. The descriptions in the textbook are not the same as those the author read, but the system also suggests the possibility of estimating work function values from Vickers hardness (there is a connection), and the properties shown in Figures 6 and 10 (path with red dotted lines) show considerable overlap. In the computer system, a path with nodes (material properties) appearing in the largest number of academic fields (represented by the colored circles around the material properties) is shown with thick edges, indicating the most multidisciplinary path. An example of the system output for a network search is shown in Figure 11. Because it is not commonly known that the work function is related to the Vickers hardness, a network search would be useful for finding properties that can be used to estimate the work function. In this case, a network search beginning with a target property (here, the work function) is used.


**Table 1.** List of textbooks used for the prototype system.

relationship was explained in Section 2.3. The descriptions in the textbook are not the same as those the author read, but the system also suggests the possibility of estimating work function values from Vickers hardness (there is a connection), and the properties shown in Figures 6 and 10 (path with red dotted lines) show considerable overlap. In the computer system, a path with nodes (material properties) appearing in the largest number of academic fields (represented by the colored circles around the material properties) is shown with thick edges, indicating the most multidisciplinary path. An example of the system output for a network search is shown in Figure 11. Because it is not commonly known that the work function is related to the Vickers hardness, a network search would be useful for finding properties that can be used to estimate the work function. In this case, a network search beginning with a target property (here, the work function) is used.

**Book Title Author(s) Publisher Year**

neering C.Barry Carter, M. Grant Norton Springer 2013

Vladislav V. Kharton WILEY 2009

Fundamentals of Materials Science Eric J. Mittemeijer Springer 2011 Understanding Materials Science Rolf E. Hummel Springer 2004 Materials Handbook François Cardarelli Springer 2018 The Chemical Bond I–III D. Michael P. Mingos, ed. Springer 2016

Electrochemistry for Materials Science Walfried Plieth Elsevier 2008

Electronic Properties of Materials E Hummel Springer 2011 Physics of Semiconductor Devices Simon M. Sze, Kwok K. Ng WILEY 2006 Principles of Surface Physics Friedhelm Bechstedt Springer 2003 Physics of Surfaces and Interfaces Harald Ibach Springer 2006 Solid Surface Physics Heribert Wagner Springer 1979

**Table 1.** List of textbooks used for the prototype system.

Ceramic Materials: Science and Engi-

Solid State Electrochemistry I: Fundamentals, Materials and their Applications

**Figure 10.** Computer system results screen showing a path search between the material properties of work function and Vickers hardness. Red dotted lines are shown for comparison with the manually compiled relationship in Figure 6. **Figure 10.** Computer system results screen showing a path search between the material properties of work function and Vickers hardness. Red dotted lines are shown for comparison with the manually compiled relationship in Figure 6. *Materials* **2021**, *14*, x FOR PEER REVIEW 12 of 16

The search result in Figure 11 uses the trace function (sequential network search, Figure 9a while retaining the previous network search results); the search begins at work function and reaches binding energy. This result suggests that properties such as density and absorption edge might be used in addition to hardness to estimate the work function. For TMCs, it is expected that experimental results on the effect of carbon deficiency on density may exist, but not results on absorption edges. It is reasonable to consider that density is a measure of binding potential depth in Figure 6, because the density would increase if the bonds in the carbides become stronger (that is, the binding potential is deeper) when both molar mass and lattice constant decrease because of carbon deficiency. The author checked references on the density of TMCs with carbon deficiency. The effect of carbon deficiency on the density for TiCx [42] and ZrCx [43] (group IV TMCs) and VCx [44] and TaCx [45] (group V TMCs) is shown in Figure 12a, where the density is

deficiency on hardness, which were previously used as a measure of the bulk term of the work function, are also shown for comparison. The absolute values of the density clearly depend on the atomic radius of transition metals. Therefore, the density is plotted as a relative value, and only the qualitative dependence of density on the stoichiometry is considered. For TiCx and ZrCx, whose phase diagrams show a wide region of one carbondeficient phase, the density decreases monotonously with increasing carbon deficiency (decreasing x), as demonstrated in Figure 12a, in agreement with the trend of hardness in Figure 12b. For VCx and TaCx, the density is expected to increase with increasing carbon deficiency near stoichiometry (0.9 < x <1.0) from hardness change with carbon deficiency. Although TaCx shows the expected dependence on carbon deficiency, density values for 0.9 < x <1.0 are missing for VCx. The density of VCx decreases with carbon deficiency for x < 0.87, which is consistent with the hardness trend. In the phase diagram of the binary system of V and C [46], VCx exists in the range 0.66 < x < 0.89 at 1650 °C, where the concentration of C dissolved in metallic V is the maximum. The above range is in agreement with the data range for the density in Figure 12a. Therefore, it is considered that the density, like the Vickers hardness, is also useful as a measure of the bulk term of the work function for VCx. TaCx exists in the range 0.68 < x < 0.99 at 2843 °C, where the concentration of C dissolved in metallic Ta is the maximum. Because the composition at which the hardness is maximum is somewhat unclear, it is difficult to discuss the behavior of TaCx near the lower limit of x. In summary, it appears that the density can be used as an

**Figure 11.** Results screen for sequential network search starting from work function. **Figure 11.** Results screen for sequential network search starting from work function.

The search result in Figure 11 uses the trace function (sequential network search, Figure 9a while retaining the previous network search results); the search begins at work function and reaches binding energy. This result suggests that properties such as density and absorption edge might be used in addition to hardness to estimate the work function. For TMCs, it is expected that experimental results on the effect of carbon deficiency on density may exist, but not results on absorption edges. It is reasonable to consider that density is a measure of binding potential depth in Figure 6, because the density would increase if the bonds in the carbides become stronger (that is, the binding potential is deeper) when both molar mass and lattice constant decrease because of carbon deficiency.

The author checked references on the density of TMCs with carbon deficiency. The effect of carbon deficiency on the density for TiCx [42] and ZrCx [43] (group IV TMCs) and VCx [44] and TaCx [45] (group V TMCs) is shown in Figure 12a, where the density is calculated from lattice constants obtained by X-ray diffraction measurements and the molar mass in the stoichiometry given in the references. In Figure 12b, the effects of carbon deficiency on hardness, which were previously used as a measure of the bulk term of the work function, are also shown for comparison. The absolute values of the density clearly depend on the atomic radius of transition metals. Therefore, the density is plotted as a relative value, and only the qualitative dependence of density on the stoichiometry is considered. For TiCx and ZrCx, whose phase diagrams show a wide region of one carbondeficient phase, the density decreases monotonously with increasing carbon deficiency (decreasing x), as demonstrated in Figure 12a, in agreement with the trend of hardness in Figure 12b. For VCx and TaCx, the density is expected to increase with increasing carbon deficiency near stoichiometry (0.9 < x <1.0) from hardness change with carbon deficiency. Although TaCx shows the expected dependence on carbon deficiency, density values for 0.9 < x <1.0 are missing for VCx. The density of VCx decreases with carbon deficiency for x < 0.87, which is consistent with the hardness trend. In the phase diagram of the binary system of V and C [46], VCx exists in the range 0.66 < x < 0.89 at 1650 ◦C, where the concentration of C dissolved in metallic V is the maximum. The above range is in agreement with the data range for the density in Figure 12a. Therefore, it is considered that the density, like the Vickers hardness, is also useful as a measure of the bulk term of the work function for VCx. TaCx exists in the range 0.68 < x < 0.99 at 2843 ◦C, where the concentration of C dissolved in metallic Ta is the maximum. Because the composition at which the hardness is maximum is somewhat unclear, it is difficult to discuss the behavior of TaCx near the lower limit of x. In summary, it appears that the density can be used as an indicator of the effect of carbon deficiency on the bulk term of the work function in TMCs, at least in the composition range in which the carbon deficiency is smaller and the TMCx phase exists in the phase diagram.

In the above example, the density of carbon-deficient TMCs was checked manually because there is no retrievable database. However, automated data collection and data presentation, as shown in Figure 12b, should be possible in principle, which would assist an individual researcher in the design process illustrated in Figure 1.

The system presented here is still a prototype. The development of a product and commercialization of the product is necessary in future. In addition, many additional functions such as quantitative relationships, arranging tie-ups with numerical database and machine learning are desired. Finding a new relations based on the structure of the graph database could be also explored, because there are considerable numbers of scientific principles represented in a similar form such in particle mechanics and geostatics and electric field and magnetic field in electromagnetics.

**Figure 12.** Composition dependence of density of TMCs (**a**) and with that of Vickers hardness [36] (**b**) for comparison. VEC is the abbreviation of "valence electron concentration" [36]. **Figure 12.** Composition dependence of density of TMCs (**a**) and with that of Vickers hardness [36] (**b**) for comparison. VEC is the abbreviation of "valence electron concentration" [36].

indicator of the effect of carbon deficiency on the bulk term of the work function in TMCs, at least in the composition range in which the carbon deficiency is smaller and the TMCx

an individual researcher in the design process illustrated in Figure 1.

electric field and magnetic field in electromagnetics.

In the above example, the density of carbon-deficient TMCs was checked manually because there is no retrievable database. However, automated data collection and data presentation, as shown in Figure 12b, should be possible in principle, which would assist

The system presented here is still a prototype. The development of a product and commercialization of the product is necessary in future. In addition, many additional functions such as quantitative relationships, arranging tie-ups with numerical database and machine learning are desired. Finding a new relations based on the structure of the graph database could be also explored, because there are considerable numbers of scientific principles represented in a similar form such in particle mechanics and geostatics and

phase exists in the phase diagram.

#### **5. Conclusions**

**5. Conclusions**  A materials informatics method that uses knowledge of scientific principles as well as numerical data was proposed. The use of systematic knowledge of scientific principles enables a broader perspective that is less limited by commonly used approaches. Some examples of material search and prediction using very little experimental data were shown to demonstrate the advantage of using scientific principles. Then, a system consisting of a database of knowledge on the relationships between material properties and a relationship search function, which is being developed by the author and collaborators, was presented. Finally, the author's discovery that work function values can be estimated from the density of materials when the effect of carbon deficiency in TMCs is considered is presented to demonstrate the usefulness of the system.

#### **6. Patents**

In the article, the following five patents, (1) property relationship database and search system, (2) those with options on priority, (3) those with modified search, (4) those with user information including search history, (5) those with combined search used for avoiding trade-offs, for example, are related.


**Funding:** This research was partly funded by Grants-in-Aid from the Ministry of Education and Science of Japan JSPS, KAKENHI Grant Number JP16K06283.

**Data Availability Statement:** The data presented in this study are available on request from the author.

**Acknowledgments:** This study is partly supported by Grants-in-Aid from the Ministry of Education and Science of Japan JSPS, KAKENHI Grant Number JP16K06283.

**Conflicts of Interest:** The author declares no conflict of interest.

#### **References**

