1. Introduction and Background
While there is a large variety of extraction algorithms for geometric features, such as point and translation symmetries from gray level patterns that are more or less periodic in two (2D) and one (1D) dimensions [
1,
2], related comments by Kenichi Kanatani [
3] on symmetry as a continuous and hierarchic feature have been largely ignored for the last two decades by the computational symmetry and applied crystallography communities alike. The notable exceptions in this respect are the work by Yanxi Liu and coworkers [
4,
5,
6] on 1D periodic time series in the form of subsequently recorded 2D images which were done more than a decade ago and much more recent work by the author of this review on objective 2D Bravais lattice type assignments to noisy images [
7].
While the applied crystallography community typically speaks of ‘crystal patterns’ when it refers to atomically resolved images [
8], more or less 2D and 1D periodic patterns where the individual pixels possess digitized intensity values (i.e. gray-levels rather than colors) are commonly referred to as ‘near regular textures’ within the computational symmetry community [
1]. With the title of this review, it is, thus, implied that its main targets are fellow members of the applied crystallography community. This paper should, however, also be of interest to the computational symmetry community because the underlying mathematical and statistical frameworks are identical when images are considered as data planes from which geometric-structural information is to be extracted and classified, regardless of the instruments with which they were recorded.
For the computational symmetry community [
1,
2,
4,
5,
6] and with regards to Kanatani’s associated developments in the robotics/computer vision fields [
3,
9,
10,
11,
12], it is entirely natural to consider images as data planes. While this is largely because there are no microscopes and specifics of the underlying physics of the imaging process involved that may need modeling, modern microscopes are so good now that the data plane approach also works well in materials science and structural biology.
It should, therefore, not come as a surprise that this review follows the existing leads from the computational symmetry community but also goes beyond the current state of affairs in crystallographic symmetry classification schemes when multi-model inferences are discussed. The conclusion section of a recent review of the computational symmetry field states fittingly that “strategies … for handling real world complexity have to be developed to deal with … the issue of subgroup relations among symmetry groups, raised by Kanatani” [
1]. The time seems indeed to be right for these kinds of developments and this paper reviews both the statistical foundation and the wider crystallographic implications of them. The latter is mainly done in appendices, which may be of limited interest to members of the computational symmetry community.
More or less 2D periodic Islamic building ornaments were assigned to plane symmetry groups in [
2] on the basis of the careful elucidation of the approximate site symmetries of conspicuous parts of periodic motifs in direct space. These elucidations rely, however, critically on
arbitrary thresholds and must, therefore, always be
subjective. The final plane symmetry group assignment can in that kind of an approach never be objective.
Utilizing Kanatani’s approach [
3,
9,
10,
11,
12], the authors of [
2] could, in principle, transform their classifications into objective ones in spite of the multitude of irregularities/defects that their analyzed Islamic ornaments contain. When that was done, model selection uncertainties [
13,
14,
15,
16,
17] would need to be addressed properly. A solution to the latter problem will be presented in this review in a qualitative way as well. Note that model selection uncertainties are not addressed in the work of Liu and coworkers [
4,
5,
6] either.
The problems associated with the above-mentioned subjective [
1,
2] crystallographic symmetry classifications, Kanatani’s new statistical theory [
3,
9,
10,
11,
12], and systematic ways of dealing with model selection uncertainties [
13,
14,
15,
16,
17] became more relevant to the applied crystallography community with the recent emergence of both the crystalline ‘materials per design paradigm’ [
18] and model-based approaches to the imaging of crystals and long-range ordered materials. Straton and co-workers [
19,
20] utilized, for example, the above-mentioned objective translation symmetry type classification scheme [
7] for the detection and subsequent correction of double and multiple mini-tip artifacts in scanning tunneling microscope (STM) images of more or less 2D periodic arrays of molecules on a crystal surface by means of crystallographic image processing [
21,
22].
Independent of the type of microscope with which the data have been recorded, the purpose of crystallographic image processing is the extraction of geometric-structural information from noisy 2D periodic images. The translation and site/point symmetries in the hypothetical noise-free version of the image are taken advantage of as one averages over the asymmetric unit so that a better signal-to-noise ratio is obtained for the structure of interest. Note that the averaging over the asymmetric unit (rather than the translation periodic unit cell) ensures that better results are obtained than those achievable with traditional Fourier filtering [
7]. This is because the multiplicity of the general position [
23,
24] boosts the number of entities over which one averages by a factor of up to 12.
The noisy 2D periodic image is considered to constitute a data plane and the models for the data at the foundation of crystallographic image processing are the 17 plane symmetry groups of 2D crystallography [
23,
24], which represent all possible combinations of translation and point/site symmetries in the Euclidean plane. Crystallographic image processing originated about 50 years ago within the structural biology community [
25] and contributed under the name ‘crystallographic electron microscopy’ (monikers ‘Fourier or pseudo-kinematic electron microscopy’) to the award of the 1982 Nobel Prize in Chemistry to Sir Aaron Klug.
Another type of model-based imaging in atomic resolution microscopy [
26,
27,
28,
29,
30] with a complementary foundation originated as a very promising approach to quantitative transmission electron microscopy (TEM) at the University of Antwerp (Belgium) at the beginning of the 21st century and led to the award of the 2017 Ernst Ruska Prize to Sandra Van Aert. The underlying procedures of that approach are analogous to single-crystal X-ray crystallography in so far as one distinguishes between the ‘solving’ of the structure and the ‘refinement’ of the resolved structure [
26]. First, the structure is resolved by the imaging of individual projected atomic columns in a more or less 2D periodic array with a state-of-the-art TEM. This is followed by a maximal likelihood refinement of the position and chemical composition of the atomic columns in that array.
Since the number of atoms in projected columns can be determined with single-atom accuracy when an aberration corrected TEM is utilized for the model-based imaging [
29,
30], a tomographic enhancement, i.e. the combining of structural information that was obtained from several atomic resolution images in different projections, was not necessary for the determination of the 3D structure of nanocrystals for which the thickness did not vary widely from atomic column to atomic column. It is this author’s opinion that the aforementioned model-based atomic-resolution approach to quantitative TEM could benefit from both the complementary geometric Akaike Information Criterion (G-AIC) approach that is outlined below in general terms and crystallographic image processing.
A recent paper by Vasudevan and co-workers seems to be most suitable to illustrate the need for this review at the present time as it describes a geometric-structural feature extraction approach where a window is sliding over a noisy image of a transmitted crystal and the discrete Fourier transform (dFT) is calculated at consecutive window positions of that atomically-resolved image so that the locations of different crystal phases can be mapped in two dimensions [
31]. The authors of that paper state that it would, in principle, be possible to derive the local crystallography, i.e. the Bravais lattice type and plane symmetry group, of different types of more or less 2D periodic entities on crystal surfaces or within crystalline matrices from the data that they recorded with their sliding dFT windows in a scanning transmission electron microscope (STEM), but also caution that this “would require substantial efforts at developing the appropriate image classification schemes” [
31].
Crystallographic classification schemes for 2D periodic patterns have been in existence for over nine decades [
32,
33]; see [
23] for an authoritative, brief and mathematically comprehensive modern description as well as [
34] for a college-level textbook. The real problem that needs to be addressed in the above-mentioned context of the sliding dFT windows is, however,
how to make crystallographic classifications
objectively on the basis of results from some
non-ideal algorithm and when only
noisy data are available, as is the case in all
real world applications.
The situation is analogous to what is encountered in the field of crystallographic 1D periodic classification schemes for gray-level patterns. The mathematical background of frieze symmetries and their projections from layer symmetries has been around for decades and is neatly summed up in an authoritative text [
35], which follows the same outline as the comprehensive description of all plane symmetries of gray-level patterns [
23,
24] as projections from 3D space symmetries. The problem is again
how to make classifications
objectively on the basis of noisy experimental image data and without adding a subjective value judgment to arrive at one crystallographic symmetry class only.
More or less 1D periodic 2D images of crystalline materials, such as aberration-corrected STEM images of plane coincidence site lattice (CSL) grain boundaries in edge-on projections which are atomically resolved [
36,
37,
38,
39,
40,
41] are known to be underlain by both predictable [
41] types of frieze symmetries and 3D atomic level
bi-crystal structures [
42,
43,
44,
45,
46]. There is, at present, however, no
objective way to extract the parameters of grain boundary structures at the atomic level from such images. Subjectivity in the experimental determination of the very basic Σ value (CSL index) has, for example, been recently discussed in [
47].
The core ideas of the crystallographic processing of noisy 2D images could be transferred to images that are periodic in 1D only as a first step towards the development of
objective crystallographic symmetry classification schemes on the basis of Kanatani’s statistical theory [
3,
9,
10,
11,
12] and systematic ways of dealing with model selection uncertainties [
13,
14,
15,
16,
17]. This would be equivalent to the adaptation of the proposal of this review to 1D periodic cases. The atomistic model-based approach that was pioneered at the University of Antwerp [
26,
27,
28,
29,
30] could also be brought to bear on the extraction of geometric-structural information from atomic resolution images of grain boundaries.
Appendix A and [
48,
49] provide some more background on CSL (and approximate low-CSL index) grain boundaries in order to illustrate opportunities for 1D periodic symmetry classifications in that particular field.
As soon as suitable classification schemes have been demonstrated that work without any arbitrarily set thresholds, a robot could be programmed to classify input images automatically and sort them into crystallographic databases for more or less 2D or 1D periodic patterns objectively. It would then be up to the user of such databases to (subjectively) interpret the objectively-reported classification results. The author of this review presents here key aspects of his novel crystallographic symmetry classification scheme that is designed to work well in the presence of geometric-structural feature extraction uncertainties of the types that exist in more or less 2D and 1D periodic images.
Noise in the imaging process, as well as geometric-structural feature extraction uncertainties in the processing of an image with some real world (non-ideal) algorithm will necessarily break all pre-existing symmetries of a crystalline sample (or that a synthetic image may possess due to its design) so that there will only be
non-genuine pseudo-symmetries left to be classified. The image is then, of necessity, only translation periodic to a larger or smaller extent so that none of the strict mathematically abstract restrictions of 2D [
23,
24] and 1D [
35] crystallography are applicable anymore.
Further complications arise when there are
genuine pseudo-symmetries [
50] in the hypothetical noise-free version of a 2D or 1D periodic image. Geometric-structural feature extraction procedures can in the presence of noise not readily distinguish between non-genuine pseudo-symmetries that combine to form the underlying symmetry group structure of the hypothetical noise-free version of the image, on the one hand, and genuine pseudo-symmetries that exist in addition to this structure [
50], on the other hand. Within this review, we will occasionally refer to non-genuine pseudo-symmetries as pseudo-symmetries of a different (or second) kind.
Appendix B provides more information on different types of pseudo-symmetries.
Since instances of the latter kind of pseudo-symmetries may be mistaken for instances of the former kind, the wrong underlying symmetry group structure may be inferred so that subsequent crystallographic classifications would be in error. Vice versa, due to noise in the experiments, non-genuine pseudo-symmetries may be mistaken for genuine pseudo-symmetries so that crystallographic symmetry classifications result which underreport the factual existing symmetry when an extrapolation to a zero-noise level is made. Genuine pseudo-symmetries also play important roles in twinning and the formation of multiple domains in crystalline solids [
51,
52].
Mix-ups of genuine and non-genuine pseudo-symmetries that lead to symmetry classification problems in both inorganic crystal structures and molecule crystals (both small and large) in the presence of experimental noise are for the mainstream 3D crystallography case discussed in
Appendix C and [
53,
54,
55,
56,
57,
58,
59,
60,
61,
62,
63,
64,
65,
66,
67,
68,
69,
70,
71,
72,
73,
74,
75,
76,
77,
78,
79,
80,
81,
82].
The three largest crystallographic databases for mainstream 3D crystallography results [
83,
84,
85,
86,
87] are also briefly mentioned in
Appendix C.1. Two of these databases are in open access [
83,
84,
86,
87]. Specifics of single-crystal X-ray protein crystallography are mentioned in [
88] and [
89]. A critical review of crystal structure determinations by means of single-crystal X-ray crystallography in general is provided in [
90]. Standard statistical descriptions and the utility of contemporary null hypothesis tests in mainstream 3D single-crystal X-ray crystallography are discussed in
Appendix D and [
91,
92,
93,
94,
95,
96,
97,
98,
99,
100].
On the basis of [
101,
102,
103,
104,
105,
106,
107,
108,
109,
110,
111,
112,
113],
Appendix C.3. presents this author’s assessment of the single-crystal X-ray crystallography structure study of a highly topical metal-organic framework (MOF) compound [
101,
102]. That compound is probably incorrectly classified in the major databases for that class of material [
85,
86,
87] due to an unrecognized pseudo-symmetry arising from the co-existence of triple domains. The published crystal structure of that MOF [
101] is most likely an over idealization and at the very least incomplete due to the deliberate removal of experimentally observed electron density during data processing [
102]. The crystallographic analysis of a few low electron dose STEM images—see [
109] or [
110] for one of these images—of that structure (as mentioned briefly in
Appendix C.3) proved to be crucial to this author’s arrival at this conclusion on that crystal structure’s validity [
109].
In spite of all of the kinds of difficulties that are mentioned above, and because there is now an objective way for recognizing genuine pseudo-symmetries in the presence of noise as outlined below, it makes a great deal of sense to assign a set of approximate crystallographic symmetry classifications to a 2D image so that the models one is using for atomic or molecular resolution imaging are of comparatively small dimensionalities and allow for optimal geometric-structural information extraction processes in the presence of noise.
Any real world geometric feature extraction algorithm will with necessity introduce some small systematic error into geometric-structural feature extraction results so that none of the computer programs that implement such algorithms will ever deliver
definitive results [
9,
10,
114] (Kanatani’s dictum). Since there are no definitive feature extraction results, one should not attempt to classify these results into qualitatively exclusive (definitive) classes such as a single Bravais lattice type [
7,
23], Laue class [
24], and plane symmetry group [
23,
24] in the 2D case but utilize Kanatani’s new statistics [
9,
10,
11,
12] instead.
This is because the traditional kinds of classifications imply that the extracted pseudo-symmetries adhere 100% (i.e. definitively) to the restrictions that are imposed by a mathematically-abstract crystallographic type, class, or group, which are all of a qualitatively strict nature per definition. Such an adherence can obviously not be genuine as there is noise in all image recording and processing steps in all real-world applications.
In spite of this, allegedly-definitive symmetry classifications are so far the common practice in both the computational symmetry and applied crystallography communities alike. They are, however, fundamentally unsound because all qualitative classifications will be in error insofar as they claim to be
definitive; see Kanatani’s comments from 1997 in this context [
3].
Fortunately, crystallographic symmetries are hierarchic and the majority of them are non-disjoint [
7,
23,
24,
35,
114]. These features allow for a boot-strapping approach that does not require an initial estimate of the generalized noise level in a more or less 2D or 1D periodic image.
By means of pair-wise comparison of non-disjoint models with Kanatani’s G-AIC [
9,
10,
11,
12], one first obtains the model that minimizes the expected Kullback-Leibler information loss [
13,
14,
15,
16,
17] within a set of models that represents a stretch of a symmetry hierarchy branch and later determines for this particular model the generalized noise level. When this has been achieved, one can calculate the relative likelihood that a model in a set of non-disjoint (or disjoint) models minimizes the expected Kullback-Leibler information loss and formulate so-called Akaike weights as conditional model probabilities that add up to 100% for the whole set [
13,
14,
15,
16,
17].
Instead of a
definitive classification that makes a 100% assignment to only one class, which cannot be guaranteed to be correct due to the unavoidable presence of experimental noise and feature extraction uncertainties that are due to the utilized algorithm as discussed above, one obtains by this route a
fuzzy classification that is spread over several classes of non-disjoint models within one symmetry hierarchy branch. One may also end up with a fuzzy classification that is spread over several classes of both non-disjoint and disjoint models if there is a genuine pseudo-symmetry [
50] in the data plane.
The derived percentages of the adherences to the individual classes of models within (and outside of) a symmetry hierarchy branch will be specific to the noise level of the image to be classified and also very slightly specific to the algorithm with which the classification has been made. The effects of experimental noise and the utilized real world algorithm are summarized in a generalized noise level term. Reduced generalized noise levels of future image data from the same crystalline sample that are recorded with more sophisticated instruments and processed with more ‘truthful’ feature extraction algorithms will have a tendency to change the individual percentages somewhat, but will also never allow for definitive classifications. Additionally a reduction in the experimental noise level per unit cell can be obtained by the processing of a significantly larger image area that contains many more repeats of the 2D or 1D periodic motif.
Major goals of this review are to bring Kanatani’s comments [
3] and dictum [
9,
10,
114], as well as his G-AIC approach [
9,
10,
11,
12] to the attention of both the applied crystallography and the computational symmetry communities. The utilization of the information theory concept of (i) Akaike weights [
13,
14,
15,
16,
17] and (ii) their products [
13] for complementing geometric-structural pieces of information (that were extracted from the results of the same imaging experiment or from the same synthetic data) for generalized noise-level dependent crystallographic classifications of more or less periodic crystal patterns constitute the novel ideas of this paper. This review will concentrate on crystal patterns in the form of 2D gray-level images that are more or less periodic in one and two dimensions.
Secondary goals of this review are popularizations of a Fourier space version of Liu’s G-AIC for the assignments of plane symmetry groups to more or less 2D periodic images [
4,
5] and the author’s versions of such criteria for Bravais lattice type [
7] and Laue class assignments to such images. The combination of G-AICs for Bravais lattice types, Laue classes, and plane symmetry groups should be useful to deal with the consequences of genuine pseudo-symmetries [
50] that the hypothetical noise-free version of an image may possesses, either per design or by the nature of the crystalline sample from which it was recorded.
The rest of the paper is organized as follows: We begin with explaining the nature of Kanatani’s comments on symmetry as a continuous and hierarchic feature in
Section 2. This is followed by a discussion of Kanatani’s dictum in
Section 3. Within that section, we will concern ourselves with genuine pseudo-symmetries [
50]—see
Figure 1—which exist per the design of both of the constituting images and quote the related lattice parameter extraction results (for these two images) by three different algorithms/computer programs from [
114]. The purpose of that part of this review is to illustrate the non-definitiveness of geometric-structural feature extraction results that are obtained by any real-world algorithm from noise-free and noisy images alike.
Readers interested in the details of the three computer programs that implement these algorithms are referred to [
114] for comprehensive information. Two of these programs [
115,
116] are used in the applied crystallography community and the third [
117] one supports all aspects of crystallographic image processing and electron crystallography on the basis of high-resolution (phase-contrast) transmission electron microscope (HRTEM) images [
118,
119,
120,
121] that were recorded within the validity range of the weak phase object (WPO) approximation. While one of these programs [
115], and most algorithms of the computational symmetry community [
1] work in direct space, the other two programs [
116,
117] that were utilized in [
114] work in Fourier/reciprocal space.
For easy references below, we will use the capital letters A, B, and C instead of either the actual names of these three computer programs or their entries in the final list of references at the end of this review.
Table 1 provides the conversion key.
The fourth section on G-AICs reviews first the general form of these criteria and then proceeds by giving specifics of Fourier space versions of such criteria for fuzzy, i.e. quantitative generalized noise-level dependent classifications of geometric-structural feature extraction results into plane symmetry groups, Laue classes, and Bravais lattice types. Liu and her co-workers’ frieze pattern assignments to time series recording of both a walking humanoid avatar and a walking human being [
4,
5,
6] will be mentioned in this section briefly (and discussed further in
Appendix E) as illustrations of the fact that one should not only report the most likely crystallographic symmetry classification for a real-world experiment, but also its relative likelihood, as well as the likelihoods of reasonable alternatives in order to make a fair assessment of the crystallographic model selection uncertainty [
13,
14,
15,
16,
17].
In the fifth section, we will provide equations for the relative likelihoods of disjoint and non-disjoint crystallographic symmetry models within a set, their respective mutual evidence ratios, and their Akaike weights. There are also equations for the usage of Akaike weights for multi-model predictions that are based on the relative probabilities of crystallographic symmetry models within a set.
Section 5.1 contains the equations for combined posterior model probabilities [
13] that are based on complementing pieces of geometric-structural information in more or less 2D periodic (noisy) images. The corresponding combined Akaike weights should be helpful for distinguishing between genuine and non-genuine pseudo-symmetries [
50] that the hypothetical noise-free version of an image processes.
The fourth and fifth sections constitute the core of this review and contain the equations/inequalities that refine its novel ideas. Finally, there is a brief summary and conclusions section.
As already mentioned above, there are five appendices that present: (i) the potential of the main proposal of this review with respect of the extraction of grain boundary structures from atomic resolution images that are more or less periodic in 1D; (ii) different types of pseudo-symmetries; (iii) pseudo-symmetry mediated misclassifications in both the scientific literature and the major databases of mainstream 3D crystallography as well as a brief discussion of the crystallographic R value; (iv) statistical descriptors and null hypothesis tests in mainstream 3D crystallography; and (v) crystallographic comments on the only so far existing experimental 1D periodic study that utilized a geometric Akaike Information Criterion.
3. Kanatani’s Dictum
A direct quote from [
9,
10] is in order here to start this section: “The reason why there exist so many feature extraction algorithms, none of them being definitive, is that they are aiming at an intrinsically impossible task.” While this statement might be somewhat shocking to researchers who never before thought about this topic deeply, it is certainly true. No real world feature extraction algorithm working on real world data will ever be able to deliver
definitive results. This is because all algorithms (and the computer programs that implement them) are based on heuristics and use approximations, as well as internal thresholds, to achieve their goals. Additionally, all real world image data are of finite resolution and noisy.
As mentioned above, a thorough illustration of Kanatani’s dictum within a crystallographic context is provided in [
114]. We take from that paper the lattice parameter extraction results of the two images that are shown in
Figure 1 but present them here in a form that is adjusted to the crystallographic setting that we use in this review.
The two images in
Figure 1 are synthetic and freely downloadable (together with many more images of the same size and type) at the website that is listed as [
122]. On the left hand side of this figure, there is the noise-free (original) image of the pair. The image on the right hand side of this figure has been obtained by adding independent Gaussian noise of mean zero and a standard deviation of 10% of the maximal image intensity to the individual pixels of the noise-free image to the left. The images in
Figure 1 possess a rectangular (primitive) Bravais lattice and plane symmetry group
pm in the crystallographic
p1m1 setting [
23,
24] per design. One choice of a unit cell is outlined in
Figure 1a by a rectangle in yellow ink. Other choices are possible because the origin is in this particular plane symmetry group not fixed at a specific point.
Per crystallographic convention [
23,
24], the
x-axis ([
10] vector) runs from the top-left corner of a unit cell to its bottom left corner (in the
p1m1 setting). The
y-axis ([01] vector) is perpendicular to the
x-axis and runs from left to right. The unit cell edges x,0 =
and 0,y =
are in
Figure 1 parallel to the image edges X,Y. The edge relationships between unit cells and images are usually arbitrary, i.e. not subject to any crystallographic restriction, so that
Figure 1 shows a very special case of such a relationship.
Per crystallographic convention [
23,
24], the origin of each unit cell in plane symmetry group
p1m1 (
pm for short when the setting is not communicated), i.e. the position 0,0 from where all other positions are measured in fractions of the unit translation vectors, is located anywhere on a mirror line of position 0,y. Two possible choices of unit cells for
Figure 1a that take the prevailing pseudo-symmetries of the Fedorov type [
123] into account are given in
Figure 2, where just one magnified single unit cell cutout is displayed both on the left- and right-hand sides of this figure. Pseudo-symmetries of the Fedorov type are compatible with a crystallographic lattice that is of the Bravais type. That lattice is not necessarily the prevailing (genuine) crystallographic lattice; see
Appendix B for more information.
As
Figure 2a shows, there is always a second mirror line at position ½,y in the unit cell in plane symmetry group
p1m1. This mirror line and the 0,y (and 1,y) mirror line(s) are displayed by full yellow lines in both parts of
Figure 2. The coordinate y varies from 0 to 1 in the 0,y and ½,y labels of sets of special points, which carry Wyckoff letters a and b, respectively [
23,
24]. The points y = 0 and y = 1 are symmetry equivalent by way of one unit translation along the
y-axis. The
genuine translation symmetry in all subfigures of
Figure 1 and
Figure 2 is that of the rectangular (primitive) Bravais lattice type per design.
As far as genuine point/site symmetries are concerned, the full (four letter) plane symmetry group symbol (
p1m1) details that there are only one-fold rotation points in the plane of the image, mirror lines perpendicular to the [
10] direction of the unit cell, and one-fold rotation points along its [01] direction; see
Figure 1a and
Figure 2a. Sets of mirror lines are in crystallography represented by their geometric normal, which are, in
Figure 1a and
Figure 2a oriented perpendicular to the vector [
10] =
due to the
p1m1 setting.
The ½,y mirror line (which carries Wyckoff letter b in
p1m1 [
23,
24]) splits the three white blobs of the translation periodic motif (and their immediate black surroundings) in
Figure 1a and
Figure 2a into upper and lower halves. While this line is not drawn out in
Figure 1a, it is given as a full yellow line in
Figure 2a. The multiplicity of points that are located on that mirror line is one. All of these points are, therefore, at a special position in this plane symmetry group. The general position, on the other hand, possesses a multiplicity of two so that there is a symmetry equivalent
position for each x,y position. Both of these positions carry Wyckoff letter c in plane symmetry group
p1m1. The asymmetric unit is just one half of the unit cell as sectioned by the ½,y mirror line; see
Figure 2a. The alternative asymmetric unit in
Figure 2b is composed of one half each of the two less intense white blobs plus two quarters of the most intense white blob and their immediate black surroundings.
Genuine motif-based (four-fold rotation points plus vertical mirror lines) pseudo-symmetry is also present in both images of
Figure 1 and
Figure 2. This pseudo-symmetry is of the Fedorov type [
123]—see
Appendix B—and complicates the crystallographic analysis and symmetry classification. The complications are particularly severe in
Figure 1b due to the added noise. Genuine translational pseudo-symmetry is caused by the similarity in intensity and size of the three white blobs that form (together with their immediate black surroundings) the content of the rectangular unit cell in
Figure 2a.
While the 0,y and ½,y mirror lines in this figure are genuine and intersect the white blobs horizontally, there are also x,0, x,
1/
3, and x,
2/
3 pseudo-mirror lines that intersect the while blobs vertically. In order to avoid overcrowding, these pseudo-mirrors are only given as dotted yellow lines in
Figure 2b.
These three pseudo-mirror lines generate three more parallel pseudo-mirror lines at positions x,
1/
6, x,
3/
6, and x,
5/
6 , which intersect the black background areas between the white blobs in the middle vertically. The above-mentioned genuine mirror lines 0,y and ½,y (of
Figure 1a and
Figure 2a) combine with the perpendicular pseudo-mirror lines as drawn into
Figure 2b. This generates pseudo-four-fold and pseudo-two-fold rotation points at the crossings of genuine mirror lines and pseudo-mirror lines so that Fedorov pseudosymmetry group
pb/34mm ⊃
p1m1 results on the basis of the rectangular Bravais lattice. The pseudo-four-fold rotation points contain in themselves pseudo-two-fold rotation points. There are alternatives to construct the Fedorov pseudosymmetry group that
Figure 1a and
Figure 2a,b possess, but they all lead to the same end result.
With respect to
Figure 1a, the origins of the two unit cells in
Figure 2 are shifted to the position of a four-fold pseudo-rotation point. There are two such points in pseudo-symmetry group
pb/34mm, which carry pseudo-Wyckoff letters a and b, and their locations with respect to the pseudo-square lattice are 0,0 and ½,½, respectively. As a result of the combination of the genuine symmetries and pseudo-symmetries in
Figure 1a, the genuine rectangular lattice of
Figure 1a and
Figure 2a is “truncated” into a pseudo-square lattice as outlined in
Figure 2b.
Due to its design history, the image in
Figure 1b possesses also plane symmetry
pm (in the
p11m setting) but all genuine (point/site and translation) symmetries have been turned into non-genuine (second kind) pseudo-symmetries—see
Appendix B—by the added noise. These pseudo-symmetries exist in addition to the above-mentioned genuine pseudo-symmetries. A complementary description of the two images in
Figure 1 is provided in [
114]. Note that a complementary setting has been used in that paper, but this is inconsequential to a crystallographic analysis. The results of any such analyses will be complementary, e.g., a unit cell angle in one setting would be the difference between 180° and that angle in another setting.
Note that Kanatani includes, per definition, all kinds of image feature extraction uncertainties into the generalized noise term in his G-AICs so that one cannot extract
definitive results even from the image in
Figure 1a, which is free of added Gaussian noise. In this review, noise is treated in the generalized sense that is in accord with Kanatani’s dictum [
9,
10].
As already mentioned above, the main thrust of [
114] was to illustrate Kanatani’s dictum on multiple examples. Since three algorithms/computer programs were applied to a total of 12 images in [
114], a measure of the reliability of subsequent geometric inferences on the basis of the outputs of the computer programs that implemented these algorithms was also obtained. Additionally, since the three algorithms were tested on both noise-free images (such as the one shown in
Figure 1a) and noisy images that were derived from the noise-free images (such as the one in
Figure 1b), the robustness of the algorithms/computer programs in the presence of Gaussian noise was also tested in [
114].
The ratio of the lattice parameters (a/b) of the unit cells in the two images in
Figure 1 and
Figure 2 is one third and the unit cell angle γ is 90° per design. Values close to this ratio and angle should, therefore, be obtained as result of lattice parameter extractions with suitable computer programs even in the presence of noise.
Table 2,
Table 3,
Table 4 and
Table 5 list the results of the application of the three different computer programs [
114] of
Table 1 as adapted to the particular unit cell setting of this review (
Figure 2). Results that were obtained in the
default settings of the three computer programs that implement three different types of algorithms (A to C in
Table 1) are listed for the image in
Figure 1a in
Table 2 and for the image in
Figure 1b in
Table 4.
Table 3 list
re-interpreted/re-calculated results from Algorithm B (on the basis of the displayed dFT amplitude map) and results that were obtained in a
non-default setting of Algorithm C for the image in
Figure 1a.
Table 5 does the same kind of thing for the image in
Figure 1b.
Somewhat surprisingly,
Table 2 shows that only one of the three tested algorithms extracted qualitatively correct lattice parameters from the noise-free, but visibly pseudo-symmetric, image in
Figure 1a. These lattice parameters are in good compliance with the rectangular Bravais lattice type that this image possesses per design. For easy reference, qualitatively-correct results are marked in bold font in all of the four image data tables in this review.
Only Algorithm A was, thus, capable of dealing with the translational pseudo-symmetry in
Figure 1a effectively as its lattice parameter extraction results are given in bold font in
Table 2. The other two algorithms extracted in their default settings a unit cell that is too small by a factor of three from this figure. This is also reflected by the ratio of the two basis vectors, which was incorrectly determined as nearly unity by Algorithms B and C in their respective default settings [
114].
Extracted basis vectors that nearly possess the same magnitude, and are also perpendicular to each other within error bars, are, of course, what one would expect for a square Bravais lattice. In other words, to the Algorithms B and C in their default settings, the existing (genuine) translational pseudo-symmetry [
50] in
Figure 1a was apparently a crystallographic symmetry since ‘quantitatively wrong’ lattice parameter sets were extracted.
It was straightforward to re-interpret/re-calculate the lattice parameter extraction output for
Figure 1a as obtained with Algorithm B on the basis of the dFT amplitude map that the program displayed [
114]. This resulted in a bold font entry for qualitative correctness in
Table 3 for Algorithm B. For Algorithm C, using a non-default setting in the processing of
Figure 1a also resulted in a bold font entry in this table.
The added noise in
Figure 1b ‘fooled’ all three computer programs (in their default settings) into extracting results that are obviously incorrect, see
Table 4. This is a direct consequence of the noise-exacerbated pseudo-symmetries in the image shown in
Figure 1b. The oblique unit cell that Algorithm A extracted from the image in
Figure 1b—see
Table 4—can be straightforwardly transformed into a pseudo-square unit cell with essentially the same parameters as those that were obtained with the other two algorithms.
Re-interpreting/re-calculating the lattice parameter extraction outputs for the image in
Figure 1b as obtained with Algorithm B (on the basis of the dFT amplitude map of that image) and using a non-default setting of Algorithm C in the processing of this image led to qualitatively correct results and bold font entries for both algorithms in
Table 5.
The stated error bars on the unit cell angles of 0.05° for the two algorithms/computer programs that extract lattice parameters in Fourier space, i.e. B and C, are based on the implied number of significant figures output by one of these programs [
114], but seem to be too small to allow for agreement of the extraction results of the different algorithms in the case of lattice parameter extractions by the default program settings from the noisy image in
Figure 1b.
The traditional way of assigning Bravais lattice types to the lattice parameters of the two images in
Figure 1 that have been extracted by three different algorithms within the stated error bars as listed in
Table 2,
Table 3,
Table 4 and
Table 5 may, obviously, lead to misclassifications given the numerical variations in these tables. If one does not know the design parameters and history of the two images in
Figure 1 in advance, one is hard pressed to figure out which of the results in these four tables are actually trustworthy, let alone to make
definitive classifications into Bravais lattice types. One would certainly be ill advised to average the results from the three different algorithms in
Table 2 and
Table 4.
Guided by the ‘somewhat squarish’ visual appearance of what appears to be unit cells in the image of
Figure 1b, most researchers would probably classify that image as belonging to the square Bravais lattice type. Two of the results listed in
Table 4 would support this classification in the traditional way based on the numerical values of the extracted lattice parameters and their somewhat extended error bars. This would, however, be incorrect!
A fuzzy classification into Bravais lattice types on the basis of translation symmetry model probabilities (Akaike weights) would, on the other hand, be noise-level dependent and correct in a fundamental sense. Likewise fuzzy classifications into (i) Laue classes on the basis of point/site symmetry model probabilities and (ii) plane symmetry groups on the basis of plane symmetry model probabilities, both utilizing complementing types of Akaike weights, would also be correct in a fundamental sense and generalized-noise level dependent.
The crystallographic symmetry classifications of the image in
Figure 1a would obviously be much less fuzzy than those of the image in
Figure 1b, although still not completely definitive as a matter of principle when a real world algorithm is involved. It is expected that the crystallographic classifications of both images would peak for plane symmetry group
pm, Laue class
2mm, and the rectangular (primitive) Bravais lattice type. This is because these crystallographic categories went into the design of both images.
In case of the image in
Figure 1a, the peaking at these crystallographic categories will be much sharper than for the image in
Figure 1b because only geometric-structural feature extraction uncertainties that are due to the particulars of the applied algorithms/computer programs will make the classifications of the former image fuzzy (as there is no added Gaussian noise present that disturbs the recognition of the design categories).
6. Summary and Conclusions
Geometric Akaike Information Criteria and associated Akaike weights for generalized noise-level-dependent crystallographic symmetry classification of 2D images that are more or less periodic in 2D (or 1D) and considered to constitute 2D data planes have been reviewed. These kinds of classifications are always fuzzy and, in a sense, preliminary, since images with reduced generalized noise levels may become available in the future. In other words, these kinds of classifications are never definitive and static in all real-world applications, in compliance with Kanatani’s dictum.
While this review concentrates on more or less periodic crystal patterns in two dimensions (and mentioned such patterns in one dimension only briefly on a few occasions), it goes without explicitly saying that the outlined approach is, in principle, also applicable to crystal patterns of dimensions three to six.
It was demonstrated by an example that pseudo-symmetries present challenges to extraction algorithms for geometric-structural features from more or less 2D periodic images, as well as to their subsequent crystallographic symmetry classifications. Pseudo-symmetries in 3D and the problems they cause in mainstream single-crystal X-ray crystallography are discussed in
Appendix C. It is noted in that appendix that there are, so far, no statistical descriptors in mainstream 3D crystallography beyond the Hamilton test, which is a form of null hypothesis testing, that are related to Kanatani’s comments. Similarly, there is, so far, no systematic procedure to deal with genuine pseudo-symmetries in 3D on the basis of noisy diffraction data.
The point is also made repeatedly in
Appendix C that crystallographically misclassified 3D crystal structures could essentially no longer be found within crystallographic databases as soon as the objective information theory-based approach of this review was implemented and symmetry classifications were allowed to spread over several classes as a function of the generalized noise level of the experimental data. Such a spreading would allow for an objective reporting of the results of crystal structure determinations, but is not necessary for very highly-symmetric and very well-characterized atomic arrangements, where there is no lingering doubt about the validity of the reported ideal structure. For lower-symmetric and poorly-characterized atomic arrangements (as in many biopolymers), on the other hand, the spreading over several crystallographic symmetry classes would be helpful to the users of the databases as uncertainties about the structures’ validity are faithfully/objectively reported. When better crystal structure determinations become available in the future (at lower generalized noise levels), the spreading would allow for a simple updating of the database entry rather than a re-classification.
Crystallographic model selection uncertainties were illustrated in a qualitative manner on the basis of results from the single relevant experimental study (in 1D) in the literature that the author of this review is aware of after quite substantial background searches. Multi-model inferences and averaging were also discussed.
The combining of Akaike weights for Bravais lattice types, Laue classes, and plane symmetry groups should enable successful crystallographic symmetry classifications even in the presence of manifest pseudo-symmetries that exist per design of an image or that pre-exist within a crystalline sample that has been imaged.
Despite the lack of a guarantee that Kanatani’s geometric AIC approach will work well for fuzzy, but quantitative, crystallographic symmetry classifications, the members of the applied crystallography and computational symmetry communities are hereby invited to test them out on the basis of the above-listed equations and inequalities. Demonstrated success in that endeavor could lead, over time, to a widespread adaptation of the information theoretic approach (as briefly outlined in this review for the 2D case) to mainstream 3D periodic crystallography and its higher-dimensional extensions.
Appendix A is directed at the applied crystallography/materials science community and briefly assesses the potential of the main proposal of this review in connection with the extraction of grain boundary structures from atomic resolution images that are more or less periodic in 1D.
Appendix B distinguishes between pseudo-symmetries of different types. Experimental noise turns genuine symmetries in data that were collected from a crystal structure into pseudo-symmetries that can be very hard to distinguish from genuine pseudo-symmetries.
Appendix D is concerned with standard statistical descriptions and the utility of contemporary null hypothesis tests in mainstream X-ray crystallography. By quoting from a very recent paper that the great Indian-American statistician Calyampudi Radhakrishna Rao authored (with Miodrag M. Lovric as co-author), it is shown that there is no point in disputing the validity of Kanatani’s dictum by means of null hypothesis testing in all real-world applications.
Appendix E provides comments of a crystallographic nature on the only relevant experimental study in 1D from the literature that has employed a geometric AIC. These comments are peripheral to the main topic of this review, but still worthwhile making for the benefit of the computational symmetry community.