Symmetry in Sphere-Based Assembly Configuration Spaces

Sitharam, Meera; Vince, Andrew; Wang, Menghan; Bóna, Miklós

doi:10.3390/sym8010005

Open AccessArticle

Symmetry in Sphere-Based Assembly Configuration Spaces

by

Meera Sitharam

¹

,

Andrew Vince

²,

Menghan Wang

^1,*

and

Miklós Bóna

²

¹

Department of Computer and Information Science and Engineering, University of Florida, Gainesville, FL 32601, USA

²

Department of Mathematics, University of Florida, Gainesville, FL 32601, USA

^*

Author to whom correspondence should be addressed.

Symmetry 2016, 8(1), 5; https://doi.org/10.3390/sym8010005

Submission received: 4 May 2015 / Revised: 21 December 2015 / Accepted: 7 January 2016 / Published: 21 January 2016

(This article belongs to the Special Issue Rigidity and Symmetry)

Download

Browse Figures

Versions Notes

Abstract

:

Many remarkably robust, rapid and spontaneous self-assembly phenomena occurring in nature can be modeled geometrically, starting from a collection of rigid bunches of spheres. This paper highlights the role of symmetry in sphere-based assembly processes. Since spheres within bunches could be identical and bunches could be identical, as well, the underlying symmetry groups could be of large order that grows with the number of participating spheres and bunches. Thus, understanding symmetries and associated isomorphism classes of microstates that correspond to various types of macrostates can significantly increase efficiency and accuracy, i.e., reduce the notorious complexity of computing entropy and free energy, as well as paths and kinetics, in high dimensional configuration spaces. In addition, a precise understanding of symmetries is crucial for giving provable guarantees of algorithmic accuracy and efficiency, as well as accuracy vs. efficiency trade-offs in such computations. In particular, this may aid in predicting crucial assembly-driving interactions. This is a primarily expository paper that develops a novel, original framework for dealing with symmetries in configuration spaces of assembling spheres, with the following goals. (1) We give new, formal definitions of various concepts relevant to the sphere-based assembly setting that occur in previous work and, in turn, formal definitions of their relevant symmetry groups leading to the main theorem concerning their symmetries. These previously-developed concepts include, for example: (i) assembly configuration spaces; (ii) stratification of assembly configuration space into configurational regions defined by active constraint graphs; (iii) paths through the configurational regions; and (iv) coarse assembly pathways. (2) We then demonstrate the new symmetry concepts to compute the sizes and numbers of orbits in two example settings appearing in previous work. (3) Finally, we give formal statements of a variety of open problems and challenges using the new conceptual definitions.

Keywords:

sphere assembly; configuration space; stratification; distance constraints; Cayley geometry; entropy; kinetics; pathways

1. Motivation

Supramolecular assembly is prevalent in nature, healthcare and engineering, but poorly understood. The assembly starts with identical copies of structures drawn from a small number of types. Modeling these starting structures as rigid bunches of spheres is well suited to assembly processes driven by so-called short-range or hard sphere interaction potentials.

More formally, an input to a computational model of an assembly process is an assembly system consisting of the following:

A collection of k rigid molecular components belonging to a few types; a rigid component is specified as the set of positions of the centers of their constituent atoms, in a local coordinate system. In many cases, an atom could be the representation of the average position of a collection of atoms in an amino acid residue. Note that an assembly configuration is given by the positions and orientations of the entire set of k rigid molecular components in an assembly system, relative to one fixed component. Since each rigid molecular component has six degrees of freedom, a configuration is a point in $6 (k - 1)$ dimensional Euclidean space.
The pairwise component of the potential energy function of the assembly system is specified as a sum of potential energy terms between pairs of constituent atoms i and j in two different rigid components of the assembly system. The weak interaction between the rigid molecular components is captured by this potential energy function. The pairwise potential energy terms are, in turn, specified using pairwise potential energy functions similar to so-called Lennard–Jones potentials and Morse potentials [1]. The potential energy is a function of the distance $d_{i, j}$ between i and j.
A non-pairwise component of the potential energy function is in the form of global potential energy terms that capture the tethers between the rigid components within a monomer, as well as other global potential energy terms that implicitly represent the solvent (water or lipid bilayer membrane) effect [2,3,4]. These are independent of particular pairs of atoms.

It is important to note that all of the above potential energy terms are functions of the assembly configuration.

The formal conceptual framework we develop here is inspired by the following types of prediction questions.

Input: the 3D descriptions of the rigid molecular components and their interactions (Section 2 describes how they are formally specified). Output: prediction of the final assembly structures and their likelihood.
Input: as in the previous item, plus a 3D configuration of the final assembled structure. Output: prediction of those interactions that are crucial for the assembly process to terminate in the given input assembly configuration.
Input: as in the previous item. Output: prediction of minimal alterations of the building blocks or interactions that would significantly increase the likelihood of the assembly process terminating in the given input assembly configuration.
Input: as in the previous item; additionally, more than one choice of final assembly configuration. Output: prediction of key events, such as specific intermediate sub-assembly configuration choices during assembly that determine which one of the final assembly configurations is more likely to result.

Experimentally, in vitro or vivo, these types of predictions about supramolecular assembly processes are difficult because of the remarkable rapidity, spontaneity and robustness of assembly processes. The prediction tasks highlight combinatorial explosion and, thus, the insufficiency of experimentation (trying various possibilities) and guesswork, even with the help of known data on similar assemblies and biological knowledge about evolutionarily-conserved structures. In addition, many of the current experimental methods are labor and resource intensive, making blind alleys expensive in time and effort.

On the other hand, computer simulations guided by theoretical first principles and standard paradigms, such as Monte Carlo (MC) or molecular dynamics (MD), are limited due to the reasons detailed in the next subsections.

1.1. Assembly Configurational Volume

The stability and binding affinity of subassemblies depend on free energy, whose landscape in the case of assembly is heavily influenced by configurational entropy (volume measure of microstates corresponding to a macrostate; see [5]); this depends on accurate computation of configurational volumes by sampling, attempted by a long and distinguished series of methods [5,6,7,8,9,10,11,12,13]. Assembly configuration spaces are high dimensional, and the number of required samples is typically exponential in the dimension. Sampling on a high-dimensional ambient space grid typically means computing a large proportion of samples that lie outside any region of interest, which is effectively of lower dimension, and these samples must be discarded. Not only are the relevant regions in the case of short-ranged potentials of effectively lower dimension, they are also geometrically/topologically complex; hence, grid-based sampling in Cartesian space, as well as non-ergodic methods, like MC or MD, have to generate impractically dense sampling to accurately reflect the volume/measure ratios of these important, relatively low volume regions having complex geometry and topology. These methods do not exploit the abundance of symmetries of the landscape. They are used both for assembly processes, whose feasible regions are defined by one-sided pairwise distance equalities and inequalities between atom-centers, and folding processes, where the feasible regions are defined by pairwise distance equalities. The difference of complexity between the two is a litmus test for the limitations that are addressed by the Cayley configuration space approach taken by efficient atlasing and search of assembly landscapes (EASAL) described in Section 1.5.

Conventional methods to compute the energy landscape of small clusters are based on searching for local minima [1,14,15]. Point group symmetrization schemes [16,17,18] and local rigidification schemes [19,20] have been exploited in global optimization algorithms to gain computational efficiency.

Because of the complexity of the problem of dealing with the short range of interaction of hard spheres leading to narrow regions of lower potential energy, separated by vast flat parts, conventional local minima-based methods for energy landscape computation [14] are limited. These methods have the additional disadvantage of small perturbations to energy values requiring complete recomputation, and also, they do not deal well with the very flat landscape that is the signature of short-range potentials.

An alternative approach for short-range potentials is to consider the “sticky sphere limit” based on taking the limit as the range of interaction goes to zero [21,22,23]. In this limit, the energy landscape reduces to a collection of manifolds of different dimensions, glued together at their boundaries (formally, a Thom–Whitney stratification of real semi-algebraic sets), as described in theoretical models proposed independently and separately by Holmes-Cerfon et al. [24] in 2013 and by the first author’s research group [25,26] in 2011.

The background provided in the remainder of this section recalls previously-developed concepts for describing assembly configuration spaces. This motivates the conceptual framework for symmetry in assembly under short-range potentials given in Section 2.

1.2. Kinetics, Topology and Geometric Complexity

Kinetics and transition rates between subassemblies also require an explicit understanding of the geometry, topology and multiple paths in the assembly configuration space. For cluster assemblies from spheres, there are a number of methods [27,28,29,30,31,32,33] to compute the entire configuration space of small molecules, such as cyclo-octane [34,35,36]. Some methods from robotics and computational geometry [12], such as the probabilistic roadmap [37], effectively give bounds to approximate free energy without relying on MC or MD sampling. Starting from MC and MD samples, recent heuristic methods infer topology [38,39,40,41] and use topology to guide dimensionality reduction [42]. Yet, most prevailing methods are unable to extract the topology in a sufficiently efficient and accurate manner as to be able to feasibly compute volume or path integrals (required for entropy or kinetics computations), even for small assemblies. Moreover, even those prevailing methods that exploit symmetry in the configuration space to compute free energy and kinetics do not employ a formal and precise group-theoretic framework.

1.3. Recursive Decomposition, Assembly Trees and Combinatorial Entropy

For larger, microscale assemblies, a direct study of the free energy and configurational entropy is computationally emphatically intractable. At these coarser scales, the primitives are stable subassemblies and transition rates (obtained from the computational tasks of the previous two subsections). Still, the combinatorial entropy of multiple pathways makes it difficult to isolate crucial combinations of assembly-driving interface interactions.

This issue has been addressed by the first author’s previous work on recursive decompositions [43,44,45] of larger assemblies into smaller subassemblies. This work introduces structures called assembly trees and the notion of combinatorial entropy, applied to model viral capsid assembly in [46].

While trees of various types have been used to model various processes related to assembly [47,48], to the best of our knowledge, the assembly trees from [46] have a formal structure that is distinct from other tree representations of assembly pathways. In particular, non-root nodes of the assembly tree contain subassemblies, rather than configurations of the entire assembly system; and any pair of nodes that are incomparable (neither ancestor or child in the tree) is a disjoint sub-assembly, i.e, they do not contain any common rigid components; moreover, only rigid sub-assembly configurations are represented. In addition, the authors have taken the first steps towards precisely formalizing the effect of symmetries on a highly simplified version of assembly trees; specifically, their orbits under the action of a fixed group of symmetries, called assembly pathways [49]. These concepts will be discussed in detail in Section 2 and Section 3.

1.4. Symmetry in Chemistry

Since spheres within rigid bunches of an assembly system could be identical and bunches could be identical, as well, the underlying symmetry groups could be of large order, which grow with the number of participating spheres and bunches. Therefore, all of the tasks in the previous three subsections can be significantly simplified by taking advantage of natural symmetries of the configuration space that arise due to identical assembling units, their symmetries and symmetries of the final assembled structure. However, none of the prevailing methods discussed above computationally incorporates these symmetries. Group theory has been used to study the symmetry of molecules and molecular orbits [50,51,52,53] for a long time. The well-known Pólya enumeration theorem [54], which provides a method to find the number of orbits of a group action, is motivated by the problem of enumerating permutational isomers of a given molecular skeleton. Group theory is widely used in crystallography to describe crystallographic symmetry and to classify crystal structures [55,56]. Other applications include using the molecule symmetry group in studying molecular spectroscopy [57] and using generating functions in understanding nuclear spin statistics of nonrigid molecules [58]. However, most of these works only involve the symmetry of individual structures. The literature is sparse in the context of symmetry in assembly systems or in configuration spaces.

1.5. EASAL: Efficient Atlasing and Search of Assembly Landscapes

A recent method of the first author, EASAL (efficient atlasing and search of assembly landscapes) [25,26], formally addresses the issues highlighted in the first two subsections above: computation of configurational entropy and kinetics, via geometrization, stratification and convexification using Cayley parameterization of assembly configuration spaces. Geometrization and stratification were also used later in [24] independently (as mentioned at the end of Section 1.1): the geometrization is achieved in [24] via a somewhat different process consistent with smooth potential energy functions, while the stratification is the standard Thom–Whitney stratification of semi-algebraic sets as laid out in [25,26].

On the other hand, Cayley convexification based on [59] is a unique feature of EASAL not present in [24], which makes it tractable to sample and compute entropy integrals over higher dimensional constant potential energy regions of the assembly configuration space. In addition, Cayley convexification helps formalize and precisely explain the intuitively clear observation that assembly configuration spaces are significantly simpler geometrically and topologically than folding configuration spaces. The difference in complexity is especially stark when there are cycles of pairwise constraints between atom centers.

We describe the geometrization and stratification aspects of EASAL’s approach below. Stratification is explained in further detail in Section 2, and Cayley parameters for configuration spaces and convexification based on [59] are explained in Section 4.

1.5.1. Geometrization

The assembly configuration space is represented as a semi-algebraic set satisfying geometric constraints specified as distance inequalities between atom centers. The short-range or hard sphere potential interaction is typically discretized to take different constant values on three intervals for the distance value

d_{i, j}

:

(0, r_{i, j})

,

(r_{i, j}, r_{i, j} + δ_{i, j})

and

(r_{i, j} + δ_{i, j}, \infty) .

Typically,

r_{i, j}

, the so-called van der Waals or steric radius, specifies “forbidden” regions around atoms i and

j .

Additionally,

r_{i, j} + δ_{i, j}

is a distance where the attractive (electrostatic or other weak) forces between the two atoms are no longer strong (typically, these forces decay as the reciprocal of some power of the distance

d_{i, j}

between atom centers). Intuitively, the interval

(0, r_{i, j})

is where the repulsive force highly dominates, and

(r_{i, j}, r_{i, j} + δ_{i, j})

is where the attractive force and repulsive forces are balanced; also,

(r_{i, j} + δ_{i, j}, \infty)

is where neither force is strong. Over these three intervals, respectively, the potential assumes a very high value, a very low value and a medium value

m_{i, j} .

All of these bounds for the intervals for

d_{i, j}

, as well as the values for the potential on these intervals are specified as part of the input to the assembly model. These constants are specified for each pair of atoms i and j, i.e., the subscripts are necessary. The interval with the low value is called the well. The hard sphere potentials are defined solely by the van der Waals’ forbidden distance constraint,

δ_{i, j} = 0

.

The information in the potential energy landscape can thus be geometrized, i.e., represented using assembly constraints, in the form of distance intervals. These constraints define feasible configurations. The set of feasible configurations is called the assembly configuration space. The active constraint regions of the configuration space are regions where at least one of the short-range inter-atom distances lies in the potential energy well, i.e., the interval

(r_{i, j}, r_{i, j} + δ_{i, j})

.

1.5.2. Stratification

The above geometrization of an assembly configuration space makes it natural to stratify an assembly configuration space into an atlas of active constraint regions. More details are provided in Section 2.4. The active constraint regions of the configuration space are regions where at least one of the inter-atom distances lies in the potential energy well. The active constraint regions are stratified by dimension into a topological Thom–Whitney complex, with the boundary region being one dimension smaller. The active constraint regions can be modeled as so-called convexifiable Cayley configuration spaces [59], a combinatorially-definable concept by first labeling each region by its unique active constraint graph (see Section 2). A demo movie of EASAL is available at [60]. Standard algorithms can be employed for a fast computation of paths from one configuration to another in the atlas. However, the computation of entropy integrals over these paths poses several challenges.

1.6. Organization and Contribution

This is a primarily expository paper that develops a novel, original framework for dealing with symmetries in configuration spaces of assembling spheres under short-range potentials. It is motivated by a longer term goal to exploit natural symmetries using assembly trees and other concepts described in the previous sections that have appeared in various avatars in the community, including our work on EASAL. Such an understanding of symmetries is essential for significantly reducing the complexity of the computation of configurational and combinatorial entropy, as well as kinetics, since spheres within rigid bunches of an assembly system could be identical and bunches could be identical, as well, giving underlying symmetry groups of large order, which grow with the number of participating spheres and bunches.

To this end, we develop a formal conceptual framework for assembly under short-range potentials, as an assembly of rigid bunches of spheres. As different definitions of assembly macrostates are appropriate in different contexts, for example depending on whether different copies of identical atoms or molecules are considered interchangeable or not, we carefully define and differentiate between the congruence and isomorphism of configurations. We then show how symmetries of assembly configuration spaces arise due to: multiple copies of identical building blocks (in particular, when these building blocks are rigid bunches of spheres), internal symmetries of building blocks and the symmetries of the final assembled structure.

The organization of this paper is as follows. In Section 2, we define the new conceptual framework for symmetry in assembly under short-range potentials (or an assembly of rigid bunches of spheres) leading to the main Theorem 4. An application of some of these results on symmetry can be found in [26]. In Section 3, we illustrate one aspect of our approach [49] for computing combinatorial entropy using generating functions for counting the number and size of simplified assembly pathways (orbits of a symmetry group action on assembly trees). Note that while this simple example has a fixed group size, the method demonstrated applies also when the underlying symmetry group grows with the size of the system. In Section 4, open questions and directions are given.

2. Framework for Symmetry in an Assembly

In this section, we define natural groups of symmetries acting on various previously-defined objects related to symmetry that are described in Section 1 and later in this section. The four new groups we defined are the weak automorphism group, the strict congruence group, the strict order preserving isomorphism group and the strict permuted congruence group of an assembly configuration. We consider the action of these groups on various objects defined in previous literature on assembly and sketched in Section 1 [25,26,46], such as assembly configuration space, active constraint regions, active constraint graphs, assembly paths and trees. These resulting symmetry classes will be used to formalize the main new Theorem 4 and two applications in Example 1 and Section 3, as well as open problems in the last section of this paper.

Let X be a set under the action of a group G, and x be any element of X. The orbit of x under G is the set

G (x) = {ϕ (x) : ϕ \in G}

. An element g of G fixes x if

g (x) = x

. The stabilizer subgroup

{stab}_{G} (x)

of x in G is the group of all elements in G that fix x, i.e.,

{stab}_{G} (x) = {ϕ \in G | ϕ (x) = x}

.

The following theorem from standard group theory can be used to determine the number of orbits and the size of orbits for various objects defined in this section. An explicit application of this theorem is shown in the next section.

Theorem 1.

Let X be a set under the action of a group G. For all

x \in X

, the equalities:

| G (x) | = | G | / | {stab}_{G} (x) | (Orbit--Stabilizer theorem)

and

| X / G | = \frac{1}{| G |} \sum_{ϕ \in G} | X^{ϕ} | (Burnside’s lemma)

hold, where

| X / G |

is the number of orbits of X and

X^{ϕ}

is the set

{x \in X : ϕ (x) = (x)}

.

Different definitions of macrostates are appropriate in different contexts, for example depending on whether different copies of identical atoms or molecules are considered interchangeable or not. For this reason, we carefully define and differentiate between the congruence and isomorphism of configurations.

In order to give a physically meaningful formalization of an assembly system under short-range potentials, we define the notion of a bunch, i.e., a rigid configuration of spheres of varying colors and radii.

2.1. A Bunch and Its Symmetries

Let

S E (3)

denote the group of orientation-preserving isometries of

R^{3}

.

A bunch is a tuple

(P; C, r, δ)

where

P = (p_{1}, p_{2}, \dots, p_{n})

is an ordered set of points in

R^{3}

, and

C, r, δ

are functions defining colored spheres centered at the points in P. Specifically,

C : P \to C

where C is a finite set of “colors”, and

r, δ : P \to R^{+}

, such that the spheres are non-intersecting, i.e.,

∥ p_{i} - p_{j} ∥_{2} \geq r (p_{i}) + r (p_{j})

for any

i \neq j

. The map δ is interpreted as the width of the annulus specified by the potential energy well and is used in the definition of an active constraint graph of an assembly configuration later in this section. For a bunch B,

P (B)

is used to denote the point set B; similarly, we have

C (B), r (B)

and

δ (B)

.

Two bunches

B = (P; C, r, δ)

and

B^{'} = (P^{'}; C^{'}, r^{'}, δ^{'})

are isomorphic if there is an element ϕ of

S E (3)

and a permutation

π \in S_{n}

, such that

ϕ (p_{i}) = p_{π (i)}^{'}

for all i, where

n = | P |

, and ϕ preserves the color, radius and annulus of points. In this case, with a slight abuse of notation, we write

B^{'} \in ϕ (B)

, where

ϕ (B)

denotes the set of bunches that are isomorphic to B under ϕ and some permutation in

S_{n}

. See Figure 1 for an example.

Two bunches

B = (P; C, r, δ)

and

B^{'} = (P^{'}; C^{'}, r^{'}, δ^{'})

are strictly isomorphic, if there is a permutation

π \in S_{n}

such that B and

B^{'}

are isomorphic under π and the identity element in

S E (3)

. The weak automorphism group of B, denoted

Waut (B)

, is the group of all permutations

π \in S_{n}

that take B to a strictly isomorphic

B^{'}

.

Figure 1. Two isomorphic bunches of five spheres.

Two bunches

B = (P; C, r, δ)

and

B^{'} = (P^{'}; C^{'}, r^{'}, δ^{'})

are order-preserving isomorphic or congruent, if there is a

ϕ \in S E (3)

, such that B and

B^{'}

are isomorphic under ϕ and the identity permutation. In this case, with a slight abuse of notation, we write

B^{'} = ϕ (B)

.

We have the following observation that describes strict isomorphism using the notion of congruence.

Observation 2.

Two congruent bunches B and

B^{'}

are strictly isomorphic, if and only if

\tilde{P} = {\tilde{P}}^{'}

, where

\tilde{P}

and

{\tilde{P}}^{'}

denote the unordered point sets of B and

B^{'}

, respectively, and for all

p \in P^{'}

,

C^{'} (p) = C (p)

,

r^{'} (p) = r (p)

,

δ^{'} (p) = δ (p)

.

2.2. An Assembly Configuration Space and Its Symmetries

An assembly configuration is an ordered set

B = (B_{1}, B_{2} \dots B_{k})

, where

B_{i} = (P_{i}; C_{i}, r_{i}, δ_{i})

is a bunch for all i, such that for all

i, j

and all

x \in P_{i}, y \in P_{j}, x \neq y

, we have:

{∥ x - y ∥}_{2} \geq r_{i} (x) + r_{j} (y)

(1)

Two assembly configurations

B = (B_{1}, \dots, B_{k})

and

B^{'} = (B_{1}^{'}, \dots, B_{k}^{'})

are configurations of the same assembly system (see Section 1) if

B_{i}

is congruent to

B_{σ (i)}^{'}

for some permutation

σ \in S_{k}

, for all i. Notice that the congruence between bunches could be different for each i. The set of all assembly configurations of an assembly system is called an assembly configuration space. The assembly configuration space containing the assembly configuration

B

is denoted

A (B)

or simply

A

when the context is clear.

In the following discussion, we always restrict our universe to assembly configurations in the same assembly configuration space.

Two assembly configurations

B = (B_{1}, \dots, B_{k})

and

B^{'} = (B_{1}^{'}, \dots, B_{k}^{'})

are isomorphic if there is an element ϕ of

S E (3)

(isomorphism between bunches) and a permutation

σ \in S_{k}

, such that for all i,

B_{σ (i)}^{'}

is isomorphic to

B_{i}

under ϕ and a permutation

π_{i} \in S_{n_{i}}

, where

n_{i} = | P_{i} |

.

Two assembly configurations

B

and

B^{'}

are strictly isomorphic, if there is a permutation

σ \in S_{k}

, such that for all i,

B_{σ (i)}^{'}

is isomorphic to

B_{i}

under the identity element in

S E (3)

and a permutation

π_{i} \in S_{n_{i}}

, where

n_{i} = | P_{i} |

. Thus, a strict isomorphism is a tuple of permutations

(σ, π_{1}, \dots, π_{k})

, where

σ \in S_{k}

and

π_{i} \in S_{n_{i}}

. The weak automorphism group of

B

, denoted

Waut (B)

, is the group of all such tuples

(σ, π_{1}, \dots, π_{k})

that take

B

to a strictly isomorphic

B^{'}

, with the group operation

(σ, π_{1}, \dots, π_{k}) (σ^{'}, π_{1}^{'}, \dots, π_{k}^{'}) = (σ σ^{'}, π_{1} π_{1}^{'}, \dots, π_{k} π_{k}^{'})

.

Note that all assembly configurations in the same assembly configuration space

A

have the same weak automorphism group. Thus, we define the weak automorphism group of an assembly configuration space

A

, denoted

Waut_{A}

, to be the weak automorphism group of any assembly configuration

B

in

A

.

Two assembly configurations

B

and

B^{'}

are congruent if there is an isomorphism

ϕ \in S E (3)

that preserves both the order of the bunches and the order of points within each bunch, i.e., for all i,

B_{i}^{'}

is congruent to

B_{i}

under ϕ. Two assembly configurations

B

and

B^{'}

are strictly congruent if they are both congruent and strictly isomorphic. In general, we think of two strict congruent assembly configurations as the same. The strict congruence group of an assembly configuration

B

is the stabilizer of the set strictly congruent assembly configurations of

B

under

Waut_{A}

. It is the stabilizer subgroup

{stab}_{Waut_{A}} B

of the assembly configuration

B

under

Waut_{A}

.

Two assembly configurations

B

and

B^{'}

are order-preserving isomorphic if there is an isomorphism

ϕ \in S E (3)

that preserves the order of the bunches, i.e., for all i,

B_{i}^{'}

is congruent to

ϕ (B_{i})

. Two assembly configurations

B

and

B^{'}

are strictly order preserving isomorphic if they are both order-preserving isomorphic and strictly isomorphic. The strict order-preserving isomorphism group of an assembly configuration

B

is the stabilizer of the set of strictly order-preserving isomorphic configurations of

B

under

Waut_{A}

.

Two assembly configurations

B

and

B^{'}

are permuted congruent if there is an isomorphism that preserves the order of points within each bunch, i.e., there is an element ϕ of

S E (3)

and a permutation

σ \in S_{k}

, such that for all i,

B_{σ (i)}^{'}

is congruent to

B_{i}

under ϕ. Two assembly configurations

B

and

B^{'}

are strictly permuted congruent if they are both permuted congruent and strictly isomorphic. The strict permuted congruence group of an assembly configuration

B

is the stabilizer of the set of permuted congruent configurations of

B

under

Waut_{A}

.

For an example, refer to Figure 2. The assembly configuration

B_{1}

consists of three congruent bunches. The assembly configuration

B_{2}

is obtained from

B_{1}

with a strict congruence

(σ, π_{1}, π_{2}, π_{3})

induced by a rotation in

S E (3)

, where

σ = (1 3)

, and

π_{i} = i d

for all i. The assembly configuration

B_{3}

is obtained from

B_{1}

with a strict permuted congruence

(σ, π_{1}, π_{2}, π_{3})

, where σ is a cyclic permutation of the three bunches, and

π_{i} = i d

for all i. On the other hand,

B_{4}

is obtained from

B_{1}

with a strict isomorphism

(σ, π_{1}, π_{2}, π_{3})

, where σ is a cyclic permutation of the three bunches,

π_{1} = (1 2)

and

π_{2} = π_{3} = i d

.

Figure 2. The assembly configuration

B_{1}

consists of three isomorphic bunches.

B_{2}

is obtained from

B_{1}

with a strict congruence;

B_{3}

is obtained from

B_{1}

with a strict permuted congruence; and

B_{4}

is obtained from

B_{1}

with a strict isomorphism that is neither a strict congruence, nor a strict permuted congruence, nor a strict order preserving isomorphism.

Figure 2. The assembly configuration

B_{1}

consists of three isomorphic bunches.

B_{2}

is obtained from

B_{1}

with a strict congruence;

B_{3}

is obtained from

B_{1}

with a strict permuted congruence; and

B_{4}

is obtained from

B_{1}

with a strict isomorphism that is neither a strict congruence, nor a strict permuted congruence, nor a strict order preserving isomorphism.

Figure 3 shows another example of four assembly configurations each containing two bunches. The strict congruence group

{stab}_{Waut_{A}} B

of the assembly configuration

B_{1}

is of size two and contains those tuples

(σ, π_{1}, π_{2})

, where

π_{1} \in {i d, (2 4)}

,

σ = i d

,

π_{2} = i d

. The weak automorphism group

Waut_{A}

of the assembly system is of size four and contains those tuples

(σ, π_{1}, π_{2})

, where

π_{1} \in {i d, (2 4), (3 1), (2 4) (3 1)}

,

σ = i d

,

π_{2} = i d

. All four strictly isomorphic assembly configurations are obtained by applying

Waut_{A}

to the assembly configuration

B_{1}

. Notice that

B_{2}

and

B_{1}

(

B_{4}

and

B_{3}

) are strictly congruent, while

B_{3}

and

B_{1}

are strictly order-preserving isomorphic. The orbit of

B_{1}

under

Waut_{A}

is of size two and consists of

B_{1}

and

B_{3}

.

Figure 3. Four assembly configurations obtained by applying

Waut_{A}

on the assembly configuration

B_{1}

.

B_{2}

is obtained from

B_{1}

with a congruence, while

B_{3}

is obtained from

B_{1}

with a strict order-preserving isomorphism.

Figure 3. Four assembly configurations obtained by applying

Waut_{A}

on the assembly configuration

B_{1}

.

B_{2}

is obtained from

B_{1}

with a congruence, while

B_{3}

is obtained from

B_{1}

with a strict order-preserving isomorphism.

We have the following observations for alternative characterizations of strict congruence, strict order-preserving isomorphism and strict permuted congruence of assembly configurations.

Observation 3.

Given two assembly configurations

B = (B_{1}, \dots, B_{k})

and

B^{'} = (B_{1}^{'}, \dots, B_{k}^{'})

in the same assembly configuration space,

$B$ and $B^{'}$ are strictly congruent if and only if they are congruent, and
(*)
$B$ and $B^{'}$ have the same unordered partition of the unordered point set into bunches, i.e., ${\tilde{P_{1}}, \dots, \tilde{P_{k}}} = {\tilde{P_{1}^{'}}, \dots, \tilde{P_{k}^{'}}}$ , where $\tilde{P_{i}}$ is the unordered point set of the bunch $B_{i}$ , and each point has the same color, radius and annulus in $B$ and $B^{'}$ .
$B$ and $B^{'}$ are strictly order-preserving isomorphic if and only if they are order preserving isomorphic and satisfy the condition (*).
$B$ and $B^{'}$ are strictly permuted congruent if and only if they are permuted congruent and satisfy the condition (*).

2.3. Symmetries in an Active Constraint Graph and an Active Constraint Region

An active constraint graph

G (B)

of an assembly configuration

B = (B_{1}, \dots, B_{k})

is a graph

(V, E)

, where the vertex set V has one vertex for each point

p \in P_{1} \cup \dots \cup P_{k}

, labeled by a tuple

(i, l)

, representing that the point p appears as the i-th point

p_{i}

in the l-th bunch

B_{l}

of

B

, and a vertex pair

{x, y} \in E

if x and y lie in distinct bunches of

B

and:

r (x) + r (y) \leq {∥ x - y ∥}_{2} \leq (r (x) + δ (x)) + (r (y) + δ (y)) .

An element

(σ, π_{1}, \dots, π_{k})

of the weak automorphism group

Waut_{A}

of

B

’s assembly configuration space

A

acts on

G (B)

by taking the tuple

(i, l)

to

(π_{l} (i), σ (l))

.

Two active constraint graphs

G_{1}, G_{2}

are isomorphic if there is a

ψ = (σ, π_{1}, \dots, π_{k}) \in Waut_{A}

, such that

{x, y} \in E (G_{1}) ⟺ {ψ (x), ψ (y)} \in E (G_{2})

. In this case, we say

G_{1} ≅_{ψ} G_{2}

or

ψ (G_{1}) = G_{2}

.

The automorphism group of an active constraint graph G is the group of elements

ψ \in Waut_{A}

, such that

ψ (G) = G

, i.e., it is the stabilizer subgroup

{stab}_{Waut_{A}} G

.

For example, Figure 4 shows all of the non-isomorphic active constraint graphs with 12 edges of an assembly system consisting of six bunches, where all bunches are identical singleton spheres.

Figure 4. All non-isomorphic active constraint graphs with 12 edges of an assembly system of six bunches that are identical singleton spheres. The label on top is automatically generated by EASAL and specifies the orbit number of the shown active constraint graph.

Note:

It is clear that

{stab}_{Waut_{A}} B \subseteq {stab}_{Waut_{A}} G (B)

. Moreover, there are assembly configurations

B

, such that

{stab}_{Waut_{A}} B ⊊ {stab}_{Waut_{A}} G (B)

, i.e., the strict congruence group of

B

does not have all of the automorphisms of the corresponding active constraint graph. Refer to the assembly configuration

B

and its active constraint graph G in Figure 5, where each bunch is a singleton sphere. The permutation

σ = (1 2 3) \in Waut_{A}

is contained in

{stab}_{Waut_{A}} (G)

. However, it is not contained in the strict congruence group

{stab}_{Waut_{A}} B

of the assembly configuration.

Figure 5. An assembly configuration whose automorphism group is strictly contained in that of the corresponding active constraint graph. Here, the bunches are singleton spheres, and bunches of the same color have the same

C

, r and δ.

Figure 5. An assembly configuration whose automorphism group is strictly contained in that of the corresponding active constraint graph. Here, the bunches are singleton spheres, and bunches of the same color have the same

C

, r and δ.

The full graph

G^{*}

of an active constraint graph G is obtained by adding edges to G to make the set of vertices in each bunch into a clique.

An active constraint region

R_{G}

of the assembly configuration space

A

contains all assembly configurations

B

with the active constraint graph

G (B) = G

. The action of elements of

Waut_{A}

on an active constraint region and the stabilizer of an active constraint region in

Waut_{A}

are well-defined by the action of

Waut_{A}

on assembly configurations.

The following theorem gives containment and equality relations between stabilizer subgroups of an active constraint graph, an active constraint region and individual configurations in the active constraint region.

Theorem 4.

For an active constraint graph

G = G (B)

of an assembly configuration space

A

, it holds that:

{stab}_{Waut_{A}} B \subseteq {stab}_{Waut_{A}} G = {stab}_{Waut_{A}} R_{G}

In addition, there exist active constraint graphs G of assembly configuration spaces

A

where the above containment is strict, i.e.,

for all B such that G = G (B), {stab}_{Waut_{A}} B ⊊ {stab}_{Waut_{A}} G = {stab}_{Waut_{A}} R_{G}

Proof.

(1) It is straightforward to see that

{stab}_{Waut_{A}} B \subseteq {stab}_{Waut_{A}} G (B)

. We give an example to show the existence of G where

{stab}_{Waut_{A}} B ⊊ {stab}_{Waut_{A}} G

for any assembly configuration

B

of G. Refer to the assembly configuration in Figure 6, where each bunch is a singleton sphere. The permutation

σ = (1 2 3)

is contained in the automorphism group

{stab}_{Waut_{A}} G

of the active constraint graph G. However, it is not contained in the strict congruence group of any corresponding assembly configuration, as the position of the sphere six is asymmetric with respect to

1, 2, 3

in any assembly configuration of G. Thus,

{stab}_{Waut_{A}} B ⊊ {stab}_{Waut_{A}} G

for any assembly configuration

B

of G.

Figure 6. Any assembly configuration corresponding to the active constraint graph G has its strict congruence group strictly contained in

{stab}_{Waut_{A}} G

. Here, the bunches are singleton spheres, and bunches of the same color have the same

C

, r and δ.

Figure 6. Any assembly configuration corresponding to the active constraint graph G has its strict congruence group strictly contained in

{stab}_{Waut_{A}} G

. Here, the bunches are singleton spheres, and bunches of the same color have the same

C

, r and δ.

(2)

{stab}_{Waut_{A}} G = {stab}_{Waut_{A}} R_{G}

: from the definition of permutations in the weak automorphism group of the assembly configuration space, it follows that

{stab}_{Waut_{A}} G \subseteq {stab}_{Waut_{A}} R_{G}

. To show

{stab}_{Waut_{A}} R_{G} \subseteq {stab}_{Waut_{A}})

, consider any element

ψ \in {stab}_{Waut_{A}} R_{G}

. For any assembly configuration

B \in R_{G}

, if a pair of spheres

(x, y)

are “touching” (i.e., they yield an edge in the corresponding active constraint graph), it must be the case that

(ψ (x), ψ (y))

are also “touching” in

ψ (B)

, since

G (B) = G (ψ (B)) = G

. Similarly, ψ must map “non-touching” pairs to “non-touching” pairs. Therefore,

ψ \in {stab}_{Waut_{A}} G

. ☐

Remark 1.

We expect the strict order-preserving isomorphism group and the strict permuted congruence group of an assembly configuration

B

to lie between the strict congruence group

{stab}_{Waut_{A}} B

and the automorphism group

{stab}_{Waut_{A}} G

of its active constraint graph. However, the containment relationship between these two groups is not clear.

2.4. Symmetries in Stratification, Assembly Path and Pathway

A stratification

S (A)

of the assembly configuration space

A

is a partition of the space into strata

X_{i}

of

A

that form a filtration

\emptyset \subset X_{0} \subset X_{1} \subset \dots \subset X_{m} = A

,

m = 6 (n - 1)

. Each

X_{i}

is a union of active constraint regions

R_{G}

, where the corresponding active constraint graph G has

m - i

independent edges, i.e.,

m - i

inequality constraints are active. Each active constraint graph G is itself part of at least one, and possibly many, hence, l-indexed, nested chains of the form

\emptyset \subset G_{0}^{l} \subset G_{1}^{l} \subset \dots \subset G_{m - i}^{l} = G \subset \dots \subset G_{m}^{l}

.

These induce corresponding reverse nested chains of active constraint regions

R_{G_{j}^{l}}

:

\emptyset \subset R_{G_{m}^{l}} \subset R_{G_{m - 1}^{l}} \subset \dots \subset R_{G_{m - i}^{l}} = R_{G} \subset \dots R_{G_{0}^{l}}

. Note that here, for all

l, j

,

R_{G_{m - j}^{l}} \subseteq X_{j}

is closed and j dimensional. See Figure 7 for an example of assembly configuration space stratification.

Given two active constraint graphs

G_{i}

and

G_{j}

,

R_{G_{i}}

(resp.

G_{i}

) is a parent of

R_{G_{j}}

(resp.

G_{j}

) (resp.

R_{G_{j}}

is a child of

R_{G_{i}}

) if

G_{i} ⊊ G_{j}

, and there does not exist an active constraint graph

G_{m}

, such that

G_{i} ⊊ G_{m} ⊊ G_{j}

. The parent-child relation provides a Hasse diagram of active constraint regions in the stratification of

A

.

Figure 7. A fundamental region of the stratification for the assembly configuration space of the assembly configurations in Figure 4 of six bunches, with each bunch being a singleton sphere and all bunches identical. Therefore,

Waut_{A}

is the complete symmetric group of the permutations of six elements,

S_{6}

. Each node shown is an orbit representative of an active constraint region corresponding to an active constraint graph. The grey part is those active constraint graphs (orbit representatives) whose corresponding constraint regions are empty. The example active constraint graph representatives on the right have arrows pointing to their regions in the stratification. The labels in the circles are unimportant: they are automatically generated and specify an orbit of an active constraint graph (example shown on the right).

Figure 7. A fundamental region of the stratification for the assembly configuration space of the assembly configurations in Figure 4 of six bunches, with each bunch being a singleton sphere and all bunches identical. Therefore,

Waut_{A}

is the complete symmetric group of the permutations of six elements,

S_{6}

. Each node shown is an orbit representative of an active constraint region corresponding to an active constraint graph. The grey part is those active constraint graphs (orbit representatives) whose corresponding constraint regions are empty. The example active constraint graph representatives on the right have arrows pointing to their regions in the stratification. The labels in the circles are unimportant: they are automatically generated and specify an orbit of an active constraint graph (example shown on the right).

An assembly path from

G_{1}

to

G_{m}

in the stratification is a sequence

G_{1} ⊊ G_{2} ⊊ G_{3} ⊊ \dots ⊊ G_{m}

where

G_{i + 1}

is a child of

G_{i}

for all

1 \leq i \leq m

. A coarse assembly path from

G_{1}

to

G_{m}

in the stratification is a sequence

G_{1} ⊊ G_{2} ⊊ G_{3} ⊊ \dots ⊊ G_{m}

where

G_{i + 1}^{*}

has exactly one new rigid component S not in

G_{i}^{*}

, with S containing a set of two or more rigid components

S_{1} \dots S_{m}

of

G_{i}

. In addition, for all proper subsets

Q ⊊ {S_{1} \dots S_{m}}

with

| Q | \geq 2

, the subgraphs of

G_{i + 1}^{*}

induced by Q are not rigid (The rigid components of a graph are the maximal rigid subgraphs. Two rigid components cannot intersect on more than two vertices. We refer the reader to combinatorial rigidity concepts in [61]).

For example, In Figure 7, the sequence of active constraint graphs on the right form an assembly path.

An assembly forest corresponding to a coarse assembly path from

G_{1}

to

G_{m}

is the unique forest where the leaves are the maximal rigid components of

G_{1}^{*}

. The internal nodes are the new rigid components S occurring in some

G_{i + 1}^{*}

in the path. The children of S are the set of rigid components

S_{1} \dots S_{m}

contained in S that occur in

G_{i}^{*}

. The roots of the forest are the rigid components of

G_{m}^{*}

. An assembly tree is an assembly forest with only one root. See Section 3 for examples of assembly trees [46,49,62].

A full (coarse) assembly path is an (coarse) assembly path from

G_{1}

to

G_{m}

, where

G_{1}

is the empty active constraint graph, and

G_{m}^{*}

is a rigid active constraint graph. A (coarse) assembly path from primitives has the first property of the full assembly path, i.e.,

G_{1}

is the empty active constraint graph, but not the last property, i.e.,

G_{m}

can be any active constraint graph. The full assembly tree and assembly tree from primitives are also defined in this way.

A path between full active constraint graphs G and H where

G ⊈ H

and

H ⊈ G

is a sequence

G = G_{i}, G_{i + 1}, G_{i + 2}, \dots, G_{i + m} = H

, where any pair

G_{i + k}

and

G_{i + k + 1}

is on some assembly path, and

G_{i + k} ⊊ G_{i + k + 1}

if k is even,

G_{i + k} ⊋ G_{i + k + 1}

if k is odd.

The fundamental domain of the stratification

S (A)

is the minimal sub-stratification

\tilde{S} (A)

, such that

⋃_{π \in Waut_{A}} π (\tilde{S} (A)) = S (A)

, where π acts on

\tilde{S} (A)

via its action on the active constraint regions (resp. active constraint graphs) of

\tilde{S} (A)

. In other words the active constraint regions (resp. active constraint graphs) in

\tilde{S} (A)

are orbit representatives of active constraint regions (resp. active constraint graphs) under

Waut_{A}

.

An assembly pathway is an orbit of an assembly tree under

Waut_{A}

. The definition extends to full and coarse assembly trees.

2.5. Example Illustrating the above Symmetries

Some of the symmetry concepts defined here were used in [26] to efficiently compute path and higher dimensional region intervals in sphere-based assembly configuration spaces more efficiently reproducing and extending the results in [24]. We give a brief description here in the form of an example:

Example 1.

As an example, Figure 7 shows the Hasse diagram of the fundamental region of a stratification of an assembly system of six bunches that are identical singleton spheres considered first in [24]. Figure 8 shows an (orbit representative of an) active constraint graph of the system together with its parents and children in the Hasse diagram.

Figure 8. The neighbors of one active constraint graph in the Hasse diagram of the stratification for the assembly system in Figure 4.

In addition, orbit representatives of paths help with improving the efficiency of path integrals. In Figure 7, any path that goes down from the top of the diagram to the bottom is the orbit representative of an assembly path. In Figure 8, the sequence

e 10 q 6 ⊊ e 11 g 3 ⊊ e 12 g 2

is the orbit representative of an assembly path, but not a coarse assembly path, as none of

e 11 g 3

’s rigid components contains two or more rigid components of

e 10 g 6

. On the other hand, the sequence

e 10 q 6 ⊊ e 12 g 2

is the orbit representative of a coarse assembly path.

3. Enumerating Simple Assembly Pathways

In this section, we consider the action of the strict congruence group of a single final configuration on its assembly trees and use generating functions to count the number and sizes of simplified assembly pathways [49]. Note that our approach could potentially be applied for all other groups defined in Section 2, the largest of which is the weak automorphism group of the final configuration, which would be the same as the weak automorphism group of the assembly configuration space.

A simple assembly is modeled by a rooted tree; the leaves are abstract representation of individual bunches, the root representing the final assembled configuration. The internal vertices represent intermediate stages of assembly, simplified to be subsets instead of subgraphs of the root. This simplification results in a loss of information about the assembly configuration space and active constraint graphs of the intermediate stages of assembly. To compensate, the group is taken to be the automorphism group G of the graph of the assembled structure at the root instead of the weak automorphism group

Waut_{A}

of the assembly configuration space.

The definitions of assembly trees and pathways are simplified as follows. Given a finite group G acting on a finite set X, we will define a simplified assembly pathway for the pair

(G, X)

. First, a simplified assembly tree is a rooted tree for which each internal vertex has at least two children and whose leaves are bijectively labeled with elements of a set X. There is an induced labeling on all of the vertices of a simplified assembly tree by labeling a vertex v by the set of labels on the leaves that are descendents of v. We identify each vertex of a simplified assembly tree with its label. Two simplified assembly trees are considered identical if there is a root-preserving, adjacency-preserving and label-preserving bijection between their vertex sets. The 26 simplified assembly trees with four leaves, labeled in the set

X = {1, 2, 3, 4}

, are shown in Figure 9.

For a simplified assembly tree τ, the action of G on X induces a natural action of G on the power set of X and, thereby, on the set of vertices of τ. Let

T_{X}

denote the set of all simplified assembly trees for X. If

g \in G

, then define the tree

g (τ)

as the unique simplified assembly tree whose set of vertex labels (including the labels of internal vertices) is

{g (v) : v \in τ}

. Thus, we have an induced action of G on

T_{X}

. Each orbit of this action of G on

T_{X}

consists of a set of simplified assembly trees called a simplified assembly pathway for

(G, X)

.

Example 2.

(Klein 4-group acting on

T_{4}

) Consider the Klein 4-group

G = Z_{2} \oplus Z_{2}

acting on the set

X = {1, 2, 3, 4}

. Writing G as a group of permutations in cycle notation, this action is:

G = {(1) (2) (3) (4), (1 2) (3 4), (1 3) (2 4), (1 4) (2 3)} .

For this example, there are exactly 11 simplified assembly pathways, which are indicated in Figure 9 by boxes around the orbits. There are four simplified assembly pathways of size one, i.e., with one simplified assembly tree in the orbit, three simplified assembly pathways of size two and four simplified assembly pathways of size four.

For any subgroup H of G, let

t_{X} (H)

denote the number of trees in

T_{X}

that are fixed by every element of H. Furthermore, let

\bar{t} (H) : = \bar{t_{X}} (H)

denote the number of trees in

T_{X}

that are fixed by every element of H, but by no other elements of G. In other words,

{\bar{t}}_{X} (H) = | {τ \in T_{X} | s t a b_{G} (τ) = H} | .

(2)

Figure 9. Klein 4-group acting on

T_{4}

. See Example 3.

Figure 9. Klein 4-group acting on

T_{4}

. See Example 3.

The first theorem below reduces the enumeration of simplified assembly pathways to the calculation of

\bar{t} (H)

for subgroups H of G. The index of a subgroup H in G, i.e., the number of left (equivalently, right), cosets of H in G is denoted by

(G : H)

. By Lagrange’s theorem, this index equals

| G | / | H |

. The second theorem below reduces the calculation of

\bar{t} (H)

to the calculation of

t (H)

. The desired quantities

{\bar{t}}_{X} (H)

are computed from the numbers

t_{X} (H)

using Möbius inversion on the lattice of subgroups of G.

Theorem 5.

The number of trees in any simplified assembly pathway for

(G, X)

divides

| G |

. If m divides

| G |

, then the number

N (m)

of simplified assembly pathways of cardinality m is:

N (m) = \frac{1}{m} \sum_{H \leq G : (G : H) = m} \bar{t} (H) .

Theorem 6.

Let G be a group acting on a set X. If H is a subgroup of G, then:

\bar{t_{X}} (H) = \sum_{H \leq K \leq G} μ (H, K) t_{X} (K),

where μ is the Möbius function for the lattice of subgroups of G.

Example 3.

(Klein 4-group acting on

T_{4}

, continued) Theorem 5, applied to our previous example of

Z_{2} \oplus Z_{2}

acting simply on

{1, 2, 3, 4}

, states that the size of a simplified assembly pathway must be 1, 2 or 4, since it must be a divisor of

4 = | Z_{2} \oplus Z_{2} |

. To find the number of pathways of each size, note that G has three subgroups of order two, namely

\begin{matrix} K_{1} & = {(1) (2) (3) (4), (1 2) (3 4)}, \\ K_{2} & = {(1) (2) (3) (4), (1 3) (2 4)}, \\ K_{3} & = {(1) (2) (3) (4), (1 4) (2 3)}, \end{matrix}

and that:

\begin{matrix} \bar{t} (G) = 4, \\ \bar{t} (K_{1}) = \bar{t} (K_{2}) = \bar{t} (K_{3}) = 2, \\ \bar{t} (K_{0}) = 16, \end{matrix}

where

K_{0}

denotes the trivial subgroup of order one. The simplified assembly trees in

T_{X}

that are fixed by all elements of G are shown in Figure 9,

A, B, C, D

. For

i = 1, 2, 3

, those simplified assembly trees in

T_{X}

that are fixed by all elements of

K_{i}

and by no other elements of G are shown in Figure 9,

E, F, G

, respectively. The remaining 16 simplified assembly trees in Figure 9 are fixed by no elements of G, except the identity. Therefore, according to Theorem 5, the number of pathways of sizes 1, 2 and 4 are, respectively,

\begin{matrix} \bar{t} (G) & = 4, \\ \frac{1}{2} (\bar{t} (K_{1}) + \bar{t} (K_{2}) + \bar{t} (K_{3})) = \frac{1}{2} (2 + 2 + 2) & = 3, \\ \frac{1}{4} \bar{t} (K_{0}) & = 4 . \end{matrix}

The problem of enumerating simplified assembly pathways is reduced, using Theorems 5 and 6, to calculating the number

t (G)

of simplified assembly trees fixed by a given group G. This is done using permutation group theory and generating functions. It will be assumed, as is the case in many of the biological applications, that G acts freely on X, i.e., if

g (x) = x

for some

x \in X

, then g must be the identity. In this case:

| X | : = | X_{n} | = n \cdot | G |,

where n is the number of G-orbits in its action on X. Denote by

t_{n} (G)

the number of trees in

T_{n} : = T_{X_{n}}

that are fixed by G. We define the exponential generating function:

f_{G} (x) : = \sum_{n \geq 1} t_{n} (G) \frac{x^{n}}{n!}

for the sequence

{t_{n} (G)}

.

If G is the trivial group of order one, then let us denote this generating function simply by

f (x)

. This is the generating function for the total number of rooted, labeled trees with n leaves in which every non-leaf vertex has at least two children. For

H \leq G

, let:

{\hat{f}}_{H} (x) = \frac{1}{(G : H)} f_{H} ((G : H) x) .

Theorem 7.

The generating function

f_{G} (x)

satisfies the following functional equations:

1 - x + 2 f (x) = exp (f (x)),

and for

| G | > 1

,

1 + 2 f_{G} (x) = exp (\sum_{H \leq G} {\hat{f}}_{H} (x)) .

Although proofs are omitted in this survey, the rather involved proof of Theorem 7 relies on, in addition to generating function techniques, a characterization of block systems arising from a group acting on a set and a recursive procedure for constructing all trees in

T_{X}

that are fixed by G (see [49], Theorems 9 and 14).

Remark 2.

Finding the generating function

f_{G} (x)

depends on first finding the generating functions

f_{H} (x)

for proper subgroups H of G. In that sense, the procedure for finding

f_{G} (x)

is recursive, proceeding up the lattice of subgroups of G, starting from the trivial subgroup.

It is also worth mentioning that subgroups that are conjugate in G have the same generating function.

Example 4.

(Klein 4-group acting on

T_{4}

, continued)

Consider

G = Z_{2} \oplus Z_{2}

acting on

X_{n}

. Recall that

| X_{n} | = 4 n

, the integer n being the number of G-orbits. Recall that the subgroups of G are

K_{0}, K_{1}, K_{2}, K_{3}, G

, where

K_{0}

is the trivial group and:

\begin{matrix} K_{1} & = {(1) (2) (3) (4), (1 2) (3 4)}, \\ K_{2} & = {(1) (2) (3) (4), (1 3) (2 4)}, \\ K_{3} & = {(1) (2) (3) (4), (1 4) (2 3)} . \end{matrix}

The functional equations in the statement of Theorem 7 are:

\begin{matrix} 1 - x + 2 f (x) & = exp (f (x)) \\ 1 + 2 f_{K_{i}} (x) & = exp (\frac{1}{2} f (2 x) + f_{K_{i}} (x)) for i = 1, 2, 3, and \\ 1 + 2 f_{G} (x) & = exp (\frac{1}{4} f (4 x) + \frac{1}{2} f_{K_{1}} (2 x) + \frac{1}{2} f_{K_{2}} (2 x) + \frac{1}{2} f_{K_{3}} (2 x) + f_{G} (x)) . \end{matrix}

Using these equations and MAPLE software, the coefficients of the respective generating functions provide the following first few values for the number of fixed simplified assembly trees. For the first entry

t_{1} (G) = 4

for the group G, the four fixed trees are shown in Figure 9A–D. For trees with eight leaves, there are

t_{2} (G) = 104

simplified assembly trees fixed by

G = Z_{2} \oplus Z_{2}

, and so on.

\begin{matrix} t_{n} (K_{0}) & : 1, 1, 4, 26, 236, 2752 \\ t_{n} (K_{i}) & : 1, 6, 72, 1312, 32128, 989696 \\ t_{n} (G) & : 4, 104, 4896, 341120, 31945728, 3790876672 . \end{matrix}

Example 5.

(The icosahedral group acting on a viral capsid)

A symmetry of a polyhedron is a transformation in SE

(3)

that keeps the polyhedron, as a whole, fixed, and a direct symmetry is similarly defined. The icosahedral group is the group of direct symmetries of the icosahedron. It is a group of order 60 denoted

G_{60}

.

A viral capsid assembly configuration is modeled by a polyhedron P with icosahedral symmetry. Its set X of facets represents the protein monomers. The icosahedral group acts on P and, hence, on the set X. It follows from the so-called quasi-equivalence theory of the capsid structure that

G_{60}

acts freely on X. We have

| X | : = | X_{n} | = 60 n

, where n is the number of orbits in the action of the icosahedral group on X. Not every n is possible for a viral capsid; n must be a T-number, that is a number of the form

h^{2} + h k + k^{2}

, where h and k are nonnegative integers.

Note: An icosahedral viral capsid assembly configuration has a corresponding icosahedral active constraint graph. Additionally, the group

G_{60}

, viewed as a subgroup of the symmetric group

S_{60}

, is the automorphism group of this active constraint graph. As mentioned in the beginning of this section, we are interested in the orbits of simplified assembly trees under the action of this automorphism group. However, we continue to use the more intuitive view of

G_{60}

as a geometric group.

Before the number of simplified assembly trees can be enumerated, basic information about the icosahedral group is needed. The group

G_{60}

consists of:

the identity,
15 rotations of order 2 about axes that pass through the midpoints of pairs of diametrically opposite edges of P,
20 rotations of order 3 about axes that pass through the centers of diametrically opposite triangular faces and
24 rotations of order 5 about axes that pass through diametrically opposite vertices.

There are 59 subgroups of

G_{60}

that play a crucial role in the theory. Besides the two trivial subgroups, they are the following:

15 subgroups of order 2, each generated by one of the rotations of order 2,
10 subgroups of order 3, each generated by one of the rotations of order 3,
5 subgroups of order 4, each generated by rotations of order 2 about perpendicular axes,
6 subgroups of order 5, each generated by one of the rotations of order 5,
10 subgroups of order 6, each generated by a rotation of order 3 about an axis L and a rotation of order 2 that reverses L,
6 subgroups of order 10, each generated by a rotation of order 5 about an axis L and a rotation of order 2 that reverses L,
5 subgroups of order 12, each the symmetry group of a regular tetrahedron inscribed in P.

From the above geometric description of the subgroups, it follows that all subgroups of a given order are conjugate in the group

G_{60}

. Representatives of the conjugacy classes of the subgroups of the icosahedral group are denoted by

G_{0}, G_{2}, G_{3}, G_{5}, G_{6}, G_{10}, G_{12}, G_{60}

, where the subscript is the order of the group. The set of subgroups of

G_{60}

forms a lattice, ordered by inclusion. A partial Hasse diagram for this lattice

L

is shown in Figure 10. The number on the edge joining

G_{i}

(below) and

G_{j}

(above) indicates the number of distinct subgroups of order i contained in each subgroup of order j. The number in parentheses on the edge joining

G_{i}

(below) and

G_{j}

(above) indicates the number of distinct subgroups of order j containing each subgroup of order i. The Möbius function of

L

is shown in Figure 11. The entry in the table corresponding to the row labeled

G_{i}

and column

G_{j}

is

μ (G_{i}, G_{j})

.

Figure 10. Partial Hasse diagram for the lattice of subgroups of the icosahedral group.

Figure 11. The values of the Möbius function of the subgroup lattice of

G_{60}

.

Figure 11. The values of the Möbius function of the subgroup lattice of

G_{60}

.

Consider the case

| X | = 60

, i.e., for the

T = 1

capsid. Using Theorem 7 and MAPLE software, the generating functions

f_{G_{i}} (x)

were computed, and hence, their coefficients

t_{60 / i} (G_{i})

, which count simplified assembly trees that are fixed by any copy of

G_{i}

, were also computed. Note that, since

| X | = 60

, the number of orbits of

G_{i}

in its action on X is

60 / i

. Substituting these values into Theorem 6 and using the Möbius, Figure 11 yields the numerical values for

{\bar{t}}_{60 / i} (G_{i})

, the number of simplified assembly trees over X with

| X | = 60

that are fixed by

G_{i}

, but by no other elements of

G_{60}

. In other words, these are the numbers of trees whose stabilizer in

G_{60}

is exactly

G_{i}

. Substituting these numbers

\bar{t}

into Theorem 5, we arrive at the number of simplified assembly pathways of each possible size:

\begin{matrix} 204 & simplified assembly pathways of size 1 \\ \sim 168 e 8 & simplified assembly pathways of size 5 \\ \sim 223 e 9 & simplified assembly pathways of size 6 \\ \sim 613 e 17 & simplified assembly pathways of size 10 \\ \sim 102 e 17 & simplified assembly pathways of size 12 \\ \sim 334 e 28 & simplified assembly pathways of size 15 \\ \sim 504 e 31 & simplified assembly pathways of size 20 \\ \sim 835 e 51 & simplified assembly pathways of size 30 \\ \sim 320 e 99 & simplified assembly pathways of size 60 \end{matrix}

4. Open Questions

4.1. Enumeration Problems in (Non-Simplified) Assembly Framework

We are interested in the following enumeration problems related to the action of

Waut_{A}

for the framework in Section 2:

(1): How does one compute the size of orbits/stabilizers and the number of orbits under $Waut_{A}$ for assembly configurations, active constraint graphs, active constraint regions, (coarse) assembly paths and assembly trees/forests?
(2): How does one compute the number of coarse assembly paths that correspond to a particular assembly tree/forest?
(3): Given two active constraint graphs G and H, where G and H are incomparable, i.e., $G ⊈ H$ and $H ⊈ G$ , how does one compute the number of paths between them?
(4): Given two active constraint graphs $G_{1}$ and $G_{m}$ , where $G_{1} ⊊ G_{m}$ , how does one compute the number of (coarse) assembly paths from $G_{1}$ to $G_{m}$ ?
(5): What are the orbits of the (coarse) assembly paths in (4) under the action of ${stab}_{Waut_{A}} (G_{m})$ ?
(6): What are the orbits of the (coarse) assembly paths in (4) under the action of the group H, where $H = Waut_{A}$ if ${stab}_{Waut_{A}} (G_{1}) = Waut_{A}$ (i.e., $G_{1}$ is the empty active constraint graph), or $H = {stab}_{Waut_{A}} (G_{1}) \cap {stab}_{Waut_{A}} (G_{m})$ otherwise?

4.2. Symmetries within an Active Constraint Region via Cayley Configurations

So far, we have discussed the orbit of an active constraint region and active constraint graph and pointed out that it is sufficient to deal with a single orbit representative provided we are able to compute the multiplying factors associated with the size of the orbit, stabilizer, number of orbits, etc.

In fact, a single active constraint region could be decomposed into the union of nontrivial subregions that form the orbit of a fundamental region, leading to enormous efficiencies in sampling, the computation of volumes that are currently hopelessly intractable in high dimensional configuration spaces, as discussed in Section 1.

In fact, since the fundamental region itself could have subregions with varying orders of stabilizers, we could decompose into more than one orbit representative, with different stabilizers. In any case, sampling or computing the volume of an active constraint region is simplified by sampling these fundamental subregions and computing the size of their orbits.

One way to obtain such a decomposition of an active constraint region

R_{G}

is via the locally complete Cayley (assembly) configurations

δ_{F}

corresponding to the active constraint graph G. Convex Cayley configuration spaces highlight the key difference between assembly and other constraint systems, e.g., folding. This difference is captured in the combinatorial structure of active constraint graphs. A Cayley parameter for an active constraint region

R_{G}

is a non-edge of its active constraint graph G. For specific sets of non-edges F, the set of vectors

λ_{F}

of attainable lengths of F (in 3D realizations of a linkage

(G, δ)

with underlying graph G and edge lengths δ) is always convex for any given lengths δ (that is, for all of the 3D realizations of the bar-joint constraint system or linkage

(G, δ)

). This set is called the (three-dimensional) Cayley configuration space of the linkage

(G, δ)

on the Cayley parameters F, denoted

Φ_{F} (G, δ)

, and can be viewed as a “projection” of the space of pairwise distance vectors of realizations of

(G, δ)

on the Cayley parameters F. Such graphs G are said to have convexifiable Cayley configuration spaces with parameters F. Convexity permits the use of convex programming techniques for improving the efficiency of sampling, search, volume computations, etc., for the configuration space.

The concept is best explained using key theorems of the first author in [59,63] discussed in Section 4.

We assume knowledge of common graph operations, such as k-sums and resulting partial k-trees, a minor-closed class (partial 2-trees are series-parallel graphs with a forbidden minor

K_{4}

).

Theorem 8.

[59] A graph H has a convexifiable Cayley configuration space with parameters F if and only if for each

f \in F

, all of the minimal two-sum components of

H \cup F

that contain both endpoints of f are partial 2-trees. The Cayley configuration space

Φ_{F} (H, δ)

of a bar-joint system or linkage

(H, δ)

is a convex polytope. When

H \cup F

is a 2-tree, the bounding hyperplanes of this polytope are triangle inequalities relating the lengths of edges of the triangles in

H \cup F

.

Note: A major advantage of the convex Cayley method is that sampling the configuration space can be effected by standard methods of convex programming. Another advantage is that the method is completely unaffected when δ are intervals rather than exact values [59]. A different characterization of inherent Cayley convexity for a graph G on a set F of non-edges as in the above section has been proven also for higher dimensions d [59,64], showing equivalence to a minor closed property of d-flattenability introduced in [65] and also for other, non-Euclidean distances (norms) in [63]. Any realization of H in a normed space can be flattened into d-dimensional normed space (in the same norm) maintaining the same edge distances.

Theorem 9.

[63] A graph H is d-flattenable if and only if for every partition of H into

G \cup F

, G has a convex Cayley configuration space on F in d-dimensions.

4.2.1. Fundamental Regions of Active Constraint Regions

After G has been completed with the convexifying Cayley parameters F, the locally rigid graph

G \cup F

typically loses symmetries present in G, i.e., the automorphism group is smaller. However, F can be replaced by any set of edges

π (F)

for

π \in {stab}_{Waut_{A}} (G)

. Each locally complete Cayley configuration in the active constraint region G is of the form

δ_{F}

(lengths of edges in F, where

G \cup F

is rigid). Each Cartesian (assembly) configuration within an active constraint region with graph G corresponds bijectively to a globally-complete Cayley configuration

(δ_{F}, δ_{H})

where

G \cup F

is rigid and

G \cup F \cup H

is globally rigid (or even

G \cup F \cup H

is a complete graph).

Thus, when sampling the Cayley configuration space on F, one can find the boundaries of the fundamental regions corresponding to the corresponding Cartesian assembly configurations as follows. For a Cayley configuration

δ_{F}

, all of its generically finitely many real/Cartesian configurations can be obtained as various corresponding values of

δ_{H}

, which include the values of

δ_{π (F)}

. The boundary of a fundamental region occurs during sampling when we encounter a Cartesian (assembly) configuration c where the lengths of

π (F)

correspond to already sampled lengths of F.

Note that there could be a different decomposition into fundamental regions, corresponding to each Cartesian configuration (type) corresponding to the Cayley configuration. For example, for a different configuration

c^{'}

from the configuration c above, the lengths of

π (F)

may not correspond to already sampled lengths of F; or there could be another element

σ \in {stab}_{Waut_{A}} (G)

, with

σ \neq π

, where the lengths of

σ (F)

in

c^{'}

could correspond to already sampled lengths of F. In this manner, one can, in principle, algorithmically bound fundamental regions

R_{G}^{i}

of the active constraint region

R_{G}

, by inspecting the assembly configurations corresponding to the Cayley configuration space on F, such that the active constraint region

R_{G}

is the union of the orbits of the regions

R_{G}^{i}

(under the action of

{stab}_{Waut_{A}} (G)

).

Efficiently finding these fundamental regions, as well as the number and sizes of their orbits are an open question, whose answer would enormously reduce the complexity of configurational entropy computations for assembly.

4.3. g-Unfixable Unlabeled Trees

Call a tree g-unfixable if there is no leaf labeling, so that the resulting labeled tree is fixed by the permutation g, and let us say that a tree is G-unfixable if it is g-unfixable for every nontrivial element of the group G. A study of unlabeled trees that are g-unfixable may lead to relevant related results. These properties are interesting for at least two reasons. First, they clarify the minimum quantifiable information in a labeled tree that is necessary to decide if it is fixed by a group element g: if the underlying unlabeled tree is g-unfixable, then the information in the labeling is unnecessary to make this decision. This may lead to efficient algorithms that use properties of the automorphism group of the tree to help in deciding whether a given labeled tree is fixed by the given group.

4.4. Depth of an Assembly Pathway

A result of [62] tells us that the orbit size of an assembly pathway is at least the depth of the pathway. The number of assembly pathways and orbit sizes of assembly trees that constitute a pathway must be taken into consideration in defining any probability space over pathways. If the dynamics of transitioning between states along a pathway and thereby the density of states influencing the configurational integral computation [66] and other such factors nullify the vast differences in symmetry-induced numeracy factors between pathways, then that argument is yet to be made. The local rule theories using simple geometric rules, ODEs and other first principles physics-based simulations of the assembly of viral capsids [67,68,69,70,71,72,73,74,75,76,77] have been used to obtain the assembly kinetics, including rates and concentrations of intermediates, and implicitly provide a probability distribution over pathways. A cautionary note in [78] uses an ODE-based model of reaction kinetics to question simplistic models of assembly pathways. However, the model does not contradict the simple and transparent thesis that when symmetric structures form from identical units, the simple numeracy of orbit sizes of assembly trees must be taken into consideration in any theory predicting likely assembly pathways. This paper shows the rich intricacy of possible symmetries at play. We in fact conjecture that this symmetry factor increases with the depth of the pathway. Proving this conjecture would strengthen the motivation for studying the symmetry factor.

4.5. Other Questions

Theorems 5 and 6, as well as our successful computation effort in the special case of

| X | = 60

and

T = 1

can serve as a motivation to revisit the following questions, first raised in [62].

Given two symmetry-invariant properties, how does one compute the ratio of the number of pathways that satisfy both of these properties to the number of symmetry classes that satisfy only one of these properties?
What can we say about larger (icosahedrally) symmetric polyhedral graphs (larger T numbers of viral capsids, for example), fullerenes and fulleroids and polyhedra with different symmetry groups? In such cases, the computations of Section 3 can also be phrased as algorithmic questions, where the asymptotic complexity of the algorithm is expressed in terms of the number of facets of the polyhedron (or the T number).
To fully extend the techniques in Section 3 to the framework of Section 2, each sub-assembly must be a rigid subgraph of the graph at the root. Some assembly trees fail to satisfy the rigidity condition and can never occur (probability zero). Such assembly trees are geometrically invalid. In addition, a valid assembly tree can be assigned a non-zero probability according to how difficult it is to find a solution to the constraints on each sub-assembly. Computing this probability, called the geometric stability factor, is necessary to make the required predictions.
Dropping the rigidity requirement, but maintaining the subgraph (connectivity) requirement, in [79], two of the authors study the number of assembly trees of graphs on labeled vertices. In that model, each graph has a trivial automorphism group, but the enumeration of assembly trees still leads to the use of a recent and very powerful technique from the theory of D-finite power series in several variables.

Incorporating a nontrivial automorphism group of the graph could help understand the role of capsid symmetry in the RNA assembly model of [80], which purports that RNA viruses assemble by attaching to the internal (symmetry breaking) genome strand, since that would avoid having to deal with the prohibitive number of possible assembly pathways. It should be noted that in our precise and formal theory of assembly trees and their orbits (our pathways), assembly has an underlying partial order of stable intermediates that are influenced by the connectivity and rigidity; they are subgraphs of the underlying polyhedral graph given by active constraints. The informal definition of the pathway in [80] is a linear order (in our language, an assembly tree that is a path) given by a Hamiltonian circuit in the viral polyhedral (dual) graph. We are not aware of a clarification of why the interactions of a given monomer in the sequence to multiple other monomers besides the previous one in the sequence would be insignificant. If not, the assembly tree would indeed be a partial order as in our case, and the tree would have a minimum fan-in required for rigidity, reducing the number of assembly trees significantly and reducing the number of their symmetry classes or orbits further, whereby this number alone is not a significant reason to adopt a alternate model of assembly (such as RNA strand attachment) that cuts down the possible pathways.

As future work, we also aim to apply the symmetry framework developed in this paper to explain more experimental and theoretical results from previous literature.

5. Conclusions

In this paper, we developed a novel framework for symmetry in assembly under short range potentials and considered the symmetry groups of various objects studied in previous literature on assembly, including assembly configuration spaces, active constraint graphs, active constraint regions, assembly trees and pathways. The new Theorem 4 formalizes the containment relations between stabilizer subgroups of the active constraint graph and corresponding assembly configurations. We then demonstrated the new symmetry concepts to compute the sizes and numbers of orbits in two example settings appearing in previous work. The methods can improve efficiency for large systems with multiple identical bunches and spheres that have large order symmetry groups. The new symmetry framework helps formalize a number of questions for future work.

Acknowledgments

We thank Rahul Prabhu for his feedback and assistance in paper preparation. This work was partially supported by a grant from the Simons Foundations (#312515 to Andrew Vince).

Author Contributions

Miklos Bona contributed to surveying earlier results in Section 3 and Section 4 from the earlier paper [49] by Bona, Sitharam and Vince. Meera Sitharam is responsible for most of Section 1, Section 2 and Section 4, contributed to overall presentation and organization. Andrew Vince contributed to Section 2 and is responsible for Section 3 which is adapted from his contribution to the earlier paper [49] by Bona, Sitharam and Vince. Menghan Wang contributed to Section 2 and is responsible for overall writing and revising.

Conflicts of Interest

The authors declare no conflict of interest.

References

Doye, J.P.; Wales, D.J. Structural consequences of the range of the interatomic potential a menagerie of clusters. J. Chem. Soc. Faraday Trans. 1997, 93, 4233–4243. [Google Scholar] [CrossRef]
Lazaridis, T.; Karplus, M. Effective energy function for proteins in solution. Proteins 1999, 35, 133–152. [Google Scholar] [CrossRef]
Lazaridis, T. Effective energy function for proteins in lipid membranes. Proteins 2003, 52, 176–192. [Google Scholar] [CrossRef] [PubMed]
Im, W.; Feig, M.; Brooks, C.L. An implicit membrane generalized Born theory for the study of structure, stability, and interactions of membrane proteins. Biophys. J. 2003, 85, 2900–2918. [Google Scholar] [CrossRef]
Karplus, M.; Kushick, J. Method for estimating the configurational entropy of macromolecules. Macromolecules 1981, 14, 325–332. [Google Scholar] [CrossRef]
Andricioaei, I.; Karplus, M. On the calculation of entropy from covariance matrices of the atomic fluctuations. J. Chem. Phys. 2001, 115, 6289. [Google Scholar] [CrossRef]
Hnizdo, V.; Darian, E.; Fedorowicz, A.; Demchuk, E.; Li, S.; Singh, H. Nearest-neighbor nonparametric method for estimating the configurational entropy of complex molecules. J. Comput. Chem. 2007, 28, 655–668. [Google Scholar] [CrossRef] [PubMed]
Hnizdo, V.; Tan, J.; Killian, B.J.; Gilson, M.K. Efficient calculation of configurational entropy from molecular simulations by combining the mutual-information expansion and nearest-neighbor methods. J. Comput. Chem. 2008, 29, 1605–1614. [Google Scholar] [CrossRef] [PubMed]
Hensen, U.; Lange, O.F.; Grubmüller, H. Estimating Absolute Configurational Entropies of Macromolecules: The Minimally Coupled Subspace Approach. PLoS ONE 2010, 5. [Google Scholar] [CrossRef] [PubMed]
Killian, B.J.; Yundenfreund Kravitz, J.; Gilson, M.K. Extraction of configurational entropy from molecular simulations via an expansion approximation. J. Chem. Phys. 2007, 127. [Google Scholar] [CrossRef] [PubMed]
Head, M.S.; Given, J.A.; Gilson, M.K. Mining Minima: Direct Computation of Conformational Free Energy. J. Phys. Chem. A 1997, 101, 1609–1618. [Google Scholar] [CrossRef]
Chirikjian, S.G. Chapter Four—Modeling Loop Entropy. In Computer Methods, Part C; Johnson, M.L., Brand, L., Eds.; Academic Press: Cambridge, MA, USA, 2011; Volume 487, pp. 99–132. [Google Scholar]
King, B.M.; Silver, N.W.; Tidor, B. Efficient calculation of molecular configurational entropies using an information theoretic approximation. J. Phys. Chem. B 2012, 116, 2891–2904. [Google Scholar] [CrossRef] [PubMed]
Wales, D.J. Energy Landscapes: Applications to Clusters, Biomolecules and Glasses; Cambridge University Press: Cambridge, UK, 2003. [Google Scholar]
Doye, J.P.; Miller, M.A.; Wales, D.J. The double-funnel energy landscape of the 38-atom Lennard-Jones cluster. J. Chem. Phys. 1999, 110, 6896–6906. [Google Scholar] [CrossRef]
Oakley, M.T.; Johnston, R.L.; Wales, D.J. Symmetrisation schemes for global optimization of atomic clusters. Phys. Chem. Chem. Phys. 2013, 15, 3965–3976. [Google Scholar] [CrossRef] [PubMed]
Wales, D.J. Surveying a complex potential energy landscape: Overcoming broken ergodicity using basin-sampling. Chem. Phys. Lett. 2013, 584, 1–9. [Google Scholar] [CrossRef]
Morgan, J.W.; Wales, D.J. Energy landscapes of planar colloidal clusters. Nanoscale 2014, 6, 10717–10726. [Google Scholar] [CrossRef] [PubMed]
Kusumaatmaja, H.; Whittleston, C.S.; Wales, D.J. A Local Rigid Body Framework for Global Optimization of Biomolecules. J. Chem. Theory Comput. 2012, 8, 5159–5165. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Rühle, V.; Kusumaatmaja, H.; Chakrabarti, D.; Wales, D.J. Exploring Energy Landscapes: Metrics, Pathways, and Normal-Mode Analysis for Rigid-Body Molecules. J. Chem. Theory Comput. 2013, 9, 4026–4034. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Baxter, R. Percus-Yevick equation for hard spheres with surface adhesion. J. Chem. Phys. 1968, 49. [Google Scholar] [CrossRef]
Stell, G. Sticky spheres and related systems. J. Stat. Phys. 1991, 63, 1203–1221. [Google Scholar] [CrossRef]
Miller, M.; Frenkel, D. Competition of percolation and phase separation in a fluid of adhesive hard spheres. Phys. Rev. Lett. 2003, 90. [Google Scholar] [CrossRef] [PubMed]
Holmes-Cerfon, M.; Gortler, S.J.; Brenner, M.P. A geometrical approach to computing free-energy landscapes from short-ranged potentials. Proc. Natl. Acad. Sci. 2013, 110, E5–E14. [Google Scholar] [CrossRef] [PubMed]
Ozkan, A.; Sitharam, M. EASAL: Efficient Atlasing and Search of Assembly Landscapes. In Proceedings of the BiCoB Symposium, New Orleans, LA, USA, 23–25 March 2011.
Ozkan, A.; Pence, J.; Peters, J.; Sitharam, M. EASAL: Theory and Algorithms for Efficient Atlasing and Search of Assembly Landscapes. 2012; in preparation. [Google Scholar]
Arkus, N.; Manoharan, V.N.; Brenner, M.P. Minimal energy clusters of hard spheres with short range attractions. Phys. Rev. Lett. 2009, 103. [Google Scholar] [CrossRef] [PubMed]
Wales, D.J. Energy Landscapes of Clusters Bound by Short-Ranged Potentials. ChemPhysChem 2010, 11, 2491–2494. [Google Scholar] [CrossRef] [PubMed]
Beltran-Villegas, D.J.; Bevan, M.A. Free energy landscapes for colloidal crystal assembly. Soft Matter 2011, 7, 3280–3285. [Google Scholar] [CrossRef]
Calvo, F.; Doye, J.P.; Wales, D.J. Energy landscapes of colloidal clusters: Thermodynamics and rearrangement mechanisms. Nanoscale 2012, 4, 1085–1100. [Google Scholar] [CrossRef] [PubMed]
Khan, S.J.; Weaver, O.L.; Sorensen, C.M.; Chakrabarti, A. Nucleation in short-range attractive colloids: ordering and symmetry of clusters. Langmuir ACS J. Surf. Coll. 2012, 28, 16015–16021. [Google Scholar] [CrossRef] [PubMed]
Hoy, R.S.; Harwayne-Gidansky, J.; O’Hern, C.S. Structure of finite sphere packings via exact enumeration: Implications for colloidal crystal nucleation. Phys. Rev. E 2012, 85. [Google Scholar] [CrossRef] [PubMed]
Hoy, R.S. Structure and dynamics of model colloidal clusters with short-range attractions. Phys. Rev. E 2015, 91. [Google Scholar] [CrossRef] [PubMed]
Martin, S.; Thompson, A.; Coutsias, E.A.; Watson, J.P. Topology of cyclo-octane energy landscape. J. Chem. Phys. 2010, 132. [Google Scholar] [CrossRef] [PubMed]
Jaillet, L.; Corcho, F.J.; Pérez, J.J.; Cortés, J. Randomized tree construction algorithm to explore energy landscapes. J. Comput. Chem. 2011, 32, 3464–3474. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Porta, J.M.; Ros, L.; Thomas, F.; Corcho, F.; Cantó, J.; Pérez, J.J. Complete maps of molecular-loop conformational spaces. J. Computat. Chem. 2007, 28, 2170–2189. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Amato, N.M.; Song, G. Using motion planning to study protein folding pathways. J. Comput. Biol. 2002, 9, 149–168. [Google Scholar] [CrossRef] [PubMed]
Gfeller, D.; De Lachapelle, D.M.; De Los Rios, P.; Caldarelli, G.; Rao, F. Uncovering the topology of configuration space networks. Phys. Rev. E Stat. Nonlinear Soft Matter Phys. 2007, 76. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Varadhan, G.; Kim, Y.J.; Krishnan, S.; Manocha, D. Topology preserving approximation of free configuration space. Robotics 2006, 3041–3048. [Google Scholar]
Lai, Z.; Su, J.; Chen, W.; Wang, C. Uncovering the Properties of Energy-Weighted Conformation Space Networks with a Hydrophobic-Hydrophilic Model. Int. J. Mol. Sci. 2009, 10, 1808–1823. [Google Scholar] [CrossRef] [PubMed]
Prada-Gracia, D.; Gómez-Gardenes, J.; Echenique, P.; Falo, F. Exploring the Free Energy Landscape: From Dynamics to Networks and Back. PLoS Comput. Biol. 2009, 5. [Google Scholar] [CrossRef] [PubMed]
Yao, Y.; Sun, J.; Huang, X.; Bowman, G.R.; Singh, G.; Lesnick, M.; Guibas, L.J.; Pande, V.S.; Carlsson, G. Topological methods for exploring low-density states in biomolecular folding pathways. J. Chem. Phys. 2009, 130. [Google Scholar] [CrossRef] [PubMed]
Hoffmann, C.M.; Lomonosov, A.; Sitharam, M. Planning Geometric Constraint Decompositions Via Graph Transformations. AGTIVE ’99 (Graph Transformations with Industrial Relevance; Nagl, M., Schurr, A., Munch, M., Eds.; Springer: Kerkrade, The Netherlands, 1999; Volume 1779, pp. 309–324. [Google Scholar]
Hoffmann, C.M.; Lomonosov, A.; Sitharam, M. Decomposition of geometric constraints systems, Part I: Performance measures. J. Symb. Comput. 2001, 31, 367–408. [Google Scholar] [CrossRef]
Hoffmann, C.M.; Lomonosov, A.; Sitharam, M. Decomposition of geometric constraints systems, Part II: New algorithms. J. Symb. Comput. 2001, 31, 409–427. [Google Scholar] [CrossRef]
Sitharam, M.; Agbandje-McKenna, M. Modeling virus assembly using geometric constraints and tensegrity: Avoiding dynamics. J. Comput. Biol. 2006, 13, 1232–1265. [Google Scholar] [CrossRef] [PubMed]
Carvalho-Santos, Z.; Machado, P.; Branco, P.; Tavares-Cadete, F.; Rodrigues-Martins, A.; Pereira-Leal, J.B.; Bettencourt-Dias, M. Stepwise evolution of the centriole-assembly pathway. J. Cell Sci. 2010, 123, 1414–1426. [Google Scholar] [CrossRef] [PubMed]
Wales, D.J. Energy landscapes: Calculating pathways and rates. Int. Rev. Phys. Chem. 2006, 25, 237–282. [Google Scholar] [CrossRef]
Bóna, M.; Sitharam, M.; Vince, A. Enumeration of viral capsid assembly pathways: Tree orbits under permutation group action. Bull. Math. Biol. 2011, 73, 726–753. [Google Scholar] [CrossRef] [PubMed]
Bunker, P.R.; Jensen, P. Fundamentals of Molecular Symmetry; CRC Press: Boca Raton, FL, USA, 2004. [Google Scholar]
Cotton, F.A. Chemical Applications of Group Theory; John Wiley & Sons: Hoboken, NJ, USA, 2008. [Google Scholar]
Bonchev, D.; Rouvray, D. Chemical Group Theory: Techniques and Applications; Taylor & Francis: London, UK, 1995; Volume 4. [Google Scholar]
Kerber, A.; Laue, R.; Meringer, M.; Rücker, C.; Schymanski, E. Mathematical Chemistry and Chemoinformatics: Structure Generation, Elucidation and Quantitative Structure-Property Relationships; Walter de Gruyter: Berlin, Germany, 2013. [Google Scholar]
Pólya, G.; Read, R.C. Combinatorial Enumeration of Groups, Graphs, and Chemical Compounds; Springer-Verlag New York, Inc.: New York, NY, USA, 1987. [Google Scholar]
Altmann, S.L. Induced Representations in Crystals and Molecules; Academic Press: Cambridge, MA, USA, 1977. [Google Scholar]
Hahn, T.; Shmueli, U.; Wilson, A.J.C.; Prince, E. International Tables for Crystallography; D. Reidel Publishing Company: Dordrecht, the Netherlands, 2005. [Google Scholar]
Bunker, P.R.; Jensen, P. Molecular Symmetry and Spectroscopy; NRC Research Press: Ottawa, Canada, 1998. [Google Scholar]
Balasubramanian, K. Generating functions for the nuclear spin statistics of nonrigid molecules. J. Chem. Phys. 1981, 75, 4572–4585. [Google Scholar] [CrossRef]
Sitharam, M.; Gao, H. Characterizing Graphs with Convex Cayley Configuration Spaces. Discret. Comput. Geom. 2010, 43, 594–625. [Google Scholar] [CrossRef]
EASAL video. Available online: http://www.cise.ufl.edu/∼sitharam/EASALvideo.mpg (accessed on 20 January 2016).
Graver, J.E.; Servatius, B.; Servatius, H. Combinatorial Rigidity; Graduate Studies in Math, AMS: Providence, RI, USA, 1993. [Google Scholar]
Bóna, M.; Sitharam, M. The influence of symmetry on the probability of assembly pathways for icosahedral viral shells. Comput. Math. Methods Med. 2008, 9, 295–302. [Google Scholar] [CrossRef]
Sitharam, M.; Willoughby, J. On Flattenability of Graphs. In Automated Deduction in Geometry; Botana, F., Quaresma, P., Eds.; Springer International Publishing: New York, NY, USA, 2015; Volume 9201, pp. 129–148. [Google Scholar]
Cheng, J. Towards Combinatorial Characterizations and Algorithms for Bar-And-Joint Independence and Rigidity in 3D and Higher Dimensions. Ph.D Thesis, University of Florida, Gainesville, FL, USA, 2013. [Google Scholar]
Belk (Sloughter), M.; Connelly, R. Realizability of Graphs. Discret. Computat. Geom. 2007, 37, 125–137. [Google Scholar] [CrossRef]
Wales, D. Perspective: Insight into reaction coordinates and dynamics from the potential energy landscape. J. Chem. Phys. 2015, 142. [Google Scholar] [CrossRef] [PubMed]
Schwartz, R.; Shor, P.; Prevelige, P.; Berger, B. Local rules simulation of the kinetics of virus capsid self-assembly. Biophys. J. 1998, 75, 2626–2636. [Google Scholar] [CrossRef]
Berger, B.; Shor, P.; King, J.; Muir, D.; Schwartz, R.; Tucker-Kellogg, L. Local rule-based theory of virus shell assembly. Proc. Natl. Acad. Sci. USA 1994, 91, 7732–7736. [Google Scholar] [CrossRef] [PubMed]
Berger, B.; Shor, P. On the Mathematics of Virus Shell Assembly; Technical Report; Massachusetts Institute of Technology: Cambridge, MA, USA, 1994. [Google Scholar]
Berger, B.; Shor, P.W. Local rules switching mechanism for viral shell geometry. Discret. Appl. Math. 2000, 104, 97–111. [Google Scholar] [CrossRef]
Schwartz, R.; Prevelige, P.; Berger, B. Local Rules Modeling of Nucleation-Limited Virus Capsid Assembly; Technical Report, MIT-LCS-TM-584; Massachusetts Institute of Technology: Cambridge, MA, USA, 1998. [Google Scholar]
Reddy, V.S.; Giesing, H.A.; Morton, R.T.; Kumar, A.; Post, C.B.; Brooks, C.L.; Johnson, J.E. Energetics of quasiequivalence: Computational analysis of protein-protein interactions in icosahedral viruses. Biophysical 1998, 74, 546–558. [Google Scholar] [CrossRef]
Zlotnick, A. To build a virus capsid: An equilibrium model of the self assembly of polyhedral protein complexes. J. Mol. Biol. 1994, 241, 59–67. [Google Scholar] [CrossRef] [PubMed]
Marzec, C.J.; Day, L.A. Pattern formation in icosahedral virus capsids: The papova viruses and nudaurelia capensis β virus. Biophysical 1993, 65, 2559–2577. [Google Scholar] [CrossRef]
Rapaport, D.; Johnson, J.; Skolnick, J. Supramolecular self-assembly: Molecular dynamics modeling of polyhedral shell formation. Compt. Phys. Commun. 1998, 121–122, 231–235. [Google Scholar] [CrossRef]
Johnson, J.E.; Speir, J.A. Quasi-equivalent viruses: A paradigm for protein assemblies. J. Mol. Biol. 1997, 269, 665–675. [Google Scholar] [CrossRef] [PubMed]
Keef, T.; Micheletti, C.; Twarock, R. Master equation approach to the assembly of viral capsids. J. Theor. Biol. 2006, 242, 713–721. [Google Scholar] [CrossRef] [PubMed]
Misra, N.; Lees, D.; Zhang, T.; Schwartz, R. Pathway complexity of model virus capsid assembly systems. Comput. Math. Methods Med. 2008, 9, 277–293. [Google Scholar] [CrossRef]
Vince, A.; Bóna, M. The Number of Ways to Assemble a Graph. Electron. J. Comb. 2012, 19. [Google Scholar] [CrossRef]
Stockley, P.G.; Twarock, R.; Bakker, S.E.; Barker, A.M.; Borodavka, A.; Dykeman, E.; Ford, R.J.; Pearson, A.R.; Phillips, S.E.; Ranson, N.A. Packaging signals in single-stranded RNA viruses: Nature’s alternative to a purely electrostatic assembly mechanism. J. Biol. Phys. 2013, 39, 277–287. [Google Scholar] [CrossRef] [PubMed]

© 2016 by the authors; licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons by Attribution (CC-BY) license (http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Sitharam, M.; Vince, A.; Wang, M.; Bóna, M. Symmetry in Sphere-Based Assembly Configuration Spaces. Symmetry 2016, 8, 5. https://doi.org/10.3390/sym8010005

AMA Style

Sitharam M, Vince A, Wang M, Bóna M. Symmetry in Sphere-Based Assembly Configuration Spaces. Symmetry. 2016; 8(1):5. https://doi.org/10.3390/sym8010005

Chicago/Turabian Style

Sitharam, Meera, Andrew Vince, Menghan Wang, and Miklós Bóna. 2016. "Symmetry in Sphere-Based Assembly Configuration Spaces" Symmetry 8, no. 1: 5. https://doi.org/10.3390/sym8010005

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Symmetry in Sphere-Based Assembly Configuration Spaces

Abstract

1. Motivation

1.1. Assembly Configurational Volume

1.2. Kinetics, Topology and Geometric Complexity

1.3. Recursive Decomposition, Assembly Trees and Combinatorial Entropy

1.4. Symmetry in Chemistry

1.5. EASAL: Efficient Atlasing and Search of Assembly Landscapes

1.5.1. Geometrization

1.5.2. Stratification

1.6. Organization and Contribution

2. Framework for Symmetry in an Assembly

2.1. A Bunch and Its Symmetries

2.2. An Assembly Configuration Space and Its Symmetries

2.3. Symmetries in an Active Constraint Graph and an Active Constraint Region

2.4. Symmetries in Stratification, Assembly Path and Pathway

2.5. Example Illustrating the above Symmetries

3. Enumerating Simple Assembly Pathways

4. Open Questions

4.1. Enumeration Problems in (Non-Simplified) Assembly Framework

4.2. Symmetries within an Active Constraint Region via Cayley Configurations

4.2.1. Fundamental Regions of Active Constraint Regions

4.3. g-Unfixable Unlabeled Trees

4.4. Depth of an Assembly Pathway

4.5. Other Questions

5. Conclusions

Acknowledgments

Author Contributions

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI