**3. Discussion**

Processivity is a basic device of enzymes working on (generating, modifying or moving along) polymeric substrates [1]. By its very molecular logic, it increases cellular economy by limiting the production of metabolic by-products and the dissipation of energy, and it enables large-scale molecular changes to occur, thus it is at the heart of many key cellular processes. Due to the all-or-none character of the operation of processive enzymes, however, there have to be very precise and highly controlled cellular mechanisms for turning them on.

As outlined, there are diverse molecular mechanisms underlying processivity, falling into two general categories, structural confinement by well-folded binding elements and spatial confinement by independent binding elements connected through a linker region. This latter mechanism is apparent in dimeric mechanochemical motors and also in monomeric enzymes. The importance of the general kinetic consequence of processivity can be deduced from its convergent appearance in many independent systems. Whereas its mechanistic underpinning is rather well understood in the case of enzymes that rely on structural confinement and is also analyzed rather extensively in the case of mechanochemical motors, it has so far been largely overlooked in the case of monomeric enzymes.

The typical design of such enzymes is embodied by certain bacterial cellulases, which have a modular structure that combines a large CD linked to a smaller CBM by an intrinsically disordered linker [39] that enables a continuum of conformations. A similar feature has been suggested for the matrix metalloproteinase MMP-9 [33,37], which progressively degrades polymeric components of the extracellular matrix, such as collagen. This enzyme also has a modular structure, with an N-terminal unit of a catalytic domain and three fibronectin type II exosite modules, connected by a 54-residues long linker to a C-terminal hemopexin C domain. SAXS and AFM demonstrated that it can assume multiple conformations and that it can crawl in an inchworm-like manner along its substrate [57]. A similar architecture has been suggested and/or theoretically modelled in the case of glycohydrolases, such as Cel7A [58], cellobiohydrolase I [59] and chitinases [60]. The importance of this arrangement is underscored by cellobiohydrolase I, in which the deletion of the linker dramatically reduces the rate of crystalline cellulose degradation [32] and also other glycoside hydrolases, in which the removal of the carbohydrate-binding module results in a significant decrease in their activity [6], without directly affecting their catalytic domain. Apparently, the unifying feature of all these examples is the structural disorder of their linkers, which ensures a high local concentration and relatively restricted conformational search of binding domains around their binding sites.

Here, we used statistical-kinetic modelling of such systems that this structural arrangement can endow such an enzyme with the capacity of processive movements along a polymeric substrate of spatially repeating binding sites. We characterized these enzymes by the time of (re)binding as a function of linker length, and found that within a certain length range, they have a preference for binding over dissociation, i.e., they show processive kinetic behavior. Geometric features of the domains, direct binding of the linker with the domains themselves and PTMs of the linkers all influence binding kinetics and may thus serve as points of regulatory input. This might be of no negligible importance, as the processive chain of events past the point of activation appears uncontrolled, which may have dire consequences. A proper regulatory input halting the reaction may be a remedy under some circumstances, as suggested by frequent PTMs of processive linkers (Table 2) and their regulated binding to the flanking domains, as shown for MMP-9, for example [33].

These theoretical observations have general relevance and are supported by a collection of 12 such enzymes that all have highly disordered linkers. Notably, despite rapid evolution and sequence variability of IDPs/IDRs in general, and disordered linker regions in particular, the length and flexibility of linkers in the processive enzymes is conserved. Quantitative modelling of the cellulase enzymes is in general agreement with the observed level of processivity and suggests that this functional-kinetic property is manifest in a relatively limited range of linker lengths, which appear to be in co-evolutionary link with the particular step size along their typical substrate. This has been also suggested by the behavior of the related mechanochemical motors kinesin-1 and kinesin-2, the degree of processivity of which sharply changes by changing the length of their linker regions [15]. This feature is also underlined by the observation that short and long linkers are entirely missing in DLD-type processive enzymes.

In a broader functional context, we suggest that this observed behavior is a special case of the entropic chain functions of IDPs/IDRs and appears as a conceptual extension of mechanisms, such as fly casting [27] and monkey-bar mechanism [28]. Processivity appears to draw on all these mechanisms and may represent one of the primary benefits of the flexibility emanating from structural disorder [25,61]. This type of function cannot be supported by a structured protein; thus it is an appealing addition to the functional arsenal of structural disorder, understanding of which may even enable the design and generation of enzymes of improved capacity for the needs of biotechnology.

#### **4. Data and Methods**

#### *4.1. Collection of Processive Enzymes and Intrinsically Disordered Proteins*

Processive enzymes were collected from the literature by searching for keywords "processive" or "processivity." We aimed for a full coverage of all types of processive enzymes, which resulted in

47 illustrative examples (Table S1), many of which were covered previously [1]. From this collection we selected 12 monomeric enzymes, for further analysis (Table 1). Due to their dominant modular arrangement, we term these monomeric processive enzymes domain-linker-domain (DLD) type. For comparative purposes, we also downloaded 1274 IDP/IDR sequences from the DisProt database (version 7.0) and selected 133 of the IDRs annotated as "linkers" [44].
