*2.3. Methodology to Generate N(d') Plots*

A second analysis involved plotting the hit fraction (*N*) for a subset with α = 160◦–180◦ as a function of the van der Waals corrected H3C···ElR distance *d'* (i.e., *<sup>d</sup>* – vdW(C) – vdW(ElR)): [70] *N*(*d'*). Such distributions show how much of the data is involved in van der Waals overlap with the methyl C-atom along the vector of the X–CH3 bond and how such data is distributed. For attractive interacting pairs this distribution is expected to exhibit a peak-like feature, or an S-like curvature when the cumulative hit fraction is used.

#### *2.4. Computational Methods*

DFT geometry optimization calculations were performed with Spartan 2016 at the B3LYP [71, 72]-D3 [73]/def2-TZVP [74,75] level of theory, which is known to give accurate results at reasonable computational cost and a very low basis set superposition error (BSSE) [74,75]. The typical starting geometry for possible Carbon bonding adducts was set to *d'* <sup>=</sup> <sup>−</sup>0.1 Å and <sup>α</sup> <sup>=</sup> <sup>180</sup>◦, and in the case of dimethylacetamide the C···O=C angle was also set to 180◦. The geometry optimizations were performed without any constraints. For other geometries (e.g., a H-bonded geometry), the molecular fragments were manually oriented in a suitable constellation before starting an unconstrained geometry optimization. The Amsterdam Density Functional (ADF) [76] modelling suite at the B3LYP [71,72]-D3 [73]/TZ2P [74,75] level of theory (no frozen cores) was used for energy decomposition and 'atoms in molecules' [77] analyses. Details of the Morokuma-Ziegler inspired energy decomposition scheme used in the ADF-suite have been reported elsewhere [76,78] and the scheme has proven useful to evaluate hydrogen bonding interactions [79].

#### **3. Results and Discussion**

#### *3.1. P(*α*) Plots*

A numerical overview of the amount of crystallographic information files (CIFs) and protein data bank files (PDBs) for each search query is given in Table S1, together with the amount of hits found in each dataset (a .cif or .pdb file can contain multiple hits). Shown in Figure 2 are the *P*(α) plots for the CSD (left) and PDB (right) data plotted at 5◦ intervals involving X = C, N, O and ElR = water-O (top) or amide-O (bottom). These datasets were chosen because they all contained a large number of hits (>7,500) and thus allow for the most reliable comparison. A complete set of *P*(α) plots is provided in Figure S1.

**Figure 2.** *P* (α) directionality plots for the data retrieved from the CSD (left) and the PDB (right) using the general query shown in the top-right inset figure for X–CH3···ElR pairs. X can be C, N or O and 'ElR' can be a water or an amide O-atom. The insert figure in the top left is intended as a guide to the eye to interpret the spatial location of data with a certain value of α. Due to the amount of data per dataset (see Table S1 for numerical overview), the plots are given at a 5◦ resolution for α. A full set of *P*(α) plots (i.e., for all the X *vs* ElR pairs in Figure 1) is given in Figure S1. The *P* value of 1 is highlighted in green and indicates a random distribution of data. *N* (CSD/PDB) = 46,000/29,508 (C, water); 18,170/17,101 (N, water); 7190/11,392 (O, water); 53,473/22,538 (C, amide); 7663/10,855 (N, amide); 9,158/11,064 (O, amide).

The data plotted in Figure 1 largely trace the line at *P* = 1 (highlighted in green), which is indicative of a random distribution of data. For X = O and N, these values are somewhat above unity around α = 160◦–180◦ for water O-atoms in both databases and for amide O-atoms in the CSD. The maximum *P*-values are very small at about 1.5, which indicates a very small amount of directionality. Indeed, maximum *P*-values for several weak inter-molecular interactions are: ~2.5 for CH-π; [11,65] ~3 for π interactions with nitro compounds; [42,68] ~2.5–5 for anion-π and lone-pair-π; [29,64] ~2.5–10 for halogen-π [67] and also about 2.5–10 for halogen bonding with aryl-halogens. [66] Interestingly, the *P*-values did not peak near α = 90◦–120◦, an angle congruent with hydrogen bonding. These data thus suggest that the Carbon binding geometry is more directional than a hydrogen bonding geometry, although this directionality is very weak. In all cases for X = C, the *P*-values around α = 160◦–180◦ are below unity, suggesting that the Carbon bonding geometry is least favored in these instances. The data for ElR = RCO2 and RyCS are very similarly distributed and the data for aryl rings is skewed towards α = 160◦–180◦ only for the CSD data (for X = C, N, O, P and S, see Figure S1). For all other cases where X = P, very few hits were obtained (most numerous was RCO2 in the CSD with *N* = 1,454). While some datasets with X = S were of a reasonable size, too many were well below *N* = 7,500 and these data will thus not be discussed in the main text (there is a small discussion in the caption of Figure S1).
