3.3.1. Correlation Coefficient

The Pearson correlation coefficient (CC) *rxy* can be computed for any model as follows

$$r\_{xy} = \frac{\sum\_{i=1}^{n} (\mathbf{x}\_i - \bar{\mathbf{x}})(y\_i - \bar{y})}{\sqrt{\sum\_{i=1}^{n} (\mathbf{x}\_i - \bar{\mathbf{x}})^2} \sqrt{\sum\_{i=1}^{n} (y\_i - \bar{y})^2}} \tag{7}$$

where *n* is the sample size, *xi*, *yi* are the individual sample points (i.e., paired instances) indexed with *i* for a pair of random variables (*X*,*Y*), and

$$\bar{x} = \frac{1}{n} \sum\_{i=1}^{n} x\_i \tag{8}$$

is the sample mean for *X* and similarly obtained for *Y* as follows

$$\mathcal{Y} = \frac{1}{n} \sum\_{i=1}^{n} y\_i \tag{9}$$

Essentially, the value of *rxy* ranges from −1 to 1, where a value of 1 means that the relationship between *X* and *Y* can be described by a linear equation. In this case, all data points fall on a line. The correlation sign (− or +) follows from the regression slope, where a + sign means that *Y* increases as *X* increases and vice versa for a − sign. The case of *rxy* = 0 means that no correlation exists between *X* and *Y*. Other intermediate values (i.e., 0 < *rxy* < 1 and −1 < *rxy* < 0) describe partial correlations with values closer to 1 or −1 representing a better model based on the context and purpose of the experiment.
