3.3.2. Longitudinal Analysis

For RQ3, correlation was used to determine if any statistically significant trend existed over the 12 years of the study. Pearson's correlation coefficient (r) is a measure of association between two interval or ratio variables. In this case, the number of accidents and the year. The statistical hypotheses to be tested are given as

> H0: r = 0 HA: r - 0

This test is assessing if there is an association between the number of accidents in each year and the year. A secondary question can be posed here, and that is, if the accident counts have reduced, has the rate of reduction for maintenance accidents reduced at the same rate or a different rate to all accidents? This requires two simple linear models

$$
\mathfrak{m}\_{\text{all}} = \mathfrak{a}\_1 + \mathfrak{beta}\_1 \mathfrak{t} \tag{1}
$$

and

$$\mathbf{n}\_{\rm II} = \alpha\_2 + \beta\_2 \mathbf{t} \tag{2}$$

where α1 and β1 are the model coefficients to predict the count for all official ICAO accidents (nall), while α2 and β2 are the model coefficients to predict the counts for accidents with maintenance contributions (nm), and t is time in years. Assessing the difference in the rate of change requires a combined model, given as

$$\mathbf{n}\_{\Delta} = \mathbf{n}\_{\mathrm{m}} - \mathbf{p} \mathbf{n}\_{\mathrm{all}} = \mathbf{a}\_{3} + \boldsymbol{\beta}\_{3} \mathbf{t} \tag{3}$$

where the new 'dependent variable' is given as the relative difference in the two accident groups (p is the proportion of all accidents which are maintenance related and approximately 35/1277). That is, multiplying nall by p gives a count relative to 35 instead of 1277, which means they are on the same scale. If this new count, or more correctly, the difference in the count (nΔ), diverges (β3 - 0), then the rates are not equal (β1 - β2). These can be expressed as

$$\begin{array}{cccc} \text{H}\_{0}\text{:} & \quad \text{\beta}\_{3} = 0 \quad \text{or} \quad & \quad \beta\_{1} = \beta\_{2} \\\\ \text{H}\_{\text{A}}\text{:} & \quad \beta\_{3} \neq 0 \quad \text{or} \quad & \quad \beta\_{1} \neq \beta\_{2} \end{array}$$

A subtle difference which needs to be mentioned is that since two dependent variables are being regressed against the independent variable, then the degrees of freedom are the number of observations (12) subtract 3, not the 2 associated with simple linear regression.

Logistic regression is needed to answer RQ4. This is because both fatalness and fate of the airframe (accident category) are both dichotomous variables (fatal or not, hull lost, or not), while the aircraft age is a continuous variable. Logistic regression is the ideal tool to measure association between a dichotomous dependent variable and a continuous independent variable. The statistical hypotheses to be tested are given as

$$\mathbf{H}\_0 \colon \qquad \beta = 0$$

$$\mathbf{H}\_\mathbf{A} \colon \qquad \beta \neq 0$$

where β is the variable in the fitted logit function, that relates the continuous variable to the dichotomous output. The logit has the form,

$$
\pi(\mathbf{x}) = \frac{\mathbf{e}^{\alpha + \beta \mathbf{x}}}{1 + \mathbf{e}^{\alpha + \beta \mathbf{x}}}.\tag{4}
$$

Here, π is the estimated probability of the dichotomous outcome at the given predictor level (x), in this case, age of the aircraft.
