**2. Experiments**

In a given work environment, including an ideal environment, workers will vary in their relative output, so a probabilistic model is required to capture the mean relative productivity and the variation around the mean. We define ideal conditions as those that maximise the probability of achieving a 100% relative output. Less ideal conditions will reduce this probability, perhaps to zero if no workers can maintain full output, and will increase the probability of lower levels of productivity, potentially to no output at all, if work becomes impossible for at least some workers.

A suitable family of distributions to express variation in a proportion or percentage is the two-parameter beta family, usually parameterised as

$$\text{sf}\_{\mathbf{X}}(\mathbf{x}; \mathfrak{a}, \boldsymbol{\beta}) = (\mathbf{x}^{(\alpha - 1)} \, (1 - \mathbf{x})^{(\beta - 1)}) / \text{B}(\mathfrak{a}, \boldsymbol{\beta}) \; ; \mathbf{x} \in [0, 1] , \mathfrak{a} > 0 , \boldsymbol{\beta} > 0 \tag{1}$$

This distribution has a mean μ = α/(α + β ) and variance

$$
\sigma^2 = \alpha \beta / [(\alpha + \beta)^2 (\alpha + \beta + 1)] = \mu (1 - \mu) / (\alpha + \beta + 1) \tag{2}
$$

A convenient reparameterisation for modelling purposes is directly in terms of μ and a scale parameter <sup>θ</sup> <sup>=</sup> <sup>√</sup>αβ , from which the usual parameters can be recovered as <sup>α</sup> <sup>=</sup> θφ, <sup>β</sup> <sup>=</sup> <sup>θ</sup>/<sup>φ</sup> where φ <sup>2</sup> = μ/(1 – μ).

Usually, we will model μ to express how it varies with the work conditions. Here we model μ as a logistic function of heat stress T, and estimate θ as a constant. If a particular parametric model implies a mean output of μ, maximising the log-likelihood of the data with respect to the parameters that determine μ in each environment, and also with respect to θ, provides the maximum likelihood estimates of those parameters.

We let the potential maximum output for individual i under ideal conditions following a normal distribution, Zi ~N(ψ, τ2), and express the proportional reduction in the work environment j as Pij ~B(μ (Tj), θ). Here, Tj represents the heat stress of that environment on some scale such as ordinary ambient temperature, or a more specialised heat stress measure such as effective temperature ET or wet bulb globe temperature, WBGT.

The observed output is Y = Z × P. Since the mean of Z is ψ and that of P is μ, the mean of Y is approximately E[Y] ≈ ψμ. A given level of output might represent a small proportion P of a large potential Z, or vice-versa, so the model, as so far described, will be poorly conditioned in terms of ψ and μ, and a further model constraint is required. Published reports often describe an optimal environment under which heat stress is essentially absent. For example, Wyndham (1969) [14] assumed no loss of productivity until the ambient heat (expressed as natural wet bulb temperature, Tnw) exceeded 27.7 ◦C. It seems reasonable, then, to assume that the beta distribution representing reduced

productivity in such an environment has its mode at P = 1, approaching a degenerate distribution in which all workers deliver 100% output. Under the conventional Beta(α, β) parameterisation, such distributions are the subfamily in which β = 1. This sub-family can be specified through a logistic model in the form:

$$\text{logit } \mu(\mathbf{T}) = 2\ln(\theta) + \mathbf{b}(\mathbf{T} - \mathbf{T}\_0) \tag{3}$$

Then μ(T0) = θ2/(1 + θ2), α(T0) = θ2, and β(T0) = 1 as required. The overall model now has four parameters: the mean ψ and variance τ<sup>2</sup> of the maximum potential output; the (negative) regression parameter b that relates the mean relative output μ to the environmental stress measure T; and the beta-distribution scale parameter θ, which also determines the intercept of the regression. Sometimes, T0 can also be estimated from the data, but this may be unstable in small datasets, and then better results are achieved by setting T0 from external considerations such as local knowledge or expert opinion. Both approaches are applied in the examples below.

As well as modelling the mean output through a logistic regression equation, which may include many covariates, the model also flexibly captures between-worker variance in output, through two parameters τ<sup>2</sup> and θ. The first measures the inter-worker variation of output in ideal, comfortable conditions, while θ determines how the variance changes with temperature. A large θ implies that the variance reduces in proportion with the mean output, as assumed by the original author in the first example below [14]. A small θ, on the other hand, implies that the variance is maintained, or even increased, as adverse conditions cause the output to fall, which is the case in which any given setting will in itself be a research question of interest, to be answered through the analysis of relevant field data.
