*2.5. Viewing Sessions*

Each observer was required to attend four sessions on campus or by correspondence (a term generation session and then three quantification sessions for each study). Observers (Study A; *n* = 26) attended sessions in November 2011 (treatment comparisons: Study A1, Study A2, and Study A3) and a second set of observers (*n* = 20; Study B) attended sessions in June 2013 (treatment comparisons: Study B1, Study B2 and Study B3). Observers were given detailed instructions on completing the QBA sessions but were not told about the experimental treatments or that the sheep were on a livestock vehicle.

For the term generation session, observers in both studies were shown 11 video clips of groups of sheep during road transport demonstrating a wide range of behavioural expression to allow observers to describe as many aspects of their expressive repertoire as possible. After watching each clip, observers were given 2 min to write down any words that they thought described that animal's behavioural expression. There was no limit imposed to the number of descriptive terms an observer could generate, but terms needed to describe not *what* the animal was doing (i.e., physical descriptions of the animal such as vocalising, chewing, tail flicking), but *how* the animal was doing it. Subsequent editing of the descriptive terms was carried out to remove terms that described actions, and terms that were in the negative form were transformed to the positive for ease of scoring (e.g., *unhappy* became *happy*). Each descriptive term was attached to a 100 mm visual analogue scale (min = 0 to max = 100). The list of terms was alphabetically arranged, therefore effectively randomly arranged and ensuring that terms with a similar meaning were not generally listed together.

For the quantification viewing sessions, observers viewed and scored video clips of animals under transport (clips were randomly arranged within each viewing session) using their own unique list of descriptive terms. Before session commencement, observers were given detailed instructions on how to score each animal's expression using the visual analogue scale: they were told to think of the distance between the zero-point and their mark on the scale as reflecting the intensity of the animal's expression. Observers viewed and scored 20 clips for each quantification session except for Study A2 where they viewed and scored 10 clips and for Study B3, where they viewed and scored 36 clips. For each clip, the score was to reflect the overall expression of all sheep visible.
