*1.2. Search Strategy*

The search strategy aimed to only find articles published in English or translated into English. There was no restriction on the date of publication. Articles were searched for between June–July 2020. Keywords used to search all databases and references sources were animal grimace score, animal grimace scale, animal pain assessment, animal pain indicators, animal pain face, animal pain scales and the NC3Rs website. All papers were retrieved and downloaded into Endnote with X8.0.1 with any duplicates removed. What is pain and why does it matter?

The International Association for the Study of Pain defines pain as: 'An unpleasant sensory and emotional experience associated with, or resembling that associated with, actual or potential tissue damage' [2]. Pain can be further categorised as acute, visceral or chronic. Acute pain serves an evolutionary and adaptive function to signal and avoid potential or actual damage to tissues. This type of pain may result from an injury or surgical wound [3–5]. Visceral pain is due to the activation of stretch or pressure receptors in visceral organs. Unalleviated or poorly treated acute pain can progress to chronic pain. The latter is the result of neuroplastic changes occurring within the nervous system, rendering the body more sensitive to pain and can even create sensations of pain without any external stimuli [3–7]. It is important to be able to manage and assess all types of pain in research mammals and avoid the inadvertent development of chronic pain. While the types of pain may manifest differently, research staff must be able to assess and alleviate pain to maintain optimal animal wellbeing (mental and physical).

There are moral, legal and ethical obligations that require those working with animals to manage pain [5,7–13]. The recognition, assessment and treatment of pain is an essential aspect in public support and acceptability in the use of animals for research [7,12,13]. Using the precautionary principle, animal ethics committees and research staff must acknowledge the potential for pain [5,10,12,14]. They must also consider the experimental and animal welfare consequences of pain and take steps to ensure pain is adequately managed during procedures [3–5,15]. Regulatory frameworks often apply, as a precaution, the anthropomorphic principle, by which any procedure causing, or expected to cause, pain in humans, may produce pain in animals. The European Union, United Kingdom and Australian regulations operate on this principle and require the alleviation of pain for research animals. Exceptions to the alleviation of pain may be granted for studies that measure pain and/or distress. However, even in these exceptional cases, there is always a maximum threshold level of pain before intervention is required [5,10,12,14–17]. Although managemen<sup>t</sup> of pain and associated humane interventions will vary due to the nature of the experimental outcomes, researchers are required to intervene with predetermined criteria to alleviate pain or, if necessary, humanely euthanise animals. For practical purposes, animal users can only fulfil this obligation for humane intervention if they are able to identify pain rapidly, consistently and accurately in their target species.

Unalleviated pain results in alterations in animal behaviour, physiology, and physical states [18–20]. These changes can be identified in various ways through behavioural observation, biochemistry, haematology, endocrinology and physical alterations in locomotion or posture [4,21]. In addition to the suffering of the animal, these changes may impact experimental outcomes and become a confounder by increasing experimental variability and producing negative affective states. Conversely, positive emotional states of animals are linked to less experimental variability and more robust experimental results [22,23]. The full spectrum of potential confounders to unmitigated pain is not entirely understood; however, the literature supports that there are experimental and animal welfare benefits in identifying and subsequently treating pain and alleviating negative effects on animals used for research [3,17,19,21–24].

### *1.3. Pain Faces*

Humans are known to display a series of facial expressions linked to the experience of pain [25,26]. Those so-called 'pain faces' are used in human medicine to detect, as well as assess, pain in non-verbal humans (i.e., infants) [7,25,26]. These pain faces can be used to develop grimace scales and capitalise on the human propensity to focus on the facial area [27,28]. The conservation of these pain faces is also present in many non-human mammalian species [26,29–34] and are a naturally useful method in the identification and assessment of pain. However, as with any technique, Grimace scales have benefits and limitations. These are important to acknowledge and take into considerations before their use.

### *1.4. Pain Assessment Requirements*

There are a series of important considerations when determining if a method is an appropriate test in identifying and/or assessing pain. The testing method must reliably produce the same result independent of the observer and the number of times an animal is observed. These are, respectively, known as intraobserver and interobserver agreement. It should also be consistent between testing timepoints and observers [7,35,36]. An ideal method should be easy to train and not require specialist knowledge or equipment [7,33,36–38].

A suitable test must demonstrate validity by accurately determining or reflecting the presence or absence of pain [7,35,36,39]. To determine the validity of a pain assessment technique, we should test the animals before the painful stimulus, after the introduction of the painful stimulus and once pain relief has been provided. The test should demonstrate an absence of pain before the painful stimulus, an increase in pain at the introduction of the painful stimulus, and a subsequent reduction in observed pain on the delivery of an appropriate analgesic [7,36]. Ideally, the test should be able to demonstrate a dose–responsive curve to pain based on the administration of appropriate analgesia [40–44].

The specificity and sensitivity of a test are also crucial to ensure animals are correctly identified when pain or welfare concerns arise. If the specificity is too low, there is a risk of pain being incorrectly identified, potentially leading to unnecessary interventions such as pain relief or humane euthanasia [3,7,15,36]. Alternatively, if the sensitivity is too low, experimental animals may reach their threshold for intervention while being inaccurately identified as not painful, therefore remaining in pain possibly even beyond their humane endpoint. An appropriate method would demonstrate both high sensitivity and specificity to ensure correct assessment and correct managemen<sup>t</sup> of arising pain or welfare issues [3,7,15,36].

Cage or pen-side pain identification techniques should rely on spontaneous rather than retrospective indicators of pain. It ensures humane intervention can be applied promptly with animals not left in distress for any extended length of time [9,45,46]. The assessment of pain should preferably be a non-invasive method, to avoid the risk of eliciting a pseudoanalgesic stress response, inhibiting the ability of the observer to detect pain accurately [47–50]. Techniques such as assessing the quality of nest-building in mice [3,15,18,51–53] or degree of burrowing in rodents are non-invasive, observatory, proxy measures to wellbeing and potentially pain [3,15,18,54,55].

### *1.5. Confounders to Pain Identification*

Some caveats must be maintained when selecting a pain assessment technique. Many pain assessment indicators may be ambiguous. The choice of a pain identification tool or methodology must be specific to the species and validated for the procedures or experimental work being performed [56–60]. It well accepted that not all animals demonstrate the same signs of pain, even for a similar nociceptive stimulus [36,61]. Many research animals are prey animals and as such, are prone to hide signs of pain or demonstrate a freeze response, rendering pain assessments challenging [21,46,50,62–66]. The types of procedures or experiment performed should not obscure the ability of the technique to detect pain [39,56–58,67]. Pain identification should be consistent across the species regardless of sex, strain or breed; however, differences in pain thresholds between sexes or strains may exist [67,68]. Additionally, some natural behaviours (i.e., flehmen response, aggression) [3,4,32,69–72] or physiological indicators (i.e., cortisol, heart rate) [3–5,7,15,17,21,66] may be equivocal and require differentiation. Whenever possible, the choice of technique should accurately identify an animal in pain, independent of the procedure or behaviour performed, species, affective or physiological state, sex, strain or breed.

### *1.6. Non-Grimace Scale Pain Assessment*

The individual expression, magnitude and experience of pain can vary between animals [67,68]. There are known difficulties in measuring the magnitude of a particular animal's pain or distress which can make the absolute measurement or degree of pain challenging to assess [4,21,62,68,73–75].

Before the development and use of grimace scales, a variety of indicators have been used in an attempt to identify and assess pain. These can be grouped into behavioural, physiological and physical indicators (Table 1). Typically, behavioural indicators have the benefit of being non-invasive, observational, requiring limited equipment and offering an opportunity to capture signs of pain in species or individuals that may hide signs of pain (i.e., prey) [3,21,50–55,66]. Many behavioural assessment techniques take time (>5 min), require extensive training, are more retrospective than spontaneous, and may be non-specific proxy indicators to pain [4,18,76–78]. Physiological indicators (neuroendocrine or sympathetic nervous system) are often non-specific markers related to stress or distress [3,5,21,79,80]. They do have the benefit of being relatively quantifiable but often require specialised equipment, are retrospective, usually require animal handling or restraint with the potential for a confounding pseudoanalgesic effect [47–49] or are a non-specific stress response [3,4,8,81]. Physical indicators such as changes in posture, locomotion and production yields, have been correlated with the presence of pain in animals [3,8,15,17,21,64,66,82–85]. However, physical indicators are just as often non-specific indicators of non-painful animal wellbeing or environmental factors [3,5,8,15,17,21,64,66,82,84].

Ideally, a pain assessment technique should ensure accurate pain identification and minimal opportunity for the confounding of experimental outcomes due to experimental procedures, sex, breed/strain, or species. At present, there is not a single non-invasive, low-cost behavioural, physical or physiological pain assessment technique that is spontaneous, pain-specific, easy to train and quick to use (Table 1) [5,15,16,21,38,46,53,78,86–89]. With the exception of some behavioural ethograms [63,78,90,91], other methods are unable to give a reliable dose-dependent response to pain. While many pain identification methods have their use and benefits, their use in the cage or pen-side managemen<sup>t</sup> of animal pain and/or in the timely and appropriate application of humane intervention is limited.

Thus, a myriad of techniques has been developed in an attempt to assess and capture the various expressions of pain in animals. These tools usually revolve around three dimensions: behavioural, physiological, and physical [5,73]. Table 1 categorises and reviews some commonly used assessments [3,4,7,8,15,21,80] in terms of their dimension, ability to be timely (spontaneous), non-invasiveness, spontaneous, easiness to train, and low-cost with minimal or no equipment requirements.


**Table 1.** Behavioural, physiological, and physical pain assessment techniques.

\* Ultrasonic vocalization monitoring requires special equipment; \*\* e.g., Egg or Milk; \*\*\* Eggs can easily be counted; \*\*\*\* Obvious signs of lameness are easy to train but more subtle lameness may be more difficult.

### *Animals* **2020**, *10*, 1726

### *1.7. Grimace Scales in Animals*

Grimace scales are proving to be a useful methodology for the identification of pain in research that meets most of the prerequisites for identifying and assessing pain in research animals. A range of research species-specific grimace scores has been developed (Table 2) and used in a wide range of experimental studies and research settings (Table 3). The initial methodology in the mapping of pain and the development of a facial action coding system (FACS) was developed in humans [97,98]. A FACS is an anatomical classification system used to map facial movements and facial muscles areas involved in facial contraction and relaxation. Photographs and videos scored by blinded observers serve as the base of facial mapping for FACS. FACSs o ffers the ability to code and identify expressions of pain via the individual components of facial expressions known as facial action units (FAUs) [99]. FAUs consistent with the expression of pain can then be used to develop a pain face or 'grimace' [99]. Regions of the face that have been found to change during the expression of pain include the eye, nose, cheek, mouth, ear and whiskers [8,81,100]. The position or carriage of the head is also found to change in some species as well [33,64,68,85,101,102]. The FAUs related to the expression of a grimace face in mammalian animals used in research are included in Table 4. From this known 'grimace face', the severity of the pain experienced can be objectively scored from images and/or film of animals in a known naturally (i.e., lameness, mastitis) [37,82,85] or experimentally induced (i.e., plantar incision [41,44,103] state of pain.

Table 2 summarises many of the available studies that demonstrate a successful use of grimace scales in research animals. The table outlines which species-specific grimace scales have been validated, shown to be pain-specific, demonstrated a dose-dependent relationship, used in real-time and were easy to use. The di fferent pain states to which they are applicable is also listed. In all but one species (guinea pigs) [63,78,90], observers were found to correctly, reliably and objectively identify pain in animals when using facial expressions or facial action units.


**Table 2.** Summary of species-specific available grimace scores.


**Table 3.** Grimaces Scales Facial Action Units by Species.


**Table 4.** Grimace Scales by Experimental Study or Pain Type.

Control animals (negative or positive) were also included throughout this process and a simple species-specific grimace scale was developed [25,33,41,107]. The scoring system most commonly used in grimace scales is a three-point scale to determine if a specific FAU is not (score = 0), moderately (score = 1), or obviously present (score = 2) [41,45,100]. The scale must then demonstrate a

dose-dependent change in pain scores on the delivery of analgesia [7]. Further research is typically performed to ensure the applicability of the grimace scale across multiple pain scenarios or environments, sex, strain/breed, age, as well as type and length of painful stimuli [7]. The scoring system can be used three ways. Firstly, it can determine either the absence or presence of pain. Secondly, it can offer some distinction between the intensity of pain via the summation of total scores. A change by two or more points is considered to be a legitimate alteration in pain intensity [133]. Thirdly, a threshold score can be set to offer guidance to research staff as to when to intervene to provide pain alleviation or humane euthanasia of research animals. The process of developing a grimace scale is time intensive but once developed and validated is relatively easy to train research staff to use [7,38,70,85,101].

### **2. Advantages and Uses**

Grimace scales have been applied across numerous research models, species and environmental contexts [41,128] (Table 4). They are a technique that can also be used to detect pain in existing pain research models as well as analgesic drug studies [40–42,45,60,77,109,110,128–130]. Grimace scales offer the ability to detect and assess the severity of pain, determine the potential benefit of any analgesic intervention and assist in identifying humane interventions. The technique is of practical value as it can be used at the cage or pen-side level as a spontaneous indicator of pain [39,41,55,75,92]. As a methodology, it has the added benefit of being easy to teach to a range of observers including research staff, clinical veterinarians, animal scientists and undergraduate and graduate students [38,41,55,75,129]. Overall, the grimace scale methodology appears to be acceptably conserved and validated across a number of mammalian species and range of experiments. It is likely this technique has the capacity to be applied across an even greater range of mammalian species and experimental settings (Tables 3 and 4). However, a careful systematic assessment will always be required to ensure applicability, accuracy and validity.

Grimace scale facial expressions are proving to be a useful [81] complement to existing tools in the assessment of animal wellbeing. The scores generated from the grimace scale should be used in conjunction with the context in which the animal is scored, its history, the procedure performed and the general parameters for wellbeing and signalment (sex, strain, species). When used appropriately, it is an excellent method to identify pain and as an adjunct to maintaining animal wellbeing in research studies [3,64,70,82,85,87]. Using this technique has the potential to improve pain detection in research animals and enable observers (i.e., research staff) a better opportunity to provide analgesia, humane euthanasia or identify animals requiring reassessment. The use of these grimace scales can be a vital tool to enable mitigation of the experience of pain in animals and refine animal welfare outcomes [41,60,66,75,76,82,100,114,128]. Unlike other types of pain assessment, grimace scales are spontaneous and usable in real-time [7,45,55,76,87,91,92,101]. They can also be matched and corroborated against other known indicators of pain or painful diseases including, but not limited to, lameness [37,64,82,85], cortisol [70], behavioural ethograms [81,85,91,92], acute laminitis [37], mastitis and foot rot [82]. A future area for development and benefit is the use of software automation in the development and scoring of facial expressions. The use of scoring software along with the installation of video cameras into enclosures may be able to enhance and hasten the development of grimaces, offer highly accurate grimace scores for animals in pain but also allow the remote monitoring and scoring of affected animals [41,59,134,135].

Another benefit of the system is simplicity, as it enables staff to distinguish a painful face from a non-painful one. Using a three-point scale is thought to be very useful in the reduction in subjectivity and offers observers greater clarity, confidence and support as to when to administer pain relief or humane intervention [7,75,81]. Reduction in grimace scores has been shown to occur on the application of pain relief [33,35,41,45,82,85,100,102] in a dose-dependent manner [40,41,128]. Therefore, grimace scales have the potential to assess both the presence and severity of pain. The use of grimace scales can alert research staff to animal discomfort, which may require additional monitoring, assessment or analgesia.

Grimace scales are a non-invasive method in the detection of pain [7,81,100]. Many of the animals utilised in research are known 'prey' species with a high degree of stoicism and evolutionary adaptation to minimise expressions of pain or poor welfare states [50,63–65,74]. Consequently, an ideal pain identification and assessment should be non-invasive and should reduce the possibility for these prey animals to minimise their expression of pain or for the potential of stress-induced analgesia [47–50].

Both experienced and inexperienced observers can identify pain with a significant intraobserver and interobserver agreemen<sup>t</sup> [41,57,60,66,76,82,100,114]. A potential benefit of using grimace scales to identify and assess pain in animals is that extensive animal experience is not required. Observers varied in their background experience to research and animal work and their training in pain assessment techniques. The observers ranged from students (undergraduate and postgraduate), veterinarians, animal care professionals, and early to late-career researchers [33,38,70,75,101,106,114,116]. Another favourable outcome when using grimace scales is that a natural empathy or innate understanding of animal behaviour is not necessary nor is a belief in the ability of an animal to experience pain. Through the use of a grimace scale, pain identification and assessment can be more objective (for or against the presence of pain). It also requires research sta ff to formally record a score and monitor animals for signs of pain, which can o ffer a more precise framework to determine when humane intervention or pain relief is needed [36].

The apparent usefulness of grimace scales could be related to several factors. One of which is that it capitalises on the innate human tendency to focus on the facial area when observing an animal [28]. Interestingly, many FAUs (orbital tightening, ear position and cheek area) appear to be conserved across mammalian species [33,41,45,82,100] (Table 3) and may be tapping into an evolutionary conservation repertoire of known FAUs. It may help explain how even the single identification of a few potentially evolutionarily conserved FAUs can still be useful in detecting pain [35]. It is supported via statistical modelling which has identified the FAUs most strongly correlated with a pain face, thereby o ffering the potential to isolate which FAUs are critical for use in a grimace scale (i.e., statistically significant) and which ones may detract from the scale (i.e., equines have four and mice have two critical FAUs) [35]. It may explain why grimace scales are one of the few techniques proven to be robust across several di fferent mammalian species when compared to other pain assessment techniques [34]. However, by using only the minimum number required of FAUs to score pain, the ability to determine appropriate intervention thresholds and assess pain intensity may be reduced.

The use of FACS and subsequent combinations of FAUs appears to be an excellent method to identify changes in facial features, which are consistent with the experience of pain in animals [33,40,41,76,82,85,92,101,104,106,114,130]. The grimace scale method seems to meet many of the requirements for an ideal pain identification technique. It is known to be a reliable and validated method of assessing pain in many of the commonly used research animals [33,35,40,41,45,70,82,85,100,106,114].
