**6. Discussion**

As common techniques performed in biomedical research studies involving mice, blood sampling methods may have considerable impact on cumulative experience. Furthermore, retrobulbar sampling has been considered controversial and its use is vetoed in some laboratories. Whilst summary documents or guidelines exist on this topic, a number of these are now older documents and it is not clear whether they have been based on a systematic review of the available evidence (see, e.g., [3,61]). This review represents the first systematic review on recovery blood sampling techniques in mice based on their impact on animal welfare and sample quality.

### *6.1. Impact of Blood Sample Route on Mouse Welfare*

Whilst there is a substantial body of evidence on the impact of blood sampling on animal welfare with 27 studies sourced, the heterogeneity in terms of sampling routes compared, and outcomes measured, renders it problematic to make any recommendation. Despite the large number of studies, few performed the same pairwise comparisons, with the same outcome measures. We were also unable to perform any assimilation of the behavioural data given the heterogeneity. This is unfortunate, as arguably these measures may provide a better measure of well-being and animal impact than a

measure of the short-term stress response. Given the above caveat, some general points taken from the assimilation are presented.

For serial blood sampling, as a general rule, small-volume sampling routes may be beneficial over large-volume sampling routes based on the findings for glucose, plasma corticosterone and bodyweight (Table 1). This would seem plausible based on physiological principles given the reduced blood volume lost, and therefore reduced chance of hypotension, haemorrhagic shock and impaired tissue and organ metabolism that can result [62].

Whilst it might be expected that head-focused routes would have a greater impact on bodyweight than methods focused at the tail and leg, this was only borne out for the facial and retrobulbar routes. The finding that bodyweight loss did not occur with sublingual sampling (data from two studies) is intriguing given the mouth focus, and requires further study to confirm.

It is often assumed that anaesthesia improves animal well-being following procedure performance, ye<sup>t</sup> the findings are inconsistent. For example, beneficial e ffects of isoflurane use were found, with reduced anxiety being observed in the OFT [12] and decreased histological lesion severity after retrobulbar and facial sampling [1,8]. However, anaesthesia caused a doubling in the incidence of the serious adverse e ffect of ear bleeding after facial vein bleeding, and a tripling in rate of bleeding from the nares after sublingual puncture as observed in [30]. Perhaps the best current advice in this regard is to determine anaesthetic use on a case-by-case basis, dependent on the blood sampling route, and the researcher's level of technical expertise and comfort with the procedure. These apparent discrepancies also need future targeted research focus.

The di fference in serial mortality rate observed for the facial vein route between the studies of [26,29] (2% vs. 33%) is striking. This is especially so since the technique is widely used in laboratory animal practice, and this mortality rate would likely raise ethical concerns for continued performance amongs<sup>t</sup> many ethics committees. These results may be an artefact of the small sample sizes employed in the Frolich et al. 2018 study, with a potential e ffect of learning/re-familiarisation. Mortality rates may have declined as the skill was reacquired with larger numbers of animals sampled. Only one of the included studies specifically investigated the e ffect of experience on outcomes measured [28]. Based on pathological findings, findings from this study were that experience with retrobulbar sampling had little impact on outcome. However, it is suggested that this e ffect may have been overlooked in the included studies, and deserves further research attention, and an agreed criterion to standardise expertise level.

As a final point, based on the evidence available, the reason for the demise of the retrobulbar route for ethical reasons remains unclear. The synthesis implies that it is associated with zero mortality [29], none [30,33] to mild clinical ocular abnormalities [29], and similar severity of histological lesions to other large-volume sampling routes [36,38]. This change in policy direction is more surprising given that this route has generally been replaced by facial vein sampling, which arguably can lead to a similar number, if not more, potentially negative outcomes [27,29,33,38]. It is speculated that pictures, or dialogue showing the horrendous (likely rare) outcome of globe perforation after sampling, may have influenced decision making in this regard. Perhaps a more appropriate focus for policy makers should be how to best train and ensure operator competency in this technique in order to avoid the occurrence of serious adverse e ffects. It is certainly by no means clear whether the proposed alternative facial route is more beneficial for animal well-being and this should be a priority for future research.

### *6.2. Evidence Completeness and Quality and Recommendations for Future Research*

This review identifies a number of factors preventing recommendations on choice of mouse blood sampling route being made. These include (1) that in spite of a reasonable number of studies on the topic, there were often few studies examining the same pairwise comparison; (2) there was a lack of standardised outcome measures relevant to well-being and timepoints for comparison; (3) many studies were at high risk of bias, either by virtue of study design or deficiencies in reporting. When considering that there have been significant and repeated recent e fforts to improve the reporting standards of animal research, and that guidelines to assist animal research have been widely available since 2010 [63–66], the overall poor quality of reporting of the included studies is problematic. Simple details prescribed by the ARRIVE (Animal Research: Reporting of In Vivo Experiments) guidelines, such as description of the randomisation procedure, were only reported by three studies published after 2010 (of 16 total). This highlights how far animal-based research needs to come to be considered rigorous, transparent and transferable.

Generally, RCTs are desirable due to their placement in the hierarchy of evidence. However, given the practical and widespread nature of these interventions, and therefore the importance of external validity, it may be desirable to investigate these techniques using large-scale, well-designed observational studies. This may avoid issues alluded to earlier, where small sample sizes may lead to artefacts. Alternatively, RCTs that are multicentre in nature to increase sample size could be performed. These studies may be able to reduce confounding by issues such as technique experience as part of a dilution e ffect due to the use of many operators.

One of the major limitations of this systematic review was the inability to perform a meta-analysis, which is the ideal method to synthesise and present data from multiple, comparable studies. However, due to the heterogeneity that existed between studies, performing a meta-analysis was never considered to be appropriate. While this factor has severely limited the impact of our results in terms of making a determination on the e ffects of blood sample route on animal welfare, it does provide us some insight as to how a meta-analysis may be facilitated for future systematic reviews on this topic. Ideally, the relevance and reproducibility of outcome measures used to assess welfare in adult mice should be validated and discussed in the field, so that consensus may be reached, and experiments standardised accordingly. However, by far the most common barrier to performing a meta-analysis encountered was the underreporting of results by authors of the primary studies. Often, authors simply reported their results as a figure or a graph, and a statement of (non-)significance. While this may suit the purposes of the primary author in confirming or rejecting their null hypothesis, this underreporting greatly reduces the transferability and comparability of these data. A simple solution to this problem would be for scientific journals to start mandating that submitting authors provide complete data sets. This tactic is already being employed by international journals such as PLoS One, Springer Nature and Science [67–69] and may reduce the need to resort to alternate, less robust data synthesis strategies [70]. However, until this becomes a standard, as it has in clinical research, this issue will continue to plague animal-based research.
