*4.1. Human Versus Nonhuman Determination*

Nonhuman remains comprise a significant portion (25–30%) of total cases assessed by forensic anthropologists [1–3] and can represent more than 90% of skeletal cases submitted to medical examiner offices [1]. Although forensic anthropologists mentally assess bone size and shape when determining skeletal species, only one other published study was found that assessed the utility of basic long bone osteometrics in differentiating human from nonhuman remains. Saulsman et al. [14] created discriminate functions from a sample of 50 human and 50 nonhuman specimens from five Australian species. Their study illustrated the potential utility of such quantitative methods, with accuracy rates over 95%, but it was limited by sample sizes and species inclusion.

Our results, where more than 16,000 long bones were assessed quantitatively to develop predictive models, support their findings. From this extensive dataset, we provide discriminant functions and decision trees that can be used to assist or support human versus nonhuman determinations from long bones. Even when all elements are pooled, the DFA and decision trees return over 90% accuracy, with correct classifications of human remains over 95% (99.6% for the decision tree). Thus, high accuracy rates can be achieved even without first distinguishing the specific bony element present. If the bone is first identified and bone-specific methods are applied, accuracy increases further for all models except the tibia-specific and ulna-specific discriminant functions, which were slightly lower. The ulna performed the worst across most analyses, which may partly be due to the lack of distal measurements collected for this element. Generally, the decision tree presented slightly higher overall accuracy rates as compared to the DFAs.

When assessing the human versus nonhuman origin of skeletal remains, we recommend the use of the decision trees presented in this paper and Supplementary Materials compared to the discriminant functions, given (1) their higher accuracy rates, (2) their use of more available data and split-validation, and (3) their lack of statistical assumptions [42]. The better performance of decision trees may also reflect the incorporation of multiple sectioning points into the model (one at each node) as compared to a single sectioning point with discriminant functions. In addition, decision trees provide classification rates

at each of the nodes, providing a more realistic view of accuracy and confidence in the classification for any specific set of measurements. For example, if a bone falls into the node 7 in Figure 1, the results indicate about a 75% probability that the bone is human, despite an overall model accuracy rate of 91%. Decision trees are intuitive, transparent, and easy to apply [40,41,46]. While the concept of decision trees is not new to forensic anthropology [39,40,47–50], the method remains underutilized in practice.

Another advantage to decision tree models is that they allow users to assign higher costs to certain sets of misclassifications [36], in this case to the misclassification of human remains as nonhuman. In forensic anthropology, misclassifying human remains as nonhuman could prevent decedent identification, leaving family members without closure and impeding possible criminal investigations. In contrast, the biggest cost of misclassifying a nonhuman element as human is the unnecessary expenditure of time and resources spent in securing a scene and contacting an expert for final determination. The decision trees presented here assist in reducing the possibility of both of these scenarios. A death investigator called to a scene with a bone could have the decision tree printed on a single sheet of paper (or access it via the OsteoID website on their smartphone) and, using a tape measure, can easily follow the branches of the tree for a preliminary assessment of human versus nonhuman. Because of the integrated misclassification costs, the trees are more likely to incorrectly assign a nonhuman bone as human than vice versa; thus, the result is conservative and anything close to matching human form will be treated as if it is human and of forensic significance until determined otherwise (ideally by a trained forensic anthropologist). At the same time, resources are not wasted on scenes containing remains that are clearly not human. Thus, the models presented here can act as a triaging tool.

While some may argue that all bones discovered should be assessed by a forensic anthropologist, this is not realistic and does not represent current practice. Forensic anthropologists typically receive elements that are believed to possibly be human. Those remains that the finder, law enforcement agent, or those consulted by the law enforcement agent (including physicians and veterinarians) deem as not human are frequently not referred to medicolegal agencies or forensic anthropologists. If referred to medicolegal agencies, their non-anthropological personnel may also determine that the remains received are not human and not worth consulting with a forensic anthropologist. Resources, such as the models and web tool presented here, can assist these individuals who are already undertaking these triaging roles to make more informed decisions. If the decision trees, discriminant functions, visual comparison with the web tool images and/or context of the remains suggest that they may be of human origin, the medicolegal agency and forensic anthropologist should be consulted for final determinations. The forensic anthropologist, in turn, may find these resources useful in supporting their designations or confirming the particular faunal species (discussed below).

Not surprisingly, the most accurate human versus nonhuman functions and decision trees include measurements from multiple regions of the bone, which may not be possible in cases involving fragmented remains. Consequently, the use of only specific bone regions was tested as part of this study for application to larger bone fragments. Univariate analyses were performed on maximum lengths to reflect cases in which erosion to the epiphyses could affect proximal and distal elements. Models were created from only the distal measurements (width and depth) and from only the midshaft measurements (maximum and minimum diameters) for use in cases limited to these fragmented regions. The length and distal epiphyseal region-specific analyses produced higher accuracy rates than the midshaft measurements (except for the ulna). This is expected given that maximum length and distal width were commonly the most important variables in the more inclusive models. For the femoral decision tree, despite inputting all six variables, the tree output only used maximum length and was able to correctly classify over 96% of the total sample and over 99% of the human sample. The region-specific discriminate functions developed per bone (Supplemental Table S12) produced accuracy rates above 85% for all functions except the humeral midshaft (67.1%). These results are slightly higher than the region-

specific DFA results presented by Saulsman et al. [14]. While the results suggest that these models may be useful tools when assessing fragmented remains as human or nonhuman, caution is still warranted given that classification rates are only moderately high, and additional evidence (e.g., presence of morphological features, application of a second method) should be provided to support the conclusion. Saulsman and colleagues [14] also warn against estimating the midshaft location on humeral fragments because deviations 2 cm above or below the actual midshaft location significantly altered their classification rates; results from femoral and tibial deviations were more robust. Application of the models to burned fragments must also consider the possibility of bone shrinkage with the thermal modification [51].

The most conservative approach for assessing the human origin of skeletal remains using osteometrics would be to compare specimen measurements with the minimum, maximum, and 95% confidence intervals for human remains presented in Table 2 and at least preliminarily consider anything that falls within that range, or very close to that range, as potentially human pending further analysis. OsteoID [43] will return images of human bones if the input measurements fall anywhere within the min/max or standard deviation ranges compiled from the sample of >2700 individuals. Practitioners must always consider the small possibility that their unknown specimen can be an outlier, perhaps lying at the extremes of the human distribution which may not have been captured in this study. Pathological conditions that affect body size (e.g., dwarfism, gigantism, etc.), although rare, could also affect results [52,53].

In highly fragmented or taphonomically-modified remains, morphometric and visual assessments may not be applicable. Other evidence, such as cortical bone thickness and trabecular bone density may be factored into the decision [4,54,55], although research by Rerolle et al. [56] suggests that corticomedullary index may not be as distinctive in humans as previously suggested. Several papers state that nutrient foramen location and morphology can assist in human versus nonhuman distinctions [57,58]. Microscopic (histomorphological) or molecular methods can also be utilized [59–63] to determine human origin, but they require greater expertise and specialized equipment, are more time intensive, and are destructive to the specimen [3]. Even histomorphological techniques cannot provide 100% accuracy in distinguishing human from nonhuman species, with certain faunal species (e.g., large mammals) and bone types (e.g., presence of only Haversian bone) shown to be particularly problematic [60]. Publications also differ on opinions of the use of osteon circularity in determining human origin of bone [62,63].

#### *4.2. Species Identification*

The quantitative methods of species identification were less successful than those assessing human origin. While these results are likely impacted by uneven sample sizes across the 28 species, they also reflect morphological and size similarity between some species. For example, brown bear and black bear long bones are morphologically similar [41,64–67], especially as represented by these few basic measurements; thus, small brown bears and large black bears may be misidentified. Sheep and goat long bones are also difficult to differentiate [29,68]. Domestic dogs pose many issues, not just because of their similarity to other canids included in this study (e.g., coyotes and wolves) [69,70] but also because of their high degree of variability in both morphology and size [71,72]. The DFA species classification rates were significantly higher than chance, but the probability of species misidentification remains relatively high. The application of a discriminant function to classify an unknown specimen into one of 28 groups would also be impractical to do by hand, thereby requiring computer usage. Ultimately, practitioners must rely on visual comparisons of more subtle morphological differences in making the final faunal species designations.

In facing these challenges of species identification, the OsteoID website [43] is particularly useful. Users can input basic measurements to narrow down the potential species and are presented with photographic images of the possible identifications.

Thus, the measurements are used as a filtering tool, but the final identification is still based on visual comparison. With the use of visual comparisons, OsteoID can be used for identifying fragmented elements. Supplemental resources provided on the website can also be utilized in skeletal identifications, such as access to the metric database, a link to this publication and associated Supplementary Materials, 3D scans of numerous elements, and lists of other useful texts and websites. Photographs of additional elements (e.g., carpals) not included in the web tool are provided and will be continually updated. The web tool can easily be modified if future minimum/maximum values need revision. There is also the possibility of expanding the database and web tool to include additional species/specimens in the future.

As an online, searchable, comparative osteology collection that includes photographs, data, and 3D scans, OsteoID [43] provides forensic anthropologists with a centralized location for free resources to facilitate skeletal species identification. Practitioners with less zooarchaeological training or lacking access to physical comparative collections will benefit most from these resources when determining faunal species. The web tool and online resources can be accessed from smart phones and other devices while at the scene. With the download of free third-party applications, even the 3D bone models can be viewed on smart phones. The 3D models also can be downloaded and 3D printed to create comparative collections. Beyond forensic anthropologists, forensic pathologists, medical examiners, coroners, crime scene and death investigators, and law enforcement personnel may find OsteoID useful when making preliminary assessments. In situations where scene personnel have reason to believe that remains are nonhuman and typically would have dismissed the remains as not forensically significant, they can use the OsteoID resources to visually confirm that the morphology is not consistent with a human and perhaps find a faunal species match. In cases in which there is any possibility that remains are human, expert opinions should still be obtained. Modified remains or those that are more diagnostically difficult will require a forensic anthropologist's expertise, but OsteoID can reduce time and cost expenditures for diagnostically nonhuman remains. Bioarchaeologists, zooarchaeologists, veterinarians, and biologists may also find the OsteoID web tool and resources useful, and the general public may find interest in learning more about remains encountered. Presently, there are multiple social media groups where individuals post their skeletal finds and group participants provide species identifications. Given that OsteoID is publicly available, it contains multiple disclaimers urging anyone with remains that could potentially be human to leave them in situ and to contact local authorities. Finally, the photographs and 3D scans made available via the website can be used to train students in comparative osteology and the data may be used by researchers in other studies.

## *4.3. Limitations and Future Directions*

Given that all forensic anthropologists rely partly on bone form (i.e., size and general shape) when assessing human origin, using bone metrics to create a quantitative classification method seems simple and logical. However, our study illustrates several challenges to this work. Firstly, it is difficult to find measurements that can be collected consistently across diverse species and bones. Limiting our measurements to maximum lengths, breadths, and depths allowed us to increase the range of animals and skeletal elements in our dataset for pooled analyses, but it excludes aspects of discrete morphological features used in visual assessments of species identification. While the general morphometric variables were able to successfully differentiate human from nonhuman remains (similar to the results of Saulsman et al. [14]), visual assessments that consider specific bone features are necessary for accurate faunal species identification.

Because the methods developed here are dependent on size and epiphyseal breadths, only skeletally mature specimens could be included in quantitative analyses (and resultant functions and models are only applicable to skeletally mature specimens). At least partial fusion of both the proximal and distal epiphyses should be observed prior to utilizing the discriminant functions or decision trees. Skeletally mature specimens

of certain species can be hard to locate, especially domesticated species which may be butchered as juveniles [73]. The species curated at museums vary and again tend not to focus on domesticated species or may not curate full skeletons, especially for larger mammals where space becomes a challenge.

Unequal sample sizes from different species could have biased our classification results, particularly with human versus nonhuman analyses. Although a high degree of faunal variation is captured in the pooled nonhuman sample, there is a smaller representation of some of the largest mammalian species. Given that humans also have relatively large body sizes, this may be driving some of the classification bias, as the models may be more likely to classify all large bones (human or nonhuman) as human given the large human sample sizes. Indeed, larger animals such as moose, brown bear, horse, cow and elk were more commonly misclassified as human, which could explain the relatively higher human and lower nonhuman classification rates in the discriminant functions. Misclassifying some of these species elements as human instead of nonhuman in preliminary forensic contexts is less costly than erroneously classifying human elements as nonhuman; following the preliminary human classification, a forensic anthropologist would then be consulted for a more formal assessment that would identify the error.

The smaller sample sizes in some nonhuman species are also less likely to capture the true population size variation and thus impact DFA species classifications. The human sample size, however, which is of greatest forensic significance, is sufficiently large, and the nonhuman sample sizes exceed those of previous publications [14]. Furthermore, not all measurements were available for all specimens. Data obtained from the literature frequently had some but not all the study measurements, meaning that in the DFAs, many of those cases were excluded.

The species included in the metric database are not exhaustive, and it is unclear how a specimen from an excluded species would classify. This study was limited to species commonly encountered in North America that were accessible at collections but does not include, for example, marine mammals. Further validation of the developed methods is needed, and if more data can be collected from additional species and specimens, revised models may be more appropriate. Future data collection for human versus nonhuman determinations should focus on adding greater samples of larger-bodied mammals. While increased samples of larger-bodied fauna may decrease model accuracy rates, it is possible that the models may still be able to confidently differentiate human from nonhuman specimens given the distinct functional anatomy of humans [3,74,75].

Preliminary analyses using a subsample of the humeral and femoral data suggest that machine learning and random forest models may be able to further increase morphometric classification rates for human versus nonhuman designations and species assignments [76]. Random forest models are a machine learning approach in which numerous decision trees are created from random subsamples, and their predictions are combined through averaging to produce a final classification [46–48]. This machine learning technique increases classification stability and alleviates potential issues of overfitting [58]. The downside of random forest models is their complexity. Because random forest model results are based on the combined results of hundreds or thousands of trees, there is no final model/tree that can be presented or applied to cases [46]. This ensemble approach is considered a "black box" method [41] meaning that it is mathematically complex and difficult to understand and explain in terms of application [77], which can be a disadvantage in court testimony. Furthermore, for broad application, a software program would need to be created to run the random forest models with new unknown specimens.

#### **5. Conclusions**

The tools presented in this study do not diminish the need for forensic anthropologists. Caution must still be used given the high cost of misclassifying a human bone as nonhuman, and forensic anthropologists or other experts should be consulted in situations where there is any possibility that remains may be human. Still, the resources developed and

provided here may be used to preliminarily assess whether remains are potentially human and determine the number of resources to expend on a found bone (e.g., whether or not a scene needs to be preserved, etc.). Forensic anthropologists or other medicolegal personnel can use the resources to support classifications and faunal species identifications. These resources may also be beneficial to other disciplines where skeletal remains are encountered or training in comparative osteology is beneficial, including wildlife forensics, bioarchaeology, zooarchaeology, veterinary medicine, and biology.

**Supplementary Materials:** The following are available online at https://www.mdpi.com/article/ 10.3390/biology11010025/s1. Figure S1: Human versus nonhuman decision tree derived from all available measurements and a pooled-bone sample. Figure S2: Human versus nonhuman decision tree derived from only distal bone measurements and a pooled-bone sample. Figure S3: Human versus nonhuman decision tree derived from only midshaft measurements and a pooled-bone sample. Figure S4: Human versus nonhuman decision tree derived from only maximum length measurements using a pooled-bone sample. Figure S5: Human versus nonhuman decision tree for the humerus, derived from all available measurements. Figure S6: Human versus nonhuman decision tree for the femur, derived from all available measurements. Figure S7: Human versus nonhuman decision tree for the radius, derived from all available measurements. Figure S8: Human versus nonhuman decision tree for the tibia, derived from all available measurements. Figure S9: Human versus nonhuman decision tree for the ulna, derived from all available measurements. Table S1: Descriptive statistics for humeral measurements collected by species. Table S2: Descriptive statistics for femoral measurements collected by species. Table S3: Descriptive statistics for radial measurements collected by species. Table S4: Descriptive statistics for radio-ulnar measurements collected by species. Table S5: Descriptive statistics for ulnar measurements collected by species. Table S6: Descriptive statistics for tibial measurements collected by species. Table S7: Descriptive statistics for fibular measurements collected by species. Table S8: Descriptive statistics for scapular measurements collected by species. Table S9: Descriptive statistics for sacral measurements collected by species. Table S10: Descriptive statistics for pelvic measurements collected by species. Table S11: Descriptive statistics for fused metapodial measurements collected by species. Table S12: Select discriminant functions for human versus nonhuman classification.

**Author Contributions:** Conceptualization, H.M.G., R.D. and S.B.S.; methodology, H.M.G., R.D. and S.B.S.; data collection and processing, H.M.G., R.D., S.B.S., M.S.L., M.M., N.S. and N.K.; formal analysis, H.M.G.; software, H.M.G.; data curation, H.M.G. and R.D.; writing—original draft preparation, H.M.G., R.D., S.B.S.; writing—review and editing, H.M.G., R.D., S.B.S., M.S.L., M.M., N.S. and N.K.; supervision, H.M.G. and S.B.S.; funding acquisition, H.M.G., R.D. and S.B.S. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was funded by National Institute of Justice, grant number NIJ 2018-DU-BX-0229. Opinions expressed herein do not necessarily represent the official position or policies of the U.S. Department of Justice or the National Institute of Justice.

**Institutional Review Board Statement:** All data used in this study were collected in compliance with the Declaration of Helsinki and the federal regulations for the protection of human subjects in research at 45 CFR 46 from postmortem skeletal collections curated at established museums or institutions.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** The data presented in this study are openly available via the OsteoID website (www.boneidentification.com (accessed on 24 December 2021) → Additional Resources), as well as on Dryad (doi:10.5061/dryad.73n5tb2z0).

**Acknowledgments:** We would like to thank all the museum curators that facilitated access to collections. A special thanks to the following individuals who contributed their time or data to the project: Andrea Clendaniel, Elizabeth Dougher, Alexandra Klales, Michael Kenyhercz, Dennis Dirkmaat, Julie Meachen, Chelsea Cataldo-Ramirez, and Christopher Milensky. Finally, thank you to Erin Menardi, the Web Developer who collaborated with the PIs in creating this online tool.

**Conflicts of Interest:** The authors declare no conflict of interest.
