**5. Conclusions**

Given the tremendous loss of life and property directly attributed to motor vehicle incidents on one hand, and significant advances in relevant data availability on the other, it is natural that data analytics is viewed as having grea<sup>t</sup> potential for contributing to solving these problems. A successful effort in this direction necessarily has to rely on a combination of data collection, descriptive analytics, predictive/explanatory modeling, and optimization. At the same time, each piece separately can be a significantly nontrivial problem on its own. Hence, development of a mature data-driven decision support tool incorporating all of these stages "from scratch" is probably beyond the scope or ability of any single researcher. This is especially true since there is not a conscious effort in pulling all of these areas together with the goal of informing practical decision-making. The most significant gap that we have identified, is in the translation of outcomes/insights from predictive/explanatory models (which aim to help us better understand and quantify crash risk) into prescriptive optimization models (which aim to inform route/path selection, driver assignment, etc.). Perhaps, a partial underlying reason is the absence of readily available convenient data sources and/or data processing tools.

In this review, we highlighted a promising opportunity to develop advanced analytical methods for safety-enabled transportation. The following areas represent the main avenues for progress (ordered according to the sections in this review):

	- (i) real-time or near-real-time data are more widely available now;
	- (ii) the computational advancements in the recent years can allow for parallelizing/ computing risk across the entire road network or at the very least for all major highways and interstates;
	- (iii) the insights from these relatively small road segments may not be generalizable to the entire road network; and

(iv) it is unclear how drivers (regular commuters or commercial) can utilize these insights to make more informed decisions about their time-of-travel, path and/or route selection.

In our estimation, a more comprehensive/interdisciplinary approach to crash risk modeling is needed. The research questions should not be limited to only better understand the factors contributing to crash risk, but to also consider how the output from the research can be utilized by commuters and commercial drivers. This is especially important since, despite the technological advancements in sensing technologies and development of public policies that tackle distracted driving/cell phone usage, the rate and counts of motor vehicle injuries and fatalities have remained alarmingly high.

**Supplementary Materials:** In an effort to bridge the gap between the crash prediction literature and the hazmat/optimization literature, we have made all our source code used for (a) scraping crash-related data, (b) preprocessing of such-data; (c) descriptive analytics (i.e., visualizing traffic/weather/crash data and/or clustering); and (d) explanatory modeling available on a GitHub repository https://github.com/caimiao0714/ TrafficSafetyReviewRmarkdown. To facilitate the consumption of this code, we host a website showcasing how the code can be used and depicting some of its results. The website was constructed using an R Markdown file [131], which is stored on the following GitHub Page https://caimiao0714.github.io/TrafficSafetyReviewRmarkdown/. We hope that the Supplementary Materials provided in this manuscript help promote "open data science" practices in our research community.

**Funding:** This work was supported in part by: the National Science Foundation (CMMI-1635927 and CMMI-1634992); the Ohio Supercomputer Center (PMIU0138 and PMIU0162); the American Society of Safety Professionals (ASSP) Foundation; the University of Cincinnati Education and Research Center Pilot Research Project Training Program; the Transportation Informatics Tier I University Transportation Center (TransInfo); a Google Cloud Platform research gran<sup>t</sup> for data management; and a Dark Sky gran<sup>t</sup> for extended API access (i.e., they increased the number of possible queries per day). Dr. Megahed's research was also partially supported by the Neil R. Anderson Endowed Assistant Professorship at Miami University.

**Conflicts of Interest:** The authors declare no conflict of interest.
