*4.1. Annual Central Valley Runoff Reconstructions*

A total of 60 of the 69 initial chronologies passed screening tests for temporal stability of the runoff signal and significant SSR regression model (Supplementary Material C). Most of these lagged models resulting from stepwise regression have a simple structure. All 60 models include a lag-0 (current year predictor) and 28 models have just one predictor. The median number of predictors is 2 and the maximum is 5. The minimum, median and maximum percentage of calibration-period variance explained by the models are 8%, 29% and 73%, respectively. All models are significant as judged by *p* < 0.05 for the overall F of regression. Blue oak chronologies from the Central Valley or the coastal region tend to have the strongest signal (Figure 1).

Time coverage by SSRs varies according to the coverage of the chronologies themselves. Thirteen of the SSRs have uniform coverage for 903–2008 and comprise the subset for the long reconstruction; all 60 SSRs, with a common period 1640–2001, are available for the short reconstruction. As in previous studies (e.g., Meko et al. [53]), long tree-ring chronologies of western juniper from south-central Oregon are important contributors to the long network (Figure 1).

Regression of *yt* on the 13-site-mean and 60-site-mean SSRs yields long and short reconstruction models accounting for 66% and 77% of the calibration-period variance of *yt* after adjustment for loss of degrees of freedom (adjusted *R*2) (Table 3). Both models have strong validation, as indicated by high positive RE values from cross-validation, and by highly significant correlation of cross-validation predictions with observed flow. Both reconstructions also closely track and have significant correlation with earlier flows (spanning WYs 1872–1900) from gages on the Sacramento, San Joaquin, and other Central Valley rivers that comprise the Bulletin 5 8RI flows (Figure 3) [65]; these earlier data had also been used, with less success, for validation of the first Sacramento River runoff reconstruction [51]. Both models greatly underestimate the flow in WY 1890. Tree-ring reconstruction calibrated by regression with gaged flows tend to be conservative (biased toward the mean) because the variance explained by regression is always less than 100%. This compression of variance theoretically would lead to underestimation of both wet extremes and dry extremes and complicates direct comparison of observed and reconstructed magnitudes of extreme flow events [83]. Moreover, as seen in Figure 3, the magnitude of extreme high flows may be especially difficult to capture because growth of drought-sensitive trees beyond some high level of soil moisture is logically expected to benefit less and less from additional moisture.

**Table 3.** Statistics for long record and short record tree-ring reconstructions (Model 1). Statistics are for regression models whose predictand is transformed 8RI flow (square root billion cubic meters).


<sup>a</sup> Number of contributing tree-ring chronologies. <sup>b</sup> Overall F (not listed) for both models is highly significant (*p* < 1E-25). <sup>c</sup> RE = reduction-of-error statistic; r = correlation of cross-validation predictions with observed flows. The two correlations listed are both larger than any of the correlations for the 1000 simulated reconstructed flow series (*p* < 0.001)

**Figure 3.** Time series plots (WYs 1872–1900) comparing reconstruction of Central Valley runoff (Model 1) with instrumented flows for (**a**) long record reconstruction and (**b**) short record reconstruction. Spearman correlations and significance annotated. Significance not adjusted for autocorrelation because none of the series are positively autocorrelated. This period, as documented in Bulletin 5 [65], precedes years used for calibrating and validating reconstruction models.

Both the short and long record reconstructions strongly track the sum of individual reconstructions generated previously for the Sacramento and San Joaquin Rivers by Meko et al. [20]. For the 1640–2001 period common to these reconstructions, the Pearson correlation is *r* = 0.83 for the long record reconstruction and *r* = 0.95 for the short record reconstruction. For the earlier 903–1639 period in common with the long reconstruction only, the correlation remains high (r = 0.82). While agreement between reconstructions is limited by differences in tree-ring networks and statistical reconstruction methods, the reconstructions are reasonably consistent in their characterization of droughts and wet periods. The long record reconstruction, for example, includes a period of low runoff values in the mid-1100s that aligns with a period of notable persistent drought in both the Colorado and Sacramento Basins [95,100].

The long and short record 8RI reconstructions are shown in Figure 4, indicating annual and 5-, 10-, 20-, and 100-year center-averaged values. When the late-19th to early 20th century reconstructed flows are compared with reconstructed flows in the preceding centuries, it is apparent that single-year wet and dry extremes are more variable, however, time averaged flows are more consistent over the different periods. This is also summarized in Table 4, which shows that for all of the averaging periods presented, from 5 to 100 years, the range of flows are essentially similar for the full reconstruction and the more recent 1872– 2001 period. Importantly, the entirety of the instrumented period, 1872–2018, generally shows a wider range in flows that the reconstructed values. Additionally, low flows over different averaging periods are lower in the instrumental record than in the longer reconstruction period. This comparison suggests that the instrumental flow record is a reasonable representation of the conditions over the past millennium and captures extremes in the low flow periods.

**Figure 4.** Time series plots of long and short tree-ring reconstructions of Central Valley runoff (Model 1) for (**a**) long record reconstruction, spanning 903–2008 and (**b**) short record reconstruction, spanning 1640–2001. Smooth lines represent 20-year average flows. Other averaging periods, as summarized in Table 4, are excluded from the figure for clarity.

**Table 4.** Range of reconstructed and instrumented Central Valley Runoff (8RI) for different averaging periods. Units are reported as BCM.


Another way to look at the flow reconstructions is to examine the sequence of wet and dry periods in the record, in comparison to the contemporary period. The long record reconstruction was reviewed to highlight patterns of low and high flows that are of interest for water resources management. Figure 5 shows 20-year center averages associated with four 121-year periods with the greatest variations (970–1090; 1100–1220; 1570–1690; 1850– 1970). This figure illustrates that shifts between wet and dry periods have occurred several

times in the past millennium, but in most of these instances, the range of flow variation is not greater than the reconstructed flows for late 19th and early 20th century. Actual observed flows during high flow periods are higher that the reconstructed flows, a bias expected from the variance compression inherent in regression, and possibly also to lower sensitivity of tree-growth to soil moisture beyond a threshold level. However, if only reconstructed flows are considered for comparison over different time periods, as done throughout this study, it may be inferred that flow patterns in the instrumental period after the 1870s, especially over decadal time scales, are a reasonable representation of the overall variability seen in the past millennium.

**Figure 5.** Selected flow periods over the long reconstruction compared with instrumented flow from (20-year centered average for both flow terms).
