*Discussion of Model Evaluations*

Qualitative and quantitative comparisons of the performance of the WARMF and Regression models for forecasting EC at the compliance monitoring station at Vernalis were made to assess the utility of both models. The simple evaluation of the Regression and WARMF forecasting models comparing the differences between the observed salinity and the model-based forecasts of EC at the Vernalis compliance monitoring station between 22 February 2018 and 22 May 2020 suggested that the Regression model EC forecasts were generally closer to the overall mean of the observations than the WARMF model EC forecasts (previously shown in Figure 5). Although there were a total of 820 EC observations made at the Vernalis monitoring station fewer forecasts were made due to personnel availability and occasionally data validity issues. The WARMF model EC forecasts were made on the Monday of each week owing to the greater amount of time required to assemble model time series input data and complete each forecast and associated personnel constraints—hence forecasting frequency was roughly three times higher in the case of the Regression model (previously shown in Table 2).

The results of the model performance comparison as was shown in Figure 7, the Regression model provides EC forecasts with mean differences of less than or equal to 5 μS/cm for the first 7 days (Δ Day + 0 to Δ Day + 6). Alternately, the WARMF model provides EC forecasts with mean differences of less than or equal to 5 μS/cm for only 5 days (Δ Day + 0 to Δ Day + 4). Based on these measures of performance, the Regression model provides EC forecasts with reduced error relative to the WARMF model for the period from Δ Day + 4 to Δ Day + 6.

Forecast EC standard deviation, a measure of the dispersion of the EC forecasts or EC forecast differences around the mean EC value, showed that Regression model EC forecasts closely approximated of the EC observations at all lead times. The standard deviations of the WARMF model EC forecasts were consistently less than standard deviations of the EC observations until lead time day + 8. The standard deviation of forecast EC differences steadily increased with forecast lead time for both models with the WARMF model EC forecasts exhibiting greater values of standard deviation than the Regression model throughout the forecast period (previously shown in Figure 2).

To examine the effect where individual model bias affected the mean of differences between the observed EC and the model forecasted EC, EC forecast values that were higher than the measured EC were examined separately from those for which the EC forecast values were lower than the corresponding EC observations. Figures 8 and 9 showed comparisons of the positive and negative bias EC results for the Regression and WARMF models, respectively. For the positive bias differences in EC, the Regression model had smaller differences at all lead times than the WARMF model. For the negative bias differences in EC, the Regression model had smaller negative mean differences than the WARMF model. For both the positive and negative bias forecast mean differences in EC, the Regression model performed better than the WARMF model for lead times from Δ Day + 0 to Δ Day + 10. From Δ Day + 12 to Δ Day + 14, the performance of both model EC forecasts was approximately the same.

Visual inspection of the forecast EC time series results did not reveal any particular seasonal influence on the results. The RMSE between the observed EC data and model EC forecasts was also calculated as a function of forecast EC lead time. These results revealed that RMSE increased with EC forecast lead time indicating a decrease in the reliability of model forecasts. The Regression model showed consistently lower RMSE values compared to the WARMF model. The California Nevada River Forecast Center has typically run its published forecasts out only 10 days. As previously discussed, fourteen days has been considered by technical analysts associated with the real-time salinity management program to be a minimum period that would reasonably allow agricultural and wetland managers time to make adjustments to salt load export to the SJR.

Visual analysis and statistical tests suggested that neither the observed EC data or the model EC forecasts were normally distributed whereas the variances were sufficiently similar to validate the use of the matched pair permutation test, used to test whether the mean of the observed EC and model EC forecasts are statistically similar. The Regression model showed better goodness of fit relative to the WARMF model (Figure 16) as assessed by the R-squared coefficient The matched pair permutation tests indicated that both Regression and WARMF models provided reasonable forecasts extending out to approximately 7 days—not quite long enough to satisfy the goal of 14 days suggested for agricultural and wetlands stakeholder operations.

The prior analyses were based on using the full data set of all available daily observation EC data–model forecast EC paired values for both the Regression and WARMF models. However, the Regression model EC forecasts were made approximately three times more frequently than the WARMF model EC forecasts over the past 2 years (Table 3). Comparisons of the concurrent day EC forecast results with those made with the full data set suggested that the results of the analysis were similar. For both cases, the WARMF model EC forecasts were consistently lower than those the Regression model and also lower than the observed data (comparing Figure 1 with Figure 17). The standard deviations of differences between forecasts and observations for the WARMF model EC forecasts for both the full and concurrent data sets were greater than those for the Regression model at all forecast lead times.

In general, the Regression model performed better than the WARMF model for forecasting EC for up to one week into the future.
