1. Introduction
Forest Inventory and Monitoring sample designs are often conceived and presented with a notion of permanence. A set of areas (plots) on the ground is established through some mechanism to be observed in perpetuity at defined intervals of time. Additionally, the estimation system is often designed to correspond to that permanently established set of observations. At the same time, there is often no permanence to funding allocations to make those ground observations or to continue or complete specific scientific investigations. When a budget reduction occurs to the extent that there will necessarily be fewer plots observed per year, the problem might be viewed as a tradeoff between reducing the number of plots (
n) or lengthening the number of years (or cycle length) taken to observe the plots. Often, when the problem is viewed that way, there is scant acknowledgement of the fact that the tradeoff does not constitute a one-for-one relationship. An exception to that common deficiency can be found in Van Deusen and Roesch [
1], which discusses the difference in the informational costs of reducing the number of plot locations relative to lengthening the cycle when estimating the components of growth. Here, we explore some of those differences at a very basic level, as they relate to variants of three general classes of estimators of annual cubic meter per hectare volume.
We compare the simulated contrasting effects of decreasing the total number of plots and increasing the cycle length on three general estimation systems and variations of two of those estimation systems. The first general estimation system is used by the USDA Forest Service’s Forest Inventory and Analysis (FIA) Program and can be found in Bechtold and Patterson [
2]. That system is a moving average estimator, applied to the end of the period, rather than to the middle of the period (or window) and was expressed in Roesch [
3] as:
where
P is the number of panels (or cycle length),
t is the year of interest, within a time period of 1 to
T, and
yi is the observation of the variable of interest in year
i. The general statistical properties of this estimator are well-known: notably, because it uses all of the available data, it has very low variance, but will be biased in the presence of any non-zero trend, and that bias will increase relative to the cycle length. In its initial form, the first annual estimate is available at the end of the first cycle. In this paper, we formulate estimates for every year in which there is data available. To do that, we have to augment the estimator in Equation (1) with estimates for the first
P − 1 years formed from all of the data available up to and including each year
t:
The second general estimation system utilizes the moving window estimator:
where
k is an odd integer and the window width. In the applications below, we use
k = 1, 3, 5, 7 and 9, noting that MW1 is simply the panel mean. The vector of annual estimates from each application of
MWK will be shorter than the number of input years by (
k − 1), split evenly between the first 0.5(
k − 1) years (the initial extraneous years) and the last 0.5(
k − 1) years (the final extraneous years). We initially set
k =
P and then provide estimates for the extraneous years using two different strategies. In the first strategy, we use a spline of successively smaller moving windows as we approach the most extraneous year in the estimation interval. For each year of interest, we use the largest window available for which the year of interest is centered in the window:
In the second strategy, we supplement the estimates in the extraneous years using the method in Roesch [
3]. For each cycle length P, the estimator
MWP was supplemented through recursion to provide estimates for the otherwise missing years, under the assumption that the trend estimated in the initial and final estimated years remained constant through the estimator’s initial and ending extraneous years, respectively. To supplement estimator
MWP to provide estimates for the first 0.5(
k − 1) and the last 0.5(
k − 1) years of
T years, apply the algorithm,
The third general estimation system utilizes the dual-filter estimators of Roesch [
3], in which two filters are applied in succession, the first being a moving window estimator and the second being a variant of Theil’s mixed estimator (Theil [
4]). The specific variant that we use assumes that the results of the first filter will conform to the quadratic model described in Van Deusen [
5,
6], who was the first to propose mixed estimators for continuous forest inventory designs. The dual-filter approach had previously been suggested for this rotating panel design in Roesch [
7], as a way of reducing the variance of inputs into the mixed estimator. In the simulations described below, for the dual-filter estimators, we assume that the off-diagonal co-variances are zero in the mixed estimator, even though that is clearly not the case (because the panels have already been combined in the first filter). We use zeros in the off-diagonal co-variances for a number of reasons. For one, estimation of the off-diagonal covariances is extremely computer intensive and therefore time-prohibitive in a simulation. More importantly, the mixed estimator will only be improved if the data available are sufficient to ensure that the co-variances are well-estimated and that will not always be true. In a production environment, one could include an algorithm to decide when sufficient data are available to include the off-diagonal co-variances. Nevertheless, the variants of the dual-filter estimators are shown to work well below.
We formulate a dual-filter mixed estimator, MixMWP, using MWP for the first filter and then let:
n = T – k + 1,
∑ = an n row by n column co-variance matrix for MWP;
Ω = an (n − 3) row by (n − 3) column sub-matrix of ∑ , using rows and columns from 4 to n;
R = an (n − 3) row by n column constraint matrix, with zeros everywhere except that each row t has the sequence [1,−3,3,−1] beginning in column t.
The dual filter mixed estimator is then:
The value of the unknown parameter
p is estimated with maximum likelihood, as shown in Van Deusen [
6].
As with the
MWP estimator, the
MixMWP estimator does not provide estimates for the first 0.5(
k − 1) and the last 0.5(
k − 1) years in the series, and again there are a number of ways we could formulate estimates for these extraneous years. We could use the same two strategies in Equations (4) and (5) to formulate
MixSpline and
MixSuppl that we used to formulate
MWSpline and
MWSuppl, respectively. In this paper, we do not consider
MixSpline further but do consider
MixSuppl, which appeared to be somewhat useful in a similar context in Roesch [
3]. A third strategy would be to formulate a dual-filter estimator using
MWSpline, for the first filter, and then let:
∑ = the T row by T column co-variance matrix for MWSpline; and let
Ω = an (T − 3) row by (T − 3) column sub-matrix of ∑, using rows and columns from 4 to T;
R = an (T − 3) row by T column constraint matrix, with zeros everywhere except that each row t has the sequence [1,−3,3,−1] beginning in column t.
We can use these inputs to formulate:
In this paper, we test the performance of these estimators under both a simulated decrease in plot location intensity and increasing cycle length.
3. Results
The
wBias statistics calculated over the 152 populations of the estimators for cycle lengths of 5, 7 and 9 years from the simulation under sampling intensities 1 through 3 are given in
Figure 3, while the corresponding
wMSE statistics are given in
Figure 4. In the nine graphs of each figure, increasing cycle length is shown from left to right, while decreasing sample intensity is shown from top to bottom.
We can make several observations that are common to the nine graphs in
Figure 3. First, we note that the panel mean is the only design-unbiased estimator. All of the other estimators show bias in the annual estimates, the magnitude of which depends on how well their underlying models fit the individual population trajectories. Often, that bias is not much greater than the empirical bias of the panel mean.
EOPSpline is almost always the most biased except for estimates made for the years 1995 and 1996. Note that in 1995,
EOPSpline is equal to the panel mean, given that data for only 1 year are available. The supplemented estimators (
MWSuppl and
MixSuppl) show more bias in
Figure 3, but lower
wMSE (in
Figure 4) in the extraneous years than their splined estimator (
MWSpline and
MixMWsp) counterparts. In
Figure 3, as cycle length increases (from the left to the right graphs), the
wBias of all estimators except the panel mean increases, with that of the
EOPSpline being the most exaggerated.
In
Figure 4, in the extraneous years, as cycle length increases, the
wMSE of all estimators correspondingly increases. We note no significant change in the
wMSE ranking of the estimators for specific years or between years in the non-extraneous years for each cycle length. We do observe differences in the
wMSE rankings between the three temporal groupings of (1) the lower extraneous years, (2) the non-extraneous years and (3) the upper extraneous years, for each cycle length. Recall that the number of extraneous (or extreme years) increases as cycle length increases, the effects of which can be seen from left to right in
Figure 4.
As sample density decreases from top to bottom in
Figure 3 we note minor changes in
wBias magnitudes, but no change in
wBias ranking of the estimators for each year. Also, for the corresponding graphs in
Figure 4, there are proportional changes in
wMSE magnitudes but no change in
wMSE ranking of the estimators for each year.
4. Discussion
It is not often acknowledged that for the class of sample designs being discussed here, model-based estimators are required to ensure that annual estimates can be made with low variance. Model-based estimators can be model-unbiased but they are not design unbiased. As mentioned above, the panel mean is the only design-unbiased estimator that we consider here. The results show that the panel mean has such a high variance that most of the time is has the highest wMSE for the annual estimates. The magnitude of bias in the model-based estimators does depend on how well their underlying models fit the population trend, with EOPSpline being the least responsive to changes in trend. The supplemented estimators (MWSuppl and MixSuppl) show more bias but lower mean squared error in the extraneous years than their splined estimator (MWSpline and MixMWsp) counterparts. As cycle length increases, the bias of all of the model-based estimators increases, with the greatest effect showing in EOPSpline. All of the model-based estimators, except for EOPSpline, work well in the non-extraneous years, although there are differences in the rankings based on mean squared error between the three groupings of estimation years, the lower extraneous years, the non-extraneous years and the upper extraneous years. The decrease in spatial intensity is shown to contribute to minor changes in the magnitudes of bias and mean squared error, but those changes are less drastic than those observed to result from an equivalent reduction in plot observations effected through the lengthening of the cycle.
While it is true that all national forest inventories attempt to provide estimates of many different variables, and we have used only one in our simulation, we believe that we can cautiously think of the results of this study in somewhat more general terms. That is, although the variable of interest here was cubic meters per hectare of wood through a 21-year period, the estimators themselves are indifferent to the exact variable of interest. The conclusions here should apply to all variables that have about the same relationship of spatial to temporal diversity or rate of change. This would be true for many, but certainly not all variables of interest. Variables that have about the same spatial diversity but a faster rate of change through time would be more affected by a lengthening of the cycle and those with a slower rate of change through time would be less affected by a lengthening of the cycle, than what we observed in these simulations. All of these estimators are affected by how any particular variable changes over the landscape relative to how it changes through time.
In this study, we took the values of the variable of interest (cubic meters per hectare) at face value, rather than treating those values as derivatives of observed values, and then explicitly incorporating the additional model error in the sampling simulations. We do not know if the error associated with the application of volume equations would suggest that larger errors in the sampling simulations should be considered. One reason is that there are many tree species for which no volume equations have been developed. Typically, in those cases, a volume equation developed for a different, but presumably similar, species is used. Additionally, populations of specific species could be changing enough through time that older volume equations have greater error than they once had. These unknowns can only be addressed through future research. Some of this research is currently being conducted, as the improvement of volume equations for many, but not all, species is an area of active research in the USA and worldwide.