*3.3. Area Size and Sample Size Constraint*

The impact of area size, sample size constraint, the scaling phenomenon, and other associated issues of the comparison sampling analysis are examined here under expanded scope. The band pair of Aqua MODIS B5 (1240 nm) and SNPP VIIRS M8 (1238 nm) is used as the representative case study because their comparison result has shown to be the most stable [6,7]—this is primarily due to their well-matched spectral coverage and partly to the radiometric stability of the SWIR bands. For each SNO event, an examination of the impact of area and sample size is carried out at each spatial scale

from the 20-km to 160-km scale. That is, 20 × 20 km-square area centered on nadir crossing is analyzed, then on to 32 × 32 km-square and so on until up to 160 × 160 km-square. At each scale or area size, statistics are computed for two separate cases. For the sample-unconstrained case, all pixels within the 4.5% homogeneity are used to compute the population statistics. In the sample-constrained case, only a fixed number of pixels of the best homogeneity quality, also necessarily below 4.5%, are used. The sample size of 500 samples and 1000 samples are used for the size-constrained cases.

The usable SNO events range from those of clearest scene conditions to those of variable conditions. A best-scenario SNO event is shown in Figure 5, that of 15 December 2016, for the sample-unconstrained case (red stars) and two sample-constrained cases, at 500 (green diamonds) and 1000 samples (blue squares). The top panel shows the average ratio of the qualified pixel-based ratios at each scale or area size, and the bottom panel shows the corresponding error bar, or the relative STD. The ratio result shows near-perfect broad-scale agreement among three cases that is remarkably stable at 0.991. The error bars are also very tight for all three cases, and are practically identical for the two sample-constrained cases at 0.2%. The overall result indicates a very clean scene condition that can generate very robust result at all scales shown up to 160 km. The occurrences of SNO events with this level of pristine clarity are only of several per year, but they remarkably reveal the capability of intercomparison to be fundamentally at the 0.2% level. The broad-scale agreement also reflects the sampling procedure to be meaningfully constructed and correct. All similarly clear-scene cases at other times have been checked to generate stable ratios and tight error bars across all scales as well. When clear SNO conditions exist, such as with low cloud or aerosol content, then using any small area size within the SNO scene will generate a robust and the correct result. It is here pointed out that Chu et al. [7,25] have examined one such high-precision event to confirm its clear-scene condition.

**Figure 5.** The ratio (top) and the relative precision (bottom) versus area size for the three cases of unconstrained sample size (red stars), constrained size at 1000 samples (blue squares), and constrained size at 500 samples (green diamonds) for the clear-scene SNO event on 15 December 2016 for Aqua MODIS M8 versus SNPP VIIRS M8.

The primarily important SNO cases are those of marginal statistical quality with broad-scale error bar of few percent, approximately 2% to 4%, that can be improved to be below 2% to be added to the comparison time series. Thus the number of these marginal cases can determine the success or failure of a time series. Figure 6 illustrates two representative cases with ~2% broad-scale error bar. The labels are the same as those of Figure 5. The most outstanding feature to note is that, consistent over the entire range of scale or area size shown, the sample-constrained ratios are stable with tighter error bars, while the sample-unconstrained ratios are unstable at the level of 1.5% or worse. In particular, the ratio-versus-scale result of each sample-unconstrained case demonstrates worsening

scatter toward larger scales—this decisively demonstrates the use of larger areas on its own does not improve comparison result and can in fact make it worse. Thus a strategy such as using larger areas or all available pixels, even when many noisy pixels have been removed via a homogeneity filter, is not a reliable procedure. On the other hand, the two sample-constrained cases—at 500 and 1000 samples—show stable ratio with broad-scale agreement. This finding shows that robust results are not achieved by having more samples but on the contrary by limiting them, specifically by using only the best-quality pixels. The error bar results of the constrained case are also tighter and continue to smoothly tighten further with increasing scale. The overall strong conclusion is that the application of sample size constraint, in conjunction with a homogeneity-ranked selection, stabilizes the ratio and tightens the error bar at each scale. Because of this stabilization, area size actually becomes statistically conforming—that is, by increasing the area size, more samples become available for selection and the error bar tightens as expected. The caveat is that fixing the number of best quality pixels is a necessary middle step to facilitate this conforming behavior.

**Figure 6.** The scale-dependent result of radiometric comparison of (**a**) Aqua MODIS B5 versus SNPP VIIRS M8 on 29 May 2016 and (**b**) Aqua MODIS B4 versus SNPP VIIR M4 on 11 June 2016, for ratio (top panel) and the error bar (bottom panel) versus area size for the three cases of unconstrained sample size (red stars), constrained size at 1000 samples (blue squares), and constrained size at 500 samples (green diamonds).

The precision-versus-scale result (bottom panel) also illustrates distinctively different behavior between the sample-constrained and the sample-unconstrained cases. While the sample-unconstrained case exhibits unstable large scatter, the two sample-constrained cases instead show a smooth exponentially decreasing patterning of error-bar tightening that begins to agree at the 60-km scale and finally settling at ~1%. It is clear that the unconstrained case uses all pixels necessarily including all those of worse statistical quality, thus the inclusion of all pixels does not help to tightening the error bar but in fact worsens the result. An examination of the pixel quality (shown later) illuminates this point. The precision of the unconstrained cases also shows consistent clustering at around the 2% level throughout all scales, thus demonstrating instances of "scaling phenomenon" within individual SNO events. However the phenomenon is herein explicitly revealed to be only loosely scale-invariant, and that the error bar can vary with scale or area size to some degree. This is a common feature for a majority of SNO events.

The exponential shape of the error bar results also indicates some well-behaved property. For the 1000-sample case, the 32-km scale is where the area size is minimally large enough to have more than 1000 pixels, specifically at 1024, for the analysis to be applied. At this scale, both the constrained and the unconstrained results contain almost the same set of pixels, thus the two precisions necessarily are closely matched, as shown in both dates at ~2.3%. As the area size increases to include more pixels, the constrained case will have more available pixels from which to select those of best homogeneity quality to further tighten the error bar. The error bar stabilizes at larger scales when most pixels of best homogeneity quality have been found, and that finding more pixels of better homogeneity from larger area becomes both less probable and less leveraging. This finding suggests that the selection of area size and sample size should not be too tightly matched, and instead, given any sample size constraint, the area size should be made larger to allow more samples. For example, for comparison at the 1-km regime using 1000 samples, an area of 50 × 50 km-square with 2500 available pixels will be better than a 32 × 32 km-square area with only 1024 pixels. The precision result in Figure 6, showing tightening precision at larger area, proves this point.

The relative left-right shift in the error bar versus scale result demonstrates another aspect of the sample-size condition. As explained above, the 1000-sample case starts its first point at the 32-km scale; for the 500-sample case, 23-km is the starting point with 529 available pixels. In any given spatial resolution regime, sample-constraint size determines the minimal scale. Therefore future sensors with finer imaging capability will push the minimum area even lower, allowing for more refined studies and improved capability.
