*3.5. Application and Result*

The long-term stability of the Aqua MODIS B5 and SNPP VIIRS M8 time series makes it an ideal case to illustrate the impact of various selection criteria. Figure 9 shows the three cases of 36-km with 1000 samples (red crosses), 36-km with 500 samples (green diamonds), and 50-km with 500 samples (blue squares). The solid line is the series mean set at 0.988 and the two dash lines mark the 2% boundaries above and below the mean. A precision threshold of 3% is applied to all three time series. The mean and the precision results are computed using the best 100 events in each time series for consistent comparison purpose.

**Figure 9.** The time series of Aqua MODIS B5 versus SNPP VIIRS M8 under three combinations of area scale and sample size. A 3% precision threshold is imposed on each SNO event.

It is seen that lowering constraint size has impact. Lowering the sample size constraint from 1000 (red crosses) to 500 (green diamonds) increases successful comparison outcomes from 101 to 162 and tightens the 100-event error bar average from 1.214% to 0.622%. The 50-km, 500-sample (blue squares) time series, contrasting against the 36-km, 500-sample case, increases the number of successful SNO events to 195 and tightens the error bar to 0.424%.

Nevertheless, all three time series generate statistically indistinguishable series means at 0.988, thus it may appears at first that different conditions do not matter. However, for other purposes such as generating a fuller time series with fewer data gaps, i.e., better regularity, a larger area size and a less stringent sample size constraint may be better. For example, the 50-km, 500-sample time series (blue squares) contains more outcomes in the year 2012 and 2013 than the other two cases. What is demonstrated is that the area size and the sample size constraint can be tuned to improve some characteristics of the time series such as regularity that can be helpful to evaluate the radiometric performance at certain period. Larger area sizes beyond 80-km scale and lower sample size down to 250 samples have been examined to result in no improvement, thus supporting the 50-km with 500-sample condition to be sufficiently optimal for Aqua MODIS B8 versus SNPP VIIRS M8.

Yet the same result reveals a limitation—the existence of data gaps, such as the 5-month Austral winter period years 2014, 2015, and 2016. While many SNO events do exist in these periods and the refined analysis here has improved the situation somewhat, the challenging conditions of low radiance and noisy scenes are difficult to overcome. This is definitely one area for continual improvement.

Figure 10 further illustrates Aqua MODIS B5 versus SNPP VIIRS M8 result for three scenarios at the 50-km scale—the constrained case with 500 samples of best homogeneity (blue square), the unconstrained case with all samples without homogeneity condition (red stars) and clear-scene subset of the 500-sample constrained case (green diamonds). The 500-sample constrained case is the same time series in Figure 9, also in blue squares, repeated here for comparison. The same 3% precision threshold is applied for the constrained and the constrained cases, and a 100-event choice is similarly made for consistent comparison. For the clear-scene time series, a 0.35% precision is further imposed on the constrained case to extract this subset. The first notable improvement is that the constrained case is significantly better at 195 outcomes and a 100-event precision of 0.424%, comparing with the unconstrained case at 166 outcomes and 1.059% precision. This is consistent with the result in Figure 9. Also, the clean scene time series with 38 best outcomes, those with error bars below 0.35%, trace out a long-term baseline at 0.989 that is very consistent with other cases. As already revealed by the event of 15 December 2016 shown in Figure 5, the clear-scene result should be closest to the "truth" of comparison result. The clean scene time series with ~0.3% average precision suggests that 0.989 reflects the true comparison baseline, and that other times series are highly consistent with this result—this finding can be very helpful in pinpointing the radiometric baseline and helping to ascertain other features. It is worthy to note again that "nadir-only" condition by using a small area, here at 50-km scale, is already itself a sufficiently constraining condition, and therefore even the unconstrained case can appear to have comparably acceptable result.

**Figure 10.** The three time series of Aqua MODIS B5 versus SNPP VIIRS M8 correspond to the constrained analysis using homogeneity and sample size constraint, the unconstrained case, and the clear-scene result.

Intercomparison can expose a variety of different outcomes and features. Figure 11 shows the corresponding comparison result of Aqua MODIS B4 (555 nm) versus SNPP VIIRS M4 (551 nm) for the constrained (blue squares), unconstrained (red stars), and clear-scene (green triangles) scenarios. The same 3% precision threshold is applied for both the constrained and the unconstrained cases, and the best 100 events are used to compute the time series statistics. In comparison with Figure 10, it is clear that different band pairs have clear qualitative difference; for Aqua MODIS B4 and SNPP VIIRS M4, which center near the 550 nm spectral range, the stronger scene radiance leads to more successful events.

**Figure 11.** The three time series of Aqua MODIS B4 versus SNPP VIIRS M4 correspond to the constrained analysis using homogeneity and sample size constraint, the unconstrained case and the clear-scene result.

Although the examination into the physical cause of any deviation is not a purpose of this study, one Aqua MODIS B4 versus SNPP VIIRS M4 result reveals an important feature: in the four-year period from 2013 to 2017, an upward drift of ~2% can be seen, thus exposing some worsening on-orbit calibration error in either the IDPS-generated SNPP VIIRS M4 or the Aqua MODIS B4 of Collection 6 release. The clear-scene time series (green diamonds) is particularly lucid in tracing out both the multiyear drift and the yearly oscillation. The worsening calibration error comes from within the IDPS-generated radiance due to some nontrivial angular dependence in the reflectance property of the SD degradation [26–29] that has not been correctly captured by the standard on-orbit calibration methodology. This calibration error is neither trivial nor negligible, and can severely compromise product retrievals and climate studies. Thus establishing a meaningful and reliable time series, along with robust ratios and tight error bars, is a fundamentally important aspect of intercomparison methodology to enable correct assessments of the sensor data. Additionally, the seasonal modulation exhibited in the time series is typical of inter-RSB comparison of Aqua MODIS versus SNPP VIIRS [5–7]. Figure 4 has shown that the SZA correction is not a contributor to this modulation; the RSR mismatch is necessarily one of the contributing causes.

Also, in Figure 11, the unconstrained case shows significantly worse statistics than the sample-constrained case, again demonstrating the utility of these constraints despite using fewer samples.

#### *3.6. Impact of Precision Threshold on the Time Series*

The current finding so far suggests a 0.2% stability of the overall ratio mean of time series under different scenarios, but additional examination of the dependence on the threshold over a larger threshold range yields some confirmation. Figure 12 shows the time series mean versus precision threshold of Aqua MODIS B5 versus SNPP VIIRS M8 for the four different constraint conditions over a 0.6% range. For each precision threshold, all SNO events under the threshold are included in the computation of the time series mean. As the precision threshold is relaxed, more SNO events with larger error bar are included, and the series mean changes accordingly.

**Figure 12.** The mean of the radiometric comparison time series at each level of precision threshold, for the constrained and the unconstrained cases in Aqua MODIS B5 versus SNPP VIIRS M8.

The most important result is that the time series mean varies, primarily upward for this particular comparison case, over a 0.4% range with respect to precision threshold. The overall pattern is consistent with those events of tightest precision being more likely representative of the true radiometric comparison result, and those events of worse precision contain more radiometric bias. Therefore keeping a tight precision threshold is recommended to reduce any nontrivial variability or bias in the time series mean. The 2% precision threshold appears to be a reasonable choice with variability of the mean on the level of 0.2% variability in the mean for this context of the constrained procedure; while a more generous choice to achieve fuller time series must be cautious about making the time series mean less reliable.

The long-term stability of the Aqua MODIS B5 versus SNPP VIIRS M8 time series is what makes clearer the existence of any deviation or variability. In contrast, cases such as Aqua MODIS B4 versus SNPP VIIRS M4 with significant drift, as shown in Figure 11, are more difficult for interpreting the dependence on the precision threshold since the 2% drift complicates the result. For these cases, the mitigating the on-orbit calibration error should take top priority over any intercomparison issue. As emphasized already, intercomparison analysis is most valuable when it reveals some deviating that requires correction.

#### *3.7. Scaling Phenomenon in MODIS versus SNPP VIIRS*

The "scaling phenomenon" [7] is a broad-scaled and persistent variability pervading into the SNO results as illustrated in Figure 6 in selected events. Figure 13 illustrates the phenomenon for the Aqua MODIS B8 versus SNPP VIIRS M1 time series as a whole and includes the new result under the constrained analysis. Each point represents the error bar versus sample size outcome of an SNO event in the time series. Time series results of three different area sizes are shown: 36-km scale (red triangles), 50-km (blue squares), and 80-km (green diamonds). The result demonstrates how the scene-based scaling phenomenon blocks the use of the larger area size to improve statistics and how the constrained procedure overcomes this limitation.

**Figure 13.** Scaling phenomenon in Aqua MODIS B8 versus SNPP VIIRS M1 for both sample-unconstrained and sample-constrained cases.

The top left panel of Figure 13 displays the time-series result of error bar versus sample size (for all events) without size constraint. The maximum sample size for the 36-km scale is 1296, or close to 0.2 per 6400 on the plot, and similarly for the 50-km scale at 2500, at near 0.39, and 80-km at 6400, at 1.0. It can be seen that the pattern of scatter of error bar values, apart from different sample size ranges, appear similar for all three scales. The top right panel of Figure 13, the scaled version, explicitly demonstrates the scaling phenomenon by linearly scaling the sample size of 36-km and the 50-km result to match 6400, i.e., stretching the result in the horizontal direction rightward until 6400. The scaled result shows that scatter pattern of three cases are effectively indistinguishable. The clear implication is that enlarging the area size to increase the number of pixels ends up generating same statistics and does not improve the quality of the time series. In contrast, the time series results in Figure 9 under homogeneity-ranked sample constraint, demonstrate clear improvement with lager areas. More detailed examination into each SNO event reveals that the scaling phenomenon is only an approximate effect of some common scene-based effect. As shown in Figure 6, the error bar result in the sample-unconstrained case (red stars) in each single NO event can slightly change with increasing scale.

The sample size constraint, originally applied to stabilize the error bar [7], necessarily impacts any scene-based effects including the "scaling phenomenon". The bottom two panels of Figure 13 demonstrate the impact of the constraints, for sample size of 500, on error bar versus sample size result. The label of "Sample Size" on the horizontal axis refers to the original available number of pixels for each event before the constraint is applied—thus it corresponds to the sample size for the corresponding unconstrained case. However all actual outcomes have the same final sample size of 500. In the bottom-right panel, the error bar scatter pattern of the 80-km result (green diamonds) is seen to become tighter than those of the 36-km and the 50-km scales, thus showing that scaling phenomenon is no longer true in the constrained analysis. In the same plot, the range of the error bar shows more obvious and faster tightening with increasing size for all three cases, reaching below 2% at higher sample size, showing that the constrained procedure is effective.

For completion and illustration, the corresponding ratio versus sample size of the Aqua MODIS B8 versus SNPP VIIRS M1 time series is shown in Figure 14. The 4 to 6% range of spread makes it less obvious to discern any 0.1 to 0.5% effect, but many resulting points can be seen to have shifted from the unconstrained case (top panels) toward the center of the range in the constrained case (bottom panels). The 4 to 6% spread of Aqua MODIS B8 versus SNPP VIIRS M1 ratio result is among the worst comparison results, whereas cases such as Aqua MODIS B5 versus SNPP VIIRS M8, as in Figure 8, spreads over a smaller 2% range. In general, ratio result is not an effective discriminator of statistical

quality among SNO events given its large spread, and the final selection of the times series events should not rely on using ratio. On the other hand, error bar result, as shown by the bottom panels of Figure 13 as well as in earlier figures, has demonstrated to be stronger discriminator of statistical quality of SNO events that can be utilized as a selection filter.

**Figure 14.** Ratio versus sample size for Aqua MODIS B8 versus SNPP VIIRS M1 for unscaled (top) and the scaled (top right) ratio result under the sample-unconstrained condition, and for the corresponding unscaled (bottom left) and scaled (bottom right) ratio results under the sample-constrained condition.

The scaling phenomenon exists in effectively identical fashion for all inter-RSB comparisons of Aqua MODIS versus SNPP VIIRS. Figure 15 shows the scaling phenomenon for six inter-RSB comparisons of Aqua MODIS versus SNPP VIIRS. Be it thin clouds, aerosol, or any combination of scene conditions, it appears that some atmospheric conditions in the polar scenes impact all RSBs in nearly identical way. A general implication is that any inter-RSB comparison between any two polar-orbiting multispectral sensors that generate SNO scenes over the polar regions necessarily needs to take this scene-based effect into account.

Figure 16 demonstrates that the scaling phenomenon also exists for Terra MODIS versus SNPP VIIRS, exemplified by Terra MODIS B8 versus SNPP VIIRS M1. As Terra versus SNPP SNO events trace out completely different locations (see Figure 2), this result generalizes this scene-based variability over both northern and southern polar scenes.

**Figure 15.** Precision versus sample size for six inter-RSB comparisons of Aqua MODIS versus SNPP VIIRS under no sample constraint demonstrating scaling phenomenon.

**Figure 16.** Precision versus sample size for Terra MODIS B8 versus SNPP VIIRS M1 under no sample constraint demonstrating scaling phenomenon.
