3.1. Boundary Layer Meteorological Evaluation
Table 1,
Table 2 and
Table 3 summarize the performance statistics of meteorological and chemical predictions from WRF/Chem-ROMS during SPS1, SPS2, and MUMBA, respectively. For T2, the model is able to capture the seasonal variation with higher average temperatures in summer during SPS1 and MUMBA than those in autumn during SPS2, and higher T2 during MUMBA due to the heatwaves than SPS1. The domain-mean MBs of T2 during all field campaigns are in the range of −0.22 to 0.1°C at 27-, 9-, and 3-km resolutions but −0.9 to −0.5°C at 81-km resolution, indicating a satisfactory performance of T2 predictions except at 81-km. The model has the best statistical performance for T2 at 3-km for SPS2 and at 9-km for SPS1 and MUMBA. Over the ocean, SST is slightly overpredicted for SPS1, underpredicted for SPS2, and either overpredicted or underpredicted for MUMBA at all grid resolutions. The domain-mean MBs of SST are −1.7 to 1.4 °C at 81-km resolution and −0.5 to 1.0 °C at finer resolutions. For RH2, MBs are within 5%, NMBs are within 7%, and NMEs are within 15%, also indicating a good performance, although the model did not reproduce the observed relatively higher RH2 value during SPS2 than during SPS1 and MUMBA at all grid resolutions except at 81-km. For WS10, the model effectively simulates the seasonal variation with the strongest wind during MUMBA and the weakest wind during SPS2. The WS10 biases at 3-km are −0.09, 0.16, and −0.32 m s
−1 for SPS1, SPS2, and MUMBA, the corresponding MBs for WD10 are within 9.5°, 1.9°, and 10.9°, respectively, which are within the performance threshold of 10° except for MUMBA. An overall good performance of WS10 and WD10 is also found at 9-km. In addition to accurate simulation of atmospheric stability and the use of a fine grid resolution, an accurate representation of surface roughness is required for accurate simulation of WS10. WRF has a tendency to overpredict WS10 because it cannot accurately resolve topography [
96,
97]. In this work, the good performance of WS10 at 3- and 9-km is because of the use of fine grid resolution and of the surface roughness correlation algorithm of Mass and Ovens [
96]. However, the model shows relatively large biases for WS10 and WD10 at coarser grid resolutions (27- and 81-km) for SPS1 and SPS2, with their MBs ranging from −0.8 to 0.6 m s
−1 and from −1.6° to 11.5 °, respectively, some of them exceed the performance threshold of 0.5 m s
−1 for WS10 and 10° for WD10. This indicates a need to improve the model’s representation of subgrid scale variation of the topography, in particular surface roughness in Australia at coarse grid resolutions.
Figure 3 compares observed and simulated domain-mean diurnal profiles of T2 and WS10 averaged over all BoM monitoring stations over the Greater Sydney (d04) simulated by WRF/Chem-ROMS during SPS1, SPS2, and MUMBA. As expected, the predictions at 81-km deviate most significantly from the observations during most hours for both T2 and WS10 during all three field campaigns, whereas those at finer resolutions agree generally better with observations. The large deviation at 81-km is caused by the use of a coarse grid resolution that cannot accurately resolve heterogeneity of topography and small-scale meteorological processes and variables. The T2 predictions at 27-, 9-, and 3-km are similar, but WS10 predictions differ appreciably with better agreement at 9- and 3-km than at 27-km. For T2, the model tends to underpredict between 6 a.m. to noontime but overpredict after 2 p.m. For WS10, the model tends to overpredict before 8 a.m. and after late afternoon but underpredicted during the daytime. These deviations indicate the model’s difficulties in reproducing both daytime and nighttime profiles of T2 and WS10 in Australia, due to the limitations of the YSU PBL scheme and the NOAH land surface module in representing daytime and nocturnal boundary layer and surface sensible and latent heat fluxes over land areas. For example, the limitations in the nocturnal boundary layer representation may include inaccurate eddy diffusivities and nighttime mixing, and the strength and depth of the low-level jets [
50,
98,
99,
100].
Figure S1 in the supplementary material compares observed and simulated temporal profiles of T2 at five selected sites. During SPS1, SPS2, and MUMBA, T2 predictions at all grid resolutions are very similar at two inland sites (Badgery’s Creek and Richmond), but show higher sensitivity to horizontal grid resolutions at Bankstown Airport (an inland site), and Bellambi and Sydney Airport (coastal or near coastal sites), with better performance at 3-, 9-, and 27-km except at Bellambi during SPS1. During SPS1, the closest agreement with observations occurs for the simulation at 3-km at Bellambi and Bankstown Airport but for the simulation at 9-km at Sydney Airport. During SPS2, the closest agreement with observations occurs for the simulation at 9-km at Bankstown Airport but for the simulation at 27-km at Sydney Airport. During MUMBA, the closest agreement with observations occurs for the simulation at 9-km at Bankstown Airport and Sydney Airport. Comparison of temperature profiles during the three field campaigns shows strong observed seasonal variations with higher temperature in summer 2011 and 2013 (SPS1 and MUMBA) than in autumn (SPS2) at the same site (except at Bellambi where the observations are only available in summer 2012), which is reproduced well at all sites. The model also reproduces well the observed heat waves on Jan. 8 and Jan. 18, 2013 with higher daily maximum T2 values and a much stronger observed daily variation of T2 during summer 2013 than during summer 2011.
Figure S2 compares observed and simulated temporal profiles of WS10 at the same five sites. Compared to T2, WS10 predictions are much more sensitive to horizontal grid resolutions at all sites, generally with better performance at 3- and 9-km at most sites. During SPS1, WS10 predictions show the largest difference at Bellambi among the simulations with different grid resolutions, with the best performance at 3-km. Bellambi is very close to the ocean, a grid cell of 3 × 3 km
2 most accurately represents the land surface at this site, whereas larger grid cells consist of some oceanic areas, leading to underpredictions of WS10. The best agreement with observation occurs at the Sydney Airport for the simulation at 3- and 9-km but at Badgery’s Creek, Bankstown Airport, and Richmond RAAF for the simulation at 27-km. The model tends to underpredict WS10 at the Sydney Airport, in addition to Bellambi. During SPS2, the WS10 predictions at 3- and 9-km are very close at all sites except Bellambi where the four sets of predictions are quite different (although no observations are available for performance evaluation). The best agreement with observation occurs at the Sydney Airport for the simulation at 3-km but at Badgery’s Creek, Bankstown Airport, and Richmond RAAF for the simulation at 9-km. The model tends to underpredict WS10 at the Sydney Airport but overpredict at other sites. During MUMBA, WS10 predictions from the four sets of simulations are very close at Bankstown Airport and Richmond RAAF, with an overall very good performance against observations. Simulated WS10 at 3-, 9-, and 27-km at Badgery’s Creek agree well with observations but are significantly overpredicted at 81-km. WS10 predictions are largely underpredicted at the Sydney Airport, with the best performance at 3-km. Similar to SPS1 and SPS2, the model tends to underpredict WS10 at the Sydney Airport. Observed WS10 are generally lower during autumn than during summer at all sites, which is not well reproduced because of overpredictions at all sites except at the Sydney Airport. As mentioned previously, the larger wind bias during SPS2 may be driven by less organized flow patterns in autumn compared to summer, in addition to the inaccurate representation of the surface roughness.
Precipitation predictions are evaluated using three sets of data, including the observations from the field campaign (OBS), satellite retrievals, and combined satellite data and reanalysis data from MSWEP and GPCP. Comparing to OBS, the domain-mean NMBs at all grid resolutions are generally within ±30% for SPS1 and SPS2 but in the range of -40.7% to -32.4%% for MUMBA, indicating a satisfactory or marginal performance against OBS during SPS1 and SPS2 but unsatisfactory performance during MUMBA. MUMBA is an atypical summer in terms of both temperature and precipitation. The observed daily average precipitation during MUMBA is much higher (by a factor of 6) than that during SPS1 which is dry. SPS2 is also wet with much higher total precipitation (by a factor of 5.3) than SPS1. The model performs reasonably well for precipitation against OBS over d04 in the dry period (SPS1) but moderately underpredicts precipitation in the wet periods (SPS2 and MUMBA). Comparing to MSWEP, the domain-mean NMBs at 81-, 27-, 9-, and 3-km are mostly beyond ±30%, indicating an unsatisfactory performance at all grid resolutions during all field campaigns except for 81- and 3-km during SPS1 and 81-km and 27-km during SPS2. Comparing to GPCP, the domain-mean NMBs at all grid resolutions are well within ±30% for SPS2, but beyond ±30% for SPS1 and MUMBA, indicating a satisfactory performance during SPS2 but an unsatisfactory performance during SPS1 and MUMBA. The model’s skills in predicting precipitation are not self-consistent because of large differences in the three sets of data used for evaluation. While the observed daily mean precipitation from OBS, MSWEP, and GPCP are similar for MUMBA, large differences exist among precipitation from OBS, MSWEP, and GPCP for SPS1 (0.84, 1.18, and 2.2 mm day
−1, respectively) and between GPCP precipitation and OBS or MSWEP precipitation for SPS2 (2.3 vs. 4.41 or 4.54 mm day
−1). Both OBS and MSWEP indicate a dry summer during SPS1, therefore GPCP may have overestimated the total precipitation during SPS1, which led to large underprediction biases for SPS1. However, in general, it is clear that the model tends to underpredict precipitation during all field campaigns. Based on the model predictions, non-convective precipitation dominates in d04 for SPS1, SPS2, and MUMBA. Convective precipitation dominates in both oceanic and land areas in d02 for SPS1 and in the oceanic area for SPS2 but both non-convective and convective precipitation are important in both oceanic and land areas d02 for MUMBA. The large underpredictions in precipitation may indicate underpredictions in either non-convective precipitation (if it indeed dominates over the convection precipitation) or convective precipitation (it is unfortunately not possible to detect if the observed precipitation is convective or non-convective). Therefore, the underpredictions of the total precipitation in d04 during these field campaigns may be mainly associated with the limitations of either the M09 double moment microphysics scheme in the former case or the MSKF cumulus parameterization in the latter case. The underpredictions in total precipitation over Australia in this work are different from the reported overpredictions by WRF/Chem [
67,
68] or WRF/Chem-ROMS [
40] over the U.S. where the simulated convective precipitation dominates the total precipitation. Such overpredictions are attributed to the limitations of the cumulus parameterization parameterizations of Grell and Devenyi [
101] (GD) or Grell and Freitas [
102]. Although underpredictions remain, Wang et al. [
103] found that simulations with MSKF can reduce the large wet biases in total precipitation predictions from WRF/Chem over the U.S. domain. Interestingly, as reported in Monk et al. [
28], three sets of WRF simulations using the Grell 3D cumulus parameterization (which is an improved version of the GD scheme) but with the same or different cloud microphysics modules (M09 or LIN or WSM6) and the same or a different PBL scheme (YSU or MYJ) tend to overpredict total precipitation in d04 against OBS and MSWEP during SPS1, SPS2, and MUMBA except for one simulation for SPS2 and one simulation for MUMBA. It is therefore worthy to further investigate in the future why the combination of M09 and MSKF schemes performs differently in Australia comparing to the U.S. (underpredictions vs. overpredictions) and whether the MSKF scheme did not trigger convective precipitation over d04. Nevertheless, the underpredictions of precipitation in this work can lead to overpredictions of concentrations for soluble gases such as SO
2 and aerosol components such as sulfate and hydrophilic SOA.
As shown in
Figure S3, the large underpredictions against both OBS and MSWEP during MUMBA are attributed to underpredictions of the intensity of the heavy precipitation during January 28, 2013 at all sites except Sydney Airport. In particular, very large underpredictions of the peak precipitation against both OBS and MSWEP occur at all grid resolutions on Jan. 28, 2013 at Badgery’s Creek, Bellambi, Richmond RAAF, and Williamtown RAAF (Figure not shown). The model performs better for lighter precipitation events during MUMBA. The model also missed several smaller precipitation events (e.g., on Jan. 13 and Feb. 10, 2013) at most sites. At most sites, the simulation at 9-km gives the best agreement to OBS and MSWEP for the heavy precipitation on Jan. 28, 2013 but the simulation at 3-km gives the best agreement for lighter precipitation except Sydney Airport where the simulation at 3-km gives the best agreement for both heavy and light precipitation. During SPS2, underpredictions (despite being less severe than MUMBA) also occurred for intensity of peak precipitation on April 18, 2012 at Bankstown Airport (against both OBS and MSWEP), Richmond RAAF (against both OBS and MSWEP), Sydney Airport (against MSWEP), and Camden Airport (against both OBS and MSWEP, figure not shown), which contributed to the overall dry biases during SPS2. At Badgery’s Creek, the 3-km simulation captures well the peak precipitation from OBS and MSWEP on April 18, 2012 despite some overpredictions, whereas the other simulations significantly underpredict the intensity of the precipitation. The model performs overall well at Bellambi, Williamtown RAAF (Figure not shown), and Wollongong Airport (Figure not shown). During SPS1, all simulations miss the heavy precipitation from OBS on Feb. 12, 2011 at all sites except at Bellambi. They also miss the heavy precipitation from MSWEP on Feb. 12, 2011 at Badgery’s Creek, Bankstown Airport, Camden Airport, and Richmond RAAF. Unlike SPS2 and MUMBA where all simulations predict precipitations on the same days, large differences exist in not only the magnitude of the predictions on the same day but also the days with simulated heavy precipitation among the four sets of precipitation predictions at different grid resolutions during SPS1.
3.2. Surface Chemical Evaluation
For surface O
3, the model predicts the highest domain-mean O
3 concentrations in d04 at all grid resolutions during MUMBA (due mainly to its highest T2), followed by SPS1 and SPS2, which is consistent with the observed seasonal variations. The domain-mean NMBs of O
3 during the three field campaigns are 16.2–20.5% at 81-km but well within ±10% at finer grid resolutions. The NMEs are 36.5–42.4%, 49.4–56.5%, and 37.5–46.4% for SPS1, SPS2, and MUMBA, respectively. Both NMBs and NMEs for the simulation at 81-km exceeded the performance threshold of O
3. For simulations at finer grid resolutions, while the NMBs are within the 15% threshold, the NMEs are all greater than the threshold of 25%. The large NMEs may be caused by inaccurate emissions of O
3 precursors such as NO
x and VOCs and inaccurate meteorological predictions such as T2, WS10, and mixing heights. The smallest NMBs occur at 27-km for all three field campaigns, the second smallest NMBs occur at 3-km for SPS1 and SPS2, but at 9-km for MUMBA.
Figure 4 compares observed and simulated diurnal profiles of surface concentrations of O
3 and PM
2.5 from the WRF/Chem-ROMS simulations at various grid resolutions averaged over all monitoring stations over the Greater Sydney area (d04). The model performs similarly at 3-, 9-, and 27-km, which agrees better with observations than at 81-km. The underpredictions in O
3 between 7 am and 1 pm are related to the underpredictions of T2 during the same hours as shown in
Figure 3. The simulation at 81-km overpredicts O
3 during all hours except 7–11 a.m. local time for SPS1 and MUMBA and 8 a.m.–12 p.m. local time for SPS2. The overpredictions at night are mainly caused by insufficient titration by NO (indicated by significant underpredictions of NO, see
Table 1,
Table 2 and
Table 3). The large underpredictions of nighttime NO may be caused by underestimated NO emissions and also overpredicted nocturnal and morning PBL height (PBLH) as reported by Monk et al. [
26]. Comparing to SPS1 and SPS2, the nocturnal PBLH is better captured for MUMBA, therefore the nighttime O
3 predictions at various grid resolutions are closer to the observations for MUMBA than for SPS1 and SPS2. As the underpredictions of NO reduce with increased grid resolutions, the nighttime O
3 predictions generally agree better with observations, with the best performance at 3-km. According to Monk et al. [
28], the model can better simulate PBLH in the afternoons. As the grid resolutions increase, the model can better capture the atmospheric mixing (indicated by reduced underpredictions of CO, see
Table 1,
Table 2 and
Table 3) and the concentrations of O
3 precursors such as NO and NO
2, leading to less overpredictions of O
3 during afternoon hours, with the best performance at 3-km.
To assess the model’s capability at both regional and urban scales,
Figure 5 shows simulated spatial distributions of surface concentrations of O
3 from the WRF/Chem-ROMS simulations at 3- and 27-km overlaid with observations over the Greater Sydney area (d04). While 27-km represents a regional scale, and 3-km represents an urban scale. For SPS1, although the NMB at 27-km is smaller than that at 3-km (0.5% vs. −5.7%), the use of 3-km grid resolution can better capture the concentration gradients, with the lowest concentrations at Liverpool and Chullora, and the highest at Oakdale. The 3-km simulation also more accurately reproduces the O
3 concentrations at three sites in the southwest of the Greater Sydney area, University of Wollongong (UOW), Wollongong, and Warrawong. For SPS2, the model simulation at 3-km reproduces the lowest concentrations at Liverpool and Westmead, the highest at Oakdale, as well as a lower concentration at UOW than those at Wollongong and Warrawong, all of which were not well captured by the simulation at 27-km, even though the NMB at 27-km is smaller than that at 3-km (−0.2% vs. −3.4%). For MUMBA, although the NMB at 27-km is smaller than that at 3-km (2.8% vs. −10.1%), the observed highest O
3 concentrations occurred at Oakdale and Bargo, which are more accurately simulated by the simulation at 3-km than at 27-km, particularly at Oakdale. Similar to SPS1 and SPS2, the 3-km simulation better captures the observed O
3 concentrations at UOW, Wollongong, and Warrawong.
Figures S4 and S5 and
Figure 6 compare observed and simulated temporal profiles of O
3 concentrations from WRF/Chem-ROMS at selected sites during SPS1, SPS2, and MUMBA, respectively. The site-specific statistics is summarized in
Table S1. For SPS1, the simulation at 81-km overpredicts O
3 concentrations during most days at all sites, the simulation at 27-km gives the best performance at all sites, although all simulations miss the high O
3 concentrations during Feb. 19–22 at most sites, especially at Oakdale. The highest observed O
3 concentration (up to ~40 ppb) occured on Feb. 25 at Oakdale, which was underpredicted by all simulations. During SPS2, the O
3 predictions show a much higher sensitivity to grid resolutions, especially at Randwick, Earlwood, Chullora, and Oakdale. The simulation at 81-km overpredicts O
3 concentrations during most days except for Oakdale where all simulations underpredict O
3 and for Wollongong where all simulations generally capture the magnitude of O
3 well. The simulation at 27-km gives the best performance at all sites except for Oakdale and Randwick where the simulation at 3-km is the best and for Richmond where the simulation at 9-km is the best. Comparing to SPS1 and SPS2, the observed and simulated O
3 concentrations during MUMBA show much stronger daily variations, driven by the strong daily variations of T2 shown in
Figure S1. The simulation at 81-km tends to overpredict O
3 during most days at Chullora, Earlwood, Liverpool, Randwick, and Wollongong. The simulation at 27-km gives the best performance at all sites except for Oakdale where the 3-km simulation is the best. All model simulations miss high O
3 concentrations during Jan. 24–28 at most sites. Similar to SPS1, Oakdale has the highest observed O
3 concentration (up to ~44 ppb) on Jan. 10–12, 2013 and O
3 concentrations higher than 30 ppb on several days, which are underpredicted by all simulations. As shown in
Table S1, the site-specific performance statistics is overall consistent with the domain-mean statistics. The simulated site-specific standard deviations generally agree with those based on observations during all three field campaigns.
As shown in
Table 1,
Table 2 and
Table 3, for surface CO, the domain-mean NMBs are in the range of −66.3% to −54.9% at 81-km and −55.3% to −24.5% at finer grid resolutions. The moderate-to-large underpredictions in CO may be caused by underestimated CO emissions (because of the use the 2008 inventory, as discussed in
Section 2.2) and overpredicted PBLHs. In addition, the use of a coarse grid resolution that causes greater CO underpredictions. Surface NO is significantly underpredicted with the domain-mean NMBs −89.3% to −77.3% at 81-km resolution. The model simulates NO better at finer grid resolutions, −44.6% to −18.1% at 27-km, −5.8% to 18.3% at 9-km, and 5.0–53.2% at 3-km resolution. For surface NO
2, the model reproduces the observed seasonal variation well, with the highest observed NO
2 concentrations during SPS2 comparing to those during SPS1 and MUMBA. Although photochemical production of NO
2 is less during SPS2 than SPS1 and MUMBA due to weaker solar radiation in autumn than summer, the ventilation rates are lower in autumn than in summer, which favored the accumulation of NO
2 in the Sydney basin. The domain-mean NMBs of NO
2 are in the range of −32.3% to −29.5% at 81-km resolution, 0.7–6.2% at 27-km, 5.4–20.1% at 9-km, and 9.6–30.7% at 3-km resolution. Both NO and NO
2 are affected by atmospheric mixing, emissions, and chemistry, and also highly sensitive to the grid resolution. The NO emissions are likely underestimated. While the use of finer grid resolutions effectively reduces NMBs of NO and NO
2 concentrations for SPS1 and SPS2, it changes moderate-to-large underpredictions in NO and NO
2 at 81-km (NMBs of −77.3% and −30.2%, respectively) to moderate-to-large overpredictions at 3-km (NMBs of 53.2% and 30.7%, respectively) for MUMBA. This indicates that high, localized NO
x emissions are inadequately dispersed at 3-km (and 9-km to a lesser extent), resulting in overpredictions over the d04 domain. For surface SO
2, the domain-mean NMBs at 81-, 27-, 9-, and 3-km are much higher, 50.7–114.3% for SPS1, 63.3–161.5% for SPS2, and 33.8–149.6% for MUMBA. Different from CO and NO
x, increasing grid resolution leads to higher SO
2 overpredictions with the highest at 3-km. While an overestimated SO
2 emission (because of the use the 2008 inventory, as discussed in
Section 2.2) and insufficient SO
2 conversion to sulfate (indicated by underpredictions in precipitation) may be responsible in part for the overpredictions at all grid resolutions during all three field campaigns, the inadequate dispersion amplifies the model biases, resulting in much greater overpredictions at a finer grid resolution.
For surface PM
2.5 and PM
10, the model predicts the highest domain-mean concentrations in d04 at all grid resolutions during MUMBA (due mainly to its highest T2), followed by SPS1 and SPS2, which captures the observed seasonal variations. For PM
2.5, most NMBs exceed the threshold value of 30% except for the simulations at 3–27 km for SPS1 and at 81-km for SPS2, and all NMEs exceed the threshold value of 50%, indicating a poor performance for PM
2.5. Increasing grid resolution helps reduce NMBs for SPS1 and MUMBA but causes greater overpredictions for SPS2. The PM
2.5 underpredictions during SPS1 and MUMBA may be caused by inaccurate meteorology (e.g., high biases in WS10), underestimated emissions (e.g., primary PM and PM
2.5 precursors such as NO
x and anthropogenic and biogenic VOCs), insufficient SO
2 conversion to sulfate, and underprediction in secondary organic aerosol (SOA). The PM
2.5 overpredictions during SPS2 may be caused by overpredictions of sulfate and organic carbon (OC) (which dominate PM
2.5 concentrations), which may be attributed to overestimated emissions of SO
2 (indicated by the large overprediction of SO
2 concentrations) and primary OC. However, no OC observations are available to verify this speculation.
Figure 7 shows spatial distributions of surface concentrations of PM
2.5 simulated by WRF/Chem-ROMS at 3- and 27-km resolutions overlaid with observations over the Greater Sydney area (d04). For SPS1 and MUMBA, the simulation at 3-km gives better statistical performance and also reproduces more accurately the observed concentration gradients at the five sites (Chullora, Earlwood, Liverpool, Richmond, and Wollongong) than the simulation at 27-km. For SPS2, the simulation at 3-km shows larger overpredictions than that at 27-km at all sites except for Richmond.
Figure 8 compares observed and simulated temporal profiles of PM
2.5 concentrations from WRF/Chem-ROMS at the five sites during SPS1, SPS2, and MUMBA.
Table S2 summarizes site-specific performance statistics. During SPS1 and MUMBA, the PM
2.5 underpredictions occur during most hours at all sites with the largest underpredictions at 81-km and the least underpredictions at 3-km. The differences among the simulations at different grid resolutions are relatively small during most hours at all sites. During SPS2, much larger differences for the four set of simulations occur at Chullora, Earlwood, and Liverpool compared to SPS1 and MUMBA, indicating a much higher sensitivity to grid resolution during autumn than during summer. However, the simulation at 81-km gives the best agreement to the observations, and the largest overpredictions occur at 3-km. As shown in
Table S2, the site-specific performance statistics is overall consistent with the domain-mean statistics. However, the simulated site-specific standard deviations deviate from those based on observations during all three field campaigns, with the best agreement at 3-km for SPS1 and MUMBA but at 81-km for SPS2.
For surface PM
10, observations are available at many more sites than PM
2.5. Most NMBs exceed the threshold value of 30% except for the simulations at 3–27 km for SPS2, and all NMEs exceed the threshold value of 50%, indicating a poor performance for PM
10. The large PM
10 underpredictions during SPS1 and MUMBA may be caused by underpredictions of PM
2.5 and underestimate of PM
10 emissions (because of the use the 2008 inventory, as discussed in
Section 2.2) and sea-salt emissions. The seemly better performance of PM
10 at 3-, 9-, and 27-km grid resolutions during SPS2 is mainly because of large overpredictions of PM
2.5. While the use of 3-km only improves the PM
10 performance for SPS1 and MUMBA slightly, it reduces the NMBs for PM
10 for SPS2 to a larger extent, but this is because of increased overpredictions in PM
2.5 as the grid resolution increases. In addition to the aforementioned possible reasons for PM
2.5 underpredictions that may also explain the moderate-to-large PM
10 underpredictions, an additional reason may be the underpredictions in sea-salt emissions and concentrations.
Figure 9 shows simulated spatial distribution of surface concentrations of PM
10 by WRF/Chem-ROMS at 3- and 27-km overlaid with observations over the Greater Sydney area (d04). The simulations at 3-km reproduces better the observed PM
10 concentration gradients for all three field campaigns than those at 27-km, particularly for SPS2. The moderate-to-large PM
10 underpredictions occur at both inland and coastal sites for SPS1 and MUMBA, with larger underpredictions at inland sites.
3.3. Evaluation of Radiative, Cloud, and Heat flux Variables
Table 4,
Table 5 and
Table 6 summarizes the performance statistics of radiative, cloud, heat flux, and column gas variables simulated using WRF/Chem-ROMS for SPS1, SPS2, and MUMBA, respectively.
Figure 10 and
Figures S7 and S8 compare observed and WRF/Chem-ROMS simulated radiation and optical variables, CCN, and cloud variables at 27-km over d02. Comparing to PBL meteorological predictions, the radiation and cloud predictions are relatively less sensitive to grid resolution. For SPS1, the model simulates GLW well with NMBs within 5% but moderately overpredicts GSW with NMBs of 23–27.8%. As shown in
Figure 10, the model does better in simulating observed gradients for GSW over the northern portion of d02 except along the coastal areas, it overpredicts GSW in the southern portion of d02. The model generally reproduces the spatial distributions and gradients of GLW well, despite underpredictions in the southern portion of d02. AOD is slightly overpredicted with NMBs of 1.6–8.3% over d04 at all grid resolutions, which is considered to be an excellent performance. The model gives similar spatial pattern to that of the MODIS AOD over oceanic areas but moderately overpredicts AOD over land areas in d02 (with an NMB of 45% over d02). CCN observations are only available over the ocean. CCN over the ocean is moderately underpredicted with NMBs ranging from −22.7% to −12.7%, especially in the northeastern and the southwestern portions of the d02. The underpredictions of CCN over the ocean are likely related to PM
10 underpredictions, although no surface and column PM
10 concentrations over the ocean are available for model evaluation. PWV is slightly underpredicted with NMBs of −12.1% to −11.6%, CF is moderately underpredicted with NMBs of −46.3% to −39.4%, and LWP is largely underpredicted with NMBs ranging from −96.8% to −89.1%. Simulated CF shows a similar pattern to observed CF with higher CF in the southern portion than the northern portion, but with much lower values in both land and ocean areas than observed CF. The underpredictions mostly occur in the southern portion of d02, in particular over the ocean areas. CDNC predictions are sensitive to grid resolution, with moderate underpredictions of 46.2% at 81-km and 31.2% at 3-km, but much better agreement with observations with an NMB of 3.5% at 27-km and −6.2% at 9-km. CDNC underpredictions can be attributed in part to the underpredictions of PM
10 and uncertainties in the derived CDNC based on MODIS retrievals of cloud properties such as cloud effective radius, LWP, and COT, all of which are subject to uncertainties. COT is largely underpredicted with NMBs of −80.4% to −62.0%. Most underpredictions occur over land, particularly over the southern portion of the domain. Since COT depends on CDNC and LWP, the underpredictions in CDNC and CWP propagate into the COT predictions. In addition, the COT calculation in WRF/Chem-ROMS only accounts for contributions from water and ice, and contributions from other hydrometeors such as graupel are neglected, which explains in part the underpredictions of COT. The model largely underpredicts SWCF with NMBs of −60.4% to −47.8% and LWCF with NMBs of −55.1% to −40%. The spatial distributions of the observed SWCF correlate well with those of the observed CF, a similar correlation is found between simulated SWCF and CF, although it is weaker owing to underpredictions in both CF and SWCF. Such underpredictions in SWCF can be attributed to the underpredictions in CCN over the ocean and possibly over land, CDNC, LWP, and COT, caused by possible underpredictions in column PM
10 concentrations. The model overpredicts LHF and SHF over the ocean against OAFlux data, with NMBs of 26.4–49.4% and 16.0–74.2%, respectively. Using finer grid resolutions of 3- or 9-km can reduce the model biases in simulating the latent and sensible heat flux.
For SPS2, the model simulates GLW and GSM well with NMBs within 3% and 13%, respectively. As shown in
Figure S6, the model simulates well the spatial distribution of GLW and GSW throughout the domain. AOD is largely overpredicted with NMBs of 71.0–76.6% over both land and ocean areas, which may be caused by higher PM
2.5 concentrations at the surface (see
Table 2) and possibly above the surface. CCN over the ocean is moderately underpredicted with NMBs ranging from −28.2% to −24.1%, especially in the northeastern and the southwestern portions of the d02. PWV is moderately overpredicted with NMBs of 21.0–21.3%, CF is moderately underpredicted with NMBs of −32.2% to −25.9%, and LWP is largely underpredicted with NMBs ranging from −95.0% to −93.6%. As shown in
Figure S7, similar to SPS1, the simulated CF shows a similar pattern to observed CF. Observed CDNC is available in a few areas in d02 but is not available over d04, thus there is no statistical evaluation over d04. COT is moderately to largely underpredicted with NMBs of −61.2% to −18.7%. Underpredictions occur over all land areas for the reasons discussed above. The model captures COT over oceanic areas well. The model underpredicts SWCF with NMBs of −31.1% to −4.1% and LWCF with NMBs of −34.5% to −9.8%. As shown in
Figure S7, the simulated SWCF shows similar distributions as the observed SWCF. Unlike SPS1, the model tends to underpredict LHF and SHF over the ocean against OAFlux data, with NMBs of −19.9% to 1.8% and −45.3% to −22.9%, respectively. The 3-km simulation gives the best performance for both LHF and SHF.
For MUMBA, the model simulates GLW well with NMBs within 5% but moderately overpredicts GSW with NMBs of 21.7–25.8%. As shown in
Figure S6, the underprediction of GSW occur over most of d02 except over the ocean areas in the southwestern portion. The model generally reproduces the spatial distribution and gradients of GLW well, despite underpredictions in the northern portion and overpredictions in the southwestern portion. AOD is moderately overpredicted with NMBs of 27.2–34.5%, which occurs over both land and ocean areas in the northern portion of d02. This AOD overprediction may be caused by higher PM
2.5 concentrations above the surface as the surface PM
2.5 concentrations are largely underpredicted (see
Table 3). CCN over the ocean is largely underpredicted over all ocean areas with NMBs ranging from −57.9% to −54.4%. PWV is moderately overpredicted with NMBs of 17.7% to 18.6%, CF is largely underpredicted with NMBs of −65.8% to −48.7%, and LWP is largely underpredicted with NMBs ranging from −97.1% to −90.9%. As shown in
Figure S7, simulated CF is much lower than observed CF especially over ocean areas. Similar to SPS2, observed CDNC is only available in a few areas in d02 but is not available over d04. COT is moderately to largely underpredicted with NMBs of −71.0% to −37.4%, mainly over land areas for the aforementioned reasons. The model captures COT over oceanic areas well. The model largely underpredicts SWCF with NMBs of −55.0% to −44.7% and LWCF with NMBs of −68.8% to −61.2%. As shown in
Figure S7, the model is able to simulate the correlation between the spatial distributions of the observed SWCF and observed CF, but the simulated correlation is much weaker because of large underpredictions in both CF and SWCF. The model either overpredicts or underpredicts LHF and SHF over the ocean against OAFlux data, with NMBs of −9.9% to 8.1% and −35.5% to 7.3%, respectively. The better performance is at 27-km for LHF and at 81-km for SHF.