**3. Results**

## *3.1. Sound Levels, Audibility, and Response Thresholds*

The waveform and spectrum of an example overflight recorded in air are shown in Figure 2a. Broadband levels (20 Hz–20 kHz) exceeded 117 dB re 20 μPa for about 1 s in this example.

Averaged over all 23 overflights, the received level was 110 ± 4 dB re 20 μPa rms and 107 ± 5 dBA; maximum received levels were 119 dB re 20 μPa and 118 dBA. In-air noise covered a frequency band from 20 Hz to greater than 10 kHz, peaking between 50 Hz and 1 kHz (Figure 3a). Comparing 1/3 octave band levels with audiograms indicated that in-air noise from Growlers would be audible to all species within the limits of the audiogram measurements available, which ranged from a minimum of 250 Hz for cormorants to a maximum of 8 kHz for ducks (Figure 3b). Audiogram-weighted levels suggested that murre might experience less disturbance (18–28 dBth) from Growlers compared with puffins (60–65 dBth), cormorants (65–71 dBth), and ducks (81–88 dBth; Table 1). A-weighted noise levels experienced by people ranged from 104 to 109 dBA.

The waveform and spectrum of an example overflight recorded under water are shown in Figure 2b. Broadband levels exceeded 131 dB re 1 μPa for about 1 s in this example. Averaged over the 10 overflights, the received level in the strongest 1-s window was 134 ± 3 dB re 1 μPa rms. The underwater noise recorded during the 10 overflights covered a frequency band from 20 Hz to 30 kHz, peaking between 200 Hz and 1 kHz (Figure 4a). Based on intersection with audiograms, Growler noise penetrating the water was expected to be audible to killer whales between 200 Hz and 40 kHz, and to cormorants between 1 kHz and 4 kHz (Figure 4b). Audiogram-weighted levels indicated Growler flights would result in 48–56 dBth of noise for killer whales, and 40–44 dBth for cormorants (Table 1).

**Figure 2.** Waveform (top) and spectrogram (bottom) of (**a**) a Growler overflight recorded in air (fs = 48 kHz, NFFT = 12,000, 50% overlap) and (**b**) a Growler overflight recorded under water (fs = 96 kHz, NFFT = 24,000, 50% overlap).

**Figure 3.** In-air (**a**) received power spectral density (PSD) from 23 overflights (grey), and median (blue) and quartile (red and green) levels. (**b**) One-third octave band levels (median and quartiles; blue, red, and green, respectively) are compared to the in-air audiograms of cormorants, ducks (i.e., lesser scaup), murres, and puffins. Noise above the audiogram lines is expected to be audible.


**Table 1.** Median and quartile audiogram-weighted levels (dBth) for killer whales and seabirds, and A-weighted levels for humans.

When compared with thresholds of behavioral and physiological stress responses in humans and a suite of terrestrial wildlife (i.e., terrestrial birds and mammals), we found that in-air received levels exceeded all identified thresholds (Figure 5a). Underwater received levels exceeded thresholds of startle response for common murre and avoidance by killer whales. The strongest received levels exceeded the threshold of startle response for herring and harbor porpoise, but were below those associated with avoidance in California sea lions (Figure 5b).

**Figure 4.** Underwater (**a**) received power spectral density (PSD) from 10 Growler overflights (grey), with median (blue) and quartile (red and green) levels. Ambient noise percentiles at the time of recording are shown in dotted curves. (**b**) One-third octave band levels (median and quartiles; blue, red, and green, respectively) shown together with the killer whale (black) and cormorant (pink) underwater audiograms. Noise above the audiogram lines is expected to be audible.

LTSAs of in-air recordings show the pattern of FCLPs as 30–60 min periods of rapid consecutive flights interspersed with shorter intervals of reduced or no flights (Figure S1). Underwater noise was detected on multiple dates and time periods when FCLPs were scheduled, with the same pattern of clustered activity (Figure 6a,b). Visual contrasts of underwater noise from FCLPs and routine takeoffs show the unique characteristics of sound from Growlers compared to vessels, and that received levels from Growlers are likely to exceed those associated with a range of typical vessel noise (Figure 6a–c).

#### *3.2. Comparison of Sound Levels and Flight Activity with Prior Studies*

On September 13, 185 landings and overhead passes of aircraft at the north end of Ault Field occurred between 1100 and 1500. Of these, all but two (1 Boeing 737 and 1 DC-9) were EA-18G Growlers engaged in FCLPs. The majority of overhead passes were a single aircraft, but passes with up to three aircraft simultaneously were observed. Seventeen events of Growlers taking off to the south were audible but not visible. On September 16, 83 passes or landings were observed during the same time period; of these, three were Boeing 737 and 10 were P-3s. The remaining 70 events were Growlers, with a maximum of two aircraft observed at any one time; 13 events of Growlers taking off were also audible but not visible.

**Figure 5.** The distribution of received levels (RL) for (**a**) 23 in-air overflight events and (**b**) 10 flight events recorded under water relative to thresholds known to cause behavioral and physiological responses in humans and representative suites of terrestrial and marine wildlife. In-air: owls, 60 dBA = physiological stress responses [31], 50% chance of nest flushing [32], and 50% reduction in the probability of prey detection and hunting strikes [33]; humans, 67 dBA = 50% probability of awakening at night [29] and increases in nighttime blood pressure [28]; harlequin duck, 80 dBA = reduced courtship and increased vigilance and agonism [34]; marbled murrelet, 92 dBA = risk of disturbance in nesting marbled murrelets [30]; caribou, 98 ASEL (A-weighted sound exposure level) = interrupted resting bouts and increased activity [35]. (Note: The threshold for caribou was reported in ASEL which likely overestimates RL (dBA).) Underwater: common murre, 110 dB re 1 μPa = startle response and interrupted feeding [37]; killer whales, 116 dB re 1 μPa = evading noise from small boats [36]; harbor porpoise, 133 dB re 1 μPa = 50% probability of startle response to low- and mid-frequency up/downsweeps [38]; herring, 137 dB re 1 μPa=startle response to recorded boat noise [39]; California sea lions, 150 dB re 1 μPa = 50% probability of avoidance of area with a simulated mid-frequency tactical sonar signal [40].

When the three metrics of maximum received level, daily number of events, and daily duration > 100 dBA were contrasted with those in previous studies that assessed impacts of MLAF, the combined sound levels and flight activity associated with FCLPs exceeded those in most other studies (Figure 7). In studies related to people, some documented louder maximum received levels, but with fewer events and cumulative daily durations (Figure 7a). Similarly, cumulative daily duration was substantially exceeded in only one previous study; however, the received levels were lower (110 vs. 118 dBA). Overall, the sound levels and flight activity we describe in this study bear the strongest similarity to the most extreme areas around airfields on Okinawa, which were measured opportunistically between 1968 and 1972, and then systematically in 1998 (Table S1a). Contrasts with studies for wildlife show that when all three metrics are considered, sound levels and flight activity at NASWI are largely incomparable to most prior studies (Figure 7b). The taxonomic groups that have been evaluated for impacts of MLAF include ungulates (caribou, sheep, deer, and horse), one species of raptor, four species of ducks, two rodents, and one reptile (Table S1b).

**Figure 6.** One-hour spectrograms (fs = 96 kHz, NFFT = 96,000, 0% overlap) contrasting underwater sound from Growlers juxtaposed with examples of vessel noise: (**a**) FCLP sessions on 20 August (0940–1040), (**b**) FCLP sessions on August 27 (1930–2030) and (**c**) typical clusters of consecutive takeoffs (2–3 Growlers per cluster) on September 4 (0915–1015).

**Figure 7.** Maximum received level (RL), cumulative daily duration in seconds > 100 dBA, and the daily number of events in the current study (Whidbey) and previous studies of impacts of military low-altitude flights (Table S1) related to (**a**) people and communities and (**b**) wildlife. Colored symbols reflect (**a**) geographic region and (**b**) broad taxonomic group, with the daily number of events represented by the size of the symbol. Maximum received levels in studies were typically reported as A-weighted sound pressure levels (dBA) or dBA could be calculated, (**\***) with the exception of 3 data points from wildlife studies (black dot inside symbol) that exclusively reported either C-weighted sound pressure levels or A-weighted sound exposure level (Table S1); the exceptions were included as these metrics are expected to overestimate RL (dBA).

#### **4. Discussion**

In this study, we measured noise from an infrequently studied source of MLAF, operating in close proximity to residential sites, recreational areas, and habitat for multiple sensitive marine species. Our goal was to evaluate potential impacts on people and wildlife, using thresholds of response that have been established in previous studies. We measured sound both in air and under water and compared received levels with species-specific audiograms to demonstrate the extent to which noise from Growlers is perceived by sensitive wildlife. We also place the measured noise (i.e., received levels, total daily duration, and number of events) in the context of studies that have assessed impacts of MLAF on people and wildlife. By adopting this integrated approach, our study is uniquely positioned to illustrate knowledge gaps that can undermine assessment of noise impacts.

When we considered noise as a totality of received level, frequency of events, and total daily duration, the sound levels and flight events exceeded those in most previous studies. This finding is critical because it indicates that assessments of impact (e.g., the EIS) are, by definition, based largely on studies that have evaluated responses of people and wildlife to fewer and quieter MLAF events. To find where similar sound levels are experienced by people, we would have to turn to industrial and occupational noise studies, including those for military personnel (e.g., [42]). However, extrapolating

from these studies to a community noise context is largely inappropriate given differences in the type and duration of exposure as well as occupational regulations such as time exposure limits, use of hearing protection, and testing [43].

In our review of MLAF studies for people, we found that comparable community or environmental noise has been studied in only one other region of the world, on Okinawa Island, Japan. From World War II until 1998, Okinawa Island had 39 U.S. military facilities (today there are 28), including two major bases of Kadena Air Base and Futenma Air Station. Noise from aircraft was measured opportunistically around these bases in 1968 and 1972 (during the Vietnam War), but were not measured systematically until 1998, when a multi-year study evaluated consequences for health and well-being. The study was launched because at the time it was estimated that 38% of Okinawa's population were living in conditions that exceeded the national standards for exposure to aircraft noise [6]. It is notable that the sound levels and flight activity we document around Whidbey Island are similar to Okinawan measurements during the Vietnam War, prior to passage and adoption of national noise regulations by Japan's Environment Agency in 1973 and the Defense Facilities Administration Agency in 1980.

The same trend was apparent when we compared the maximum received level, number of events, and total daily duration with studies related to wildlife. Only one wildlife study, conducted in a laboratory, exceeded the total daily duration that we measured. Although some studies evaluated exposure to stronger maximum received levels, cumulative daily duration was less. For example, one of the most comprehensive assessments conducted by Goudie and Jones (2004) examined behavioral responses of harlequin ducks to MLAF, finding reduced courtship and increased agonism at a threshold of 80 dBA, with recovery requiring about two hours [34,44]. Although Goudie and Jones (2004) recorded a higher maximum sound pressure level, the typical number of daily events was just 3% of the number we document in the current study. This example illustrates the difficulty in assessing impacts of increased Growler training on area wildlife. We face not only general research limitations in how noise impacts communication, behavior, foraging, and ultimately fitness of wildlife [8,16] but an added burden of extrapolating to a number of events, received levels, and cumulative daily duration that is largely unstudied.

We considered carefully whether sampling decisions or assumptions made during analysis could have inflated the number of events, received levels, or cumulative daily duration. Recordings of in-air noise were done over two days, which could be considered non-representative (e.g., if an extreme number of events were recorded). However, when we compiled the FCLP schedule for the past 4 years, we found that our sampling days, with two published active time frames, represented typical and moderate training activity for a single day. The number of flight events and received levels we recorded were also consistent with previous monitoring of Growlers on Whidbey Island. Between 2013 and 2019, noise and events from FCLPs have been measured periodically at 12 locations around Coupeville OLF. The daily number of flight events associated with FCLPs ranged from 69 to 239, easily encompassing our calculated average of 127 events per training day [45–48]. Similarly, maximum sound levels in previous sampling ranged from 97 to 121 dBA (depending on distance from the flight track in use), a range that also encompasses our maximum of 118 dBA.

Knowledge of the consequences for health and well-being of people experiencing these numbers of flight events and cumulative daily duration of noise exposure is necessarily limited, given that similar conditions have been rarely studied. The investigations by the Okinawa Prefecture in the mid-1990s offer the best available information on implications for public health. One of these associated studies found that noise from aircraft around the Kadena airbase was hazardous and sufficient to cause hearing loss among the population as a whole [49], while an epidemiological study identified individuals with noise-induced hearing loss that was likely due to living in proximity to the base [6,50]. In a different region of the world, Finnish Air Force investigations found that two claims of hearing loss from MLAF events were plausible based on measured exposures [51]. A unique laboratory study examining temporary threshold shift and stapedius reflex period concluded that 114 dBA (below our

measured maximum) was a critical threshold, where repeated exposure to military aircraft noise above this was likely to result in noise-induced hearing loss [52].

Studies have also documented consequences for cardiovascular health within the noise exposures that we measured. Clear dose-response relationships existed between blood pressure and aircraft noise surrounding Kadena and Futenma airbases, with noise-exposed groups exhibiting a 30% increase over control groups [6]; risk of hypertension due to noise exposure was highest for older age groups [53]. Laboratory studies have demonstrated short-term increases in blood pressure following exposures to military aircraft noise, with suggested response thresholds ranging from 90 [54] to 106 dBA [55]. Other aspects of human health and well-being including annoyance, sleep disturbance, resident dissatisfaction, and even low birth weights have been studied and associated with high-intensity exposure specifically from military aircraft [6,41,56–58] as well as at lower levels of community noise from civilian aircraft and other sources [4,59]. As a result of reviewing these and other studies, in 2017 the Washington State Department of Health recommended that the U.S. Navy conduct a health impact assessment as part of the EIS process [60]. Our results confirm the need for such an assessment to occur.

The strength of the noise from flight events resulted in another critical finding from this study, where we document sound from Growlers 30 m below the sea surface, and at levels known to trigger behavioral changes for aquatic wildlife. Sound levels between the hydrophone and the surface may have been stronger than those we measured (though complex noise fields arise, particularly in shallow water (see, e.g., Figure 2i–l in Erbe et al. 2017) [61]. For Endangered SRKW, received levels were above those associated with changes in call amplitude [62,63] and avoidance or changes in behavior [36]. Other Salish Sea marine mammals that have been shown to react with avoidance or startle responses to low-mid frequency sound in this range include harbor porpoise [38,64,65], harbor seals (*Phoca vitulina*) [66], and gray whales (*Eschrichtius robustus*) [67]. At lower levels, communication masking has been demonstrated for bottlenose dolphins (*Tursiops aduncus*) [68], and is likely for other species such as humpback whales (*Megaptera novaeangliae*) [69]. Although the number of studies that have looked at behavioral responses of fish to noise is small, our received levels overlapped or exceeded thresholds for herring and some other marine fish including sea bass (*Dicentrarchus labrax*) [39,70]. Studies of how noise may be perceived by and impact seabirds underwater are almost nonexistent. In 2011, an expert panel was convened to establish underwater thresholds for injury to Threatened marbled murrelet from pile driving noise, but injury was defined to include only permanent loss of cochlear hair cells or barotrauma [71]. However, two very recent studies have demonstrated startle, avoidance, and changes in foraging of common murre and Gentoo penguins (*Pygoscelis papua*) at 105–115 dB re 1 μPa, suggesting that marbled murrelet and other seabirds around Whidbey Island may be impacted by underwater (and in-air) noise from jet aircraft [37,72].

Our results indicate underwater impacts that have been unstudied, underestimated, or otherwise dismissed in the two relevant EIS [17,73] and corresponding Biological Opinion(s) for ESA-listed species [19,30]. Of chief concern in this region is SRKW. In the EIS, underwater noise from aircraft were deemed unlikely to adversely affect SRKW (and humpback whales), and to have no effect on critical habitat. The rationale for this conclusion included assertions that whales would have to be at the surface of the water and directly underneath low-altitude aircraft (<300 m), and that whales were already exposed to boat and ship noise that could "drown out or lessen" any noise from aircraft. Our results indicate instead that noise from Growlers is measurable at least 30 m under water, with sound levels known to impact whales. Furthermore, these sound levels are comparable to those documented by studies of noise that is experienced by SRKW from small and large vessels [74,75]. The reason that no effects on SRKW critical habitat are assumed is not due to evaluation of noise impacts, but rather to an exemption of waters within the boundaries of military installations from critical habitat designation (see Appendix C, Sections 4.1 and 3.2.5 in [17]). Lastly, the rationale for not considering impacts of aircraft flying higher than 300 m is based on an agreement with the National Marine Fisheries Service in 2015 to assume that underwater noise from any event where aircraft exceed this altitude will cause no reaction in marine mammals (see Appendix C and Section 4.1 in [17]). This is despite the fact that

modeled underwater noise for aircraft at altitudes of 300–3000 m is 128–152 dB re 1 μPa (see Table 3.0–4 in [73]), exceeding known thresholds for behavioral reactions and adverse impacts on marine mammals, including SRKW (Figure 5). A recent synthesis of underwater noise and vessel disturbance on SRKW by the Washington State Academy of Sciences recommended "defining every interaction with an SRKW as an opportunity to disturb a whale", due to the "fragile condition" of the population [76]. Collectively, we believe that our results create a case for revisiting these impact assessments as well as future inclusion of military aircraft noise in cumulative effects models for SRKW [77].

Evaluating the EIS process against our results further illuminates a problem wherein risk from noise effects is calculated based on the likelihood that individuals (e.g., an individual SRKW or marbled murrelet group) will be exposed to sound levels that result in physical damage (i.e., hearing damage or barotrauma) or direct changes in behavior such as foraging, breeding, or nesting. This does not account for the problem that the habitat itself is being impacted by noise, and becomes less hospitable [78], nor that noise may be added to other stressors [79,80]. In this scenario, the very rarity of the species becomes a factor in assuming impacts are discountable, negligible, or insignificant (see Appendix C and Section 4.1 in [17]). Our purpose in outlining these inconsistencies in environmental impact assessment is not to point fingers at federal oversight agencies or the U.S. military, but to exemplify how knowledge gaps [81], exemptions [5], and use of high noise thresholds for harm intersect to discount or underestimate noise impacts on wildlife, which are increasingly understood to include indirect effects on habitat, abundance and fitness of populations [16,82].

The above challenges are part of an evolving understanding of how to evaluate and mitigate growing noise pollution worldwide. However, our study reveals added challenges specific to noise from MLAF, which is a scarcity of studies resulting in large knowledge gaps with respect to impact. There are substantial logistical and bureaucratic hurdles in monitoring military operations; these have been pointed out in reviews [3,9] as well as experienced by the authors of this study. In particular, the fact that schedules are usually not available in advance and access to operational areas may be restricted increases time and costs of doing these studies. Despite the challenges, there is a strong need to close knowledge gaps, as increased noise from MLAF is predicted to become more common in the future due to base consolidation [2]. Other countries (i.e., Finland and Australia) have recently adopted the Growler platform and may find similar issues in locating training facilities. The problem is not limited to Growler aircraft; the new F-35 is also causing similar concerns and discomfort in areas around airfields [83,84]. And while the trend within the U.S. is toward consolidation, the building of new bases and the expansion of military aircraft activity continues worldwide [41,58,85].

In summary, our study suggests the need for underwater noise from Growlers to be included in cumulative effects models [77] and Biological Opinions for ESA-listed species [19,30], as well as more broadly evaluated outside of the immediate vicinity of Whidbey Island. Furthermore, our results show that sound levels and flight operations around NASWI are largely beyond those that have been previously evaluated, supporting calls for a comprehensive health assessment to evaluate consequences for human health and well-being [60]. Finally, we hope that this study stimulates consideration of how to evaluate impacts of intense noise exposure not only for the benefit of this region, but other areas that may face similar challenges now and in the future.

**Supplementary Materials:** The following are available online at http://www.mdpi.com/2077-1312/8/11/923/s1, Figure S1: LTSAs of in-air recordings, Table S1: Noise metrics and attributes extracted from MLAF studies, Table S2: Summary of FCLP days and time periods, 2015–2019, Data S1: Results of MLAF literature review.

**Author Contributions:** R.W., E.A. and L.M.K. conceived and designed the study and data collection; R.W., L.T.B., M.S.C. and L.M.K. collected the data; L.M.K. conducted the literature review; C.E., R.W., L.M.K. and M.S.C. analyzed the data; R.W., E.A., L.M.K. and C.E. contributed equipment and analytical tools and software; L.M.K., C.E. and R.W. wrote the paper; all authors contributed to editing and final reviews of the paper. All authors have read and agreed to the published version of the manuscript.

**Funding:** Multiple small grants and sources of funding made this work possible. The National Parks Conservation Association contributed funds and led a campaign to allow individual donors to contribute to the project. A project *J. Mar. Sci. Eng.* **2020**, *8*, 923

grant was also awarded from The Suquamish Foundation (Appendix X Award 2018Q226). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

**Acknowledgments:** We are grateful for the assistance of several volunteers on this project and manuscript. Heather McCauliffe loaned two Extech sound pressure data loggers. Kimberly Nielsen assisted with the creation of Figure 1, and Toby Hall assisted with deployment and retrieval of the SoundTrap. The manuscript was substantially improved by comments from two anonymous reviewers. Lastly, Rob Williams thanks and acknowledges the Pew Fellows Program in Marine Conservation for support.

**Conflicts of Interest:** The authors declare no conflict of interest.
