Cetaceans belong to
Whippomorpha, including
Mysticeti and
Odontoceti. Acoustic signals are the most important way in which
Cetaceans perceive their environment and communicate, especially active acoustic signals generated by
Cetaceans [
1,
2].
Odontoceti can use their sonar systems for echolocation through wide-frequency band signals [
3,
4]. Since the 20th century, due to excessive whaling, these animals’ habitats have been reduced in size, and some species are now endangered [
5,
6,
7,
8]. To protect
Cetaceans, the International Whaling Commission (IWC) devised the International Convention for the Regulation of Whaling [
9]. In recent years, studies have shown that in addition to human hunting, the noise generated by human activities also has a greatly negative impact on
Cetaceans [
1,
10,
11,
12,
13,
14,
15].
Odontoceti, especially
Delphinidae and
Phocaenidae, are highly dependent on their sonar systems for environmental perception and predation [
1,
2,
3,
16,
17]. Human activities are frequent in rivers, estuaries, and coastal areas, and the noise produced results in serious interference with, and even damage to the auditory systems of
Odontoceti [
10,
11,
12,
13,
15]. Taking
Sousa chinensis as an example, we can observe that their click signal is a broadband signal with a short duration [
18], which is easily masked by impact or knocking noises [
13]. Young individuals can distinguish artificial knocking sounds and click signals from their own. However, the hearing capacity of elder individuals gradually becomes worse with increased age [
19,
20], resulting in the confusion of knocking sounds and clicking signals and causing them to enter rivers by mistake. Some aged
Sousa chinensis have been killed due to grounding or fungal infection based on these mistakes. The study of
Cetaceans’ acoustic signals is important for species protection measures and biological resource development.
Hydrophone data are the most widely used type of information in
Cetacean research because sound travels much further than light in water and is more easily recorded. The acoustic research on hydrophone data of
Cetaceans began in the 1940s, given that hydrophone technology was developed in World War II. William E. Schevill et al. researched the acoustic signals of the
White porpoise (
Delphinapterus leucas) by a hydrophone working at 0.5 kHz~10 kHz in 1949 [
21]. Since then, many researchers have used hydrophones to study the characteristics of the acoustic signals of
Cetaceans. In 1993, Whitlow W.L. Au summarized the results of acoustic research on a variety of
Cetaceans and compiled a book entitled The Sonar of Dolphins [
3]. After determining the characteristics of
Cetaceans’ acoustic signals, researchers began to study the relationship between
Cetaceans’ acoustic signals, environment, and behaviors based on these characteristics. Therefore, the detection and analysis of acoustics became increasingly important.
Tursiops truncatus are known to avoid obstacles in their paths while swimming and to locate fishes for food by sound reflection or by echolocation [
4,
22]. Johnson et al. found that the upper limit of hearing of
Tursiops truncatus can reach 120~140 kHz. Liang Fang et al. studied the high-frequency echolocation signals of
Sousa chinensis in Sanniang Bay, Guangxi Province, China, and found that the mean peak frequency was 109 kHz [
18]. Liang Fang et al. researched the echolocation signals of captive and free-ranging
Neophocaena asiaeorientalis and found that the main center frequency of clicks from individuals in the Baiji aquarium was 133 kHz, while that of individuals at the Shishou Tian-e-zhou Reserve was 128 kHz, and that of individuals at Tianxingzhou was 129 kHz [
23]. In most studies, the researchers released the hydrophone into the water to collect acoustic signals. Some pressed the hydrophone against the skin of dolphins, as in T.H. Bullock et al.’s research [
24].
Cetacean acoustic signals can be separated into whistle, burst-pulse, and click signals based on their time–frequency characteristics. The click signal is a broadband signal with an upper limit of frequency up to 150 kHz, with some reaching even more than 200 kHz. It is still difficult to identify the click signals of
Odontoceti in large quantities for research purposes. On the one hand, marine environment noise leads to some interference with the sound signals. On the other hand, knocks and sounds of non-target marine animals may be misidentified [
25]. Abbas et al. designed an FChOA-MLPNN for the automatic detection of marine mammal sounds [
26]. However, the manual labeling of the dataset consumed a great deal of labor and could lead to the mislabeling of knock signals as click signals. Yang et al. transformed hydrophone data from the time domain into the time–frequency domain using a short-time Fourier transform (STFT). The acoustic signals of dolphins can be marked according to their duration, short energy, and spectral centroid. Due to the uncertainty principle, the time–frequency spectrum calculated using STFT cannot maintain a high temporal resolution and high-frequency resolution at the same time. Yang’s method can mark whistle and burst pulses accurately, but there are some mismarks of clicks [
27].
Since the development of the digital signal processing (DSP) method, a new signal analysis method has been produced, i.e., the wavelet transform (WT), where the resolution can be dynamically changed in accordance with the frequency. Although WT still cannot ensure both a high temporal resolution and high-frequency resolution in a window at the same time based on the uncertainty principle, the dynamic resolution captures greater details of wide-frequency band signals [
28]. Thus far, WT has been applied to seismic signal recognition, part flaw detection, image processing, and other fields [
29,
30,
31].