Optimal Representation of Anuran Call Spectrum in Environmental Monitoring Systems Using Wireless Sensor Networks
Abstract
:1. Introduction
1.1. Environmental Monitoring of Anuran Calls as Indicators of Climate Change
1.2. Previous Work
1.3. Research Objectives
2. Materials and Methods
2.1. WSN Architecture
2.2. Sounds Database
- Epidalea calamita; mating call (369 records),
- Epidalea calamita; release call (63 records),
- Alytes obstetricans; mating call (419 records),
- Alytes obstetricans; distress call (17 records).
2.3. Sound Framing
2.4. Spectrum Representation
2.4.1. MPEG-7 Feature Extraction
- Spectrogram analysis. By applying the Fast Fourier Transform (FFT) to the frame values, a spectral representation is obtained for each frame. The 5 parameters derived from this spectrum are:
- Total power.
- Relevant power, that is, the power in a certain frequency band.
- Power centroid.
- Spectral dispersion.
- Spectrum flatness.
- Linear prediction coding (LPC) analysis. From the sound values, , a model of the sound source is estimated. This model uses a harmonic sound generator, a random sound generator, and a digital filter defined by its characteristic polynomial . The roots of this polynomial are complex numbers which can be stated as , and play a key role in this technique by determining the formants. Through LPC analysis, the spectrum envelope can be obtained and 11 parameters can also be derived such as:
- Frequency of the formants (only the first three formants are considered).
- Bandwidth of the formants (only the first three formants are considered).
- Pitch.
- Harmonic centroid.
- Harmonic spectral deviation.
- Harmonic spectral spread.
- Harmonic spectral variation.
- Harmonicity analysis. From the sound values, its autocorrelation function is obtained as this function is an indirect way of describing a spectrum. The two parameters derived from this analysis are:
- Harmonicity ratio.
- Upper limit of harmonicity.
2.4.2. Filter Bank Energy
2.4.3. Cepstral Representation
2.4.4. Sound Pre-Emphasis
2.4.5. Cepstral Liftering
2.4.6. Mel Frequency Cepstral Coefficients (MFCCs)
2.5. Sound Classifiers
2.6. Classification Metrics
3. Results
3.1. Sound Classification Using MPEG-7 Features
3.2. Sound Classification Using Filter Bank Energies
3.3. Sound Classification Using MFCC (Default Options)
3.4. Classification Performances versus MFCC Feature Extraction Options
3.5. Sound Classification Using Optimal MFCC
4. Discussion
4.1. Comparing Classification Performances
4.2. Breaking Down the Improvement in Classification Performances.
- MPEG-7 features (extracted with the options described in Table 5).
- FBE (extracted with the options described in Table 5).
- FBE in log-scale, that is, extracted with the same options used in the previous stage but using a logarithmic scale to represent the energies.
- FBE in mel-log-scale, that is, extracted with the same options used in the previous stage but using a mel scale to represent the frequencies. In fact a mel filter bank, as described in Section 2.4.2, was used.
- FBE in mel-log-scale with optimum options, that is, extracted with the same options used in the previous stage but using the optimum values for the remaining extracting options.
- DCT (Discrete Cosine Transform) of the FBE in mel-log-scale, that is, the DCT of stage 4. This result is in fact a set of Mel Frequency Cepstral Coefficients (MFCC) but obtained with options that are not the default options defined in HTK, nor the optimum values obtained in Section 3.
- MFCC with optimum frame duration (, that is, extracted with the same options used in the previous stage but using the optimum frame duration.
- MFCC with optimum options, that is, extracted with the same options used in the previous stage but now using the optimum values for the limits of low frequency () and high frequency () of the spectrum.
4.3. Reducing the Spectrum Representation Vector
5. Conclusions
Author Contributions
Acknowledgments
Conflicts of Interest
References
- Menzel, A.; Sparks, T.H.; Estrella, N.; Koch, E.; Aasa, A.; Ahas, R.; Alm-Kübler, K.; Bissolli, P.; Braslavská, O.; Briede, A.; et al. European phenological response to climate change matches the warming pattern. Glob. Chang. Biol. 2006, 12, 1969–1976. [Google Scholar] [CrossRef] [Green Version]
- Khamukhin, A.A.; Demin, A.Y.; Sonkin, D.M.; Bertoldo, S.; Perona, G.; Kretova, V. An algorithm of the wildfire classification by its acoustic emission spectrum using Wireless Sensor Networks. J. Phys. Conf. Ser. 2017, 803, 1–6. [Google Scholar] [CrossRef]
- Pörtner, H.O.; Knust, R. Climate change affects marine fishes through the oxygen limitation of thermal tolerance. Science 2007, 315, 95–97. [Google Scholar] [CrossRef] [PubMed]
- Deutsch, C.A.; Tewksbury, J.J.; Huey, R.B.; Sheldon, K.S.; Ghalambor, C.K.; Haak, D.C.; Martin, P.R. Impacts of climate warming on terrestrial ectotherms across latitude. Proc. Natl. Acad. Sci. USA 2008, 105, 6668–6672. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Huey, R.B.; Deutsch, C.A.; Tewksbury, J.J.; Vitt, L.J.; Hertz, P.E.; Pérez, H.J.Á.; Garland, T. Why tropical forest lizards are vulnerable to climate warming. Proc. R. Soc. Lond. B Biol. Sci. 2009. [Google Scholar] [CrossRef] [PubMed]
- Kearney, M.; Shine, R.; Porter, W.P. The potential for behavioral thermoregulation to buffer “cold-blooded” animals against climate warming. Proc. Natl. Acad. Sci. USA 2009, 106, 3835–3840. [Google Scholar] [CrossRef] [PubMed]
- Duarte, H.; Tejedo, M.; Katzenberger, M.; Marangoni, F.; Baldo, D.; Beltrán, J.F.; Martí, D.A.; Richter-Boix, A.; Gonzalez-Voyer, A. Can amphibians take the heat? Vulnerability to climate warming in subtropical and temperate larval amphibian communities. Glob. Chang. Biol. 2012, 18, 412–421. [Google Scholar] [CrossRef] [Green Version]
- Bradbury, J.W.; Vehrencamp, S.L. Principles of Animal Communication, 2nd ed.; Sinauer Associates: Sunderland, MA, USA, 2011; ISBN 978-0878930456. [Google Scholar]
- Fay, R.R.; Popper, A.N. (Eds.) Comparative Hearing: Fish and Amphibians; Springer Science & Business Media: New York, NY, USA, 2012; Volume 11, ISBN 978-0387984704. [Google Scholar]
- Gerhardt, H.C.; Huber, F. Acoustic Communication in Insects and Anurans: Common Problems and Diverse Solutions; University of Chicago Press: Chicago, IL, USA, 2002; ISBN 978-0226288338. [Google Scholar]
- Bellis, E.D. The effects of temperature on salientian breeding calls. Copeia 1957, 1957, 85–89. [Google Scholar] [CrossRef]
- Walker, T.J. Specificity in the response of female tree crickets (Orthoptera, Gryllidae, Oecanthinae) to calling songs of the males. Ann. Entomol. Soc. Am. 1957, 50, 626–636. [Google Scholar] [CrossRef]
- Walker, T.J. Factors responsible for intraspecific variation in the calling songs of crickets. Evolution 1962, 16, 407–428. [Google Scholar] [CrossRef]
- Schneider, H. Structure of the mating calls and relationships of the European tree frogs (Hylidae, Anura). Oecologia 1974, 14, 99–110. [Google Scholar] [CrossRef] [PubMed]
- Gerhardt, H.C.; Mudry, K.M. Temperature effects on frequency preferences and mating call frequencies in the green treefrog, Hyla cinerea (Anura: Hylidae). J. Comp. Physiol. 1980, 137, 1–6. [Google Scholar] [CrossRef]
- Gayou, D.C. Effects of temperature on the mating call of Hyla versicolor. Copeia 1984, 1984, 733–738. [Google Scholar] [CrossRef]
- Pires, A.; Hoy, R.R. Temperature coupling in cricket acoustic communication. J. Comp. Physiol. A 1992, 171, 79–92. [Google Scholar] [CrossRef] [PubMed]
- Márquez, R.; Bosch, J. Advertisement calls of the midwife toads Alytes (Amphibia, Anura, Discoglossidae) in continental Spain. J. Zool. Syst. Evol. Res. 1995, 33, 185–192. [Google Scholar] [CrossRef]
- Llusia, D.; Márquez, R.; Beltrán, J.F.; Benitez, M.; Do Amaral, J.P. Calling behaviour under climate change: Geographical and seasonal variation of calling temperatures in ectotherms. Glob. Chang. Biol. 2013, 19, 2655–2674. [Google Scholar] [CrossRef] [PubMed]
- Akyildiz, I.; Melodia, T.; Chowdury, K. Wireless multimedia sensor networks: A survey. IEEE Wirel. Commun. 2007, 14, 32–39. [Google Scholar] [CrossRef]
- Wimmer, J.; Towsey, M.; Roe, P.; Williamson, I. Sampling environmental acoustic recordings to determine bird species richness. Ecol. Appl. 2013, 23, 1419–1428. [Google Scholar] [CrossRef] [PubMed]
- Alonso, J.B.; Cabrera, J.; Shyamnani, R.; Travieso, C.M.; Bolaños, F.; García, A.; Villegas, A.; Wainwright, M. Automatic anuran identification using noise removal and audio activity detection. Expert Syst. Appl. 2017, 72, 83–92. [Google Scholar] [CrossRef]
- Luque, J.; Larios, D.F.; Personal, E.; Barbancho, J.; León, C. Evaluation of MPEG-7-Based Audio Descriptors for Animal Voice Recognition over Wireless Acoustic Sensor Networks. Sensors 2016, 16, 717. [Google Scholar] [CrossRef] [PubMed]
- Luque, A.; Romero-Lemos, J.; Carrasco, A.; Barbancho, J. Non-sequential automatic classification of anuran sounds for the estimation of climate-change indicators. Expert Syst. Appl. 2018, 95, 248–260. [Google Scholar] [CrossRef]
- Romero, J.; Luque, A.; Carrasco, A. Animal Sound Classification using Sequential Classifiers. In BIOSTEC 2017: 10th International Joint Conference on Biomedical Engineering Systems and Technologies; ScitePress Digital Library: Setubal, Portugal, 2017; pp. 242–274. [Google Scholar] [CrossRef]
- Luque, A.; Gómez-Bellido, J.; Carrasco, A.; Personal, E.; Leon, C. Evaluation of the Processing Times in Anuran Sound Classification. Wirel. Commun. Mob. Comput. 2017, 2017, 8079846. [Google Scholar] [CrossRef]
- Larios, D.F.; Barbancho, J.; Sevillano, J.L.; Rodríguez, G.; Molina, F.J.; Gasull, V.G.; León, C. Five years of designing wireless sensor networks in the doñana biological reserve (Spain): An applications approach. Sensors 2013, 13, 12044–12069. [Google Scholar] [CrossRef] [PubMed]
- Fonozoo. Available online: www.fonozoo.com (accessed on 23 January 2018).
- Blum, A.L.; Langley, P. Selection of relevant features and examples in machine learning. Artif. Intell. 1997, 97, 245–271. [Google Scholar] [CrossRef]
- Raman, B.; Ioerger, T.R. Enhancing Learning Using Feature and Example Selection; Texas A&M University: College Station, TX, USA, 2003. [Google Scholar]
- Olvera-López, J.A.; Carrasco-Ochoa, J.A.; Martínez-Trinidad, J.F.; Kittler, J. A review of instance selection methods. Artif. Intell. Rev. 2010, 34, 133–143. [Google Scholar] [CrossRef]
- Borovicka, T.; Jirina, M.; Kordik, P.; Jirina, M. Selecting representative data sets. In Advances in Data Mining Knowledge Discovery and Applications; InTech: London, UK, 2012. [Google Scholar] [CrossRef]
- Patel, R.R.; Dubrovskiy, D.; Döllinger, M. Measurement of glottal cycle characteristics between children and adults: Physiological variations. J. Voice 2014, 28, 476–486. [Google Scholar] [CrossRef] [PubMed]
- ISO. ISO/IEC 15938-4:2001 (MPEG-7: Multimedia Content Description Interface), Part 4: Audio; ISO/IEC JTC, 1; ISO: Geneva, Switzerland, 2001. [Google Scholar]
- ISO. ISO 226:2003. In Acoustics—Normal Equal-Loudness-Level Contours; ISO: Geneva, Switzerland, 2003. [Google Scholar]
- Stevens, S.S.; Volkmann, J.; Newman, E.B. A scale for the measurement of the psychological magnitude pitch. J. Acoust. Soc. Am. 1937, 8, 185–190. [Google Scholar] [CrossRef]
- O’shaughnessy, D. Speech Communication: Human and Machin, 2nd ed.; Wiley-IEEE Press: Hoboken, NJ, USA, 1999; ISBN 978-0-7803-3449-6. [Google Scholar]
- ETSI. ETSI Std 202 050-1.5 Speech Processing, Transmission and Quality Aspects (STQ); Distributed Speech Recognition; Advanced Front-end Feature Extraction Algorithm; Compression Algorithms; ETSI: Nice, France, 2007. [Google Scholar]
- Young, S.; Evermann, G.; Gales, M.; Hain, T.; Kershaw, D.; Liu, X.; Moore, G.; Odell, J.; Ollason, D.; Povey, D.; et al. The HTK Book (for HTK Version 3.5); Department of Engineering, University of Cambridge: Cambridge, UK, 2015. [Google Scholar]
- Wacker, A.G.; Landgrebe, D.A. The Minimum Distance Approach to Classification; Information Note 100771; Purdue University: West Lafayette, IN, USA, 1971. [Google Scholar]
- Le Cam, L. Maximum likelihood: An introduction. Int. Stat. Rev./Rev. Int. Stat. 1990, 153–171. [Google Scholar] [CrossRef]
- Rokach, L.; Maimon, O. Data Mining with Decision Trees: Theory and Applications; World Scientific Pub Co. Inc.: Singapore, 2008; ISBN 978-981-277-171-1. [Google Scholar]
- Cover, T.M.; Hart, P.E. Nearest neighbour pattern classification. IEEE Trans. Inf. Theory 1967, 13, 21–27. [Google Scholar] [CrossRef]
- Cristianini, N.; Shawe-Taylor, J. An Introduction to Support Vector Machines and Other Kernel-Based Learning Methods; Cambridge University Press: Cambridge, UK, 2000; ISBN 9780521780193. [Google Scholar]
- Dobson, A.J.; Barnett, A. An Introduction to Generalized Linear Models; CRC Press: Boca Raton, FL, USA, 2008; ISBN 9781584889502. [Google Scholar]
- Du, K.L.; Swamy, M.N.S. Neural Networks and Statistical Learning; Springer Science and Business Media: Berlin, Germany, 2013; ISBN 978-1-4471-5571-3. [Google Scholar]
- Härdle, W.K.; Simar, L. Applied Multivariate Statistical Analysis; Springer Science and Business Media: Berlin, Germany, 2012; ISBN 978-3-540-72244-1. [Google Scholar]
- Hastie, T.; Tibshirani, R.; Friedman, J. The Elements of Statistical Learning: Data Mining, Inference and Prediction; Springer: Berlin, Germany, 2005; ISBN 978-0-387-84858-7. [Google Scholar]
- Rabiner, L.R. A tutorial on hidden Markov models and selected applications in speech recognition. Proc. IEEE 1989, 77, 257–286. [Google Scholar] [CrossRef] [Green Version]
- Luque, A.; Romero-Lemos, J.; Carrasco, A.; Gonzalez-Abril, L. Temporally-aware algorithms for the classification of anuran sounds. PeerJ 2018, 6, e4732. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Sturm, B.L. A simple method to determine if a music information retrieval system is a “horse”. IEEE Trans. Multimedia 2014, 16, 1636–1644. [Google Scholar] [CrossRef]
- Sokolova, M.; Lapalme, G. A systematic analysis of performance measures for classification tasks. Inf. Process. Manag. 2009, 45, 427–437. [Google Scholar] [CrossRef]
- Powers, D.M. Evaluation: From precision, recall and F-measure to ROC, informedness, markedness and correlation. J. Mach. Learn. Technol. 2011, 2, 37–63. [Google Scholar]
- Chawla, N.V. Data mining for imbalanced datasets: An overview. In Data Mining and Knowledge Discovery Handbook; Springer: New York, NY, USA, 2005; pp. 853–867. [Google Scholar] [CrossRef]
- Gonzalez-Abril, L.; Nuñez, H.; Angulo, C.; Velasco, F. GSVM: An SVM for handling imbalanced accuracy between classes in bi-classification problems. Appl. Soft Comput. 2014, 17, 23–31. [Google Scholar] [CrossRef]
Sound Class | Sound Recordings | Pattern Recordings | |||
---|---|---|---|---|---|
Number | Seconds | Number | Seconds (Pattern Section) | Seconds (Total Recording) | |
Ep. cal. mating call | 369 (43%) | 1853 | 4 | 13.89 | 20.39 |
Ep. cal. release call | 63 (7%) | 311 | 3 | 0.99 | 14.56 |
Al. ob. mating call | 419 (48%) | 2096 | 4 | 1.09 | 19.72 |
Al. ob. distress call | 17 (2%) | 83 | 2 | 3.30 | 9.80 |
Silence/Noise | - | - | - | 45.20 | - |
Total | 868 | 4343 | 13 | 64.47 | 64.47 |
Classifier | Training Functions | Test Functions | Additional Functions |
---|---|---|---|
MinDis | - | - | |
MaxLik | fitgmdist | mvnpdf | |
DecTr | fitctree | predict | |
kNN | fitcknn | predict | |
SVM | fitcsvm | predict | |
LogReg | mnrfit | mnrval | |
Neur | Feedforwardnet train | net | |
Discr | fitcdiscr | predict | |
Bayes | fitNaiveBayes | posterior | |
HMM | hmmtrain | hmmdecode | kmeanlbg disteusq |
Classification Class | |||
---|---|---|---|
Classified as Positive | Classified as Negative | ||
Data class | Positive | TP (true positive) | FN (false negative) |
Negative | FP (false positive) | TN (true negative) |
Metric | Formula | Evaluation Focus |
---|---|---|
Accuracy | Overall effectiveness of a classifier | |
Error rate | Classification error | |
Precision | Class agreement of the data labels with the positive labels given by the classifier | |
Sensitivity | Effectiveness of a classifier to identify positive labels | |
Specificity | How effectively a classifier identifies negative labels | |
ROC | Combined metric based on the Receiver Operating Characteristic (ROC) space [53] | |
score | Combination of precision () and sensitivity () in a single metric | |
Geometric Mean | Combination of sensitivity () and specificity () in a single metric |
Domain | Function | Option | MPEG-7 | FBE | MFCC-HTK | MFCC-opt |
---|---|---|---|---|---|---|
Time | Pre-emphasis | - | - | 0.97 | - | |
Framing | Window | Hamming | Hamming | Hamming | Hamming | |
30 ms | 30 ms | 25 ms | 20 ms | |||
10 ms | 10 ms | 10 ms | 10 ms | |||
Frequency | Filter Bank Energy | 64 Hz | 64 Hz | 300 Hz | 1000 Hz | |
16 kHz | 16 kHz | 3700 Hz | 5000 Hz | |||
- | 18 | 20 | 20 | |||
Scaling | - | Linear | Mel | Mel | ||
Quefrency | Cepstrum | Transform | - | - | DCT | DCT |
- | - | 13 | 20 | |||
Liftering | - | - | 22 | - |
Window Function | ACC | ERR | PRC | SNS | SPC | ROC | F1 | GM |
---|---|---|---|---|---|---|---|---|
Rectangular | 91.58% | 8.42% | 73.12% | 69.96% | 92.77% | 82.16% | 71.51% | 80.56% |
Hamming | 94.85% | 5.15% | 76.76% | 81.22% | 95.87% | 88.49% | 78.93% | 88.24% |
Filter Bank | ACC | ERR | PRC | SNS | SPC | ROC | F1 | GM |
---|---|---|---|---|---|---|---|---|
Rectangular | 94.56% | 5.44% | 62.57% | 73.03% | 96.08% | 85.34% | 67.40% | 83.77% |
Mel | 94.85% | 5.15% | 76.76% | 81.22% | 95.87% | 88.89% | 78.93% | 88.24% |
Cepstral Transform | ACC | ERR | PRC | SNS | SPC | ROC | F1 | GM |
---|---|---|---|---|---|---|---|---|
DFT | 94.27% | 5.73% | 74.46% | 81.17% | 96.09% | 88.94% | 77.67% | 88.31% |
DCT | 94.85% | 5.15% | 76.76% | 81.22% | 95.87% | 88.89% | 78.93% | 88.24% |
Classification Class | |||||
---|---|---|---|---|---|
Ep. cal. Mating Call | Ep. cal. Release Call | Al. ob. Mating Call | Al. ob. Distress Call | ||
Data class | Ep. cal. mating call | 96.16% | 0.82% | 0.82% | 2.19 |
Ep. cal. release call | 48.33% | 48.33% | 1.67% | 1.67% | |
Al. ob. mating call | 2.41% | 0.96% | 95.90% | 0.72% | |
Al. ob. distress call | 0% | 0% | 0% | 100% |
ACC | ERR | PRC | SNS | SPC | ROC | F1 | GM | |
---|---|---|---|---|---|---|---|---|
MPEG-7 (18) | 84.56% | 15.44% | 56.80% | 77.69% | 90.28% | 84.22% | 65.63 | 83.75% |
FBE (18) | 93.69% | 7.31% | 56.78% | 71.05% | 94.00% | 83.32% | 63.12% | 81.72% |
MFCC-HTK (13) | 94.85% | 5.15% | 76.76% | 81.22% | 95.87% | 88.89% | 78.93% | 88.24% |
MFCC-opt (13) | 95.44% | 4.56% | 79.38% | 84.00% | 96.34% | 90.38% | 81.63% | 89.96% |
MFCC-opt (20) | 96.37% | 3.63% | 81.28% | 85.10% | 97.21% | 91.35% | 83.15% | 90.95% |
ACC | ERR | PRC | SNS | SPC | ROC | F1 | GM | |
---|---|---|---|---|---|---|---|---|
MPEG-7 | 84.56% | 15.44% | 56.80% | 77.69% | 90.28% | 84.22% | 65.63 | 83.75% |
FBE | 93.69% | 7.31% | 56.78% | 71.05% | 94.00% | 83.32% | 63.12% | 81.72% |
LogFBE | 94.74% | 5.26% | 88.31% | 73.31% | 95.37% | 85.06% | 80.12% | 83.62% |
MelLogFBE | 94.15% | 5.85% | 81.58% | 82.52% | 95.41% | 89.19% | 82.05% | 88.73% |
MelLogFBE-opt | 92.87% | 7.13% | 78.16% | 86.25% | 94.31% | 90.37% | 82.00% | 90.19% |
MelLogDCT | 94.85% | 5.15% | 79.15% | 81.91% | 95.78% | 89.12% | 80.51% | 88.58% |
MFCC-optTw | 93.10% | 6.90% | 82.31% | 84.44% | 94.02% | 89.36% | 83.36% | 89.10% |
MFCC-opt | 96.37% | 3.63% | 81.78% | 85.09% | 91.17% | 91.33% | 83.40 | 90.93% |
© 2018 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Luque, A.; Gómez-Bellido, J.; Carrasco, A.; Barbancho, J. Optimal Representation of Anuran Call Spectrum in Environmental Monitoring Systems Using Wireless Sensor Networks. Sensors 2018, 18, 1803. https://doi.org/10.3390/s18061803
Luque A, Gómez-Bellido J, Carrasco A, Barbancho J. Optimal Representation of Anuran Call Spectrum in Environmental Monitoring Systems Using Wireless Sensor Networks. Sensors. 2018; 18(6):1803. https://doi.org/10.3390/s18061803
Chicago/Turabian StyleLuque, Amalia, Jesús Gómez-Bellido, Alejandro Carrasco, and Julio Barbancho. 2018. "Optimal Representation of Anuran Call Spectrum in Environmental Monitoring Systems Using Wireless Sensor Networks" Sensors 18, no. 6: 1803. https://doi.org/10.3390/s18061803
APA StyleLuque, A., Gómez-Bellido, J., Carrasco, A., & Barbancho, J. (2018). Optimal Representation of Anuran Call Spectrum in Environmental Monitoring Systems Using Wireless Sensor Networks. Sensors, 18(6), 1803. https://doi.org/10.3390/s18061803