Time Signature Detection: A Survey
Abstract
:1. Introduction
2. Musical Input Signals
2.1. Audio Samples as Data
2.2. MIDI Signals as Data
3. Datasets
4. Classical Methods
5. Deep Learning Techniques
5.1. Convolutional Neural Netorks
5.2. Convo-Recurrent Neural Networks
6. Conclusions and Future Pathways
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Benetos, E.; Dixon, S. Polyphonic music transcription using note onset and offset detection. In Proceedings of the 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Prague, Czech Republic, 22–27 May 2011; pp. 37–40. [Google Scholar]
- Benetos, E.; Dixon, S.; Duan, Z.; Ewert, S. Automatic music transcription: An overview. IEEE Signal Process. Mag. 2018, 36, 20–30. [Google Scholar] [CrossRef]
- Tuncer, D. In Music Education, in the Context of Measuring Beats, Anacrusic Examples Prepared with Simple Time Signature. Procedia-Soc. Behav. Sci. 2015, 197, 2403–2406. [Google Scholar] [CrossRef] [Green Version]
- Smith, S.M.; Williams, G.N. A visualization of music. In Proceedings of the Visualization’97 (Cat. No. 97CB36155), Phoenix, AZ, USA, 19–24 October 1997; pp. 499–503. [Google Scholar]
- Kaplan, R. Rhythmic Training for Dancers; ERIC: Champaign, IL, USA, 2002. [Google Scholar]
- Kan, Z.J.; Sourin, A. Generation of Irregular Music Patterns With Deep Learning. In Proceedings of the 2020 International Conference on Cyberworlds (CW), Caen, France, 29 September–1 October 2020; pp. 188–195. [Google Scholar]
- Bottiroli, S.; Rosi, A.; Russo, R.; Vecchi, T.; Cavallini, E. The cognitive effects of listening to background music on older adults: Processing speed improves with upbeat music, while memory seems to benefit from both upbeat and downbeat music. Front. Aging Neurosci. 2014, 6, 284. [Google Scholar] [CrossRef]
- Still, J. How down is a downbeat? Feeling meter and gravity in music and dance. Empir. Musicol. Rev. 2015, 10, 121–134. [Google Scholar] [CrossRef] [Green Version]
- Temperley, D. The Cognition of Basic Musical Structures; MIT Press: Cambridge, MA, USA, 2004. [Google Scholar]
- Attas, R.E.S. Meter as Process in Groove-Based Popular Music. Ph.D. Thesis, University of British Columbia, Vancouver, BC, Canada, 2011. [Google Scholar]
- Goto, M.; Muraoka, Y. A beat tracking system for acoustic signals of music. In Proceedings of the Second ACM International Conference on Multimedia, San Francisco, CA, USA, 15–20 October 1994; pp. 365–372. [Google Scholar]
- Foote, J.; Uchihashi, S. The beat spectrum: A new approach to rhythm analysis. In Proceedings of the IEEE International Conference on Multimedia and Expo, IEEE Computer Society, Tokyo, Japan, 22–25 August 2001; p. 224. [Google Scholar]
- Burger, B.; Thompson, M.R.; Luck, G.; Saarikallio, S.; Toiviainen, P. Music moves us: Beat-related musical features influence regularity of music-induced movement. In Proceedings of the 12th International Conference in Music Perception and Cognition and the 8th Triennial Conference of the European Society for the Cognitive Sciences for Music, Thessaloniki, Greece, 23–28 July 2012; pp. 183–187. [Google Scholar]
- Bahuleyan, H. Music genre classification using machine learning techniques. arXiv 2018, arXiv:1804.01149. [Google Scholar]
- Oramas, S.; Barbieri, F.; Nieto Caballero, O.; Serra, X. Multimodal deep learning for music genre classification. Trans. Int. Soc. Music Inf. Retr. 2018, 1, 4–21. [Google Scholar] [CrossRef]
- Feng, T. Deep Learning for Music Genre Classification. Private Document. 2014. Available online: https://courses.engr.illinois.edu/ece544na/fa2014/Tao_Feng.pdf (accessed on 24 September 2021).
- Kostrzewa, D.; Kaminski, P.; Brzeski, R. Music Genre Classification: Looking for the Perfect Network. In International Conference on Computational Science; Springer: Berlin/Heidelberg, Germany, 2021; pp. 55–67. [Google Scholar]
- Kameoka, H.; Nishimoto, T.; Sagayama, S. Harmonic-temporal structured clustering via deterministic annealing EM algorithm for audio feature extraction. In Proceedings of the ISMIR 2005, 6th International Conference on Music Information Retrieval, London, UK, 11–15 September 2005; pp. 115–122. [Google Scholar]
- Chen, Y.; Jiang, H.; Li, C.; Jia, X.; Ghamisi, P. Deep feature extraction and classification of hyperspectral images based on convolutional neural networks. IEEE Trans. Geosci. Remote Sens. 2016, 54, 6232–6251. [Google Scholar] [CrossRef] [Green Version]
- Liu, Z.; Wang, Y.; Chen, T. Audio feature extraction and analysis for scene segmentation and classification. J. VLSI Signal Process. Syst. Signal Image Video Technol. 1998, 20, 61–79. [Google Scholar] [CrossRef]
- Mathieu, B.; Essid, S.; Fillon, T.; Prado, J.; Richard, G. YAAFE, an Easy to Use and Efficient Audio Feature Extraction Software. In Proceedings of the 11th International Society for Music Information Retrieval Conference, ISMIR 2010, Utrecht, The Netherlands, 9–13 August 2010; pp. 441–446. [Google Scholar]
- Sharma, G.; Umapathy, K.; Krishnan, S. Trends in audio signal feature extraction methods. Appl. Acoust. 2020, 158, 107020. [Google Scholar] [CrossRef]
- Hsu, C.; Wang, D.; Jang, J.R. A trend estimation algorithm for singing pitch detection in musical recordings. In Proceedings of the 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Prague, Czech Republic, 22–27 May 2011; pp. 393–396. [Google Scholar] [CrossRef]
- Nakamura, E.; Benetos, E.; Yoshii, K.; Dixon, S. Towards Complete Polyphonic Music Transcription: Integrating Multi-Pitch Detection and Rhythm Quantization. In Proceedings of the 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Calgary, AB, Canada, 15–20 April 2018; pp. 101–105. [Google Scholar] [CrossRef]
- Degara, N.; Pena, A.; Davies, M.E.P.; Plumbley, M.D. Note onset detection using rhythmic structure. In Proceedings of the 2010 IEEE International Conference on Acoustics, Speech and Signal Processing, Dallas, TX, USA, 14–19 March 2010; pp. 5526–5529. [Google Scholar] [CrossRef] [Green Version]
- Gui, W.; Xi, S. Onset detection using leared dictionary by K-SVD. In Proceedings of the 2014 IEEE Workshop on Advanced Research and Technology in Industry Applications (WARTIA), Ottawa, ON, Canada, 29–30 September 2014; pp. 406–409. [Google Scholar] [CrossRef]
- Mounir, M.; Karsmakers, P.; Waterschoot, T.V. Annotations Time Shift: A Key Parameter in Evaluating Musical Note Onset Detection Algorithms. In Proceedings of the 2019 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), New Paltz, NY, USA, 20–23 October 2019; pp. 21–25. [Google Scholar] [CrossRef]
- Alonso, M.; Richard, G.; David, B. Extracting note onsets from musical recordings. In Proceedings of the 2005 IEEE International Conference on Multimedia and Expo, Amsterdam, The Netherlands, 6 July 2005; p. 4. [Google Scholar] [CrossRef] [Green Version]
- Wu, F.H.F.; Jang, J.S.R. A supervised learning method for tempo estimation of musical audio. In Proceedings of the 22nd Mediterranean Conference on Control and Automation, Palermo, Italy, 16–19 June 2014; pp. 599–604. [Google Scholar]
- Downie, J.S.; Byrd, D.; Crawford, T. Ten Years of ISMIR: Reflections on Challenges and Opportunities. In Proceedings of the 10th International Society for Music Information Retrieval Conference, ISMIR 2009, Kobe, Japan, 26–30 October 2009; pp. 13–18. [Google Scholar]
- Muller, M.; Ellis, D.P.; Klapuri, A.; Richard, G. Signal processing for music analysis. IEEE J. Sel. Top. Signal Process. 2011, 5, 1088–1110. [Google Scholar] [CrossRef] [Green Version]
- Klapuri, A. Musical Meter Estimation and Music Transcription. Cambridge Music Processing Colloquium. Citeseer. 2003, pp. 40–45. Available online: https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.77.8559&rep=rep1&type=pdf (accessed on 24 September 2021).
- Lartillot, O.; Toiviainen, P. A Matlab toolbox for musical feature extraction from audio. In Proceedings of the International Conference on Digital Audio Effects, Bordeaux, France, 10–15 September 2007; Volume 237, p. 244. [Google Scholar]
- Villanueva-Luna, A.E.; Jaramillo-Nuñez, A.; Sanchez-Lucero, D.; Ortiz-Lima, C.M.; Aguilar-Soto, J.G.; Flores-Gil, A.; May-Alarcon, M. De-Noising Audio Signals Using Matlab Wavelets Toolbox; IntechOpen: Rijeka, Croatia, 2011. [Google Scholar]
- Giannakopoulos, T.; Pikrakis, A. Introduction to Audio Analysis: A MATLAB® Approach; Academic Press: Cambridge, MA, USA, 2014. [Google Scholar]
- Pedregosa, F.; Varoquaux, G.; Gramfort, A.; Michel, V.; Thirion, B.; Grisel, O.; Blondel, M.; Prettenhofer, P.; Weiss, R.; Dubourg, V.; et al. Scikit-learn: Machine learning in Python. J. Mach. Learn. Res. 2011, 12, 2825–2830. [Google Scholar]
- McFee, B.; Raffel, C.; Liang, D.; Ellis, D.P.; McVicar, M.; Battenberg, E.; Nieto, O. librosa: Audio and music signal analysis in python. In Proceedings of the 14th Python in Science Conference, Austin, TX, USA, 6–12 July 2015; Volume 8, pp. 18–25. [Google Scholar]
- Mallat, S. A Wavelet Tour of Signal Processing; Elsevier: Amsterdam, The Netherlands, 1999. [Google Scholar]
- Cataltepe, Z.; Yaslan, Y.; Sonmez, A. Music genre classification using MIDI and audio features. EURASIP J. Adv. Signal Process. 2007, 2007, 1–8. [Google Scholar] [CrossRef] [Green Version]
- Ozcan, G.; Isikhan, C.; Alpkocak, A. Melody extraction on MIDI music files. In Proceedings of the Seventh IEEE International Symposium on Multimedia (ISM’05), Irvine, CA, USA, 14 December 2005; p. 8. [Google Scholar]
- Klapuri, A.P.; Eronen, A.J.; Astola, J.T. Analysis of the meter of acoustic musical signals. IEEE Trans. Audio Speech Lang. Process. 2005, 14, 342–355. [Google Scholar] [CrossRef] [Green Version]
- Uhle, C.; Herre, J. Estimation of tempo, micro time and time signature from percussive music. In Proceedings of the Internatioanl Conference on Digital Audio Effects (DAFx), London, UK, 8–11 September 2003. [Google Scholar]
- Jiang, J. Audio processing with channel filtering using DSP techniques. In Proceedings of the 2018 IEEE 8th Annual Computing and Communication Workshop and Conference (CCWC), Las Vegas, NV, USA, 8–10 January 2018; pp. 545–550. [Google Scholar]
- Foote, J. Visualizing music and audio using self-similarity. In Proceedings of the Seventh ACM International Conference on Multimedia (Part 1), Orlando, FL, USA, 30 October–5 November 1999; pp. 77–80. [Google Scholar]
- Saito, S.; Kameoka, H.; Takahashi, K.; Nishimoto, T.; Sagayama, S. Specmurt analysis of polyphonic music signals. IEEE Trans. Audio Speech Lang. Process. 2008, 16, 639–650. [Google Scholar] [CrossRef] [Green Version]
- Grohganz, H.; Clausen, M.; Müller, M. Estimating Musical Time Information from Performed MIDI Files. In Proceedings of the International Conference on Music Information Retrieval (ISMIR), Taipei, Taiwan, 27–31 October 2014; pp. 35–40. [Google Scholar]
- Roig, C.; Tardón, L.J.; Barbancho, I.; Barbancho, A.M. Automatic melody composition based on a probabilistic model of music style and harmonic rules. Knowl.-Based Syst. 2014, 71, 419–434. [Google Scholar] [CrossRef]
- Akujuobi, U.; Zhang, X. Delve: A dataset-driven scholarly search and analysis system. ACM SIGKDD Explor. Newsl. 2017, 19, 36–46. [Google Scholar] [CrossRef]
- Goto, M.; Hashiguchi, H.; Nishimura, T.; Oka, R. RWC Music Database: Popular, Classical and Jazz Music Databases. ISMIR 2002, 2, 287–288. Available online: https://staff.aist.go.jp/m.goto/RWC-MDB/ (accessed on 24 September 2021).
- Turnbull, D.; Barrington, L.; Torres, D.; Lanckriet, G. Semantic annotation and retrieval of music and sound effects. IEEE Trans. Audio Speech Lang. Process. 2008, 16, 467–476. Available online: http://slam.iis.sinica.edu.tw/demo/CAL500exp (accessed on 24 September 2021). [CrossRef] [Green Version]
- Tzanetakis, G.; Cook, P. Musical genre classification of audio signals. IEEE Trans. Speech Audio Process. 2002, 10, 293–302. Available online: https://www.kaggle.com/andradaolteanu/gtzan-dataset-music-genre-classification (accessed on 24 September 2021). [CrossRef]
- Turnbull, D.; Barrington, L.; Torres, D.; Lanckriet, G. Exploring the Semantic Annotation and Retrieval of Sound. CAL Technical Report CAL-2007-01. San Diego, CA, USA, 2007. Available online: https://www.ee.columbia.edu/~dpwe/research/musicsim/uspop2002.html (accessed on 24 September 2021).
- Tingle, D.; Kim, Y.E.; Turnbull, D. Exploring automatic music annotation with “acoustically-objective” tags. In Proceedings of the International Conference on Multimedia Information Retrieval, Philadelphia, PA, USA, 29–31 March 2010; pp. 55–62. Available online: http://calab1.ucsd.edu/~datasets/ (accessed on 24 September 2021).
- Law, E.; West, K.; Mandel, M.I.; Bay, M.; Downie, J.S. Evaluation of algorithms using games: The case of music tagging. In Proceedings of the 10th International Society for Music Information Retrieval Conference, ISMIR 2009, Kobe, Japan, 26–30 October 2009; pp. 387–392. Available online: https://mirg.city.ac.uk/codeapps/the-magnatagatune-dataset (accessed on 24 September 2021).
- Defferrard, M.; Benzi, K.; Vandergheynst, P.; Bresson, X. Fma: A dataset for music analysis. arXiv 2016, arXiv:1612.01840. [Google Scholar]
- Schedl, M.; Orio, N.; Liem, C.C.; Peeters, G. A professionally annotated and enriched multimodal data set on popular music. In Proceedings of the 4th ACM Multimedia Systems Conference, Oslo, Norway, 28 February–1 March 2013; pp. 78–83. Available online: http://www.cp.jku.at/datasets/musiclef/index.html (accessed on 24 September 2021).
- Bertin-Mahieux, T.; Ellis, D.P.; Whitman, B.; Lamere, P. The Million Song Dataset. 2011. Available online: http://millionsongdataset.com/ (accessed on 24 September 2021).
- Panagakis, I.; Benetos, E.; Kotropoulos, C. Music genre classification: A multilinear approach. In Proceedings of the International Symposium Music Information Retrieval, Philadelphia, PA, USA, 14–18 September 2008; pp. 583–588. [Google Scholar]
- Benetos, E.; Kotropoulos, C. A tensor-based approach for automatic music genre classification. In Proceedings of the 2008 16th European Signal Processing Conference, Lausanne, Switzerland, 25–29 August 2008; pp. 1–4. [Google Scholar]
- Chang, K.K.; Jang, J.S.R.; Iliopoulos, C.S. Music Genre Classification via Compressive Sampling. In Proceedings of the 11th International Society for Music Information Retrieval Conference, ISMIR 2010, Utrecht, The Netherlands, 9–13 August 2010; pp. 387–392. [Google Scholar]
- Chathuranga, Y.; Jayaratne, K. Automatic music genre classification of audio signals with machine learning approaches. GSTF J. Comput. (JoC) 2014, 3, 14. [Google Scholar] [CrossRef]
- Zhang, W.; Lei, W.; Xu, X.; Xing, X. Improved Music Genre Classification with Convolutional Neural Networks. In Proceedings of the Interspeech, San Francisco, CA, USA, 8–12 September 2016; pp. 3304–3308. [Google Scholar]
- Kotsiantis, S.; Kanellopoulos, D.; Pintelas, P. Handling imbalanced datasets: A review. GESTS Int. Trans. Comput. Sci. Eng. 2006, 30, 25–36. [Google Scholar]
- Seiffert, C.; Khoshgoftaar, T.M.; Van Hulse, J.; Napolitano, A. RUSBoost: Improving classification performance when training data is skewed. In Proceedings of the 2008 19th International Conference on Pattern Recognition, Tampa, FL, USA, 8–11 December 2008; pp. 1–4. [Google Scholar]
- López, V.; Fernández, A.; Herrera, F. On the importance of the validation technique for classification with imbalanced datasets: Addressing covariate shift when data is skewed. Inf. Sci. 2014, 257, 1–13. [Google Scholar] [CrossRef]
- Harte, C. Towards Automatic Extraction of Harmony Information from Music Signals. Ph.D. Thesis, Queen Mary University of London, London, UK, 2010. [Google Scholar]
- Ellis, K.; Coviello, E.; Lanckriet, G.R. Semantic Annotation and Retrieval of Music using a Bag of Systems Representation. In Proceedings of the 12th International Society for Music Information Retrieval Conference, ISMIR 2011, Miami, FL, USA, 24–28 October 2011; pp. 723–728. [Google Scholar]
- Andersen, J.S. Using the Echo Nest’s automatically extracted music features for a musicological purpose. In Proceedings of the 2014 4th International Workshop on Cognitive Information Processing (CIP), Copenhagen, Denmark, 26–28 May 2014; pp. 1–6. [Google Scholar]
- Pons, J.; Nieto, O.; Prockup, M.; Schmidt, E.; Ehmann, A.; Serra, X. End-to-end learning for music audio tagging at scale. arXiv 2017, arXiv:1711.02520. [Google Scholar]
- Ycart, A.; Benetos, E. A-MAPS: Augmented MAPS dataset with rhythm and key annotations. In Proceedings of the 19th International Society for Music Information Retrieval Conference Late-Breaking Demos Session, Electronic Engineering and Computer Science, Paris, France, 23–27 September 2018. [Google Scholar]
- Lenssen, N. Applications of Fourier Analysis to Audio Signal Processing: An Investigation of Chord Detection Algorithms; CMC Senior Theses, Paper 704; Claremont McKenna College: Claremont, CA, USA, 2013; Available online: https://scholarship.claremont.edu/cmc_theses/704/ (accessed on 24 September 2021).
- Gouyon, F.; Herrera, P. Determination of the Meter of Musical Audio Signals: Seeking Recurrences in Beat Segment Descriptors. Audio Engineering Society Convention 114. Audio Engineering Society. 2003. Available online: https://www.aes.org/e-lib/online/browse.cfm?elib=12583 (accessed on 24 September 2021).
- Pikrakis, A.; Antonopoulos, I.; Theodoridis, S. Music meter and tempo tracking from raw polyphonic audio. In Proceedings of the ISMIR 2004, 5th International Conference on Music Information Retrieval, Barcelona, Spain, 10–14 October 2004. [Google Scholar]
- Coyle, E.; Gainza, M. Time Signature Detection by Using a Multi-Resolution Audio Similarity Matrix. Audio Engineering Society Convention 122. Audio Engineering Society. 2007. Available online: https://www.aes.org/e-lib/online/browse.cfm?elib=14139 (accessed on 24 September 2021).
- Holzapfel, A.; Stylianou, Y. Rhythmic Similarity in Traditional Turkish Music. In Proceedings of the ISMIR—International Conference on Music Information Retrieval, Kobe, Japan, 26–30 October 2009; pp. 99–104. [Google Scholar]
- Gainza, M. Automatic musical meter detection. In Proceedings of the 2009 IEEE International Conference on Acoustics, Speech and Signal Processing, Taipei, Taiwan, 19–24 April 2009; pp. 329–332. [Google Scholar]
- Gulati, S.; Rao, V.; Rao, P. Meter detection from audio for Indian music. In Speech, Sound and Music Processing: Embracing Research in India; Springer: Berlin/Heidelberg, Germany, 2011; pp. 34–43. [Google Scholar]
- Varewyck, M.; Martens, J.P.; Leman, M. Musical meter classification with beat synchronous acoustic features, DFT-based metrical features and support vector machines. J. New Music Res. 2013, 42, 267–282. [Google Scholar] [CrossRef]
- Cano, E.; Mora-Ángel, F.; Gil, G.A.L.; Zapata, J.R.; Escamilla, A.; Alzate, J.F.; Betancur, M. Sesquialtera in the colombian bambuco: Perception and estimation of beat and meter. Proc. Int. Soc. Music Inf. Retr. Conf. 2020, 2020, 409–415. [Google Scholar]
- Lee, K. The Role of the 12/8 Time Signature in JS Bach’s Sacred Vocal Music; University of Pittsburgh: Pittsburgh, PA, USA, 2005. [Google Scholar]
- Panwar, S.; Das, A.; Roopaei, M.; Rad, P. A deep learning approach for mapping music genres. In Proceedings of the 2017 12th System of Systems Engineering Conference (SoSE), Waikoloa, HI, USA, 18–21 June 2017; pp. 1–5. [Google Scholar]
- Schoukens, J.; Pintelon, R.; Van Hamme, H. The interpolated fast Fourier transform: A comparative study. IEEE Trans. Instrum. Meas. 1992, 41, 226–232. [Google Scholar] [CrossRef]
- Chen, K.F.; Mei, S.L. Composite interpolated fast Fourier transform with the Hanning window. IEEE Trans. Instrum. Meas. 2010, 59, 1571–1579. [Google Scholar] [CrossRef]
- McLeod, A.; Steedman, M. Meter Detection and Alignment of MIDI Performance. ISMIR, 2018; pp. 113–119. Available online: http://ismir2018.ircam.fr/doc/pdfs/136_Paper.pdf (accessed on 24 September 2021).
- McLeod, A.; Steedman, M. Meter Detection From Music Data. In DMRN+ 11: Digital Music Research Network One-Day Workshop 2016; Utkal University: Bhubaneswar, India, 2016. [Google Scholar]
- De Haas, W.B.; Volk, A. Meter detection in symbolic music using inner metric analysis. In Proceedings of the International Society for Music Information Retrieval Conference, New York, NY, USA, 7–11 August 2016; p. 441. [Google Scholar]
- Liu, J.; Sun, S.; Liu, W. One-step persymmetric GLRT for subspace signals. IEEE Trans. Signal Process. 2019, 67, 3639–3648. [Google Scholar] [CrossRef]
- Brown, J.C. Determination of the meter of musical scores by autocorrelation. J. Acoust. Soc. Am. 1993, 94, 1953–1957. [Google Scholar] [CrossRef]
- Hua, X.; Ono, Y.; Peng, L.; Cheng, Y.; Wang, H. Target detection within nonhomogeneous clutter via total bregman divergence-based matrix information geometry detectors. IEEE Trans. Signal Process. 2021, 69, 4326–4340. [Google Scholar] [CrossRef]
- Wu, Y.; Ianakiev, K.; Govindaraju, V. Improved k-nearest neighbor classification. Pattern Recognit. 2002, 35, 2311–2318. [Google Scholar] [CrossRef]
- Lai, J.Z.; Huang, T.J. An agglomerative clustering algorithm using a dynamic k-nearest-neighbor list. Inf. Sci. 2011, 181, 1722–1734. [Google Scholar] [CrossRef]
- Dong, W.; Moses, C.; Li, K. Efficient k-nearest neighbor graph construction for generic similarity measures. In Proceedings of the 20th International Conference on World Wide Web, Hyderabad, India, 28 March–1 April 2011; pp. 577–586. [Google Scholar]
- Zhu, W.; Sun, W.; Romagnoli, J. Adaptive k-nearest-neighbor method for process monitoring. Ind. Eng. Chem. Res. 2018, 57, 2574–2586. [Google Scholar] [CrossRef]
- West, K.; Cox, S. Finding An Optimal Segmentation for Audio Genre Classification. In Proceedings of the ISMIR 2005, 6th International Conference on Music Information Retrieval, London, UK, 11–15 September 2005; pp. 680–685. [Google Scholar]
- Roopaei, M.; Rad, P.; Jamshidi, M. Deep learning control for complex and large scale cloud systems. Intell. Autom. Soft Comput. 2017, 23, 389–391. [Google Scholar] [CrossRef]
- Li, T.L.; Chan, A.B.; Chun, A. Automatic musical pattern feature extraction using convolutional neural network. In Proceedings of the International MultiConference of Engineers and Computer Scientists 2010 (IMECS 2010), Hong Kong, China, 17–19 September 2010; pp. 546–550. [Google Scholar]
- Polishetty, R.; Roopaei, M.; Rad, P. A next-generation secure cloud-based deep learning license plate recognition for smart cities. In Proceedings of the 2016 15th IEEE International Conference on Machine Learning and Applications (ICMLA), Anaheim, CA, USA, 18–20 December 2016; pp. 286–293. [Google Scholar]
- Lai, S.; Xu, L.; Liu, K.; Zhao, J. Recurrent convolutional neural networks for text classification. In Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, Austin, TX, USA, 25–30 January 2015. [Google Scholar]
- Dai, J.; Liang, S.; Xue, W.; Ni, C.; Liu, W. Long short-term memory recurrent neural network based segment features for music genre classification. In Proceedings of the 2016 10th International Symposium on Chinese Spoken Language Processing (ISCSLP), Tianjin, China, 17–20 October 2016; pp. 1–5. [Google Scholar]
- Feng, L.; Liu, S.; Yao, J. Music genre classification with paralleling recurrent convolutional neural network. arXiv 2017, arXiv:1712.08370. [Google Scholar]
- Jia, B.; Lv, J.; Liu, D. Deep learning-based automatic downbeat tracking: A brief review. Multimed. Syst. 2019, 25, 617–638. [Google Scholar] [CrossRef] [Green Version]
- Pereira, R.M.; Costa, Y.M.; Aguiar, R.L.; Britto, A.S.; Oliveira, L.E.; Silla, C.N. Representation learning vs. Handcrafted features for music genre classification. In Proceedings of the 2019 International Joint Conference on Neural Networks (IJCNN). Budapest, Hungary, 14–19 July 2019; pp. 1–8. [Google Scholar]
- Dieleman, S.; Brakel, P.; Schrauwen, B. Audio-based music classification with a pretrained convolutional network. In Proceedings of the 12th International Society for Music Information Retrieval Conference (ISMIR-2011), Miami, FL, USA, 24–28 October 2011; pp. 669–674. [Google Scholar]
- Durand, S.; Essid, S. Downbeat Detection with Conditional Random Fields and Deep Learned Features. In Proceedings of the 17th International Society for Music Information Retrieval Conference, ISMIR, New York, NY, USA, 7–11 August 2016; pp. 386–392. [Google Scholar]
- Böck, S.; Davies, M.E.; Knees, P. Multi-Task Learning of Tempo and Beat: Learning One to Improve the Other. In Proceedings of the 20th ISMIR Conference, Delft, The Netherlands, 4–8 November 2019; pp. 486–493. [Google Scholar]
- Purwins, H.; Li, B.; Virtanen, T.; Schlüter, J.; Chang, S.Y.; Sainath, T. Deep learning for audio signal processing. IEEE J. Sel. Top. Signal Process. 2019, 13, 206–219. [Google Scholar] [CrossRef] [Green Version]
- Fuentes, M.; Mcfee, B.; Crayencour, H.C.; Essid, S.; Bello, J.P. A music structure informed downbeat tracking system using skip-chain conditional random fields and deep learning. In Proceedings of the ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Brighton, UK, 12–17 May 2019; pp. 481–485. [Google Scholar]
- Rajan, R.; Raju, A.A. Deep neural network based poetic meter classification using musical texture feature fusion. In Proceedings of the 2019 27th European Signal Processing Conference (EUSIPCO), A Coruna, Spain, 2–6 September 2019; pp. 1–5. [Google Scholar]
- Zhang, Y.; Yang, Q. A survey on multi-task learning. arXiv 2017, arXiv:1707.08114. [Google Scholar]
- Sener, O.; Koltun, V. Multi-task learning as multi-objective optimization. arXiv 2018, arXiv:1810.04650. [Google Scholar]
- Burges, C.J.; Platt, J.C.; Jana, S. Extracting noise-robust features from audio data. In Proceedings of the 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing, Orlando, FL, USA, 13–17 May 2002; Volume 1, pp. 1–1021. [Google Scholar]
- Das, S.; Bäckström, T. Postfiltering Using Log-Magnitude Spectrum for Speech and Audio Coding. In Proceedings of the Interspeech, Hyderabad, India, 2–6 September 2018; pp. 3543–3547. [Google Scholar]
- Srinivasamurthy, A.; Holzapfel, A.; Cemgil, A.T.; Serra, X. Particle filters for efficient meter tracking with dynamic bayesian networks. In ISMIR 2015, 16th International Society for Music Information Retrieval Conference, Málaga, Spain, 26–30 October 2015; Müller, M., Wiering, F., Eds.; International Society for Music Information Retrieval (ISMIR): Canada, 2015; Available online: https://repositori.upf.edu/handle/10230/34998 (accessed on 24 September 2021).
- Humphrey, E.J.; Bello, J.P.; LeCun, Y. Moving beyond feature design: Deep architectures and automatic feature learning in music informatics. In Proceedings of the 13th International Society for Music Information Retrieval Conference, ISMIR 2012, Porto, Portugal, 8–12 October 2012; pp. 403–408. [Google Scholar]
- Fuentes, M.; McFee, B.; Crayencour, H.; Essid, S.; Bello, J. Analysis of common design choices in deep learning systems for downbeat tracking. In Proceedings of the 19th International Society for Music Information Retrieval Conference, Paris, France, 23–27 September 2018. [Google Scholar]
Criteria | MIDI | Digital Audio |
---|---|---|
Definition | A MIDI file is a computer software that provides music info. | A digital audio refers to digital sound reproduction and transmission. |
Pros | Files of small size fit on a disk easily. The files are perfect at all times. | The exact sound files are reproduced. It replicates superior quality. |
Cons | There is variation from the original sound. | They take more disk space with more minutes of sound, files can get corrupted with a little manipulation. |
Format Type | Compressed. | Compressed. |
Information Data | Does not contain any audio information. | Contains recorded audio information. |
Dataset Name | Year Created | Number of Samples | Data Samples |
---|---|---|---|
RWC [49] | 2002 | 365 | Audio |
CAL500 [50] | 2008 | 502 | MIDI |
GZTAN [51] | 2002 | 1000 | Audio |
USPOP [52] | 2002 | 8752 | MIDI |
Swat10K [53] | 2010 | 10,870 | MIDI |
MagnaTagATune [54] | 2009 | 25,863 | Audio |
FMA [55] | 2016 | 106,574 | Audio |
MusicCLEF [56] | 2012 | 200,000 | Audio |
MSD [57] | 2011 | 1,000,000 | CSV |
Year | Method | Dataset | Data | Accuracy (%) |
---|---|---|---|---|
2003 | SVM [72] | Self generated | Audio | 83 |
2003 | ACF [42] | Excerpts of percussive music | Audio | 73.5 |
2004 | SSM [73] | Greek music samples | Audio | 95.5 |
2007 | ASM [74] | Commercial CD recordings | Audio | 75 |
2009 | ACF, OSS [75] | Usul | MIDI | 77.8 |
2009 | BSSM, ASM [76] | Generated samples | Audio | 95 |
2011 | Comb Filter [77] | Indian Music DB | Audio | 88.7 |
2013 | SVM [78] | Generated Samples | Audio | 90 |
2014 | RSSM [47] | MIDI keyboard scores | MIDI | 93 |
2020 | Annotation Workflow [79] | ACMUS-MIR | Audio | 75.06 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Abimbola, J.; Kostrzewa, D.; Kasprowski, P. Time Signature Detection: A Survey. Sensors 2021, 21, 6494. https://doi.org/10.3390/s21196494
Abimbola J, Kostrzewa D, Kasprowski P. Time Signature Detection: A Survey. Sensors. 2021; 21(19):6494. https://doi.org/10.3390/s21196494
Chicago/Turabian StyleAbimbola, Jeremiah, Daniel Kostrzewa, and Pawel Kasprowski. 2021. "Time Signature Detection: A Survey" Sensors 21, no. 19: 6494. https://doi.org/10.3390/s21196494
APA StyleAbimbola, J., Kostrzewa, D., & Kasprowski, P. (2021). Time Signature Detection: A Survey. Sensors, 21(19), 6494. https://doi.org/10.3390/s21196494