Mining Characteristic Patterns for Comparative Music Corpus Analysis
Abstract
:1. Introduction
2. Descriptive Patterns
2.1. Class Association Patterns
2.2. Pattern Interestingness
- –
- Typicality: The measure of typicality evaluates relative pattern frequency in the target class, i.e., the proportion of examples in class C that are covered by pattern X. The measure was originally proposed to quantify disjunctive patterns which together completely cover a class [30], but is here applied to evaluate individual characteristic patterns.
- –
- Utility: The utility measure was proposed in order to rank as more interesting those patterns which are more specifically related to the target class than to other classes [22]. The measure returns a positive value when pattern and class cannot be considered statistically independent, i.e., when . Utility is weighted by pattern frequency (scaled by a), favoring more general patterns with higher relative frequency (for ).
- –
- Novelty: Similarly to utility, novelty (applied to characteristic patterns in [31]) is based on comparing the joint and individual probabilities of a pattern and a class: it returns a positive value when the co-occurrence of pattern and class is more frequent, measured by , than expected given the individual probabilities and . Novelty can be rewritten as weighted relative accuracy, , combining generality of the pattern with added value of the class probability given the pattern [32].
- –
- Laplace estimate: The Laplace estimate has been used as an alternative to novelty for assessing characteristic patterns [31]. The measure quantifies the pattern’s frequency in the target class relative to its frequency in the complete dataset. The value of the Laplace estimate is maximal if all examples covered by the pattern are examples of the target class. Due to the constant summands (1 in the numerator and 2 in the denominator), the measure implicitly penalizes less frequent patterns with smaller and [33].
- –
- Relative risk: The measure of relative risk has been employed to select among typical patterns those patterns which are more predictive of the target class [21]. Relative risk assesses the conditional probability of the class C given the pattern X against the class probability in the case of the pattern being absent.
- –
- IC: The IC measure [34] not only considers the pattern’s absence in the target class, but also in the background, with observed in the target class rather than background suggesting C to be less plausible. Different from utility and novelty, the IC measure, increasing with higher , is biased towards frequent classes rather than frequent patterns.
- –
- F1 score: As a further measure, we borrow the F1 score from predictive data mining, which is designed to balance recall and precision by calculating their harmonic mean. While recall (corresponding to typicality) measures the relative frequency of the pattern in the target class irrespective of the pattern’s occurrence in other classes, precision quantifies the proportion of pattern occurrences which are observed in the target class rather than other classes.
2.3. Analysis Criteria
3. Data
3.1. Datasets
3.2. Music Content Features
4. Results
4.1. Distribution of Completeness and Discriminant Power for Different Interestingness Measures
4.2. Example Patterns
4.2.1. Case Study 1: Tribes
4.2.2. Case Study 2: Song Types
5. Discussion and Conclusions
Author Contributions
Funding
Conflicts of Interest
References
- Serra, X. Creating research corpora for the computational study of music: The case of the CompMusic project. In Proceedings of the 53rd AES International Conference on Semantic Audio, London, UK, 27–29 January 2014; pp. 1–9. [Google Scholar]
- Kroher, N.; Díaz-Báñez, J.M.; Mora, J.; Gómez, E. Corpus COFLA: A research corpus for the computational study of flamenco music. J. Comput. Cult. Herit. 2016, 9, 10. [Google Scholar] [CrossRef]
- van Kranenburg, P.; de Bruin, M.; Volk, A. Documenting a song culture: The Dutch Song Database as a resource for musicological research. Int. J. Digit. Libr. 2017, 20, 13–23. [Google Scholar] [CrossRef] [Green Version]
- Cook, N. Computational and comparative musicology. In Empirical Musicology: Aims, Methods, Prospects; Clarke, E., Cook, N., Eds.; Oxford University Press: Oxford, UK, 2004; pp. 103–126. [Google Scholar]
- Savage, P.E.; Brown, S. Toward a new comparative musicology. Anal. Approach. World Music 2013, 2, 148–197. [Google Scholar]
- Taminau, J.; Hillewaere, R.; Meganck, S.; Conklin, D.; Nowé, A.; Manderick, B. Descriptive subgroup mining of folk music. In Proceedings of the 2nd International Workshop on Machine Learning and Music at ECML/PKDD 2009 (MML 2009), Bled, Slovenia, 7–11 September 2009. [Google Scholar]
- Taminau, J.; Hillewaere, R.; Meganck, S.; Conklin, D.; Nowé, A.; Manderick, B. Applying subgroup discovery for the analysis of string quartet movements. In Proceedings of the 3rd International Workshop on Music and Machine Learning at ACM Multimedia (MML 2010), Florence, Italy, 29 October 2010; pp. 29–32. [Google Scholar]
- Conklin, D. Discovery of distinctive patterns in music. Intell. Data Anal. 2010, 14, 547–554. [Google Scholar] [CrossRef] [Green Version]
- Conklin, D.; Anagnostopoulou, C. Comparative pattern analysis of Cretan folk songs. J. New Music Res. 2011, 40, 119–125. [Google Scholar] [CrossRef]
- Anagnostopoulou, C.; Giraud, M.; Poulakis, N. Melodic contour representations in the analysis of children’s songs. In Proceedings of the 3rd International Workshop on Folk Music Analysis (FMA 2013), Amsterdam, The Netherlands, 6–7 June 2013; pp. 40–42. [Google Scholar]
- Shanahan, D.; Neubarth, K.; Conklin, D. Mining musical traits of social functions in Native American music. In Proceedings of the 17th International Society for Music Information Retrieval Conference (ISMIR 2016), New York, NY, USA, 7–11 August 2016; pp. 681–687. [Google Scholar]
- Neubarth, K.; Shanahan, D.; Conklin, D. Supervised descriptive pattern discovery in Native American music. J. New Music Res. 2018, 47, 1–16. [Google Scholar] [CrossRef]
- Backer, E.; van Kranenburg, P. On musical stylometry—A pattern recognition approach. Pattern Recognit. Lett. 2005, 26, 299–309. [Google Scholar] [CrossRef]
- Cuthbert, M.S.; Ariza, C.; Friedland, L. Feature extraction and machine learning on symbolic music using the music21 toolkit. In Proceedings of the 12th International Society for Music Information Retrieval Conference (ISMIR 2011), Miami, FL, USA, 24–28 October 2011; pp. 387–392. [Google Scholar]
- Herremans, D.; Martens, D.; Sörensen, K. Dance hit song prediction. J. New Music Res. 2014, 43, 291–302. [Google Scholar] [CrossRef]
- Martins, M.L.; Silla, C.N., Jr. Irish traditional ethnomusicology analysis using decision trees and high level symbolic features. In Proceedings of the 12th Sound and Music Computing Conference (SMC 2015), Maynooth, Ireland, 25 July–1 August 2015. [Google Scholar]
- Herremans, D.; Martens, D.; Sörensen, K. Composer classification models for music theory building. In Computational Music Analysis; Meredith, D., Ed.; Springer: Cham, Switzerland, 2016; pp. 369–392. [Google Scholar]
- van Kranenburg, P.; Maessen, G. Comparing offertory melodies of five medieval Christian chant traditions. In Proceedings of the 18th International Society for Music Information Retrieval Conference (ISMIR 2017), Suzhou, China, 23–27 October 2017; pp. 204–210. [Google Scholar]
- Dong, G.; Li, J. Efficient mining of emerging patterns: Discovering trends and differences. In Proceedings of the 5th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-99), San Diego, CA, USA, 15–18 August 1999; pp. 43–52. [Google Scholar]
- Han, J.; Fu, Y.; Wang, W.; Koperski, K.; Zaiane, O. DMQL: A data mining query language for relational databases. In Proceedings of the ACM SIGMOD Workshop on Data Mining and Knowledge Discovery (DMKD-96), Montreal, QC, Canada, 2 June 1996; pp. 27–33. [Google Scholar]
- Ali, K.; Manganaris, S.; Srikant, R. Partial classification using association rules. In Proceedings of the 3rd International Conference on Knowledge Discovery and Data Mining (KDD-97), Newport Beach, CA, USA, 14–17 August 1997; pp. 115–118. [Google Scholar]
- Maeda, A.; Maki, H.; Akimori, H. Characteristic rule induction algorithm for data mining. In Proceedings of the 2nd Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD-98), Melbourne, Australia, 15–17 April 1998; pp. 399–400. [Google Scholar]
- Densmore, F. Menominee Music; Bulletin 102; Smithsonian Institution, Bureau of American Ethnology: Washington, DC, USA, 1932.
- Geng, L.; Hamilton, H.J. Interestingness measures for data mining: A survey. ACM Comput. Surv. 2006, 38, 1–32. [Google Scholar] [CrossRef]
- Novak, P.K.; Lavrač, N.; Webb, G.I. Supervised descriptive rule induction. In Encyclopedia of Machine Learning; Sammut, C., Webb, G.I., Eds.; Springer: Berlin, Germany, 2010; pp. 938–941. [Google Scholar]
- Atzmüller, M. Subgroup discovery—Advanced review. Wiley Interdiscip. Rev. Data Min. Knowl. Discov. 2015, 5, 35–49. [Google Scholar]
- Neubarth, K.; Conklin, D. Contrast pattern mining in folk music analysis. In Computational Music Analysis; Meredith, D., Ed.; Springer: Cham, Switzerland, 2016; pp. 393–424. [Google Scholar]
- Tan, P.N.; Kumar, V.; Srivastava, J. Selecting the right objective measure for association analysis. Inf. Syst. 2004, 29, 293–313. [Google Scholar] [CrossRef]
- Belohlavek, R.; Grissa, D.; Guillaume, S.; Nguifo, E.M.; Outrata, J. Boolean factors as a means of clustering of interestingness measures of association rules. In Proceedings of the 8th International Conference on Concept Lattices and their Applications (CLA 2011), Nancy, France, 17–20 October 2011; pp. 207–222. [Google Scholar]
- Han, J.; Cai, Y.; Cercone, N. Data-driven discovery of quantitative rules in relational databases. IEEE Trans. Knowl. Data Eng. 1993, 5, 29–40. [Google Scholar] [CrossRef] [Green Version]
- Turmeaux, T.; Salleb, A.; Vrain, C.; Cassard, D. Learning characteristic rules relying on quantified paths. In Proceedings of the 7th European Conference on Principles and Practice of Knowledge Discovery in Databases (PKDD 2003), Cavtat-Dubrovnik, Croatia, 22–26 September 2003; pp. 471–482. [Google Scholar]
- Lavrač, N.; Flach, P.; Zupan, B. Rule evaluation measures: A unifying view. In Proceedings of the 9th International Workshop on Inductive Logic Programming (ILP-99), Bled, Slovenia, 24–27 June 1999; pp. 174–185. [Google Scholar]
- Fürnkranz, J. Separate-and-conquer rule learning. Artif. Intell. Rev. 1999, 13, 3–54. [Google Scholar] [CrossRef]
- Kamber, M.; Shinghal, R. Evaluating the interestingness of characteristic rules. In Proceedings of the 2nd International Conference on Knowledge Discovery and Data Mining (KDD-96), Portland, OR, USA, 2–4 August 1996; pp. 263–266. [Google Scholar]
- Brijs, T.; Vanhoof, K.; Wets, G. Reducing redundancy in characteristic rule discovery by using integer programming techniques. Intell. Data Anal. 2000, 4, 229–240. [Google Scholar] [CrossRef]
- Nettl, B. North American Indian musical styles. J. Am. Folk. 1954, 67, 44–56, 297–307, 351–368. [Google Scholar] [CrossRef] [Green Version]
- Densmore, F. Chippewa Music II; Bulletin 53; Smithsonian Institution, Bureau of American Ethnology: Washington, DC, USA, 1913.
- Densmore, F. Chippewa Music; Bulletin 45; Smithsonian Institution, Bureau of American Ethnology: Washington, DC, USA, 1910.
- Densmore, F. Teton Sioux Music; Bulletin 61; Smithsonian Institution, Bureau of American Ethnology: Washington, DC, USA, 1918.
- Densmore, F. Pawnee Music; Bulletin 93; Smithsonian Institution, Bureau of American Ethnology: Washington, DC, USA, 1929.
- Densmore, F. Papago Music; Bulletin 90; Smithsonian Institution, Bureau of American Ethnology: Washington, DC, USA, 1929.
- Kavšek, B.; Lavrač, N. APRIORI-SD: Adapting association rule learning to subgroup discovery. Appl. Artif. Intell. 2006, 20, 543–583. [Google Scholar] [CrossRef]
- Lavrač, N.; Gamberger, D. Relevancy in constraint-based subgroup discovery. In Constraint-Based Mining and Inductive Databases; Boulicaut, J.F., Raedt, L.D., Mannila, H., Eds.; Springer: Berlin/Heidelberg, Germany, 2005; pp. 243–266. [Google Scholar]
- Sulzmann, J.N.; Fürnkranz, J. A comparison of techniques for selecting and combining class association rules. In Proceedings of the International Workshop From Local Patterns to Global Models at ECML/PKDD 2008 (LeGo-08), Antwerp, Belgium, 14–18 September 2008. [Google Scholar]
- Webb, G.I.; Butler, S.; Newlands, D. On detecting differences between groups. In Proceedings of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-2003), Washington, DC, USA, 24–27 August 2003; pp. 256–265. [Google Scholar]
- Novak, P.K.; Lavrač, N.; Gamberger, D.; Krstačić, A. CSM-SD: Methodology for contrast set mining through subgroup discovery. J. Biomed. Inform. 2009, 42, 113–122. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Lavrač, N.; Kavšek, B.; Flach, P.; Todorovski, L. Subgroup discovery with CN2-SD. J. Mach. Learn. Res. 2004, 5, 153–188. [Google Scholar]
- Densmore, F. Northern Ute Music; Bulletin 75; Smithsonian Institution, Bureau of American Ethnology: Washington, DC, USA, 1922.
- Densmore, F. Choctaw Music; Bulletin 136; Smithsonian Institution, Bureau of American Ethnology: Washington, DC, USA, 1943.
- García-Vico, A.; Carmona, C.; Martín, D.; García-Borroto, M.; del Jesus, M. An overview of emerging pattern mining in supervised descriptive rule discovery: Taxonomy, empirical study, trends, and prospects. Wiley Interdiscip. Rev. Data Min. Knowl. Discov. 2018, 8, e1231. [Google Scholar] [CrossRef] [Green Version]
- Piatetsky-Shapiro, G. Discovery, analysis and presentation of strong rules. In Knowledge Discovery in Databases; Piatetsky-Shapiro, G., Frawley, W., Eds.; MIT Press: Cambridge, MA, USA, 1991; pp. 229–248. [Google Scholar]
- Lenca, P.; Meyer, P.; Vaillant, B.; Lallich, S. On selecting interestingness measures for association rules: User oriented description and multiple criteria decision aid. Eur. J. Oper. Res. 2008, 184, 610–626. [Google Scholar] [CrossRef] [Green Version]
- Carvalho, D.R.; Freitas, A.A.; Ebecken, N. Evaluating the correlation between objective rule interestingness measures and real human interest. In Proceedings of the 9th European Conference on Principles and Practice of Knowledge Discovery in Databases (PKDD 2005), Porto, Portugal, 3–7 October 2005; pp. 453–461. [Google Scholar]
Measure | Definition |
---|---|
typicality | |
utility | |
novelty | |
Laplace estimate | |
relative risk | |
IC | |
F1 score |
Case Study | Classes | # of Songs | # of Classes |
---|---|---|---|
1 | Tribes | 819 | 4 |
2 | Song types | 270 | 5 |
Attribute | Definition | # of Values | |
---|---|---|---|
Case Study 1 | Case Study 2 | ||
tonality | third above keynote | 5 | 3 |
firstNoteReKey | first tone relative to keynote | 17 | 14 |
lastNoteReKey | last tone relative to keynote | 7 | 6 |
lastNoteReCompass | last tone relative to compass of song | 4 | 2 |
material | tone material | 16 | 13 |
accidentals | chromatic alterations of tones | 6 | 3 |
structure | relation between contiguous accented tones | 4 | 3 |
firstProgression | direction of first progression | 2 | 2 |
firstMetricPos | part of measure on which song begins | 3 | 3 |
firstMeasure | rhythm (meter) of first measure | 13 | 10 |
meterChange | change of time (measure-lengths) | 3 | 3 |
rhythmicUnits | rhythmic units | 3 | 3 |
tempoVoiceAccomp | tempo relation voice/rhythmic accompaniment | – | 4 |
Measure | Class (Support) | Pattern (Support) | I | ||||
---|---|---|---|---|---|---|---|
A1 | typicality | Teton Sioux (240) | meterChange:yes (691) | 222 | |||
A2 | utility | Papago (167) | material:major_pentatonic, firstNoteReKey:fifth, lastNoteReKey:fifth, lastNoteReCompass:above_lowest (28) | 27 | |||
A3 | utility | Papago (167) | lastNoteReCompass:above_lowest (269) | 149 | |||
A4 | utility | Teton Sioux (240) | lastNoteReCompass:lowest, meterChange:yes (456) | 195 | |||
A5 | novelty | Papago (167) | lastNoteReCompass:above_lowest (269) | 149 | |||
A6 | Laplace | Chippewa (326) | tonality:major, structure:harmonic, firstNoteReKey:twelfth,lastNoteReKey:keynote (59) | 56 | |||
A7 | rel. risk | Papago (167) | lastNoteReCompass:above_lowest (269) | 149 | |||
A8 | IC | Teton Sioux (240) | lastNoteReCompass:lowest (566) | 209 | |||
A9 | F1 score | Papago (167) | lastNoteReCompass:above_lowest, meterChange:yes (226) | 139 |
Measure | Class (Support) | Pattern (Support) | I | ||||
---|---|---|---|---|---|---|---|
B1 | typicality | love songs (26) | tempoVoiceAccomp:noDrum (65) | 26 | |||
B2 | utility | game songs (14) | tonality:minor, accidentals:none, firstProgression:upward, lastNoteReKey:keynote (25) | 8 | |||
B3 | utility | love songs (26) | tempoVoiceAccomp:noDrum (65) | 26 | |||
B4 | utility | love songs (26) | structure:melodic (171) | 25 | |||
B5 | novelty | Mĭde’ songs (92) | structure:melodic, firstProgression:downward (121) | 64 | |||
B6 | Laplace | war songs (87) | structure:mixed, lastNoteReKey:keynote, lastNoteReCompass:lowest, rhythmicUnits:one (12) | 12 | |||
B7 | rel. risk | game songs (14) | lastNoteReKey:keynote, firstMetricPos:accented, rhythmicUnits:no (37) | 10 | |||
B8 | IC | Mĭde’ songs (92) | firstProgression:downward (198) | 84 | |||
B9 | F1 score | Mĭde’ songs (92) | structure:melodic, firstProgression:downward (121) | 64 |
© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Neubarth, K.; Conklin, D. Mining Characteristic Patterns for Comparative Music Corpus Analysis. Appl. Sci. 2020, 10, 1991. https://doi.org/10.3390/app10061991
Neubarth K, Conklin D. Mining Characteristic Patterns for Comparative Music Corpus Analysis. Applied Sciences. 2020; 10(6):1991. https://doi.org/10.3390/app10061991
Chicago/Turabian StyleNeubarth, Kerstin, and Darrell Conklin. 2020. "Mining Characteristic Patterns for Comparative Music Corpus Analysis" Applied Sciences 10, no. 6: 1991. https://doi.org/10.3390/app10061991
APA StyleNeubarth, K., & Conklin, D. (2020). Mining Characteristic Patterns for Comparative Music Corpus Analysis. Applied Sciences, 10(6), 1991. https://doi.org/10.3390/app10061991