Covering Blue Voices: African American English and Authenticity in Blues Covers

De Timmerman, Romeo; Slembrouck, Stef

doi:10.3390/languages9070229

Open AccessArticle

Covering Blue Voices: African American English and Authenticity in Blues Covers

by

Romeo De Timmerman

^* and

Stef Slembrouck

Department of Linguistics, Ghent University, 9000 Gent, Belgium

^*

Author to whom correspondence should be addressed.

Languages 2024, 9(7), 229; https://doi.org/10.3390/languages9070229

Submission received: 6 April 2024 / Revised: 4 June 2024 / Accepted: 18 June 2024 / Published: 25 June 2024

(This article belongs to the Special Issue Interface between Sociolinguistics and Music)

Download

Browse Figures

Versions Notes

Abstract

:

Many musicologists and researchers of popular music have recently stressed the omnipresence of covers in today’s music industry. In the sociolinguistics of music, however, studio-recorded covers and their potential differences from ‘original’ compositions have certainly been acknowledged in passing, but very few sociolinguists concerned with the study of song seem to have systematically explored how language use may differ in such re-imagined musical outputs. This article reports on a study which examines the language use of 45 blues artists from three distinct time periods (viz., 1960s, 1980s, and 2010s) and three specific social groups (viz., African American; non-African American, US-based; and non-African American, non-US based) distributed over 270 studio-recorded original and cover performances. Through gradient boosting decision tree classification, it aims to analyze the artists’ use of eight phonological and lexico-grammatical features that are traditionally associated with African American English (viz., /aɪ/ monophthongization, post-consonantal word-final /t/ deletion, post-consonantal word-final /d/ deletion, alveolar nasal /n/ in <ing> ultimas, post-vocalic word-final /r/ deletion, copula deletion, third-person singular <s> deletion, and not-contraction). Our analysis finds song type (i.e., the distinction between covers and originals) to have no meaningful impact on artists’ use of the examined features of African American English. Instead, our analysis reveals how performers seem to rely on these features to a great extent and do so markedly consistently, regardless of factors such as time period, socio-cultural background, or song type. This paper hence builds on our previous work on the language use of blues performers by further teasing out the complex indexical and iconic relationships between features of African American English, authenticity, and the blues genre in its various manifestations of time, place, and performance types.

Keywords:

African American English; blues music; authenticity; indexicality; iconicity

1. Introduction

Over the past few decades, sociolinguists have become increasingly interested in the language use of speakers in various contexts of popular culture. One such area is the study of language use in music, the first inquiries into which date back to the 1980s and 1990s (Simpson 1999; Trudgill 1983). More recently, scholars have built on this variationist sociolinguistic tradition and have particularly zoomed in on the complexities of language use in a variety of particular, often globalized, genres (Beal 2009; Bell 2011; Bell and Gibson 2011; Coupland 2011; Drummond 2018; Gibson 2010, 2011, 2019; Gilbers 2021; Gilbers et al. 2020). By and large, however, these and other studies have restricted their focus to one specific type of musical output: studio performances of ‘original’ compositions. Nonetheless, musical practice is more multifaceted than this, and we would argue that the sociolinguistics of music should account for the full range of practices that one comes across.

Consequently, the present article aims to expand the sociolinguistic interest in musical practice by considering the language use of artists who record cover versions of other performers’ compositions in the specific context of the blues genre. Concretely, we will examine the language use of 45 blues artists from three distinct time periods (viz., 1960s, 1980s, and 2010s) and three specific social groups (viz., African American; non-African American, US-based; and non-African American, non-US based) to compare their original and cover performances. Through a gradient boosting decision tree algorithm, which was trained on a corpus of 270 blues songs (30 for each permutation of social group and time window), this study aims to show the similarities and differences in the artists’ use of eight phonological and lexico-grammatical features of African American English (AAE) across both performative contexts. The present article hence aims to (i) further tease out the complex relationship between features of AAE and the blues genre and (ii) examine whether cover songs should be treated as separate performative modes in the sociolinguistics of music.

2. Sociolinguistics of Covers in Music

Sociolinguistic interest in the study of language use in music dates back to Trudgill’s (1983) seminal paper on the Americanized pronunciation of British pop singers. Since then, sociolinguistic scholarship on music has steadily expanded, focusing not only on performers but also audiences, communities, and genres (Cutler 1999, 2003; Cutler 2007; Garley 2018, 2019; Newman 2005). When considering studies examining the language use of performers specifically, there is great variation in terms of their linguistic foci as well; while some zoom in on phonology (Gibson 2010) and prosody (Gilbers et al. 2020), others are interested in morphosyntax (Squires 2019), lexical items, and/or metaphors (Beal 2009; Bridle 2018). However, while all of these latter approaches are, in some ways, interested in the language use of speakers (i.e., singers) during staged performances (Bell and Gibson 2011)—high performances in Coupland’s (2007) terminology—it appears that, thus far, the linguistic study of song has largely been limited to one particular performance type: studio-recorded performances of ‘original’ compositions; that is, songs initially written, performed, and/or recorded for the first time by one and the same artist.

One particular lacuna that we wish to draw attention to in this paper is the idea that cover songs are distinct from their original counterparts, and that they should hence be approached as separate empirical entities in the linguistic study of song lyrics and their performance. Scholars in the fields of musicology and popular music studies have recognized and underlined the omnipresence of covers in today’s music industry (Cusic 2005; Mosser 2008; Plasketes 2005, 2016; Weinstein 2010). This includes (i) covers in a traditional sense (i.e., artists who perform a rendition of another performer’s previously released song) but also (ii) artists who re-release new versions of their own existing tracks (e.g., unplugged performances, Taylor Swift’s Taylor’s Versions, etc.). Similarly, the use of covers also extends into (iii) the practice of sampling, for example, which quickly became one of the cornerstones of hip hop and related genres.

Although these and other types of re-interpretative practices have been central to the field of popular music studies, it appears that, as of yet, accounts of how language use may differ in such re-imagined musical output have been largely absent in the sociolinguistic literature on language and music. The production of covers and their potential differences from ‘original’ recordings have certainly been acknowledged in passing (see, for example, Beal 2009; Bell 2011; Coupland 2009, 2011; Simpson 1999), but very few sociolinguists concerned with the study of song seem to have systematically explored this important dimension of contemporary musical practice. One question that we raise within this context is to what extent singers may wish to faithfully emulate the characteristics of the original performance, even when the cover crosses genre boundaries, and, linguistically, how the use of certain salient phonological and lexico-grammatical features of the original artist or recording are related to this.

To address this question empirically, we build in this paper upon our previous sociolinguistic work on the language use of live blues performers (De Timmerman et al. 2023), which has revealed how artists from a variety of socio-cultural backgrounds appear to rely on the use of AAE features when singing. Strikingly, the blues as a genre is characterized by the continuous re-imagining, re-recording, and re-performing of various songs, in some cases to such an extent that they have acquired status as ‘blues standards’ (cf. jazz standards), that is, songs that are arguably considered to belong more to the genre, rather than to an individual performer. It seems plausible, then, that blues performers who rely on features of AAE in their own creative output, would do so even more when covering the songs of others.

Conversely, one could equally argue that performers may feel the need to deviate from the original recording, instead wishing to stress their own take on the original composition. This may translate to, not only on a musical level, but also on a linguistic one, novelties characterizing the new performance. In other words, we would argue that there may be tension in blues covers between faithfulness to the original and the creativity of one’s own version. This contrast is not exclusive to the blues, of course, as is illustrated, for example, by the work of De Munck (2019), who similarly argues that performances of classical music necessarily balance between the dehumanized, faithful rendition of a composer’s original piece versus the personalized performance which highlights the creativity and virtuosity of the individual performers.

The aim of the present study is hence to scrutinize this creative tension in musical re-interpretation and to gauge how nuanced dynamics of language use may be involved in this. We strongly believe that our specific empirical focus on blues covers in this study is warranted not only because of the genre’s strong reliance on cover work but also because of its unique contrast between processes of localization and globalization, which results in a cultural context that, we would argue, is incredibly fruitful for sociolinguistic inquiry. In the following section, we home in on this by considering how the blues simultaneously encompasses highly localized and globalized styles and argue as to how and why this intrinsically results in a socio-cultural context that is undoubtedly valuable not only to (ethno)musicologists but also to (socio)linguists.

3. Indexicality and Iconicity of AAE Features in Blues Music

Our interest in the complex relationship between the blues genre and particular identifiable linguistic patterns can be situated within the third wave of variationist sociolinguistic scholarship (Eckert 2012, 2016). The first wave of variationist sociolinguistics, heralded in by Labov’s (1966) study of the social stratification of /r/ in New York City, largely approached linguistic variation as directly reflecting one’s membership in specific social groups. This one-to-one mapping of sociolinguistic variable use with macro-sociological categories largely persisted in the second wave, which embraced qualitative, ethnographic methods in the study of sociolinguistic variation following the work of Milroy and Milroy (1978) on networks of working class communities in urban Belfast. The third wave, however, differentiates itself through a more nuanced approach to the relationship between linguistic variation and social meaning. Building conceptually on notions such as indexicality (Silverstein 2003; Eckert 2008) and stylistic practices (Bucholtz and Hall 2005; Eckert 2008; Irvine 2002), contemporary third-wave variationist scholars generally study how speakers, through processes such as “bricolage” (Hebdige 2008) and “stylization” (Coupland 2007; Rampton 2009), dynamically and interactionally project their social personae through language use.

This third-wave approach, which goes beyond simple linguistic category affiliation, we would argue, is essential to understanding the complexities of language use of contemporary blues performers, especially when considering the genre’s history. On the one hand, the blues emerged in highly specific societal and political contexts among enslaved and discriminated African American communities across the Southern United States of the nineteenth century. However, it quickly diversified on both a local level, with various blues styles geographically spanning the United States (e.g., Delta Blues, Chicago Blues, Memphis Blues, Louisiana Blues, Piedmont Blues, and Texas Blues), as well as on a global level, with popularized forms of blues(-rock) spreading to Western Europe and the rest of the world in the latter half of the twentieth century. This global development lead directly to criticism and skepticism regarding the blues of white, middle-class performers in particular, which is further echoed in current discourses surrounding cultural appropriation in the arts (Young 2008). Within this context, some scholars argue that the blues is an inherently “racial project” which prevents non-African American performers from imparting the “racialized moral pain” which is quintessential to the genre (Taylor 1995, p. 315; see also Gleason 1968; Jones 1999; Young 2008), while others assert that the blues is a “stance embodied and articulated in sound and poetry” which can genuinely and successfully be adopted by non-African American artists (Rudinow 1994, p. 135).

Rudinow (1994) specifically considers authenticity to lie at the center of this dilemma. Blues performers, he argues, ought not to be judged by racial and ethnic backgrounds alone but should rather be evaluated “on the degree of mastery of the idiom and the integrity of the performer’s use of the idiom in performance” (p. 135). This blues idiom, we would argue, is multifaceted and may include not only musical, instrumental, and lyrical—that is, lexical, metaphorical, and thematic—but also linguistic performances. As part of this argument, then, and following the third-wave variationist paradigm, we largely interpret the aforementioned prevalence of AAE features in the lyrical language use of blues performers regardless of their backgrounds to be indexical of the artistic authenticity necessary to accurately and adequately perform blues music (cf. De Timmerman et al. 2023). This indexical relationship offers contemporary blues performers linguistic means through which they can position themselves within the genre as faithful, serious, and authentic performers.

The indexical relationship between features of AAE and musical, artistic authenticity is naturally related to sociolinguistic conceptualizations of authenticity as well. In this study, we specifically draw on Coupland’s (2003) framework of sociolinguistic authenticities, which highlights five specific dimensions of authenticity: ontology, historicity, systemic coherence, consensus, and value. These qualities are central to authentic entities and highlight how their cultural value is dependent on their internal characteristics, as well as how they are perceived, discussed, and constructed by social actors. However, as Coupland (2003) argues, globalization and late modernity often result in a significant discrepancy between the ‘innate’ authenticity speakers are able to acquire from their own sociocultural background and socialization, and the authenticating processes which are central to how they position themselves in the social world. Following this and applied to the blues context, we would furthermore stress how blues performers, specifically those with non-African American cultural heritage, may rely on both musical and linguistic means to construct their artistic authenticity, as well as to identify themselves as blues artists.

Nevertheless, we have equally asserted in our previous work how certain features of AAE simply appear to be constitutive of the blues genre as a whole, resulting in our coining the term Standard Blues Singing Style (SBSS), following Wilson’s (2017) Classical Choral Singing Style (CCSS) and Gibson’s (2019) Standard Popular Music Singing Style (SPMSS). This claim, one could argue, does not align with the n+1st order dialectics of indexicality (Silverstein 2003), which encompass the continuous re-interpretation and re-construal of linguistic signs and their social meanings. Instead, it suggests a more straightforward relationship between the observed linguistic variation and the cultural export product that is blues music. We tie this view to Irvine and Gal’s (2000) notion of iconization—later renamed as rhematization (Gal 2005, 2013)—which occurs when “[l]inguistic features that index social groups or activities appear to be iconic representations of them, as if a linguistic feature somehow depicted or displayed a social group’s inherent nature or essence” (p. 37). This iconic transformation of the dynamics between linguistic features and the social meanings that are linked to them is arguably directly relevant to the blues context, in which a certain style of singing appears to be simply part and parcel of the genre’s conventions.

We strongly believe that the tension between processes of indexicality and iconicity is essential to interpreting and understanding the language use of (contemporary) blues performers. In this paper, for which our focus lies specifically on the comparison between original and cover works, we hence examine how both processes appear to be at play in our corpus of recorded blues performances. In the next section, we first outline our methodology in compiling this corpus, analyzing it through descriptive statistics and machine learning prediction, before turning to a discussion of the quantitative results that considers how the observed linguistic variation in blues performances may be explained through the conceptual linguistic frameworks we outlined above.

4. Data and Methods

To examine the use of AAE features in blues music, a corpus of 270 studio-performed blues songs was compiled. These 270 songs were equally distributed among 45 blues artists, who in turn represent one of three social groups (viz., African American, non-African American, US-based; and non-African American, non-US-based) and one of three time periods (viz., 1960s, 1980s, and 2010s), as shown in Table 1. For each of these artists, six unique songs were selected using the online video sharing platform YouTube. In those cases in which artists were active during more than one of our selected time periods (e.g., B.B. King), we selected the earliest time period for which we were able to find enough recordings of sufficient audio quality, and limited our selection of songs to that single time period1. Three out of six songs per artists had to be originals, meaning that no other performer had previously recorded a version of the composition before. The other three songs had to be covers, that is, songs which were previously recorded by another blues artist. Put simply, we are not comparing original and cover versions of the same song, but instead original and cover songs performed by the same artist. Throughout this article, when we use the term ‘cover’, this is the specific type of song we are referring to.

Moreover, while some of the included artists and bands2 certainly do not fall only under the blues category, we ensured that all selected songs can be categorized as blues, based on their reliance on (i) the 12-bar I–IV–V or i–iv–v chord patterns, (ii) the minor pentatonic scale, especially when played over major chords, (iii) the ‘blue’ ♭3, ♭5 and ♭7 notes, and (iv) a ‘lyrical pattern’, often with a call-response structure, which is repeated throughout the song (see Bridle 2018, p. 26). Our definition of the blues is accordingly rather broad and includes a variety of subgenres, ranging from early Delta Blues to contemporary blues-rock. Our corpus hence includes both ‘traditional’ blues performers—that is, artists with African American cultural heritage who often grew up in the Southern United States and, in many cases, moved to urban centers in the North or Midwest as part of the Great Migration—but also artists and bands in other parts of the globe, who were inspired by and built on the music of these traditional performers. This includes bands such as Fleetwood Mac and the Rolling Stones, who certainly moved away from more traditional blues and blues-rock later in their careers, but were among the first British artists to actively perform blues music in the early 1960s.

To examine the use of AAE features across these cover and original performances, five phonological and three lexico-grammatical features of AAE were selected as variables for this study. These variables were selected based on the ample body of literature on AAE, and can be considered prototypical features of the variety (see list below). It should be noted, however, that we do not wish to dismiss current scholarly interest in and arguments for a more nuanced and region-sensitive approach to the study of AAE (Forman 2002; Gilbers 2021; Gilbers et al. 2020; Wolfram 2007; Wolfram and Kohn 2015). AAE is not simply a monolithic variety of English without any regional variation. However, since our focus in this paper is scaled to the use of AAE features in a globalized musical genre, we effectively limit ourselves to salient and generalizable features of AAE that, based on their high prevalence in a particular genre, may acquire shibboleth-like value to a certain musical community (cf. De Timmerman et al. 2023). Similarly, we acknowledge that these features are not necessarily exclusive to AAE and may, in some cases, be prevalent in other varieties of English, or nonformal registers more generally. Nevertheless, based on the extensive (socio)linguistics scholarship on AAE, we strongly argue that the selected features are, in fact, characteristic of the variety, and may therefore be easily picked up by non-AAE speaking blues performers when singing. The five phonological and three lexico-grammatical features we selected are as follows:

/aɪ/ Monophthongization (Anderson 2002; Fridland 2003),
e.g., <mine>, [maɪn] realized as [ma:n];
Alveolar nasal realization of <ing> ultimas (Thomas 2007),
e.g., <worrying>, [wʌriɪŋ] realized as [wʌriɪn];
Post-vocalic word-final /r/ deletion (Labov 1968; Thomas 2007),
e.g., <for>, [for] realized as [fo:];
Post-consonantal word-final /t/ deletion (Labov 1968; Thomas 2007),
e.g., <don’t>, [doʊnt] realized as [doʊn];
Post-consonantal word-final /d/ deletion (Labov 1968; Thomas 2007),
e.g., <and>, [ænd] realized as [æn];
Use of <ain’t> as a negative auxiliary verb to replace <isn’t> (Walker 2005),
e.g., “he isn’t going to the party” realized as “he ain’t going to the party”;
Deletion of third-person singular <s> (Newkirk-Turner and Green 2016),
e.g., “he walks to school” realized as “he walk to school”;
Copula deletion (Green 2002; Kim 2022),
e.g., “he is fun” realized as “he fun”.

The entire corpus of 270 songs was imported into MAXQDA (MAXQDA n.d.) for transcription and coding of the selected AAE features. All songs were transcribed in “naturalized form” (Bucholtz 2007) using standard English spelling. The entire corpus was then coded in a binary manner: each time in the corpus when one of the eight AAE variables could potentially be realized, we indicated whether it was. To facilitate this process, the audio track was often slowed down substantially to make subtle differences in pronunciation easier to perceive. Nevertheless, the binary categorization remained challenging, particularly in older recordings. Therefore, in instances of uncertainty, tokens were left uncoded. To verify the accuracy of the annotator, an inter-rater reliability test was performed on a similar corpus of live-performed blues lyrics, which featured the same five phonological features used in this study but none of the lexico-grammatical ones (see De Timmerman et al. 2023). This resulted in an 88% agreement between both raters across all features, with specific agreement levels ranging from 86% to 91% for each of the five phonological features. We therefore argue that the binary differences in realization can reliably be identified, and that our corpus is suitable for further analysis.

Upon completing the annotation, the MAXQDA project was exported into tabular form, which allowed for the statistical and machine learning analyses to be performed using Python3. The tabular dataset4 had the following variables for each potential occurrence of one of the eight AAE features outlined above:

Word (i.e., word containing the AAE feature);
Previous word;
Next word;
Artist name;
Song title;
Song type (i.e., cover or original);
AAE feature (i.e., one of the eight selected features of AAE);
AAE realization (i.e., whether the AAE feature was realized—the binary outcome variable);
Time period (i.e., 1960s, 1980s, or 2010s);
Social group (i.e., African American; non-African American, US-based; or non-African American, non-US-based).

One methodological challenge inherent to the variationist paradigm is the natural skewedness in the distributions of linguistic variables and their realizations. Applied to this particular study, we are faced with two types of imbalances. On the one hand, there will inevitably be asymmetry in the possible occurrences of the eight selected features we outlined above. For example, /ai/ diphthongs are ubiquitous in our corpus, but contracted negations are much rarer (cf. Section 5.1). On the other hand, we are also faced with imbalance in the specific realizations of these variables. As our exploratory data analysis in the next section shows, there are many more positive cases of the outcome variable in our corpus, that is, many more cases in which artists realize the AAE feature than cases in which they do not. This begs the question to what extent statistical methodologies commonly used in linguistics can account for these imbalances, and, more importantly, how meaningful sociolinguistic insights can be distilled from them. Strikingly, logistic regression, the generalized linear mixed effects model which is prototypically used in contemporary sociolinguistic studies for binary classification, has widely been reported to be particularly sensitive to class imbalances, resulting in poor model performance (Li 2020; Muchlinski et al. 2016; Oommen et al. 2011; van den Goorbergh et al. 2022).

To address this problem, we rely in this paper on a machine learning approach using a gradient boosted decision tree classification algorithm. These algorithms are generally considered to be very performant for predictions on tabular data (McElfresh et al. 2023), are widely used in a variety of fields (Hancock and Khoshgoftaar 2020a), and, crucially, are less sensitive to class imbalances when compared to logistic regression (Hancock and Khoshgoftaar 2020b). We specifically selected the popular ‘Catboost’ library (Prokhorenkova et al. 2019), since its gradient boosting classifier provides built-in support for categorical variables. We additionally used the ‘Scikit-learn’ library (Pedregosa et al. 2011) for data splitting and model evaluation. Hyperparameter optimization was performed using the ‘Optuna’ framework (Akiba et al. 2019) by maximizing the F1-score over the course of 50 trials. Finally, we used the ‘Shap’ package for model white-boxing (Lundberg and Lee 2017). The next section outlines our exploratory data analysis, descriptive statistics to compare original and cover songs, and the gradient boosting classification model.

5. Results

5.1. Descriptive Summary

The binary coding of AAE variable realizations resulted in a dataset of 15,184 rows. As shown in Table 2, the absolute and relative frequencies of AAE realizations vary considerably by variable. In absolute numbers, the phonological features occur much more frequently than the lexico-grammatical ones. /aɪ/ Monophthongization in particular is strikingly prevalent, though this can largely be attributed to the omnipresence of the first-person pronouns ‘I’ and ‘my’ in the corpus, and blues music more generally (see Bridle 2018).

In relative terms, the AAE realization of most variables is largely preferred, as seen in Table 2, with two notable exceptions, both of which are lexico-grammatical, as follows: copula deletion occurs only in 17% of all possible cases, and the deletion of third-person singular <s> occurs in 40% of cases. The third and final lexico-grammatical variable, on the other hand, has by far the highest relative prevalence of AAE realizations: <ain’t> replaces <isn’t> in 99% percent of cases. For all phonological variables, the AAE realization is preferred, though the relative frequency varies from 60% through 89% depending on the variable.

To illustrate the overall prevalence of AAE realizations in the corpus, Figure 1 visualizes the mean AAE realizations by artist and by group. This group variable (see figure legend) combines time period and social group in one variable with nine permutations for easier visualization. Note that the distinction between covers and originals is not considered at this stage.

As seen in Figure 1, the group means of eight out of nine groups are tightly clustered together, ranging from 72% for the ‘1980s non-AA, non-US’ group, to 77% for the ‘2010s AA’ group. The ninth group, ‘1960s non-AA, US-based’, has a strikingly lower group mean of 66%. We interpret this as reflecting the potential social friction that may arise when borrowing or appropriating cultural, musical or linguistic elements from a social group with high geographical and temporal proximity. All other non-African American groups, however, are more distanced from the reference group (viz., African American performers from the 1960s) and hence have very similar group means. However, while the group means are largely uniform, there is considerable within-group variability, as we observe no clear clustering of the individual artist means based on the group variable. Instead, there is a lot of variation in the mean AAE realizations of all artists, as individual means range from a lower bound of 53% for Eric Clapton to an upper bound of 87% for Eric Gales.

Figure 1 consequently paints a picture that is largely uniform with the tendencies laid out in our previous work on the language use of blues performers (De Timmerman et al. 2023). By and large, artists seem to rely heavily on the use of AAE features while singing, suggesting that these linguistic features are important to the blues genre as such. This naturally ties in with current third-wave variationist scholarship, and shows, in yet another context, that linguistic variable use need not directly reflect one’s socio-cultural background. Instead, we would argue that blues performers depend on features of AAE to position themselves within the blues genre through their indexical link with original blues performers, and traditional blues performance more generally. We hence find further support for the existence of a Standard Blues Style of Singing (SBSS), which seems to depend heavily on AAE features as indexical expressions of authenticity (cf. Gibson 2010, 2019).

5.2. Original versus Cover Songs

In order to examine the use of AAE features across the two song types we are interested in, viz., covers and originals; Figure 2 shows the mean AAE realizations of all eight variables for each of the nine groups, while also distinguishing between both song types. As is clearly indicated on the figure, the mean values for both song types do not seem to differ considerably for most of the variables. There are a few exceptions, however. For third-person singular <s> deletion and zero copula there are, in the case of some groups, differences between the two song types. Specifically, we observe noticeable differences in the African American and non-African American, US-based groups for both the 1960s and 1980s. As highlighted before, however, our sample sizes for these specific variables are relatively low. This is also reflected in the broad 95% confidence intervals of these variables in most groups. In other words, we cannot really make any meaningful inferences about the impact of song type on the use of these two specific variables. More interesting, perhaps, are the slight differences between covers and originals for /t/ deletion and /r/ deletion among non-African American, non-US-based artists of the 1960s and 1980s, respectively. In both cases, the AAE realizations of the phonological variables are more prevalent in the cover songs as opposed to the original ones.

If we generate a similar figure but plot the ‘artist’ variable on the y-axis instead, this general pattern is further confirmed. As shown in Figure 3, the mean AAE realizations for all artists in the corpus are strikingly similar when comparing their cover and original songs. There are four cases in which a slight difference can be observed, viz., Cream, Bonnie Raitt, Eric Clapton, and The Blues Band; for all four of these performers and bands, the mean AAE realization seems to be slightly higher when they are covering another artist’s composition than when performing their own work. In line with the results displayed in Figure 2, three out of four of these exceptions are artists from the non-African American, non-US-based group. For all other 41 artists in the corpus however, the mean scores are remarkably similar, as shown by the quasi-overlapping mean estimates and their 95% confidence intervals. In other words, on an artist level, it appears that the cover versus original distinction is, by and large, not very meaningful.

Figure 2 and Figure 3 consequently suggest that there is no sizeable effect of the song type on the prevalence of the eight features of AAE. These results are positioned somewhat ambiguously vis-à-vis the twofold hypothesis we outlined before. As mentioned in Section 2, we anticipated that contemporary artists covering another song might wish to either stick faithfully to the original composition or deviate from it to highlight their own creativity instead, and that both approaches may be reflected in the artists’ language use. The results presented here, however, show that the language use of artists seems to not meaningfully differ across both song types, and it appears that their reliance on the observed AAE features is simply part and parcel of their musical register. While this is a relatively marked observation, it ties in with our general findings on the importance of these AAE features in blues music, and it hence provides further support for the existence of an SBSS.

5.3. Predictive Modeling

To examine this further, and to paint a more complete picture of the language use of blues artists in the corpus, we trained a gradient boosting decision tree model to predict the binary realization of the selected AAE features (i.e., the outcome variable). As listed in Section 4, the predictors are (i) word, (ii) previous word, (iii) next word, (iv) artist name, (v) song title, (vi) song type, (vii) AAE feature, (viii) time period, and (ix) social group. Note that during data preprocessing, initial model testing revealed the categorical rendering of the time period variable with three possible values (viz., 1960s, 1980s, 2010s) to outperform its continuous numerical counterpart (i.e., the specific year of performance associated with each individual song), which is why we opted to use the former over the latter as a predictor. Similarly, while a combination of the time period and social group variables was used for parts of the data visualization (cf. Section 5.2), this combined feature was not used to train the model. Instead, both the social group and time period variables were used as separate features for improved model performance.

The model was trained on 70% of the entire dataset. Half of the remaining data were reserved for model validation during hyperparameter optimization (15%), while the other half were used for model testing (15%). As mentioned in Section 4, we specifically chose to adopt this machine learning approach to address the class imbalance limitation of logistic regression. Moreover, these ensemble learning models are highly performant in prediction tasks based on tabular data, especially when confronted with data irregularities like class imbalances (McElfresh et al. 2023). In other words, if there is structure and regularity in our dataset, and hence in the lyrical language use of blues performers, then such a model is very likely to identify it.

The general accuracy of our trained model on the test set (15% of the data) is 0.90, but because there is considerable class imbalance in the overall dataset (cf. Section 4), evaluating the model is not as straightforward. To obtain an accurate view of the model’s performance, Table 3 shows the classification report for the final model after hyperparameter optimization, using predictions on the previously unseen test set.

As shown in Table 3, the performance metrics for both classes all exceed 0.80, except for the recall of the minority class (i.e., AAE feature not realized), which can naturally be attributed to the 74–26% class imbalance of the outcome variable. The high values for all model metrics suggest that the model is able to account for the class imbalance. We can furthermore verify this through the confusion matrix displayed in Figure 4.

As shown in Figure 4, the model predicts very few false negatives (n = 99) in the test set. This is, of course, not unexpected because of the large class imbalance in favor of the positive cases. However, there are also markedly few false positive cases (n = 139), especially compared to the true negatives (n = 462) and the true positive cases (n = 1578). In other words, Figure 3, again, suggests that the model is able to accurately account for the class imbalance in the dataset.

Now that the model evaluation is established as positive, we can more qualitatively look at the predictions and their Shapley values. Shapley values provide a way to quantify the contribution of each variable6 of the dataset to the model’s prediction for a particular datapoint. In other words, Shapley values help identify those variables that have a meaningful impact on the outcome variable and those that do not. They consequently help to interpret the otherwise relatively opaque model predictions, hence helping with white-boxing the model.

Figure 5 visualizes the mean absolute Shapley value for all of the variables that are used by the model: the larger a variable’s mean absolute value, the more important it is for the model’s prediction.

As shown in Figure 5, the ‘word’ variable has by far the highest mean absolute Shapley value. In other words, it is the variable with the highest impact on the model’s prediction. This makes sense intuitively: depending on which word contains the possible AAE feature, the realization might be different. The specific type of AAE feature has the second highest mean impact on the prediction. This can again be interpreted intuitively, especially when considering the variation in feature means outlined previously (cf. Section 5.1). We also find the ‘song’ and ‘next word’ variables to have relatively high mean scores. The latter makes sense linguistically, as many of the selected AAE features may depend on the phonological context of the next speech unit. The high mean absolute Shapley value for the individual ‘song’, on the other hand, can likely be attributed to the strong pattern of lexical and lyrical repetition that is present in most songs in our dataset, and blues music more generally, as is illustrated by the first six lines of Muddy Waters’ “I Got My Brand on You”:

I got my brand on you
I got my brand on you
I got my brand on you
I got my brand on you
There ain’t nothing you can do honey
I got my brand on you

Conversely, some of the variables have strikingly low mean absolute Shapley values. The ‘social group’ predictor, for instance, has a comparatively low mean absolute value, suggesting that this variable has very little effect on model prediction. Our earlier descriptive finding, the tendency for artists to frequently use features of AAE while singing regardless of their socio-cultural backgrounds, is further confirmed here. This naturally ties in with third-wave variationist interpretations, which, as outlined in Section 3, move away from one-to-one mapping of linguistic features and macro-sociological categories, instead stressing that speakers can stylistically draw on linguistic resources associated with cultural groups that they do not necessarily belong to in order to position themselves in the social world in a meaningful way (see, for example, Bucholtz and Hall 2005; Bucholtz and Lopez 2011; Coupland 2007; Eckert 2012, 2016; Irvine 2002; Rampton 2009, 2022). We argue that subscribing to a genre that is traditionally associated with a social group that one does not belong to falls under this heading as well.

When considering the ‘song type’ variable, i.e., the cover versus original distinction which lies at the focus of this paper, we strikingly observe that it has the lowest mean absolute value of all of the predictors. This means that it on average has very little impact on the model’s predictions. This is in line with the trend we observed above in Figure 2 and Figure 3. By and large, song type does not seem to affect the use of AAE features by the observed blues performers. If we consider this together with the low impact of the ‘social group’ variable, it appears that there is a striking consistency with which artists seem to rely on the observed features of AAE, regardless of factors which one might otherwise expect to affect their language use. We accordingly interpret this regularity to mean that these features are simply integral to and constitutive of the blues genre.

In addition to Figure 5, which shows the mean absolute Shapley values for all datapoints, it is also possible to visualize the specific Shapley values of all variables for any given datapoint, as is shown in Figure 6. We believe that this useful, not only because it can further elucidate the model’s predictions on a more micro-level, but also because it can help reveal the interaction between separate predictors.

Figure 6 shows the Shapley values of all variables used by the model for a randomly selected datapoint in the corpus (id = 141). We can observe that the general magnitude and order of the Shapley values is largely conform to the mean absolute Shapley values shown in Figure 5, but here we receive their actual value, not just their absolute one. This means that we can also observe their direction: the value of the ‘time’ variable, viz., 1960s, seems to slightly push the prediction away from AAE realization, for example, while the other variables steer this particular prediction towards AAE realization. Again, the ‘word’ variable has the highest impact on the model’s output, closely followed by ‘AAE feature’. Conversely, ‘social group’ and ‘song type’ have comparatively little impact on the model’s prediction, as was the case with the mean absolute Shapley values in Figure 5. Note, however, that these Shapley values depend not only on the individual variable’s value, but also on the values of all other predictors. In other words, the model accounts for the interactions between the variables. We illustrate this by plotting the same figure for another, very similar datapoint, as is shown in Figure 7.

When comparing the Shapley values visualized in Figure 6 and Figure 7, we immediately observe that, although many variables have identical values (viz., ‘word’, ‘AAE feature’, ‘artist’, ‘social group’ and ‘time’), the Shapley values associated with those values differ for both datapoints. In other words, the model’s prediction is not only affected by the values of the different predictors, but also their mutual interaction. Of course, the magnitude with which these Shapley values vary based on variable interaction appears to be minimal, and the general tendencies we observed in the visualization of the mean absolute Shapley values in Figure 5 hold true here as well. By and large, ‘word’ and ‘AAE feature’ are the strongest predictors, while variables such as ‘song type’ and ‘social group’ have quasi no impact on the model’s prediction, even on the micro-level of individual datapoints.

Since the comparison between Figure 6 and Figure 7 has shown that Shapley values of variables depend not only on the individual values, but also on the interaction between the different variables, it is naturally possible to examine a specific variable’s distribution of these Shapley values. We are specifically interested here in the distribution of Shapley values for the ‘AAE feature’ predictor to gauge the extent to which the observed features of AAE impact the model’s predictions. As mentioned in Section 3, the use of sociolinguistic variables often varies greatly depending on the salience and nature of the variable, so examining the distribution of Shapley values for the linguistic features that we examined can possibly help us gain insight in the complex dynamics at play here. Figure 8 accordingly shows boxplots of the Shapley values for each feature of AAE we examined in this study.

Figure 8 shows that the use of <ain’t> as an auxiliary verb replacing <isn’t> has, by far, the highest median Shapley value with the tightest distribution, though, as a likely result of interactions with other model variables, there are some outliers with lower values. The alveolar nasal realization of <ing> and /aɪ/ monophthongization also have relatively high medians and tight distributions, though these have considerably more outliers with lower, often even negative Shapley values. As could be expected based on our exploratory data analysis of the AAE features outlined before, zero copula has the lowest median Shapley value. Here too there are quite a few outliers, this time on the positive end. The deletion of third-person singular <s> has a very broad distribution with substantial variation, which is again not surprising considering that our descriptive statistics revealed this variable to be realized in only 40% of all cases.

The latter series of results provides us with preliminary insight into the adoption of AAE features by blues performers and their relative salience. The use of <ain’t> appears to be, at least in relative terms, ubiquitous in our corpus. We would argue that this can likely be attributed to the almost lexical value <ain’t> has as a grammatical marker of AAE, and nonstandard varieties of English more generally. In fact, this is precisely why we adopted the term lexico-grammatical earlier on in this paper. Conversely, it appears that the other two lexico-grammatical features we selected, which coincidentally have little to no lexical value, appear to have markedly lower median Shapley values. Their distribution, however, is different: zero copula has a very tight distribution, suggesting that it pushes the model’s predictions away from AAE realization rather consistently, while third-person singular <s> has the broadest distribution of all selected features. In other words, there is a lot of variation in the Shapley values associated with this particular feature, and its realization likely depends highly on the other variables in the dataset.

The results for the phonological features are more consistent than is the case for the lexico-grammatical ones. By and large, all of these features have positive median Shapley values and relatively tight distributions. As mentioned before, alveolar nasal realizations of <ing> and /aɪ/ monophthongization appear to have many outliers, specifically on the negative end of the Shapley spectrum. In other words, while on average the model predicts these variables to be realized in an AAE manner, based on the values of other variables in the dataset, their predictions can often go towards non-AAE realization. Generally speaking, however, it appears that, based on both the descriptive statistics we discussed before as well as the decision tree model, the phonological features we observed are used more often, and more consistently, by the examined blues performers than the lexico-grammatical ones.

Overall, the results presented in this section are generally in line with our previous work on language use in blues music, and help reveal a discernible pattern of AAE features being used by performers of a variety of temporal and socio-cultural backgrounds. Additionally, this study has revealed that, in contrast with our twofold initial hypothesis, artists do not appear to draw on more features of AAE when covering other performers, but instead they seem to rely on the same set of features, though not always to the same degree (cf. Figure 8), in all of their studio performed blues music, regardless of song type. These findings further support the existence of an SBSS, and additionally help underline the marked and complex relationship between AAE and the blues genre.

6. Conclusions

The present article reported on an empirical study examining the language use of blues performers when singing, across time periods, socio-cultural groups, and, crucial to the focus of this paper, song types. These two song types, viz., originals and covers, we argue, deserve equal attention from sociolinguists, and should be considered as two separate empirical entities in the (socio)linguistic study of (popular) music. As mentioned before, while sociolinguistic scholars have certainly commented on the possible relevance of covers in passing (Beal 2009; Bell 2011; Coupland 2009, 2011; Simpson 1999), the goal of the present study was to systematically compare the similarities and differences in the use of AAE features by blues performers when performing either covers or their original work.

Our analysis showed that the examined blues performers seem to rely on the selected features of AAE to the same extent in both song types. In other words, at least in the case of the blues genre, performers seem to not alter their singing style when covering another artist’s composition. The possible tension we discussed before, whereby artists might either be expected to emulate the original singing style more faithfully or instead choose to deviate from it to emphasize their own take on the covered item, does not seem to not apply to the blues. However, while these results showed that the cover versus original distinction does not seem to be a meaningful one in the context of the blues, we would certainly not wish to claim that this is the case for many other musical genres. On the contrary, we would argue that additional comparisons of original versus cover performances in other genres, through similar or different methodologies than the one adopted in the present study, may be incredibly valuable to the sociolinguistics of music.

In addition, while this study did not reveal a significant difference between cover and original songs, we believe that the consistency with which we observed blues singers of various time periods and socio-cultural backgrounds to rely on features of AAE across both performative modes is just as valuable from a sociolinguistic perspective. As we highlighted in Section 3, the stylistic language use of blues performers seems to operate somewhere along the axis of indexicality, on the one hand, and iconicity, on the other. In our previous work (De Timmerman et al. 2023), we predominantly relied on Silverstein’s (2003) notion of indexicality to conceptualize the connection between features of AAE and the blues genre. With reference to the highly localized context in which the blues first emerged, it seems plausible that particular features of AAE would be related not to the African American community, or even to specific traditional blues performers of African American descent, but instead to their status as important and authentic blues performers. This indexical link, we argue, may help clarify the prevalence of AAE features in the song lyrics of contemporary, non-African American blues performers.

At the same time, however, we find it hard to deny that the omnipresence of the observed AAE features in our corpus signifies that these linguistic patterns are simply constitutive of the blues as a genre. In other words, while the indexical interpretation provided above certainly makes sense intuitively when considering the initial burst of internationalization of the blues in the 1960s, in today’s more intensively globalized context, perhaps a more straightforward link between the examined AAE features and the genre is equally valuable. By building on Irvine and Gal’s (2000) concept of iconization, then, we might argue that the indexical relationship between AAE features and authenticity in the blues has become so emblematic of the genre, that these features have simply become iconic, essentializing characteristics of (authentic) blues music. Put differently, perhaps one can assert that blues musicians simply sing a certain type of way, and that the linguistic patterns we examined in this study are part of this singing style. Much like Wilson’s (2017) CCSS for choral singing and Gibson’s (2019) SPMSS for popular music, it hence seems that blues is characterized by an SBSS, and that certain features of AAE are an integral, iconic part of it.

The tension between indexicality and iconicity is an interesting avenue for further research, certainly for sociolinguistics in general, but also for the sociolinguistic study of music in particular. While a considerable body of, particularly linguistic anthropologist and variationist sociolinguistic, scholarship has been dedicated to examining the relationship between identifiable linguistic patterns and certain macro-sociological categories, it seems that the language use of musical performers is another valuable lens to explore this through, especially considering the inherent globalization, and hybridity, of many contemporary musical genres. In the specific context of the blues, it seems that tensions between processes of localization and globalization on the one hand, relate in some way to indexical and iconic types of stylistic language use on the other, but these tensions might be present in other musical genres as well, and we would accordingly invite and encourage other sociolinguistics scholars interested in the study of music to contribute to this promising line of inquiry.

Author Contributions

Conceptualization, R.D.T. and S.S.; methodology, R.D.T.; software, R.D.T.; validation, R.D.T.; formal analysis, R.D.T.; investigation, R.D.T.; resources, R.D.T. and S.S.; data curation, R.D.T.; writing—original draft preparation, R.D.T.; writing—review and editing, R.D.T. and S.S.; visualization, R.D.T.; supervision, S.S.; project administration, R.D.T.; funding acquisition, R.D.T. and S.S. All authors have read and agreed to the published version of the manuscript.

Funding

This study was funded by Stef Slembrouck’s research resources.

Institutional Review Board Statement

The study was approved by the Ethics Committee of Ghent University’s Faculty of Arts and Philosophy (reference number: 2024-18, approval date: 29 March 2024).

Informed Consent Statement

Not applicable.

Data Availability Statement

The tabular dataset and the list of songs with their metadata are publicly available at https://osf.io/tbm3d/?view_only=50ae007e230747efaa5043cabfc193ac (accessed on 14 March 2024) (De Timmerman 2024a, 2024b).

Conflicts of Interest

The authors declare no conflict of interest.

Notes

1	For some of the 1960s artists, we included songs which were recorded slightly before 1960 or slightly after 1969, because of the scarcity and/or low audio quality of available recordings from 1960 to 1969.
2	Bands were also included in the selection, but in these cases, we made sure that all included performances were sung by the same lead singer.
3	All Python code used for this project is publicly available at https://github.com/romeodetimmerman/aae-in-blues-slx_and_music (accessed on 14 March 2024).
4	The tabular dataset and the list of songs with their metadata are publicly available at https://osf.io/tbm3d/?view_only=50ae007e230747efaa5043cabfc193ac (accessed on 14 March 2024).
5	Nota bene: in the case of /r/, /t/, and /d/ deletion, ‘Realized’ means that the sound is, in fact, omitted, while ‘Not realized’ means that the sound is produced.
6	Although in machine learning, the term “feature” is more commonly used than “variable” or “predictor”, we do not adopt this terminological switch to avoid confusion with our ‘AAE feature’ variable.

References

Akiba, Takuya, Shotaro Sano, Toshihiko Yanase, Takeru Ohta, and Masanori Koyama. 2019. Optuna: A Next-Generation Hyperparameter Optimization Framework. Paper presented at the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, Anchorage, AK, USA, August 4–8; pp. 2623–31. [Google Scholar] [CrossRef]
Anderson, Bridget L. 2002. Dialect Leveling and /Ai/ Monophthongization among African American Detroiters. Journal of Sociolinguistics 6: 86–98. [Google Scholar] [CrossRef]
Beal, Joan C. 2009. ‘You’re Not from New York City, You’re from Rotherham’: Dialect and Identity in British Indie Music. Journal of English Linguistics 37: 223–40. [Google Scholar] [CrossRef]
Bell, Allan. 2011. Falling in Love Again and Again: Marlene Dietrich and the Iconization of Non-Native English. Journal of Sociolinguistics 15: 627–56. [Google Scholar] [CrossRef]
Bell, Allan, and Andy Gibson. 2011. Staging Language: An Introduction to the Sociolinguistics of Performance. Journal of Sociolinguistics 15: 555–72. [Google Scholar] [CrossRef]
Bridle, Marcus. 2018. Male Blues Lyrics 1920 to 1965: A Corpus Based Analysis. Language and Literature: International Journal of Stylistics 27: 21–37. [Google Scholar] [CrossRef]
Bucholtz, Mary. 2007. Variation in Transcription. Discourse Studies 9: 784–808. [Google Scholar] [CrossRef]
Bucholtz, Mary, and Kira Hall. 2005. Identity and Interaction: A Sociocultural Linguistic Approach. Discourse Studies 7: 585–614. [Google Scholar] [CrossRef]
Bucholtz, Mary, and Qiuana Lopez. 2011. Performing Blackness, Forming Whiteness: Linguistic Minstrelsy in Hollywood Film. Journal of Sociolinguistics 15: 680–706. [Google Scholar] [CrossRef]
Coupland, Nikolas. 2003. Sociolinguistic Authenticities. Journal of Sociolinguistics 7: 417–31. [Google Scholar] [CrossRef]
Coupland, Nikolas. 2007. Style: Language Variation and Identity, 1st ed. Cambridge: Cambridge University Press. [Google Scholar] [CrossRef]
Coupland, Nikolas. 2009. The Mediated Performance of Vernaculars. Journal of English Linguistics 37: 284–300. [Google Scholar] [CrossRef]
Coupland, Nikolas. 2011. Voice, Place and Genre in Popular Song Performance: Popular Music Performance. Journal of Sociolinguistics 15: 573–602. [Google Scholar] [CrossRef]
Cusic, Don. 2005. In Defense of Cover Songs. Popular Music and Society 28: 171–77. [Google Scholar] [CrossRef]
Cutler, Cecilia. 1999. Yorkville Crossing: White Teens, Hip Hop and African American English. Journal of Sociolinguistics 3: 428–42. [Google Scholar] [CrossRef]
Cutler, Cecilia. 2003. The Authentic Speaker Revisited: A Look at Ethnic Perception Data from White Hip Hoppers. University of Pennsylvania Working Papers in Linguistics 9: 6. [Google Scholar]
Cutler, Cecelia. 2007. Hip-Hop Language in Sociolinguistics and Beyond. Language and Linguistics Compass 1: 519–38. [Google Scholar] [CrossRef]
De Munck, Marlies. 2019. De vlucht van de nachtegaal: Een filosofisch pleidooi voor de muzikant. Borgerhout: Letterwerk. Available online: https://www.letterwerk.be/books/devluchtvandenachtegaal.html (accessed on 14 March 2024).
De Timmerman, Romeo. 2024a. aae-in-blues-slx_and_music. Public Dataset. Available online: https://osf.io/tbm3d/?view_only=50ae007e230747efaa5043cabfc193ac (accessed on 14 March 2024).
De Timmerman, Romeo. 2024b. aae-in-blues-slx_and_music. Github Repository. Available online: https://github.com/romeodetimmerman/aae-in-blues-slx_and_music/ (accessed on 14 March 2024).
De Timmerman, Romeo, Ludovic De Cuypere, and Stef Slembrouck. 2023. The Globalization of Local Indexicalities through Music: African-American English and the Blues. Journal of Sociolinguistics 28: 3–25. [Google Scholar] [CrossRef]
Drummond, Rob. 2018. Maybe It’s a Grime [t]ing: th -Stopping among Urban British Youth. Language in Society 47: 171–96. [Google Scholar] [CrossRef]
Eckert, Penelope. 2008. Variation and the Indexical Field: Variation and the Indexical Field. Journal of Sociolinguistics 12: 453–76. [Google Scholar] [CrossRef]
Eckert, Penelope. 2012. Three Waves of Variation Study: The Emergence of Meaning in the Study of Sociolinguistic Variation. Annual Review of Anthropology 41: 87–100. [Google Scholar] [CrossRef]
Eckert, Penelope. 2016. Third Wave Variationism. In Oxford Handbook Topics in Linguistics, 1st ed. Edited by Oxford Handbooks Editorial Board. Oxford: Oxford University Press. [Google Scholar] [CrossRef]
Forman, Murray. 2002. The ‘Hood Comes First: Race, Space, and Place in Rap and Hip-Hop, 1st ed. Middletown: Wesleyan University Press. [Google Scholar]
Fridland, Valerie. 2003. ‘Tie, Tied and Tight’: The Expansion of /Ai/ Monophthongization in African-American and European-American Speech in Memphis, Tennessee. Journal of Sociolinguistics 7: 279–98. [Google Scholar] [CrossRef]
Gal, Susan. 2005. Language Ideologies Compared: Metaphors of Public/Private. Journal of Linguistic Anthropology 15: 23–37. [Google Scholar] [CrossRef]
Gal, Susan. 2013. Tastes of Talk: Qualia and the Moral Flavor of Signs. Anthropological Theory 13: 31–48. [Google Scholar] [CrossRef]
Garley, Matt. 2018. Peaze Up! Adaptation, Innovation, and Variation in German Hip Hop Discourse. In Multilingual Youth Practices in Computer Mediated Communication. Edited by Cecelia Cutler and Unn Røyneland. Cambridge: Cambridge University Press, pp. 87–106. [Google Scholar] [CrossRef]
Garley, Matt. 2019. Choutouts: Language Contact and US-Latin Hip Hop on YouTube. Publications and Research 11: 77–106. [Google Scholar]
Gibson, Andy. 2010. Production and Perception of Vowels in New Zealand Popular Music. Ph.D. dissertation, Auckland University of Technology, Auckland, New Zealand. [Google Scholar]
Gibson, Andy. 2011. Flight of the Conchords: Recontextualizing the Voices of Popular Culture1: FLIGHT OF THE CONCHORDS. Journal of Sociolinguistics 15: 603–26. [Google Scholar] [CrossRef]
Gibson, Andy. 2019. Sociophonetics of Popular Music: Insights from Corpus Analysis and Speech Perception Experiments. Ph.D. dissertation, University of Canterbury, Christchurch, New Zealand. [Google Scholar]
Gilbers, Steven. 2021. Ambitionz Az a Ridah: 2Pac’s Changing Accent and Flow in Light of Regional Variation in African-American English Speech and Hip-Hop Music. Ph.D. dissertation, University of Groningen, Groningen, The Netherlands. [Google Scholar] [CrossRef]
Gilbers, Steven, Nienke Hoeksema, Kees de Bot, and Wander Lowie. 2020. Regional Variation in West and East Coast African-American English Prosody and Rap Flows. Language and Speech 63: 713–45. [Google Scholar] [CrossRef] [PubMed]
Gleason, Ralph J. 1968. Can the White Man Sing the Blues? Jazz and Pop 7: 28–29. [Google Scholar]
Green, Lisa J. 2002. African American English: A Linguistic Introduction, 1st ed. Cambridge: Cambridge University Press. [Google Scholar] [CrossRef]
Hancock, John T., and Taghi M. Khoshgoftaar. 2020a. CatBoost for Big Data: An Interdisciplinary Review. Journal of Big Data 7: 94. [Google Scholar] [CrossRef] [PubMed]
Hancock, John T., and Taghi M. Khoshgoftaar. 2020b. Performance of CatBoost and XGBoost in Medicare Fraud Detection. Paper presented at 2020 19th IEEE International Conference on Machine Learning and Applications (ICMLA), Miami, FL, USA, December 14–17; pp. 572–79. [Google Scholar] [CrossRef]
Hebdige, Dick. 2008. Subculture: The Meaning of Style. New Accents. London and New York: Routledge. [Google Scholar]
Irvine, Judith T. 2002. ‘Style’ as Distinctiveness: The Culture and Ideology of Linguistic Differentiation. In Style and Sociolinguistic Variation. Edited by John R. Rickford and Penelope Eckert. Cambridge: Cambridge University Press, pp. 21–43. [Google Scholar] [CrossRef]
Irvine, Judith T., and Susan Gal. 2000. Language Ideology and Linguistic Differentiation. In Regimes of Language: Ideologies, Polities, and Identities. Edited by Paul V. Kroskrity. Santa Fe: School of American Research Press, pp. 35–84. [Google Scholar]
Jones, Leroi. 1999. Blues People: Negro Music in White America, 1st ed. New York: Harper Perennial. [Google Scholar]
Kim, Kwang-sup. 2022. Repair-by-Deletion/Insertion and the Distribution of the Copula in African American English. Lingua 269: 103203. [Google Scholar] [CrossRef]
Labov, William. 1966. The Social Stratification of English in New York City, 2nd ed. Cambridge: Cambridge University Press. [Google Scholar] [CrossRef]
Labov, William. 1968. A Study of the Non-Standard English of Negro and Puerto Rican Speakers in New York City. Volume II: The Use of Language in the Speech Community. Available online: https://eric.ed.gov/?id=ED028424 (accessed on 29 March 2024).
Li, Yazhe. 2020. Addressing Class Imbalance for Logistic Regression. Ph.D. dissertation, Imperial College London, London, UK. [Google Scholar] [CrossRef]
Lundberg, Scott, and Su-In Lee. 2017. A Unified Approach to Interpreting Model Predictions. arXiv arXiv:1705.07874. [Google Scholar] [CrossRef]
MAXQDA. n.d. MAXQDA|All-In-One Qualitative & Mixed Methods Data Analysis Tool. Available online: https://www.maxqda.com/ (accessed on 14 March 2024).
McElfresh, Duncan, Sujay Khandagale, Jonathan Valverde, Vishak Prasad C., Ganesh Ramakrishnan, Micah Goldblum, and Colin White. 2023. When Do Neural Nets Outperform Boosted Trees on Tabular Data? Advances in Neural Information Processing Systems 36: 76336–69. [Google Scholar]
Milroy, James, and Lesley Milroy. 1978. Belfast: Change and Variation in an Urban Vernacular. In Sociolinguistic Patterns in British English. Edited by Peter Trudgill. London: Edward Arnold. [Google Scholar]
Mosser, Kurt. 2008. Cover Songs: Ambiguity, Multivalence, Polysemy. Popular Musicology Online 26. Available online: http://www.popular-musicology-online.com/issues/02/mosser.html (accessed on 14 March 2024).
Muchlinski, David, David Siroky, Jingrui He, and Matthew Kocher. 2016. Comparing Random Forest with Logistic Regression for Predicting Class-Imbalanced Civil War Onset Data. Political Analysis 24: 87–103. [Google Scholar] [CrossRef]
Newkirk-Turner, Brandi L., and Lisa Green. 2016. Third Person Singular -s and Event Marking in Child African American English. Linguistic Variation 16: 103–30. [Google Scholar] [CrossRef]
Newman, Michael. 2005. Rap as Literacy: A Genre Analysis of Hip-Hop Ciphers. Text—Interdisciplinary Journal for the Study of Discourse 25: 399–436. [Google Scholar] [CrossRef]
Oommen, Thomas, Laurie G. Baise, and Richard M. Vogel. 2011. Sampling Bias and Class Imbalance in Maximum-Likelihood Logistic Regression. Mathematical Geosciences 43: 99–120. [Google Scholar] [CrossRef]
Pedregosa, Fabian, Gaël Varoquaux, Alexandre Gramfort, Vincent Michel, Bertrand Thirion, Olivier Grisel, Mathieu Blondel, Peter Prettenhofer, Ron Weiss, Vincent Dubourg, and et al. 2011. Scikit-Learn: Machine Learning in Python. Journal of Machine Learning Research 12: 2825–30. [Google Scholar]
Plasketes, George. 2005. Re-flections on the Cover Age: A Collage of Continuous Coverage in Popular Music. Popular Music and Society 28: 137–61. [Google Scholar] [CrossRef]
Plasketes, George, ed. 2016. Play It Again: Cover Songs in Popular Music. London: Routledge. [Google Scholar] [CrossRef]
Prokhorenkova, Liudmila, Gleb Gusev, Aleksandr Vorobev, Anna Veronika Dorogush, and Andrey Gulin. 2019. CatBoost: Unbiased Boosting with Categorical Features. arXiv arXiv:1706.09516. [Google Scholar] [CrossRef]
Rampton, Ben. 2009. Interaction Ritual and Not Just Artful Performance in Crossing and Stylization. Language in Society 38: 149–76. [Google Scholar] [CrossRef]
Rampton, Ben. 2022. Language Crossing and the Problematisation of Ethnicity and Socialisation. Pragmatics. Quarterly Publication of the International Pragmatics Association (IPrA) 5: 485–513. [Google Scholar] [CrossRef]
Rudinow, Joel. 1994. Race, Ethnicity, Expressive Authenticity: Can White People Sing the Blues? The Journal of Aesthetics and Art Criticism 52: 127. [Google Scholar] [CrossRef]
Silverstein, Michael. 2003. Indexical Order and the Dialectics of Sociolinguistic Life. Language & Communication 23: 193–229. [Google Scholar] [CrossRef]
Simpson, Paul. 1999. Language, Culture and Identity: With (Another) Look at Accents in Pop and Rock Singing. Multilingua—Journal of Cross-Cultural and Interlanguage Communication 18: 343–68. [Google Scholar] [CrossRef]
Squires, Lauren. 2019. Genre and Linguistic Expectation Shift: Evidence from Pop Song Lyrics. Language in Society 48: 1–30. [Google Scholar] [CrossRef]
Taylor, Paul Christopher. 1995. … So Black and Blue: Response to Rudinow. The Journal of Aesthetics and Art Criticism 53: 313. [Google Scholar] [CrossRef]
Thomas, Erik R. 2007. Phonological and Phonetic Characteristics of African American Vernacular English: Phonological and Phonetic Characteristics of AAVE. Language and Linguistics Compass 1: 450–75. [Google Scholar] [CrossRef]
Trudgill, Peter. 1983. Acts of Conflicting Identity: The Sociolinguistics of British Pop-Song Pronunciation. In Sociolinguistics: A Reader. Edited by Nikolas Coupland and Adam Jaworski. London: Macmillan Education UK, pp. 251–65. [Google Scholar] [CrossRef]
van den Goorbergh, Ruben, Maarten van Smeden, Dirk Timmerman, and Ben Van Calster. 2022. The Harm of Class Imbalance Corrections for Risk Prediction Models: Illustration and Simulation Using Logistic Regression. Journal of the American Medical Informatics Association 29: 1525–34. [Google Scholar] [CrossRef]
Walker, James A. 2005. The Ain’t Constraint: Not-Contraction in Early African American English. Language Variation and Change 17: 1–17. [Google Scholar] [CrossRef]
Weinstein, Deena. 2010. Appreciating Cover Songs: Stereophony. In Play It Again: Cover Songs in Popular Music. Edited by George Plasketes. London: Routledge. [Google Scholar]
Wilson, Guyanne. 2017. Conflicting Language Ideologies in Choral Singing in Trinidad. Language & Communication 52: 19–30. [Google Scholar] [CrossRef]
Wolfram, Walt. 2007. Sociolinguistic Folklore in the Study of African American English. Language and Linguistics Compass 1: 292–313. [Google Scholar] [CrossRef]
Wolfram, Walt, and Mary E. Kohn. 2015. Regionality in the Development of African American English. In The Oxford Handbook of African American Language. Edited by Jennifer Bloomquist, Lisa J. Green and Sonja L. Lanehart. Oxford: Oxford University Press. [Google Scholar] [CrossRef]
Young, James O. 2008. Cultural Appropriation and the Arts, 1st ed. Hoboken: Wiley. [Google Scholar] [CrossRef]

Figure 1. Mean AAE realizations by artist and by group. Dots represent individual artist means; dashed vertical lines show group means. Abbreviations are ‘AA’ for African American; ‘nonAA_US’ for non-African American, US-based; and ‘nonAA_nonUS’ for non-African American, non-US-based.

Figure 2. Point plots of mean AAE realizations by variable, group, and song type. The color blue is used for cover songs and orange for originals. Horizontal lines show 95% confidence intervals. Abbreviations are ‘AA’ for African American; ‘nonAA_US’ for non-African American, US-based; and ‘nonAA_nonUS’ for non-African American, non-US-based.

Figure 3. Point plots of mean AAE realizations by artist, group, and song type. The color blue is used for cover songs, orange for originals. Horizontal lines show 95% confidence intervals. Abbreviations are ‘AA’ for African American; ‘nonAA_US’ for non-African American, US-based; and ‘nonAA_nonUS’ for non-African American, non-US-based.

Figure 4. Confusion matrix for model evaluation. Displayed counts are based on model predictions on the test set (15% of data, not seen by model during training).

Figure 5. Mean absolute Shapley values for all variables. All values are in log-odds space.

Figure 6. Shapley values for datapoint id = 141. The mean Shapley value, E[F(X)], represents the average prediction, while f(X) indicates the prediction for this specific datapoint. The sum of the Shapley values of the individual features equals f(X). All values are in log-odds space.

Figure 7. Shapley values for datapoint id = 262. The mean Shapley value, E[F(X)], represents the average prediction, while f(X) indicates the prediction for this specific datapoint. The sum of the Shapley values of the individual features equals f(X). All values are in log-odds space.

Figure 8. Boxplots of Shapley values for each AAE feature. Orange lines show median value. Dots indicate outliers. All values are in log-odds space.

Table 1. All artists by time period (1960s, 1980s, and 2010s) and social group (African American; Non-African American, US-based; Non-African American, non-US-based).

	African American	Non-African American, US-Based	Non-African American, Non-US-Based
1960s	Albert King B.B. King Freddie King Jimi Hendrix Muddy Waters	Allman Brothers Band Canned Heat Janis Joplin Paul Butterfield Steve Miller	Cream Fleetwood Mac Rolling Stones Savoy Brown Ten Years After
1980s	Albert Collins John Lee Hooker Koko Taylor Luther Allison Robert Cray	Bonnie Raitt Fabulous Thunderbirds J. J. Cale Robben Ford Stevie Ray Vaughan	Eric Clapton Jeff Healey John Mayall Rory Gallagher The Blues Band
2010s	Eric Gales Gary Clark Jr. Kingfish Kirk Fletcher Shemekia Copeland	Ally Venable Joe Bonamassa Matt Schofield Philip Sayce Samantha Fish	Dani Wilde Dan Patlansky Davy Knowles Joanna Shaw Taylor Tiny Legs Tim

Table 2. Absolute and relative frequencies of AAE realizations by feature. ‘Realized’ indicates AAE feature is used; ‘Not realized’ indicates AAE feature is not used5.

	Not Realized	Realized
/a`ɪ`/ monophthongization	1162 (20%)	4748 (80%)
post-vocalic /r/ deletion	876 (29%)	2191 (71%)
post-consonantal /d/ deletion	617 (30%)	1414 (70%)
alveolar nasal in <ing> ultimas	147 (11%)	1227 (89%)
post-consonantal /t/ deletion	803 (40%)	1198 (60%)
auxiliary verb ain’t	3 (1%)	251 (99%)
third-person singular <s> deletion	139 (60%)	94 (40%)
zero copula	261 (83%)	53 (17%)
	4008 (26%)	11176 (74%)

Table 3. Classification report for model evaluation. Precision highlights the impact of false positives. Recall highlights the impact of false negatives. F1-score is the harmonic mean of precision and recall.

	Precision	Recall	F1-Score
AAE feature not realized	0.82	0.77	0.80
AAE feature realized	0.92	0.94	0.93

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

De Timmerman, R.; Slembrouck, S. Covering Blue Voices: African American English and Authenticity in Blues Covers. Languages 2024, 9, 229. https://doi.org/10.3390/languages9070229

AMA Style

De Timmerman R, Slembrouck S. Covering Blue Voices: African American English and Authenticity in Blues Covers. Languages. 2024; 9(7):229. https://doi.org/10.3390/languages9070229

Chicago/Turabian Style

De Timmerman, Romeo, and Stef Slembrouck. 2024. "Covering Blue Voices: African American English and Authenticity in Blues Covers" Languages 9, no. 7: 229. https://doi.org/10.3390/languages9070229

APA Style

De Timmerman, R., & Slembrouck, S. (2024). Covering Blue Voices: African American English and Authenticity in Blues Covers. Languages, 9(7), 229. https://doi.org/10.3390/languages9070229

Article Menu

Covering Blue Voices: African American English and Authenticity in Blues Covers

Abstract

1. Introduction

2. Sociolinguistics of Covers in Music

3. Indexicality and Iconicity of AAE Features in Blues Music

4. Data and Methods

5. Results

5.1. Descriptive Summary

5.2. Original versus Cover Songs

5.3. Predictive Modeling

6. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Notes

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI