Next Article in Journal
COVID-19 Data Analysis and Computation with Urban Structure Consideration
Previous Article in Journal
Development of a System for the Active Orientation of Small Screws
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Proceeding Paper

The Effect of Auditory Perceptual Training by Online Computer Software on English Pronunciation †

by
Ching-Wen (Felicia) Wang
The Language Center, Chaoyang University of Technology, Taichung 412005, Taiwan
Presented at the 2024 IEEE 4th International Conference on Electronic Communications, Internet of Things and Big Data, Taipei, Taiwan, 19–21 April 2024.
Eng. Proc. 2024, 74(1), 4; https://doi.org/10.3390/engproc2024074004
Published: 26 August 2024

Abstract

:
Pronunciation is crucial to L2 learning. However, achieving speech proficiency is difficult. Class time constraints make demonstration–imitation pronunciation teaching methods less effective, even after repeated practice. Research suggests that pronunciation involves motor control, that auditory preparation enhances accuracy, and that learners produce more accurate pronunciation after perceiving accurate target sounds. This study proposes that the perception of accurate L2 target sounds will enhance pronunciation. To test this concept, the study employed online auditory training tasks for English learners enrolled at a private university in Taiwan. The results showed that auditory teaching results in positive learning outcomes.

1. Introduction

Pronunciation courses in second language learning are often overlooked, partly due to teachers’ lack of appropriate pronunciation training [1]. If second language learners do not receive proper pronunciation training from the beginning of their studies, they may learn the mispronunciation of words or to speak with a native language accent, which can affect a listener’s comprehension of [2] and trust in a speaker [3]. Flege’s [4] Speech Learning Model explains that if a learner’s native language and second language have completely corresponding phonetics, there is no learning difficulty. However, when the native and second languages have similar but still different phonetics, the learning difficulty is greater than learning new elements.
Pronunciation teaching in classrooms is usually incidental, corrected only when errors occur during vocabulary or sentence reading. The systematic teaching of pronunciation as well as its formal inclusion in regular courses is often lacking. Furthermore, many pronunciation actions involve the control and coordination of muscles that are not easily observable, such as the mouth, nose, throat, and chest. Cognitive neuroscience research suggests that smoothly producing the action of speech requires activating the organs of speech and coordinating multiple perceptual channels (such as auditory, visual, body perception, etc.) and imitation functions [5,6].
To verify the hypothesis that auditory perception and pronunciation actions are complementary, this study developed a series of online listening tasks to train the auditory perception of low-achieving learners. The goal was to develop a teaching method for improving learners’ pronunciation accuracy while demonstrating that understanding how students learn pronunciation will help teachers design effective pronunciation courses.

2. Literature Review

2.1. Motor Theory

Cognitive neuroscience regards verbal expression as a form of motor control. Guenther and Vladusich [7] suggest that to understand the action of speech it is necessary to comprehend all the brain regions involved in it. They integrated auditory and somatosensory perceptions with the brain’s motor cortex, collectively called the speech–motor control system. This system encompasses all speech actions, from simple vocalizations to complex lexical items [8,9,10]. The internal forward model in motor control theory shows the relationship between the brain’s motor commands and the effector organs [11]. For instance, when the brain issues a motor command, it simultaneously generates an efference copy to predict the result of the speech action. If the result differs from the efference copy, the brain reissues the command and corrects it. Hickok [12] proposed the Hierarchical State Feedback Control (HSFC) model which explains the general language production process and illustrates the importance of the auditory feedback system for pronunciation actions. It suggests that once speakers know what concept to express, they enter a coding stage to search for vocabulary, followed by the phonetic and auditory prediction stages. When the vocal cords emit a syllable, the auditory system feedbacks the phoneme to the speech action system. Auditory checking verifies the pronunciation accuracy of a target sound and predicts the expected phonetic result of the pronunciation action. If the predicted phonetic result differs from the actual result heard, the goal is not achieved, and auditory feedback must assist in correcting the pronunciation action. Thus, in the HSFC model, “predicted heard pronunciation” is the goal of speech action.

2.2. Auditory Perception Guiding Speech Action

Early experiments demonstrated the relationship between hearing and speech. Studies on motor learning and motor control showed how auditory perception influences speech actions. For example, the Lombard effect [13] indicates that people naturally increase their speech volume in noisy environments to confirm the content of their speech, as it is more difficult to hear their voice clearly in a loud environment. Houde and Jordan [14] had participants attempt to pronounce words containing the /ε/ sound (such as “peb”) but played words with the /i/ sound (such as “peeb”) through headphones. The results showed that participants adjusted their speech actions based on the sounds they heard, with the same phenomenon occurring even after substituting different vowels. These phenomena demonstrated that the brain uses auditory perception both to monitor and check the accuracy of speech actions.

2.3. Auditory Illusions in Everyday Life

Auditory illusions are perceptual illusions that often occur in real life. People interpret the events that occur in their environments based on their previous experiences and existing knowledge. This cognitive processing is common in vision or hearing [15]. Neuroscience researchers investigated the neural responses to continuous sound segments in the brain [16]. For example, when an octave of sound gradually rises from a low pitch to a high octave, the brain perceives the melody as continuously rising and never-ending under repeated playback. The brain does not perceive a decrease in scale. This phenomenon is known as the Shepard tone illusion. Deutsch [17] experimented with the words “no way” in a loop and asked the subjects to give feedback on what they heard. The results showed the subjects’ word interpretations varied, such as “no brain”, “no time”, “runaway”, etc. She called these variations phantom words and explained not only that knowledge, beliefs, and expectations influence what we hear but also that what we think we hear may not be what we do hear.
In addition, auditory illusions of entire sentences may occur. Warren [18] found that the brain’s cognitive system tends to interpret unknown situations using known conditions. For example, when playing the sentence “The state governors met with their respective legislatures convening in the capital city” for subjects, he used a cough to cover the first /s/ sound in “legislatures”, which produced “legi(cough)latures”. However, the subjects did not notice the omission of the phoneme /s/ and could not even determine where in the sentence the cough had occurred. Scholars call this phenomenon phonemic restoration, which shows that our brains interpret what we hear based on our experiences and knowledge.

2.4. Native Language Interference in Second Language Pronunciation Accuracy

Regardless of whether a student is learning their native language or a second language, they need to go through the process of learning language production. However, during second language learning, a learner’s native language may interfere with the operation of their auditory feedback mechanism. For example, since Japanese pronunciation does not include the English /r/ sound, when native speakers of Japanese hear the English syllable /ra/ they associate it with /la/, thereby not only confusing their auditory feedback but also leading to pronunciation errors. Thus, as prior research suggests, language experience affects speech perception [19]. In addition, if the brain’s “predicted heard pronunciation” affects the accuracy of pronunciation, which the HSFC model implies, then how the auditory system stores phonetic data also becomes a key factor affecting how students learn speech production.

2.5. Auditory Identification Training Enhances Pronunciation Accuracy

Teachers often use their existing language experience as the basis for establishing and carrying out classroom pronunciation lessons. However, cognitive science suggests that training learners’ listening ability should be the basis for training their pronunciation [20]. Alves and Luchini [20] trained the listening abilities of three groups of adult English learners from Argentina to determine the accuracy of their pronunciation development. The subjects in the first and second groups participated in all the listening training tasks. In addition, the second group received extra guidance on the target sound. However, the third group received no listening training at all. All three groups of subjects recorded their pronunciations of the target sounds before and after the listening training. The subjects also took a delayed test one month after the study to determine any longer-term effects of the training. The results showed that the pronunciation of the two groups of subjects who received the listening training improved significantly, and the learning effect continued for one month. In addition, the second group, which received extra guidance on the target sound, performed better than the groups that did not receive the extra guidance. The researchers inferred that listening training effectively improves learners’ pronunciation accuracy. Therefore, their study encourages pronunciation teaching to enhance learners’ awareness of target sound learning.

2.6. Summary

Researchers in the field of education have long been committed to improving teaching methods. On the other hand, scientists have been exploring the most effective methods of education from the perspective of learning, and inferring teaching methods from the outcomes of such studies. This study employed the latter approach by examining the situation of second language learners from not only the perspective of science but also through the establishment of a corresponding online learning platform and interactive online exercises that provide learners with a direction for self-study while also motivating their interest in learning. This collection of practice materials for second language learners, coupled with classroom listening and spelling activities to check learning output in real-time, lays out a foundation for teaching pronunciation to low-achieving freshmen.

3. Research Method

Low-achieving freshmen in the vocational education system often lack learning strategies and exposure to diverse interest-inducing teaching methods. These deficiencies make it more difficult to improve learning effectiveness. For example, even though pronunciation is fundamental to increasing second language vocabulary, particularly in phonetic languages, teaching curricula often overlook it because of a lack of effective and diverse teaching methods. This study aims to improve this situation by verifying a hypothesis based on cognitive science research that actively improving auditory perception can enhance pronunciation accuracy. Consequently, the targets in this study were freshmen enrolled in the vocational education system who had low English proficiency and practiced phonetic listening. Based on the preceding discussion, this study explored the impact of auditory perceptual training on learner pronunciation accuracy.

3.1. Participants

The research method in this study included implementing a curriculum developed for a basic English class of freshmen whose level of achievement in English was in the lowest 5% at a private university of technology in Taichung, Taiwan. The class consisted of about 45 students, including a few with reading or physical disabilities that excluded them from participating in this study. In general, these students’ learning motivation was also low, partly because they had not been able to discover personally effective learning strategies. According to the university’s statistics, each year about 5% of basic level students fail to pass the minimum English graduation threshold (CEFR A2) required by their departments.

3.2. Research Design and Procedure

The present research plan was to assist learners in pronunciation learning before and after each two-hour class by constructing an online training system. Figure 1 shows the procedure of the experiment by conducting the read-and-record English pronunciation pre-test in the second week of the semester before the formal course began. The participants took the same test in their final week of training. The pre-/post-tests did not include feedback. All the learners watched a 10-min rules of pronunciation video on the platform and performed listen-and-spell tasks using an alphabetical card set provided to each student. The in-class listen-and-spell activities consolidated their knowledge of the rules of pronunciation. Every day after the second week of class, the students performed either listen-and-identify or listen-and-record tasks, which helped them learn the rules of pronunciation. Each stage of the pronunciation rule training lasted for six days and the daily sessions for 10−15 min. After the third week, the first hour of class involved listen-and-spell exercises for guided pronunciation practice. The content of the remaining class hour included studies in local culture and teaching material from the class textbook.
The students began each online training session by watching a micro-course video devoted to the rule of pronunciation. After watching the video, the students listened to approximately 90 words that exemplified the day’s pronunciation rules. Over the 10-week training period, the online system taught 939 vocabulary words, which the students used to complete the listen-and-identify or listen-and-repeat tasks. The students had already learned half of the words on the pre-test (old words). New words appeared on the post-test that the training sessions did not include. These unfamiliar words checked whether the training generalized learned sounds to new-word contexts. The group completing the listen-and-identify tasks received immediate feedback on their answers, while the group completing the listen-and-repeat (record) tasks received no feedback. The system recorded the learners’ pronunciation performance (either their identifications or their recordings) and allowed the teacher to evaluate the effectiveness of the online pronunciation-learning process.
The study adopted an experimental teaching method to improve the learners’ pronunciation accuracy. All three groups (A, B, and C) received ten online pronunciation micro-course videos, as well as listening- and spelling-teaching activities in class. The experimental Groups A and B received either the auditory perceptual training or the auditory perceptual training with recording, respectively. The control Group C only received the pronunciation micro-course videos and participated in the in-class listening- and spelling-teaching activities. The training method of Group B simulated a situation in which the students learned by themselves or spoke as soon as they heard the demonstration sound in class. However, for the analysis subjects, the students who completed the pre-test were involved in the training and the post-test, which left 12 students in Group A, 18 in Group B; and 13 in Group C.

3.3. Instruments and Scoring

The English pronunciation test content included common monosyllabic words of single vowels (such as a in ‘bat’), diphthongs (such as a-e as in ‘game’), simple consonants (such as d in ‘did’), and compound consonants (such as -nk in ‘pink’). The test platform displayed 71 vocabulary words, with the students recording their pronunciation of the vocabulary on the platform. All the recordings were manually scored; that is, real assessors judged all the English pronunciation tests by manually listening to them. If the judgment was that a recorded syllable matched its target sound, then it received 1 point; otherwise, it received 0 points. Each student’s pre-test and post-test recordings were randomly presented, so the judging listener would not know whether the syllable was from a pre-test or a post-test recording file. Once the judges assessed all the recordings, the progress score was statistically analyzed and presented as a percentage. Paired-Samples T Tests validated the results of the pre- and post-tests for any significant improvement in pronunciation-learning performance.

4. Results

This study investigated whether listen-and-identify perceptual training achieves better pronunciation accuracy. Data analysis from the pre- and post-tests included the Paired-Samples T Tests to check for differences among the variables. After investigating the efficacy of the English pronunciation intervention training tasks (independent variables) on the post-test scores (dependent variable), the results of the Paired-Samples T Tests revealed significant improvement in all three groups. In Group A, the mean pre-test score was 38.3 (SD = 24.8), and the mean post-test score was 71.5 (SD = 28.4), t(11) = −3.75, p = 0.003 < 0.05, indicating a statistically significant enhancement in pronunciation ability. Similarly, in Group B, the mean pre-test score was 54.1 (SD = 25.3), and the mean post-test score was 76.8 (SD = 20.6), t(17) = −3.85, p = 0.001 < 0.05, demonstrating a significant improvement. Finally, in Group C, the mean pre-test score was 45.5 (SD = 24.4), and the mean post-test score was 73.1 (SD = 12.1), t(12) = −4.01, p = 0.002 < 0.05, also signifying significant advancement in English pronunciation. These findings underscore the positive impact of the pronunciation training program across the groups.

5. Discussion and Conclusions

The results of this study indicated that the training program effectively enhanced the participants’ English pronunciation abilities. The motor theory posits that the motor system plays a crucial role in speech perception. The participants in Group A practiced listen-and-identify skills without speaking. A significant improvement in their post-test scores was achieved with training tasks that involved identifying words with the target sounds tuned to the students’ perceptual representations. On the other hand, Group B practiced listen-and-repeat skills involving immediate reaction after listening to the target sounds. However, the procedure required the participants to listen before speaking and to accurately pronounce the target sounds. They had to maintain accurate auditory representations. Hence, Group B exemplified another form of perceptual training. Group C was the control group in the study. However, their test results showed significant improvement. One possible interpretation of this result is that the in-class listen-and-spell activities effectively contributed to their learning. Thus, even though Group C did not receive additional online training, visual aids nevertheless played a crucial role in their learning. The results of all three groups showed that the students engaged in perceptual training with every task. Auditory representation and speech had a close relationship. Designing a pronunciation course required training the learners to perceive the target sound accurately. While the common approach to teaching pronunciation is demonstration and repetition, the listen-and-identify approach is an effective alternative strategy that reinforces accurate pronunciation outcomes. The listen-and-spell activity also helped consolidate the learners’ knowledge of the target sounds while enriching the learning enjoyment.
This study was preliminary research because the number of participants in the three groups was small, making it unfeasible to have a traditional control group because the teacher had to interact with all the students. Consequently, further research is necessary to verify and generalize the results of this study.

Funding

This research received no external funding.

Data Availability Statement

The original contributions presented in the study are included in the article, further inquiries can be directed to the corresponding author.

Conflicts of Interest

The author declares no conflicts of interest.

References

  1. Camus-Oyarzun, P.A. The Effects of Pronunciation Instruction on the Production of Second Language Spanish: A Classroom Study. Ph.D. Thesis, Georgetown University, Washington, DC, USA, 2016. [Google Scholar]
  2. Munro, M.J.; Derwing, T.M. Processing time, accent, and comprehensibility in the perception of native and foreign-accented speech. Lang. Speech 1995, 38, 289–306. [Google Scholar] [CrossRef] [PubMed]
  3. Rubin, D.L.; Smith, K.A. Effects of accent, ethnicity, and lecture topic on undergraduates’ perceptions of nonnative English-speaking teaching assistants. Int. J. Intercult. Relat. 1990, 14, 337–353. [Google Scholar] [CrossRef]
  4. Flege, J.E. Second-language speech learning: Theory, findings and problems. In Speech Perception and Linguistic Experience: Theoretical and Methodological Issues; Strange, W., Ed.; York Press: Timonium, MD, USA, 1995. [Google Scholar]
  5. Guenther, F.H. Speech sound acquisition, coarticulation, and rate effects in a neural network model of speech production. Psychol. Rev. 1995, 102, 594–621. [Google Scholar] [CrossRef] [PubMed]
  6. Tremblay, S.; Shiller, D.M.; Ostry, D.J. Somatosensory basis of speech production. Nature 2003, 423, 866–869. [Google Scholar] [CrossRef] [PubMed]
  7. Guenther, F.H.; Vladusich, T. A neural theory of speech acquisition and production. J. Neurolinguistics 2009, 25, 408–422. [Google Scholar] [CrossRef] [PubMed]
  8. Fiez, J.A.; Petersen, S.E. Neuroimaging studies of word reading. Proc. Natl. Acad. Sci. USA 1998, 95, 914–921. [Google Scholar] [CrossRef] [PubMed]
  9. Turkeltaub, P.E.; Eden, G.F.; Jones, K.M.; Zeffiro, T.A. Meta-analysis of the functional neuroanatomy of single-word reading: Method and validation. NeuroImage 2002, 16, 765–780. [Google Scholar] [CrossRef] [PubMed]
  10. Argyropoulos, G.P.D. The cerebellum, internal models, and prediction in ‘non-motor’ aspects of language: A critical review. Brain Lang. 2016, 161, 4–17. [Google Scholar] [CrossRef] [PubMed]
  11. Hickok, G. The architecture of speech production and the role of the phoneme in speech processing. Lang. Cogn. Process. 2014, 29, 2–20. [Google Scholar] [CrossRef] [PubMed]
  12. Lane, H.; Tranel, B. The Lombard sign and the role of hearing in speech. J. Speech Lang. Heart Res. 1971, 14, 677–709. [Google Scholar] [CrossRef]
  13. Houde, J.F.; Jordan, M.I. Sensorimotor adaptation in speech production. Science 1997, 279, 1213–1216. [Google Scholar] [CrossRef] [PubMed]
  14. Hawkins, S. Phonological features, auditory objects, and illusions. J. Phon. 2010, 38, 60–89. [Google Scholar] [CrossRef]
  15. Shimizu, Y.; Umeda, M.; Mano, H.; Aoki, I.; Higuchi, T.; Tanaka, C. Neuronal response to Shepard’s tones. An auditory fMRI study using multifractal analysis. Brain Res. 2007, 1186, 113–123. [Google Scholar] [CrossRef] [PubMed]
  16. Deutsch, D. Phantom Words. 2013. Available online: http://deutsch.ucsd.edu/psychology/pages.php?i=211 (accessed on 13 February 2017).
  17. Warren, R.M. Perceptual restoration of missing speech sounds. Science 1970, 167, 392–393. [Google Scholar] [CrossRef] [PubMed]
  18. Zhang, Y.; Kuhl, P.K.; Imada, T.; Kotani, M.; Tohkura, Y. Effects of language experience: Neural commitment to language-specific auditory patterns. NeuroImage 2005, 26, 703–720. [Google Scholar] [CrossRef] [PubMed]
  19. Wang, Y.; Jongman, A.; Sereno, J.A. Acoustic and perceptual evaluation of Mandarin tone productions before and after perceptual training. J. Acoust. Soc. Am. 2003, 113, 1033–1043. [Google Scholar] [CrossRef] [PubMed]
  20. Alves, U.K.F.; Luchini, P.L. Effects of perceptual training on the identification and production of word-initial voiceless stops by Argentinean learners of English. Ilha Do Desterro 2017, 70, 15–32. [Google Scholar] [CrossRef]
Figure 1. Research design and procedure.
Figure 1. Research design and procedure.
Engproc 74 00004 g001
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Wang, C.-W. The Effect of Auditory Perceptual Training by Online Computer Software on English Pronunciation. Eng. Proc. 2024, 74, 4. https://doi.org/10.3390/engproc2024074004

AMA Style

Wang C-W. The Effect of Auditory Perceptual Training by Online Computer Software on English Pronunciation. Engineering Proceedings. 2024; 74(1):4. https://doi.org/10.3390/engproc2024074004

Chicago/Turabian Style

Wang, Ching-Wen (Felicia). 2024. "The Effect of Auditory Perceptual Training by Online Computer Software on English Pronunciation" Engineering Proceedings 74, no. 1: 4. https://doi.org/10.3390/engproc2024074004

Article Metrics

Back to TopTop