Next Article in Journal
Automatic Extraction and Visualization of Interaction Networks for German Fairy Tales
Next Article in Special Issue
A Study on Text Classification in the Age of Large Language Models
Previous Article in Journal
Leveraging Multi-Modality and Enhanced Temporal Networks for Robust Violence Detection
Previous Article in Special Issue
Towards Self-Conscious AI Using Deep ImageNet Models: Application for Blood Cell Classification
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Lexical Error Guard: Leveraging Large Language Models for Enhanced ASR Error Correction

by
Mei Si
1,*,
Omar Cobas
1,2 and
Michael Fababeir
1,2
1
Department of Cognitive Science, Rensselaer Polytechnic Institute, Troy, NY 12180, USA
2
Department of Computer Science, Rensselaer Polytechnic Institute, Troy, NY 12180, USA
*
Author to whom correspondence should be addressed.
Mach. Learn. Knowl. Extr. 2024, 6(4), 2435-2446; https://doi.org/10.3390/make6040120
Submission received: 1 August 2024 / Revised: 25 September 2024 / Accepted: 13 October 2024 / Published: 29 October 2024

Abstract

Error correction is a vital element in modern automatic speech recognition (ASR) systems. A significant portion of ASR error correction work is closely integrated within specific ASR systems, which creates challenges for adapting these solutions to different ASR frameworks. This research introduces Lexical Error Guard (LEG), which leverages the extensive pre-trained knowledge of large language models (LLMs) and employs instructional learning to create an adaptable error correction system compatible with various ASR platforms. Additionally, a parameter-efficient fine-tuning method is utilized using quantized low-rank adaptation (QLoRA) to facilitate fast training of the system. Tested on the LibriSpeech data corpus, the results indicate that LEG improves ASR results when used with various Whisper model sizes. Improvements in WER are made, with a decrease from 2.27% to 2.21% on the “Test Clean” dataset for Whisper Large with beam search. Improvements on the “Test Other” dataset for Whisper Large with beam search are also made, from 4.93% to 4.72%.
Keywords: ASR error correction; large language models; instructional learning; QLoRA ASR error correction; large language models; instructional learning; QLoRA

Share and Cite

MDPI and ACS Style

Si, M.; Cobas, O.; Fababeir, M. Lexical Error Guard: Leveraging Large Language Models for Enhanced ASR Error Correction. Mach. Learn. Knowl. Extr. 2024, 6, 2435-2446. https://doi.org/10.3390/make6040120

AMA Style

Si M, Cobas O, Fababeir M. Lexical Error Guard: Leveraging Large Language Models for Enhanced ASR Error Correction. Machine Learning and Knowledge Extraction. 2024; 6(4):2435-2446. https://doi.org/10.3390/make6040120

Chicago/Turabian Style

Si, Mei, Omar Cobas, and Michael Fababeir. 2024. "Lexical Error Guard: Leveraging Large Language Models for Enhanced ASR Error Correction" Machine Learning and Knowledge Extraction 6, no. 4: 2435-2446. https://doi.org/10.3390/make6040120

APA Style

Si, M., Cobas, O., & Fababeir, M. (2024). Lexical Error Guard: Leveraging Large Language Models for Enhanced ASR Error Correction. Machine Learning and Knowledge Extraction, 6(4), 2435-2446. https://doi.org/10.3390/make6040120

Article Metrics

Back to TopTop