Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (1)

Search Parameters:
Keywords = SPD-TDNN

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
13 pages, 366 KiB  
Article
Speaker Recognition Based on the Joint Loss Function
by Tengteng Feng, Houbin Fan, Fengpei Ge, Shuxin Cao and Chunyan Liang
Electronics 2023, 12(16), 3447; https://doi.org/10.3390/electronics12163447 - 15 Aug 2023
Cited by 2 | Viewed by 1593
Abstract
The statistical pyramid dense time-delay neural network (SPD-TDNN) model makes it difficult to deal with the imbalance of training data, poses a high risk of overfitting, and has weak generalization ability. To solve these problems, we propose a method based on the joint [...] Read more.
The statistical pyramid dense time-delay neural network (SPD-TDNN) model makes it difficult to deal with the imbalance of training data, poses a high risk of overfitting, and has weak generalization ability. To solve these problems, we propose a method based on the joint loss function and improved statistical pyramid dense time-delay neural network (JLF-ISPD-TDNN), which improves on the SPD-TDNN model and uses the joint loss function method to combine the advantages of the cross-entropy loss function and the comparative learning of the loss function. By minimizing the distance between speech embeddings from the same speaker and maximizing the distance between speech embeddings from different speakers, the model could achieve enhanced generalization performance and more robust speaker feature representation. We evaluated the proposed method’s performance using the evaluation indexes of the equal error rate (EER) and minimum cost function (minDCF). The experimental results show that the EEE and minDCF on the Aishell-1 dataset reached 1.02% and 0.1221%, respectively. Therefore, using the joint loss function in the improved SPD-TDNN model can significantly enhance the model’s speaker recognition performance. Full article
(This article belongs to the Special Issue Machine Learning in Music/Audio Signal Processing)
Show Figures

Figure 1

Back to TopTop