Text-independent speaker verification with very short utterances
Loading...
Date
2023
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Thesis (M.S.) - Bogazici University. Institute for Graduate Studies in Science and Engineering, 2023.
Abstract
The accuracy of the text-independent speaker verification suffers greatly when the speech duration is very short. In this thesis, some methods are proposed aiming to compensate for the drastic performance degradation in speaker verification with very short utterances. Firstly, methods that try to leverage the additional information from large-scale speaker datasets are proposed in order to enhance the limited speaker information that is present in the very short speech utterances. Secondly, the problem of short utterances is tackled in a more specific way in terms of the phonetic content of the speech. An analysis of phonetic mismatch between verification utterances is performed, along with experiments of a back-end scoring module that is aware of the phonetic mismatch in speaker verification. Furthermore, contributions to the speaker verification in general, which might be applicable to the very short duration conditions are presented. A novel loss function for back-end scoring module training is introduced. The proposed loss function outperformed the baseline loss function in all cases, including very short duration scenario. Lastly, a novel unsupervised domain adaptation of the discriminative back-end scoring for speaker verification is proposed. The proposed adaptation method improved the performance of the out-of-domain backend scoring model in the target domain in all cases. The relative improvement of the proposed method, compared to baseline adaptation methods, is highest in short duration conditions. NOTE Keywords : speech processing, deep learning, speaker recognition.