基於多重注意力機制的輔助損失函數用於端到端語者標記

Translated title of the contribution: Auxiliary Loss to Attention Head for End to End Speaker Diarization

Yi Ting Yang, Jiun Ting Li, Berlin Chen

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

This study introduces a novel auxiliary function for use in the Self-Attention End-to-End Speaker Diarization (SA-EEND) model, aiming to achieve accurate speaker label prediction within overlapping speech regions. Previous research has lacked effective methods for leveraging speaker information within the model to enhance auxiliary model training and has not taken into account variations in the distribution of different speech activity patterns. This study proposes a novel auxiliary function to facilitate speaker label prediction within overlapping speech regions. By considering both the overall speech activity patterns and the task-specific speech activity patterns for different speakers, we adjust the weight matrices of the multi-head self-attention mechanism in the Transformer layers. We also select loss functions that can improve the learning performance for labels with fewer occurrences, resulting in better speaker discrimination. Experimental evaluations were conducted on Mini LibriSpeech. Although the results exhibited some limitations, there were still notable advancements made.

Translated title of the contributionAuxiliary Loss to Attention Head for End to End Speaker Diarization
Original languageChinese (Traditional)
Title of host publicationROCLING 2023 - Proceedings of the 35th Conference on Computational Linguistics and Speech Processing
EditorsJheng-Long Wu, Ming-Hsiang Su, Hen-Hsen Huang, Yu Tsao, Hou-Chiang Tseng, Chia-Hui Chang, Lung-Hao Lee, Yuan-Fu Liao, Wei-Yun Ma
PublisherThe Association for Computational Linguistics and Chinese Language Processing (ACLCLP)
Pages38-43
Number of pages6
ISBN (Electronic)9789869576963
Publication statusPublished - 2023
Event35th Conference on Computational Linguistics and Speech Processing, ROCLING 2023 - Taipei City, Taiwan
Duration: 2023 Oct 202023 Oct 21

Publication series

NameROCLING 2023 - Proceedings of the 35th Conference on Computational Linguistics and Speech Processing

Conference

Conference35th Conference on Computational Linguistics and Speech Processing, ROCLING 2023
Country/TerritoryTaiwan
CityTaipei City
Period2023/10/202023/10/21

ASJC Scopus subject areas

  • Language and Linguistics
  • Speech and Hearing

Fingerprint

Dive into the research topics of 'Auxiliary Loss to Attention Head for End to End Speaker Diarization'. Together they form a unique fingerprint.

Cite this