基於多重注意力機制的輔助損失函數用於端到端語者標記

Yi Ting Yang, Jiun Ting Li, Berlin Chen

研究成果: 書貢獻/報告類型會議論文篇章

摘要

This study introduces a novel auxiliary function for use in the Self-Attention End-to-End Speaker Diarization (SA-EEND) model, aiming to achieve accurate speaker label prediction within overlapping speech regions. Previous research has lacked effective methods for leveraging speaker information within the model to enhance auxiliary model training and has not taken into account variations in the distribution of different speech activity patterns. This study proposes a novel auxiliary function to facilitate speaker label prediction within overlapping speech regions. By considering both the overall speech activity patterns and the task-specific speech activity patterns for different speakers, we adjust the weight matrices of the multi-head self-attention mechanism in the Transformer layers. We also select loss functions that can improve the learning performance for labels with fewer occurrences, resulting in better speaker discrimination. Experimental evaluations were conducted on Mini LibriSpeech. Although the results exhibited some limitations, there were still notable advancements made.

貢獻的翻譯標題Auxiliary Loss to Attention Head for End to End Speaker Diarization
原文繁體中文
主出版物標題ROCLING 2023 - Proceedings of the 35th Conference on Computational Linguistics and Speech Processing
編輯Jheng-Long Wu, Ming-Hsiang Su, Hen-Hsen Huang, Yu Tsao, Hou-Chiang Tseng, Chia-Hui Chang, Lung-Hao Lee, Yuan-Fu Liao, Wei-Yun Ma
發行者The Association for Computational Linguistics and Chinese Language Processing (ACLCLP)
頁面38-43
頁數6
ISBN(電子)9789869576963
出版狀態已發佈 - 2023
事件35th Conference on Computational Linguistics and Speech Processing, ROCLING 2023 - Taipei City, 臺灣
持續時間: 2023 10月 202023 10月 21

出版系列

名字ROCLING 2023 - Proceedings of the 35th Conference on Computational Linguistics and Speech Processing

會議

會議35th Conference on Computational Linguistics and Speech Processing, ROCLING 2023
國家/地區臺灣
城市Taipei City
期間2023/10/202023/10/21

Keywords

  • auxiliary loss
  • end-to-end neural diarization
  • multi-head attention
  • speaker diarization

ASJC Scopus subject areas

  • 語言與語言學
  • 言語和聽力

指紋

深入研究「基於多重注意力機制的輔助損失函數用於端到端語者標記」主題。共同形成了獨特的指紋。

引用此