TY - GEN
T1 - Enhanced Bert-Based Ranking Models for Spoken Document Retrieval
AU - Lin, Hsiao Yun
AU - Lo, Tien Hong
AU - Chen, Berlin
N1 - Publisher Copyright:
© 2019 IEEE.
PY - 2019/12
Y1 - 2019/12
N2 - The Bidirectional Encoder Representations from Transformers (BERT) model has recently achieved record-breaking success on many natural language processing (NLP) tasks such as question answering and language understanding. However, relatively little work has been done on ad-hoc information retrieval (IR), especially for spoken document retrieval (SDR). This paper adopts and extends BERT for SDR, while its contributions are at least three-fold. First, we augment BERT with extra language features such as unigram and inverse document frequency (IDF) statistics to make it more applicable to SDR. Second, we also explore the incorporation of confidence scores into document representations to see if they could help alleviate the negative effects resulting from imperfect automatic speech recognition (ASR). Third, we conduct a comprehensive set of experiments to compare our BERT-based ranking methods with other state-of-The-Art ones and investigate the synergy effect of them as well.
AB - The Bidirectional Encoder Representations from Transformers (BERT) model has recently achieved record-breaking success on many natural language processing (NLP) tasks such as question answering and language understanding. However, relatively little work has been done on ad-hoc information retrieval (IR), especially for spoken document retrieval (SDR). This paper adopts and extends BERT for SDR, while its contributions are at least three-fold. First, we augment BERT with extra language features such as unigram and inverse document frequency (IDF) statistics to make it more applicable to SDR. Second, we also explore the incorporation of confidence scores into document representations to see if they could help alleviate the negative effects resulting from imperfect automatic speech recognition (ASR). Third, we conduct a comprehensive set of experiments to compare our BERT-based ranking methods with other state-of-The-Art ones and investigate the synergy effect of them as well.
KW - BERT
KW - Spoken document retrieval
KW - information retrieval
KW - model augmentation
KW - speech recognition
UR - http://www.scopus.com/inward/record.url?scp=85081595716&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85081595716&partnerID=8YFLogxK
U2 - 10.1109/ASRU46091.2019.9003890
DO - 10.1109/ASRU46091.2019.9003890
M3 - Conference contribution
AN - SCOPUS:85081595716
T3 - 2019 IEEE Automatic Speech Recognition and Understanding Workshop, ASRU 2019 - Proceedings
SP - 601
EP - 606
BT - 2019 IEEE Automatic Speech Recognition and Understanding Workshop, ASRU 2019 - Proceedings
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2019 IEEE Automatic Speech Recognition and Understanding Workshop, ASRU 2019
Y2 - 15 December 2019 through 18 December 2019
ER -