TY - GEN
T1 - Innovative Bert-Based Reranking Language Models for Speech Recognition
AU - Chiu, Shih Hsuan
AU - Chen, Berlin
N1 - Publisher Copyright:
© 2021 IEEE.
PY - 2021/1/19
Y1 - 2021/1/19
N2 - More recently, Bidirectional Encoder Representations from Transformers (BERT) was proposed and has achieved impressive success on many natural language processing (NLP) tasks such as question answering and language understanding, due mainly to its effective pre-training then fine-tuning paradigm as well as strong local contextual modeling ability. In view of the above, this paper presents a novel instantiation of the BERT-based contextualized language models (LMs) for use in reranking of N-best hypotheses produced by automatic speech recognition (ASR). To this end, we frame N-best hypothesis reranking with BERT as a prediction problem, which aims to predict the oracle hypothesis that has the lowest word error rate (WER) given the N-best hypotheses (denoted by PBERT). In particular, we also explore to capitalize on task-specific global topic information in an unsupervised manner to assist PBERT in N-best hypothesis reranking (denoted by TPBERT). Extensive experiments conducted on the AMI benchmark corpus demonstrate the effectiveness and feasibility of our methods in comparison to the conventional autoregressive models like the recurrent neural network (RNN) and a recently proposed method that employed BERT to compute pseudo-log-likelihood (PLL) scores for N-best hypothesis reranking.
AB - More recently, Bidirectional Encoder Representations from Transformers (BERT) was proposed and has achieved impressive success on many natural language processing (NLP) tasks such as question answering and language understanding, due mainly to its effective pre-training then fine-tuning paradigm as well as strong local contextual modeling ability. In view of the above, this paper presents a novel instantiation of the BERT-based contextualized language models (LMs) for use in reranking of N-best hypotheses produced by automatic speech recognition (ASR). To this end, we frame N-best hypothesis reranking with BERT as a prediction problem, which aims to predict the oracle hypothesis that has the lowest word error rate (WER) given the N-best hypotheses (denoted by PBERT). In particular, we also explore to capitalize on task-specific global topic information in an unsupervised manner to assist PBERT in N-best hypothesis reranking (denoted by TPBERT). Extensive experiments conducted on the AMI benchmark corpus demonstrate the effectiveness and feasibility of our methods in comparison to the conventional autoregressive models like the recurrent neural network (RNN) and a recently proposed method that employed BERT to compute pseudo-log-likelihood (PLL) scores for N-best hypothesis reranking.
KW - BERT
KW - N-best hypotheses reranking
KW - automatic speech recognition
KW - language models
UR - http://www.scopus.com/inward/record.url?scp=85103982819&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85103982819&partnerID=8YFLogxK
U2 - 10.1109/SLT48900.2021.9383557
DO - 10.1109/SLT48900.2021.9383557
M3 - Conference contribution
AN - SCOPUS:85103982819
T3 - 2021 IEEE Spoken Language Technology Workshop, SLT 2021 - Proceedings
SP - 266
EP - 271
BT - 2021 IEEE Spoken Language Technology Workshop, SLT 2021 - Proceedings
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2021 IEEE Spoken Language Technology Workshop, SLT 2021
Y2 - 19 January 2021 through 22 January 2021
ER -