It has always been difficult for language understanding systems to handle spontaneous speech with satisfactory robustness, primarily due to such problems as the fragments, disfluencies, out-of-vocabulary words, and ill-formed sentence structures. Also, the search schemes used are usually not flexible enough in accepting different input linguistic units, and great efforts are therefore required when they are used with different acoustic front ends in different tasks, specially in multi-modal and multi-lingual systems. In this paper, a new hierarchical tag-graph-based search scheme for spontaneous speech understanding is proposed. This scheme is based on a layered hierarchy of grammar rules, and therefore can integrate all the statistical and rule-based knowledge including acoustic scores, language model scores and grammar rules into the search process. More robust speech understanding is thus achievable. In addition, this scheme can accept graphs of different linguistic units such as phonemes, syllables, characters, words, spotted keywords, or phrases as the input, thus compatible to different acoustic front ends and multi-modal and multi-lingual applications can be easily developed. This search scheme has been successfully applied to a multi-domain, multi-modal dialogue system.
ASJC Scopus subject areas