Huge, continually increasing quantities of multimedia content including speech information are filling up our computers, networks and lives. It is obvious that speech is one of the most important sources of information for multimedia content, as it is the speech of the content that tells us of the subjects, topics and concepts. As a result, the associated spoken documents of the multimedia content will be key for content retrieval and browsing. Substantial efforts along with very encouraging results for spoken document transcription, retrieval, and summarization have been reported. This chapter presents a concise yet comprehensive overview of information retrieval and automatic summarization technologies that have been developed in recent years for efficient spoken document retrieval and browsing applications. An example prototype system for voice retrieval of Chinese broadcast news collected in Taiwan will be introduced as well.
ASJC Scopus subject areas