A locality-preserving essence vector modeling framework for spoken document retrieval

Kuan Yu Chen, Shih Hung Liu, Berlin Chen, Hsin Min Wang

Research output: Chapter in Book/Report/Conference proceedingConference contribution

3 Citations (Scopus)

Abstract

Because unprecedented volumes of multimedia data associated with spoken documents have been made available to the public, spoken document retrieval (SDR) has become an important research area in the past decades. Recently, representation learning has emerged as an active research topic in many machine learning applications owing largely to its excellent performance. In the context of natural language processing, the pioneering work can date back to the word embedding methods. However, learning of paragraph (or sentence and document) representations is more reasonable and suitable for some tasks, such as information retrieval and document summarization. Nevertheless, as far as we are aware, there is relatively less work focusing on launching paragraph embedding methods into SDR. Motivated by these observations, this paper proposes a novel paragraph embedding method, named the locality-preserving essence vector (LPEV) model. LPEV is designed with consideration to two aspects. First, the model aims at not only distilling the most representative information from a paragraph but also getting rid of the general background information. Second, inspired by the local invariance perspective, which is a celebrated principle used in manifold learning techniques, LPEV also manages to preserve semantic locality in the learned low-dimensional embedding space for producing more informative and discriminative vector representations of paragraphs. On top of the proposed framework, a series of empirical SDR experiments conducted on the TDT-2 (Topic Detection and Tracking) collection demonstrate the good efficacy of our SDR methods as compared to existing strong baselines.

Original languageEnglish
Title of host publication2017 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2017 - Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages5665-5669
Number of pages5
ISBN (Electronic)9781509041176
DOIs
Publication statusPublished - 2017 Jun 16
Event2017 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2017 - New Orleans, United States
Duration: 2017 Mar 52017 Mar 9

Publication series

NameICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
ISSN (Print)1520-6149

Conference

Conference2017 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2017
CountryUnited States
CityNew Orleans
Period17/3/517/3/9

Fingerprint

Launching
Invariance
Information retrieval
Learning systems
Semantics
Processing
Experiments

Keywords

  • distill
  • locality
  • Representation
  • spoken document retrieval

ASJC Scopus subject areas

  • Software
  • Signal Processing
  • Electrical and Electronic Engineering

Cite this

Chen, K. Y., Liu, S. H., Chen, B., & Wang, H. M. (2017). A locality-preserving essence vector modeling framework for spoken document retrieval. In 2017 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2017 - Proceedings (pp. 5665-5669). [7953241] (ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/ICASSP.2017.7953241

A locality-preserving essence vector modeling framework for spoken document retrieval. / Chen, Kuan Yu; Liu, Shih Hung; Chen, Berlin; Wang, Hsin Min.

2017 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2017 - Proceedings. Institute of Electrical and Electronics Engineers Inc., 2017. p. 5665-5669 7953241 (ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Chen, KY, Liu, SH, Chen, B & Wang, HM 2017, A locality-preserving essence vector modeling framework for spoken document retrieval. in 2017 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2017 - Proceedings., 7953241, ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings, Institute of Electrical and Electronics Engineers Inc., pp. 5665-5669, 2017 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2017, New Orleans, United States, 17/3/5. https://doi.org/10.1109/ICASSP.2017.7953241
Chen KY, Liu SH, Chen B, Wang HM. A locality-preserving essence vector modeling framework for spoken document retrieval. In 2017 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2017 - Proceedings. Institute of Electrical and Electronics Engineers Inc. 2017. p. 5665-5669. 7953241. (ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings). https://doi.org/10.1109/ICASSP.2017.7953241
Chen, Kuan Yu ; Liu, Shih Hung ; Chen, Berlin ; Wang, Hsin Min. / A locality-preserving essence vector modeling framework for spoken document retrieval. 2017 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2017 - Proceedings. Institute of Electrical and Electronics Engineers Inc., 2017. pp. 5665-5669 (ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings).
@inproceedings{c76aaef5ec9f4346a54fa0a1352879e0,
title = "A locality-preserving essence vector modeling framework for spoken document retrieval",
abstract = "Because unprecedented volumes of multimedia data associated with spoken documents have been made available to the public, spoken document retrieval (SDR) has become an important research area in the past decades. Recently, representation learning has emerged as an active research topic in many machine learning applications owing largely to its excellent performance. In the context of natural language processing, the pioneering work can date back to the word embedding methods. However, learning of paragraph (or sentence and document) representations is more reasonable and suitable for some tasks, such as information retrieval and document summarization. Nevertheless, as far as we are aware, there is relatively less work focusing on launching paragraph embedding methods into SDR. Motivated by these observations, this paper proposes a novel paragraph embedding method, named the locality-preserving essence vector (LPEV) model. LPEV is designed with consideration to two aspects. First, the model aims at not only distilling the most representative information from a paragraph but also getting rid of the general background information. Second, inspired by the local invariance perspective, which is a celebrated principle used in manifold learning techniques, LPEV also manages to preserve semantic locality in the learned low-dimensional embedding space for producing more informative and discriminative vector representations of paragraphs. On top of the proposed framework, a series of empirical SDR experiments conducted on the TDT-2 (Topic Detection and Tracking) collection demonstrate the good efficacy of our SDR methods as compared to existing strong baselines.",
keywords = "distill, locality, Representation, spoken document retrieval",
author = "Chen, {Kuan Yu} and Liu, {Shih Hung} and Berlin Chen and Wang, {Hsin Min}",
year = "2017",
month = "6",
day = "16",
doi = "10.1109/ICASSP.2017.7953241",
language = "English",
series = "ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
pages = "5665--5669",
booktitle = "2017 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2017 - Proceedings",

}

TY - GEN

T1 - A locality-preserving essence vector modeling framework for spoken document retrieval

AU - Chen, Kuan Yu

AU - Liu, Shih Hung

AU - Chen, Berlin

AU - Wang, Hsin Min

PY - 2017/6/16

Y1 - 2017/6/16

N2 - Because unprecedented volumes of multimedia data associated with spoken documents have been made available to the public, spoken document retrieval (SDR) has become an important research area in the past decades. Recently, representation learning has emerged as an active research topic in many machine learning applications owing largely to its excellent performance. In the context of natural language processing, the pioneering work can date back to the word embedding methods. However, learning of paragraph (or sentence and document) representations is more reasonable and suitable for some tasks, such as information retrieval and document summarization. Nevertheless, as far as we are aware, there is relatively less work focusing on launching paragraph embedding methods into SDR. Motivated by these observations, this paper proposes a novel paragraph embedding method, named the locality-preserving essence vector (LPEV) model. LPEV is designed with consideration to two aspects. First, the model aims at not only distilling the most representative information from a paragraph but also getting rid of the general background information. Second, inspired by the local invariance perspective, which is a celebrated principle used in manifold learning techniques, LPEV also manages to preserve semantic locality in the learned low-dimensional embedding space for producing more informative and discriminative vector representations of paragraphs. On top of the proposed framework, a series of empirical SDR experiments conducted on the TDT-2 (Topic Detection and Tracking) collection demonstrate the good efficacy of our SDR methods as compared to existing strong baselines.

AB - Because unprecedented volumes of multimedia data associated with spoken documents have been made available to the public, spoken document retrieval (SDR) has become an important research area in the past decades. Recently, representation learning has emerged as an active research topic in many machine learning applications owing largely to its excellent performance. In the context of natural language processing, the pioneering work can date back to the word embedding methods. However, learning of paragraph (or sentence and document) representations is more reasonable and suitable for some tasks, such as information retrieval and document summarization. Nevertheless, as far as we are aware, there is relatively less work focusing on launching paragraph embedding methods into SDR. Motivated by these observations, this paper proposes a novel paragraph embedding method, named the locality-preserving essence vector (LPEV) model. LPEV is designed with consideration to two aspects. First, the model aims at not only distilling the most representative information from a paragraph but also getting rid of the general background information. Second, inspired by the local invariance perspective, which is a celebrated principle used in manifold learning techniques, LPEV also manages to preserve semantic locality in the learned low-dimensional embedding space for producing more informative and discriminative vector representations of paragraphs. On top of the proposed framework, a series of empirical SDR experiments conducted on the TDT-2 (Topic Detection and Tracking) collection demonstrate the good efficacy of our SDR methods as compared to existing strong baselines.

KW - distill

KW - locality

KW - Representation

KW - spoken document retrieval

UR - http://www.scopus.com/inward/record.url?scp=85023781784&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85023781784&partnerID=8YFLogxK

U2 - 10.1109/ICASSP.2017.7953241

DO - 10.1109/ICASSP.2017.7953241

M3 - Conference contribution

T3 - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings

SP - 5665

EP - 5669

BT - 2017 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2017 - Proceedings

PB - Institute of Electrical and Electronics Engineers Inc.

ER -