Investigating Siamese LSTM networks for text categorization

Chin Hong Shih, Bi Cheng Yan, Shih Hung Liu, Berlin Chen

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2 Citations (Scopus)

Abstract

Recently, deep learning and deep neural networks have attracted considerable attention and emerged as one predominant field of research in the artificial intelligence community. The developed techniques have also gained widespread use in various domains with good success, such as automatic speech recognition, information retrieval and text classification, etc. Among them, long short-term memory (LSTM) networks are well suited to such tasks, which can capture long-range dependencies among words efficiently, meanwhile alleviating the gradient vanishing or exploding problem during training effectively. Following this line of research, in this paper we explore a novel use of a Siamese LSTM based method to learn more accurate document representation for text categorization. Such a network architecture takes a pair of documents with variable lengths as the input and utilizes pairwise learning to generate distributed representations of documents that can more precisely render the semantic distance between any pair of documents. In doing so, documents associated with the same semantic or topic label could be mapped to similar representations having a relatively higher semantic similarity. Experiments conducted on two benchmark text categorization tasks, viz. IMDB and 20Newsgroups, show that using a three-layer deep neural network based classifier that takes a document representation learned from the Siamese LSTM sub-networks as the input can achieve competitive performance in relation to several state-of-the-art methods.

Original languageEnglish
Title of host publicationProceedings - 9th Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2017
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages641-646
Number of pages6
ISBN (Electronic)9781538615423
DOIs
Publication statusPublished - 2018 Feb 5
Event9th Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2017 - Kuala Lumpur, Malaysia
Duration: 2017 Dec 122017 Dec 15

Publication series

NameProceedings - 9th Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2017
Volume2018-February

Conference

Conference9th Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2017
CountryMalaysia
CityKuala Lumpur
Period17/12/1217/12/15

Fingerprint

Semantics
Network architecture
Information retrieval
Speech recognition
Artificial intelligence
Labels
Classifiers
Long short-term memory
Experiments
Deep neural networks
Deep learning

ASJC Scopus subject areas

  • Artificial Intelligence
  • Human-Computer Interaction
  • Information Systems
  • Signal Processing

Cite this

Shih, C. H., Yan, B. C., Liu, S. H., & Chen, B. (2018). Investigating Siamese LSTM networks for text categorization. In Proceedings - 9th Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2017 (pp. 641-646). (Proceedings - 9th Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2017; Vol. 2018-February). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/APSIPA.2017.8282104

Investigating Siamese LSTM networks for text categorization. / Shih, Chin Hong; Yan, Bi Cheng; Liu, Shih Hung; Chen, Berlin.

Proceedings - 9th Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2017. Institute of Electrical and Electronics Engineers Inc., 2018. p. 641-646 (Proceedings - 9th Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2017; Vol. 2018-February).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Shih, CH, Yan, BC, Liu, SH & Chen, B 2018, Investigating Siamese LSTM networks for text categorization. in Proceedings - 9th Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2017. Proceedings - 9th Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2017, vol. 2018-February, Institute of Electrical and Electronics Engineers Inc., pp. 641-646, 9th Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2017, Kuala Lumpur, Malaysia, 17/12/12. https://doi.org/10.1109/APSIPA.2017.8282104
Shih CH, Yan BC, Liu SH, Chen B. Investigating Siamese LSTM networks for text categorization. In Proceedings - 9th Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2017. Institute of Electrical and Electronics Engineers Inc. 2018. p. 641-646. (Proceedings - 9th Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2017). https://doi.org/10.1109/APSIPA.2017.8282104
Shih, Chin Hong ; Yan, Bi Cheng ; Liu, Shih Hung ; Chen, Berlin. / Investigating Siamese LSTM networks for text categorization. Proceedings - 9th Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2017. Institute of Electrical and Electronics Engineers Inc., 2018. pp. 641-646 (Proceedings - 9th Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2017).
@inproceedings{a2c98f896d5e48048c2e61703045b499,
title = "Investigating Siamese LSTM networks for text categorization",
abstract = "Recently, deep learning and deep neural networks have attracted considerable attention and emerged as one predominant field of research in the artificial intelligence community. The developed techniques have also gained widespread use in various domains with good success, such as automatic speech recognition, information retrieval and text classification, etc. Among them, long short-term memory (LSTM) networks are well suited to such tasks, which can capture long-range dependencies among words efficiently, meanwhile alleviating the gradient vanishing or exploding problem during training effectively. Following this line of research, in this paper we explore a novel use of a Siamese LSTM based method to learn more accurate document representation for text categorization. Such a network architecture takes a pair of documents with variable lengths as the input and utilizes pairwise learning to generate distributed representations of documents that can more precisely render the semantic distance between any pair of documents. In doing so, documents associated with the same semantic or topic label could be mapped to similar representations having a relatively higher semantic similarity. Experiments conducted on two benchmark text categorization tasks, viz. IMDB and 20Newsgroups, show that using a three-layer deep neural network based classifier that takes a document representation learned from the Siamese LSTM sub-networks as the input can achieve competitive performance in relation to several state-of-the-art methods.",
author = "Shih, {Chin Hong} and Yan, {Bi Cheng} and Liu, {Shih Hung} and Berlin Chen",
year = "2018",
month = "2",
day = "5",
doi = "10.1109/APSIPA.2017.8282104",
language = "English",
series = "Proceedings - 9th Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2017",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
pages = "641--646",
booktitle = "Proceedings - 9th Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2017",

}

TY - GEN

T1 - Investigating Siamese LSTM networks for text categorization

AU - Shih, Chin Hong

AU - Yan, Bi Cheng

AU - Liu, Shih Hung

AU - Chen, Berlin

PY - 2018/2/5

Y1 - 2018/2/5

N2 - Recently, deep learning and deep neural networks have attracted considerable attention and emerged as one predominant field of research in the artificial intelligence community. The developed techniques have also gained widespread use in various domains with good success, such as automatic speech recognition, information retrieval and text classification, etc. Among them, long short-term memory (LSTM) networks are well suited to such tasks, which can capture long-range dependencies among words efficiently, meanwhile alleviating the gradient vanishing or exploding problem during training effectively. Following this line of research, in this paper we explore a novel use of a Siamese LSTM based method to learn more accurate document representation for text categorization. Such a network architecture takes a pair of documents with variable lengths as the input and utilizes pairwise learning to generate distributed representations of documents that can more precisely render the semantic distance between any pair of documents. In doing so, documents associated with the same semantic or topic label could be mapped to similar representations having a relatively higher semantic similarity. Experiments conducted on two benchmark text categorization tasks, viz. IMDB and 20Newsgroups, show that using a three-layer deep neural network based classifier that takes a document representation learned from the Siamese LSTM sub-networks as the input can achieve competitive performance in relation to several state-of-the-art methods.

AB - Recently, deep learning and deep neural networks have attracted considerable attention and emerged as one predominant field of research in the artificial intelligence community. The developed techniques have also gained widespread use in various domains with good success, such as automatic speech recognition, information retrieval and text classification, etc. Among them, long short-term memory (LSTM) networks are well suited to such tasks, which can capture long-range dependencies among words efficiently, meanwhile alleviating the gradient vanishing or exploding problem during training effectively. Following this line of research, in this paper we explore a novel use of a Siamese LSTM based method to learn more accurate document representation for text categorization. Such a network architecture takes a pair of documents with variable lengths as the input and utilizes pairwise learning to generate distributed representations of documents that can more precisely render the semantic distance between any pair of documents. In doing so, documents associated with the same semantic or topic label could be mapped to similar representations having a relatively higher semantic similarity. Experiments conducted on two benchmark text categorization tasks, viz. IMDB and 20Newsgroups, show that using a three-layer deep neural network based classifier that takes a document representation learned from the Siamese LSTM sub-networks as the input can achieve competitive performance in relation to several state-of-the-art methods.

UR - http://www.scopus.com/inward/record.url?scp=85050818358&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85050818358&partnerID=8YFLogxK

U2 - 10.1109/APSIPA.2017.8282104

DO - 10.1109/APSIPA.2017.8282104

M3 - Conference contribution

AN - SCOPUS:85050818358

T3 - Proceedings - 9th Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2017

SP - 641

EP - 646

BT - Proceedings - 9th Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2017

PB - Institute of Electrical and Electronics Engineers Inc.

ER -