TY - GEN
T1 - Investigating Siamese LSTM networks for text categorization
AU - Shih, Chin Hong
AU - Yan, Bi Cheng
AU - Liu, Shih Hung
AU - Chen, Berlin
N1 - Publisher Copyright:
© 2017 IEEE.
PY - 2017/7/2
Y1 - 2017/7/2
N2 - Recently, deep learning and deep neural networks have attracted considerable attention and emerged as one predominant field of research in the artificial intelligence community. The developed techniques have also gained widespread use in various domains with good success, such as automatic speech recognition, information retrieval and text classification, etc. Among them, long short-term memory (LSTM) networks are well suited to such tasks, which can capture long-range dependencies among words efficiently, meanwhile alleviating the gradient vanishing or exploding problem during training effectively. Following this line of research, in this paper we explore a novel use of a Siamese LSTM based method to learn more accurate document representation for text categorization. Such a network architecture takes a pair of documents with variable lengths as the input and utilizes pairwise learning to generate distributed representations of documents that can more precisely render the semantic distance between any pair of documents. In doing so, documents associated with the same semantic or topic label could be mapped to similar representations having a relatively higher semantic similarity. Experiments conducted on two benchmark text categorization tasks, viz. IMDB and 20Newsgroups, show that using a three-layer deep neural network based classifier that takes a document representation learned from the Siamese LSTM sub-networks as the input can achieve competitive performance in relation to several state-of-the-art methods.
AB - Recently, deep learning and deep neural networks have attracted considerable attention and emerged as one predominant field of research in the artificial intelligence community. The developed techniques have also gained widespread use in various domains with good success, such as automatic speech recognition, information retrieval and text classification, etc. Among them, long short-term memory (LSTM) networks are well suited to such tasks, which can capture long-range dependencies among words efficiently, meanwhile alleviating the gradient vanishing or exploding problem during training effectively. Following this line of research, in this paper we explore a novel use of a Siamese LSTM based method to learn more accurate document representation for text categorization. Such a network architecture takes a pair of documents with variable lengths as the input and utilizes pairwise learning to generate distributed representations of documents that can more precisely render the semantic distance between any pair of documents. In doing so, documents associated with the same semantic or topic label could be mapped to similar representations having a relatively higher semantic similarity. Experiments conducted on two benchmark text categorization tasks, viz. IMDB and 20Newsgroups, show that using a three-layer deep neural network based classifier that takes a document representation learned from the Siamese LSTM sub-networks as the input can achieve competitive performance in relation to several state-of-the-art methods.
UR - http://www.scopus.com/inward/record.url?scp=85050818358&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85050818358&partnerID=8YFLogxK
U2 - 10.1109/APSIPA.2017.8282104
DO - 10.1109/APSIPA.2017.8282104
M3 - Conference contribution
AN - SCOPUS:85050818358
T3 - Proceedings - 9th Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2017
SP - 641
EP - 646
BT - Proceedings - 9th Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2017
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 9th Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2017
Y2 - 12 December 2017 through 15 December 2017
ER -