Development of a text retrieval and mining system for Taiwanese historical people

Shun Hong Sie, Hao Ren Ke, Su Bing Chang

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Citation (Scopus)

Abstract

Personage is an important kind of entities in study of history. Comprehensive understanding of personage biographies is beneficial for researching into historical events. This article introduces the development of a text retrieval and mining system for Taiwanese historical people - Taiwan Biographical Database (TBDB). It describes the characteristics of personages in TBDB, highlights the system architecture and preliminary achievement of TBDB, and proposes a method to recognize named entities in the personage biographies, specifically poetry societies, which achieves the recall rate 96% and the precision rate 65%. Finally, this article elaborates on the lessons learned through the creation of TBDB, and the future plans.

Original languageEnglish
Title of host publicationProceedings of the 2017 Pacific Neighborhood Consortium Annual Conference and Joint Meetings
Subtitle of host publicationData Informed Society, PNC 2017
EditorsSophy Shu-Jiun Chen, Feng-Tyan Lin, Da-Wei Wang, Ling-Jyh Chen
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages56-62
Number of pages7
ISBN (Electronic)9789869531702
DOIs
Publication statusPublished - 2017 Dec 13
Event2017 Pacific Neighborhood Consortium Annual Conference and Joint Meetings, PNC 2017 - Tainan, Taiwan
Duration: 2017 Nov 72017 Nov 9

Publication series

NameProceedings of the 2017 Pacific Neighborhood Consortium Annual Conference and Joint Meetings: Data Informed Society, PNC 2017
Volume2017-December

Conference

Conference2017 Pacific Neighborhood Consortium Annual Conference and Joint Meetings, PNC 2017
CountryTaiwan
CityTainan
Period17/11/717/11/9

Fingerprint

History
Text mining
Data base
Taiwan
Text retrieval
Lessons learned
System architecture
Named entity
Poetry

Keywords

  • name entity recognition
  • social network analysis (SNA)
  • Taiwan Biographical Database (TBDB)
  • text mining
  • text retrieval

ASJC Scopus subject areas

  • Information Systems
  • Information Systems and Management
  • Computer Networks and Communications
  • Computer Science Applications

Cite this

Sie, S. H., Ke, H. R., & Chang, S. B. (2017). Development of a text retrieval and mining system for Taiwanese historical people. In S. S-J. Chen, F-T. Lin, D-W. Wang, & L-J. Chen (Eds.), Proceedings of the 2017 Pacific Neighborhood Consortium Annual Conference and Joint Meetings: Data Informed Society, PNC 2017 (pp. 56-62). (Proceedings of the 2017 Pacific Neighborhood Consortium Annual Conference and Joint Meetings: Data Informed Society, PNC 2017; Vol. 2017-December). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.23919/PNC.2017.8203522

Development of a text retrieval and mining system for Taiwanese historical people. / Sie, Shun Hong; Ke, Hao Ren; Chang, Su Bing.

Proceedings of the 2017 Pacific Neighborhood Consortium Annual Conference and Joint Meetings: Data Informed Society, PNC 2017. ed. / Sophy Shu-Jiun Chen; Feng-Tyan Lin; Da-Wei Wang; Ling-Jyh Chen. Institute of Electrical and Electronics Engineers Inc., 2017. p. 56-62 (Proceedings of the 2017 Pacific Neighborhood Consortium Annual Conference and Joint Meetings: Data Informed Society, PNC 2017; Vol. 2017-December).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Sie, SH, Ke, HR & Chang, SB 2017, Development of a text retrieval and mining system for Taiwanese historical people. in SS-J Chen, F-T Lin, D-W Wang & L-J Chen (eds), Proceedings of the 2017 Pacific Neighborhood Consortium Annual Conference and Joint Meetings: Data Informed Society, PNC 2017. Proceedings of the 2017 Pacific Neighborhood Consortium Annual Conference and Joint Meetings: Data Informed Society, PNC 2017, vol. 2017-December, Institute of Electrical and Electronics Engineers Inc., pp. 56-62, 2017 Pacific Neighborhood Consortium Annual Conference and Joint Meetings, PNC 2017, Tainan, Taiwan, 17/11/7. https://doi.org/10.23919/PNC.2017.8203522
Sie SH, Ke HR, Chang SB. Development of a text retrieval and mining system for Taiwanese historical people. In Chen SS-J, Lin F-T, Wang D-W, Chen L-J, editors, Proceedings of the 2017 Pacific Neighborhood Consortium Annual Conference and Joint Meetings: Data Informed Society, PNC 2017. Institute of Electrical and Electronics Engineers Inc. 2017. p. 56-62. (Proceedings of the 2017 Pacific Neighborhood Consortium Annual Conference and Joint Meetings: Data Informed Society, PNC 2017). https://doi.org/10.23919/PNC.2017.8203522
Sie, Shun Hong ; Ke, Hao Ren ; Chang, Su Bing. / Development of a text retrieval and mining system for Taiwanese historical people. Proceedings of the 2017 Pacific Neighborhood Consortium Annual Conference and Joint Meetings: Data Informed Society, PNC 2017. editor / Sophy Shu-Jiun Chen ; Feng-Tyan Lin ; Da-Wei Wang ; Ling-Jyh Chen. Institute of Electrical and Electronics Engineers Inc., 2017. pp. 56-62 (Proceedings of the 2017 Pacific Neighborhood Consortium Annual Conference and Joint Meetings: Data Informed Society, PNC 2017).
@inproceedings{e9bc80b002d342139735caabce5670ec,
title = "Development of a text retrieval and mining system for Taiwanese historical people",
abstract = "Personage is an important kind of entities in study of history. Comprehensive understanding of personage biographies is beneficial for researching into historical events. This article introduces the development of a text retrieval and mining system for Taiwanese historical people - Taiwan Biographical Database (TBDB). It describes the characteristics of personages in TBDB, highlights the system architecture and preliminary achievement of TBDB, and proposes a method to recognize named entities in the personage biographies, specifically poetry societies, which achieves the recall rate 96{\%} and the precision rate 65{\%}. Finally, this article elaborates on the lessons learned through the creation of TBDB, and the future plans.",
keywords = "name entity recognition, social network analysis (SNA), Taiwan Biographical Database (TBDB), text mining, text retrieval",
author = "Sie, {Shun Hong} and Ke, {Hao Ren} and Chang, {Su Bing}",
year = "2017",
month = "12",
day = "13",
doi = "10.23919/PNC.2017.8203522",
language = "English",
series = "Proceedings of the 2017 Pacific Neighborhood Consortium Annual Conference and Joint Meetings: Data Informed Society, PNC 2017",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
pages = "56--62",
editor = "Chen, {Sophy Shu-Jiun} and Feng-Tyan Lin and Da-Wei Wang and Ling-Jyh Chen",
booktitle = "Proceedings of the 2017 Pacific Neighborhood Consortium Annual Conference and Joint Meetings",

}

TY - GEN

T1 - Development of a text retrieval and mining system for Taiwanese historical people

AU - Sie, Shun Hong

AU - Ke, Hao Ren

AU - Chang, Su Bing

PY - 2017/12/13

Y1 - 2017/12/13

N2 - Personage is an important kind of entities in study of history. Comprehensive understanding of personage biographies is beneficial for researching into historical events. This article introduces the development of a text retrieval and mining system for Taiwanese historical people - Taiwan Biographical Database (TBDB). It describes the characteristics of personages in TBDB, highlights the system architecture and preliminary achievement of TBDB, and proposes a method to recognize named entities in the personage biographies, specifically poetry societies, which achieves the recall rate 96% and the precision rate 65%. Finally, this article elaborates on the lessons learned through the creation of TBDB, and the future plans.

AB - Personage is an important kind of entities in study of history. Comprehensive understanding of personage biographies is beneficial for researching into historical events. This article introduces the development of a text retrieval and mining system for Taiwanese historical people - Taiwan Biographical Database (TBDB). It describes the characteristics of personages in TBDB, highlights the system architecture and preliminary achievement of TBDB, and proposes a method to recognize named entities in the personage biographies, specifically poetry societies, which achieves the recall rate 96% and the precision rate 65%. Finally, this article elaborates on the lessons learned through the creation of TBDB, and the future plans.

KW - name entity recognition

KW - social network analysis (SNA)

KW - Taiwan Biographical Database (TBDB)

KW - text mining

KW - text retrieval

UR - http://www.scopus.com/inward/record.url?scp=85047180430&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85047180430&partnerID=8YFLogxK

U2 - 10.23919/PNC.2017.8203522

DO - 10.23919/PNC.2017.8203522

M3 - Conference contribution

T3 - Proceedings of the 2017 Pacific Neighborhood Consortium Annual Conference and Joint Meetings: Data Informed Society, PNC 2017

SP - 56

EP - 62

BT - Proceedings of the 2017 Pacific Neighborhood Consortium Annual Conference and Joint Meetings

A2 - Chen, Sophy Shu-Jiun

A2 - Lin, Feng-Tyan

A2 - Wang, Da-Wei

A2 - Chen, Ling-Jyh

PB - Institute of Electrical and Electronics Engineers Inc.

ER -