Biomedical text mining about Alzheimer's diseases for Machine Reading evaluation

Bing Han Tsai, Yu Zheng Liu, Wen Juan Hou

Research output: Contribution to journalArticle

1 Citation (Scopus)

Abstract

The paper presents the experiments carried out as part of the participation in the pilot task of Biomedical about Alzheimer for QA4MRE at CLEF 2012. We have submitted total five unique runs in the pilot task. One run uses Term Frequency (TF) of the query words to weight the sentence. Two runs use Term Frequency-Inverted Document Frequency (TF-IDF) of the query words to weight the sentences. The two unique runs differ in the way that when multiple answers get the same scores by our system, we choose the different answer in the different runs. The last two runs use TF or TF-IDF weighting scheme as well as the OMIM terms about Alzheimer for query expansion. Stopwords are removed from the query words and answers. Each sentence in the associated document is assigned a weighting score with respect to query words. The sentence that receives the higher weighting score corresponding to the query words is identified as the more relevant sentence to the document. The corresponding answer option to the given question is scored according to the sentence weighting score and the highest ranked answer is selected as the final answer.

Original languageEnglish
JournalCEUR Workshop Proceedings
Volume1178
Publication statusPublished - 2012

Fingerprint

Experiments

Keywords

  • Biomedical text mining
  • Machine Reading
  • QA4MRE
  • Question-answering

ASJC Scopus subject areas

  • Computer Science(all)

Cite this

Biomedical text mining about Alzheimer's diseases for Machine Reading evaluation. / Tsai, Bing Han; Liu, Yu Zheng; Hou, Wen Juan.

In: CEUR Workshop Proceedings, Vol. 1178, 2012.

Research output: Contribution to journalArticle

@article{a372fa0836da466da6a441ab5422de7f,
title = "Biomedical text mining about Alzheimer's diseases for Machine Reading evaluation",
abstract = "The paper presents the experiments carried out as part of the participation in the pilot task of Biomedical about Alzheimer for QA4MRE at CLEF 2012. We have submitted total five unique runs in the pilot task. One run uses Term Frequency (TF) of the query words to weight the sentence. Two runs use Term Frequency-Inverted Document Frequency (TF-IDF) of the query words to weight the sentences. The two unique runs differ in the way that when multiple answers get the same scores by our system, we choose the different answer in the different runs. The last two runs use TF or TF-IDF weighting scheme as well as the OMIM terms about Alzheimer for query expansion. Stopwords are removed from the query words and answers. Each sentence in the associated document is assigned a weighting score with respect to query words. The sentence that receives the higher weighting score corresponding to the query words is identified as the more relevant sentence to the document. The corresponding answer option to the given question is scored according to the sentence weighting score and the highest ranked answer is selected as the final answer.",
keywords = "Biomedical text mining, Machine Reading, QA4MRE, Question-answering",
author = "Tsai, {Bing Han} and Liu, {Yu Zheng} and Hou, {Wen Juan}",
year = "2012",
language = "English",
volume = "1178",
journal = "CEUR Workshop Proceedings",
issn = "1613-0073",
publisher = "CEUR-WS",

}

TY - JOUR

T1 - Biomedical text mining about Alzheimer's diseases for Machine Reading evaluation

AU - Tsai, Bing Han

AU - Liu, Yu Zheng

AU - Hou, Wen Juan

PY - 2012

Y1 - 2012

N2 - The paper presents the experiments carried out as part of the participation in the pilot task of Biomedical about Alzheimer for QA4MRE at CLEF 2012. We have submitted total five unique runs in the pilot task. One run uses Term Frequency (TF) of the query words to weight the sentence. Two runs use Term Frequency-Inverted Document Frequency (TF-IDF) of the query words to weight the sentences. The two unique runs differ in the way that when multiple answers get the same scores by our system, we choose the different answer in the different runs. The last two runs use TF or TF-IDF weighting scheme as well as the OMIM terms about Alzheimer for query expansion. Stopwords are removed from the query words and answers. Each sentence in the associated document is assigned a weighting score with respect to query words. The sentence that receives the higher weighting score corresponding to the query words is identified as the more relevant sentence to the document. The corresponding answer option to the given question is scored according to the sentence weighting score and the highest ranked answer is selected as the final answer.

AB - The paper presents the experiments carried out as part of the participation in the pilot task of Biomedical about Alzheimer for QA4MRE at CLEF 2012. We have submitted total five unique runs in the pilot task. One run uses Term Frequency (TF) of the query words to weight the sentence. Two runs use Term Frequency-Inverted Document Frequency (TF-IDF) of the query words to weight the sentences. The two unique runs differ in the way that when multiple answers get the same scores by our system, we choose the different answer in the different runs. The last two runs use TF or TF-IDF weighting scheme as well as the OMIM terms about Alzheimer for query expansion. Stopwords are removed from the query words and answers. Each sentence in the associated document is assigned a weighting score with respect to query words. The sentence that receives the higher weighting score corresponding to the query words is identified as the more relevant sentence to the document. The corresponding answer option to the given question is scored according to the sentence weighting score and the highest ranked answer is selected as the final answer.

KW - Biomedical text mining

KW - Machine Reading

KW - QA4MRE

KW - Question-answering

UR - http://www.scopus.com/inward/record.url?scp=84922021565&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84922021565&partnerID=8YFLogxK

M3 - Article

VL - 1178

JO - CEUR Workshop Proceedings

JF - CEUR Workshop Proceedings

SN - 1613-0073

ER -