TY - GEN
T1 - Mining patterns of drug-disease association from biomedical texts
AU - Hou, Wen Juan
AU - Lee, Bo Syun
AU - Chen, Hung Chi
N1 - Publisher Copyright:
© 2018 Association for Computing Machinery.
PY - 2018/1/18
Y1 - 2018/1/18
N2 - Drug repurposing aims to identify new indications for approved drugs, and it can promisingly reduce time and drug development costs. The goal of the paper, drug-disease relation extraction automatically from biomedical texts, is fundamental to the study of drug repurposing since lots of clinical case studies published in an unstructured textual form. To analyze the number of verbs and nouns pertinent to diseases and medications in the training data, two models with different drug-disease orders are established, and some rules are proposed at this phase. The first model is for the sentences with the order that the disease name precedes the drug name. The second model is for the reverse order to the first model. These verbs and nouns are then classified into categories of "pure association," "pure no association" and "neutrals." Among them, some neutrals are further verified by the Chi-square test method. As a result, the associations between diseases and medications are identified, which are called patterns later. Finally, the patterns are used in the test data to extract the disease and drug pairs. The best experimental results show the precision value of 100%, recall value of 89.0%, and F-score value of 94.2%.
AB - Drug repurposing aims to identify new indications for approved drugs, and it can promisingly reduce time and drug development costs. The goal of the paper, drug-disease relation extraction automatically from biomedical texts, is fundamental to the study of drug repurposing since lots of clinical case studies published in an unstructured textual form. To analyze the number of verbs and nouns pertinent to diseases and medications in the training data, two models with different drug-disease orders are established, and some rules are proposed at this phase. The first model is for the sentences with the order that the disease name precedes the drug name. The second model is for the reverse order to the first model. These verbs and nouns are then classified into categories of "pure association," "pure no association" and "neutrals." Among them, some neutrals are further verified by the Chi-square test method. As a result, the associations between diseases and medications are identified, which are called patterns later. Finally, the patterns are used in the test data to extract the disease and drug pairs. The best experimental results show the precision value of 100%, recall value of 89.0%, and F-score value of 94.2%.
KW - Biomedical literature
KW - Chi-square test
KW - Drug-disease association
KW - Pattern extraction
UR - http://www.scopus.com/inward/record.url?scp=85046785803&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85046785803&partnerID=8YFLogxK
U2 - 10.1145/3180382.3180401
DO - 10.1145/3180382.3180401
M3 - Conference contribution
AN - SCOPUS:85046785803
T3 - ACM International Conference Proceeding Series
SP - 84
EP - 90
BT - Proceedings of the 2018 8th International Conference on Bioscience, Biochemistry and Bioinformatics, ICBBB 2018
PB - Association for Computing Machinery
T2 - 8th International Conference on Bioscience, Biochemistry and Bioinformatics, ICBBB 2018
Y2 - 18 January 2018 through 20 January 2018
ER -