TY - GEN
T1 - Classifying biological full-text articles for multi-database curation
AU - Hou, Wen Juan
AU - Lee, Chih
AU - Chen, Hsin Hsi
PY - 2006
Y1 - 2006
N2 - In this paper, we propose an approach for identifying curatable articles from a large document set. This system considers three parts of an article (title and abstract, MeSH terms, and captions) as its three individual representations and utilizes two domain-specific resources (UMLS and a tumor name list) to reveal the deep knowledge contained in the article. An SVM classifier is trained and cross-validation is employed to find the best combination of representations. The experimental results show overall high performance.
AB - In this paper, we propose an approach for identifying curatable articles from a large document set. This system considers three parts of an article (title and abstract, MeSH terms, and captions) as its three individual representations and utilizes two domain-specific resources (UMLS and a tumor name list) to reveal the deep knowledge contained in the article. An SVM classifier is trained and cross-validation is employed to find the best combination of representations. The experimental results show overall high performance.
UR - http://www.scopus.com/inward/record.url?scp=79952840641&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=79952840641&partnerID=8YFLogxK
U2 - 10.3115/1608974.1608997
DO - 10.3115/1608974.1608997
M3 - Conference contribution
AN - SCOPUS:79952840641
SN - 1932432590
SN - 9781932432596
T3 - EACL 2006 - 11th Conference of the European Chapter of the Association for Computational Linguistics, Proceedings of the Conference
SP - 159
EP - 162
BT - EACL 2006 - 11th Conference of the European Chapter of the Association for Computational Linguistics, Proceedings of the Conference
PB - Association for Computational Linguistics (ACL)
T2 - 11th Conference of the European Chapter of the Association for Computational Linguistics, EACL 2006
Y2 - 3 April 2006 through 7 April 2006
ER -