Enhancing performance of protein and gene name recognizers with filtering and integration strategies

Wen Juan Hou, Hsin Hsi Chen

研究成果: 雜誌貢獻期刊論文同行評審

12 引文 斯高帕斯(Scopus)

摘要

Named entity (NE) recognition is a fundamental task in biological relationship mining. This paper considers protein/gene collocates extracted from biological corpora as restrictions to enhance the precision rate of protein/gene name recognition. In addition, we integrate the results of multiple NE recognizers to improve the recall rates. Yapex and KeX, and ABGene and Idgene are taken as examples of protein and gene name recognizers, respectively. The precision of Yapex increases from 70.90 to 85.84% at the low expense of the recall rate (i.e., it only decreases 2.44%) when collocates are incorporated. When both filtering and integration strategies are employed together, the Yapex-based integration with KeX shows good performance, i.e., the F-score increases by 7.83% compared to the pure Yapex method. The results of gene recognition show the same tendency. The ABGene-based integration with Idgene shows a 10.18% F-score increase compared to the pure ABGene method. These successful methodologies can be easily extended to other name finders in biological documents.

原文英語
頁(從 - 到)448-460
頁數13
期刊Journal of Biomedical Informatics
37
發行號6
DOIs
出版狀態已發佈 - 2004 12月
對外發佈

ASJC Scopus subject areas

  • 電腦科學應用
  • 健康資訊學

指紋

深入研究「Enhancing performance of protein and gene name recognizers with filtering and integration strategies」主題。共同形成了獨特的指紋。

引用此