Objectionable content filtering by click-through data

Lung Hao Lee, Yen Cheng Juan, Hsin Hsi Chen, Yuen Hsien Tseng

研究成果: 書貢獻/報告類型會議論文篇章

3 引文 斯高帕斯(Scopus)

摘要

This paper explores users' browsing intents to predict the category of a user's next access during web surfing, and applies the results to objectionable content filtering. A user's access trail represented as a sequence of URLs reveals the contextual information of web browsing behaviors. We extract behavioral features of each clicked URL, i.e., hostname, bag-of-words, gTLD, IP, and port, to develop a linear chain CRF model for context-aware category prediction. Large-scale experiments show that our method achieves a promising accuracy of 0.9396 for objectionable access identification without requesting their corresponding page content. Error analysis indicates that our proposed model results in a low false positive rate of 0.0571. In real-life filtering simulations, our proposed model accomplishes macro-averaging blocking rate 0.9271, while maintaining a favorably low macro-averaging over-blocking rate 0.0575 for collaboratively filtering objectionable content with time change on the dynamic web. Copyright is held by the owner/author(s).

原文英語
主出版物標題CIKM 2013 - Proceedings of the 22nd ACM International Conference on Information and Knowledge Management
頁面1581-1584
頁數4
DOIs
出版狀態已發佈 - 2013
事件22nd ACM International Conference on Information and Knowledge Management, CIKM 2013 - San Francisco, CA, 美国
持續時間: 2013 10月 272013 11月 1

出版系列

名字International Conference on Information and Knowledge Management, Proceedings

其他

其他22nd ACM International Conference on Information and Knowledge Management, CIKM 2013
國家/地區美国
城市San Francisco, CA
期間2013/10/272013/11/01

ASJC Scopus subject areas

  • 一般決策科學
  • 一般商業,管理和會計

指紋

深入研究「Objectionable content filtering by click-through data」主題。共同形成了獨特的指紋。

引用此