Objectionable content filtering by click-through data

Lung Hao Lee, Yen Cheng Juan, Hsin Hsi Chen, Yuen-Hsien Tseng

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Citation (Scopus)

Abstract

This paper explores users' browsing intents to predict the category of a user's next access during web surfing, and applies the results to objectionable content filtering. A user's access trail represented as a sequence of URLs reveals the contextual information of web browsing behaviors. We extract behavioral features of each clicked URL, i.e., hostname, bag-of-words, gTLD, IP, and port, to develop a linear chain CRF model for context-aware category prediction. Large-scale experiments show that our method achieves a promising accuracy of 0.9396 for objectionable access identification without requesting their corresponding page content. Error analysis indicates that our proposed model results in a low false positive rate of 0.0571. In real-life filtering simulations, our proposed model accomplishes macro-averaging blocking rate 0.9271, while maintaining a favorably low macro-averaging over-blocking rate 0.0575 for collaboratively filtering objectionable content with time change on the dynamic web. Copyright is held by the owner/author(s).

Original languageEnglish
Title of host publicationCIKM 2013 - Proceedings of the 22nd ACM International Conference on Information and Knowledge Management
Pages1581-1584
Number of pages4
DOIs
Publication statusPublished - 2013 Dec 11
Event22nd ACM International Conference on Information and Knowledge Management, CIKM 2013 - San Francisco, CA, United States
Duration: 2013 Oct 272013 Nov 1

Other

Other22nd ACM International Conference on Information and Knowledge Management, CIKM 2013
CountryUnited States
CitySan Francisco, CA
Period13/10/2713/11/1

    Fingerprint

Keywords

  • Click-through mining
  • Collaborative filtering
  • Internet censorship

ASJC Scopus subject areas

  • Decision Sciences(all)
  • Business, Management and Accounting(all)

Cite this

Lee, L. H., Juan, Y. C., Chen, H. H., & Tseng, Y-H. (2013). Objectionable content filtering by click-through data. In CIKM 2013 - Proceedings of the 22nd ACM International Conference on Information and Knowledge Management (pp. 1581-1584) https://doi.org/10.1145/2505515.2507849