A multi-level hierarchical index structure for supporting efficient similarity search on tag sets

Jia Ling Koh*, Nonhlanhla Shongwe, Chung Wen Cho

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2 Citations (Scopus)

Abstract

Social communication websites has been an emerging type of a Web service that helps users to share their resources. For providing efficient similarity search of tag set in a social tagging system, we propose a multi-level hierarchical index structure to group similar tag sets. Not only the algorithms of similarity searches of tag sets, but also the algorithms of deletion and updating of tag sets by using the constructed index structure are provided. Furthermore, we define a modified hamming distance function on tag sets, which consider the semantically relatedness when comparing the members for evaluating the similarity of two tag sets. This function is more applicable to evaluate the similarity search of two tag sets. A systematic performance study is performed to verify the effectiveness and the efficiency of the proposed strategies. The experiment results show that the proposed MHIB approach further improves the pruning effect of the previous work which constructs a two-level index structure. Especially, the MHIB approach is well scalable with respect to the three parameters when using either the hamming distance or the modified hamming distance for similarity measure. Although the insertion operation of the MHIB approach requires higher cost than the naïve method, with the assistant of the constructed inverted list of clusters, it performs faster than the previous work. Besides, the cost of performing deletion operation by using the MHIB approach is much less than the other two approaches and so is the update operation.

Original languageEnglish
Title of host publication6th International Conference on Research Challenges in Information Science, RCIS 2012 - Conference Proceedings
DOIs
Publication statusPublished - 2012
Event6th International Conference on Research Challenges in Information Science, RCIS 2012 - Valencia, Spain
Duration: 2012 May 162012 May 18

Publication series

NameProceedings - International Conference on Research Challenges in Information Science
ISSN (Print)2151-1349
ISSN (Electronic)2151-1357

Other

Other6th International Conference on Research Challenges in Information Science, RCIS 2012
Country/TerritorySpain
CityValencia
Period2012/05/162012/05/18

Keywords

  • Social tagging
  • index structure
  • similarity search

ASJC Scopus subject areas

  • Computer Science Applications
  • Information Systems
  • Software

Fingerprint

Dive into the research topics of 'A multi-level hierarchical index structure for supporting efficient similarity search on tag sets'. Together they form a unique fingerprint.

Cite this