Robust region-of-interest determination based on user attention model through visual rhythm analysis

Ming Chieh Chi*, Chia Hung Yeh, Mei Juan Chen

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

37 Citations (Scopus)


Region-of-interest (ROI) determination is very important for video processing and it is desirable to find a simple method to identify the ROI. Along this direction, this paper investigates a user attention model based on visual rhythm analysis for automatic determination of ROI in a video. The visual rhythm, which is an abstraction of a video, is a thumbnail version of a video by a 2-D image that captures the temporal information of a video sequence. Four sampling lines, including diagonal, anti-diagonal, vertical, and horizontal lines, are employed to obtain four visual rhythm maps in order to analyze the location of the ROI from video data. Via the variation on visual rhythms, object and camera motions can be efficiently distinguished. As for hardware design consideration, the proposed scheme can accurately extract ROI with very low computational complexity for real-time applications. The promising results from the experiments demonstrate that the moving object is effectively and efficiently extracted. Finally, we present a way to use flexible macroblock ordering in combination with ROI determination as a preprocessing step for H.264/AVC video coding, and experimental results show the quality of ROI regions is significantly enhanced.

Original languageEnglish
Article number4914872
Pages (from-to)1025-1038
Number of pages14
JournalIEEE Transactions on Circuits and Systems for Video Technology
Issue number7
Publication statusPublished - 2009 Jul
Externally publishedYes


  • Content analysis
  • Feature extraction
  • Region-of-interest (ROI)
  • User attention
  • Visual rhythm

ASJC Scopus subject areas

  • Media Technology
  • Electrical and Electronic Engineering


Dive into the research topics of 'Robust region-of-interest determination based on user attention model through visual rhythm analysis'. Together they form a unique fingerprint.

Cite this