Abstract
A region of interest (ROI) refers to a specific area within an image that attracts visual attention or contains critical information. Identifying and focusing on ROIs can improve computational efficiency by avoiding redundant computation in non-informative areas. Therefore, detecting the ROI within a video frame is crucial for many multimedia applications. However, most existing approaches rely on pixel-domain information and convolutional neural networks for ROI detection. The potential of using graph convolutional networks (GCNs) to exploit compressed-domain information for ROI detection remains underexplored. Accordingly, this paper proposes a coding unit (CU)-based ROI detection method that employs a GCN and compressed-domain information from H.266/Versatile Video Coding (VVC) encoded video. A video frame is constructed from CUs, with each CU treated as a node in the graph. Each node is associated with features such as geometric attributes, spatio-temporal position, coding mode, quantization parameter, motion characteristics, and residual statistics. These nodes are connected via edges to establish a graph representation of the frame. The resulting graph is then processed by a GCN to identify ROI CUs. The experimental results demonstrate that the proposed method effectively and efficiently detects ROI CUs while significantly reducing computation time compared to previous works.
| Original language | English |
|---|---|
| Pages (from-to) | 186552-186563 |
| Number of pages | 12 |
| Journal | IEEE Access |
| Volume | 13 |
| DOIs | |
| Publication status | Published - 2025 |
Keywords
- compressed domain
- graph convolutional network
- H.266/VVC
- Region of interest
- versatile video coding
ASJC Scopus subject areas
- General Computer Science
- General Materials Science
- General Engineering