An efficient VLSI architecture for H.264 variable block size motion estimation

Chien Min Ou, Chian Feng Le, Wen Jyi Hwang

Research output: Contribution to journalArticle

69 Citations (Scopus)

Abstract

This paper proposes a novel flexible VLSI architecture for the implementation of variable block size motion estimation (VBSME). The architecture is able to perform a full motion search on integral multiples of 4×4 blocks sizes. To use the architecture, each 16×16 macroblock of the source frames should be partitioned into sixteen 4×4 non-overlapping subblocks, called primitive subblocks. The architecture contains sixteen modules and one VBSME processor. Each module, realized by cascading 1D systolic arrays, is responsible for the block-matching operations of a different primitive subblock. The realization has the advantages of high throughput, high flexibility and 100 % processing element (PE) utilization. The motion estimation of all the primitive subblocks are performed in parallel. Because these primitive subblocks can be used to form the 41 subblocks of different sizes specified by the H.264, the VBSME processor is employed to concurrently compute the sums of absolute differences (SADs) of all the 41 subblocks from the SADs of the primitive subblocks. This new architecture has lower latency and higher throughput over other exiting VBSME architectures for the hardware implementation of H.264 encoders.

Original languageEnglish
Pages (from-to)1291-1299
Number of pages9
JournalIEEE Transactions on Consumer Electronics
Volume51
Issue number4
DOIs
Publication statusPublished - 2005 Nov 1

Fingerprint

Motion estimation
Throughput
Systolic arrays
Hardware
Processing

Keywords

  • H.264 standard
  • VLSI architecture
  • Variable block size motion estimation
  • Video coding

ASJC Scopus subject areas

  • Media Technology
  • Electrical and Electronic Engineering

Cite this

An efficient VLSI architecture for H.264 variable block size motion estimation. / Ou, Chien Min; Le, Chian Feng; Hwang, Wen Jyi.

In: IEEE Transactions on Consumer Electronics, Vol. 51, No. 4, 01.11.2005, p. 1291-1299.

Research output: Contribution to journalArticle

@article{6c7d64577cb14cad8c4f1b2fc8921a59,
title = "An efficient VLSI architecture for H.264 variable block size motion estimation",
abstract = "This paper proposes a novel flexible VLSI architecture for the implementation of variable block size motion estimation (VBSME). The architecture is able to perform a full motion search on integral multiples of 4×4 blocks sizes. To use the architecture, each 16×16 macroblock of the source frames should be partitioned into sixteen 4×4 non-overlapping subblocks, called primitive subblocks. The architecture contains sixteen modules and one VBSME processor. Each module, realized by cascading 1D systolic arrays, is responsible for the block-matching operations of a different primitive subblock. The realization has the advantages of high throughput, high flexibility and 100 {\%} processing element (PE) utilization. The motion estimation of all the primitive subblocks are performed in parallel. Because these primitive subblocks can be used to form the 41 subblocks of different sizes specified by the H.264, the VBSME processor is employed to concurrently compute the sums of absolute differences (SADs) of all the 41 subblocks from the SADs of the primitive subblocks. This new architecture has lower latency and higher throughput over other exiting VBSME architectures for the hardware implementation of H.264 encoders.",
keywords = "H.264 standard, VLSI architecture, Variable block size motion estimation, Video coding",
author = "Ou, {Chien Min} and Le, {Chian Feng} and Hwang, {Wen Jyi}",
year = "2005",
month = "11",
day = "1",
doi = "10.1109/TCE.2005.1561858",
language = "English",
volume = "51",
pages = "1291--1299",
journal = "IEEE Transactions on Consumer Electronics",
issn = "0098-3063",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
number = "4",

}

TY - JOUR

T1 - An efficient VLSI architecture for H.264 variable block size motion estimation

AU - Ou, Chien Min

AU - Le, Chian Feng

AU - Hwang, Wen Jyi

PY - 2005/11/1

Y1 - 2005/11/1

N2 - This paper proposes a novel flexible VLSI architecture for the implementation of variable block size motion estimation (VBSME). The architecture is able to perform a full motion search on integral multiples of 4×4 blocks sizes. To use the architecture, each 16×16 macroblock of the source frames should be partitioned into sixteen 4×4 non-overlapping subblocks, called primitive subblocks. The architecture contains sixteen modules and one VBSME processor. Each module, realized by cascading 1D systolic arrays, is responsible for the block-matching operations of a different primitive subblock. The realization has the advantages of high throughput, high flexibility and 100 % processing element (PE) utilization. The motion estimation of all the primitive subblocks are performed in parallel. Because these primitive subblocks can be used to form the 41 subblocks of different sizes specified by the H.264, the VBSME processor is employed to concurrently compute the sums of absolute differences (SADs) of all the 41 subblocks from the SADs of the primitive subblocks. This new architecture has lower latency and higher throughput over other exiting VBSME architectures for the hardware implementation of H.264 encoders.

AB - This paper proposes a novel flexible VLSI architecture for the implementation of variable block size motion estimation (VBSME). The architecture is able to perform a full motion search on integral multiples of 4×4 blocks sizes. To use the architecture, each 16×16 macroblock of the source frames should be partitioned into sixteen 4×4 non-overlapping subblocks, called primitive subblocks. The architecture contains sixteen modules and one VBSME processor. Each module, realized by cascading 1D systolic arrays, is responsible for the block-matching operations of a different primitive subblock. The realization has the advantages of high throughput, high flexibility and 100 % processing element (PE) utilization. The motion estimation of all the primitive subblocks are performed in parallel. Because these primitive subblocks can be used to form the 41 subblocks of different sizes specified by the H.264, the VBSME processor is employed to concurrently compute the sums of absolute differences (SADs) of all the 41 subblocks from the SADs of the primitive subblocks. This new architecture has lower latency and higher throughput over other exiting VBSME architectures for the hardware implementation of H.264 encoders.

KW - H.264 standard

KW - VLSI architecture

KW - Variable block size motion estimation

KW - Video coding

UR - http://www.scopus.com/inward/record.url?scp=33845644392&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=33845644392&partnerID=8YFLogxK

U2 - 10.1109/TCE.2005.1561858

DO - 10.1109/TCE.2005.1561858

M3 - Article

AN - SCOPUS:33845644392

VL - 51

SP - 1291

EP - 1299

JO - IEEE Transactions on Consumer Electronics

JF - IEEE Transactions on Consumer Electronics

SN - 0098-3063

IS - 4

ER -