Abstract
In this work, we present an audio content identification system that identifies some unknown audio material by comparing its fingerprint with those extracted off-line and saved in the music database. We will describe in detail the procedure to extract audio fingerprints and demonstrate that they are robust to noise and content-preserving manipulations. The main feature in the proposed system is the zero-crossing rate extracted with the octave-band filter bank. The zero-crossing rate can be used to describe the dominant frequency in each subband with a very low computational cost. The size of audio fingerprint is small and can be efficiently stored along with the compressed files in the database. It is also robust to many modifications such as tempo change and time-alignment distortion. Besides, the octave-band filter bank is used to enhance the robustness to distortion, especially those localized on some frequency regions.
Original language | English |
---|---|
Pages (from-to) | 55-64 |
Number of pages | 10 |
Journal | Proceedings of SPIE - The International Society for Optical Engineering |
Volume | 5242 |
DOIs | |
Publication status | Published - 2003 |
Externally published | Yes |
Event | Internet Multimedia Management Systems IV - Orlando, FL, United States Duration: 2003 Sept 9 → 2003 Sept 11 |
Keywords
- Audio Database Management
- Audio Fingerprint
- Audio Identification
- Audio Processing
- Zero-crossing Rate
ASJC Scopus subject areas
- Electronic, Optical and Magnetic Materials
- Condensed Matter Physics
- Computer Science Applications
- Applied Mathematics
- Electrical and Electronic Engineering