跳至主導覽 跳至搜尋 跳過主要內容

A Comparative Experimental Study on Simple Features and Lightweight Models for Voice Activity Detection in Noisy Environments

  • Bo Yu Su
  • , Berlin Chen
  • , Shih Chieh Huang
  • , Jeih Weih Hung*
  • *此作品的通信作者

研究成果: 雜誌貢獻期刊論文同行評審

摘要

This work presents a comparative study of voice activity detection in noise using simple acoustic features and relatively compact recurrent models within a controlled MATLAB-based framework. For each utterance, 9 baseline spectral-plus-periodicity features, MFCCs, and FBANKs are extracted and passed to several lightweight BiLSTM-based networks, either alone or preceded by a 1D CNN layer. The main experiments are carried out at a fixed SNR to separate the influence of the network structure and the feature type, and an additional series with four SNR levels is used to assess whether the same performance trends hold when the SNR varies. The results show that adding a compact CNN front-end before the BiLSTM consistently improves detection scores, that MFCCs generally outperform the baseline spectral–periodicity features and often give better recall/F1 than FBANKs for the considered lightweight models, and that (Formula presented.) +BiLSTM with 13-dimensional MFCCs offers a favorable trade-off between accuracy, robustness across SNRs, and model size. Because all conditions share a single MATLAB implementation with fixed noise types, SNR values, and evaluation metrics, this work is positioned as a benchmark and practical guideline publication for noise-robust, resource-constrained VAD, rather than as a proposal of a completely new deep-learning architecture.

原文英語
文章編號263
期刊Electronics (Switzerland)
15
發行號2
DOIs
出版狀態已發佈 - 2026 1月

ASJC Scopus subject areas

  • 控制與系統工程
  • 訊號處理
  • 硬體和架構
  • 電腦網路與通信
  • 電氣與電子工程

指紋

深入研究「A Comparative Experimental Study on Simple Features and Lightweight Models for Voice Activity Detection in Noisy Environments」主題。共同形成了獨特的指紋。

引用此