AMUSE++: A Mamba-Enhanced Speech Enhancement Framework with Bi-Directional and Advanced Front-End Modeling

  • Tsung Jung Li
  • , Berlin Chen
  • , Jeih Weih Hung*
  • *Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

This study presents AMUSE++, an advanced speech enhancement framework that extends the MUSE++ model by redesigning its core Mamba module with two major improvements. First, the originally unidirectional one-dimensional (1D) Mamba is transformed into a bi-directional architecture to capture temporal dependencies more effectively. Second, this module is extended to a two-dimensional (2D) structure that jointly models both time and frequency dimensions, capturing richer speech features essential for enhancement tasks. In addition to these structural changes, we propose a Preliminary Denoising Module (PDM) as an advanced front-end, which is composed of multiple cascaded 2D bi-directional Mamba Blocks designed to preprocess and denoise input speech features before the main enhancement stage. Extensive experiments on the VoiceBank+DEMAND dataset demonstrate that AMUSE++ significantly outperforms both the backbone MUSE++ across a variety of objective speech enhancement metrics, including improvements in perceptual quality and intelligibility. These results confirm that the combination of bi-directionality, two-dimensional modeling, and an enhanced denoising frontend provides a powerful approach for tackling challenging noisy speech scenarios. AMUSE++ thus represents a notable advancement in neural speech enhancement architectures, paving the way for more effective and robust speech enhancement systems in real-world applications.

Original languageEnglish
Article number282
JournalElectronics (Switzerland)
Volume15
Issue number2
DOIs
Publication statusPublished - 2026 Jan

Keywords

  • bi-directional 2D mamba
  • lightweight neural networks
  • mamba state-space models
  • speech enhancement
  • time–frequency modeling

ASJC Scopus subject areas

  • Control and Systems Engineering
  • Signal Processing
  • Hardware and Architecture
  • Computer Networks and Communications
  • Electrical and Electronic Engineering

Fingerprint

Dive into the research topics of 'AMUSE++: A Mamba-Enhanced Speech Enhancement Framework with Bi-Directional and Advanced Front-End Modeling'. Together they form a unique fingerprint.

Cite this