TY - GEN
T1 - Leveraging Deep Learning to Enhance Optical Microphone System Performance with Unknown Speakers for Cochlear Implants
AU - Han, Ji Yan
AU - Li, Jia Hui
AU - Yang, Chan Shan
AU - Chen, Fei
AU - Liao, Wen Huei
AU - Liao, Yuan Fu
AU - Lai, Ying Hui
N1 - Publisher Copyright:
© 2024 IEEE.
PY - 2024
Y1 - 2024
N2 - Cochlear implants (CI) play a crucial role in restoring hearing for individuals with profound-to-severe hearing loss. However, challenges persist, particularly in low signal-to-noise ratios and distant talk scenarios. This study introduces an innovative solution by integrating a Laser Doppler vibrometer (LDV) with deep learning to reconstruct clean speech from unknown speakers in noisy conditions. Objective evaluations, including short-time objective intelligibility (STOI) and perceptual evaluation of speech quality (PESQ), demonstrate the superior performance of the proposed-LDV system over traditional microphones and a baseline LDV system under the same recording conditions. STOI scores for Mic-Noisy, Mic-log Minimum Mean Square Error (logMMSE), baseline-LDV, and proposed-LDV were 0.44, 0.35, 0.48, and 0.73, respectively, whereas PESQ scores were 1.51, 1.76, 1.4, 0.73, and 1.96, respectively. Furthermore, the vocoder simulation listening testing results showed the proposed system achieving a higher word accuracy score than baselines systems. These findings highlight the potential of the proposed system as a robust speech capture method for CI users, addressing challenges related to noise and distance.
AB - Cochlear implants (CI) play a crucial role in restoring hearing for individuals with profound-to-severe hearing loss. However, challenges persist, particularly in low signal-to-noise ratios and distant talk scenarios. This study introduces an innovative solution by integrating a Laser Doppler vibrometer (LDV) with deep learning to reconstruct clean speech from unknown speakers in noisy conditions. Objective evaluations, including short-time objective intelligibility (STOI) and perceptual evaluation of speech quality (PESQ), demonstrate the superior performance of the proposed-LDV system over traditional microphones and a baseline LDV system under the same recording conditions. STOI scores for Mic-Noisy, Mic-log Minimum Mean Square Error (logMMSE), baseline-LDV, and proposed-LDV were 0.44, 0.35, 0.48, and 0.73, respectively, whereas PESQ scores were 1.51, 1.76, 1.4, 0.73, and 1.96, respectively. Furthermore, the vocoder simulation listening testing results showed the proposed system achieving a higher word accuracy score than baselines systems. These findings highlight the potential of the proposed system as a robust speech capture method for CI users, addressing challenges related to noise and distance.
KW - Cochlear implant
KW - deep learning
KW - Laser Doppler vibrometer
KW - speech enhancement
UR - http://www.scopus.com/inward/record.url?scp=85214998575&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85214998575&partnerID=8YFLogxK
U2 - 10.1109/EMBC53108.2024.10782084
DO - 10.1109/EMBC53108.2024.10782084
M3 - Conference contribution
C2 - 40039183
AN - SCOPUS:85214998575
T3 - Proceedings of the Annual International Conference of the IEEE Engineering in Medicine and Biology Society, EMBS
BT - 46th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, EMBC 2024 - Proceedings
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 46th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, EMBC 2024
Y2 - 15 July 2024 through 19 July 2024
ER -