Improving phoneme recognition of throat microphone speech recordings using transfer learning

Publication:
Improving phoneme recognition of throat microphone speech recordings using transfer learning

Departments

Organizational Unit

Department of Computer Engineering

Organizational Unit

KUIS AI (Koç University & İş Bank Artificial Intelligence Center)

School / College / Institute

Organizational Unit

College of Engineering

Organizational Unit

Research Center

KU-Authors

Erzin, Engin

Turan, Mehmet Ali Tuğtekin

Date

2021

Type

Journal Article

Embargo Status

N/A

Abstract

Throat microphones (TM) are a type of skin-attached non-acoustic sensors, which are robust to environmental noise but carry a lower signal bandwidth characterization than the traditional close-talk microphones (CM). Attaining high-performance phoneme recognition is a challenging task when the training data from a degrading channel, such as TM, is limited. In this paper, we address this challenge for the TM speech recordings using a transfer learning approach based on the stacked denoising auto-encoders (SDA). The proposed transfer learning approach defines an SDA-based domain adaptation framework to map the source domain CM representations and the target domain TM representations into a common latent space, where the mismatch across TM and CM is eliminated to better train an acoustic model and to improve the TM phoneme recognition. For the phoneme recognition task, we use the convolutional neural network (CNN) and the hidden Markov model (HMM) based CNN/HMM hybrid system, which delivers better acoustic modeling performance compared to the conventional Gaussian mixture model (GMM) based models. In the experimental evaluations, we observed more than 12% relative phoneme error rate (PER) improvement for the TM recordings with the proposed transfer learning approach compared to baseline performances.

Publisher

Elsevier

Subject

Acoustics, Computer science, interdisciplinary applications

Source

Speech Communication

DOI

10.1016/j.specom.2021.02.004

URI

https://doi.org/10.1016/j.specom.2021.02.004
https://hdl.handle.net/20.500.14288/15621

Publication: Improving phoneme recognition of throat microphone speech recordings using transfer learning

Departments

School / College / Institute

Program

KU-Authors

KU Authors

Co-Authors

Editor & Affiliation

Compiler & Affiliation

Translator

Other Contributor

Date

Language

Type

Embargo Status

Journal Title

Journal ISSN

Volume Title

Alternative Title

Abstract

Source

Publisher

Subject

Citation

Has Part

Source

Book Series Title

Edition

DOI

URI

item.page.datauri

Link

Rights

Copyrights Note

Collections

Endorsement

Review

Supplemented By

Referenced By

Related Goal

1

Views

0

Downloads

Publication:
Improving phoneme recognition of throat microphone speech recordings using transfer learning