Publication:
Improving phoneme recognition of throat microphone speech recordings using transfer learning

dc.contributor.departmentDepartment of Computer Engineering
dc.contributor.departmentKUIS AI (Koç University & İş Bank Artificial Intelligence Center)
dc.contributor.facultymemberYes
dc.contributor.kuauthorErzin, Engin
dc.contributor.kuauthorTuran, Mehmet Ali Tuğtekin
dc.contributor.schoolcollegeinstituteCollege of Engineering
dc.contributor.schoolcollegeinstituteResearch Center
dc.date.accessioned2024-11-09T23:59:19Z
dc.date.issued2021
dc.description.abstractThroat microphones (TM) are a type of skin-attached non-acoustic sensors, which are robust to environmental noise but carry a lower signal bandwidth characterization than the traditional close-talk microphones (CM). Attaining high-performance phoneme recognition is a challenging task when the training data from a degrading channel, such as TM, is limited. In this paper, we address this challenge for the TM speech recordings using a transfer learning approach based on the stacked denoising auto-encoders (SDA). The proposed transfer learning approach defines an SDA-based domain adaptation framework to map the source domain CM representations and the target domain TM representations into a common latent space, where the mismatch across TM and CM is eliminated to better train an acoustic model and to improve the TM phoneme recognition. For the phoneme recognition task, we use the convolutional neural network (CNN) and the hidden Markov model (HMM) based CNN/HMM hybrid system, which delivers better acoustic modeling performance compared to the conventional Gaussian mixture model (GMM) based models. In the experimental evaluations, we observed more than 12% relative phoneme error rate (PER) improvement for the TM recordings with the proposed transfer learning approach compared to baseline performances.
dc.description.fulltextNo
dc.description.harvestedfromManual
dc.description.indexedbyWOS
dc.description.indexedbyScopus
dc.description.openaccessNO
dc.description.peerreviewstatusN/A
dc.description.publisherscopeInternational
dc.description.readpublishN/A
dc.description.sponsoredbyTubitakEuTÜBİTAK
dc.description.sponsorshipScientific and Technological Research Council of Turkey (TUBITAK) [217E107] This work was supported in part by the Scientific and Technological Research Council of Turkey (TUBITAK) under grant number 217E107.
dc.description.studentonlypublicationNo
dc.description.studentpublicationYes
dc.description.versionN/A
dc.identifier.doi10.1016/j.specom.2021.02.004
dc.identifier.eissn1872-7182
dc.identifier.embargoN/A
dc.identifier.endpage32
dc.identifier.grantno217E107
dc.identifier.issn0167-6393
dc.identifier.quartileQ1
dc.identifier.scopus2-s2.0-85102974819
dc.identifier.startpage25
dc.identifier.urihttps://doi.org/10.1016/j.specom.2021.02.004
dc.identifier.urihttps://hdl.handle.net/20.500.14288/15621
dc.identifier.volume129
dc.identifier.wos000639454800004
dc.keywordsPhoneme recognition
dc.keywordsFeature augmentation
dc.keywordsTransfer learning
dc.keywordsThroat microphone
dc.keywordsDenoising auto-encoder
dc.language.isoeng
dc.publisherElsevier
dc.relation.affiliationKoç University
dc.relation.collectionKoç University Institutional Repository
dc.relation.ispartofSpeech Communication
dc.relation.openaccessN/A
dc.rightsN/A
dc.subjectAcoustics
dc.subjectComputer science, interdisciplinary applications
dc.titleImproving phoneme recognition of throat microphone speech recordings using transfer learning
dc.typeJournal Article
dspace.entity.typePublication
local.contributor.kuauthorTuran, Mehmet Ali Tuğtekin
local.contributor.kuauthorErzin, Engin
relation.isOrgUnitOfPublication89352e43-bf09-4ef4-82f6-6f9d0174ebae
relation.isOrgUnitOfPublication77d67233-829b-4c3a-a28f-bd97ab5c12c7
relation.isOrgUnitOfPublication.latestForDiscovery89352e43-bf09-4ef4-82f6-6f9d0174ebae
relation.isParentOrgUnitOfPublication8e756b23-2d4a-4ce8-b1b3-62c794a8c164
relation.isParentOrgUnitOfPublicationd437580f-9309-4ecb-864a-4af58309d287
relation.isParentOrgUnitOfPublication.latestForDiscovery8e756b23-2d4a-4ce8-b1b3-62c794a8c164

Files