A deep learning approach for data driven vocal tract area function estimation

Publication:
A deep learning approach for data driven vocal tract area function estimation

dc.contributor.department	Department of Electrical and Electronics Engineering
dc.contributor.department	Department of Computer Engineering
dc.contributor.department	Graduate School of Sciences and Engineering
dc.contributor.kuauthor	Asadiabadi, Sasan
dc.contributor.kuauthor	Erzin, Engin
dc.contributor.schoolcollegeinstitute	College of Engineering
dc.contributor.schoolcollegeinstitute	College of Sciences
dc.contributor.schoolcollegeinstitute	GRADUATE SCHOOL OF SCIENCES AND ENGINEERING
dc.date.accessioned	2024-11-09T13:45:07Z
dc.date.issued	2018
dc.description.abstract	In this paper we present a data driven vocal tract area function (VTAF) estimation using Deep Neural Networks (DNN). We approach the VTAF estimation problem based on sequence to sequence learning neural networks, where regression over a sliding window is used to learn arbitrary non-linear one-to-many mapping from the input feature sequence to the target articulatory sequence. We propose two schemes for efficient estimation of the VTAF; (1) a direct estimation of the area function values and (2) an indirect estimation via predicting the vocal tract boundaries. We consider acoustic speech and phone sequence as two possible input modalities for the DNN estimators. Experimental evaluations are performed over a large data comprising acoustic and phonetic features with parallel articulatory information from the USC-TIMIT database. Our results show that the proposed direct and indirect schemes perform the VTAF estimation with mean absolute error (MAE) rates lower than 1.65 mm, where the direct estimation scheme is observed to perform better than the indirect scheme.
dc.description.fulltext	YES
dc.description.indexedby	WOS
dc.description.indexedby	Scopus
dc.description.openaccess	YES
dc.description.publisherscope	International
dc.description.sponsoredbyTubitakEu	N/A
dc.description.sponsorship	N/A
dc.description.version	Author's final manuscript
dc.identifier.doi	10.1109/SLT.2018.8639582
dc.identifier.embargo	NO
dc.identifier.filenameinventoryno	IR01885
dc.identifier.isbn	9781538643341
dc.identifier.issn	2639-5479
dc.identifier.quartile	N/A
dc.identifier.scopus	2-s2.0-85063083027
dc.identifier.uri	https://doi.org/10.1109/SLT.2018.8639582
dc.identifier.wos	463141800025
dc.keywords	Speech articulation
dc.keywords	Vocal tract area function
dc.keywords	Deep neural network
dc.keywords	Convolutional neural network
dc.language.iso	eng
dc.publisher	Institute of Electrical and Electronics Engineers (IEEE)
dc.relation.grantno	NA
dc.relation.ispartof	2018 IEEE Workshop on Spoken Language Technology (SLT)
dc.relation.uri	http://cdm21054.contentdm.oclc.org/cdm/ref/collection/IR/id/8568
dc.subject	Computer science
dc.subject	Artificial intelligence
dc.subject	Engineering, electrical and electronic
dc.title	A deep learning approach for data driven vocal tract area function estimation
dc.type	Journal Article
dspace.entity.type	Publication
local.contributor.kuauthor	Erzin, Engin
local.contributor.kuauthor	Asadiabadi, Sasan
local.publication.orgunit1	College of Sciences
local.publication.orgunit1	GRADUATE SCHOOL OF SCIENCES AND ENGINEERING
local.publication.orgunit1	College of Engineering
local.publication.orgunit2	Department of Computer Engineering
local.publication.orgunit2	Department of Electrical and Electronics Engineering
local.publication.orgunit2	Graduate School of Sciences and Engineering
relation.isOrgUnitOfPublication	21598063-a7c5-420d-91ba-0cc9b2db0ea0
relation.isOrgUnitOfPublication	89352e43-bf09-4ef4-82f6-6f9d0174ebae
relation.isOrgUnitOfPublication	3fc31c89-e803-4eb1-af6b-6258bc42c3d8
relation.isOrgUnitOfPublication.latestForDiscovery	21598063-a7c5-420d-91ba-0cc9b2db0ea0
relation.isParentOrgUnitOfPublication	8e756b23-2d4a-4ce8-b1b3-62c794a8c164
relation.isParentOrgUnitOfPublication	af0395b0-7219-4165-a909-7016fa30932d
relation.isParentOrgUnitOfPublication	434c9663-2b11-4e66-9399-c863e2ebae43
relation.isParentOrgUnitOfPublication.latestForDiscovery	8e756b23-2d4a-4ce8-b1b3-62c794a8c164

Files

Original bundle

Now showing 1 - 1 of 1

Name:: 8568.pdf
Size:: 342.28 KB
Format:: Adobe Portable Document Format

Download

Collections

Publications with Fulltext

Publication: A deep learning approach for data driven vocal tract area function estimation

Files

Original bundle

Collections

Publication:
A deep learning approach for data driven vocal tract area function estimation