A deep learning approach for data driven vocal tract area function estimation

Publication:
A deep learning approach for data driven vocal tract area function estimation

dc.contributor.department	Department of Computer Engineering
dc.contributor.department	Graduate School of Sciences and Engineering
dc.contributor.kuauthor	Asadiabadi, Sasan
dc.contributor.kuauthor	Erzin, Engin
dc.contributor.schoolcollegeinstitute	College of Engineering
dc.contributor.schoolcollegeinstitute	GRADUATE SCHOOL OF SCIENCES AND ENGINEERING
dc.date.accessioned	2024-11-09T23:43:45Z
dc.date.issued	2018
dc.description.abstract	In this paper we present a data driven vocal tract area function (VTAF) estimation using Deep Neural Networks (DNN). We approach the VTAF estimation problem based on sequence to sequence learning neural networks, where regression over a sliding window is used to learn arbitrary non-linear one-to-many mapping from the input feature sequence to the target articulatory sequence. We propose two schemes for efficient estimation of the VTAF; (1) a direct estimation of the area function values and (2) an indirect estimation via predicting the vocal tract boundaries. We consider acoustic speech and phone sequence as two possible input modalities for the DNN estimators. Experimental evaluations are performed over a large data comprising acoustic and phonetic features with parallel articulatory information from the USC-TIMIT database. Our results show that the proposed direct and indirect schemes perform the VTAF estimation with mean absolute error (MAE) rates lower than 1.65 mm, where the direct estimation scheme is observed to perform better than the indirect scheme.
dc.description.indexedby	WOS
dc.description.openaccess	YES
dc.description.publisherscope	International
dc.description.sponsoredbyTubitakEu	N/A
dc.identifier.isbn	978-1-5386-4334-1
dc.identifier.issn	2639-5479
dc.identifier.scopus	2-s2.0-85063083027
dc.identifier.uri	https://hdl.handle.net/20.500.14288/13549
dc.identifier.wos	463141800025
dc.keywords	Speech articulation
dc.keywords	Vocal tract area function
dc.keywords	Deep neural network
dc.keywords	Convolutional neural network articulatory movements
dc.keywords	Neural-networks
dc.keywords	Speech
dc.keywords	Shape
dc.language.iso	eng
dc.publisher	IEEE
dc.relation.ispartof	2018 IEEE Workshop On Spoken Language Technology (Slt 2018)
dc.subject	Computer science
dc.subject	Artificial intelligence
dc.subject	Engineering
dc.subject	Electrical electronic engineering
dc.title	A deep learning approach for data driven vocal tract area function estimation
dc.type	Conference Proceeding
dspace.entity.type	Publication
local.contributor.kuauthor	Asadiabadi, Sasan
local.contributor.kuauthor	Erzin, Engin
local.publication.orgunit1	GRADUATE SCHOOL OF SCIENCES AND ENGINEERING
local.publication.orgunit1	College of Engineering
local.publication.orgunit2	Department of Computer Engineering
local.publication.orgunit2	Graduate School of Sciences and Engineering
person.familyName	Asadiabadi
person.familyName	Erzin
person.givenName	Sasan
person.givenName	Engin
relation.isOrgUnitOfPublication	89352e43-bf09-4ef4-82f6-6f9d0174ebae
relation.isOrgUnitOfPublication	3fc31c89-e803-4eb1-af6b-6258bc42c3d8
relation.isOrgUnitOfPublication.latestForDiscovery	89352e43-bf09-4ef4-82f6-6f9d0174ebae
relation.isParentOrgUnitOfPublication	8e756b23-2d4a-4ce8-b1b3-62c794a8c164
relation.isParentOrgUnitOfPublication	434c9663-2b11-4e66-9399-c863e2ebae43
relation.isParentOrgUnitOfPublication.latestForDiscovery	8e756b23-2d4a-4ce8-b1b3-62c794a8c164

Collections

Publications without Fulltext

Publication: A deep learning approach for data driven vocal tract area function estimation

Files

Collections

Publication:
A deep learning approach for data driven vocal tract area function estimation