Publication:
Audio-visual correlation analysis for lip feature extraction

dc.contributor.departmentDepartment of Computer Engineering
dc.contributor.departmentDepartment of Computer Engineering
dc.contributor.departmentN/A
dc.contributor.departmentDepartment of Electrical and Electronics Engineering
dc.contributor.kuauthorYemez, Yücel
dc.contributor.kuauthorErzin, Engin
dc.contributor.kuauthorSargın, Mehmet Emre
dc.contributor.kuauthorTekalp, Ahmet Murat
dc.contributor.kuprofileFaculty Member
dc.contributor.kuprofileFaculty Member
dc.contributor.kuprofileMaster Student
dc.contributor.kuprofileFaculty Member
dc.contributor.otherDepartment of Computer Engineering
dc.contributor.otherDepartment of Electrical and Electronics Engineering
dc.contributor.schoolcollegeinstituteCollege of Engineering
dc.contributor.schoolcollegeinstituteCollege of Engineering
dc.contributor.schoolcollegeinstituteGraduate School of Sciences and Engineering
dc.contributor.schoolcollegeinstituteCollege of Engineering
dc.contributor.yokid107907
dc.contributor.yokid34503
dc.contributor.yokidN/A
dc.contributor.yokid26207
dc.date.accessioned2024-11-09T23:28:33Z
dc.date.issued2005
dc.description.abstractIn this paper, correlations between audio features and different lip features are investigated. Audio features are selected as Mel Frequency Cepstral Coefficients (MFCC) of the audio signal. Three different lip features are considered for the visual lip information, where these features are 2D DCT coefficients of the intensity based image and the optical flow vectors within the lip region, and the distances between predefined points on the lip contour which carries the lip shape information. In this study, a new methodology based on class conditional probability is used to estimate and compare the correlations between audio feature and each lip feature. The lip feature, which has the highest correlation to audio features, is identified among the above lip features. Isolation of lip features, which are highly correlated with audio signal, can be used for audio-visual speech recognition, audio-visual lip synchronization and estimation of lip shapes using audio signal for visual synthesis. / Bu çalışmada, ses özellikleri ile farklı dudak özellikleri arasındaki ilişkiler araştırılmaktadır. Ses özellikleri, ses sinyalinin Mel Frekans Cepstral Katsayıları (MFCC) olarak seçilir. Görsel dudak bilgisi için üç farklı dudak özelliği dikkate alınır; burada bu özellikler, yoğunluğa dayalı görüntünün 2D DCT katsayıları ve dudak bölgesindeki optik akış vektörleri ile dudak konturunda dudak şekli bilgisini taşıyan önceden tanımlanmış noktalar arasındaki mesafelerdir. . Bu çalışmada, ses özelliği ile her bir dudak özelliği arasındaki korelasyonları tahmin etmek ve karşılaştırmak için sınıf koşullu olasılığa dayalı yeni bir metodoloji kullanılmıştır. Ses özellikleri ile en yüksek korelasyona sahip olan dudak özelliği, yukarıdaki dudak özellikleri arasında tanımlanır. Ses sinyali ile yüksek oranda ilişkili olan dudak özelliklerinin izolasyonu, görsel-işitsel konuşma tanıma, görsel-işitsel dudak senkronizasyonu ve görsel sentez için ses sinyali kullanılarak dudak şekillerinin tahmini için kullanılabilir.
dc.description.indexedbyScopus
dc.description.openaccessYES
dc.description.publisherscopeInternational
dc.description.sponsoredbyTubitakEuN/A
dc.description.volume2005
dc.identifier.doi10.1109/SIU.2005.1567665
dc.identifier.isbn0780-3923-96
dc.identifier.isbn9780-7803-9239-7
dc.identifier.linkhttps://www.scopus.com/inward/record.uri?eid=2-s2.0-33846595302&doi=10.1109%2fSIU.2005.1567665&partnerID=40&md5=ece280ffc875ebc7e5e7ebe9bd8c0a3e
dc.identifier.quartileN/A
dc.identifier.scopus2-s2.0-33846595302
dc.identifier.urihttps://IEEExplore.IEEE.org/stamp/stamp.jsp?tp=&arnumber=1567665
dc.identifier.urihttps://hdl.handle.net/20.500.14288/11909
dc.keywordsAudio acoustics
dc.keywordsCorrelation methods
dc.keywordsFeature extraction
dc.keywordsImage analysis
dc.keywordsVectors
dc.keywordsAudio signals
dc.keywordsAudio visual correlation analysis
dc.keywordsMel Frequency Cepstral Coefficients (MFCC)
dc.keywordsVisual lip information
dc.keywordsImage processing
dc.languageTurkish
dc.publisherInstitute of Electrical and Electronics Engineers (IEEE)
dc.sourceProceedings of the IEEE 13th Signal Processing and Communications Applications Conference, SIU 2005
dc.subjectEngineering
dc.titleAudio-visual correlation analysis for lip feature extraction
dc.title.alternativeGörsel-işitsel ilintiye dayalı dudak öznitelik çıkarımı
dc.typeConference proceeding
dspace.entity.typePublication
local.contributor.authorid0000-0002-7515-3138
local.contributor.authorid0000-0002-2715-2368
local.contributor.authoridN/A
local.contributor.authorid0000-0003-1465-8121
local.contributor.kuauthorYemez, Yücel
local.contributor.kuauthorErzin, Engin
local.contributor.kuauthorSargın, Mehmet Emre
local.contributor.kuauthorTekalp, Ahmet Murat
relation.isOrgUnitOfPublication89352e43-bf09-4ef4-82f6-6f9d0174ebae
relation.isOrgUnitOfPublication21598063-a7c5-420d-91ba-0cc9b2db0ea0
relation.isOrgUnitOfPublication.latestForDiscovery21598063-a7c5-420d-91ba-0cc9b2db0ea0

Files