Publication: Audio-visual correlation analysis for lip feature extraction
dc.contributor.department | Department of Computer Engineering | |
dc.contributor.department | Department of Computer Engineering | |
dc.contributor.department | N/A | |
dc.contributor.department | Department of Electrical and Electronics Engineering | |
dc.contributor.kuauthor | Yemez, Yücel | |
dc.contributor.kuauthor | Erzin, Engin | |
dc.contributor.kuauthor | Sargın, Mehmet Emre | |
dc.contributor.kuauthor | Tekalp, Ahmet Murat | |
dc.contributor.kuprofile | Faculty Member | |
dc.contributor.kuprofile | Faculty Member | |
dc.contributor.kuprofile | Master Student | |
dc.contributor.kuprofile | Faculty Member | |
dc.contributor.other | Department of Computer Engineering | |
dc.contributor.other | Department of Electrical and Electronics Engineering | |
dc.contributor.schoolcollegeinstitute | College of Engineering | |
dc.contributor.schoolcollegeinstitute | College of Engineering | |
dc.contributor.schoolcollegeinstitute | Graduate School of Sciences and Engineering | |
dc.contributor.schoolcollegeinstitute | College of Engineering | |
dc.contributor.yokid | 107907 | |
dc.contributor.yokid | 34503 | |
dc.contributor.yokid | N/A | |
dc.contributor.yokid | 26207 | |
dc.date.accessioned | 2024-11-09T23:28:33Z | |
dc.date.issued | 2005 | |
dc.description.abstract | In this paper, correlations between audio features and different lip features are investigated. Audio features are selected as Mel Frequency Cepstral Coefficients (MFCC) of the audio signal. Three different lip features are considered for the visual lip information, where these features are 2D DCT coefficients of the intensity based image and the optical flow vectors within the lip region, and the distances between predefined points on the lip contour which carries the lip shape information. In this study, a new methodology based on class conditional probability is used to estimate and compare the correlations between audio feature and each lip feature. The lip feature, which has the highest correlation to audio features, is identified among the above lip features. Isolation of lip features, which are highly correlated with audio signal, can be used for audio-visual speech recognition, audio-visual lip synchronization and estimation of lip shapes using audio signal for visual synthesis. / Bu çalışmada, ses özellikleri ile farklı dudak özellikleri arasındaki ilişkiler araştırılmaktadır. Ses özellikleri, ses sinyalinin Mel Frekans Cepstral Katsayıları (MFCC) olarak seçilir. Görsel dudak bilgisi için üç farklı dudak özelliği dikkate alınır; burada bu özellikler, yoğunluğa dayalı görüntünün 2D DCT katsayıları ve dudak bölgesindeki optik akış vektörleri ile dudak konturunda dudak şekli bilgisini taşıyan önceden tanımlanmış noktalar arasındaki mesafelerdir. . Bu çalışmada, ses özelliği ile her bir dudak özelliği arasındaki korelasyonları tahmin etmek ve karşılaştırmak için sınıf koşullu olasılığa dayalı yeni bir metodoloji kullanılmıştır. Ses özellikleri ile en yüksek korelasyona sahip olan dudak özelliği, yukarıdaki dudak özellikleri arasında tanımlanır. Ses sinyali ile yüksek oranda ilişkili olan dudak özelliklerinin izolasyonu, görsel-işitsel konuşma tanıma, görsel-işitsel dudak senkronizasyonu ve görsel sentez için ses sinyali kullanılarak dudak şekillerinin tahmini için kullanılabilir. | |
dc.description.indexedby | Scopus | |
dc.description.openaccess | YES | |
dc.description.publisherscope | International | |
dc.description.sponsoredbyTubitakEu | N/A | |
dc.description.volume | 2005 | |
dc.identifier.doi | 10.1109/SIU.2005.1567665 | |
dc.identifier.isbn | 0780-3923-96 | |
dc.identifier.isbn | 9780-7803-9239-7 | |
dc.identifier.link | https://www.scopus.com/inward/record.uri?eid=2-s2.0-33846595302&doi=10.1109%2fSIU.2005.1567665&partnerID=40&md5=ece280ffc875ebc7e5e7ebe9bd8c0a3e | |
dc.identifier.quartile | N/A | |
dc.identifier.scopus | 2-s2.0-33846595302 | |
dc.identifier.uri | https://IEEExplore.IEEE.org/stamp/stamp.jsp?tp=&arnumber=1567665 | |
dc.identifier.uri | https://hdl.handle.net/20.500.14288/11909 | |
dc.keywords | Audio acoustics | |
dc.keywords | Correlation methods | |
dc.keywords | Feature extraction | |
dc.keywords | Image analysis | |
dc.keywords | Vectors | |
dc.keywords | Audio signals | |
dc.keywords | Audio visual correlation analysis | |
dc.keywords | Mel Frequency Cepstral Coefficients (MFCC) | |
dc.keywords | Visual lip information | |
dc.keywords | Image processing | |
dc.language | Turkish | |
dc.publisher | Institute of Electrical and Electronics Engineers (IEEE) | |
dc.source | Proceedings of the IEEE 13th Signal Processing and Communications Applications Conference, SIU 2005 | |
dc.subject | Engineering | |
dc.title | Audio-visual correlation analysis for lip feature extraction | |
dc.title.alternative | Görsel-işitsel ilintiye dayalı dudak öznitelik çıkarımı | |
dc.type | Conference proceeding | |
dspace.entity.type | Publication | |
local.contributor.authorid | 0000-0002-7515-3138 | |
local.contributor.authorid | 0000-0002-2715-2368 | |
local.contributor.authorid | N/A | |
local.contributor.authorid | 0000-0003-1465-8121 | |
local.contributor.kuauthor | Yemez, Yücel | |
local.contributor.kuauthor | Erzin, Engin | |
local.contributor.kuauthor | Sargın, Mehmet Emre | |
local.contributor.kuauthor | Tekalp, Ahmet Murat | |
relation.isOrgUnitOfPublication | 89352e43-bf09-4ef4-82f6-6f9d0174ebae | |
relation.isOrgUnitOfPublication | 21598063-a7c5-420d-91ba-0cc9b2db0ea0 | |
relation.isOrgUnitOfPublication.latestForDiscovery | 21598063-a7c5-420d-91ba-0cc9b2db0ea0 |