Research Outputs
Permanent URI for this communityhttps://hdl.handle.net/20.500.14288/2
Browse
2 results
Search Results
Publication Metadata only A deterministic analysis of an online convex mixture of experts algorithm(Institute of Electrical and Electronics Engineers (IEEE), 2015) Özkan, Hüseyin; Dönmez, Mehmet A.; N/A; Tunç, Sait; Master Student; Graduate School of Sciences and Engineering; N/AWe analyze an online learning algorithm that adaptively combines outputs of two constituent algorithms (or the experts) running in parallel to estimate an unknown desired signal. This online learning algorithm is shown to achieve and in some cases outperform the mean-square error (MSE) performance of the best constituent algorithm in the steady state. However, the MSE analysis of this algorithm in the literature uses approximations and relies on statistical models on the underlying signals. Hence, such an analysis may not be useful or valid for signals generated by various real-life systems that show high degrees of nonstationarity, limit cycles and that are even chaotic in many cases. In this brief, we produce results in an individual sequence manner. In particular, we relate the time-accumulated squared estimation error of this online algorithm at any time over any interval to the one of the optimal convex mixture of the constituent algorithms directly tuned to the underlying signal in a deterministic sense without any statistical assumptions. In this sense, our analysis provides the transient, steady-state, and tracking behavior of this algorithm in a strong sense without any approximations in the derivations or statistical assumptions on the underlying signals such that our results are guaranteed to hold. We illustrate the introduced results through examples.Publication Open Access Investigating contributions of speech and facial landmarks for talking head generation(International Speech Communication Association (ISCA), 2021) N/A; Department of Computer Engineering; Kesim, Ege; Erzin, Engin; Faculty Member; Department of Computer Engineering; Graduate School of Sciences and Engineering; College of Engineering; N/A; 34503Talking head generation is an active research problem. It has been widely studied as a direct speech-to-video or two stage speech-to-landmarks-to-video mapping problem. In this study, our main motivation is to assess individual and joint contributions of the speech and facial landmarks to the talking head generation quality through a state-of-the-art generative adversarial network (GAN) architecture. Incorporating frame and sequence discriminators and a feature matching loss, we investigate performances of speech only, landmark only and joint speech and landmark driven talking head generation on the CREMA-D dataset. Objective evaluations using the peak signal-to-noise ratio (PSNR), structural similarity index (SSIM) and landmark distance (LMD) indicate that while landmarks bring PSNR and SSIM improvements to the speech driven system, speech brings LMD improvement to the landmark driven system. Furthermore, feature matching is observed to improve the speech driven talking head generation models significantly.