Organizational Unit:
Department of Computer Engineering

Loading...
OrgUnit Logo

Date established

City

Country

ID

Is Parent Of

Is Child Of

ROR Identifier

Description

Publication Search Results

Now showing 1 - 10 of 762
  • Placeholder
    Publication
    Subspace-based techniques for retrieval of general 3D models
    (IEEE, 2009) Sankur, Bülent; Dutaǧac, Helin; Department of Computer Engineering; Yemez, Yücel; Faculty Member; Department of Computer Engineering; College of Engineering; 107907
    In this paper we investigate the potential of subspace techniques, such as, PCA, ICA and NMF in the case of indexing and retrieval of generic 3D models. We address the 3D shape alignment problem via continuous PCA and the exhaustive axis labeling and reflections. We find that the most propitious 3D distance transform leading to discriminative subspace features is the inverse distance transform. Our performance on the Princeton Shape Benchmark is on a par with the state-of-the-art methods. ©2009 IEEE.
  • Placeholder
    Publication
    Enhancement of throat microphone recordings by learning phone-dependent mappings of speech spectra
    (Institute of Electrical and Electronics Engineers (IEEE), 2013) N/A; Department of Computer Engineering; Turan, Mehmet Ali Tuğtekin; Erzin, Engin; PhD Student; Faculty Member; Department of Computer Engineering; Graduate School of Sciences and Engineering; College of Engineering; N/A; 34503
    We investigate spectral envelope mapping problem with joint analysis of throat- and acoustic-microphone recordings to enhance throatmicrophone speech. A new phone-dependent GMM-based spectral envelope mapping scheme, which performs the minimum mean square error (MMSE) estimation of the acoustic-microphone spectral envelope, has been proposed. Experimental evaluations are performed to compare the proposed mapping scheme to the state-of-theart GMM-based estimator using both objective and subjective evaluations. Objective evaluations are performed with the log-spectral distortion (LSD) and the wideband perceptual evaluation of speech quality (PESQ) metrics. Subjective evaluations are performed with the A/B pair comparison listening test. Both objective and subjective evaluations yield that the proposed phone-dependent mapping consistently improves performances over the state-of-the-art GMM estimator.
  • Placeholder
    Publication
    A novel test coverage metric for concurrently-accessed software components (A work-in-progress paper)
    (Springer-Verlag Berlin, 2006) N/A; Department of Computer Engineering; N/A; Department of Computer Engineering; Department of Computer Engineering; Taşıran, Serdar; Elmas, Tayfun; Bölükbaşı, Güven; Keremoğlu, M. Erkan; Faculty Member; PhD Student; Undergraduate Student; Reseacher; Department of Computer Engineering; College of Engineering; Graduate School of Sciences and Engineering; College of Engineering, College of Engineering; N/A; N/A; N/A; N/A
    We propose a novel, practical coverage metric called "location pairs" (LP) for concurrently-accessed software components. The LP metric captures well common concurrency errors that lead to atomicity or refinement violations. We describe a software tool for measuring LP coverage and outline an inexpensive application of predicate abstraction and model checking for ruling out infeasible coverage targets.
  • Placeholder
    Publication
    Hotspotizer: end-user authoring of mid-air gestural interactions
    (Association for Computing Machinery, 2014) N/A; Department of Computer Engineering; Department of Media and Visual Arts; Baytaş, Mehmet Aydın; Yemez, Yücel; Özcan, Oğuzhan; PhD Student; Faculty Member; Faculty Member; Department of Computer Engineering; Department of Media and Visual Arts; KU Arçelik Research Center for Creative Industries (KUAR) / KU Arçelik Yaratıcı Endüstriler Uygulama ve Araştırma Merkezi (KUAR); Graduate School of Social Sciences and Humanities; College of Engineering; College of Social Sciences and Humanities; N/A; 107907; 12532
    Drawing from a user-centered design process and guidelines derived from the literature, we developed a paradigm based on space discretization for declaratively authoring mid-air gestures and implemented it in Hotspotizer, an end-to-end toolkit for mapping custom gestures to keyboard commands. Our implementation empowers diverse user populations - including end-users without domain expertise - to develop custom gestural interfaces within minutes, for use with arbitrary applications.
  • Placeholder
    Publication
    An audio-driven dancing avatar
    (Springer, 2008) Balci, Koray; Kizoglu, Idil; Akarun, Lale; Canton-Ferrer, Cristian; Tilmanne, Joelle; Bozkurt, Elif; Erdem, A. Tanju; Department of Computer Engineering; N/A; N/A; Department of Computer Engineering; Department of Electrical and Electronics Engineering; Yemez, Yücel; Ofli, Ferda; Demir, Yasemin; Erzin, Engin; Tekalp, Ahmet Murat; Faculty Member; PhD Student; Master Student; Faculty Member; Faculty Member; Department of Computer Engineering; Department of Electrical and Electronics Engineering; College of Engineering; Graduate School of Sciences and Engineering; Graduate School of Sciences and Engineering; College of Engineering; College of Engineering; 107907; N/A; N/A; 34503; 26207
    We present a framework for training and synthesis of an audio-driven dancing avatar. The avatar is trained for a given musical genre using the multicamera video recordings of a dance performance. The video is analyzed to capture the time-varying posture of the dancer's body whereas the musical audio signal is processed to extract the beat information. We consider two different marker-based schemes for the motion capture problem. The first scheme uses 3D joint positions to represent the body motion whereas the second uses joint angles. Body movements of the dancer are characterized by a set of recurring semantic motion patterns, i.e., dance figures. Each dance figure is modeled in a supervised manner with a set of HMM (Hidden Markov Model) structures and the associated beat frequency. In the synthesis phase, an audio signal of unknown musical type is first classified, within a time interval, into one of the genres that have been learnt in the analysis phase, based on mel frequency cepstral coefficients (MFCC). The motion parameters of the corresponding dance figures are then synthesized via the trained HMM structures in synchrony with the audio signal based on the estimated tempo information. Finally, the generated motion parameters, either the joint angles or the 3D joint positions of the body, are animated along with the musical audio using two different animation tools that we have developed. Experimental results demonstrate the effectiveness of the proposed framework.
  • Placeholder
    Publication
    Multimodal analysis of speech prosody and upper body gestures using hidden semi-Markov models
    (Institute of Electrical and Electronics Engineers (IEEE), 2013) N/A; N/A; N/A; Department of Computer Engineering; Department of Computer Engineering; Bozkurt, Elif; Asta, Shahriar; Özkul, Serkan; Yemez, Yücel; Erzin, Engin; PhD Student; PhD Student; Master Student; Faculty Member; Faculty Member; Department of Computer Engineering; Graduate School of Sciences and Engineering; Graduate School of Sciences and Engineering; Graduate School of Sciences and Engineering; College of Engineering; College of Engineering; N/A; N/A; N/A; 107907; 34503
    Gesticulation is an essential component of face-to-face communication, and it contributes significantly to the natural and affective perception of human-to-human communication. In this work we investigate a new multimodal analysis framework to model relationships between intonational and gesture phrases using the hidden semi-Markov models (HSMMs). The HSMM framework effectively associates longer duration gesture phrases to shorter duration prosody clusters, while maintaining realistic gesture phrase duration statistics. We evaluate the multimodal analysis framework by generating speech prosody driven gesture animation, and employing both subjective and objective metrics.
  • Placeholder
    Publication
    Multicamera audio-visual analysis of dance figures
    (IEEE, 2007) N/A; N/A; Department of Computer Engineering; Department of Computer Engineering; Department of Electrical and Electronics Engineering; Ofli, Ferda; Erzin, Engin; Yemez, Yücel; Tekalp, Ahmet Murat; PhD Student; Faculty Member; Faculty Member; Faculty Member; Department of Computer Engineering; Department of Electrical and Electronics Engineering; Graduate School of Sciences and Engineering; College of Engineering; College of Engineering; College of Engineering; N/A; 34503; 107907; 26207
    We present an automated system for multicamera motion capture and audio-visual analysis of dance figures. the multiview video of a dancing actor is acquired using 8 synchronized cameras. the motion capture technique is based on 3D tracking of the markers attached to the person's body in the scene, using stereo color information without need for an explicit 3D model. the resulting set of 3D points is then used to extract the body motion features as 3D displacement vectors whereas MFC coefficients serve as the audio features. in the first stage of multimodal analysis, we perform Hidden Markov Model (HMM) based unsupervised temporal segmentation of the audio and body motion features, separately, to determine the recurrent elementary audio and body motion patterns. then in the second stage, we investigate the correlation of body motion patterns with audio patterns, that can be used for estimation and synthesis of realistic audio-driven body animation.
  • Placeholder
    Publication
    On the rate of convergence of a classifier based on a transformer encoder
    (IEEE-Inst Electrical Electronics Engineers Inc, 2022) Gurevych, Iryna; Kohler, Michael; Department of Computer Engineering; Şahin, Gözde Gül; Faculty Member; Department of Computer Engineering; College of Engineering; 366984
    Pattern recognition based on a high-dimensional predictor is considered. A classifier is defined which is based on a Transformer encoder. The rate of convergence of the misclassification probability of the classifier towards the optimal misclassification probability is analyzed. It is shown that this classifier is able to circumvent the curse of dimensionality provided the a posteriori probability satisfies a suitable hierarchical composition model. Furthermore, the difference between the Transformer classifiers theoretically analyzed in this paper and the ones used in practice today is illustrated by means of classification problems in natural language processing.
  • Placeholder
    Publication
    Audio-facial laughter detection in naturalistic dyadic conversations
    (Ieee-Inst Electrical Electronics Engineers Inc, 2017) N/A; N/A; Department of Computer Engineering; Department of Computer Engineering; Department of Computer Engineering; Türker, Bekir Berker; Yemez, Yücel; Sezgin, Tevfik Metin; Erzin, Engin; PhD Student; Faculty Member; Faculty Member; Faculty Member; Department of Computer Engineering; Graduate School of Sciences and Engineering; College of Engineering; College of Engineering; N/A; 107907; 18632; 34503
    We address the problem of continuous laughter detection over audio-facial input streams obtained from naturalistic dyadic conversations. We first present meticulous annotation of laughters, cross-talks and environmental noise in an audio-facial database with explicit 3D facial mocap data. Using this annotated database, we rigorously investigate the utility of facial information, head movement and audio features for laughter detection. We identify a set of discriminative features using mutual information-based criteria, and show how they can be used with classifiers based on support vector machines (SVMs) and time delay neural networks (TDNNs). Informed by the analysis of the individual modalities, we propose a multimodal fusion setup for laughter detection using different classifier-feature combinations. We also effectively incorporate bagging into our classification pipeline to address the class imbalance problem caused by the scarcity of positive laughter instances. Our results indicate that a combination of TDNNs and SVMs lead to superior detection performance, and bagging effectively addresses data imbalance. Our experiments show that our multimodal approach supported by bagging compares favorably to the state of the art in presence of detrimental factors such as cross-talk, environmental noise, and data imbalance.
  • Placeholder
    Publication
    An ECC processor for IoT using Edwards curves and DFT modular multiplication
    (Springer, 2022) Al-Khaleel, Osama; Baktir, Selçuk; Department of Computer Engineering; Küpçü, Alptekin; Faculty Member; Department of Computer Engineering; College of Engineering; 168060
    In this work, an elliptic curve cryptography (ECC) processor is proposed to be used in the Internet of Things (IoT) devices. The ECC processor is designed based on Edwards curves defined over the finite prime fields GF((213 - 1)(13)), GF((2(17) - 1)(17) THORN, and GF((2(19) - 1)(19)). Modular multiplication in the proposed ECC processor is carried out in the frequency domain using a Discrete Fourier Transform (DFT) modular multiplier. Different base field adders and base field multipliers are designed and utilized in the design of the DFT modular multiplier. The ECC processor is described and functionally tested using the VHDL language and the simulation tool in the Xilinx ISE14.2. Furthermore, the ECC processor is synthesized using the synthesis tool in the Xilinx ISE14.2, targeting the Virtex-5 FPGA family. Our synthesis results show that the proposed ECC processor achieves higher speed with minor area penalty compared to the similar work in the literature.