Researcher: Khaki, Hossein
Name Variants
Khaki, Hossein
Email Address
Birth Date
8 results
Search Results
Now showing 1 - 8 of 8
Publication Metadata only Agreement and disagreement classification of dyadic interactions using vocal and gestural cues(Institute of Electrical and Electronics Engineers (IEEE), 2016) N/A; N/A; N/A; Department of Computer Engineering; Khaki, Hossein; Bozkurt, Elif; Erzin, Engin; PhD Student; PhD Student; Faculty Member; Department of Computer Engineering; Graduate School of Sciences and Engineering; Graduate School of Sciences and Engineering; College of Engineering; N/A; N/A; 34503In human-to-human communication gesture and speech co-exist in time with a tight synchrony, where we tend to use gestures to complement or to emphasize speech. In this study, we investigate roles of vocal and gestural cues to identify a dyadic interaction as agreement and disagreement. In this investigation we use the JESTKOD database, which consists of speech and full-body motion capture data recordings for dyadic interactions under agreement and disagreement scenarios. Spectral features of vocal channel and upper body joint angles of gestural channel are employed to extract unimodal and multimodal classification performances. Both of the modalities attain classification rates significantly above the chance level and the multimodal classifier performed more than 80% classification rate over 15 second utterances using statistical features of speech and motion.Publication Metadata only The JESTKOD database: an affective multimodal database of dyadic interactions(Springer, 2017) N/A; N/A; N/A; N/A; Department of Computer Engineering; Department of Computer Engineering; Bozkurt, Elif; Khaki, Hossein; Keçeci, Sinan; Türker, Bekir Berker; Yemez, Yücel; Erzin, Engin; PhD Student; PhD Student; Master Student; PhD Student; Faculty Member; Faculty Member; Department of Computer Engineering; Graduate School of Sciences and Engineering; Graduate School of Sciences and Engineering; Graduate School of Sciences and Engineering; Graduate School of Sciences and Engineering; College of Engineering; College of Engineering; N/A; N/A; N/A; N/A; 107907; 34503in human-to-human communication, gesture and speech co-exist in time with a tight synchrony, and gestures are often utilized to complement or to emphasize speech. in human-computer interaction systems, natural, Affective and believable use of gestures would be a valuable key component in adopting and emphasizing human-centered aspects. However, natural and affective multimodal data, for studying computational models of gesture and speech, is limited. in this study, we introduce the JESTKOD database, which consists of speech and full-body motion capture data recordings in dyadic interaction setting under agreement and disagreement scenarios. Participants of the dyadic interactions are native Turkish speakers and recordings of each participant are rated in dimensional affect space. We present our multimodal data collection and annotation process, As well as our preliminary experimental studies on agreement/disagreement classification of dyadic interactions using body gesture and speech data. the JESTKOD database provides a valuable asset to investigate gesture and speech towards designing more natural and affective human-computer interaction systems.Publication Metadata only Speech features for telemonitoring of Parkinson's disease symptoms(Institute of Electrical and Electronics Engineers (IEEE), 2017) N/A; N/A; N/A; Department of Computer Engineering; Department of Electrical and Electronics Engineering; Ramezani, Hamideh; Khaki, Hossein; Erzin, Engin; Akan, Özgür Barış; PhD Student; PhD Student; Faculty Member; Faculty Member; Department of Computer Engineering; Department of Electrical and Electronics Engineering; Graduate School of Sciences and Engineering; Graduate School of Sciences and Engineering; College of Engineering; College of Engineering; N/A; N/A; 34503; 6647The aim of this paper is tracking Parkinson's disease (PD) progression based on its symptoms on vocal system using Unified Parkinsons Disease Rating Scale (UPDRS). We utilize a standard speech signal feature set, which contains 6373 static features as functionals of low-level descriptor (LLD) contours, and select the most informative ones using the maximal relevance and minimal redundancy based on correlations (mRMRC) criteria. Then, we evaluate performance of Gaussian mixture regression (GMR) and support vector regression (SVR) on estimating the third subscale of UPDRS, i.e., UPDRS: motor subscale (UPDRS-III). Among the most informative features, a list of features are selected after redundancy reduction. The selected features depict that LLDs providing information about spectrum flatness, spectral distribution of energy, and hoarseness of voice are the most important ones for estimating UPDRS-III. Moreover, the most informative statistical functions are related to range, maximum, minimum and standard deviation of LLDs, which is an evidence of the muscle weakness due to the PD. Furthermore, GMR outperforms SVR on compact feature sets while the performance of SVR improves by increasing number of features.Publication Metadata only Use of agreement/disagreement classification in dyadic interactions for continuous emotion recognition(Isca-int Speech Communication assoc, 2016) N/A; N/A; Department of Computer Engineering; Khaki, Hossein; Erzin, Engin; PhD Student; Faculty Member; Department of Computer Engineering; Graduate School of Sciences and Engineering; College of Engineering; N/A; 34503Natural and affective handshakes of two participants define the course of dyadic interaction. Affective states of the participants are expected to be correlated with the nature or type of the dyadic interaction. In this study, we investigate relationship between affective attributes and nature of dyadic interaction. In this investigation we use the JESTKOD database, which consists of speech and full-body motion capture data recordings for dyadic interactions under agreement and disagreement scenarios. The dataset also has affective annotations in activation, valence and dominance (AVD) attributes. We pose the continuous affect recognition problem under agreement and disagreement scenarios of dyadic interactions. We define a statistical mapping using the support vector regression (SVR) from speech and motion modalities to affective attributes with and without the dyadic interaction type (DIT) information. We observe an improvement in estimation of the valence attribute when the DIT is available. Furthermore this improvement sustains even we estimate the DIT from the speech and motion modalities of the dyadic interaction.Publication Metadata only Use of affect based interaction classification for continuous emotion tracking(Institute of Electrical and Electronics Engineers (IEEE), 2017) N/A; N/A; Department of Computer Engineering; Khaki, Hossein; Erzin, Engin; PhD Student; Faculty Member; Department of Computer Engineering; Graduate School of Sciences and Engineering; College of Engineering; N/A; 34503Natural and affective handshakes of two participants define the course of dyadic interaction. Affective states of the participants are expected to be correlated with the nature of the dyadic interaction. In this paper, we extract two classes of the dyadic interaction based on temporal clustering of affective states. We use the k-means temporal clustering to define the interaction classes, and utilize support vector machine based classifier to estimate the interaction class types from multimodal, speech and motion, features. Then, we investigate the continuous emotion tracking problem over the dyadic interaction classes. We use the JESTKOD database, which consists of speech and full-body motion capture data recordings of dyadic interactions with affective annotations in activation, valence and dominance (AVD) attributes. The continuous affect tracking is executed as estimation of the AVD attributes. Experimental evaluation results attain statistically significant (p <; 0.05) improvements in affective state estimation using the interaction class information.Publication Metadata only JESTKOD database: dyadic interaction analysis(IEEE, 2015) Department of Computer Engineering; N/A; Department of Computer Engineering; N/A; N/A; N/A; Erzin, Engin; Bozkurt, Elif; Yemez, Yücel; Türker, Bekir Berker; Keçeci, Sinan; Khaki, Hossein; Faculty Member; PhD Student; Faculty Member; PhD Student; Master Student; PhD Student; Department of Computer Engineering; College of Engineering; Graduate School of Sciences and Engineering; College of Engineering; Graduate School of Sciences and Engineering; Graduate School of Sciences and Engineering; Graduate School of Sciences and Engineering; 34503; N/A; 107907; N/A; N/A; N/AIn the nature of human-to-human communication, gesture and speech co-exist in time with a tight synchrony. We tend to use gestures to complement or to emphasize speech. In this study we present the JESTKOD database, which will be a valuable asset to examine gesture and speech in defining more natural human-computer interaction systems. This JESTKOD database consists of speech and motion capture data recordings of dyadic interactions under friendly and unfriendly interaction scenarios. In this paper we present our multimodal data collection process as well as the early experimental studies on friendly/unfriendly classification of dyadic interactions using body gesture and speech data. © 2015 IEEE./ Öz: Vücut jestleri konuşma ile beraber, vurgulayıcı ve tamamlayıcı olarak insan-insan iletişiminin önemli bir parçasını oluşturmaktadır. Bu çalışmada sunduğumuz çok kipli JESTKOD veritabanı ile insan-insan iletişiminin önemli bir parçası olan konuşmayı ve vücut jestlerini inceleyerek, insan-bilgisayar etkileşimini daha doğal hale getirmeyi amaçlamaktayız. JESTKOD veritabanı kişilerin duygu değişimlerini esas alan konuşma ve hareket yakalamaya dayalı vücut hareketlerinin olumlu ve olumsuz iletişim senaryoları altında kayıtlarından oluşmaktadır. Bu bildiride çok kipli veritabanının hazırlanma yöntemlerini ve aşamalarını sunuyoruz. Aynı zamanda, veritabanımızın ikili iletişim senaryolarını değerlendirmek üzere konuşma ve hareket yakalama kayıtları üzerinden yaptıgımız olumlu/olumsuz iletişim sınıflandırma sonuçlarımızı da sunuyoruz.Publication Metadata only Continuous emotion tracking using total variability space(Isca-Int Speech Communication Assoc, 2015) N/A; N/A; Department of Computer Engineering; Khaki, Hossein; Erzin, Engin; PhD Student; Faculty Member; Department of Computer Engineering; Graduate School of Sciences and Engineering; College of Engineering; N/A; 34503Automatic continuous emotion tracking (CET) has received increased attention with expected applications in medical, robotic, and human-machine interaction areas. The speech signal carries useful clues to estimate the affective state of the speaker. In this paper, we present Total Variability Space (TVS) for CET from speech data. TVS is a widely used framework in speaker and language recognition applications. In this study, we applied TVS as an unsupervised emotional feature extraction framework. Assuming a low temporal variation in the affective space, we discretize the continuous affective state and extract i-vectors. Experimental evaluations are performed on the CreativeIT dataset and fusion results with pool of statistical functions over mel frequency cepstral coefficients (MFCCs) show a 2% improvement for the emotion tracking from speech.Publication Open Access Continuous emotion tracking using total variability space(International Speech Communication Association (ISCA), 2015) Department of Electrical and Electronics Engineering; Department of Computer Engineering; Khaki, Hossein; Erzin, Engin; Faculty Member; Department of Electrical and Electronics Engineering; Department of Computer Engineering; Graduate School of Sciences and Engineering; College of Engineering; N/A; 34503Automatic continuous emotion tracking (CET) has received increased attention with expected applications in medical, robotic, and human-machine interaction areas. The speech signal carries useful clues to estimate the affective state of the speaker. In this paper, we present Total Variability Space (TVS) for CET from speech data. TVS is a widely used framework in speaker and language recognition applications. In this study, we applied TVS as an unsupervised emotional feature extraction framework. Assuming a low temporal variation in the affective space, we discretize the continuous affective state and extract i-vectors. Experimental evaluations are performed on the CreativeIT dataset and fusion results with pool of statistical functions over mel frequency cepstral coefficients (MFCCs) show a 2% improvement for the emotion tracking from speech.