Publications without Fulltext
Permanent URI for this collectionhttps://hdl.handle.net/20.500.14288/3
Browse
148 results
Search Results
Publication Metadata only Multicamera audio-visual analysis of dance figures(IEEE, 2007) N/A; N/A; Department of Computer Engineering; Department of Computer Engineering; Department of Electrical and Electronics Engineering; Ofli, Ferda; Erzin, Engin; Yemez, Yücel; Tekalp, Ahmet Murat; PhD Student; Faculty Member; Faculty Member; Faculty Member; Department of Computer Engineering; Department of Electrical and Electronics Engineering; Graduate School of Sciences and Engineering; College of Engineering; College of Engineering; College of Engineering; N/A; 34503; 107907; 26207We present an automated system for multicamera motion capture and audio-visual analysis of dance figures. the multiview video of a dancing actor is acquired using 8 synchronized cameras. the motion capture technique is based on 3D tracking of the markers attached to the person's body in the scene, using stereo color information without need for an explicit 3D model. the resulting set of 3D points is then used to extract the body motion features as 3D displacement vectors whereas MFC coefficients serve as the audio features. in the first stage of multimodal analysis, we perform Hidden Markov Model (HMM) based unsupervised temporal segmentation of the audio and body motion features, separately, to determine the recurrent elementary audio and body motion patterns. then in the second stage, we investigate the correlation of body motion patterns with audio patterns, that can be used for estimation and synthesis of realistic audio-driven body animation.Publication Metadata only Object placement for high bandwidth memory augmented with high capacity memory(IEEE, 2017) N/A; N/A; Department of Computer Engineering; Laghari, Mohammad; Erten, Didem Unat; Master Student; Faculty Member; Department of Computer Engineering; Graduate School of Sciences and Engineering; College of Engineering; N/A; 219274High bandwidth memory (HBM) is a new emerging technology that aims to improve the performance of bandwidth limited applications. Even though it provides high bandwidth, it must be augmented with DRAM to meet the memory capacity requirement of any applications. Due to this limitation, objects in an application should be optimally placed on the heterogeneous memory subsystems. In this study, we propose an object placement algorithm that places program objects to fast or slow memories in case the capacity of fast memory is insufficient to hold all the objects to increase the overall application performance. Our algorithm uses the reference counts and type of references (read or write) to make an initial placement of data. In addition, we perform various memory bandwidth benchmarks to be used in our placement algorithm on Intel Knights Landing (KNL) architecture. Not surprisingly high bandwidth memory sustains higher read bandwidth than write bandwidth, however, placing write-intensive data on HBM results in better overall performance because write-intensive data is punished by the DRAM speed more severely compared to read intensive data. Moreover, our benchmarks demonstrate that if a basic block makes references to both types of memories, it performs worse than if it makes references to only one type of memory in some cases. We test our proposed placement algorithm with 6 applications under various system configurations. By allocating objects according to our placement scheme, we are able to achieve a speedup of up to 2x.Publication Metadata only Fundamental frequency estimation for heterophonical Turkish music by using VMD(Institute of Electrical and Electronics Engineers (IEEE), 2016) Simsek, Berrak Ozturk; Akan, Aydin; Department of Computer Engineering; Bozkurt, Barış; Faculty Member; Department of Computer Engineering; College of Engineering; N/AIn this study, a new method is presented for the fundamental frequency estimation of heterophonical Turkish makam music recordings that include percusssive instrument by using Variational Mode Decomposition (VMD). VMD is a method to decompose an input signal into an ensemble of sub-signals (modes) which is entirely non-recursive and determines the relevant bands adaptively and estimates the corresponding modes concurrently. In order to decompose a given signal optimally, actuated by the narrow-band properties corresponding to the Intrinsic Mode Function definition used in Emprical Mode Decomposition (EMD), and we seek an ensemble of modes. Simulation results on fundamental frequency estimation of real music data show comparable performance to other common decomposition methods for music signals such as YIN and MELODIA based methods.Publication Metadata only Agreement and disagreement classification of dyadic interactions using vocal and gestural cues(Institute of Electrical and Electronics Engineers (IEEE), 2016) N/A; N/A; N/A; Department of Computer Engineering; Khaki, Hossein; Bozkurt, Elif; Erzin, Engin; PhD Student; PhD Student; Faculty Member; Department of Computer Engineering; Graduate School of Sciences and Engineering; Graduate School of Sciences and Engineering; College of Engineering; N/A; N/A; 34503In human-to-human communication gesture and speech co-exist in time with a tight synchrony, where we tend to use gestures to complement or to emphasize speech. In this study, we investigate roles of vocal and gestural cues to identify a dyadic interaction as agreement and disagreement. In this investigation we use the JESTKOD database, which consists of speech and full-body motion capture data recordings for dyadic interactions under agreement and disagreement scenarios. Spectral features of vocal channel and upper body joint angles of gestural channel are employed to extract unimodal and multimodal classification performances. Both of the modalities attain classification rates significantly above the chance level and the multimodal classifier performed more than 80% classification rate over 15 second utterances using statistical features of speech and motion.Publication Metadata only Artificial bandwidth extension of speech excitation(IEEE, 2015) Department of Computer Engineering; N/A; Erzin, Engin; Turan, Mehmet Ali Tuğtekin; Faculty Member; PhD Student; Department of Computer Engineering; College of Engineering; Graduate School of Sciences and Engineering; 34503; N/AIn this paper, a new approach that extends narrowband excitation signals to synthesize wide-band speech have been proposed. Bandwidth extension problem is analyzed using source-filter separation framework where a speech signal is decomposed into two independent components. For spectral envelope extension, our former work based on hidden Markov model have been used. For excitation signal extension, the proposed method moves the spectrum based on correlation analysis where the distance between the harmonics and the structure of the excitation signal are preserved in high-bands. In experimental studies, we also apply two other well-known extension techniques for excitation signals comparatively and evaluate the overall performance of proposed system using the PESQ metric. Our findings indicate that the proposed extension method outperforms other two techniques. © 2015 IEEE./ Öz: Bu çalışmada dar bantlı kaynak sinyallerinin bant genişliği artırılarak geniş bantlı konuşma sentezleyen yeni bir yaklaşım önerilmektedir. Bant genişletme problemi kaynak süzgeç analizinin yardımıyla iki bağımsız bileşen üzerinde ayrı ayrı ele alınmıştır. Süzgeç yapısını şekillendiren izgesel zarfı, saklı Markov modeli tabanlı geçmiş çalışmamızı kullanarak iyileştirirken, dar bantlı kaynak sinyalinin genişletilmesi için izgesel kopyalamaya dayalı yeni bir yöntem öneriyoruz. Bu yeni yöntemde dar bantlı kaynak sinyalinin yüksek frekans bileşenlerindeki harmonik yapısını, ilinti analizi ile genişletip geniş bantlı kaynak sinyali sentezlemekteyiz. Öne sürülen bu iyileştirmenin başarımını ölçebilmek için literatürde sıklıkla kullanılan iki ayrı genişletme yöntemi de karşılaştırmalı olarak degerlendirilmekte- dir. Deneysel çalışmalarda öne sürdüğümüz genişletmenin PESQ ölçütüyle nesnel başarımı gösterilmiştir.Publication Metadata only E_coach(IEEE, 2004) Department of Electrical and Electronics Engineering; Department of Computer Engineering; Civanlar, Mehmet Reha; Baykan, Eda; Faculty Member; Undergraduated Student; Department of Electrical and Electronics Engineering; Department of Computer Engineering; College of Engineering; College of Engineering; 16372; N/AWe developed the necessary software to control the playback speed of exercise videos playing on a personal computer, using the heart rate of an individual performing the recorded exercise routine. Moderate exercise, at an appropriate heart rate, is widely regarded today as an excellent way to improve one's health when performed on a regular and frequent basis. One popular form of an indoor exercise program is to use a video "workout" program of aerobic exercise and/or weight training exercises. The "off-the-shelf" exercise videos, while they may target various fitness levels (such as "beginner", "regular", and "advanced"), cannot offer precise adjustments to address each user's current fitness level. The software developed allows for the playback of an exercise video to be adjusted to accommodate the fitness level of the individual user through a closed loop feedback mechanism. The project is being improved for logging and analyzing the performance of an individual who uses the system regularly and for exercise planning. The closed loop feedback mechanism that models the relationship between the heart rate and exercise level, is being improved with the experiments in which subjects incude fit people as well as ones who are sedementary. © 2004 IEEE.Publication Metadata only PPAD: privacy preserving group-based advertising in online social networks(IEEE, 2018) N/A; Department of Computer Engineering; Department of Computer Engineering; Boshrooyeh, Sanaz Taheri; Küpçü, Alptekin; Özkasap, Öznur; PhD Student; Faculty Member; Faculty Member; Department of Computer Engineering; Graduate School of Sciences and Engineering; College of Engineering; College of Engineering; N/A; 168060; 113507Services provided as free by Online Social Networks (OSN) come with privacy concerns. Users' information kept by OSN providers are vulnerable to the risk of being sold to the advertising firms. To protect user privacy, existing proposals utilize data encryption, which prevents the providers from monetizing users' information. Therefore, the providers would not be financially motivated to establish secure OSN designs based on users' data encryption. Addressing these problems, we propose the first Privacy Preserving Group-Based Advertising (PPAD) system that gives monetizing ability for the OSN providers. PPAD performs profile and advertisement matching without requiring the users or advertisers to be online, and is shown to be secure in the presence of honest but curious servers that are allowed to create fake users or advertisers. We also present advertisement accuracy metrics under various system parameters providing a range of security-accuracy trade-offs.Publication Metadata only A diversity combination model incorporating an inward bias for interaural time-level difference cue integration in sound lateralization(MDPI, 2020) N/A; N/A; Department of Computer Engineering; N/A; Mojtahedi, Sina; Erzin, Engin; Ungan, Pekcan; PhD Student; Faculty Member; Faculty Member; Department of Computer Engineering; Graduate School of Sciences and Engineering; College of Engineering; School of Medicine; N/A; 34503; N/AA sound source with non-zero azimuth leads to interaural time level differences (ITD and ILD). Studies on hearing system imply that these cues are encoded in different parts of the brain, but combined to produce a single lateralization percept as evidenced by experiments indicating trading between them. According to the duplex theory of sound lateralization, ITD and ILD play a more significant role in low-frequency and high-frequency stimulations, respectively. In this study, ITD and ILD, which were extracted from a generic head-related transfer functions, were imposed on a complex sound consisting of two low- and seven high-frequency tones. Two-alternative forced-choice behavioral tests were employed to assess the accuracy in identifying a change in lateralization. Based on a diversity combination model and using the error rate data obtained from the tests, the weights of the ITD and ILD cues in their integration were determined by incorporating a bias observed for inward shifts. The weights of the two cues were found to change with the azimuth of the sound source. While the ILD appears to be the optimal cue for the azimuths near the midline, the ITD and ILD weights turn to be balanced for the azimuths far from the midline.Publication Metadata only Role allocation through haptics in physical human-robot interaction(Institute of Electrical and Electronics Engineers (IEEE), 2013) N/A; N/A; Department of Computer Engineering; Department of Mechanical Engineering; Küçükyılmaz, Ayşe; Sezgin, Tevfik Metin; Başdoğan, Çağatay; PhD Student; Faculty Member; Faculty Member; Department of Computer Engineering; Department of Mechanical Engineering; Graduate School of Sciences and Engineering; College of Engineering; College of Engineering; N/A; 18632; 125489This paper presents a summary of our efforts to enable dynamic role allocation between humans and robots in physical collaboration tasks. A major goal in physical human-robot interaction research is to develop tacit and natural communication between partners. In previous work, we suggested that the communication between a human and a robot would benefit from a decision making process in which the robot can dynamically adjust its control level during the task based on the intentions of the human. In order to do this, we define leader and follower roles for the partners, and using a role exchange mechanism, we enable the partners to negotiate solely through force information to exchange roles. We show that when compared to an “equal control” condition, the role exchange mechanism improves task performance and the joint efficiency of the partners.Publication Metadata only Distributed deep reinforcement learning with wideband sensing for dynamic spectrum access(Ieee, 2020) Ucar, Seyhan; N/A; Department of Computer Engineering; Department of Electrical and Electronics Engineering; Kaytaz, Umuralp; Akgün, Barış; Ergen, Sinem Çöleri; PhD Student; Faculty Member; Faculty Member; Department of Computer Engineering; Department of Electrical and Electronics Engineering; Graduate School of Sciences and Engineering; College of Engineering; College of Engineering; N/A; 258784; 7211Wideband spectrum sensing (WBS) has been a critical issue for communication system designers and specialists to monitor and regulate the wireless spectrum. Detecting and identifying the existing signals in a continuous manner enable orchestrating signals through all controllable dimensions and enhancing resource usage efficiency. This paper presents an investigation on the application of deep learning (DL)-based algorithms within the WBS problem while also providing comparisons to the conventional recursive thresholding-based solution. For this purpose, two prominent object detectors, You Only Learn One Representation (YOLOR) and Detectron2, are implemented and fine-tuned to complete these tasks for WBS. The power spectral densities (PSDs) belonging to over-the-air (OTA) collected signals within the wide frequency range are recorded as images that constitute the signal signatures (i.e., the objects of interest) and are fed through the input of the above-mentioned learning and evaluation processes. The main signal types of interest are determined as the cellular and broadcast types (i.e., GSM, UMTS, LTE and Analogue TV) and the single-tone. With a limited amount of captured OTA data, the DL-based approaches YOLOR and Detectron2 are seen to achieve a classification rate of 100% and detection rates of 85% and 69%, respectively, for a nonzero intersection over union threshold. The preliminary results of our investigation clearly show that both object detectors are promising to take on the task of wideband signal detection and identification, especially after an extended data collection campaign.