Research Outputs

Permanent URI for this communityhttps://hdl.handle.net/20.500.14288/2

Browse

Search Results

Now showing 1 - 10 of 127
  • Placeholder
    Publication
    3D articulated shape segmentation using motion information
    (Institute of Electrical and Electronics Engineers (IEEE), 2010) Department of Computer Engineering; N/A; Yemez, Yücel; Kalafatlar, Emre; Faculty Member; Master Student; Department of Computer Engineering; College of Engineering; Graduate School of Sciences and Engineering; 107907; N/A
    We present a method for segmentation of articulated 3D shapes by incorporating the motion information obtained from time-varying models. We assume that the articulated shape is given in the form of a mesh sequence with fixed connectivity so that the inter-frame vertex correspondences, hence the vertex movements, are known a priori. We use different postures of an articulated shape in multiple frames to constitute an affinity matrix which encodes both temporal and spatial similarities between surface points. The shape is then decomposed into segments in spectral domain based on the affinity matrix using a standard K-means clustering algorithm. The performance of the proposed segmentation method is demonstrated on the mesh sequence of a human actor.
  • Placeholder
    Publication
    3D face recognition
    (Institute of Electrical and Electronics Engineers (IEEE), 2006) Dutaǧaci, H.; Sankur, B.; Department of Computer Engineering; Yemez, Yücel; Faculty Member; Department of Computer Engineering; College of Engineering; 107907
    In this paper, we compare face recognition performances of various features applied on registered 3D scans of faces. The features we compare are DFT or DCT- based features, ICA-based features and NNMF-based features. We apply the feature extraction techniques to three different representations of registered faces: 3D point clouds, 2D depth images and 3D voxel representations. We also consider block-based DFT or DCT-based local features on 2D depth images and their fusion schemes. Experiments using different combinations of representation types and feature vectors are conducted on the 3D-RMA dataset. / Bu bildiride, kayıtlı 3B yüz taramalarında uygulanan çeşitli özelliklerin yüz tanıma performanslarını karşılaştırıyoruz. Karşılaştırdığımız özellikler, DFT veya DCT tabanlı özellikler, ICA tabanlı özellikler ve NNMF tabanlı özelliklerdir. Öznitelik çıkarma tekniklerini kayıtlı yüzlerin üç farklı temsiline uyguluyoruz: 3B nokta bulutları, 2B derinlik görüntüleri ve 3B voksel temsilleri. Ayrıca, 2D derinlik görüntüleri ve bunların füzyon şemaları üzerindeki blok tabanlı DFT veya DCT tabanlı yerel özellikleri de dikkate alıyoruz. 3D-RMA veri seti üzerinde farklı temsil türleri ve özellik vektörleri kombinasyonları kullanılarak deneyler yapılmıştır.
  • Placeholder
    Publication
    3D isometric shape correspondence
    (IEEE, 2010) Department of Computer Engineering; Yemez, Yücel; Sahillioğlu, Yusuf; Faculty Member; PhD Student; Department of Computer Engineering; College of Engineering; Graduate School of Sciences and Engineering; 107907; 215195
    We address the problem of correspondence between 3D isometric shapes. We present an automatic method that finds the optimal correspondence between two given (nearly) isometric shapes by minimizing the amount of deviation from isometry. We optimize the isometry error in two steps. In the first step, the 3D points uniformly sampled from the shape surfaces are transformed into spectral domain based on geodesic affinity, where the isometry errors are minimized in polynomial time by complete bipartite graph matching. The second step of optimization, which is well-initialized by the resulting correspondence of the first step, explicitly minimizes the isometry cost via an iterative greedy algorithm in the original 3D Euclidean space. Our method is put to test using (nearly) isometric pairs of shapes and its performance is measured via ground-truth correspondence information when available.
  • Placeholder
    Publication
    3D shape recovery and tracking from multi-camera video sequences via surface deformation
    (Institute of Electrical and Electronics Engineers (IEEE), 2006) Skala, V.; N/A; Department of Computer Engineering; Sahillioğlu, Yusuf; Yemez, Yücel; PhD Student; Faculty Member; Department of Computer Engineering; Graduate School of Sciences and Engineering; College of Engineering; 215195; 107907
    This paper addresses 3D reconstruction and modeling of time-varying real objects using multicamera video. The work consists of two phases. In the first phase, the initial shape of the object is recovered from its silhouettes using a surface deformation model. The same deformation model is also employed in the second phase to track the recovered initial shape through the time-varying silhouette information by surface evolution. The surface deformation/evolution model allows us to construct a spatially and temporally smooth surface mesh representation having fixed connectivity. This eventually leads to an overall space-time representation that preserves the semantics of the underlying motion and that is much more efficient to process, to visualize, to store and to transmit. / Bu makale, çok kameralı video kullanarak zamanla değişen gerçek nesnelerin 3B yeniden yapılandırılmasını ve modellenmesini ele almaktadır. Çalışma iki aşamadan oluşmaktadır. İlk aşamada, nesnenin ilk şekli, bir yüzey deformasyon modeli kullanılarak silüetlerinden kurtarılır. Aynı deformasyon modeli, ikinci aşamada, yüzey evrimi yoluyla zamanla değişen siluet bilgisi yoluyla geri kazanılan ilk şekli izlemek için de kullanılır. Yüzey deformasyonu/evrimi modeli, sabit bağlantıya sahip uzamsal ve zamansal olarak pürüzsüz bir yüzey ağ temsili oluşturmamıza izin verir. Bu, sonunda, altta yatan hareketin anlamını koruyan ve işlemesi, görselleştirmesi, depolaması ve iletmesi çok daha verimli olan genel bir uzay-zaman temsiline yol açar.
  • Placeholder
    Publication
    A challenging design case study for interactive media design education: interactive media for individuals with autism
    (Springer, 2014) Esin Orhun, Simge; Ünlüer Çimen, Ayça; Department of Media and Visual Arts; Yantaç, Asım Evren; Faculty Member; Department of Media and Visual Arts; College of Social Sciences and Humanities; 52621
    Since 1999, research for creativity triggering education solutions for interactive media design (IMD) undergraduate level education in YIldIz Technical University leaded to a variety of rule breaking exercises. Among many approaches, the method of designing for disabling environment, in which the students design for the users with one or more of their senses disabled, brought the challenge of working on developing interactive solutions for the individuals with autism spectrum conditions (ASC). With the aim of making their life easier, the design students were urged to find innovative yet functional interaction solutions for this focused user group, whose communicational disability activate due to the deficiencies in their senses and/or cognition. Between 2011 and 2012, this project brief supported by participatory design method motivated 26 students highly to develop design works to reflect the perfect fit of interaction design to this challenging framework involving the defective social communication cases of autism.
  • Placeholder
    Publication
    A distributed approach for computing sum aggregation in P2P networks
    (IEEE, 2011) N/A; Department of Computer Engineering; N/A; Özkasap, Öznur; Çem, Emrah; Faculty Member; PhD Student; Department of Computer Engineering; College of Engineering; Graduate School of Sciences and Engineering; 113507; N/A
    Since hand vein patterns are assumed not to change over time except in size and they are unique to each individual, researchers aim to construct a biometric control system based on hand vein patterns. Each hand vein pattern defines a graph structure. According to this, we converted each hand vein pattern to a graph and to match these graphs, we developed an algorithm based on (Graph Edit Distance) GED. GED is defined as the least cost graph edit operation sequence which is used to transform one graph to another. Our initial results confirm the utility of GED-based hand vein verification.
  • Placeholder
    Publication
    A new computational framework for 3D shape descriptors
    (Institute of Electrical and Electronics Engineers (IEEE), 2006) Akgül, C.B.; Sankur B., Schmitt F.; Department of Computer Engineering; Yemez, Yücel; Faculty Member; Department of Computer Engineering; College of Engineering; 107907
    In this work, we propose a computational framework for histogram-based 3D shape descriptors. Our method is based on evaluating the density of a shape function defined over the surface of 3D model using Gaussian modeling. The proposed approach has a better shape description ability compared to other competitor histogram-based approaches. We illustrate this assertion in a content-based 3D model retrieval application. / Bu çalışmada, histogram tabanlı 3B şekil tanımlayıcıları için hesaplamalı bir çerçeve öneriyoruz. Metodumuz, Gauss modellemesi kullanılarak 3B modelin yüzeyi üzerinde tanımlanan bir şekil fonksiyonunun yoğunluğunun değerlendirilmesine dayanmaktadır. Önerilen yaklaşım, diğer rakip histogram tabanlı yaklaşımlara kıyasla daha iyi bir şekil tanımlama yeteneğine sahiptir. Bu iddiayı içerik tabanlı bir 3B model alma uygulamasında gösteriyoruz.
  • Placeholder
    Publication
    Adaptive classifier cascade for multimodal speaker identification
    (International Speech Communication Association, 2004) Department of Electrical and Electronics Engineering; Department of Computer Engineering; Department of Computer Engineering; Tekalp, Ahmet Murat; Erzin, Engin; Yemez, Yücel; Faculty Member; Faculty Member; Faculty Member; Department of Electrical and Electronics Engineering; Department of Computer Engineering; College of Engineering; College of Engineering; College of Engineering; 26207; 34503; 107907
    We present a multimodal open-set speaker identification system that integrates information coming from audio, face and lip motion modalities. For fusion of multiple modalities, we propose a new adaptive cascade rule that favors reliable modality combinations through a cascade of classifiers. The order of the classifiers in the cascade is adaptively determined based on the reliability of each modality combination. A novel reliability measure, that genuinely fits to the open-set speaker identification problem, is also proposed to assess accept or reject decisions of a classifier. The proposed adaptive rule is more robust in the presence of unreliable modalities, and outperforms the hard-level max rule and soft-level weighted summation rule, provided that the employed reliability measure is effective in assessment of classifier decisions. Experimental results that support this assertion are provided.
  • Placeholder
    Publication
    Adaptive streaming of multiview video over P2P networks
    (Wiley, 2013) N/A; Department of Electrical and Electronics Engineering; Gürler, Cihat Göktuğ; Tekalp, Ahmet Murat; PhD Student; Faculty Member; Department of Electrical and Electronics Engineering; Graduate School of Sciences and Engineering; College of Engineering; N/A; 26207
    Three-dimensional (3D) video is the next natural step in the evolution of digital media technologies. Recent 3D auto-stereoscopic displays can display multiview video with up to 200 views. While it is possible to broadcast 3D stereo video (two views) over digital TV platforms today, streaming over IP provides a more flexible approach for distribution of stereo and free-view 3D media to home and mobile with different connection bandwidth and different 3D displays. Here, flexible transport refers to quality-scalable and view-scalable transport over the Internet. These scalability options are the key to deal with the biggest challenge, which is the scarcity of bandwidth in IP networks, in the delivery of multiview video. However, even with the scalability options at hand, it is very possible that the bandwidth requirement of the sender side can reach to critical levels and render such a service infeasible. Peer-to-peer (P2P) video streaming is a promising approach and has received significant attention recently and can be used to alleviate the problem of bandwidth scarcity in server-client-based applications. Unfortunately, P2P also introduces new challenges, such as handling unstable peer connections and peers' limited upload capacity. In this chapter, we provide an adaptive P2P video streaming solution that addresses these challenges for streaming multiview video over P2P overlays. We start with reviewing fundamental video transmission concepts and the state-of-the-art P2P video streaming solutions. We then take a look at beyond the state of the art and introduce the methods for enabling adaptive video streaming for P2P network to distribute legacy monoscopic video. Finally, we move to modifications that are needed to deliver multiview video in an adaptive manner over the Internet. We provide benchmark test results against the state of the P2P video streaming solutions to prove the superiority of the proposed approach in adaptive video transmission.
  • Placeholder
    Publication
    Affect burst detection using multi-modal cues
    (IEEE, 2015) Department of Computer Engineering; Department of Computer Engineering; N/A; Department of Computer Engineering; N/A; Sezgin, Tevfik Metin; Yemez, Yücel; Türker, Bekir Berker; Erzin, Engin; Marzban, Shabbir; Faculty Member; Faculty Member; PhD Student; Faculty Member; Master Student; Department of Computer Engineering; College of Engineering; College of Engineering; Graduate School of Sciences and Engineering; College of Engineering; Graduate School of Sciences and Engineering; 18632; 107907; N/A; 34503; N/A
    Recently, affect bursts have gained significant importance in the field of emotion recognition since they can serve as prior in recognising underlying affect bursts. In this paper we propose a data driven approach for detecting affect bursts using multimodal streams of input such as audio and facial landmark points. The proposed Gaussian Mixture Model based method learns each modality independently followed by combining the probabilistic outputs to form a decision. This gives us an edge over feature fusion based methods as it allows us to handle events when one of the modalities is too noisy or not available. We demonstrate robustness of the proposed approach on 'Interactive emotional dyadic motion capture database' (IEMOCAP) which contains realistic and natural dyadic conversations. This database is annotated by three annotators to segment and label affect bursts to be used for training and testing purposes. We also present performance comparison between SVM based methods and GMM based methods for the same configuration of experiments.