Research Outputs

Permanent URI for this communityhttps://hdl.handle.net/20.500.14288/2

Browse

Search Results

Now showing 1 - 5 of 5
  • Placeholder
    Publication
    3D progressive compression with octree particles
    (Akademische Verlagsgesellsch Aka Gmbh, 2002) Schmitt, Francis; Department of Computer Engineering; N/A; Yemez, Yücel; Faculty Member; Department of Computer Engineering; College of Engineering; N/A; 107907; N/A
    This paper improves the storage efficiency of the progressive particle-based modeling scheme presented in [14, 15] by using entropy coding techniques. This scheme encodes the surface geometry and attributes in terms of appropriately ordered oc-tree particles, which can then progressively be decoded and rendered by the-viewer by means of a fast direct triangulation technique. With the introduced entropy coding technique, the bitload of the multi-level representation for geometry encoding reduces to 9-14 bits per particle (or 4.5-7 bits per triangle) for 12-bit quantized geometry.
  • Placeholder
    Publication
    3D shape correspondence by isometry-driven greedy optimization
    (IEEE Computer Soc, 2010) N/A; Department of Computer Engineering; Sahillioğlu, Yusuf; Yemez, Yücel; PhD Student; Faculty Member; Department of Computer Engineering; Graduate School of Sciences and Engineering; College of Engineering; 215195; 107907
    We present an automatic method that establishes 3D correspondence between isometric shapes. Our goal is to find an optimal correspondence between two given (nearly) isometric shapes, that minimizes the amount of deviation from isometry. We cast the problem as a complete surface correspondence problem. Our method first divides the given shapes to be matched into surface patches of equal area and then seeks for a mapping between the patch centers which we refer to as base vertices. Hence the correspondence is established in a fast and robust manner at a relatively coarse level as imposed by the patch radius. We optimize the isometry cost in two steps. in the first step, the base vertices are transformed into spectral domain based on geodesic affinity, where the isometry errors are minimized in polynomial time by complete bipartite graph matching. the resulting correspondence serves as a good initialization for the second step of optimization in which we explicitly minimize the isometry cost via an iterative greedy algorithm in the original 3D Euclidean space. We demonstrate the performance of our method on various isometric (or nearly isometric) pairs of shapes for some of which the ground-truth correspondence is available.
  • Placeholder
    Publication
    3D Shape recovery and tracking from multi-camera video sequences via surface deformation
    (IEEE, 2006) Skala, V.; N/A; Department of Computer Engineering; Sahillioğlu, Yusuf; Yemez, Yücel; PhD Student; Faculty Member; Department of Computer Engineering; Graduate School of Sciences and Engineering; College of Engineering; 215195; 107907
    This paper addresses 3D reconstruction and modeling of time-varying real objects using multicamera video. The work consists of two phases. In the first phase, the initial shape of the object is recovered from its silhouettes using a surface deformation model. The same deformation model is also employed in the second phase to track the recovered initial shape through the time-varying silhouette information by surface evolution. The surface deformation/evolution model allows us to construct a spatially and temporally smooth surface mesh representation having fixed connectivity. This eventually leads to an overall space-time representation that preserves the semantics of the underlying motion and that is much more efficient to process, to visualize, to store and to transmit.
  • Placeholder
    Publication
    Multimodal speaker identification with audio-video processing
    (Ieee, 2003) Department of Computer Engineering; N/A; Department of Computer Engineering; Department of Electrical and Electronics Engineering; Yemez, Yücel; Kanak, Alper; Erzin, Engin; Tekalp, Ahmet Murat; Faculty Member; Master Student; Faculty Member; Faculty Member; Department of Computer Engineering; Department of Electrical and Electronics Engineering; College of Engineering; Graduate School of Sciences and Engineering; College of Engineering; College of Engineering; 107907; N/A; 34503; 26207
    In this paper we present a multimodal audio-visual speaker identification system. The objective is to improve the recognition performance over conventional unimodal schemes. The proposed system decomposes the information existing in a video stream into three components: speech, face texture and lip motion. Lip motion between successive frames is first computed in terms of optical row vectors and then encoded as a feature vector in a magnitude-direction histogram domain. The feature vectors obtained along the whole stream are then interpolated to match the rate of the speech signal and fused with mel frequency cepstral coeffcients (MFCC) of the corresponding speech signal. The resulting joint feature vectors are used to train and test a Hidden Markov Model (HMM) based identification system. Face texture images are treated separately in eigenface domain and integrated to the system through decision-fusion. Experimental results are also included for demonstration of the system performance.
  • Placeholder
    Publication
    Speech driven 3D head gesture synthesis
    (IEEE, 2006) Erdem, A. Tanju; N/A; Department of Computer Engineering; Department of Computer Engineering; Department of Electrical and Electronics Engineering; Sargın, Mehmet Emre; Erzin, Engin; Yemez, Yücel; Tekalp, Ahmet Murat; Master Student; Faculty Member; Faculty Member; Faculty Member; Department of Computer Engineering; Department of Electrical and Electronics Engineering; Graduate School of Sciences and Engineering; College of Engineering; College of Engineering; College of Engineering; N/A; 34503; 107907; 26207
    In this paper, we present a speech driven natural head gesture analysis and synthesis system. The proposed system assumes that sharp head movements are correlated with prominence in speech. For analysis, a binocular camera system is employed to capture the head motion of a talking person. The motion parameters associated with the 3D head motion are then used for extraction of the repetitive head gestures. In parallel, prosodic events are detected using an HMM structure with pitch and formant frequencies and speech intensity as audio features. For synthesis, the head motion parameters are estimated from the prosodic events based on a gesture-speech correlation model and then the associated Euler angles are used for speech driven animation of a 3D personalized talking head model. Results on head motion feature extraction, prosodic event detection and correlation modelling are provided..