Research Outputs

Permanent URI for this communityhttps://hdl.handle.net/20.500.14288/2

Browse

Search Results

Now showing 1 - 10 of 17

Metadata only
3D progressive compression with octree particles
(Akademische Verlagsgesellsch Aka Gmbh, 2002) Schmitt, Francis; Department of Computer Engineering; N/A; Yemez, Yücel; Faculty Member; Department of Computer Engineering; College of Engineering; N/A; 107907; N/A
This paper improves the storage efficiency of the progressive particle-based modeling scheme presented in [14, 15] by using entropy coding techniques. This scheme encodes the surface geometry and attributes in terms of appropriately ordered oc-tree particles, which can then progressively be decoded and rendered by the-viewer by means of a fast direct triangulation technique. With the introduced entropy coding technique, the bitload of the multi-level representation for geometry encoding reduces to 9-14 bits per particle (or 4.5-7 bits per triangle) for 12-bit quantized geometry.
Metadata only
3D shape correspondence by isometry-driven greedy optimization
(IEEE Computer Soc, 2010) N/A; Department of Computer Engineering; Sahillioğlu, Yusuf; Yemez, Yücel; PhD Student; Faculty Member; Department of Computer Engineering; Graduate School of Sciences and Engineering; College of Engineering; 215195; 107907
We present an automatic method that establishes 3D correspondence between isometric shapes. Our goal is to find an optimal correspondence between two given (nearly) isometric shapes, that minimizes the amount of deviation from isometry. We cast the problem as a complete surface correspondence problem. Our method first divides the given shapes to be matched into surface patches of equal area and then seeks for a mapping between the patch centers which we refer to as base vertices. Hence the correspondence is established in a fast and robust manner at a relatively coarse level as imposed by the patch radius. We optimize the isometry cost in two steps. in the first step, the base vertices are transformed into spectral domain based on geodesic affinity, where the isometry errors are minimized in polynomial time by complete bipartite graph matching. the resulting correspondence serves as a good initialization for the second step of optimization in which we explicitly minimize the isometry cost via an iterative greedy algorithm in the original 3D Euclidean space. We demonstrate the performance of our method on various isometric (or nearly isometric) pairs of shapes for some of which the ground-truth correspondence is available.
Metadata only
3D Shape recovery and tracking from multi-camera video sequences via surface deformation
(IEEE, 2006) Skala, V.; N/A; Department of Computer Engineering; Sahillioğlu, Yusuf; Yemez, Yücel; PhD Student; Faculty Member; Department of Computer Engineering; Graduate School of Sciences and Engineering; College of Engineering; 215195; 107907
This paper addresses 3D reconstruction and modeling of time-varying real objects using multicamera video. The work consists of two phases. In the first phase, the initial shape of the object is recovered from its silhouettes using a surface deformation model. The same deformation model is also employed in the second phase to track the recovered initial shape through the time-varying silhouette information by surface evolution. The surface deformation/evolution model allows us to construct a spatially and temporally smooth surface mesh representation having fixed connectivity. This eventually leads to an overall space-time representation that preserves the semantics of the underlying motion and that is much more efficient to process, to visualize, to store and to transmit.
Metadata only
A new scalable multi-view video coding configuration for robust selective streaming of free-viewpoint TV
(IEEE, 2007) Özbek, Nükhet; Tunalı, E. Turhan; Department of Electrical and Electronics Engineering; Tekalp, Ahmet Murat; Faculty Member; Department of Electrical and Electronics Engineering; College of Engineering; 26207
Free viewpoint TV (FTV) is a new media format that allows a user to change his/her viewpoint freely. To this effect, multi-view video must be coded to sAtışfy two conflicting requirements: i) achieve high compression efficiency, and ii) allow view switching with low delay. This paper proposes a new encoding configuration for scalable multi-view video coding, which achieves a compromise between the two requirements. In the new scalable multi-view configuration, the base layer is encoded with inter-view prediction at a minimum acceptable quality, while enhancement layers for each view only depend on their respective base layers (with no interview prediction). Thus, the base layer shall be served to all users, while enhancement layers shall be served selectively to users depending on their channel bandwidth and viewing direction. We compare the compression efficiency of the proposed method with those of non-scalable multi-view coding (MVC) and simulcast (H.264/AVC of each view independently) solutions.
Metadata only
H.264 long-term reference selection for videos with frequent camera transitions
(IEEE, 2006) Özbek, Nükhet; Department of Electrical and Electronics Engineering; Tekalp, Ahmet Murat; Faculty Member; Department of Electrical and Electronics Engineering; College of Engineering; 26207
Long-tern reference prediction is an important feature of the H.264/MPEG-4 AVC standard, which provides a tradeoff between compression gain and computational complexity. In this study, we propose a long-tern reference selection method for videos with frequent camera transitions to optimize compression efficiency at shot boundaries without increasing the computational complexity. Experimental results show up to 50% reduction in the number of bits (at the same PSNR) for frames at the border of camera transitions.
Metadata only
Hierarchical representation and coding of 3D mesh geometry for transmission of surface/volume data
(IEEE, 2006) Celasun, Işıl; Eröksüz, Serkan; Siddiqui, Rizwan A.; Doğan, Ertuğrul; Department of Electrical and Electronics Engineering; Tekalp, Ahmet Murat; Faculty Member; Department of Electrical and Electronics Engineering; College of Engineering; 26207
Hierarchical mesh representation and mesh simplification have been addressed in computer graphics for adaptive level-of-detail rendering of 3D objects. In this paper, by using a new simplification method to design hierarchical 3D meshes such that each mesh level has Delaunay topology, we can obtain not only meshes with desired geometric properties, but also efficient compression of the mesh data. The hierarchical compression technique is based on a nearest-neighbor ordering of mesh node points. The baseline is the use of entropy coding of linear prediction between nearest neighbor node coordinates. Vector quantization is also employed just to be able to process efficiently stAtıştical dependences between prediction error vectors of a node. The compression method allows progressive transmission and quality scalability.
Metadata only
Machine learning-based approach to identify formalin-fixed paraffin-embedded glioblastoma and healthy brain tissues
(Spie-Int Soc Optical Engineering, 2022) N/A; Department of Electrical and Electronics Engineering; N/A; N/A; N/A; N/A; N/A; Department of Electrical and Electronics Engineering; Torun, Hülya; Batur, Numan; Bilgin, Buse; Esengür, Ömer Tarık; Baysal, Kemal; Kulaç, İbrahim; Solaroğlu, İhsan; Onbaşlı, Mehmet Cengiz; PhD Student; Undergraduate Student; PhD Student; Undergraduate Student; Faculty Member; Faculty Member; Faculty Member; Faculty Member; Department of Electrical and Electronics Engineering; Graduate School of Sciences and Engineering; College of Engineering; Graduate School of Sciences and Engineering; School of Medicine; School of Medicine; School of Medicine; School of Medicine; College of Engineering; N/A; N/A; N/A; N/A; 119184; 170305; 102059; 258783
Glioblastoma is the most malignant and common high-grade brain tumor with a 14-month overall survival length. According to recent World Health Organization Central Nervous System tumor classification (2021), the diagnosis of glioblastoma requires extensive molecular genetic tests in addition to the traditional histopathological analysis of Formalin-Fixed Paraffin-Embedded (FFPE) tissues. Time-consuming and expensive molecular tests as well as the need for clinical neuropathology expertise are the challenges in the diagnosis of glioblastoma. Hence, an automated and rapid analytical detection technique for identifying brain tumors from healthy tissues is needed to aid pathologists in achieving an error-free diagnosis of glioblastoma in clinics. Here, we report on our clinical test results of Raman spectroscopy and machine learning-based glioblastoma identification methodology for a cohort of 20 glioblastoma and 18 white matter tissue samples. We used Raman spectroscopy to distinguish FFPE glioblastoma and white matter tissues applying our previously reported protocols about optimized FFPE sample preparation and Raman measurement parameters. One may analyze the composition and identify the subtype of brain tumors using Raman spectroscopy since this technique yields detailed molecule-specific information from tissues. We measured and classified the Raman spectra of neoplastic and non-neoplastic tissue sections using machine learning classifiers including support vector machine and random forest with 86.6% and 83.3% accuracies, respectively. These proof-of-concept results demonstrate that this technique might be eventually used in the clinics to assist pathologists once validated with a larger and more diverse glioblastoma cohort and improved detection accuracies.
Metadata only
Multimodal speaker identification with audio-video processing
(Ieee, 2003) Department of Computer Engineering; N/A; Department of Computer Engineering; Department of Electrical and Electronics Engineering; Yemez, Yücel; Kanak, Alper; Erzin, Engin; Tekalp, Ahmet Murat; Faculty Member; Master Student; Faculty Member; Faculty Member; Department of Computer Engineering; Department of Electrical and Electronics Engineering; College of Engineering; Graduate School of Sciences and Engineering; College of Engineering; College of Engineering; 107907; N/A; 34503; 26207
In this paper we present a multimodal audio-visual speaker identification system. The objective is to improve the recognition performance over conventional unimodal schemes. The proposed system decomposes the information existing in a video stream into three components: speech, face texture and lip motion. Lip motion between successive frames is first computed in terms of optical row vectors and then encoded as a feature vector in a magnitude-direction histogram domain. The feature vectors obtained along the whole stream are then interpolated to match the rate of the speech signal and fused with mel frequency cepstral coeffcients (MFCC) of the corresponding speech signal. The resulting joint feature vectors are used to train and test a Hidden Markov Model (HMM) based identification system. Face texture images are treated separately in eigenface domain and integrated to the system through decision-fusion. Experimental results are also included for demonstration of the system performance.
Metadata only
Quality assessment of asymmetric stereo video coding
(IEEE, 2010) N/A; N/A; Department of Electrical and Electronics Engineering; Saygılı, Görkem; Gürler, Cihat Göktuğ; Tekalp, Ahmet Murat; Master Student; PhD Student; Faculty Member; Department of Electrical and Electronics Engineering; Graduate School of Sciences and Engineering; Graduate School of Sciences and Engineering; College of Engineering; N/A; N/A; 26207
It is well known that the human visual system can perceive high frequencies in 3D, even if that information is present in only one of the views. therefore, the best 3D stereo quality may be achieved by asymmetric coding where the reference (right) and auxiliary (left) views are coded at unequal PSNR. However, the questions of what should be the level of this asymmetry and whether asymmetry should be achieved by spatial resolution reduction or SNR (quality) reduction are open issues. Extensive subjective tests indicate that when the reference view is encoded at sufficiently high quality, the auxiliary view can be encoded above a low-quality threshold without a noticeable degradation on the perceived stereo video quality. This low-quality threshold may depend on the 3D display; e.g., it is about 31 dB for a parallax barrier display and 33 dB for a polarized projection display. Subjective tests show that, Above this PSNR threshold value, users prefer SNR reduction over spatial resolution reduction on both parallax barrier and polarized projection displays. It is also observed that, if the auxiliary view is encoded below this threshold value, symmetric coding starts to perform better than asymmetric coding in terms of perceived 3D video quality.
Metadata only
Rate-visual-distortion optimized extraction with quality layers for scalable coding of stereo videos
(Institute of Electrical and Electronics Engineers (IEEE), 2008) Özbek, Nükhet; Department of Electrical and Electronics Engineering; Tekalp, Ahmet Murat; Faculty Member; Department of Electrical and Electronics Engineering; College of Engineering; 26207
The quality layers concept is proposed and adopted in the JSVM in order to ensure an optimal adaptation in a rate-distortion sense. In this paper it is extended to multiview case for scalable coding of stereo videos. Rate-visual-distortion optimized rate allocation among the views is necessary for efficient transport of 3DTV data over the Internet. This paper addresses a standard compatible approach to solve the problem. Experimental results are presented that demonstrate the effectiveness of the proposed method for several stereo videos.

Research Outputs

Browse

Filters

Advanced Search

Filter by

Settings

Sort By

Results per page

Search Results