Publications with Fulltext
Permanent URI for this collectionhttps://hdl.handle.net/20.500.14288/6
Browse
17 results
Search Results
Publication Open Access Leveraging frequency based salient spatial sound localization to improve 360 degrees video saliency prediction(Institute of Electrical and Electronics Engineers (IEEE), 2021) Ćƶkelek, Mert; Ä°mamoÄlu, Nevrez; ĆzƧınar, ĆaÄrı; Department of Computer Engineering; Erdem, Aykut; Faculty Member; Department of Computer Engineering; College of Engineering; 20331Virtual and augmented reality (VR/AR) systems dramatically gained in popularity with various application areas such as gaming, social media, and communication. It is therefore a crucial task to have the knowhow to efficiently utilize, store or deliver 360Ā° videos for end-users. Towards this aim, researchers have been developing deep neural network models for 360Ā° multimedia processing and computer vision fields. In this line of work, an important research direction is to build models that can learn and predict the observers' attention on 360Ā° videos to obtain so-called saliency maps computationally. Although there are a few saliency models proposed for this purpose, these models generally consider only visual cues in video frames by neglecting audio cues from sound sources. In this study, an unsupervised frequency-based saliency model is presented for predicting the strength and location of saliency in spatial audio. The prediction of salient audio cues is then used as audio bias on the video saliency predictions of state-of-the-art models. Our experiments yield promising results and show that integrating the proposed spatial audio bias into the existing video saliency models consistently improves their performance.Publication Open Access Self-supervised monocular scene decomposition and depth estimation(IEEE Computer Society, 2021) Department of Computer Engineering; N/A; GĆ¼ney, Fatma; Safadoust, Sadra; Department of Computer Engineering; KoƧ Ćniversitesi Ä°Å Bankası Yapay Zeka Uygulama ve AraÅtırma Merkezi (KUIS AI)/ KoƧ University Ä°Å Bank Artificial Intelligence Center (KUIS AI); College of Engineering; Graduate School of Sciences and Engineering; 187939; N/ASelf-supervised monocular depth estimation approaches either ignore independently moving objects in the scene or need a separate segmentation step to identify them. We propose MonoDepthSeg to jointly estimate depth and segment moving objects from monocular video without using any ground-truth labels. We decompose the scene into a fixed number of components where each component corresponds to a region on the image with its own transformation matrix representing its motion. We estimate both the mask and the motion of each component efficiently with a shared encoder. We evaluate our method on three driving datasets and show that our model clearly improves depth estimation while decomposing the scene into separately moving components.Publication Open Access A diversity combination model incorporating an inward bias for interaural time-level difference cue integration in sound lateralization(Multidisciplinary Digital Publishing Institute (MDPI), 2020) N/A; Department of Computer Engineering; Mojtahedi, Sina; Erzin, Engin; Ungan, Pekcan; Faculty Member; Faculty Member; Department of Computer Engineering; Graduate School of Sciences and Engineering; College of Engineering; School of Medicine; N/A; 34503; N/AA sound source with non-zero azimuth leads to interaural time level differences (ITD and ILD). Studies on hearing system imply that these cues are encoded in different parts of the brain, but combined to produce a single lateralization percept as evidenced by experiments indicating trading between them. According to the duplex theory of sound lateralization, ITD and ILD play a more significant role in low-frequency and high-frequency stimulations, respectively. In this study, ITD and ILD, which were extracted from a generic head-related transfer functions, were imposed on a complex sound consisting of two low- and seven high-frequency tones. Two-alternative forced-choice behavioral tests were employed to assess the accuracy in identifying a change in lateralization. Based on a diversity combination model and using the error rate data obtained from the tests, the weights of the ITD and ILD cues in their integration were determined by incorporating a bias observed for inward shifts. The weights of the two cues were found to change with the azimuth of the sound source. While the ILD appears to be the optimal cue for the azimuths near the midline, the ITD and ILD weights turn to be balanced for the azimuths far from the midline.Publication Open Access Demo: Skip Graph middleware implementation(Institute of Electrical and Electronics Engineers (IEEE), 2020) Department of Computer Engineering; Hassanzadeh-Nazarabadi, Yahya; Nayal, Nazir; Hamdan, Shadi Sameh; Åahin, Ali Utkan; Ćzkasap, Ćznur; KĆ¼pĆ§Ć¼, Alptekin; Faculty Member; Department of Computer Engineering; Graduate School of Sciences and Engineering; College of Engineering; N/A; N/A; N/A; N/A; 113507; 168060Skip Graphs are Distributed Hash Table (DHT)based data structures that are immensely utilized as routing overlays in Peer-to-Peer (P2P) applications. In this demo paper, we present the software architecture of our open-source implementation of Skip Graph middleware in Java. We also present a demo scenario on configuration and constructing an overlay of Skip Graph processes in a fully decentralized manner. Our implementation is capable of hosting data objects at the Skip Graph processes and serving as a P2P data storage platform as well. Our middleware implementation provides an open-source platform to support Skip Graph-based applications on top of it.Publication Open Access On the rate of convergence of a classifier based on a transformer encoder(Institute of Electrical and Electronics Engineers (IEEE), 2022) Gurevych, Iryna; Kohler, Michael; Department of Computer Engineering; Åahin, Gƶzde GĆ¼l; Faculty Member; Department of Computer Engineering; College of Engineering; 366984Pattern recognition based on a high-dimensional predictor is considered. A classifier is defined which is based on a Transformer encoder. The rate of convergence of the misclassification probability of the classifier towards the optimal misclassification probability is analyzed. It is shown that this classifier is able to circumvent the curse of dimensionality provided the a posteriori probability satisfies a suitable hierarchical composition model. Furthermore, the difference between the Transformer classifiers theoretically analyzed in this paper and the ones used in practice today is illustrated by means of classification problems in natural language processing.Publication Open Access Verifiable dynamic searchable encryption(TĆBÄ°TAK, 2019) Department of Computer Engineering; Etemad, Mohammad; KĆ¼pĆ§Ć¼, Alptekin; PhD Student; Department of Computer Engineering; Graduate School of Sciences and Engineering; N/A; 168060Using regular encryption schemes to protect the privacy of the outsourced data implies that the client should sacrifice functionality for security. Searchable symmetric encryption (SSE) schemes encrypt the data in a way that the client can later search and selectively retrieve the required data. Many SSE schemes have been proposed, starting with static constructions, and then dynamic and adaptively secure constructions but usually in the honest-but-curious model. We propose a verifiable dynamic SSE scheme that is adaptively secure against malicious adversaries. Our scheme supports file modification, which is essential for efficiently working with large files, in addition to the ability to add/delete files. While our main construction is proven secure in the random oracle model (ROM), we also present a solution secure in the standard model with full security proof. Our experiments show that our scheme in the ROM performs a search within a few milliseconds, verifies the result in another few milliseconds, and has a proof overhead of 0.01% only. Our standard model solution, while being asymptotically slower, is still practical, requiring only a small client memory (e.g., ā488 KB) even for a large file collection (e.g., ā10 GB), and necessitates small tokens (e.g., ā156 KB for search and ā362 KB for file operations).Publication Open Access Measuring construction for social, economic and environmental assessment(Emerald, 2019) Ä°lhan, Bahriye; Department of Computer Engineering; Yobas, MĆ¼mine Banu; Teaching Faculty; Department of Computer Engineering; College of EngineeringPurpose: the purpose of this paper is to examine the issues that should be considered for a better gauge of the construction industry and built environment and to propose a set of indicators for measuring the social, economic and environmental value of construction. Design/methodology/approach: the indicators proposed in this study use Pearce's schema, which presents a framework to evaluate the socio-economic value of construction and its contribution to sustainable development. After analysing the problems faced by the industry, solutions are raised and finally indicators for each pillar of Pearce's schema are established through a literature review. Since the proposed indicators can be used for cross-country analysis, these comparisons are also presented as graphs including only those countries for which valid national data could be sourced from OECD databases. Findings: the issues, suggestions and indicators related to each concern about the main domains of the schema are addressed through the related literature and supported by available statistical data. Originality/value: although previous studies have drawn attention to measures for better evaluation of the construction industry and the built environment, this study, distinctively, presents an integrated approach in order to gauge the true value and impacts of construction in a more comprehensive way. The work's contribution to the body of knowledge is in revealing the hidden input and impact of construction on sustainable development by determining the barriers to this and their solutions, in addition to the proposal of relevant indicators.Publication Open Access Tri-op redactable blockchains with block modification, removal, and insertion(TĆBÄ°TAK, 2022) Dousti, Mohammad Sadeq; Department of Computer Engineering; KĆ¼pĆ§Ć¼, Alptekin; Faculty Member; Department of Computer Engineering; College of Engineering; 168060In distributed computations and cryptography, it is desirable to record events on a public ledger, such that later alterations are computationally infeasible. An implementation of this idea is called blockchain, which is a distributed protocol that allows the creation of an immutable ledger. While such an idea is very appealing, the ledger may be contaminated with incorrect, illegal, or even dangerous data, and everyone running the blockchain protocol has no option but to store and propagate the unwanted data. The ledger is bloated over time, and it is not possible to remove redundant information. Finally, missing data cannot be inserted later. Redactable blockchains were invented to allow the ledger to be mutated in a controlled manner. To date, redactable blockchains support at most two types of redactions: block modification and removal. The next logical step is to support block insertions. However, we show that this seemingly innocuous enhancement renders all previous constructs insecure. We put forward a model for blockchains supporting all three redaction operations and construct a blockchain that is provably secure under this formal definition.Publication Open Access SLAMP: stochastic latent appearance and motion prediction(Institute of Electrical and Electronics Engineers (IEEE), 2021) Erdem, Erkut; Department of Computer Engineering; Erdem, Aykut; GĆ¼ney, Fatma; Akan, Adil Kaan; Faculty Member; Department of Computer Engineering; KoƧ Ćniversitesi Ä°Å Bankası Yapay Zeka Uygulama ve AraÅtırma Merkezi (KUIS AI)/ KoƧ University Ä°Å Bank Artificial Intelligence Center (KUIS AI); College of Engineering; Graduate School of Sciences and Engineering; 20331; 187939; N/AMotion is an important cue for video prediction and often utilized by separating video content into static and dynamic components. Most of the previous work utilizing motion is deterministic but there are stochastic methods that can model the inherent uncertainty of the future. Existing stochastic models either do not reason about motion explicitly or make limiting assumptions about the static part. In this paper, we reason about appearance and motion in the video stochastically by predicting the future based on the motion history. Explicit reasoning about motion without history already reaches the performance of current stochastic models. The motion history further improves the results by allowing to predict consistent dynamics several frames into the future. Our model performs comparably to the state-of-the-art models on the generic video prediction datasets, however, significantly outperforms them on two challenging real-world autonomous driving datasets with complex motion and dynamic background.Publication Open Access FASTSUBS: an efficient and exact procedure for finding the most likely lexical substitutes based on an N-gram language model(Institute of Electrical and Electronics Engineers (IEEE), 2012) Department of Computer Engineering; YĆ¼ret, Deniz; Faculty Member; Department of Computer Engineering; College of Engineering; 179996Lexical substitutes have found use in areas such as paraphrasing, text simplification, machine translation, word sense disambiguation, and part of speech induction. However the computational complexity of accurately identifying the most likely substitutes for a word has made large scale experiments difficult. In this letter we introduce a new search algorithm, FASTSUBS, that is guaranteed to find the K most likely lexical substitutes for a given word in a sentence based on an n-gram language model. The computation is sublinear in both K and the vocabulary size V. An implementation of the algorithm and a dataset with the top 100 substitutes of each token in the WSJ section of the Penn Treebank are available at https://goo.gl/jzKH0.