Research Outputs
Permanent URI for this communityhttps://hdl.handle.net/20.500.14288/2
Browse
10 results
Search Results
Publication Metadata only A multitask multiple kernel learning formulation for discriminating early- and late-stage cancers(Oxford University Press (OUP), 2020) N/A; N/A; Department of Industrial Engineering; Rahimi, Arezou; Gönen, Mehmet; PhD Student; Faculty Member; Department of Industrial Engineering; Graduate School of Sciences and Engineering; College of Engineering; N/A; 237468Motivation: Genomic information is increasingly being used in diagnosis, prognosis and treatment of cancer. The severity of the disease is usually measured by the tumor stage. Therefore, identifying pathways playing an important role in progression of the disease stage is of great interest. Given that there are similarities in the underlying mechanisms of different cancers, in addition to the considerable correlation in the genomic data, there is a need for machine learning methods that can take these aspects of genomic data into account. Furthermore, using machine learning for studying multiple cancer cohorts together with a collection of molecular pathways creates an opportunity for knowledge extraction. Results: We studied the problem of discriminating early- and late-stage tumors of several cancers using genomic information while enforcing interpretability on the solutions. To this end, we developed a multitask multiple kernel learning (MTMKL) method with a co-clustering step based on a cutting-plane algorithm to identify the relationships between the input tasks and kernels. We tested our algorithm on 15 cancer cohorts and observed that, in most cases, MTMKL outperforms other algorithms (including random forests, support vector machine and single-task multiple kernel learning) in terms of predictive power. Using the aggregate results from multiple replications, we also derived similarity matrices between cancer cohorts, which are, in many cases, in agreement with available relationships reported in the relevant literature.Publication Metadata only A solution method for linear and geometrically nonlinear MDOF systems with random properties subject to random excitation(Elsevier, 1998) Micaletti, RC; Çakmak, Ahmet Ş.; Nielsen, Søren R.K.; Department of Mathematics; Köylüoğlu, Hasan Uğur; Teaching Faculty; Department of Mathematics; College of Sciences; N/AA method for computing the lower-order moments of response of randomly excited multi-degree-of-freedom (MDOF) systems with random structural properties is proposed. The method is grounded in the techniques of stochastic calculus, utilizing a Markov diffusion process to model the structural system with random structural properties. The resulting state-space formulation is a system of ordinary stochastic differential equations with random coefficients and deterministic initial conditions which are subsequently transformed into ordinary stochastic differential equations with deterministic coefficients and random initial conditions, This transformation facilitates the derivation of differential equations which govern the evolution of the unconditional statistical moments of response. Primary consideration is given to linear systems and systems with odd polynomial nonlinearities, for in these cases there is a significant reduction in the number of equations to be solved. The method is illustrated for a five-story shear-frame structure with nonlinear interstory restoring forces and random damping and stiffness properties. The results of the proposed method are compared to those estimated by extensive Monte-Carlo simulation.Publication Metadata only Discriminating early- and late-stage cancers using multiple kernel learning on gene sets(Oxford Univ Press, 2018) N/A; N/A; Department of Industrial Engineering; Rahimi, Arezou; Gönen, Mehmet; PhD Student; Faculty Member; Department of Industrial Engineering; Graduate School of Sciences and Engineering; College of Engineering; N/A; 237468Motivation: Identifying molecular mechanisms that drive cancers from early to late stages is highly important to develop new preventive and therapeutic strategies. Standard machine learning algorithms could be used to discriminate early-and late-stage cancers from each other using their genomic characterizations. Even though these algorithms would get satisfactory predictive performance, their knowledge extraction capability would be quite restricted due to highly correlated nature of genomic data. That is why we need algorithms that can also extract relevant information about these biological mechanisms using our prior knowledge about pathways/gene sets. Results: In this study, we addressed the problem of separating early- and late-stage cancers from each other using their gene expression profiles. We proposed to use a multiple kernel learning (MKL) formulation that makes use of pathways/gene sets (i) to obtain satisfactory/improved predictive performance and (ii) to identify biological mechanisms that might have an effect in cancer progression. We extensively compared our proposed MKL on gene sets algorithm against two standard machine learning algorithms, namely, random forests and support vector machines, on 20 diseases from the Cancer Genome Atlas cohorts for two different sets of experiments. Our method obtained statistically significantly better or comparable predictive performance on most of the datasets using significantly fewer gene expression features. We also showed that our algorithm was able to extract meaningful and disease-specific information that gives clues about the progression mechanism.Publication Metadata only Is there a 'heat-or-eat' trade-off in the Uk?(Wiley, 2014) Beatty, Timothy K. M.; Blow, Laura; Crossley, Thomas F.; Department of Economics; Crossley, Thomas Fraser; Faculty Member; Department of Economics; College of Administrative Sciences and Economics; N/ADo households cut back on food spending to finance the additional cost of keeping warm during spells of unseasonably cold weather? For households which cannot smooth consumption over time, we describe how cold weather shocks are equivalent to income shocks. We merge detailed household level expenditure data from older households with historical regional weather information. We find evidence that the poorest of older households cannot smooth fuel spending over the worst temperature shocks. Statistically significant reductions in food spending occur in response to winter temperatures 2 or more standard deviations colder than expected, which occur about 1 winter month in 40; reductions in food expenditure are considerably larger in poorer households.Publication Metadata only Martingale property of exchange rates and central bank interventionssd(Taylor & Francis Inc, 2003) Department of Economics; Yılmaz, Kamil; Faculty Member; Department of Economics; College of Administrative Sciences and Economics; 6111This article uses the variance ratio-based multiple comparison test and the Richardson-Smith Wald test procedures to test for the martingale property of daily exchange rates of seven major currencies vis-A-vis the U.S. dollar. To allow for the possibility that exchange rates are not governed by a single process throughout the float, the test statistics are calculated and plotted for fixed-length moving subsample windows rather than being applied to the full Sample. The results show that exchange rates do not always follow the martingale process. During the times of coordinated central bank interventions, exchange rates deviate from the martingale property.Publication Metadata only Nearest neighbor methods for testing reflexivity(Springer, 2017) Ceyhan, Elvan; Bahadır, Selim; PhD Student; Graduate School of Sciences and Engineering; 272279Nearest neighbor (NN) methods are widely employed for drawing inferences about spatial point patterns of two or more classes. We introduce a method for testing reflexivity in the NN structure (i.e., NN reflexivity) based on a contingency table which will be called reflexivity contingency table (RCT) henceforth. The RCT is based on the NN relationships among the data points and was used for testing niche specificity in literature, but we demonstrate that it is actually more appropriate for testing the NN reflexivity pattern. We derive the asymptotic distribution of the entries of the RCT under random labeling and introduce tests of reflexivity based on these entries. We also consider Pielou's approach on RCT and show that it is not appropriate for completely mapped spatial data. We determine the appropriate null hypotheses and the underlying conditions/assumptions required for all tests considered. We investigate the finite sample performance of the tests in terms of empirical size and power by extensive Monte Carlo simulations and illustrate the methods on two real-life ecological data sets.Publication Metadata only On the use of nearest neighbor contingency tables for testing spatial segregation(Springer, 2010) Department of Mathematics; Ceyhan, Elvan; Faculty Member; Department of Mathematics; College of Sciences; N/AFor two or more classes (or types) of points, nearest neighbor contingency tables (NNCTs) are constructed using nearest neighbor (NN) frequencies and are used in testing spatial segregation of the classes. Pielou's test of independence, Dixon's cell-specific, class-specific, and overall tests are the tests based on NNCTs (i.e., they are NNCT-tests). These tests are designed and intended for use under the null pattern of random labeling (RL) of completely mapped data. However, it has been shown that Pielou's test is not appropriate for testing segregation against the RL pattern while Dixon's tests are. In this article, we compare Pielou's and Dixon's NNCT-tests; introduce the one-sided versions of Pielou's test; extend the use of NNCT-tests for testing complete spatial randomness (CSR) of points from two or more classes (which is called CSR independence, henceforth). We assess the finite sample performance of the tests by an extensive Monte Carlo simulation study and demonstrate that Dixon's tests are also appropriate for testing CSR independence; but Pielou's test and the corresponding one-sided versions are liberal for testing CSR independence or RL. Furthermore, we show that Pielou's tests are only appropriate when the NNCT is based on a random sample of (base, NN) pairs. We also prove the consistency of the tests under their appropriate null hypotheses. Moreover, we investigate the edge (or boundary) effects on the NNCT-tests and compare the buffer zone and toroidal edge correction methods for these tests. We illustrate the tests on a real life and an artificial data set.Publication Metadata only Path2Surv: Pathway/gene set-based survival analysis using multiple kernel learning(Oxford University Press (OUP), 2019) N/A; Department of Industrial Engineering; Department of Industrial Engineering; Dereli, Onur; Oğuz, Ceyda; Gönen, Mehmet; PhD Student; Faculty Member; Faculty Member; Department of Industrial Engineering; Graduate School of Sciences and Engineering; College of Engineering; College of Engineering; N/A; 6033; 237468Motivation: Survival analysis methods that integrate pathways/gene sets into their learning model could identify molecular mechanisms that determine survival characteristics of patients. Rather than first picking the predictive pathways/gene sets from a given collection and then training a predictive model on the subset of genomic features mapped to these selected pathways/gene sets, we developed a novel machine learning algorithm (Path2Surv) that conjointly performs these two steps using multiple kernel learning. Results: We extensively tested our Path2Surv algorithm on 7655 patients from 20 cancer types using cancer-specific pathway/gene set collections and gene expression profiles of these patients. Path2Surv statistically significantly outperformed survival random forest (RF) on 12 out of 20 datasets and obtained comparable predictive performance against survival support vector machine (SVM) using significantly fewer gene expression features (i.e. less than 10% of what survival RF and survival SVM used).Publication Metadata only Simulation and characterization of multi-class spatial patterns from stochastic point processes of randomness, clustering and regularity(Springer, 2014) Department of Mathematics; Ceyhan, Elvan; Faculty Member; Department of Mathematics; College of Sciences; N/ASpatial pattern analysis of data from multiple classes (i.e., multi-class data) has important implications. We investigate the resulting patterns when classes are generated from various spatial point processes. Our null pattern is that the nearest neighbor probabilities being proportional to class frequencies in the multi-class setting. In the two-class case, the deviations are mainly in two opposite directions, namely, segregation and association of the classes. But for three or more classes, the classes might exhibit mixed patterns, in which one pair exhibiting segregation, while another pair exhibiting association or complete spatial randomness independence. To detect deviations from the null case, we employ tests based on nearest neighbor contingency tables (NNCTs), as NNCT methods can provide an omnibus test and post-hoc tests after a significant omnibus test in a multi-class setting. In particular, for analyzing these multi-class patterns (mixed or not), we use an omnibus overall test based on NNCTs. After the overall test, the pairwise interactions are analyzed by the post-hoc cell-specific tests based on NNCTs. We propose various parameterizations of the segregation and association alternatives, list some appealing properties of these patterns, and propose three processes for the two-class association pattern. We also consider various clustering and regularity patterns to determine which one(s) cause segregation from or association with a class from a homogeneous Poisson process and from other processes as well. We perform an extensive Monte Carlo simulation study to investigate the newly proposed association patterns and to understand which stochastic processes might result in segregation or association. The methodology is illustrated on two real life data sets from plant ecology.Publication Metadata only Stock rationing in an M/E-r/1 multi-class make-to-stock queue with backorders(Taylor & Francis, 2009) Gayon, Jean-Philippe; De Vericourt, Francis; Department of Industrial Engineering; Karaesmen, Fikri; Faculty Member; Department of Industrial Engineering; College of Engineering; 3579A model of a single-item make-to-stock production system is presented. The item is demanded by several classes of customers arriving according to Poisson processes with different backorder costs. Item processing times have an Erlang distribution. It is shown that certain structural properties of optimal stock and capacity allocation policies exist for the case where production may be interrupted and restarted. Also, a complete characterization of the optimal policy in the case of uninterrupted production when excess production can be diverted to a salvage market is presented. A heuristic policy is developed and assessed based on the results obtained in the analysis. Finally the value of production status information and the effects of processing time variability are investigated.