Publication: Learning Markov Chain Models from sequential data under local differential privacy
Program
KU-Authors
KU Authors
Co-Authors
Advisor
Publication Date
Language
en
Journal Title
Journal ISSN
Volume Title
Abstract
Markov chain models are frequently used in the analysis and modeling of sequential data such as location traces, time series, natural language, and speech. However, considering that many data sources are privacy-sensitive, it is imperative to design privacy-preserving methods for learning Markov models. In this paper, we propose Prima for learning discrete-time Markov chain models under local differential privacy (LDP), a state-of-the-art privacy standard. In Prima, each user locally encodes and perturbs their sequential record on their own device using LDP protocols. For this purpose, we adapt two bitvector-based LDP protocols (RAPPOR and OUE); and furthermore, we develop a novel extension of the GRR protocol called AdaGRR. We also propose to utilize custom privacy budget allocation strategies for perturbation, which enable uneven splitting of the privacy budget to better preserve utility in cases with uneven sequence lengths. On the server-side, Prima uses a novel algorithm for estimating Markov probabilities from perturbed data. We experimentally evaluate Prima using three real-world datasets, four utility metrics, and under various combinations of privacy budget and budget allocation strategies. Results show that Prima enables learning Markov chains under LDP with high utility and low error compared to Markov chains learned without privacy constraints.
Source:
Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Publisher:
Springer Science and Business Media Deutschland Gmbh
Keywords:
Subject
Computer science, Information systems, Theory, Methods, Telecommunications