Defending against targeted poisoning attacks in federated learning

dc.contributor.authorid0000-0002-7676-0167
dc.contributor.authoridN/A
dc.contributor.departmentDepartment of Computer Engineering
dc.contributor.departmentDepartment of Computer Engineering
dc.contributor.kuauthorGürsoy, Mehmet Emre
dc.contributor.kuauthorErbil Pınar
dc.contributor.kuprofileFaculty Member
dc.contributor.kuprofileUndergraduate Student
dc.contributor.schoolcollegeinstituteCollege of Engineering
dc.contributor.schoolcollegeinstituteCollege of Engineering
dc.contributor.yokid330368
dc.contributor.yokidN/A
dc.date.accessioned2025-01-19T10:31:25Z
dc.date.issued2022
dc.description.abstractFederated learning (FL) enables multiple participants to collaboratively train a deep neural network (DNN) model. To combat malicious participants in FL, Byzantine-resilient aggregation rules (AGRs) have been developed. However, although Byzantine-resilient AGRs are effective against untargeted attacks, they become suboptimal when attacks are stealthy and targeted. In this paper, we study the problem of defending against targeted data poisoning attacks in FL and make three main contributions. First, we propose a method for selective extraction of DNN parameters from FL participants' update vectors that are indicative of attack, and embedding them into low-dimensional latent space. We show that the effectiveness of Byzantine-resilient AGRs such as Trimmed Mean and Krum can be improved if they are used in combination with our proposed method. Second, we develop a clustering-based defense using X-Means for separating items into malicious versus benign clusters in latent space. Such separation allows identification of malicious versus benign updates. Third, using the separation from the previous step, we show that a "clean" model (i.e., a model that is not negatively impacted by the attack) can be trained using only the benign updates. We experimentally evaluate our defense methods on Fashion-MNIST and CIFAR-10 datasets. Results show that our methods can achieve up to 95% true positive rate and 99% accuracy in malicious update identification across various settings. In addition, the clean models trained using our approach achieve similar accuracy compared to a baseline scenario without poisoning.
dc.description.indexedbyWoS
dc.description.indexedbyScopus
dc.description.publisherscopeInternational
dc.identifier.doi10.1109/TPS-ISA56441.2022.00033
dc.identifier.isbn978-1-6654-7408-5
dc.identifier.quartileN/A
dc.identifier.scopus2-s2.0-85150680317
dc.identifier.urihttps://doi.org/10.1109/TPS-ISA56441.2022.00033
dc.identifier.urihttps://hdl.handle.net/20.500.14288/26236
dc.identifier.wos978301700023
dc.keywordsFederated learning
dc.keywordsPoisoning attacks
dc.keywordsAdversarial machine learning
dc.languageen
dc.publisherIEEE Computer Soc
dc.source2022 IEEE 4th International Conference on Trust, Privacy and Security in Intelligent Systems, and Applications, TPS-ISA
dc.subjectComputer science
dc.subjectInformation systems
dc.titleDefending against targeted poisoning attacks in federated learning
dc.typeConference proceeding

Files