Publication: Analysis of checkpointing algorithms for primary-backup replication
Program
KU-Authors
KU Authors
Co-Authors
Advisor
Publication Date
2017
Language
English
Type
Conference proceeding
Journal Title
Journal ISSN
Volume Title
Abstract
Replication is useful for supporting fault-tolerance, reliable and recovery oriented distributed systems. Popular application areas include databases, P2P systems, web services and Internet of Things. In this study, we propose utilizing the checkpointing concept for improving the efficiency of the well-known primary-backup replication protocol in distributed systems. We developed a software framework based on an in-memory replicated key-value store to evaluate various checkpointing algorithms. Using the framework over geographically distributed nodes of the PlanetLab platform, we performed extensive experiments and analysis with several different metrics, including blocking time, checkpointing time, checkpoint size and recovery time. Experimental scenarios consist of using the well-known benchmarking tool, YCSB, performing realistic read/update queries through exemplary workloads. Our findings indicate that incremental checkpointing combined with a periodic usage is the most efficient approach with having up to 30-times better system throughput and 50% decrease in average blocking times compared to traditional primary-backup replication and other checkpointing algorithms.
Description
Source:
Proceedings - IEEE Symposium on Computers and Communications
Publisher:
Institute of Electrical and Electronics Engineers (IEEE)
Keywords:
Subject
Computer science, Information systems, Engineering, Electrical electronic engineering, Telecommunications