Robust Distributed Bayesian Learning with Stragglers via Consensus Monte Carlo

Hari Hara Suthan Chittoor, Osvaldo Simeone

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

This paper studies distributed Bayesian learning in a setting encompassing a central server and multiple workers by focusing on the problem of mitigating the impact of stragglers. The standard one-shot, or embarrassingly parallel, Bayesian learning protocol known as consensus Monte Carlo (CMC) is generalized by proposing two straggler-resilient solutions based on grouping and coding. Two main challenges in designing straggler-resilient algorithms for CMC are the need to estimate the statistics of the workers' outputs across multiple shots, and the joint non-linear post-processing of the outputs of the workers carried out at the server. This is in stark contrast to other distributed settings like gradient coding, which only require the per-shot sum of the workers' outputs. The proposed methods, referred to as Group-based CMC (G-CMC) and Coded CMC (C-CMC), leverage redundant computing at the workers in order to enable the estimation of global posterior samples at the server based on partial outputs from the workers. Simulation results show that C-CMC may outperform G-CMC for a small number of workers, while G-CMC is generally preferable for a larger number of workers.

Original languageEnglish (US)
Title of host publication2022 IEEE Global Communications Conference, GLOBECOM 2022 - Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages609-614
Number of pages6
ISBN (Electronic)9781665435406
DOIs
StatePublished - 2022
Event2022 IEEE Global Communications Conference, GLOBECOM 2022 - Virtual, Online, Brazil
Duration: Dec 4 2022Dec 8 2022

Publication series

Name2022 IEEE Global Communications Conference, GLOBECOM 2022 - Proceedings

Conference

Conference2022 IEEE Global Communications Conference, GLOBECOM 2022
Country/TerritoryBrazil
CityVirtual, Online
Period12/4/2212/8/22

All Science Journal Classification (ASJC) codes

  • Artificial Intelligence
  • Computer Networks and Communications
  • Hardware and Architecture
  • Signal Processing
  • Renewable Energy, Sustainability and the Environment
  • Safety, Risk, Reliability and Quality

Keywords

  • Consensus Monte Carlo
  • Distributed Bayesian learning
  • coded computing
  • grouping
  • stragglers

Fingerprint

Dive into the research topics of 'Robust Distributed Bayesian Learning with Stragglers via Consensus Monte Carlo'. Together they form a unique fingerprint.

Cite this