APPLES: Efficiently handling spin-lock synchronization on virtualized platforms

Jianchen Shan, Xiaoning Ding, Narain Gehani

Research output: Contribution to journalArticlepeer-review

2 Scopus citations

Abstract

Spin-locks are widely used in software for efficient synchronization. However, they cause serious performance degradation on virtualized platforms, such as the Lock Holder Preemption (LHP) problem and the Lock Waiter Preemption (LWP) problem, due to excessive spinning by virtual CPUs (VCPUs). The excessive spinning occurs when a VCPU waits to acquire a spin-lock. To address the performance degradation, hardware facilities, such as Intel PLE and AMD PF, are provided on processors to preempt VCPUs when they spin excessively. Although these facilities have been predominantly used on mainstream virtualization systems, using them in a manner that achieves the highest performance is still a challenging issue. There are two core problems in using these hardware facilities to reduce excessive spinning. One is to determine the best time to preempt a spinning VCPU (i.e., the selection of spinning thresholds). The other is which VCPU should be scheduled to run after the spinning VCPU is descheduled. Due to the semantic gap between different software layers, the virtual machine monitor (VMM) does not have information about the computation characteristics on VCPUs, which is needed to address the above problems. This makes the problems inherently challenging. We propose a framework named AdPtive Pause-Loop Exiting and Scheduling (APPLES) to address these problems. APPLES monitors the overhead caused by excessive spinning and preempting spinning VCPUs, and periodically adjusts spinning thresholds to reduce the overhead. APPLES also evaluates and schedules "ready" VCPUs in a VM by their potential to reduce the spinning incurred by the spin-lock synchronization. The evaluation is based on the causality and the time of VCPU preemptions. The implementation of APPLES incurs only minimal changes to existing systems (about 100 lines of code in KVM). Experiments show that APPLES can improve performance by 3 ∼ 49 percent (14 percent on average) for the workloads with frequent spin-lock operations.

Original languageEnglish (US)
Article number7736153
Pages (from-to)1811-1824
Number of pages14
JournalIEEE Transactions on Parallel and Distributed Systems
Volume28
Issue number7
DOIs
StatePublished - Jul 2017

All Science Journal Classification (ASJC) codes

  • Signal Processing
  • Hardware and Architecture
  • Computational Theory and Mathematics

Keywords

  • Cloud computing
  • Lock holder preemption
  • Multi-core
  • Scheduling
  • Spin-lock synchronization
  • Virtualization

Fingerprint

Dive into the research topics of 'APPLES: Efficiently handling spin-lock synchronization on virtualized platforms'. Together they form a unique fingerprint.

Cite this