FarSpot: Optimizing Monetary Cost for HPC Applications in the Cloud Spot Market

Amelie Chi Zhou, Jianming Lao, Zhoubin Ke, Yi Wang, Rui Mao*

*Corresponding author for this work

Research output: Contribution to journalJournal articlepeer-review

9 Citations (Scopus)

Abstract

Recently, we have witnessed many HPC applications developed and hosted in the cloud, which can benefit from the elastic and diversified resources on the cloud, while on the other hand confronting high costs for executing the long-running HPC applications. Although public clouds such as Amazon EC2 offer spot instances with dynamic and usually low prices compared to on-demand ones, the spot prices can vary significantly and sometimes can even be more expensive than on-demand prices of the same type. Previous work on reducing the monetary cost for HPC applications using spot instances focused on designing fault tolerance techniques or selecting appropriate instance types/bid prices to make good usage of the low spot prices. However, with the recent update of spot pricing model on Amazon EC2, these work may become either inefficient or invalid. In this article, we present FarSpot which is an optimization framework for HPC applications in the latest cloud spot market with the goal of minimizing application cost while ensuring performance constraints. FarSpot provides accurate long-term price prediction for a wide range of spot instance types using ensemble-based learning method. It further incorporates a cost-aware deadline assignment algorithm to distribute application deadline to each task according to spot price changes. With the assigned subdeadline of each task, FarSpot dynamically migrates tasks among spot instances to reduce execution cost. Evaluation results using real HPC benchmark show that 1) the prediction error of FarSpot is very low (below 3%), 2) FarSpot reduced the monetary cost by 32% on average compared to state-of-the-art algorithms, and 3) FarSpot satisfies the user-specified deadline constraints at all time.

Original languageEnglish
Pages (from-to)2955-2967
Number of pages13
JournalIEEE Transactions on Parallel and Distributed Systems
Volume33
Issue number11
DOIs
Publication statusPublished - Nov 2022

Scopus Subject Areas

  • Signal Processing
  • Hardware and Architecture
  • Computational Theory and Mathematics

User-Defined Keywords

  • Cloud computing
  • ensemble models
  • price prediction
  • spot market

Fingerprint

Dive into the research topics of 'FarSpot: Optimizing Monetary Cost for HPC Applications in the Cloud Spot Market'. Together they form a unique fingerprint.

Cite this