An Implementation of Job Running Backup Function in User-PC Computing System

Hein Htet, Nobuo Funabiki, Ariel Kamoyedji, Xudong Zhou, Xu Xiang, Shinji Sugawara, Wen Chung Kao

研究成果: 書貢獻/報告類型會議論文篇章

摘要

As a low-cost and high-performance distributed computing platform, we have studied the User-PC Computing (UPC) system based on the master-worker model. Docker container technology is adopted to run various application programs or jobs on heterogeneous PC environments for workers. Some jobs, such as physics simulations and neural networks, require long CPU time, which increases the probability of failure of running workers. The automatic backup of the job running state and migration to other worker will be essential to reduce the job completion delay. In this paper, we implement the job running backup function in the UPC system. Checkpoint-Restore in Userspace (CRIU) is periodically applied to capture the job running state of the running job at a worker. When the master detects the failure, it automatically migrates the job to another worker. To evaluate the function, we conducted experiments using the testbed UPC system with 14 jobs and six workers of different specifications, and confirmed that the proposal successfully resumes the job running from the interrupted point at another worker.

原文英語
主出版物標題2022 4th International Conference on Computer Communication and the Internet, ICCCI 2022
發行者Institute of Electrical and Electronics Engineers Inc.
頁面156-161
頁數6
ISBN(電子)9781665469920
DOIs
出版狀態已發佈 - 2022
事件4th International Conference on Computer Communication and the Internet, ICCCI 2022 - Chiba, 日本
持續時間: 2022 7月 12022 7月 3

出版系列

名字2022 4th International Conference on Computer Communication and the Internet, ICCCI 2022

會議

會議4th International Conference on Computer Communication and the Internet, ICCCI 2022
國家/地區日本
城市Chiba
期間2022/07/012022/07/03

ASJC Scopus subject areas

  • 電腦網路與通信
  • 電腦科學應用
  • 硬體和架構
  • 資訊系統與管理
  • 原子與分子物理與光學

指紋

深入研究「An Implementation of Job Running Backup Function in User-PC Computing System」主題。共同形成了獨特的指紋。

引用此