Research Article Open Access

Fault Tolerance Grid Scheduling with Checkpoint Based on Ant Colony System

Saufi Bukhari1, Ku Ruhana Ku-Mahamud1 and Hiroaki Morino2
  • 1 Universiti Utara Malaysia, Malaysia
  • 2 Shibaura Institute of Technology, Japan

Abstract

Task resubmission and checkpoint are among several popular techniques used in providing fault tolerance in grid computing. However, due to the lack of side-by-side comparison, it is not certain of the best technique that would not degrade the system performance in addition to providing fault tolerance capability. This study proposed Dynamic ACS-based Fault Tolerance in grid computing using resubmission to new resource, checkpoint technique and utilization of resource execution history with the aim to reduce execution and task processing time and to increase the success rate in grid environment. The proposed algorithm is compared with other relevant algorithms to measure the performance in terms of execution time, success rate and average processing time. The results suggest that the proposed algorithm with improved task resubmission, checkpoint and extended pheromone update formula gives better performance in managing execution failure as well as resource selection during task assignment or resubmission.

Journal of Computer Science
Volume 13 No. 8, 2017, 363-370

DOI: https://doi.org/10.3844/jcssp.2017.363.370

Submitted On: 15 May 2017 Published On: 1 September 2017

How to Cite: Bukhari, S., Ku-Mahamud, K. R. & Morino, H. (2017). Fault Tolerance Grid Scheduling with Checkpoint Based on Ant Colony System. Journal of Computer Science, 13(8), 363-370. https://doi.org/10.3844/jcssp.2017.363.370

  • 5,381 Views
  • 3,163 Downloads
  • 0 Citations

Download

Keywords

  • Grid Computing
  • Fault Tolerance
  • Task Resubmission
  • Task Checkpoint
  • Ant Colony System