QUANTITATIVE EVALUATION OF JOB AND RESOURCES FOR BETTER SELECTION TO IMPROVE MAKESPAN IN GRID SCHEDULING

This study presents the Priority based ranking of jobs and resources to improve the Makespan in the grid scheduling problem. Grid environment’s effectiveness largely depends on scheduler’s effectiveness/efficiency as they act as local resource brokers. The scheduler is responsible to select resources/scheduling jobs so that users/application requirements are met regarding overall execution time (throughput) and the resources use cost. The scheduler selects resources that suit user imposed constraints/conditions like CPU usage, RAM available/disk storage. Resource/Jobs are selected using WPR algorithm which improves in performance like Makespan. Results are compared with Round Robin/Weighted Round Robin algorithms where the proposed method has better performance.


INTRODUCTION
The Grid is a new paradigm to solve problems in engineering, science, industry/commerce. Applications use grid infrastructure to meet computational, storage/other needs. A single site no longer meets all resource needs of today's demanding applications and using distributed resources brings benefits for application users. Deployment of grid systems involves managing heterogeneous, geographically distributed/dynamically available resources.
Grid objective is a coordinated heterogeneous resource used to maximize combined resources performance and increasing cost-effectiveness. Due to resources diverse nature, a grid is a Heterogeneous Computing (HC) system where not all machines suit every task Li and Baker (2005). Some tasks have specific machine requirements, i.e., the need for specific instruction set. Hence to lower overall task execution time and increase system throughput, correct resources should be assigned to each task.
The task of grids scheduling is complicated as many machines, each with a different local policy, are involved. A Meta computer scheduler/grid Meta scheduler is implemented over local job schedulers. It is the Meta scheduler's responsibility to schedule jobs to local schedulers which then schedule jobs based on local scheduling policy. A grid scheduling system is divided into three parts: A scheduling policy, an objective function and a scheduling algorithm Magoules et al. (2009) The scheduling policy is defined by the owner/administrator of the machine/organisations owning Science Publications JCS the machine. It includes a collection of rules to define resource allocation for jobs submitted to a machine, i.e., a scheduling policy, in an organization, may provide jobs from department A more priority, than jobs from department B. So if jobs from both departments are submitted simultaneously, job from department A is scheduled ahead of that from department B.
Objective function provides a numerical value to schedule and selects a schedule from more than one possibilities. Usually, an objective function has more than one parameter, which the scheduling system aims to maximize/minimize.
Scheduling algorithms are the heart of scheduling systems. A good scheduling algorithm produces a near optimal schedule regarding the chosen objective function and does not require too much resource/time for execution.
Job/resources are two main components in scheduling; in grid scheduling job/resource represents as follows.

Job
A computational activity made up of various resource requirements (CPU, software libraries, nodes number and memory) and constraints, expressed in job description. Also, in a simple case, a job will have one task/numerous tasks requiring varied processing capabilities.

Resource
It is a computational entity (computational device/service) where jobs, tasks/applications are scheduled, allocated/processed. Resources comprise their own characteristics like CPU characteristics, memory and software. Some parameters are resource linked, between them processing speed/workload, which transforms with time. Resources can also belong to different administrative domains, involving different policies on access/usage.
In addition to scheduling algorithms type used, applications nature also affects scheduling results and must be considered during scheduling. Generally, applications are divided into two basic classes, dataintensive and computation-intensive. Data-intensive applications dedicate most operation time to access data Wong et al. (2004) while computation-intensive applications devote most operation time to compute/process data Xhafa and Abraham (2010) almost no application belongs to either of the two groups specifically; nevertheless it needs data/computational resources proportionally for execution. Each application is both data/computation-intensive but the ratio between both differs with applications.
Grid is a large-scale, heterogeneous, dynamic independent systems collection, geographically distributed/interconnected with high speed networks. Resource allocation in grids, allocates user jobs to CPUs. Jobs are divided into tasks allocated to various computers on grid for execution. Resource allocation is a critical grid technology feature. It was found that resource heterogeneity impacts resource allocation quite significantly regarding performance, reliability, robustness/scalability.
A heterogeneous grid infrastructure is a dynamic environment where elements location, type/performance constantly changes, i.e., a component resource can be put into/pulled out from a grid any time. Resources may not be totally dedicated to such environments and hence a system's computational capabilities varies over time Abba et al. (2012).
Each site in a Grid has own scheduling policy. Certain jobs have higher priority on specific resources. For example, local jobs will be given higher priority so that local jobs are better served on local resources.
Resource management/scheduling systems for Grid computing manage resources/application execution based on resource consumers'/owners' requirements and need to continuously adapt to changes in resources availability requiring introducing many challenging issues needing addressing like heterogeneous substrate, site autonomy, online control, policy extensibility, resource allocation or co-allocation, resource trading and QoS-based scheduling. Grid resource manager provides functionality to discover/publish resources and job scheduling/submission/monitoring. But, computing resources are geographically distributed under varied ownership each with own access policy, cost/constraints.
Most scheduling algorithms concentrate on resource centric/job centric. To overcome this new algorithm is proposed as Weighted Priority Based Ranking (WRP). In the new algorithm resources are given weightage by considering CPU/RAM and total weight assigned to CPU/RAM is 10 and for Job both user priority/system priority are considered and Job location sum is calculated with this information. Then a High priority Job is allocated to high weighted resource.
The rest of paper is organized as follows. The section 1.1 describes the related work of solving the grid scheduling problem. The method to solve problem is presented in section 2. In the section 3, problem solutions are presented and the section 4 concludes and describes some future work.

Related Work
A priority based multiple queue scheduling algorithm for grid was proposed by Singh and Kaur (2008). Priority based multiple queue approach solves issues in choosing the best job sequence combination. This increases scheduler performance and in turn Grid environment. Priority based multiple queue scheduling algorithm uses first come first serve, shortest job first, round robin scheduling to locate a best combination for job sequence. Priority based multiple queues, has 3 queues, each with its own algorithm for job arrangement in respective queue. First Come First Serve in first queue, Shortest Job First in second and Round Robin in last queue (FCFS -> SJF -> RR). Kayande and Shrawankar (2011) proposed priority based pre-emptive task scheduling which involves interrupting low priority tasks when high priority tasks are in queue. This scheduling is used for mobile operating system as CPU utilization is medium, turnaround time/response time is high. SMS categorization is achieved by redirecting them to Priority Inbox. Azmi and Bakar (2011) stated that Priority rules also referred as Queue-based. Instead of guaranteeing optimal solution, such techniques find solutions in a short time. Though it is a suboptimal algorithm, it is still frequently used to solve scheduling problem in real world due to ease of implementation and low time complexity. This study used six priority rules algorithms Earliest Deadline Soni (2010) proposed Grouping-Based Job Scheduling Model in Grid Computing where it is a Memory based Grouping Job Scheduling strategy. Jobs are grouped according to resource capability. This maximizes Grid resources use, reduces jobs processing time/network delay to schedule/execute grid jobs. Selvarani and Sadhasivam (2010) suggested that scheduling approach tasks be grouped/allocated nonuniformly. Resource processing capability percentage on total processing capability of all resources is calculated. Using the percentage, resource's processing capability based on total length of tasks to be scheduled is calculated. This approach, due to job grouping optimizes computation/communication ratio and increases resource use. For effective resource use and to distribute jobs to available resources, resource processing capability on total processing capability of all resources is calculated.
A dynamic priority scheduler for advanced reservation in grid computing was proposed by Ahuja et al. (2009).
The concepts consist of two components DPSAR/Advance Resource Reservation (ARM). DPSAR does job scheduling by resolving job priorities dynamically while ARM handles job reservation scheduled by DPSAR.
A Guest-Aware Priority-Based Virtual Machine Scheduling for Highly Consolidated Server was proposed by Kim et al. (2008). The suggested scheduling scheme selects next task to be scheduled based on task priorities and I/O usage status of virtual machines. Al-Khateeb et al. (2012) stated that primary metascheduling problem was selecting best resources (sites) to execute underlying jobs while achieving objectives: Reducing mean job turnaround time, ensuring site load balance and considering job priorities. A user's priority is considered to indicate user's requirements. Job scheduling submits jobs with higher priorities before those with lower priorities. High priority jobs have potential to access more powerful resources in this policy.
A new model that assigns priority of each user level jobs was proposed by Datta and Banerjee (2012). Jobs are submitted to Grid Broker (GLO) which lists them based on priority and sends them to Local Broker Manager (L) for allotment of job resources. Kirubanand and Palaniammal (2011) study mainly focuses on M/M (a,b)/1 markovian model with adaboost algorithm and user selection algorithms to find performance on wired and wireless technologies in terms of service rate, arrival rate, Expected waiting time and Busy period. Lee et al. (2011) dealt with scoring of computing resources among clusters. An adaptive scoring method is used to schedule jobs in grid environment in the suggested system. ASJS selects fittest resource for job execution according to resource status. High computing power cluster selected among various clusters and appropriate resources in selected cluster are identified to submit jobs using average transmission power. Nojabaei et al. (2012) prposed a method allows data (time stamp, time action, priority) of jobs on different scales to be compared by bringing them to a common scale. Secondly, the jobs should be arranged based on three criteria which are priority, time action and time stamp. This sorting algorithm is programmed via MATLAB Distributed Computing Server (DCS) software.

MATERIALS AND METHODS
The above works are either job centric of Resource centric which does not improve the overall completion time of the application. So a new approach that takes account on both a side is proposed. Figure 1 shows the grid scheduling architecture.

Weighted Priority Based Ranking Algorithm
There many scheduler algorithms in use which decides the order of execution when there are many processes in a queue. The schedulers are either based on preemptive or non-preemptive technique. In preemptive methods, once the jobs are given to the CPU, the scheduler can interrupt it whereas in nonpreemptive the jobs cannot be interrupted. Various well known CPU scheduling algorithms are First Come First Serve (FCFS), Shortest Job First (SJF) and Priority scheduling (Li and Baker, 2005) all of which are non-pre emptive and unsuitable for time sharing systems. In FCFS, jobs are executed in the arrival order. In SJF, the job with least expected completion time is executed first. Shortest Remaining Time First (SRTF) and Round Robin (RR) are pre-emptive in nature with RR being highly suitable for time sharing systems.
Round Robin (RR) algorithm overcomes this by assigning time intervals called quantum to jobs when they are run. If a job is incomplete during a quantum it reverts back to the queue awaiting the next round Soni (2010). The only challenge with this algorithm is finding a suitable quantum length. Round Robin Algorithm drawbacks are that it gives equal time to all processes (processes are scheduled in a first come first serve manner) as Round Robin Algorithm drawbacks ensure it is inefficient for processes with smaller CPU bursts leading to increased waiting and response times thereby lowering system throughput.
The grid scheduling architecture has differnet components like Broker, Information Serviceand scheduler. Scheduler schedules jobs and resources based on the information provided by the broker as shown in Fig. 1. The proposed algorithm eliminates drawbacks of round robin algorithm implementation by scheduling processes through weight assignment. The architecture of the proposed system is shown in Fig. 3. The proposed Weighted Round Robin algorithm depends on: • Number of hops from task allocating server to job performing cluster • Average bandwidth between allocation server and cluster Weighted round robin algorithm's performance is compared to simple RR for specific resource cluster number

JCS
and varying tasks number. A new method that evaluates both Job and Resources is proposed. The Scheduling architecture and activity is represented in Fig. 2.
First Jobs and priority are received from the user and information about the Resources from the grid information service; with this information perform a Quantitative analysis of job and resources for better paring to improve the overall turnaround time. The Priority by the user and System are considered, the system priority is calculated by the Shortest Job First and First come First Served with that Common Location sum is calculated. With the common Location Sum value is ranked. Similarly the CPU and RAM weightage Resources are ranked.

Resource
For a given Job resource Selection process is to choose the best Resources from the R Selected List. A good algorithm is needed to choose the best resource, Random Selection may work but it is not an ideal resource selection policy.
The algorithm should take it into the current state of the resource and choose the best one on the basis of Quantitative evaluation. The Algorithm only takes CPU and RAM in to the account. The total weight of the algorithm is 10 Where the CPU Weight is 6 and the RAM weight is 4. The minimum CPU speed is 1 GHz and Minimum RAM size is 256 MB ( Table 1).

Job
For the job side, both the user and system priority is taken into account. By using those priorities, the common location sum is calculated for the jobs. Then high priority jobs can be assigned to the first ranked site to achieve the minimum turnaround time for the completion of the job with the available resources. The following Table 3 shows the Priority wise Ranking of Jobs.
In the system, consider the user priorities and the system priorities at the same time to obtain the benefits of the queuing criteria. Each queuing criterion is sorted in ascending order; the highest priority job is the first location priority. Therefore, according to the user priority criterion, J2 has the highest priority and is assigned location 1. According to SJF, J5 has the highest priority. According to FIFO, J9 has the highest priority. The job location sum is the total sum of the job location scores for each criterion. For example, J2 has location 1 according to the user priority criterion and it has location 3 by the SJF criterion and location 5 by the FIFO criterion. All of the jobs are treated in the same way. The last column in Table 3 is the job priority, which sorted in descending order according to the job location sum, thus assigning the highest job priority to the job with the lowest job location sum. Table 4 shows a sample for 10 jobs and 10 resources to assign weights and priority Ranking. This helps in better pairing of Jobs and Resources.

Experimental Setup
Simulations were carried out in Simgrid framework. The Number of node clusters taken is 5, No of jobs used in the simulation are 100, 200, 300, 400, 500, Jobs are uniform size, job failure probability -ρ is 0.2 scheduling schemes used are Round Robin, Weighted Round Robin and Weighted Priority Based Ranking. To determine performance quality of scheduling the execution time metric is considered. Execution time is the time required to run a job on a resource. The scheduler aims to choose a resource leading to least execution time.

RESULTS AND DISCUSSION
The Graph depicted in Fig. 4 demonstrates the proposed weighted Priority based Ranking algorithm has the less execution time while comparing with the Round Robin and Weighted Round Robin scheduling algorithm.
It is observed from Fig. 4 that the proposed weighted Priority based ranking algorithm achieves 16.72% to 26.67% decrease in execution time when compared with Round robin scheduling. When compared with weighted round robin, the proposed method decreases execution time by 7.05% to 24.58% for varying number of tasks.

DISCUSSION
A Grid environment is a potential complex globally distributed system involving large sets of diverse, geographically distributed components for many applications. Grid system scheduling decisions are based on mapping best resources with jobs. Grid performance can be improved by ensuring all available Grid resources are used optimally through a good scheduling algorithm. Simulation results show that the proposed weighted Priority based Ranking algorithm achieves 16.72% to 26.67% decrease in execution time when compared with Round robin scheduling.
Future work can be concentrated, after pairing resources/jobs by WPR. Status of every individual resource should be considered for updated ranking of Jobs and updated weights for resources from local/global updates in Information Service.