Dynamic Scheduling for Cloud Reliability using Transportation Problem

: Problem statement: Cloud is purely a dynamic environment and the existing task scheduling algorithms are mostly static and considered various parameters like time, cost, make span, speed, scalability, throughput, resource utilization, scheduling success rate and so on. Available scheduling algorithms are mostly heuristic in nature and more complex, time consuming and does not consider reliability and availability of the cloud computing environment. Therefore there is a need to implement a scheduling algorithm that can improve the availability and reliability in cloud environment. Approach: We propose a new algorithm using modified linear programming problem transportation based task scheduling and resource allocation for decentralized dynamic cloud computing. The Main objective is to improve the reliability of cloud computing environment by considering the resources available and it’s working status of each Cluster periodically and maximizes the profit for the cloud providers by minimizing the total cost for scheduling, allocation and execution cost and minimizing total turn-around, total waiting time and total execution time. Our proposed algorithm also utilizes task historical values such as past success rate, failure rate of task in each Cluster and previous execution time and total cost for various Clusters for each task from Task Info Container (TFC) for tasks scheduling resource allocation for near future. Results: Our approach TP Scheduling (Transpotation Problem based) responded for various tasks assigned by clients in poisson arrival pattern and achieved the improved reliability in dynamic decentralized cloud environment. Conclusion: With our proposed TP Scheduling algorithn we improve the Reliability of the decentralized dynamic cloud computing.


INTRODUCTION
Cloud computing refers to Internet based development and utilization of computer technology and hence, cloud computing can be described as a model of Internet-based computing and a subscriptionbased service where you can obtain networked storage space and computer resources and so on. Cloud Computing, dynamically scalable (and mostly virtualized) resources are provided as a service over the Internet. With the promotion of the world's leading companies, cloud computing is attracting more and more attention for providing a flexible, on demand computing infrastructure for a number of applications.
The actual cloud computing definition (Badger et al., 2011;An and Neuman, 2011) by the national institute of standards and technology is: "Cloud computing is a model for enabling convenient, on-mand network access to a shared pool of configurable computing resources (e.g., networks, servers, storage, applications and services) that can be rapidly provisioned and released with minimal management effort or service provider interaction". The goal of cloud computing in general is to provide services to users with greater flexibility and availability as is often described as "taking everything as a service" (XaaS) (An and Neuman, 2011).
Demand (Goudarzi and Pedram, 2011) for computing power has been increasing due to the penetration of information technologies in our daily interactions with the world both at personal and public levels, encompassing business, commerce, education, manufacturing and communication services. At personal level, the wide scale presence of online banking, e-commerce, SaaS (Software as a Service), social networking and so on produce workloads of great diversity and enormous scale. At the same time computing and information processing requirements of various scientific researches (Hoffa et al., 2008), public organizations and private corporations have also been increasing rapidly. Examples include digital services and functions required by the various industrial sectors, ranging from manufacturing to housing, from transportation to banking. Such a dramatic increase in the computing demand requires a scalable and dependable IT infrastructure comprising of servers, storage, network bandwidth, physical infrastructure, electrical grid (Deelman et al., 2003), IT personnel and billions of dollars in capital expenditure and operational cost (Yang et al., 2008) to name a few For consumers, it is illusion of infinite computing resources available on demand (Armbrust et al., 2008) and computing resources become immediate rather than persistent (Dillon et al., 2010) there are no up-front commitment and contract as they can use them to scale up whenever they want and release them once they finish scaling down. Moreover, resources provisioning appears to be infinite to them, the consumption can rapidly rise in order to meet peak requirement at any time. In practice, physical resources of clouds are limited and a performance bottleneck will eventually develop.
Scheduling (Senkul et al., 2002;Sakellariou and Zhao, 2004) is fundamental to the achievement of high performance in parallel and distributed systems. Scheduling problems, which are concerned with searching for optimal (or near-optimal) real-time (Srikanth et al., 2012) and predictive schedules subject to a number of constraints (Zhang et al., 2009, Yu andShi, 2008), are mostly NP-hard. In general, problem of determining whether there is an assignment of tasks to servers so that each task's demand may be satisfied by the available resources is NP-complete (Heger, 2010) (unlikely to be solvable in an amount of time that reflects a polynomial function). Even if resources are available to meet a certain demand, to correctly mapping the set of demands with a set of resources may be too complex to solve within an acceptable timeframe. In cloud computing, delivering services and resources on demand over a network requires addressing numerous technological issues, including automated provisioning, dynamic virtual server migration, or network security problems. Further, in a cloud environment, not all the resources (virtualized server systems) may actually be available to all customers, due to network latency, commercial agreements, or some security policy issues.
A cloud resource scheduler (Bautista et al, 2012, Dillon et al., 2010 should make full use of all kinds of resources on Internet such as computing, network bandwidth and storage resource. However, most of jobs that cloud computing needs to deal with are small granularity jobs , which means to need longer waiting-time, consume more resources and lead to lower flexibility and other drawbacks. Under cloud computing environment, in regard to multi-user and large amounts of small granularity concurrent job requirements, how to properly dispatch jobs to different slave nodes to avoid underutilization and how to deal with workload unbalance are the bottlenecks which importantly influence system performance. The existing scheduling algorithms (Bala and Chana, 2011) consider various parameters like time, cost, make span, speed, scalability, throughput, resource utilization, scheduling success rate and so on. but, for a multiple workflows (Yang et al., 2008), metrics like reliability and availability (Bamiah and Brohi, 2011) should also be considered. Existing scheduling algorithms does not consider reliability and availability. Therefore there is a need to implement a workflow scheduling algorithm that can improve the availability and reliability in cloud environment.

Related work:
There are plenty of research is going for resource scheduling for improving various factors in cloud computing. Normally all research is based on the heuristic based algorithm where we requires lot iterations to achieve the optimal cost and minimizing the waiting time and turn-around time. No scheduling algorithm considers the important parameters such as reliability, Availability and improving the scalability. And also complex algorithm makes the cloud scheduler as more complex. Gu et al. (2012) proposed a genetic algorithm based scheduling and considers the historical data and current states of VM, uses tree structure to do the coding in genetic algorithm, proposes the correspondent strategies of selection, hybridization and variation also puts some control on the method so that it has better astringency. Zhang, 2009, Yang et al. (2008) proposed an Ant colony based task scheduling architecture to improve the scheduling behavior, better utilize (balance) the available resources, lower aggregate task execution time and hence, minimize cost. Heger (2010) proposed an ANN based task scheduling architecture to improve the scheduling behavior, better utilize (balance) the available resources, lower aggregate task execution time and hence, minimize cost. Henzinger et al. (2010) proposed a method known as "flexible provisioning of resources in a cloud environment, (Flex PRICE) where the cloud (provider) and the users build a symbiotic relationship. Instead of renting focuses on allocation of resources across tasks using a set of specific resources, the user simply presents the job to be executed to the cloud. The cloud has an associated pricing model to quote prices of the user jobs executed. Tayal (2011) proposed a centralized scheduler (master node) a choice by referring to a global view of the whole system with fuzzy setting based on GA parameters. Their idea was the adaptation of the GA operator's value (selection; crossover; mutation) during the run of the GA. The fuzzy control is applied if the condition of fuzzy adaptation is true. This Model described the information related to processors which includes slot information, data replication information and workload information of processors. Senkul et al. (2002) presented a logical framework for scheduling work-flow under resource allocation constraints. The framework is based on Concurrent Constraint Transaction Logic (CCTR) and integrates Concurrent Trans-action Logic with Constraint Logic Programming. They presented an algorithm that took the initial work flows specification and a set of resource allocation constraints and returns a new work-flow and a resource assignment, such that every execution of that workflow is guaranteed to satisfy the constraints. Clark et al. (2012) introduced an Intelligent Cloud Resource Al-location Service (ICRAS). ICRAS supports the consumer with (1) discovering all available resource con-figurations, (2) choosing the desired configuration, 3) negotiating a service agreement with the CSP, (4) monitoring the service agreement for violations and 5) assisting in the migration of services between CSPs. Ramamritham et al. (1989) was among the first to propose the use of distributed algorithms to schedule tasks with time and resource restrictions. They give different algorithms for this purpose and a comparison of their performance. They claim that their solution is effective even in hard real-time environments. However, their approach requires each node to have full knowledge of the rest of the system, which naturally limits its scalability. Zhong and Zhang (2010) proposed an optimized scheduling algorithm to achieve the optimization or sub-optimization for cloud scheduling. In this algorithm an Improved Genetic Algorithm (IGA) is used for the automated scheduling policy. It is used to increase the utilization rate of resources and speed. Selvarani and Sadhasivam (2010) proposed an improved cost-based scheduling algorithm for making efficient mapping of tasks to available resources in cloud. This scheduling algorithm measures both resource cost and computation performance, it also Improves the computation/communication ratio. An and Neuman (2011) proposed a scheduling algorithm which takes cost and time. The simulation has demonstrated that this algorithm can achieve lower cost than others while meeting the user designated deadline. Liu et al. (2010) presented a novel compromisedtime-cost scheduling algorithm which considers the characteristics of cloud computing to accommodate instance-intensive cost-constrained workflows by compromising execution time and cost with user input enabled on the fly. Pandey et al. (2010) presented a Particle Swarm Optimization (PSO) based heuristic to schedule applications to cloud resources that takes into account both computation cost and data transmission cost. It is used for workflow application by varying its computation and communication costs. The experimental results show s that PSO can achieve cost savings and good distribution of workload onto resources. Lin and Lu, (2011) proposed an SHEFT workflow scheduling algorithm to schedule a workflow elastically on a Cloud computing environment. The experimental results show that SHEFT not only outperforms several representative workflow scheduling algorithms in optimizing workflow execution time, but also enables resources to scale elastically at runtime. Wu et al. (2011) proposed a market-oriented hierarchical scheduling strategy which consists of a service-level scheduling and a task-level scheduling. The service-level scheduling deals with the Task-to-Service assignment and the task-level scheduling deals with the optimization of the Task-to-VM assignment in local cloud data centers. Xu et al. (2009) worked on multiple workflows and multiple QoS. They had a strategy implemented for multiple workflow management system with multiple QoS. The scheduling access rate is increased by using this strategy. This strategy minimizes the make span and cost of workflows for cloud computing platform. Varalakshmi et al. (2011) proposed OWS algorithm for scheduling workflows in a cloud environment. The scheduling algorithm finds a solution that meets all user preferred QoS constraints. With this algorithm, a significant improvement in CPU utilization is achieved. Parsa and Entezari-Maleki (2009) proposed a new task scheduling algorithm RASA. It is composed of two traditional scheduling algorithms; Max-min and Min-in. RASA uses the advantages of Max-min and Min-min algorithms and covers their disadvantages. The experimental results show that RASA is outperforms the existing scheduling algorithms in large scale distributed systems (Xu et al., 2009).

Inconveniences with existing methods:
In practice, cloud computing is highly dynamic and tasks are not always executed in the same style. For this type of problem, genetic algorithms have difficulty dealing with "deceptive" fitness functions (Melanie, 1998), those where the locations of improved points give misleading information about where the global optimum is likely to be found. ANN, ant colony (Yang et al., 2008) PSO (Liu et al., 2010) and honey bee algorithms are heuristic and need lot of considerable time to get trained and react on the situation. As the cloud, dynamic style, more new clients and new tasks introduces, same type of task may not be very much frequent.

Linear Programming (LP or linear optimization):
Linear Programming (LP or linear optimization) is a mathematical method for determining a way to achieve the best outcome (such as maximum profit or lowest cost) in a given mathematical model for some list of requirements represented as linear relationships. Linear programming is a specific case of mathematical programming (mathematical optimization).
More formally, linear programming (Liu et al., 2010) is a technique for the optimization of a linear objective function, subject to linear equality and/or linear inequality constraints. It is feasible region is a convex polyhedron, which is a set defined as the intersection of finitely many half spaces, each of which is defined by a linear inequality. Its objective function is a real-valued affine function defined on this polyhedron.

The Transportation problem:
There is a type of linear programming problem (Reeb andLeavengood, 2002, Liu et al., 2010) that may be solved using a simplified version of the simplex technique called transportation method. Because of its major application in solving problems involving several product sources and several destinations of products, this type of problem is frequently called the transportation problem. It gets its name from its application to problems involving transporting products from several sources to several destinations. The two common objectives of such problems are either (1) minimize the cost of shipping m units to n destinations or (2) maximize the profit of shipping m units to n destinations. Let us assume there are m sources supplying n destinations. Source capacities, destinations requirements and costs of material shipping from each source to each destination are given constantly.
Proposed system: Overview of proposed system: In our proposed system, the clients can assign their tasks with priority value between 1 and 5 where 1 has the highest priority and charges more per unit of time and 5 has the lowest priority with least charges per unit of time. Sometimes the priority value can be automatically assigned to the client's task based on the Service Level Agreements (SLA) (Buyya et al., 2011) with client. Our System receives the tasks from clients with Flexible Quantum of buffer time. This buffer time to receive can be extended based on the inter-arrival time of tasks. After receiving set of tasks, that is transferred to scheduler. The scheduler gets all necessary information from other phases like workload predictor and Historical information from Task Info Container such as Expected Execution Time (EET), Expected Worst-case execution Time (EWT), Success Score within Expected Time (SSEET), Success Score within Worst-case Time (SSWT), Resources-Required (RR) and Cost for the task execution for each Cluster. With that information preprocessing is done to build the Transportation Problem Table (Table 3). The Column Minima (which gives the least cost for the execution in particular Cluster of resource from set of Cluster) method is used for efficient scheduling plan, which provide as much tasks scheduling as possible with minimum total cost for allocation. After the Scheduling, the Allocator generates a queue (execution Sequence order) of scheduled tasks based on EET and EWT time in ascending order for available resources at an every instance of time and allocates resources in the order. Resources availability can be periodically predicted with Resource predictor. Figure 1 shows the decentralized dynamic cloud scheduler.
With this system we can persevere and enhance the Reliability by considering the available fault-free resources for allocation of tasks and we also take the historical values for scheduling. So the task Execution failure because of resources is prevented. So the reliability can be preserved. Also our system considered the minimum cost and maximum profit to the cloud providers.
Proposed System Architecture: Our proposed system Architecture is shown in Fig. 2. It consists of 7 different phases to produce the scheduling and allocation of tasks with reliability. The concept behind the each phases of our system are Task initiator: Our cloud environment is decentralized scheduling and task allocation, dynamic in nature, the clients can assign task to the cloud at any point of time. The tasks are assigned to the cloud is in poisson arrival pattern and tasks are independent with other tasks.
The task initiator has the following functions. This phase maintains the necessary condition for the Linear Programming problem Transportation problem where Σ Sources = Σ Destinations by receiving the tasks from various clients' task assignment in which the sum of resources required should be equal to readily available Cluster resources at an instance of time.
Task initiator contains the Flexible Quantum Time Slice (FQTS) as tasks receiving buffer time. By default it has a fixed slice of time to receive the clients' tasks say 5 sec. If more tasks are assigned by clients and the inter-arrival time is too short say less than or equal to 200 ms then the time slice extended up to the inter-arrival time maintained to 200 millisecond. When FQTS completed and no tasks arrived with less than equal to 200 millisecond then tasks are transferred to TP scheduler.

Workload predictor:
This phase provide the necessary information to the Tasks arrived into decentralized cloud environment for execution. Those information are such as (i) Expected Execution Time (EET) which gives the average case execution time for the task with given input parameters, (ii) Expected Worst case Execution Time (EWT) for the task with given input parameters. This phase also predicts an important attribute such as (iii) Resources-Required (RR) to complete the task in an efficient manner.
We have an assumption that the resources available in the Cluster of the cloud have same capacity and same capability but it has different quantity of resources available at instance of time. To minimize the prediction time, the Workload Predictor calculates EET, EWT and Resources-Required for only the tasks assigned for the first time to our cloud environment, if the task is already introduced, then those information such as EET, EWT and RR can be retrieved from the Task Info Container which was updated by the Log and Info Updater.

Task info container:
The Task Info Container is storage with controller that keeps all the historical information about already assigned task. The Historical information is such as Task id, EET, EWT and Resource-Required those can be predicted at the first time by work load predictor and can be utilized for the near future execution. And other Historical data such as Success Percentage of a Task within EET called SSET, Success Percentage of a task within EWT called SWET and execution cost for task in each Cluster represented by Cost ij where i stand for Cluster and j represents the Task.
When the task is assign to the cloud, Task Info Container provides all the historical information to the TP scheduler for scheduling. For the new task these information's are newly generated and stored into the Task Info Container.

Log and info update:
This phase keeps-on updating all activities within cloud environment into the log as well as updating of task information in the appropriate storage. The functions of this phase are: • Provide the status of all Clusters and its resources by periodically collected from Cluster • Update the success and failure percentage of each task executed with EET and EWT represented as SSET, SSWT • Update the Cost information of a task in each execution for appropriate Cluster • This Phase helps the Resource availability predictor by providing periodic status • Update unallocated task details and reasons for that such as resource unavailability, resource failure during the execution, Task failure during the execution time and when the Task exceeds the EWT • Provide all the above information to the TP scheduler on demand TP task scheduler: The objectives of TP task scheduler are: • Efficiently allocate more tasks within available Resources at an instance of time • To maximize the reliability of the cloud environment and to maximize the profit for cloud provider • To minimize the execution cost within Clusters and load of the resources of the Cluster • To maximizing the success percentage of task either in EET or in EWT • The TP scheduler has control over the Task Scheduling based on the following constrains • Selection of Tasks which are all satisfying the necessary conditions on the TP at instance of the time • Priority queue maintained for the Tasks by priority value given either by client for the Task or as in the Service Level Agreement (SLA) with the client.
• Scheduling the highest Success Percentage of the task either within EET or cumulative success percentage of EET and EWT (Percentage of successful Task completion updated by Log and Info Updater) • The lowest cost for task-Cluster combination (predicted from historical values) With the above constrains, this phase formulate LPP based TP table and generate schedule plan with the lowest execution cost and make allocation queue based on smallest to highest value of EET.
Task allocator: This phase receives the schedule plan for the tasks from the TP task scheduler. The allocator makes the queue or allocation order of tasks for each Cluster with appropriate Tasks by ascending order based on EET. The task allocator allocates all tasks or some possible task to the Cluster resources as per the scheduler plan generated by the scheduler. Sometimes the task allocator may not allocate the entire task to the Cluster because of the unavailability of resources. These tasks are kept in separate queue and allocate when resources are released from the task already allocated. One additional queue is maintained where all new tasks which do not have the necessary information such as SSET, SSWT cost for all clusters, will be allocated to the freely available resources in any clusters with near future.
Resource predictor: This phase periodically collects the status of each resources of the Cluster and keeps them update for helping the task allocator. Thus the task allocator can allocate the tasks as available in the queue. Also this phase monitors and collects the working status of the resources available in the Cluster, which can be used to avoid the resource failure before it occurs.

Mathematical formulation:
The mathematical formulation of cloud resource scheduling and allocation using the modified Transportation problem consist of present and historical values. The cloud reliable scheduling and allocation using Transportation problem CRSATP can be defined as:  Table1: Initial problem formulation table  T1  T2  T3  T4  - Table 3: Actual transportation problem table   T1  T2  T3  T4  - Table 3:
Step 3: Formulate the actual transportation problem with above data as Table 3.
Step 5: Find the minimum EET/EEWT mark the column Step 6: Find the minimum cost score for the column marked and allocate the resources as required by the task Step 7: Make the EET marked and cost score ij into infinitive value.
Step 8: Find the order of execution by finding maximum charges/unit of time as the first task and so on.
Step 9: Allocate all resources and update the tasks history after execution is over Step10: If required resources are more than resources of all Cluster then eliminate the tasks which has more resources required.

MATERIALS AND METHODS
The genetic algorithm based scheduling the Tasks and resource allocation is implemented. The fitness function was selected to find the total cost for task allocation in Cluster and population is taken as 100 tasks and mutation by changing the allocation vector one task on Cluster with random value. The crossover function was implemented by generating new combination of tasks to Cluster allocation vector. The genetic Algorithm is executed with 100 tasks, mutation rate by 1% and Crossover rate by 96.5%. The Genetic Algorithm generates the maximum of 400 iterations to produce the near optimum value comparing with our algorithm which generates best optimum cost with minimum number of resources. Two different experiments are conducted. One with one set of 100 tasks another with 10 different set of each 100 tasks to prove that our system can efficiently allocate the dynamic tasks. Naturally the genetic algorithm could not produce the optimum results for both experiments and also it takes more execution time to produce the results. All the results are imported in mat lab as xpls file and graphs 1 and 2 are generated ( Fig. 3 and 4).
Next we compare our proposed TP scheduling system with Global Optimization System Implemented with total Permutation and Combination method in JAVA, This System can take only of 10 tasks for 4 Clusters and produced the best optimum cost after execution of 10,48,576 different iterations to complete. The graph 3 (Fig. 5.) shows the comparison of TP scheduling with Global Optimization for 10 tasks.

RESULTS
From the graph 1 (Fig. 3) we understand that the blue horizontal line (TP Scheduling System) is producing the optimal allocation cost in first iteration itself for the given 100 tasks, but the black line (genetic Scheduling and allocation system) takes 400 itterations it converged to find the near optimal result. It also takes long time to complete the task scheduling. With the graph 2 (Fig. 4) we understand that the black line (TP Scheduling) produces the optimal allocation cost for all 10 sets of each 100 tasks, whereas the red line (Genetic Scheduling) produces near optimal value for all sets with long execution time.
From the graph 3 (Fig. 5) we understand that the TP scheduling produces (red horizontal line) shows that TP scheduling optimal cost in first iteration whereas Global optimization system (red dotted line) produces the optimal cost at 20,000 th iteration (in this data set) and goes upto 10,48,576 total iterations to complete.
From this we can understand TP Scheduler produces better scheduling and resource allocation and minimizes the Cost and maximizes the profit. And the objective of reliability also preserved and enhanced.

DISCUSSION
Our proposed system is implemented as simulation environment using the Core JAVA with System Configuration of Core 2 Duo with T6600 and 2.20 GHz processor and with 2 GB RAM, in which we have 5 clients and 4 Clusters. Each will have random number of resources and 100 tasks are generated with random inter-arrival time. Out of 100 tasks some of the tasks were generated with random priority, EET, EWT, SSET, SSWT, Cost for 4 Clusters. Some of them are generated as new tasks so that the system will generate the Cost and Success scores. With that the TP scheduler is called for scheduling and allocation of tasks. The system generates the allocation and maintains and updates the historical information in Task Info Container. It produces the reliable and best optimum cost in first iteration itself.
With the same simulated environment, our system is compared with two other systems also developed with Core JAVA. (1) Genetic Scheduling Algorithm with same 4 tasks, 100 tasks and with necessary information.
(2) Global Optimization by total permutation and combination method for 10 tasks, 4 Cluster and with necessary information.

CONCLUSION
Thus our proposed TP scheduling algorithm for task scheduling and resource allocation in decentralized and dynamic cloud computing environment, efficiently schedule and allocate the tasks. The main objective of this algorithm, to enhance the reliability and maximization the profit by minimizing the allocation and execution cost and minimizing the complexity of cloud controller is achieved.
The reliability is achieved by the following ways. First it considered the actual availability of the resources which are all physically and logically good condition and based on that it schedules the tasks. Second preferences given to the task which are all have most successful by historical values and up-to-date cost values is considered for finding the minimal cost. Third it maximizes allocation of all assigned tasks as earlier as possible. So it serves almost all assigned tasks. This system has Task initiator which removes the bottleneck problem by control the task incoming flow. Now we have proposed the method for independent tasks with equal capability resources of Clusters and assuming no advanced reservation in Task Assinment. In Future the we are planning to improve the reliability and availability for Task Scheduling and resource allocation for some complex constraints which are not considered now such as resource specialization, critical resources, tasks dependent to predecessor task, time bounded prescheduled tasks and advanced reservation.