A FUZZY BASED MECHANISM FOR ALLOCATION OF GRID RESOURCES

One of the main challenges in Grid computing is efficient allocation of resources (CPU-hours, network bandwidth) to the tasks submitted by users. In our previous work a technique to allocate resources in a grid environment using predicted data has been proposed. We propose utilization of the predicted data the resources were classified into three types; they are permanent resources, semi-permanent and sporadic resources. These types of resources may become available for a time that is either higher than the dwelling time or lower than the dwelling time in a grid environment. As the nature features are not known in such classification and then allocation mechanism, the performance cannot be increased further. In order to avoid such problem, in this study, a prediction model and an allocation factor are introduced. These parameters are determined for the sporadic type and semi-permanent type of resources and they are used in the fuzzy-based resource allocation mechanism. The incorporation of these parameters in the resource allocation leads to a remarkable resource utilization rate and makespan. This can be observed from the simulation and comparative results. From the results, it can be said that the proposed resource allocation mechanism has proved the performance in a dynamic environment.


INTRODUCTION
Grid Computing is a form of distributed computing that involves coordination and sharing of computing application, data storage or network resources across dynamic and geographically dispersed organizations. (Rafee and Rahimzadeh, 2009;Richard et al., 2008;Vijaya et al., 2009;Abba et al., 2012). Grid can be distinguished from conventional distributed computing by its focus on large scale resource sharing, high performance and solving compute/data intensive applications. Grid supports researchers and scientists from diverse organizations to share information, instruments, data and compute and storage resources dynamically in a flexible and secure manner (Vijaya et al., 2009;Puri and Dev, 2012). It is a reliable technology for the process of making scheduling decisions involving allocating jobs to resources over multiple administrative domains. Grid Environment is usually viewed as a hierarchical problem with two levels. The first level, called meta-scheduling, a grid Environment selects the resources to be used by a job. At second level, called local scheduling, a local scheduler schedules the jobs assigned to it (Naisipour et al., 2008;Christodoulopoulos et al., 2009). The pool of resources can be assumed fixed or stable in traditional parallel and distributed computing environments, in a Grid dynamicity exists in the networks and computational resources. Scalability and adaptability are two important factors that must be taken into account in setting up a grid system. First, a network shared by many execution domains cannot provide guaranteed bandwidth. This is particularly true for Wide-Area Networks like internet. Second, both the availability and capability of computational resources will exhibit dynamic behavior (Yien et al., 2011;Wankar, 2008).

JCS
Developing middleware that offers basic functionality for example, the capability to query for information regarding the resources and the capability to schedule jobs onto the resources is done by majority of the works in Grid computing. Dynamic competitive capabilities offer value for a long period of time comprising competitive advantages obtained from customs and policies of different organizations (Darby and Tzeng, 2010;Suwan et al., 2012a). Two significant agent types present in the environment, which is fundamentally a multi-agent system, are resource agents and user agents. In majority of the cases, users can be regarded as individual agents that generate jobs and try to access resources for their execution, or as external resource brokers that map jobs in favor of other individual users. A multi-agent architecture that addressed resource management and application execution with support for Quality of Services (QoS) in grid scheduling algorithms is presented by (Keerthika and Kasthuri, 2012;Marowka, 2000).
According (Boukerram and Azzou, 2006) to job behavior in the job waiting queue is considered as an important factor for scheduling algorithm. The data access cost is also aggregated with the job waiting queue in order to reduce the job turnaround time. Grid scheduler is responsible for receiving jobs from grid users, selects feasible resources for those jobs according to acquired information and finally generates jobs to resource mappings. The number and speed of the available processors, system memory, as well as storage space are generally used to describe the resources. Resource-sharing jobs that must be mapped utilizing a resource allocation system to respective resource providers exist with users (Latip et al., 2011;Odeh et al., 2009;Farooq et al., 2009).
A small number of researchers have addressed the problem from the perspective of learning and adaptation, even though considerable attention has been paid to the resource allocation problem in Grid computing (Senthilnathan and Purusothaman, 2012;Suwan et al., 2012b). At the same time, the possibility of effectively solving the resource allocation problems using groups of autonomous learning agents has been proved by Multi-Agent Systems (MAS) and distributed AI communities. The following are present in certain resource allocation problems: (1) identification of a proper service and the resources, (2) allocating the resources based on specific conditions like pricing or priority dynamical allocation and updating of the status of the resources (Kamalam and Bhaskaran, 2012).
In order to analyze the utilization measure achieved in the grid resource allocation mechanism, we have reviewed the recent related works, in which it has been found out that the resource allocation mechanism achieves good utilization when using fuzzy logic. However, the lack of consideration of dynamic nature and forecasted allocation in those works claims that the achieved utilization is not reliable. Hence, to solve the issue we have proposed a new classification strategy along with a simple-fuzzy based resource allocation mechanism in the previous paper (Poonguzhali and Shanmugavel, 2011). But in the previous paper, the resource allocation was based on historical data and the classification was considered as from forecasted data. This is found to be a bottleneck for the performance improvement of the proposed grid resource allocation mechanism. Hence, in this study, we introduce a Prediction Model and Allocation Factor, which can be calculated from standard distribution functions and they are incorporated in the fuzzy-based resource allocation mechanism (Poonguzhali and Shanmugavel, 2011). Thus obtained Prediction Model and Allocation Factor based Fuzzy grid resource allocation mechanism improves the resource utilization with minimum make span.

Related Work
Though plenty of related works are available in the literature, a handful of significant highly related works are reviewed here (Kamalam and Bhaskaran, 2012) have proposed grid architecture as a collection of clusters with multiple worker nodes in each cluster, where the resources may join or leave the environment at any time and the jobs also arrives at different intervals of time. In this proposed method, the dynamic environment, to maximize the resource utilization and to minimize the makespan an effective grid scheduling technique was needed. They proposed a new scheduling algorithm Novel Adaptive Decentralized Job Scheduling Algorithm (NADJSA) that applies both Divisible Load Theory (DLT) and Least Cost Method (LCM) and also considers the user demands. The proposed Novel Adaptive Decentralized Job Scheduling Algorithm was compared with the Decentralized Hybrid Job Scheduling Algorithm. Conclusion: The proposed Novel Adaptive Decentralized Job Scheduling Algorithm minimizes the makespan, improves the resource utilization and satisfies the user demands and well suits for the grid environment. Their experimental results have proved that optimization scheduling allocates the cheapest resources to ensure that the deadline can be met and computation is minimized. Jiang et al. (2008) have proposed scheduling algorithms for batch-mode data-intensive jobs is a key issue in data-intensive Grid applications. It focuses on how to minimize the overhead of transferring the required data set to the executing grid site. Existing approaches pay attention to the access cost of a dataintensive job at each executing grid site for replicating the required data set. In this Scheduling algorithm they presented method from potential behaviors of jobs in the waiting queue at each grid site when the access cost was evaluated. The algorithm has mainly examined the influence of potential behaviors on the access cost and proposes a data-intensive job scheduling algorithm with potential behaviors. Furthermore, the main focus of our algorithm lies on paper shows that it has better performance in mean job time of all jobs, total number of replications, total number of local files accesses and effective network usage than the scheduling algorithm based on access cost. Chapman et al. (2007) have predicted the CPU resource utilization using their predictive grid scheduling framework, which follows Kalman filter theory. Their experimental results have proved that they have achieved 15-20% precision in their prediction. The subsequent observation of utilization has also confirmed the enhancement of scheduling quality, when compared with the other approaches. Ramesh and Krishnan (2012) have presented an optimal resource sharing algorithm in Grid Computing. Resource sharing required more optimized algorithmic structure, otherwise the waiting time and response time are increased and the resource utilization is reduced. In order to avoid such reduction in the performances of the grid system, an optimal resource sharing algorithm is required. Ramesh and Krishnan (2012) have introduced a utility function in optimal resource sharing algorithm. In this paper, a hybrid algorithm for optimization of load sharing was proposed. The hybrid algorithm contains two components which were Hash Table (HT) and Distributed Hash Table (DHT). The algorithm has mainly examined the relationship of optimal resource sharing between optimization tasks and load sharing of existing systems. Chen and Lu (2008) have introduced a utility function in grid resource scheduling algorithm. The algorithm has mainly examined the relationship between the execution time, cost and the user utility function to solve the heterogeneity issue of user requirements in grid resource allocation. The algorithm has accomplished a good performance and it has compromised the drawbacks in time-based optimization algorithms and cost-based optimization algorithms. Wankar (2008) have proposed Open Grid Forum (OGF) is an organization that resulted from the merger of the Global Grid Forum (GGF) and the Enterprise Grid Alliance (EGA). GGF was an international organization that started in 1999, with the focus on the development of open standards for grid soft ware interoperability, common practices, agreements and other related issues and proposed several specifications with the help of several working groups. The grid project started with the aim of using high-end computational recourses, networks, databases and scientific instruments owned and managed by multiple organizations. Globus was one of the most successful projects in grid computing to test these specifications. Although it could overcome many technological barriers, many were still remains as open questions. In this proposed method, they discussed about grid, Globus Toolkit and present some of technical challenges the grid community faces. Further, they provided future research directions in Grid Computing. Farooq et al. (2009) have simulated a new middleware framework for Grids that achieves user satisfaction by providing QoS guarantees for Grid applications. In this proposed method, they providing Scalability, flexibility, quality of service provisioning, efficiency and robustness were the desired characteristics of most computing systems. Although the emerging Grid computing paradigm was scalable and flexible, achieving both efficiency and quality of service provisioning in Grids was a challenging task but is necessary for the wide adoption of Grids. Grid middleware should also be robust to uncertainties such as those in user-estimated runtimes of Grid applications. In this proposed method, they presented a complete middleware framework for Grids that achieves user satisfaction by providing QoS guarantees for Grid applications, cost effectiveness by efficiently utilizing resources and robustness by intelligently handling uncertain runtimes of applications. Finally, they have validated the experimental results and have proven their performance of resource utilization of the grid with a high success rate of jobs and reduction in the total execution time of submitted jobs.

MATERIALS AND METHODS
The proposed technique works mainly from the historical involvement of every grid resources and the dwelling time of the resources in the grid. As the reliability fact for the permanent and the semipermanent type of resources are high, the only consideration is with the sporadic type of resources. The sporadic resources are the resources whose dwelling time cannot be predicted by any means. Hence, in this study they are considered as the random variables and accordingly, an allocation factor is determined. The allocation factor is considered in the enhanced version of the fuzzy-based grid resource allocation technique to allocate the grid resources to the submitted jobs with a remarkable level of allocation performance. Moreover, a prediction model is generated from the dwelling time of the sporadic resources of past time slots. A factor is considered from the model, which is so called as model factor. Even though, the semipermanent can be considered as reliable, the lesser dwelling time may leads the semi-permanent type of resources as sporadic in the future. To avoid such circumstances, the prediction model as well as allocation factor is also considered for the semi-permanent type of resources. It could be clearly understandable from the overall structure of the proposed technique, which is given in the Fig. 1.
Again the heart of the architecture is Fuzzy Inference System along with the compendium of processing of members. Mainly, two databases involved in the architecture namely Jobs database and Historical database. The jobs database holds the details of the jobs that are to be submitted to the grid whereas historical database holds the dwelling time of the resources that were available at the time of previous time slots. The resource extractor extracts the individual resources that are required by the jobs and sends it for the further blocks for allocation. The extractor holds the resource, its requirement duration and the job, which requires the resource and the priority of the job. The buffer holds the resource and the corresponding details that are to be subjected to allocation mechanism. When the allocation is performed for the resource that is in the buffer, the buffer will be cleared and then loaded by new resource and its details.
The resource locator locates the required resource in the historical database and the extracts the dwelling time of the particular resource of the previous time slots. The prediction model calculator generates a prediction model and calculates the model factor where as the distribution model mapper maps the distribution of the resources with the standard distribution functions and determines the allocation factor. The classified resource mainly contains the resources under three classes, namely, permanent, semi-permanent and sporadic. All the resultants from the blocks, namely, dwelling time of the classified resources, Prediction model calculator, Distribution model mapper and the Job buffer, are given to the Fuzzy Inference System to make a decision about whether the job can be allocated to the resource or not. The major contribution relies on predicting the resource information and determining the allocation factor using the random distribution functions. Let the availability of the grid resources in the past N time slots be (t) (t) Here, N R is the number of maximum number of resources available in the grid. In the proposed methodology, a runtime prediction model is generated for the sporadic and semi-permanent type of resources even though we have a classification model is available for all type of resources. In parallel with the prediction model, an allocation factor is determined. Both the prediction model and the allocation factor are determined only at the time of processing or analyzing a resource to allocate for the submitted jobs.

Determining the Prediction Model
When a resource is given, a prediction model is developed for the particular resource and the upcoming dwelling time is determined using the prediction model.
The prediction model mainly consists of two steps, namely, determining data variation factor φ and generating the model.

Determining Data Variation Factor φ
For every resource to be allocated, a data variation factor is determined based on the dwelling time of the particular resource in the past N time slots. φ j , where, J refers to the resource ID, is determined mainly based on six criteria, which are given below. The process of determining φ j is described as a flowchart in Fig. 2.

Generation of Polynomial
Once φ j is determined, a polynomial equation is generated, which is further used as the prediction model. The polynomial equation is generated in such a way that it should have the degree of φ j . The equation can be given as: Equation (1) is solved for the coefficients by substituting different past dwelling times and so the final solved equation is obtained. This is used as the prediction model to determine the further dwelling times of the upcoming period. The similar process is done with the semi-permanent type of resources.

Determining the Allocation Factor
In order to determine the allocation factor α, the dwelling time of a particular resource is assumed to be distributed by following any of the distribution functions. Here, we consider three distribution functions, which are very common in dealing with the random variables, namely normal distribution, poisson distribution and uniform distribution. From the distribution functions, the allocation factor is determined as follows: In Equation (2) ς 1 , ς 2 and ς 3 are the normal, poisson and uniform distribution function values for the subjected resource's dwelling time of the past time slots. Once α is determined, it is given along with the parameters φ, priority of the job, resource requirement time and predicted dwelling time of the resource to the fuzzy inference system.

Generation of Fuzzy Rules
In the previous work, only the priority status of the job, requirement time of a particular resource and the predicted dwelling time of the resource, which is available in a particular category, is used as input variables in generating fuzzy rules. In this study, as already mentioned, an allocation factor and model data, which is obtained from the prediction model, is also used in generating the fuzzy rules. However, the addend parameters are applicable for the sporadic type and semipermanent type of resources. In order to avoid the complexity in generating fuzzy rules, only the fuzzy states of MIN and MAX are utilized instead of using MIN, MID and MAX.The fuzzy rules are as follows Fig. 3. The generated fuzzy rules are given to the Fuzzy Inference System (FIS) for self-learning.

Allocation Mechanism
The allocation mechanism performs resource allocation on runtime by checking the job priority, required resource and the requirement time, estimate dwelling time of the resource, the availability of resource in every class, the allocation factor and the model factor φ. The allocation mechanism is different from the previous work only in handling the resources from sporadic type and semi-permanent type. The flowchart is given in Fig. 4, which illustrates the proposed allocation mechanism.
By following the procedures that are given in the above flow chart, the resources, which have demand for a job, are allocated to the job based on its priority, their requirement period, dwelling time of the required resource, probability and prediction model factors. The rest of the procedures are similar to that of the procedures that are followed in the previous paper (Poonguzhali and Shanmugavel, 2011).

RESULTS
The proposed resource allocation technique was implemented in the working platform of MATLAB (version 7.10) with system specifications, Intel (R) core i5 CPU, 3.20GHz and 3GB RAM.  The performance of the technique was analyzed by executing with different synthetic job datasets, fuzzy thresholds and existing techniques. Hence, firstly we describe the dataset and its generation, secondly we analyze the results and finally, the technique is compared with the existing resource allocation techniques using the performance measures utilization rate, failure rate and makespan (Hao et al., 2008;Foster et al., 2006;Poonguzhali and Shanmugavel, 2011).

Dataset Description
The main requirement for the resource allocation technique is the historical dataset. Here the historical dataset is simulated with N = 5 time slots. The historical dataset is assumed to be classified as permanent, semipermanent and sporadic.    The priority and the other dwelling time ranges are set as in the previous paper (Poonguzhali and Shanmugavel, 2011). The specification for the generated historical dataset is given in the Table 1. A job dataset is generated, which is called as input dataset, was subjected to the mechanism for allocating the available resources.

JCS
In the job dataset, a defined number of job IDs are generated. For every job ID, a defined number of resources and the requirement period are generated.
In the proposed technique, we have used prediction model and allocation factor in addition to the previous technique. The prediction model for a sample of three resources is given in Fig. 5    In the proposed technique, the fuzzy thresholding has been performed in two locations. One is at the point of evaluating the fuzzy score that deals with the sporadic resource type and the other is at the point of evaluating the fuzzy score that deals with semi-permanent resource type. Here, the threshold values (S th-II and S th-III ) are varied from 0.3-0.8 and the corresponding performance metrics are observed. A quantitative analysis is made with the fuzzy threshold, which is used in evaluating the fuzzy score and the corresponding utilization, failure rate and makespan. The analytical results are tabulated in Table 2

JCS
The makespan values shows uncertain variation and lessening the threshold increases the utilization rate and hence minimizes the failure rate. So, here we can determine the best threshold values by either considering the maximum utilization rate or by considering the makespan values. To make convenient selection, here we introduce a consolidated measure, which is given in the final column of the between the utilization rate and makespan as follows Equation 4: where, U, MS and MS max utilization, makespan and maximum makespan values of the dataset respectively. The selection of fuzzy thresholds can also be considered by the above mentioned formulation.

Comparative Analysis
To substantiate the proposed technique, it is compared with the three successful conventional resource allocation techniques such as GA-based resource allocation, SA-based resource allocation and PSO-based resource allocation and also with our previous resource allocation technique.

DISCUSSION
Based on the specifications that are given in Table  1, the dataset is generated. For instance, the semipermanent type resources are generated in between the range (Jiang et al., 2008) where their dwelling times are in between the range (Chapman et al., 2007) arbitrarily. Likely, the dataset has been generated for all the time slots and also for the other resource types. Another dataset has also been generated in the similar fashion for the current time slot.

JCS
We have selected three utilization rates, failure rates and makespan values that are obtained from the proposed methodology and they are compared against the above said resource allocation methodologies. The first utilization rate, can be called as Utilization in terms of Minimum Makespan (UMM), is the obtained utilization value in which we have achieved minimum makespan. Second Utilization rate is the Maximum utilization rate (UMX) and the third utilization rate (UMC) is the obtained utilization in which we have achieved a maximum consolidated measure value. We have determined the three failure rates FMM, FMX and FMC by considering the corresponding failure rates of the selected utilization rates, UMM, UMX and UMU respectively. The first makespan value, which is notated as MMM, is nothing but the makespan value corresponding to UMM. The second makespan value, which is notated as MMX is the obtained makespan value when the utilization reaches maximum and MMC is the consolidated makespan value. These UMM, UMX, UMC, FMM, FMX, FMC MMM, MMX and MMC are marked as bold in the Table 2 In all the datasets, it can be observed that, the UMX and UMC, FMX and FMC and MMX and MMC are same, respectively. This means that utilization makes a great impact over the consolidated score and not the makespan values. The previously proposed technique i.e., SFAM fails to achieve minimum makespan when compared to GAbased allocation mechanism. This can be acceptable as utilization has higher impact rather than makespan; however, proposed method has achieved both in utilization and makespan values. This can be observed from Fig. 6, which illustrates the performance of proposed method in all the submitted job datasets and from Fig. 7, which illustrates the mean of the performance that are observed for the individual datasets.

CONCLUSION
The study proposed a PMAFF grid resource allocation mechanism by the incorporation of two newly introduced parameters, prediction model and allocation factor in the fuzzy resource allocation mechanism. The mechanism introduced a new classification scheme based on dwelling time of the grid resources in this past time slots. Based on the values, it developed a prediction model to determine the dwelling time of the upcoming time slot. Moreover, it considered three standard distribution functions to determine an allocation factor. Using these details, a simple fuzzy-based resource allocation mechanism was devised to allocate the resources to the submitted job datasets. As the fuzzy threshold selection was found to be critical in achieving expected utilization rate and makespan values, an analysis was made to make the threshold selection more convenient. With different selection parameters, the obtained utilization rate, failure rate and makespan values were compared against the successful heuristic search resource allocation algorithms such as GAbased resource allocation technique, SA-based resource allocation technique and PSO-based resource allocation technique. From the results, it was observed that the PMAFF resource allocation mechanism achieved good performance measures rather than the aforesaid techniques. It also proved that it has improved the previously proposed simple-fuzzy based resource allocation technique.