Model for Load Balancing On Processors in Parallel Mining of Frequent Itemsets
Ravindra Patel, S. S. Rana and K. R. Pardasani
DOI : 10.3844/ajassp.2005.926.931
American Journal of Applied Sciences
Volume 2, Issue 5
The existence of many large transactions distributed databases with high data schemas, the centralized approach for mining association rules in such databases will not be feasible. Some distributed algorithms have been developed [FDM, CD], but none of them have considered the problem of data skews in distributed mining of association rules. The skewness of datasets reduces the workload balancing between processors involved in distributed mining of association rules. It is important to invent an efficient approach for distributed mining of association rules which have the ability to generate homogeneous partitions of the whole data sets; hence the supports of most large item sets are distributed evenly across the processors. We proposed an efficient stratified sampling based partitioned technique, which generate homogeneous partitions on which processors works in parallel and generate their local concepts approximately simultaneously.
© 2005 Ravindra Patel, S. S. Rana and K. R. Pardasani. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.