TWO PHASE CLUSTERING METHOD FOR LEACH PROTOCOL FOR EFFECTIVE CLUSTER HEAD SELECTION

A Wireless Sensor Networks (WSN) comprise of a huge quantity of clusters of sensor nodes that can inter communicate with all nodes and with the base station through a wireless link. All clusters are corresponding to a cluster head to provide a direct communication to the base station and in addition with other cluster heads. Obviously well-organized selection of cluster head is a significant problem in the performance of the wireless sensor network. In this study, a novel two stage clustering protocol based on Self Organizing Map (SOM) neural network and Modified Fuzzy Possibilistic Clustering Algorithm (MFPCM) with the purpose of balancing the energy consumption. The clustering is carried out based on two important condition, energy level and coordinates of sensor nodes. Consequently, this novel two stage clustering method can prevent from premature death of the nodes and permit for random death of them. The experimental observation reveals that the SOM-MFPCM based cluster-head selection algorithm performs better than the cluster head election approach used in LEACH and can enable the network lifetime longer with less number of dead nodes.


INTRODUCTION
The most recent years there has been a rise of attention in small, low power hardware platforms that incorporate sensing, processing data received from or to be sent to the atmosphere with wireless communication capability. Wireless sensor networks play a vital role in remote area applications, where human intervention is not possible. In a Wireless Sensor Network (WSN) each and every node is strictly an energy as well as bandwidth constrained one.
A wireless sensor network consists of varied items of hardware and software. At the core of the system is that the wireless sensor device. This piece of equipment consists of the physical sensors, a microprocessor to research the raw data signal and generate the information message, a radiofrequency transmitter to deliver the data and a power supply. The package can usually be noted as a node or "mote". A key facet of a wireless sensor network is that the microprocessor on every mote are often programmed to make sure that each one sensors during a given region work as a coherent system. Whereas sensor nodes during a wireless sensor network are capable of exchanging information with alternative nodes, most applications can involve the delivery of information and data from every sensor node to a central data collection purpose. That time can generally be a computer that archives the information and software is required to make sure that the information delivered from the wireless receiver is taken, displayed and keep in a usable manner. The third component of any successful wireless sensor network system involves retrieval of the information created from the sensor network. Software applications should somehow be able to question the Science Publications JCS information generated by the sensor network during a logical manner (Jang et al., 2008).
The sensors which are available in a wireless sensor network are deployed arbitrarily surrounded by the region of importance or nearer to it. A distant internet equipped Base Station (BS) is employed to give instructions to all the sensors and collect information from the sensors. Together with sensing, the wireless sensors can also route the obtained information and transmit instructions to the BS as well as communicate to each other (Lindsey et al., 2001;Enami et al., 2010).
Cluster based approach is useful for environment monitoring. WSN is the combination of wireless communication and environmental perception. It is a special form of wireless ADHOC network. This can construct the network without any infrastructure. Energy efficient routing algorithms are mainly divided in to the following categories.
(1) Reduction of the communication energy consumption by adopting multihop transmission strategy. (2) Balancing the network load by adopting the cluster-based routing protocols and optimizing the location of cluster head. (3) Adopting the sleep and wake-up mechanism to avoid the unnecessary energy consumption. The energy of a node is very limited; the maximum lifetime of WSN plays an important role to design the routing protocols. The efficient routing protocol plays an important role for packet transmission and also considers the network balance (Thangadurai and Dhanasekaran, 2013). Rajkamal and Ranjan (2012) proposed an Aggregation techniques which plays major role in increasing network life time by reducing the amount of data and data transmission in the resource limited (battery) WSN. By exploring the impact of heterogeneity on the data aggregation protocols, energy consumption of radio of the WSN is significantly reduced. The simulation results that rationalize the proposed scheme are obtained by comparing with LEACH, HEED, CLUDDA and TEEN in terms of energy dissipation showing promising improvement by applying the proposed scheme.
In this study a novel centralizes Energy Based Clustering protocol through using Self organizing map neural networks (called EBCS) is presented which can provide a uniform distribution of energy in all clusters leading to longer life of the nodes. The difference of our proposed protocol with previous ones is that it is able to adaptively cluster the nodes not only based on their topological closeness (coordinates) but also based on their energy levels in each set-up phase by using Self Organizing Map (SOM) neural network. We tried to develop the classic idea for topological clustering and incorporate a topology-energy based clustering method by using SOM neural networks in order to apply two unrelated variables (energy and distance) in clustering and reduce the dimensions of the dataset and visualize it into a map. Furthermore, by applying a second clustering phase by K-means, we aimed to regroup and optimize the clusters and reduce the algorithm computation time as shown in (Tong and Tang, 2010). Simulation results show that our new protocol can extend the network lifetime in the terms of first node dying. Chandramathi et al. (2007) WSN is a collection of autonomous devices with computational, sensing and wireless communication capabilities. The sensor nodes are low cost, multi-functional devices that are densely deployed either inside the phenomenon or very close to it and are often powered by independent power sources. The lifetime of such sensor nodes show a strong dependence on lifetime of the power source used. So, conservation of power and hence increase the lifetime becomes an important issue in WSNs. Routing in wireless sensor networks, which consumes a considerable amount of energy, offers ample scope in increasing the life time of WSNs. To address this problem an Energy Aware Optimal Routing (EAOR) algorithm, which determines the path that consumes minimum power between the source node and the destination node, is proposed here. This EAOR algorithm performs better in terms of conservation of power compared to the shortest path routing algorithm. This EAOR algorithm is also validated through extensive simulation in various scenarios, like increasing the number of sensor nodes and gateway nodes. The results show that EAOR algorithm performs well for all the scenarios.

Related Works
As an effective way to control the network topology, the clustering algorithm can significantly reduce the energy consumption of wireless sensor networks and improve network throughput. Through learning the framework of clustering algorithm for wireless sensor networks, this study presents a weighted average of cluster head selection algorithm based on BP neural network which make node weights directly related to the decisionmaking predictions. The weight distribution of nodes is objective. The simulation results show that efficiency of the algorithm in eliminating data redundancy, reducing network traffic, extending the network lifetime.
This study presents a new cluster based routing protocol for wireless sensor networks called Energy Science Publications JCS Based Clustering Self organizing map (EBCS). This new protocol can cluster sensor nodes based on multiple parameters; their energy level and coordinates, through using self organizing map neural network capability in multi dimensional data clustering. Energy based clustering can form energy balanced clusters in order to better balance energy consumption in whole network which will prolong network lifetime while insuring more network coverage by random and dispersed dying of sensor nodes throughout network space. Simulation results approve the advantages of EBCS over two similar protocols, LEACH and LEA2C in the terms of postponement of first node death and preserving more network coverage. Also a new cost function for election of best Cluster Heads based on multi criteria is presented in which the weights of the criteria are determined using an analytic hierarchy process. Prommak and Modhirun (2012) proposed a research presents a study of the optimal network design for efficient energy utilization in continuous data-gathering WSNs. the problem of minimizing the network cost through the minimum number of relay-station installation is examined initially. Then further investigation of minimizing the energy consumption of the sensor nodes. We model the network design problem as an integer linear programming. This key contribution is that the proposed models not only guarantee the network lifetime but also ensure the radio communication between the energy-limited sensor nodes so that the network can guarantee packet delivery from sensor nodes to the base station. Results: Numerical experiments were conducted to evaluate and demonstrate the effectiveness of the proposed methods in various network scenarios. Siham et al. (2013) presented a passive clustering mechanism and the clustering protocols proposed for wireless sensor networks; they introduce a new protocol designated based on the APC-T for mobile nodes in wireless sensor network. This mechanism provides the stability of clusters after each departs of cluster-head and allows balanced energy consumption among the sensor nodes. Comparison with the existing schemes such as APC-T and Geographically Repulsive Insomnious Distributed Sensors (GRIDS) proves that the mechanism for selecting a backup of cluster-head nodes, which is the most important factor influencing the clustering performance, can significantly improves the network lifetime.
In a standard WSN, most of the routing techniques, move data from multiple sources to a single fixed base station. Because of the greater number of computational tasks, the existing routing protocol did not address the energy efficient problem properly. In order to overcome the problem of energy consumption due to more number of computational tasks, a new method is developed. This algorithm divides the sensing field into three active clusters and one sleeping cluster. The cluster head selection is based on the distance between the base station and the normal nodes. The Time Division Multiple Access (TDMA) mechanism is used to make the cluster remain in the active state as well as the sleeping state. In an active cluster 50% of nodes will be made active and the remaining 50% is in sleep state. A sleeping cluster will be made active after a period of time and periodically changes its functionality. Due to this periodic change of state, energy consumption is minimized. The performance of the LEACH algorithm is also analyzed, using a network simulator NS2 based on the number of Cluster Heads (CH), Energy consumption, lifetime and the number of nodes alive.

Algorithm Assumptions
Our proposed algorithm is a centralized cluster based protocol strongly related to LEACH-Centralized (LEACH-C) (Guo et al., 2010) and LEA2C (Enami et al., 2010) protocols. The operation of the algorithm is divided into rounds in a similar way to LEACH-C. Each round begins with a cluster setup phase, in which cluster organization takes place, followed by a data transmission phase, in which data from the simple nodes are transferred to the cluster heads. Each cluster head aggregates/fuses the data received from other nodes within its cluster and relays the packet to the base station. In every cluster setup phase, Base Station (BS) has to cluster the nodes and assign appropriate roles to them. BS also creates a Time Division Multiple Access (TDMA) table for each cluster and affects this table to CHs. Using TDMA schedules the data transmission of sensor nodes and also allows sensor nodes to turn off their antennas until and after their time slot and save their energy. So the energy consumption for sending control packets is assumed to be just for BS. We assume that BS has no constraint in its energy resources and has total knowledge about the energy level and position of all nodes of the network (most probably by using GPS receiver in Science Publications JCS each node). The sensor nodes assumed to be homogenous (they have the same processing and communication capabilities and the same energy level at algorithm start).

Cluster Setup Phase
The protocol uses a two phase clustering method SOM followed by MFPCM algorithm which had been proposed in (Tong and Tang, 2010) with showing advantage of two stage clustering method compared to direct clustering of data by MFPCM in the term of computation time.
The variables that we want to consider as SOM input dataset is x and y coordinates of every node in network space and the energy level of them. So we will have a D matrix with n×3 dimensions. Since we are applying two different type variables, first we have to normalize all values. We used a Min-Max normalization method (Heinzelman et al., 2000) in which mina and max a are the minimum and maximum values for attribute a. Min-max normalization, maps a value v in the range of (0, 1) by simply computing Equation (1) (1) By means of above equation, the dataset matrix would be Equation (2): where, D is the data sample matrix or input vectors of SOM, XD = (xd 1 ...xd n ) are X coordinates, YD = (yd 1 …yd n ) are Y coordinates, E = (E 1 …E n ) are energy levels (remained energy) of all sensor nodes of the networks, xd max is the maximum value for x coordinate of the network space, yd max is the maximum value for Y coordinate of network space and E max is the remain energy of maximum energy node of the network (at the beginning it is equal to E initial ). In order to determine weight matrix, Base Station has to select m nodes with highest energy in the network. At the beginning, the nodes have equal energy level according to our assumptions. So we can partition the network space to m regions and select the nearest node to center of every region. However due to using two stage SOM-MFPCM method, we usually need to consider a rather large value for m, especially in large WSNs. In this case we can choose these m nodes randomly. We need three variables. Therefore our weight matrix would be Equation (3) where, W is the weight matrix of SOM, XD = (xd 1 ...xd n ) are x coordinates, YD = (yd 1 …yd n ) are y coordinates and (1-E 1 /E max …1-E n /E max ) are consumed energy of m selected maximum energy sensor nodes. As you can see in "(5)", we have a 3*m weight matrix, so we would also have m map units (clusters). Finally, the SOM topology structure would be as " Fig. 1 and 2" shows Number of dead nodes over certain time.
In our application, learning is done by minimization of Euclidian distance between input samples and the map prototypes weighted by a neighborhood function h i,j . So the criterion to be minimized is defined as in (Chandramathi et al., 2007) Equation (4): where, N is the number of data samples, M is the number of map units; N(X (k) ) is the neuron having the closest referent to data sample N(X (k) ) and h is the Gaussian neighborhood function defined by Equation (5): where, ||r j -r i || 2 the distance between map unit j and input sample i and σ t is the neighborhood radius at time t, which is defined by Equation (6): where, t is the number of iteration, T is the maximum number of iteration or the training length. The distance between X k and weight vectors of all map neurons are computed. A neuron N(X k ) which has the minimum distance with input sample X k , would win the competition phase Equation (7): The neighborhood radius is a large value at the beginning and it will reduce with increasing of the time of the algorithm in every iteration. After competition phase, SOM should update the weight vector of the winner N(X k ) and all its neighbors which placed at the neighborhood radius of R(N(X k )). If . j N(X ) k W R ∈ then Equation (8): Else Equation (9): . j . j W (t 1) W (t) + = where, (k) j, N (X ) h (t) the neighborhood function at time t and α(t) is the linear learning factor at time t defined by Equation (10): where, α 0 the initial learning rate, t is the number of iteration and T is the maximum training length. The learning phase repeats until stabilization (no more change) of weight vectors. Output of SOM should be given to K-means algorithm as input.

Fuzzy Possibilistic Clustering Algorithm
The fuzzified translation of the k-means approach is Fuzzy C-Means (FCM). FCM is a clustering approach which lets one node to communicate to two or more clusters. Dunn in 1973 proposed this approach and it was enhanced by Bezdek (1981). This approach is an iterative clustering technique that provides a most favorable c partition by diminishing the weight inside the group sum of squared error objective function JFCM Equation (11) In the above equation X = {x 1 ,x 2 ,…,x n } ⊆ Rp is the corresponding nodes in the p-dimensional vector space, the amount of node is indicated as p, c denotes the amount of clusters with 2 ≤ c ≤ n-1. V = {v 1 ,v 2 ,…,v c } is the c centers or prototypes of the respective clusters, v i indicates the p-dimension center of the cluster i and d 2 (X j , v i ) denotes a Euclidean distance measure between object x j and cluster centre v i . U = {µ ij } represents a fuzzy partition matrix with u ij = u i (x j ) is the degree of membership of x j in the i th cluster; x j is the j th of pdimensional measured data. The fuzzy partition matrix satisfies Equation (12 and 13): m is a weighting exponent constraint on each fuzzy membership and sets up the quantity of fuzziness of the resultant cluster head classification; it is a predefined number which is higher than one. Based on the constraint U the objective function JFCM can be diminished. In particular, the use of JFCM in accordance with u ij and v i and zeroing them correspondingly is essential but not adequate conditions for JFCM to be at its local extrema will be as the following Equation (14 and 15): In wireless atmosphere, the memberships of FCM do not constantly communicate well to the degree of belonging of the data and possibly will be inexact. This is primarily because the real data inevitably involves some noises. To improve this limitation of FCM, the constrained condition (1) of the fuzzy c-partition is not considered to acquire a possibilistic type of membership function and PCM for unsupervised clustering is developed. The cluster head created by the PFCM belongs to a thick region in the data set; every cluster is self-sufficient of the other cluster nodes in the PCM strategy. The formulation is the objective function of the PCM Equation (16) (1 ) Where Equation (17): η i is the scale parameter at the i th cluster Equation (18): u ij indicates the possibilistic typicality value of sample x j corresponding to the cluster i. m∈[1,∞) is a weighting factor said to be the possibilistic parameter. PCM is also dependent on initialization feature of other cluster techniques. The cluster heads do not have a huge mobility in PCM approaches, as each data point is categorized as simply one cluster node at a time rather than all the clusters at the same time. As a result, an appropriate initialization is essential for the algorithms to converge to nearly global minimum. The distinctiveness of both fuzzy and possibilistic cmeans algorithm is combined. Memberships and typicalities are extremely essential parameters for the right feature of data substructure in clustering problem. As a result, an objective function in the FPCM based on both memberships and typicalities can be denoted as below Equation (19): with the following constraints Equation ( PFCM constructs memberships and possibilities simultaneously, along with the usual point prototypes or cluster centers for each cluster. Hybridization of Possibilistic C-Means (PCM) and Fuzzy C-Means (FCM) is the PFCM that often avoids various problems of PCM, FCM and FPCM. The noise sensitivity defect of FCM is solved by PFCM, which overcomes the coincident clusters problem of PCM. But the estimation of centroids is influenced by the noise data.

Modified Fuzzy Possibilistic C-Means Technique (FPCM)
Objective function is very much necessary to enhance the quality of the clustering results. Huang et al. (2008) presented a new approach called Modified Suppressed Fuzzy C-Means (MS-FCM), which significantly improves the performance of FCM due to a prototypedriven learning of parameter α (Saad and Alimi, 2009). Exponential separation strength between clusters is the base for the learning process of α and is updated at each of the iteration. The parameter α can be computed as Equation (25): In the above equation β is a normalized term so that β is chosen as a sample variance. That is, β is defined: But the remark which must be pointed out here is the common value used for this parameter by all the data at each of the iteration, which may induce in error. A new parameter is added with this which suppresses this common value of α and replaces it by a new parameter like a weight to each vector. Or every point of the data set possesses a weight in relation to every cluster. Consequently this weight permits to have a better classification especially in the case of noise data. The following Equation (26) is used to calculate the weight: In the previous equation w ji represents weight of the point j in relation to the class i. In order to alter the fuzzy and typical partition, this weight is used. The objective function is composed of two expressions: The first is the fuzzy function and uses a fuzziness weighting exponent, the second is possibililstic function and uses a typical weighting exponent; but the two coefficients in the objective function are only used as exhibitor of membership and typicality. A new relation, lightly different, enabling a more rapid decrease in the function and increase in the membership and the typicality when they tend toward 1 and decrease this degree when they tend toward 0. This relation is to add Weighting exponent as exhibitor of distance in the two under objective functions. The objective function of the MFPCM can be given as follows Equation (27): U = {µ ij } represents a fuzzy partition matrix, is defined as Equation (28): T = {t ij } represents a typical partition matrix, is defined as Equation (29): V = {v i } represents c centers of the clusters, is defined as Equation (30):

Cluster Head Selection Phase
Different criteria can be considered for selecting a CH in a formed cluster. Enami et al. (2010)

JCS
have been considered for CH selection: The sensor having the maximum energy level, the nearest sensor to the BS and The nearest sensor to gravity center (centroid) of the cluster (both latter criteria used to minimize inter and intra cluster communications).
Trying to incorporate four above criteria, we proposed a cost function which should be computed for every member node of formed clusters as in " (15) where, E 0 is the initial energy of the nodes, E i is the remained energy of the i th node, DtoBS i is the distance of i th node to BS, DtoBS min is the distance of nearest node of the cluster to BS, DtoBS max is the distance of furthest node of the cluster to BS, DtoC i is the distance of node i with centroid of its cluster, DtoC min is the distance of nearest node to centroid of the cluster, DtoC max is the distance of furthest node to centroid of the cluster, CHfreq i is the number of times that node i have been selected as cluster head, CHfreq max is the maximum number of selection of a node as CH in the related cluster, CHfreq min is the minimum number of selection of a node as CH in the related cluster, α, β, λ and ω are coefficients that can be determined experimentally according to their importance in our decision and the sum of them must be equal to one. In order to determine the normalized weights (coefficients) of above criteria, we applied Analytic Hierarchy Process (AHP) method developed by Saaty (1977) and Xiangning and Yulin (2007) which transforms the pairwise comparison into weights of different attributes (here our intended criteria). Finally, the related coefficient (weight) values would be w t = [0.5619, 0.0774, 0.0460, 0.3150] Which represents the: α = 0.562, β = 0.077, λ = 0.046 and ω = 0.315 The node which has the minimum cost will be selected as the cluster head of current round. Certainly after each data transmission phase, the next cluster head will be a different node (cluster head rotation). After determining cluster head nodes, BS assign appropriate roles to all nodes by sending messages containing related cluster head ID as in LEACH-C (Guo et al., 2010).

RESULTS
The experiment is carried out in 300×300 m 2 region with 100 sensors deployed arbitrarily inside that region that assuming that they are deployed in clusters with intercluster message will occur only during cluster heads of the particular clusters. The base station is located at (150,150). A node is regarded as to be a dead node if its energy level is 0 and can be put is "sleep" mode if lower than 10% of its preliminary energy is remained. The performance of Leach algorithm is analyzed with the proposed approach in terms of sensing coverage over time and amount of dead nodes in the system over time using simulation based on MATLAB. The distance (d) between the sensors with in a cluster using the concept of Euclidian distance. The energy used to broadcast q bit of data at a distance d for each sensor node is Equation (32) The energy used to collect data for each node is Equation (33) where, ∈f s is the transmitter constant and depend upon the type of transmitter used. Where E Elec is electronic energy, in these experiments, each node commences with an initial energy of 0.5 joule and unrestricted amount of data can be sent to the Base station via cluster head. A node possibly moves to sleep state based on its remaining energy level proceeding to its dead state and the node will still continue in the network.
Second experiment compared the quantity of dead nodes in the system with time and evaluated against the LEACH algorithm and LEACH with Fuzzy logic. Table 1 provides the parameters for the experiment setup. In the 1st experiment the sensing coverage of the network is taken into account with time for the two approaches. It has been examined that proposed approach provides more sensing coverage with time in contrast to LEACH algorithm (Dasgupta and Dutta, 2011) and LEACH with fuzzy logic. LEACH algorithm the cluster head must alter in every round and it does not take the remaining energy of a node as there is no idea of sleep mode here. But in the proposed approach, the cluster head will alter in the network in addition to the re-clustering of the nodes are carried out based on the energy condition of the cluster heads in the network.

CONCLUSION
In the proposed approach, LEACH protocol is modified using the FPCM algorithm. This approach assumes that all the nodes are permanent at particular region and they are not movable. It is also assumed that the amount of clusters is predetermined and posses the equivalent initial energy. Several limitations of the LEACH protocol are overcome by the proposed approach. Since in this approach, the cluster head do not alter in every round but it alters on requirement to maintain the maximum network coverage. Two experiments were carried out to evaluate the proposed FPCM algorithm with LEACH and LEACH with FPCM algorithm. The experimental results reveal that, the proposed LEACH with FPCM provides more sensing coverage time and less number of nodes over time.