Reducing Broadcast Overhead Using Clustering Based Broadcast Mechanism in Mobile Ad Hoc Network

: Problem statement: Network wide broadcasting is an important function in Mobile Ad Hoc Networks (MANET), which attempts to deliver packets from a source node to all other nodes in the network. Broadcasting is often very useful for route discovery, naming, addressing and helping multicast operations in all kinds of networks. For designing broadcast protocols for ad hoc networks, one of the primary goal is to reduce the overhead (redundancy, contention and collision) while reaching all the nodes in network. Approach: We had discussed many approaches in network wide broadcasting namely flooding, probability based, area based, network knowledge and cluster based broadcasting methods. The implementations and analysis will be made on Linux using the Network Simulator NS2. Results: In this study, cluster based flooding algorithm had been proposed and its metrics were namely routing load and packet delivery ratio was compared with two common flooding algorithms namely simple flooding and probability based flooding. Conclusion/Recommendations: It was concluded that simple flooding required each node to rebroadcast all packets. Probability based methods used some basic understanding of network topology, assigning a probability to node to rebroadcast. Cluster broadcasting algorithm for mobile ad hoc networks guaranteed to deliver messages from a source node to all nodes of network.


INTRODUCTION
Mobile Ad hoc Network (MANET) consist of a collection of mobile hosts without a fixed infrastructure. Due to limited wireless power a host may not communicate with its destination directly. It usually requires other hosts to forward its packets to the destination through several hops. So in MANET every host acts as a router when it is forwarding packets for other hosts. Because of mobility of hosts and time variability of the wireless medium, the topology of MANET varies frequently. Therefore the routing protocol plays an important role in MANET. There has been extensive research on routing protocols, such as DSR [1] , AODV [2] , ZRP [3] and LAR [4] . A common feature of these routing protocols is that their route discovery all relies on network wide broadcasting to find the destination. Recently, a number of research groups have proposed more efficient broadcasting techniques whose goal is to minimize the number of retransmissions while attempting to ensure that a broadcast packet is delivered to each node in the network. In a broadcast process, each node decides its forwarding status based on given neighborhood information and the corresponding broadcast protocol. The existing static network broadcast schemes perform poorly in terms of delivery ratio when nodes are mobile. There are two sources that cause the failure of message delivery [5] : • Collision: The message intended for a destination collides with another message. In Fig. 1, if messages from nodes w and x collide at node y, node y does not receive any message • Mobility nodes: A former neighbor moves out of the transmission range of the current node (i.e., it is no longer a neighbor). In Fig. 1 when node w moves out of the transmission range of u, the nodes along the branch rooted at w of the broadcast tree will miss the message The effect of collision can be relieved by a very short (1 ms) forward jitter delay, where a very high (>99%) delivery ratio is achieved in static networks. The majority of delivery failures are caused by mobility nodes. Therefore, delivery failure can be caused by mobility only. Broadcasting in MANET: The simplest broadcasting scheme is flooding, which is used by most existing routing protocols. It is very costly and often results in serious broadcast storms. The broadcast problem refers to the transmission of a message to all other Mobile Hosts (MHs) in the network. The problem we consider has the following characteristics [6] .
The broadcast is spontaneous: Any Mobile Host (MH) can issue a broadcast operation at any time. For reason such as the MH mobility and the lack of synchronization, preparing any kind of global topology knowledge is prohibitive.
The broadcast is frequently unreliable: Acknowledgment mechanism is rarely used. However, attempt should be made to distribute a broadcast message to as many MHs as possible without putting too much effort. The motivations for such an assumption are: • A MH may miss a broadcast message because it is off-line, it is temporarily isolated from the network, or it experiences repetitive collisions • Acknowledgements may cause serious medium contention(storm) surrounding the sender • In many applications (e.g., route discovery in ad hoc routing protocols), 100% reliable broadcast is unnecessary To avoid the broadcast storm problem, some form of randomized delay can be introduced before a neighboring node relays the received packet. With the support from MAC layer using RTS/CTS/DATA/ACK approach, reliable transmission can be achieved at each hop. Where there are more than one neighboring nodes receiving the broadcast transmission, we may use a round-robin approach, or a none-or-all approach. In a round robin approach, the current node unicast the packet to its neighbors in a one-by-one fashion. In a none-or-all approach, after sending out the RTS message, the current node will wait for all neighboring nodes' CTS messages before it finally sends out the data packet, or it will abort this attempt of transmission and back off and then retry again. Flooding-generate broadcast storm: A straightforward approach to perform broadcast is by flooding. A MH, on receive a broadcast message for the first time, has the obligation to rebroadcast the message. In a CSMA/CA network, drawbacks of flooding include: • Redundant rebroadcasts: When a MH decides to rebroadcast a broadcast message to its neighbors, all its neighbors already have the message • Contention: After a MH broadcasts a message, if many of its neighbors decide to rebroadcast the message, these transmissions (which are all from nearby MHs) may severely contend with each other • Collision: Because of the deficiency of back off mechanism, the lack of RTS/CTS handshake in broadcasts and the absence of collision detection (CD), collisions are more likely to occur and cause more damage As we have mentioned before, the collection of these drawbacks is referred to as the broadcast storm problem. Figure 2 exemplifies the broadcast storm problem, where node S initiates a route request to node D through a flooding. As we can see, flooding is highly redundant. Each node receives the route requests degree times and the route request propagates far beyond node D. Because nearby nodes will receive and rebroadcast the route request at nearly the same time, contention (when senders can hear each other) and collision(when senders cannot hear each other) will be common.
Design pattern: In this study, we evaluate broadcast protocols on wireless networks that utilize the IEEE 802.11 MAC [7] . This MAC follows a Carrier Sense Multiple Access/Collision Avoidance (CSMA/CA) scheme. Collision avoidance is inherently difficult in MANETs; one often cited difficulty is overcoming the hidden node problem, where a node is not able to ascertain whether its neighbors are busy receiving transmissions from an uncommon neighbor. The 802.11 MAC utilizes a Request To Send (RTS) / Clear To Send (CTS) / Data / Acknowledgment procedure to account for the hidden node problem when unicasting packets. However, the RTS/CTS/data/ACK procedure is too cumbersome to implement for broadcast packets as it would be difficult to coordinate and bandwidth expensive. Therefore, the only requirement made for broadcasting nodes is that they assess a clear channel before broadcasting. Unfortunately, clear channel assessment does not prevent collisions from hidden nodes. Additionally, no recourse is provided for collision when two neighbors assess a clear channel and transmit simultaneously Random Delay Time (RDT): Many of the broadcasting protocols require a node to keep track of redundant packets received over a short time interval in order to determine whether to rebroadcast. That time interval, which we have arbitrarily termed "Random Delay Time" (RDT), is randomly chosen from a uniform distribution between 0 and Tmax seconds, where Tmax is the highest possible delay interval. This delay in transmission accomplishes two things. First it allows nodes sufficient time to receive redundant packets and assess whether to rebroadcast. Second, the randomized scheduling prevents the collisions. An important design consideration is the implementation of the random delay time. One approach is to send broadcast packets to the MAC layer after a short random time similar to the jitter. In this case, packets remain in the interface queue (IFQ) until the channel becomes clear for broadcast. While the packet is in the IFQ, redundant packets may be received, allowing the network layer to determine if rebroadcasting is still required. If the network layer protocol decides the packet should not be rebroadcast, it informs the MAC layer to discard the packet. A second approach is to implement the random delay time as a longer time period and keep the packet at the network layer until the RDT expires. Retransmission assessment is done considering all redundant packets during the RDT. After RDT expiration, the packet is either sent to the MAC layer or dropped. No attempts are made by the network layer to remove the packet after sending it to the MAC layer.
Jitter: The purpose of introducing a small amount of Jitter when forwarding data packets is to reduce the chance of collisions when nodes within transmission range of each other forward packets that have been received from a common neighbor. In other words, Suppose a source node originates a broadcast packet. Given that radio waves propagate at the speed of light, all neighbors will receive the transmission almost simultaneously. Assuming similar hardware and system loads, the neighbors will process the packet and rebroadcast at the same time.
To overcome this problem, broadcast protocols jitter the scheduling of broadcast packets from the network layer to the MAC layer by some uniform random amount of time. This (small) offset allows one neighbor to obtain the channel first, while other neighbors detect that the channel is busy (clear channel assessment fails).

MATERIALS AND METHODS
Broadcasting methods: Broadcasting methods have been categorized into four families utilizing the IEEE 802.11 MAC specifications. Note that for the comparisons of these categories the reader is referred to [8] : • Simple flooding can be used as a simple protocol for broadcasting and multicasting in ad hoc networks with low node densities and/or high mobility • Probabilistic scheme, based on the understanding that in a dense network, nodal and network resources can be save by having some nodes not rebroadcast the duplicate networks. A more refined probabilistic scheme is a counter-based approach in which upon receiving a broadcasted packet, the current node applies a Random Delay Time (RDT) before it determines whether or not to rebroadcast packet • In area based methods, intermediate nodes will evaluate additional coverage area based on all received duplicate packet. We can image that in a dense network there may be multiple nodes which are located very close to each other. In such situations, the majority of the coverage areas of these nodes overlap each other. Based on estimated distance or location information, an intermediate node will determine whether or not to rebroadcast the received packet • In neighborhood knowledge based methods, a node will determine whether or not to rebroadcast based on its neighbor list. Upon receiving a broadcasted packet, a node will check the previous node's neighbor list which is included in the packet header. If it turns out that it would not reach any additional nodes, it will decide not to rebroadcast the packet Simple flooding method: In this method [9,10] , a source node of a MANET disseminates a message to all its neighbors, each of these neighbors will check if they have seen this message before , if yes the message will be dropped, if not the message will re-disseminated at once to all their neighbors. The process goes on until all nodes have the message. Although this method is very reliable for a MANET with low density nodes and high mobility but it is very harmful and unproductive as it causes severe network congestion and quickly exhaust the battery power. Blind flooding ensures the coverage; the broadcast packet is guaranteed to be received by every node in the network, providing there is no packet loss caused by collision in the MAC layer and there is no high-speed movement of nodes during the broadcast process. However, due to the broadcast nature of wireless communication media, redundant transmissions in blind flooding may cause the broadcast storm problem, in which redundant packets cause contention and collision.
Probability based approach: Probabilistic scheme: The probabilistic scheme [11] is similar to ordinary flooding, except that nodes only rebroadcast with a predetermined probability. In dense networks, it is much likely that multiple nodes share similar transmission coverage. Thus, having some random nodes not to rebroadcast saves network resources without harming packet delivery effectiveness. In sparse networks, there is much less shared coverage and, therefore, not all nodes will receive all the broadcast packets with this scheme unless the probability parameter is high. When the probability is 100%, this scheme is identical to ordinary flooding.
Counter-based scheme: An inverse relationship is shown between the number of times a packet is received at a node and the probability of this node's transmission being able to cover additional area on a rebroadcast. This result forms the basis of the counterbased scheme. Upon receipt of a previously unseen packet, the node initiates a counter with a value of one and sets a RDT. During the RDT, the counter is incremented by one for each redundant packet received. If the counter is less than a threshold value when the RDT expires, the packet is rebroadcast. Otherwise, it is simply dropped. The features of the counter-based scheme are its simplicity and its inherent adaptability to local topologies. In other words, in a dense area of the network some nodes will not rebroadcast, whereas in sparse areas of the network all nodes will likely rebroadcast.
Area based methods: Suppose a node receives a packet from a sender that is located only one meter away. If the receiving node rebroadcasts, the additional area covered by the retransmission is quite low. On the other extreme, if a node is located at the boundary of the sender node's transmission distance, then a rebroadcast would reach significant additional area, 61% to be precise [11] . A node using an Area Based Method can evaluate additional coverage area based on all received redundant transmissions. We note that area based methods only consider the coverage area of a transmission; they don't consider whether nodes exist within that area.
Distance-based scheme: A node using the Distance-Based Scheme compares the distance between itself and each neighbor node that has previously rebroadcast a given packet1. Upon reception of a previously unseen packet, a RDT is initiated and redundant packets are cached. When the RDT expires, all source node locations are examined to see if any node is closer than a threshold distance value. If true, the node doesn't rebroadcast.

Location-based scheme:
The Location-Based scheme [11] uses a more precise estimation of expected additional coverage area in the decision to rebroadcast. In this method, each node must have the means to determine its own location, e.g., a Global Positioning System (GPS).Whenever a node originates or rebroadcasts a packet it adds its own location to the header of the packet. When a node initially receives a packet, it notes the location of the sender and calculates the additional coverage area obtainable were it to rebroadcast.
If the additional area is less than a threshold value, the node will not rebroadcast and all future receptions of the same packet will be ignored. Otherwise, the node assigns a RDT before delivery. If the node receives a redundant packet during the RDT, it recalculates the additional coverage area and compares that value to the threshold. The area calculation and threshold comparison occur with all redundant broadcasts received until the packet reaches either it's scheduled send time or is dropped.

Neighbor knowledge method:
Flooding with self pruning: The simplest of the Neighbor Knowledge Methods is what Lim and Kim refer to as Flooding with Self Pruning [12] . This protocol requires that each node have knowledge of its 1-hop neighbors, which is obtained via periodic "Hello" packets. A node includes its list of known neighbors in the header of each broadcast packet. A node receiving a broadcast packet compares its neighbor list to the sender's neighbor list. If the receiving node would not reach any additional nodes, it refrains from rebroadcasting; otherwise the node rebroadcasts the packet.

Scalable Broadcast Algorithm (SBA): The Scalable
Broadcast Algorithm (SBA) [13] requires that all nodes have knowledge of their neighbors within a two hop radius. This neighbor knowledge coupled with the identity of the node from which a packet is received allows a receiving node to determine if it would reach additional nodes by rebroadcasting 2-hop neighbor knowledge is achievable via periodic "Hello" packets; each "Hello" packet contains the node's identifier (IP address) and the list of known neighbors.
After a node receives a "Hello" packet from all its neighbors, it has two hop topology information centered at itself. Suppose Node B receives a broadcast data packet from Node A. Since Node A is a neighbor, Node B knows all of its neighbors, common to Node A, that have also received Node A's transmission of the broadcast packet. If Node B has additional neighbors not reached by Node A's broadcast, Node B schedules the packet for delivery with a RDT. If Node B receives a redundant broadcast packet from another neighbor, Node B again determines if it can reach any new nodes by rebroadcasting.
The researchers of [13] note that signal strength can be used to calculate the distance from a source node; in other words, this protocol is implementable without a Global Positioning System (GPS) until either the RDT expires and the packet is sent, or the packet is dropped. [14] is similar to Dominant Pruning in that rebroadcasting nodes are explicitly chosen by upstream senders. For example, say Node A is originating a broadcast packet. It has previously selected some, or in certain cases all, of it one hop neighbors to rebroadcast all packets they receive from Node A. The chosen nodes are called Multipoint Relays (MPRs) and they are the only nodes allowed to rebroadcast a packet received from Node A. Each MPR is required to choose a subset of its one hop neighbors to act as MPRs as well. Since a node knows the network topology within a 2-hop radius, it can select 1-hop neighbors as MPRs that most efficiently reach all nodes within the two hop neighborhood.

Multipoint relaying: Multipoint Relaying
Ad hoc broadcast protocol: The Ad Hoc Broadcast Protocol (AHBP) [15] utilizes an approach similar to Multipoint Relaying. In AHBP, only nodes that are designated as a Broadcast Relay Gateway (BRG) within a broadcast packet header are allowed to rebroadcast the packet. BRGs are proactively chosen from each upstream sender which is a BRG itself. The algorithm for a BRG to choose its BRG set is identical to that used in Multipoint Relaying (see steps 1-4 for choosing MPRs).

AHBP differs from Multipoint Relaying in three ways:
• A node using AHBP informs 1-hop neighbors of the BRG designation within the header of each broadcast packet. This allows a node to calculate the most effective BRG set at the time a broadcast packet is transmitted. In contrast, Multipoint Relaying informs 1-hop neighbors of the MPR designation via "Hello" packets • In AHBP, when a node receives a broadcast packet and is listed as a BRG, the node uses 2-hop neighbor knowledge to determine which neighbors also received the broadcast packet in the same transmission. These neighbors are considered already "covered" and are removed from the neighbor graph used to choose next hop BRGs. In contrast, MPRs are not chosen considering the source route of the broadcast packet • AHBP is extended to account for high mobility networks. Suppose Node A receives a broadcast packet from Node B and Node A does not list Node B as a neighbor (i.e., Node A and Node B have not yet exchanged "Hello" packets). In AHBP-EX (extended AHBP), Node A will assume BRG status and rebroadcast the node. Multipoint relaying could be similarly extended

Cluster based methods:
The clustering approach has been used to address traffic coordination schemes [16] , routing problems [17] and fault tolerance issues [18] . Note that cluster approach proposed in [16] was adopted to reduce the complexity of the storm broadcasting problem. Each node in a MANET periodically sends "Hello" messages to advertise its presence. Each node has a unique ID. A cluster is a set of nodes formed as follows.
A node with a local minimal ID will elect itself as a cluster head. All surrounding nodes of a head are members of the cluster identified by the heads ID. Within a cluster, a member that can communicate with a node in another cluster is a gateway. To take mobility into account, when two heads meet, the one with a larger ID gives up its head role. This cluster formation is shown in Fig. 3. Ni et al. [19] assumed that the cluster formed in a MANET will be maintained regularly by the underlying cluster formation algorithm. In a cluster, the heads rebroadcast can cover all other nodes in its cluster. To rebroadcast message to nodes in other clusters, gateway nodes are used, hence there is no need for a nongateway nodes to rebroadcast the message. As different clusters may still have many gateway nodes, these gateways will still use any of the broadcasting approaches to determine whether to rebroadcast or not. Ni et al. [19] showed that the performance of the cluster based method where the location based approach was incorporated compared favorably to the original location based scheme. The method saved much more rebroadcasts and leads to shorter average broadcast latencies. Unfortunately, the reachability was unacceptable in low density MANETs.

The broadcasting algorithms under evaluation:
Simple flooding algorithm-Algorithm 1: The simple flooding algorithm with respect to normalized routing load is implemented in Algorithm 1 using NS2 Simulation. The steps are as follows: • The algorithm for simple flooding starts with a source node broadcasting a packet to all neighbors • Each of those neighbors in turn rebroadcast the packet exactly one time • This continues until all reachable network nodes have received the packet

Probability based flooding algorithm-Algorithm 2:
The probability based flooding algorithm with respect to normalized routing load is implemented in Algorithm 2 using NS2 Simulation. The probabilistic scheme is similar to flooding, except that nodes only rebroadcast with a predetermined probability. The algorithm for Simple Flooding starts with a source node broadcasting a packet to all neighbors. Each of those neighbors in turn may rebroadcast the packet exactly one time with respect to some random condition. And this continues until all reachable network nodes have received the packet. When the probability is 100%, this scheme is identical to flooding.

The proposed clustering based techniques:
The proposed broadcasting algorithm for mobile ad hoc networks guarantees to deliver the messages from a source node to all the nodes of the network. The nodes are mobile and can move from one place to another. The algorithm adapts itself dynamically to the topology and always gives the least finish time for any particular broadcast. The algorithm focuses on reliable broadcasting. It guarantees to deliver the messages within a bounded time. The algorithm takes into consideration multiple nodes located at the same point.
The algorithm tries to fix any delay latencies and message losses. It is collision free and energy efficient.
The proposed cluster based broadcasting algorithm-Algorithm 3: K-Means algorithm is very popular for data clustering. In this Broadcasting algorithm, k-Means algorithm will be used to cluster the nodes with respect to their locations in the MANET and select a central node in each cluster to make it as a forwarding node.
• Resolve the locations of all the nodes in the network. (in this research, a simulated GPS was assumed) • Select k Center in the problem space (it can be random) • Partition the data into k clusters by grouping points that are closest to those k centers • Use the mean of these k clusters to find new centers • Repeat steps 3 and 4 until centers do not change • Find the nearby central nodes from the calculated cluster centers • Make the central nodes as forwarding nodes • Start the broadcast from the source node by broadcasting a packet to all neighbors • The neighboring nodes in turn rebroadcast the packet exactly one time one and only if it is a forward node • This continues until all reachable network nodes have received the packet The working principle of K-Means algorithm is described as given below: • Select k Center in the problem space (it can be random) • Partition the data into k clusters by grouping points that are closest to those k centers • Use the mean of these k clusters to find new centers • Repeat steps 2 and 3 until centers do not change • This algorithm normally converges in short iterations

RESULTS
For the purpose of this study, we have experimented with various kinds of simulations on NS2 [20] to understand and implement the flooding algorithms. The performance of broadcast protocols can be measured by a variety of metrics. A commonly used metric is the number of message retransmissions with respect to the number of nodes. The next important metric is reachability or the ratio of nodes connected to the source that received the broadcast message. Time delay or latency is sometimes used, which is the time needed for the last node to receive the broadcast message initiated at the source. Table 1 shows the important simulation parameters considered for simulation work.
The following metrics were considered for evaluating the flooding algorithms: • Packet delivery ratio • Normalized routing load Packet Delivery Ratio (PDR): Packet delivery ration is the ratio of the number of packets successfully received by all destinations to the total number of packets injected into the network by all sources. Table 2 shows the packet delivery ratio of the three algorithms with respect to different velocity of the nodes and node speeds. Here the total number of mobile nodes taken for simulation is 24. Packet delivery ratio chart: The following line chart (Fig. 4) shows the packet delivery ratio of the three algorithms with respect to different velocity of the nodes.
Normalized routing load: Normalized routing load can be measured by the ratio of the number of routing messages propagated by every node in the network and the number of data packets successfully delivered to all destination nodes. In other words, the routing load means the average number of routing messages generated to each data packet successfully delivered to the destination. The following Table 3 shows the normalized routing load of the three algorithms with respect to different velocity of the nodes and node speeds. Here the total number of mobile nodes taken for simulation is 24.

DISCUSSION
Broadcasting is an essential building block of any MANET, so it is imperative to utilize the most efficient broadcast methods possible to ensure a reliable network. Due to dynamic change of MANET topology and its scarce resource availability, however, there are no single optimal algorithms available for all relevant scenarios. In this study, we have evaluated the performance of a single source broadcasting techniques such as simple flooding algorithm and probability flooding algorithm using simulation. We have also proposed the techniques and algorithms for cluster based techniques for efficient flooding in Mobile Ad Hoc Network. In this research, the classic k-mean clustering algorithm was used to cluster the mobile nodes. Since the k-means algorithm has some draw backs and produce wrong clusters if there were lot of outliers in the location data. In this implementation the number of clusters was decided with respect to the model scenario at hand.

CONCLUSION
The cluster based broadcasting algorithm guarantees to deliver the packets from a source node to all the nodes of the network with minimum overhead. In this research, the old flooding method was used to get the location information and the proposed clustering based method was to used for further messaging. The main scope of this research is to device a new clustering based distributed algorithm for efficient flooding in MANET. Future researches may address the possibilities of removing the classical flooding phase which is used to discover location information. The future researches may also address the issues for real implementation which may involve real GPS for resolving location information.