EXTENDED-RANDOMIZED, EFFICIENT, DISTRIBUTED: A DYNAMIC DETECTION OF CLONE ATTACKS IN STATIC WIRELESS SENSOR NETWORKS

A wireless sensor network is a collection of nodes organized in to a cooperative network. Each node consists of processing capability, multiple types of memory, a power source and actuators and sensors. This wireless sensor network is established in hostile and harsh environments like civil and military applications. This network is prone to various attacks. One of the major attack is clone attack. An adversary can capture the node and replicate the node including its cryptographic information and deploy these nodes in the network. This will lead to several problems like leaking the data, jamming the data flow, injecting false data etc. The RED protocol determined the witness node using pseudo-randomly but it is purely static. This study proposes eXtended-Randomized, Efficient, Distributed (X-RED), which detects clone nodes in the static wireless sensor networks in a dynamically fast manner. It is a distributed protocol, which computes the witness nodes dynamically. There is no pre-assumption in determining the witness node. We show that the protocol satisfies the major requirements of the distributed algorithms like the witness node is selected based on their id and location and also reduce the overhead. Simulation results show that our protocol is more efficient than other exiting protocols in terms of detection probability. This approach gives considerable amount of increase in detection probability than other existing protocols and also reduces the storage overhead. This study can be extended for mobile wireless network in the future.


INTRODUCTION
Wireless sensor network is a network of sensor nodes, which are tiny with limited resources that communicate with each other to achieve a goal, through the wireless channels. This network is mainly used in military applications for monitoring security and in civil applications (Akyildiz et al., 2002). This network is deployed in harsh and hostile environments. Based on the operating nature, it is unattended and prone to various attacks.
One of the common attacks is clone attack or replication attack, where an adversary node captures some nodes and makes duplicates of the original node and thus inserts these duplicates in the network. These duplicates use the same node Identifier (ID) as the original node in the network. Thus it takes full control over the network (Lupu and Parvan, 2009). The consequence of this attack is injecting false data, modifying the data, initiating a warm-whole attack and dropping packets. Thus all these result in leaking of authorized data to an adversary.

MATERIALS AND METHODS
Several algorithms were developed so far to detect clone attacks in both static and mobile sensor networks. In this study we propose an algorithm which is randomized, distributed and dynamically detect the clone nodes and analyses the performances of the existing protocols LSM Science Publications JCS and RED in terms of detection probability and communication overhead (memory occupation).
The main requirements of the distributed algorithm are discussed in (Conti et al., 2006): • Witness node selection: The witness node may be selected randomly or pseudo-randomly in the distributed network. To predict the witness node, either the id or the location is used • Overhead: Since the sensor network is resourceconstrained, the overhead in message transmission should be avoided For an efficient algorithm, it should be distributed in nature and should select the witness node so as to minimize communication cost and increases the detection probability .
The remaining part of this study is organized as follows: Section 3 reviews the existing protocols. Section 4 explains the network model, assumptions and the notations used. Section 5 introduces the proposed system. Section 5 shows the simulation results and analyses the results of other existing protocols. Section 6 concludes this study.

RELATED WORKS
The first solution for clone detection is centralized one based on the Base Station. Each node sends the id and location information to the Base Station (Xing et al., 2008). From the same id, if location information is received is different, clone node is detected (Zhu et al., 2012). But this scheme has drawbacks as lot of message transmission and single point of failure. Also the nodes which are located closer to BS have to transmit lot of messages and thus reduce the operational life of these nodes.
Another centralized approach is, each node is having a set of symmetric keys which are selected randomly from a large pool. Each node counts the number of times that key is (Eschenauer and Gligor, 2002) used for its communication (Brooks et al., 2007;Chan et al., 2003). Each node sends its count to BS. From this count, the BS identifies the clone node in network. The node which uses the keys too often are considered cloned and the revocation procedure is invoked.
The two main protocols appeared in (Parno et al., 2005) are distributed solutions. The first scheme, Randomized Multicast (RM), sends the information about its location to direct neighbors and in turn each of these neighbors sends this information to randomly selected witnesses. If there is a replicated node, any one of this witness may receive the different location claims with same ID and it revokes the replicated node. The advantage is high detection probability using relatively limited number of witnesses. The number of messages send by each neighbor is √n.
The second scheme, Line Selected Multicast (LSM), uses the routing information to detect the clones. In addition to the witness nodes, the intermediate nodes within the path can check for clones as shown in Fig. 1. Each node forwards the claims and saves the claims. For example, a node a and clone a' in the network. Neighbor of a sends the location claim to r witnesses. Each node stores this information also. When this information is transferred on the path any node w verifies the signature on the claim and checks for the conflict with the location information on its buffer. If there is a conflict it revokes the cloned node. Otherwise store the claim and forwards to the next node. The advantage is less communication cost, high detection rate and less storage requirements. Zhu et al. (2007), two more schemes are proposed which are Single Deterministic Cell and Parallel Multiple Probabilistic Cells. In the first scheme, each node ID is associated with a single cell. The location information is send to the predefined witness node within a cell. Once the witness node receives the message, it is broadcasted to all other nodes in the cell. In second scheme, A number of witnesses are determined and it is already defined. The neighbors of a node a send a's claim to these witness nodes with a probability. This solution shows a high detection probability.
Another protocol for detecting node replication attack is SET proposed in (Choi et al., 2007). A number is generated randomly and it is sent to all nodes and it is used to form disjoint set of clusters and cluster heads. Each cluster is considered as a set and heads of these clusters become leaders of these sets. Within each cluster one or more trees are defined over the network graph. A protocol is used to collect all the nodes belonging to these subsets. If different subsets are having the same ID then there is a clone.
The RED protocol is similar to the RM protocol but with witnesses chosen based on pseudo-random function based on a random value. A random value, rand, is generated and distributed to all the nodes using a centralized mechanism. Each node broadcasts a message which contains encrypted ID and location information. The neighbors of source node sends (with probability p) this encrypted message to a set of g > = 1 nodes which are selected using some pseudo-random function (Conti et al., 2011). The disadvantage of the RED protocol are number of messages transmitted high, computation time is high, witness node is static what we fix as g = 1, g> = 1 etc. and is location dependent.

NETWORK MODEL AND ASSUMPTIONS
In this study, we assume nodes are static, non-tamper resistant and are uniformly deployed in the area of observation. We also assume that communication links between sensor nodes are bidirectional (Yu et al., 2009) and there is no centralized trusted entity in sensor network. Also nodes are assigned with a unique ID (Jian et al., 2012), prior to their deployment. Assumptions made about the adversary are, an adversary can compromise only a limited number of nodes, an adversary can take full control over the compromised node, an adversary can create as many replicas as adversary wishes to deploy into the network and an adversary cannot create a new ID for sensor node (Ho et al., 2009).

Key Generation
It provides authentication to node in a network to give security. Algorithm used to generate key is RSA algorithm. The Rivest-Shamir-Adleman (RSA) algorithm is one of the most popular and secures public-key encryption methods (Rivest et al., 1978). The algorithm capitalizes on the fact that there is no efficient way to factor very large (100-200 digit) numbers.
Using an encryption key (e,n), the algorithm is as follows: Represent the message as an integer between 0 and (n-1). Large messages can be broken up into a number of blocks. Each block would then be represented by an integer in the same range. Encrypt the message by raising it to the eth power modulo n. The result is a cipher text message C. To decrypt cipher text message C, raise it to another power d modulo n. The encryption key (e,n) is made public. The decryption key (d,n) is kept private by the user.

Prediction
Two types of prediction used in our schemes are ID information and Location information. This protocol does not provide any information about ID of the witness nodes during the next iteration of the protocol and also the probability that the witness node selection is not depending on the location of that node. Our protocol uses both ID and location information to detect replica in the network.

Notation
For clarity, we list the symbols and notation used throughout the paper in Table 1.

RESULTS AND DISCUSSION
A source node sends the location information to the neighbor node which is located from a random direction. This neighbor node uses randomly/hash function computation, computes a diameter. All the nodes within the circle whose diameter is d, will receive the location information and compares. The node within the circle and at the edge or boundary in the same direction becomes the witness node. From this node the location information is forwarded to a node in randomly selected direction. The proposed system architecture is given in the Fig. 2 In the RED protocol the witness node selection is performed based on the pseudo-random function and it is purely static. But our proposed approach selects the witness node dynamically and randomly in every iteration. There is no pre-computation.

X-RED Protocol
The proposed protocol is executed as given: The node a and a' send the location and ID information to a neighbor in the direction selected randomly. This neighbor node computes the diameter and collecting nodes within that diameter and compares the location and ID. If the IDs are same and location is different clone node is detected and it starts the revocation procedure. Otherwise, this information is forwarded to a node on the boundary of the circle or near to the edge. Then the same procedure is repeated until it finds the clone.
The proposed protocol steps are given below.
Input: Encrypted Message with ID, Location and time Output: Detection of Clone Nodes Step1: Source node a encrypts the message with ID, Location and time using RSA algorithm. Step2: This encrypted message is sent to a neighbor node which is randomly selected based on the direction. Step3: The neighbor node when receives the message, decrypt it using RSA algorithm and check for authorization of the source. Step4: If not authorized discard the message.
Step5: If authorized, compares the ID and Location of the received message with the existing one. Step6: If IDs are same and different locations clone node is detected and initiate the revocation procedure. Step7: Otherwise, the neighbor node compute a diameter using hash function and forward the message to all the nodes within the diameter range. Step8: All these nodes perform the comparison and start the revocation procedure if clone node is detected. Otherwise, the farthest neighbor node, a node diameter/2 distance apart in the same direction is selected as a witness node. Step9: This witness node repeat the protocol from Step 2 to Step 8.
X-RED is executed in frequent intervals of time. Every run of the protocol consists of eight steps. In the first step, source node digitally signs its message-ID and geographic location and forwards it to the farthest neighbor in the randomly selected direction. When the neighbor receives the message, it executes Step 2 to Step 7. The neighbor node computes the diameter and within the circular area from all nodes the location claim is collected and compared. If there is no clone find a witness node is selected as given in Step 8. X-RED does Science Publications JCS not send message to the specific ID. A message sent to a node that is not available in the network would be discarded; nodes deployed after the initial network deployment are not selected as witnesses because need to update all the nodes. The Step 1 encrypts a message (claim) and forwards it to the randomly selected neighbor. Generally message consists of time, ID and location of the source node. Each Neighbor receives the message performs the following steps: • Verifies the received message for its authentication and • Check the message for its freshness For every valid message that passes this step, the possible witness node extracts the ID and location. If is the first message contains this ID, then the node simply stores the message. Otherwise, compute the diameter and collect all neighbor nodes information within that diameter.
If another node with same ID as a source within the diameter has been present, the node checks if the new claim is having different location information than the one stored in memory for this same ID. So the witness node triggers a revocation procedure for the ID-the two signed claims having same ID and different location information are the proof of cloning.
Here is an example of a run of the protocol. Assume that the adversary clones identity ‫ܦܫ‬ a and assigns this identity to nodes a and a′. These two nodes are placed in two different network locations: ݈1 and ݈2, respectively. During an X-RED iteration, the nodes a and a′ have to broadcast the same ID, but different location claims (݈1 and ݈2). Both a and a′ starts sending the location information <ID a , ݈1> and <ID a' , ݈2> respectively to their neighbors in a randomly selected direction. Now each neighbor dynamically computes the diameter. Within that diameter area all the nodes will receive this information. But a node on the boundary or near to the boundary will be considered as witness node (w). The same procedure is repeated and at the same time a′ will also execute the same protocol. The same w will receive the claim from a and a′ and then finds the clone and trigger the revocation procedure.

EXPERIMENTAL ANALYSIS
In this section, we show that X-RED meets the following requirements: Unaware of ID and location information; less storage overhead and high clone attacks detection probability. Each node computes the direction randomly, only the ID and location information of direct neighbors are stored in each node. Only the witness node is having the ability of forwarding the encrypted message to next level of nodes. So storage overhead is less. The time sent with the encrypted message proves the freshness of the message. Every time the comparison is performed with set of neighbor nodes and so detection probability is very high.
We further compare X-RED with RED and LSM and show that X-RED outperforms both RED and LSM in several ways. The X-RED protocol is simulated in NS2. In the following simulation, we fixed ݊ = 1,000 nodes in the network and initially we set communication radius as 0.1 (Bettstetter, 2002;Di Pietro et al., 2004). To test the protocols, we assume that there are two nodes with the same ID in the network.
The message is transmitted from both original source sensor node a and the clone node a ' . The witness node is having the capability of forwarding the encrypted message to the next node which is selected randomly in a direction.
The probability that a particular node becomes a witness node is P witness = 1/m, where m is the number of nodes for which l ≤ d ≤ l+є (є is a small value) ldiameter randomly calculated and d-distance between neighbor and witness.
The following Table 2 shows overheads while message transmission and signature check. The Table 3 shows the communication cost and detection probability of various protocols. Figure 3 shows the number of messages that are stored by each node in X-RED, LSM and RED. X-axis represents number of messages stored by sensor nodes and Y-axis represents % of the nodes stores fixed number of messages. The graph is obtained by plotting the values taken from the results of more than 1000 simulations. Note that for LSM (Cho et al., 2013), some nodes could require to store as many as 200 messages. Our experiments show that LSM requires some 60 messages are stored by 1.9% nodes, some 40 to 59 messages are stored by 7.6% nodes and 27.5% of the nodes store messages between 20 and 39. 63% of the nodes are required to store less than 20 messages. In RED, only a very less number of the nodes store more than 10 messages (Conti et al., 2011). As for X_RED, only few nodes require to store more than 5 messages, which is relatively less than RED (0.001) percent. The sensor nodes which store the location claim message is very less. In the proposed protocol only the witness nodes are having the capacity of storage. In every iteration, the farthest neighbor in the selected direction is selected as witness.     Figure 4 shows the detection probability in the Yaxis and iterations in the X-axis. The graph is plotted for about 200 iterations. The values were taken from the results obtained for more than 50 network topology. Each single deployment was evaluated for X-RED, LSM and the RED protocol. For all the iterations, the X-RED protocol shows high probability of detecting clones than RED and LSM. From the 1st to the 50th iteration, LSM shows probability detection of about 35%, while this probability is 84% for the RED protocol . However, X_RED shows probability detection of about 85%. When the number of iterations increases, it takes the time to find the clone node and so the detection probability gradually decreases. When compared to the LSM a mass increment in detection probability and compared to RED a slight difference is there but during all iterations X-RED is showing the efficiency.

Analysis of Network with Malicious Nodes
Here we analyze the replica detection probability during a number of continuous iterations. We assume that the malicious node has cloned a node and is already controlling a set of nodes. There is no mechanism for preventing packet dropping and so malicious nodes when it becomes witness node will stop forwarding claim messages.
In RED protocol , the probability that at least one malicious node is present in the two path is Equation 1: In X-RED, from both a and a′ the claim message is sent to one neighbor node and then to witness node. On the path if there are l nodes, both the paths contain 2l nodes. The probability that at least one malicious node is present in the two paths is Equation 2: (2) n = the number of sensor nodes, l = the number of nodes on the path.
Except the two source nodes (original and clone), all the other nodes can be the malicious nodes.

CONCLUSION
In this study, three protocols namely LSM, RED and X-RED were discussed for detecting the clone attacks. In LSM, the detection probability is very less and there is an enormous improvement in RED and in X-RED it is 88%. The proposed X-RED protocol is the major contribution of this research and this study is used to detect node replication attacks and analyzing the performance of all the three protocols. During simulation, once in every five iterations the detection probability and communication overhead is calculated and the same is plotted in the graph.
The extensive simulation result shows that the X-RED protocol is highly efficient in detection probability than the existing protocols discussed in the literature. The storage overhead is evenly distributed among the nodes. The encrypted message is not broadcasted to all other nodes deployed in the network. Only very few nodes need to store the messages and so communication overhead is reduced. The main advantage of the protocol is dynamically compute the direction of the neighbor node, compute the diameter of the area in which all the nodes receive the claim information using hash function and to find the farthest neighbor every time. There is no static assumption for the witness node. This study is applied on static wireless sensor network and this can be extended for mobile wireless sensor network in future.