Accelerating Digital Forensics through Parallel Computing

: Digital crimes in the era of big data and cloud computing imposes significant challenges in digital forensics. Cloud environment provides low cost, easy management and reasonable solutions. Moreover, it supports big data structures and solutions (i.e., security, privacy and digital forensics). In order to achieve a secure digital forensics analysis in cloud environment, researchers have proposed solutions with expensive communication cost and computation overheads. Among these solutions Nasereldin et al . proposed a protocol which solves the problem of authenticity and integrity of evidence using signcryption technique. This leads to low communication and implementation overheads. Furthermore, identity-based cryptography is used to solve Public Key Infrastructure (PKI) problems. In addition, it is characterized by the ability to divide the message into small messages which is suitable for pipelining techniques. Nasreldin et al .'s signcryption protocol is based on Elliptic Curve Cryptography (ECC) which is implemented by using different mathematical operations. In this protocol, ECC mathematical operations take huge time during the execution of the algorithm. ECC consists of point doubling and point addition operations. These operations require the execution of many Montgomery modular multiplications that consume time. In this study, we introduce a technique to speed up ECC operations in order to enhance the efficiency of Nasreldin et al . protocol. In particular, we propose a multi-stage parallel design which consists of three stages. First, we speed up the point doubling and point addition operation. Secondly, we enhance the execution time of Montgomery multiplications. Finally, pipelining is used to obtain a better performance. The results show that the proposed design enhances Nasreldin et al . protocol’s execution time by 47.1, 64.7, 73.5 and 79.4%, assuming that the number of nodes is 2, 4, 6 and 12, respectively.


Introduction
Big data and cloud computing are hot topics that shape the future of both academia and industry. It is hard to acquire, handle, manage and process datasets in big data using legacy methods. Hence, big data requires optimal processing power and analytics capabilities. On the other hand, cloud computing offers a class of distributed data storage and processing platforms that provides on demand scalable and easy to use online resources in cost effective way. The extensive implementation raises the security and privacy anxieties. Cloud environment afford countless chances to criminals that allow them to misuse these new technologies by initiating attacks, capturing impeaching evidences and cracking encryption keys. The distributed computing power of big data in cloud environment makes the job of digital investigators more difficult in acquiring evidences for digital forensics purposes. Moreover, the amount of data generated through evidence acquisition is huge, complex and needs efficient analysis approaches in order to deal with its characteristics considering velocity and variety. Another problem could be raised while evidence collection, where the cloud administrator send the required data to the investigator. Therefore, it is crucial to protect the privacy of both uninvolved users and the level is based on parallelizing the operations required to perform the ECC point doubling and point addition, while the second one is used to enhance the execution time of Montgomery multiplications. Finally, pipelining is used to get a better performance. The results show that the proposed design enhances the execution time by 47.05, 69.12, 79.41, 86.03, 86.23 and 91.176% at the sender side assuming that the number of processors/nodes 'M' = 2, 4, 6, 12, 18 and 36, respectively. Moreover, in the receiver side, the degree of improvement is 47.1, 64.7, 73.5 and 79.4%, assuming that the number of nodes 'M' = 2, 4, 6, 12.
The remainder of this paper is organized as follows. In the next section, we give a review of digital forensics in cloud computing, Elliptic Curve Cryptography (ECC) and Nasereldin et al.'s protocol. Then, The proposed parallelization design of Nasereldin et al.'s protocol and its performance evaluation are presented. Finally, the conclusions are provided.

Digital Forensics in Cloud Computing
Digital forensics is a particular form of auditing that has emerged in recent years to fight cybercrime (Fernandes et al., 2014). The development of this field has been motivated by the interest of organizations in audit tasks. It has the objective of determining potential digital evidence by means of analysis techniques. When applied to clouds, digital forensics face a complex scenario because data is pushed further back into the network and servers and is more spread out across them, rather than purely being on a physical computing device. Forensics also faces the data locality issues, making it hard to isolate particular resources. Zawoad et al. (2015;2016;Zawoad and Hasan, 2013) proposed solutions which are based on the identification of the desired properties to support trustworthy forensics in the cloud. They proposed a Forensics Enabled Cloud (FECloud) architecture to maintain and afford required evidence. Unfortunately, they do not solve the authenticity and integrity of evidence problem.
Criminal investigation need to have the following characteristics: Protecting the privacy of involved users and keeping the administrator away from the investigation process. Hou et al. (2011;2013a;2013b) proposed several solutions which are based on administrator cooperation. Although, the administrator is responsible for protecting the data collection, he/she is not allowed to disclose this data. This solution's drawback is that the administrator cannot judge the relevance of data to the crimes under investigation. In addition, there is no guarantee that the data is not exposed to alteration or that it comes from the server (authenticity and integrity problems). To solve this problem, in (Hou et al., 2013b), they proposed an "encryption-then-blind signature with designated verifier" algorithm. They allow the administrator to search, retrieve and send the relevant data to the investigator in a secure manner. Nasreldin et al. (2015a), show that Hou et al. (2013b)'s scheme does not preserve its claimed integrity and authenticity. The common approach to achieve both evidence confidentiality and authenticity is to sign the evidence and encrypt it with its signature. The sender would sign the evidence using a digital signature scheme and then encrypt it with an appropriate encryption algorithm. The signature would use a private key encryption algorithm, under a randomly chosen message encryption key. The random evidence encryption key would then be encrypted using the recipient's public key. These are "sign-then-encrypt" or "encrypt-then-sign" techniques. Encrypt-then-sign is subject to the plaintext-subsection and text stealing attacks. The composition of the sign-then-encrypt approach suffers from a forwarding attack (Zheng and Imai, 1998). To mitigate these security breaches, Sign-Encrypt-Sign and Encrypt-Sign-Encrypt techniques are used. Sign-Encrypt-Sign and Encrypt-Sign-Encrypt suffers from computation, implementation and communication overheads. The term signcryption was originally introduced and studied by Zheng (1997) with the primary goal of reaching greater efficiency than can be accomplished when performing the signature and encryption operations separately. In spite of proposing some security arguments, most of the work on signcryption (Zheng, 1997) missed formal definitions and analysis. Moreover, signcryption schemes must achieve non-repudiation, which guarantees that the sender of a message cannot later repudiate that he/she has sent the message. Namely, the recipient of a message can convince a third party that the sender indeed sent the message. It is worth noting that typical signature schemes provide non-repudiation, since anyone, who knows only the sender's public key, can verify the signature. This is not the case for signcryption, because the confidentiality property entails that only the recipient can comprehend the contents of a signcrypted message sent to him/her. Nevertheless, it is feasible to accomplish non-repudiation by other means. Instead of using encryption/signing process, signcryption can be applied in place of separate encryption and signing to reduce both communication bandwidth and computational time overheads. Any authentication scheme for big data streams should verify the received packets without assuming the availability of the entire original stream. Zheng (1997) proposed the first signcryption scheme based on discrete logarithmic problem. It saved about 50% computational cost and about 85% communication cost than the traditional signature-then-encryption scheme, but it fails the forward secrecy of message confidentiality. Deng and Bao (1998) improved Zheng's scheme such that the judge can verify signature without the recipient's private key. But a key exchange protocol was required in the process of verification. At, Zheng and Imai (1998) suggested an ECC based signcryption scheme that provided all the basic security features and saved about 58% computational cost and 40% communication cost than signature-then-encryption. As it is based on ECC the key size used was smaller as compared to the other schemes. This was one of the advantages of this scheme but it still needs forward secrecy (Jung et al., 2001). Hwang et al. (2005b) proposed a signcryption scheme based on elliptic curve with forward secrecy and publicly verifiable. This scheme satisfied the message confidentiality of previous encrypted message even if the sender divulged his private key inattentively with a cost comparable to the existing schemes. Toorani and Beheshti (2009) suggested a signcryption scheme based on elliptic curve which provide all the security attributes. But this scheme required more computational cost as compared to existing schemes. Singh (2016) proposed a signcryption scheme which provides encrypted message authentication, forward secrecy and public verification. The disadvantage of this scheme is that it still requires a comparable computational and communication costs. Nasreldin et al. (2015b) propose an identity-based signcryption protocol to reduce the computation, communication and implementation overheads in evidence collection in cloud forensics. Their proposed protocol is more efficient than all the previously presented protocols. It allows the receiver (verifier) to restore the message blocks upon receiving their corresponding signature blocks. In addition, it is perfect for some application requirements and fits packet switched networks. This protocol has two stages of verification to ensure that the message has been recovered efficiently and correctly. The first verification step is to ensure the integrity and authenticity of the message (e.g., no modification or substitution in the ciphertext 'r i '). The second verification step is to ensure that the i th message is reconstructed successfully. This stage is useful for public verification in the case of a dispute takes place. It guarantees that Nasereldin et al. protocol satisfies the non-repudiation property. Nasreldin et al. (2015b) show that the security of their protocol is based on the intractability of reversing the secure cryptographic hash function and the Elliptic Curve Discrete Logarithm (ECDL) problem. Moreover, they analyzed the security of their protocol in terms of authenticity, unforgeability, confidentiality, nonrepudiation and forward secrecy. As mentioned previously, the signcryption protocols (Zheng, 1997;Zheng and Imai, 1998;Deng and Bao, 1998;Jung et al., 2001;Han et al., 2004;Hwang et al., 2005b;Yuan and Hung, 2008;Toorani and Beheshti, 2009;Mohapatra, 2010;Ashraf et al., 2015;Nasreldin et al., 2015b;Singh 2016) are based on ECC that will be described in details in the next subsection.

Elliptic Curve Cryptography ECC
Public key cryptography achieves evidence confidentiality, authenticity, non-repudiation and integrity. ECC is a better choice than RSA as it provides the same security level for shorter keys. For the last decade, ECC has gained increasing acceptance in the industry and the academic community and has been the subject of several standards. This interest is mainly due to the high level of security with relatively small keys, low cost and smaller hardware realization provided by ECC (Hwang et al., 2005a;Meurice de Dormale and Quisquater, 2007;Lo et al., 2010;Li et al., 2013). It was first proposed independently by Koblitz (1987) and Miller (1985). The security of a public key system using elliptic curves is based on the difficulty of computing discrete logarithm in a group of points on an elliptic curve defined over a finite field (FOSIT, 2000). ECC is used in many applications such as smart cards, set top box, low power portable devices (cell phone), etc. In all these applications, the main operation in ECC is the scalar multiplication in authentication and certification (Thomas et al., 2014). The Elliptic Curve Discrete Logarithm Problem (ECDLP) is currently believed to be asymptotically harder than the factorization of integers. ECC provides more security per key bit compared to other public key standards (Rao et al., 2017;Parmar and Verma, 2017). Table 1 shows the key sizes of AES, ECC and RSA for the same security level. Private keys are 12-times larger for RSA compared to ECC at the 128-bit security level; as shown in Table 1. ECC could work in GF(2 m ) or GF(p), while GF(2 m ) is suitable for hardware implementation, GF(p) is suitable for software implementation (Miller, 1985;FOSIT, 2000;Sakthivel and Nedunchezhian, 2014). In our work, we concentrate on parallelizing GF(p). Cryptographic schemes based on ECC rely on scalar multiplication of elliptic curve points. Given an integer 'k' and a point "P∈E(F(p))", scalar multiplication is the process of adding 'P' to itself 'k' times. The result of this scalar multiplication is denoted by 'kP'. Scalar multiplication of ECC can be computed efficiently using the double-and-add algorithm as given in the following: N = P; and R = O; //point at infinity In this algorithm, 'O' represents point at infinity and k-bit-length represents the number of bits of 'k'. Scalar multiplication is used for the computation of the public key, the signature, encryption and key agreement in the ECC system. The mathematical operations of the ECC are defined over the elliptic curve are as follows: . y x a x b mod p ≡ + + where: The change of the parameters 'a' and 'b' gives different elliptic curves (Certicom Corp., 2000a;2000b;Tawalbeh et al., 2010;Srivastava and Mathur, 2013). One of the crucial decisions when implementing an efficient ECC over GF(p) is deciding which point coordinates system to use. In (Tawalbeh et al., 2010), details of three different projective coordinate systems are given. The first one is the affine coordinate where a point is represented as (X A ,Y A ). The other two forms of the projective coordinates are: Table 2 shows a comparison of these three projective coordinate systems. As shown in the table, the affine coordinate system uses inversion operation in both point addition and point doubling, which is costly in terms of computation time and makes it an inefficient choice. The other coordinate systems do not use modular inversions in point addition and doubling. As mentioned in (Tawalbeh et al., 2010), the projection (X, Y) where X A = X/Z 2 and Y A = Y/Z 3 has the minimum number of modular multiplication operations.
For the projective coordinate system (x,y)⇒(X/Z 2 , Y/Z 3 ), point addition of P + Q in projective coordinates (x, y) ⇒ (X/Z 2 , Y/Z 3 ) is computed as: The doubling of a point (P + P) is computed as: In next sub-section, we give a detailed description of (Nasreldin et al., 2015b) evidence acquisition protocol. Nasreldin et al. (2015b), in their work, proposed an identity-based signcryption protocol to solve the problem of authenticity and integrity of collected evidences. Nasereldin et al.'s protocol makes use of identity-based cryptography to overcome PKI problems mentioned previously. Although this protocol needs larger number of Elliptic Curve Point Multiplication (ECPM) operations than other protocols (Zheng and Imai, 1998;Han et al., 2004;Hwang et al., 2005b;Toorani and Beheshti, 2009;Mohapatra, 2010;Singh, 2016) (as shown in Fig. 1), Nasreldin et al.'s (2015b) protocol allows message to be divided into small messages which is suitable for pipelining techniques. Moreover, it allows the recipient to restore the message blocks upon receiving their corresponding signature blocks. It consists of two stages of verification: The first stage is to ensure the integrity and authenticity of the message. The second stage is to make sure that the message is reconstructed successfully. This leads to guarantee that the protocol satisfies the non-repudiation property.

Nasreldin et al.'s Evidence Acquisition Protocol
In order to perform Nasreldin et al.'s protocol, the following steps must be performed.

Setup
The Private Key Generation center (PKG) chooses a Gap Diffie-Hellman group 'G 1 ' of prime order 'q', a multiplicative group 'G 2 ' of the same order and a bilinear map "e: G 1 × G 1 →G 2 ", together with an arbitrary generator P∈G 1 . Then it chooses a random value "s∈Z q * as the master secret key and computes the corresponding public key "P pub = sP.H 1 " and 'H 2 ' are two secure cryptographic hash functions, such that "H 1 : 0, 1* → G 1 " and "H 2 : 0, 1* → Z q * ". The system parameters (G 1 , G 2 , P, P pub , H 1 , H 2 , e, q) and the master secret key is 's'.

KeyExtract
Given identity ID, PKG computes "S ID = sH 1 (ID" and sends it to the user with identity ID. Nasreldin et al.'s protocol defines 'Q ID ' as the public key of the user with identity ID. In addition, it assumes that the sender 'A' (with secret key 'S A ' and public key 'Q A ') wants to send a message 'Mess' to the receiver 'B' (with public key 'Q B ' and secret key 'S B '), it divides the stream into blocks, 'Mess i ', where

Signcrypt Operation (Sender Side)
The sender 'A' chooses a random number * q k Z ∈ and lets r 0 = 0. The following steps must be done at the sender side before sending the signcrypted message: ( ) 'A' sends (S, α, γ, θ, r 1 ,…, r n ) to 'B' over a nonsecure channel.

,θ ,
• Checks: Upon receiving the message, the receiver verifies the signature by making the comparison between: 'α' and "Mess i ·H 2 (r i-1 ⊕e (P, Q B ) k )". In case of they are not equal, this implies that the received packets are altered and must be ignored. On the other hand, if they are equal, then the receiver retrieves the message blocks Mess i = r i [H 2 (r i-1 ⊕[e(S,θ).e(S B ,Q A )])] −1 . Lastly, the recipient checks the correctness of the message reconstruction by comparing 'γ' to H 2 (Mess i ,…, Mess n , α,e(S,γ).e(P pub ,Q A )).P. For public verification, the receiver 'B' only needs to make the following public (Mess, S, α, γ, θ). Next, any verifier can check the message authenticity by comparing 'γ' to H 2 (Mess i ,…, Mess n , α,e(S,γ).e(P pub ,Q A )).P.
In the next section a parallel implementation of Nasreldin et al. protocol (2015b) is presented.

The Proposed Parallel Design of Nasreldin et al.'s Protocol
Nasreldin et al.'s protocol is based on ECC which is characterized by different mathematical point operations that take huge time during its execution. Among these operations, the time complexity of ECPM is higher than any other point operations on elliptic curve (Tawalbeh et al., 2010). Therefore, by using parallel computation, the implementation of EPCM can be accelerated to improve the performance of ECC. Therefore, in order to accelerate Nasreldin et al.'s protocol, a multi-level parallel model is presented. Our proposed design consists of three levels: The first level is based on computing different point doubling and point addition operations of each ECPM operation in parallel, while the second one is used to enhance the execution time of Montgomery multiplications. Finally, different message blocks are pipelined.

Parallel Elliptic Curve Cryptography
Parallelizing ECC algorithms is a promising approach that can be used to reduce its computation time. Several research studies in the literature concerning parallelizing ECC over prime field GF(p) are given. These solutions are divided into two categories: The first solution is based on parallelizing the different point operations (Srivastava and Mathur, 2013;Anagreh et al., 2014;Chung et al., 2012;Gutub et al., 2007). The other research direction is based on partitioning the Montgomery modular multiplication (Fan et al., 2008;Guillermin, 2010). In this study, a hybrid parallel solution that makes use of the advantages of both categories is proposed. First, different operations of each ECMP (consists of point doubling and point addition operations) are computed in parallel. Then, the Montgomery modular multiplication operations are executed in parallel in order to enhance the execution time.
As mentioned at a previous section, the projection (X,Y) where X A = X/Z 2 and Y A = Y/Z 3 has the minimum number of modular multiplication operations. The dataflow graphs for point adding and point doubling are shown in Fig. 2 and 3 respectively. Table 2, each point addition operation needs sixteen modular multiplications and six modular additions. On the other hand, each doubling operation needs ten modular multiplications and four modular additions. Assuming that, 'T M ' is the time needed to execute one modular multiplication operation and 'T A ' is the time needed to compute one modular addition operation respectively (for simplicity, we assume that the time needed to execute modular subtraction operation equals to that needed for modular addition operation). Then, the total execution time needed to execute each point addition operation 'T S-add ' is given by:

As mentioned in
On the other hand, the total execution time 'T S-doub ' that is needed to compute each point doubling operation is given by: As mentioned in (Miller, 1985), field multiplication is the basic elliptic curve operation used in computing the point 'kP' from 'P'. Assuming that 'n' is the number of bits of 'k' which indicates the exact number of point doublings, but not point additions. Assuming that the bits of 'k' are half ones and half zeros (an average estimation for comparison reason), then the elliptic curve arithmetic operations required are 'n' point doublings and approximately 'n/2' point additions. Then, the total sequential time of the elliptic curve point multiplication arithmetic operation 'T S-ECPM ' is calculated as follows: For the first level of parallelization, different point doubling and point addition operations (for each ECMP) operation, are computed in parallel. As shown in Fig. 2 and 3, there is some dependency in calculating both point doubling and point addition. Therefore, the maximum number of nodes that can be used to execute each ECPM is four.
Both point addition and point doubling operations require the execution of many Montgomery multiplications which consume time. This led us to propose the next level of our parallel model that is concerned of enhancing the execution time of Montgomery multiplications. Each modular multiplication operation can be represented by three simple multiplication operations and one simple addition operation (GroBschadl, 2000); therefore each modular multiplication operation can be executed in parallel. The optimal number of nodes to execute one modular multiplication is three. This level of parallelism enhances the ECC performance, since it solves the problem of load imbalance (Elkabbany et al., 2014). Then, to achieve load balancing, each ECPM operation can be computed by at most twelve nodes.
Assuming that, the time needed to compute a simple multiplication operation equals to 't m ' and the time needed for computing simple addition operation equals to 't a '. Then, a modular multiplication operation can be calculated as: In addition, the modular addition could be calculated as the summation of addition and modulo operations. Using Barret algorithm (Barret, 1987), modulo operation needs one simple multiplication, one simple division and one simple subtraction. Then, the time needed to compute modular addition operation can be calculated as follows: where, 't div ' is the time needed to compute one simple division. Since, the addition operation considerably needs less time than the multiplication operation, it can be neglected and assuming that and "t div = t m ", therefore, the time needed to execute each modular multiplication is '3t m ' and the time needed to execute one modular addition is '2t m '. Then, from Equation 12 to 14, the sequential time for each ECPM 'T S-ECPM ' is calculated as follows: Due to the nature of Nasreldin et al. (2015b) protocol, the proposed parallel design assumes that the data stream is divided into 'N' messages, which can be executed in a pipelined manner. For simplicity, we assume that the number of pipeline stages equals to the number of steps to be executed and the output is shifted from step 'i' to step 'I +1'for all steps. As mentioned previously, Nasreldin et al. (2015b) requires four ECPM, in case of signcryption and only one ECPM in case of unsigncryption. Figure 4, presented different Steps of both sender and receiver sides. From this figure, we can noticed that: At the sender side, parallelization can be done within Step 1 that has one ECPM. In addition, Steps 4, 5 and 6 can be done in parallel (each has one ECPM). While at the receiver side, parallelization can be done only at Step 3 that has only one ECPM. Since, the maximum number of nodes that can be used for each ECPM is twelve. Then, for the signcryption operation (at the sender side), thirty six nodes are needed, while at the receiver side only twelve nodes are needed.
In order to simplify the calculations, we assume that the time needed for Add, Sub, Mul, Div and Hash operations will be neglected as they are very small compared to the time required for the ECPM operations and from Equation 15, the total sequential time at both sender and receiver side 'T s-sender ' and 'T s-receiver ' can be calculated as:

Results and Discussion
To evaluate the performance of the proposed parallel model, different metrics such as: Execution time, speed up, efficiency and the improvement degree are used (Borisenko, 2010;Zaghloul et al., 2017). Parallel execution time 'Tpar' can be defined as the time period between the starting of parallel computation and the time since the last processor/node finishes execution. Furthermore, the speedup can be defined as the ration between the sequential and parallel times "Ts/Tpar". Moreover, degree of improvement is determined by: "(Ts-Tpar)/Ts". Table 3 illustrates the parallel execution time for both point addition and point doubling operations at each ECMP operation. In order to simplify the calculations, we will neglect the communication time as it is small compared to the time required to compute modular operations and then, Table 4 presents the parallel time of each ECPM operation 'T ECPM-par ', for different number of nodes 'M' = 2, 4, 6 and 12. Finally, Table 5 shows the total parallel execution time of the proposed parallel design of Nasereldin et al.'s protocol at both sender and receiver sides for 'M'= 1, 2, 4, 6, 12, 18 and 36. On the other hand, Fig. 5 and 6 present the system performance: Execution time, speed up, efficiency and the improvement degree at the sender and the receiver respectively.   T ECPM-pa r = (T a-par )*n/2 + (T d-par )*n 2 T ECPM-pa r = n*(9T M +4.5T A ) = 36 (nt m ) 4 T ECPM-pa r = n*(5T M +4.5T A ) = 24 (nt m ) 6 T ECPM-pa r = n*(9t m +4.5T A ) = 18(nt m ) 12 T ECPM-pa r = n*(5t m +4.5T A ) = 14(nt m ) As shown in the above tables and figures, it is clear that the use of parallel system decreases significantly the execution time of Nasreldin et al.'s protocol. Figure 5a and 6a show that, as the number of nodes increases, the total execution/ parallel time decreases. Moreover, as the number of nodes increases, the speedup increases as shown in Fig. 5b and 6b. Figure 5c and 6c present the efficiency of the proposed parallel design. Parallel efficiency is the ratio between speedup and the number of nodes. It estimates how well the nodes are used in solving the problem. These figures illustrate an overall decrease in parallel efficiency achieved by the parallel model as the number of nodes increases. Figure 5d and 6d describe the improvement of the proposed parallel design compared to the performance prior to parallelization. As shown in these figures, as the number of nodes increases, the improvement degree increases. The degree of improvement at the sender side is 47. 05, 69.12, 79.41, 86.03, 86.23 and 91.176% assuming that 'M' = 2, 4, 6, 12,18 and 36 respectively. Moreover, in the receiver side, the degree of improvement is 47.1, 64.7, 73.5 and 79.4%, for 2, 4, 6 and 12 nodes respectively. Increasing the number of processors/nodes leads to the decrease in the system's efficiency. Therefore, the number of nodes must not exceed a certain number which is called system's saturation. As shown in Fig. 5 and 6, the saturation occurs when the number of processors equals 36 and 12 at sender and receiver sides respectively.

Conclusion
Nasreldin et al. proposed a protocol for securing the digital evidence collection in cloud environments. This protocol solves the problem of authenticity and integrity of evidence with the following characteristics: It has low communication and implementation overheads. Furthermore, it makes use of identity-based cryptography to solve PKI problems such as: High storage cost, large bandwidth requirement, non-transparency to users and the need for CRLs. In addition, it allows the message division into small messages which is suitable for pipelining techniques. In this study, a multi-level parallelism model is presented in order to accelerate Nasreldin et al.'s protocol. In their protocol, ECC mathematical operations take a huge time during the execution of the protocol. ECC is implemented by using a set of point operations, in these operations, the time complexity of ECPM is higher than any other point operations on elliptic curve. Therefore, by using parallel computation the implementation of EPCM can be accelerated to improve the performance of ECC. Since the ECPM is the most consuming time, then reducing its time will improve the Montgomery multiplication's performance.
Our design consists of three levels of parallelization: The first level is based on computing different point doubling and point addition operations in parallel, while the second one is used to enhance the execution time of Montgomery multiplications. Finally, pipelining different message blocks is used to get a better performance. The analysis shows that the use of parallel system will enhance its performance. The experimental results show that the maximum number of nodes that can be used for each ECPM is twelve. Then, for the signcryption operation (at the sender side), thirty-six nodes are needed. While, for unsigncryption operation (receiver side) only twelve nodes are needed. At the sender side, the degree of improvement of the proposed parallel design, compared to the performance prior to parallelization is 47. 05, 69.12, 79.41, 86.03, 86.23 and 91.176% assuming that 'M' = 2, 4, 6, 12, 18 and 36 respectively. On the other hand, at the receiver side, the degree of improvement is 47.1, 64.7, 73.5 and 79.4%, assuming that the number of nodes 'M' = 2, 4, 6 and 12.

Author's Contributions
The author prepared the study, elaborated the methodology, performed the analysis and wrote the manuscript.

Ethics
This article is original and contains unpublished material. The corresponding author confirms that no ethical issues involved.