ERROR DETECTION SCHEMES FOR FINITE FIELD MULTIPLIERS

Finite field multipliers are widely used in the field of cryptography for the purpose of scalar multiplication. The outputs of the finite field multipliers may consist of errors due to certain natural radiations which further leads to the failure of the cryptosystems. Here two Concurrent Error Detection (CED) schemes namely time redundancy and modular inversion based error detection schemes for finite field multipliers are discussed. The CED techniques have been implemented for bit serial, digit serial and bit parallel Montgomery multipliers. The Simulation results are obtained using Modelsim10.0b, area and power analysis has been performed using Xilinx ISE 9.1i. The proposed modular inversion based CED scheme is found to be area and power efficient compared to existing time redundancy based CED scheme.


INTRODUCTION
The finite field multiplication has received great attention in literature Ghosh et al., 2011) among the basic operations. It is mainly because the implementation of a multiplier is much more complex when compared to adder and by using multiplication operation repeatedly one can perform difficult field operations such as inversion and exponentiation which are widely used in cryptosystems. Finite field popularly known as Galois Field (GF) is represented as GF (p n ), where p n is a prime number over 'n' dimensions. When the prime number is 2, elements of GF are expressed as binary numbers. GF (2) when extended to GF (2 m ) is termed as binary extension field. Since no carry propagation occurs in GF (2 m ), the addition of two single bits requires only a logical XOR operation.
Finite fields are used in a variety of applications including classical coding theory in linear block codes such as Reed Solomon codes and in cryptographic algorithms (MacWilliams and Sloane, 1998). Cryptography is the practice and study of techniques for secured data communication in the presence of third parties. ECC (Miller, 1998) is an approach to public key cryptography based on algebraic structure of elliptic curves over finite field. This cryptographic method has been regarded mature to provide robustness for secure data transaction. Therefore ECC has become an attractive alternative cryptosystem and many designs have been proposed in recent years (Sakiyama et al., 2007;Chumg et al., 2005;Gura et al., 2002;Blake et al., 2005;Biham and Shamir, 1997;Boneh et al., 1997).The Montgomery multiplication algorithm is used to enhance the scalar multiplication in ECC (Montgomery, 1985).
CED is a process used to detect the errors in a cryptosystem while the system is performing its data transmission operation (Mitra and McCluskey, 2000;Reyhani-Masoleh and Hasan, 2006;Hariri and Reyhani-Masoleh, 2007;Bayat-Sarmadi and Hasan, 2007). Due to the fact that fault injection and active attacks are used against cryptosystems, it is very important to increase the reliability of the elliptic curve-based cryptosystems and in particular, its main arithmetic operation, i.e., multiplication. The presence of fault in cryptosystems Science Publications AJAS can lead to an active attack which results in leakage of secret information from the cryptosystems. The simplest way to prevent such an attack is to ensure that the computational device, the multiplier, verifies the value it computes before sending them out. To meet this purpose concurrent error detection scheme could be one of the options to mitigate logic errors. The design of efficient multipliers with CED capability is desirable to have a highly reliable and dedicated cryptographic hardware (Hariri and Reyhani-Masoleh, 2011).
Finite field multipliers use Montgomery multiplication algorithm to perform bit serial, digit serial and bit parallel multiplier operations (Ananyi et al., 2009;Koc and Acar, 1998;Fan and Dai, 2005;Hariri and Reyhani-Masoleh, 2008). The finite field elements are represented using three basis representations namely polynomial basis, normal basis and dual basis. Polynomial basis has found to be suitable for the purpose of error detection as conversion from polynomial basis to binary is quite simple. The bit parallel systolic finite field multiplier over polynomial basis has been implemented for irreducible polynomial, all-one polynomial and irreducible trinomial (Sargunam et al., 2012a). The speed of bit parallel systolic finite field multiplier over polynomial basis has been improved using an unique technique (Sargunam et al., 2012b). Reyhani-Masoleh and Hasan (2003) a parity prediction based technique has been implemented for a polynomial basis multiplier. The major drawback of this technique was that the exact error bit position was not specified in the output of the multiplier instead only the existence of error was detected. In this study two error detection schemes have been discussed, the time redundancy and the modular inversion based error detection techniques.

TIME REDUNDANCY TECHNIQUE
The fault attacks are common against cryptographic algorithms. CED is one of the counter measures used to protect the crypto-processors in case of such attacks. In this section, we discuss CED circuits for bit-serial, digitserial and bit-parallel Montgomery multipliers which can be used as a counter measure against natural faults and fault attacks in cryptography.

Time Redundancy Approach
The architecture using time redundancy can avoid the potential security problem caused by side-channel attacks. All single cell faults in the multiplier will be concurrently detected. Moreover, this multiplier requires a little space overhead and takes only few extra clock cycles. This technique is applied for bit serial, digit serial and bit parallel multipliers. The block diagram for the time redundancy approach is shown in the Fig. 1. The latches are used to store the data and 2-to-1 Mux is a 2 by 1 multiplexer to select one of the inputs.
CED using time redundancy technique is as follows: The fundamental operation of the multiplier is explained in the following steps.
The first step is performed by applying inputs A(x) and B(x) to the Montgomery Multiplier array and the result C(x) is converted by the *x m circuit to C'(x) and stored in latches. The dataflow of this first step is shown in bold lines in Fig. 2.   Fig. 1. CED using time redundancy  Science Publications

AJAS
The second step is executed by applying inputs A'(x) and B'(x) to the Montgomery Multiplier array. The inputs A(x) and B(x) are applied to respective *x m circuits to obtain A'(x) and B'(x). The result C'(x) is compared to the previously stored result C'(x) in latches.
The function unit *x m realizes the following function Q'(x) = Q(x)*x m mod P(x). Where Q(x) and Q'(x) are the inputs and output of the *x m circuit respectively. There is one to one correspondence between Q(x) and its Q'(x) in residue representation. The dataflow of the second step is shown in bold lines in Fig. 3. The C'(x) values obtained from step 1 and step 2 are compared using equality checker and the error signal is produced. The outputs of both these steps are equal no error signal is generated and if not the error signal is generated to indicate the error. Step By examining the error signal at the output of equality checker the errors are detected. The exact error bit position is also detected by this method.

MODULAR INVERSION TECHNIQUE
It was found that the parity prediction technique failed to detect the exact bit positions of the erroneous output of the multipliers and this technique was not efficient to detect the online errors that occurred in the cryptosystems. In (13) a time redundancy scheme was developed for the purpose of CED using modular multiplication. There are two important performance criteria in VLSI implementation, namely power and area. Trade-off may exist between the two parameters. Optimization of these two parameters can be carried out in finite field multiplier architecture in order to consume low power and low area. The time redundancy scheme was found to have high power and area utilization. In order to attain a power and area efficient CED scheme modular inversion algorithm has been used.

Modular Inversion
The multiplication inversion of an element aεF is defined as the process to find an element a −1 εF, such that a.a −1 = 1 mod P(x). Several algorithms to compute the multiplicative inverse in GF (2 m ) have been proposed in literature. The inverse is computed using an improved modification of the extended Euclidian algorithm called modular inversion algorithm. The modular multiplicative inverse a −1 (mod p) of an integer 'a' exists if and only if 'a' and 'p' are relatively prime, that is gcd (a,p) = 1. In all cases considered, p is prime and hence 'a' and 'p' are always relatively prime. The following is the modular inversion algorithm that has been incorporated in the CED scheme.

Algorithm
Inputs: Operand a, prime p Output: a −1 mod p Step1: u = a, v = p, x 1 = 1, x 2 = 0 Step2: while u ≠ 1 and v ≠ 1 do Step 2.1: while u even do Step2.1.1: u = u/2 Step 2.1.2: if x 1 even then x 1 = x 1 /2 else x 1 = (x 1 + p) /2 Step 2.2: while v even do Step 2.2.1: v = v/2 Step 2.2.2: if x 2 even then x 2 = x 2 /2 else x 2 = (x 2 + p) /2 Step 2.3: if u≥v then u = u -v, x 1 =x 1 -x 2 else v = v -u, x 2 = x 2 -x 1 Step 3: if u = 1 then return x 1 (mod p) else return x 2 (mod p) The step 2 of the algorithm runs iteratively and proceeds towards the goal. In this step for every iteration either 'u' or 'v' is reduced by at least one bit length. The total number of iterations in step 2 is at most 2k, where k is the maximum bit length of 'p' and 'a'.

Error Detection Method
In order to obtain an efficient CED scheme for the purpose of detecting errors in the output of the finite field multipliers the modular inversion algorithm has been incorporated into the error detecting scheme. This technique has been proved to have better power and area efficiency when compared to the time redundancy scheme. The block diagram for modular inversion technique is shown in Fig. 4. The modular inversion technique also performed in two steps.

Fig. 4. Modular inversion based error detection scheme
The multiplication array block performs bit serial, digit serial or bit parallel multiplication in finite field. The 2-to-1 Mux block selects one of the inputs for multiplication based on the select signal 'S'. The error detection process is performed using the block diagram by multiplying two inputs A(x) and B(x). Instead of modular multiplication in time redundancy technique here modular inversion is used to detect the errors. In this technique also exact error bit position can be detected and it can detect multiple errors.
The data flow for the CED scheme using modular inversion in the block diagram is explained in two steps as follows: During the first step the two inputs (A(x), B(x)) are multiplied using the Montgomery multiplication algorithms (Bit serial, Digit serial or Bit Parallel). The output of the Montgomery multiplication array (C(x)) is further taken as input into the modular inversion block where the inversion algorithm is performed and the output C'(x) is generated. The blocks which are used and the data flow during this first step is shown in Fig. 5. During the second step the two inputs A(x) and B(x) are individually inverted using the modular inversion algorithm to form A'(x) and B'(x). The inverted outputs are taken into the Montgomery multiplication array and multiplied using the Montgomery multiplication algorithms (Bit serial, Digit serial or Bit Parallel). The output from the Montgomery multiplication array is generated as C'(x). The blocks which are used for this step and the data flow are shown in the Fig. 6.
The outputs of step 1 and 2 (C'(x)) are compared in the equality checker. If the outputs of the two steps are different the error signal is generated as shown in Fig. 7. The existence of error and the error bit positions can be identified by examining the output of the equality checker.

IMPLEMENTATION RESULTS
The algorithms for the time redundancy and the modular inversion error detection technique have been coded using VHDL and simulated using Mentor Graphics front end (Modelsim 10.0b). The implementation is done using Xilinx ISE 9.1i and area and power reports are obtained. The bit serial, digit serial and bit parallel Montgomery multipliers are coded and the time redundancy and modular inversion techniques are applied for all the multiplier types.  Figure 10 shows the simulation result for the error detection in bit serial multiplier using time redundancy technique. Figure 11 shows the simulation result for the error detection in bit serial multiplier using the modular inversion based error detection scheme.

CONCLUSION
The CED scheme is used to detect online errors in applications like cryptography. The time redundancy and modular inversion based CED schemes are performed for the three types (Bit-serial, Digit-serial and Bit-parallel) of finite field multipliers using Montgomery multiplication algorithm. The proposed CED using modular inversion technique is found to be area and power efficient when compared to the time redundancy technique.