Concealed Weapon Detection using Wavelet, Dual Tree Complex Wavelet Transform and Curvelet Transform

: In the field of military and law enforcement, detection and visualization of concealed weapons is a common practice. The infrared and visible cameras providing the information i


Introduction
Infrared and Visible sensors acquire the complementary information about the scene under consideration. The infrared sensors capture the thermal radiations emitted by metal objects. The images generated by infrared sensors are low on spatial resolution and has lower visual perception. On the hand, the visible sensor acquires a high-resolution image with the visualization of various characteristic features and details and overall shape of the objects in the image (Ma et al., 2016;Li et al., 1995). Image fusion is the process of mapping the entire state of the object or view, by integrating the complementary and multisensory information together. Infrared and visual image fusion has been extensively employed in the field of military and domestic surveillance, biometrics and concealed weapon detection. The accurate combining of data acquired from visible and infrared sensors enables efficient target detection and visualization and generates an increased amount of situational awareness. The interesting application of detection of concealed weapons can be attributed to this common deployment in the field of law enforcement and to ensure the safety of general public places (Ghahremani and Ghassemian, 2015). The thermal and infrared sensors rapidly acquire the complementary information about the target under consideration. However, the images acquired by thermal sensors are low in quality and contains a significant amount of noise which makes the visual perception of these images very difficult. The progressive evolution in the digital signal processing and mathematical operations has made possible the design of a plethora of image fusion algorithms (Dogra et al., 2017a;Goyal et al., 2018).
The most important and fundamental aspect of an image fusion system is the maximum amount of information to be transferred to the fused image with a minimal amount of noise and distortion. Therefore, the source images have to be represented in an efficient manner in order to obtain a high quality fused images. Hence keeping this in mind, we propose the fusion of multi-sensory, infrared and visible image fusion for concealed weapon detection using Discrete Curvelet, Discrete Wavelet and Dual Tree Complex Wavelet Transform.
The rest of the paper is organized as follows: Section 2 comprises the description of the fusion method. Section 3 gives the results and discussions. The manuscript is concluded in section 4.

Fusion Algorithms
In this manuscript, an algorithm for concealed weapon detection based application has been worked out. The input source images taken for analysis are specific for this manuscript. This type of source data set is extremely difficult to work with pertaining to the poor quality, low spatial resolution and extensive amount of noise and distortion. The infrared image is able to capture the electromagnetic radiations emitted by the gun source in the image. However, the various characteristic features, details and overall perception of the infrared image are indiscernible. On the other hand, the weapon (gun) is concealed in the visible image, but the visualization of group of people and shapes and contours in several of the image are present.
Therefore, in order to combine this complementary information in a single image, we propose to combine them using the discrete wavelet, curvelet transform and DTCWT.

II (a). Fusion with Wavelet Transform
Initially, the input co-registered source dataset i.e., infrared and visible image is taken.
In wavelet analysis, the image pixels are mapped in the transform domain and a linear expansion of thresholds is obtained to enable the analysis of the various image features at different scales. The wavelet transform can be understood as a sliding see-through window of basis function over the entire range of the signal function (Nunez et al., 1999;Li and Yang, 2008).
Since Wavelets were able to capture the sharp spikes or discontinuities in the signal, they were found to be better representational tools than the Fourier transformation. Mathematically, in 2-D wavelet transform of a function f(t) inner product with a mother wavelet is calculated as (Li et al., 1995;Nunez et al., 1999; http://www.numericaltours.com/matlab/wavelet_3_daubechies1d/): where, ψ s,p is the mother wavelet defined as: And, W s,p are the wavelet coefficients obtained on scale (s) and position (p) as: And for reconstruction: Here, both s, p ∈ Z. These functions are implemented using high pass and low pass filters to get the multi-scale decomposition of the subject.

The Daubechies Wavelet Family
The Daubechies (db) family of wavelets, also written as dbN, where N is the number of vanishing moments, is the most common and most widely used family of wavelets. These are external phase orthogonal wavelets implemented using linear phase (FIR) filters. From the basic defination (dbN) they can be written as db1, db2 and so on where N can typically range from 1 to 45. Db1 wavelet which is also known as the "haar" wavelet is the simplest form of this wavelet family having one vanishing moment and two coefficients. As we go higher in the family of wavelets support the size of scaling signals and the wavelets also increase and this is the main reason for existing differences between different wavelets. So, it can be said that db wavelets are the extensions of haar wavelet which uses filters with longer lengths for producing scaling functions which are smoother comparatively. The size of the filter chosen is directly proportional to the vanishing moments (N). So mathematically: where, s is the support size. This implies that when N = 1 and s = 2 it corresponds to the haar wavelet. When N = 2, s = 4 this corresponds to the most popular db4 wavelet, used in compression of linear signals. And when N = 3, s = 6, it corresponds to db6 wavelet used in compression of quadratic signals (http://www.numericaltours.com/matlab/wavelet_3_daubechies1d/; Kessler et al., 2003;Misiti et al., 1996).

Why Daubechies?
Because of the orthogonality property of these wavelets the energy of coefficients obtained is equal to the energy of the signal and also they prove to be most suitable for feature and texture analysis. Pixel intensities are reflected efficiently due to the usage of overlapping windows. So, in image processing context, Daubechies wavelets averages over more number of pixels hence, they are smoother than haar wavelets. Moreover, other families of wavelets such as coiflets are also derived from daubechies, with some differences such as being higher computational overheads and larger overlapping windows (http://www.numericaltours.com/matlab/wavelet_3_daubechies1d/).
In the experiments performed, the input images are decomposed using Discrete Wavelet Transform, using Daubechies Wavelet transform and at 4 decomposition levels. DWT is a multi-resolution analysis based transform, which is used to increase the spatial and temporal representation of the image. Fig 1 shows the db4 scaling function and the db4 wavelet waveform DWT is a mathematical processing tool which is comprised of a dilation and translation factor which encompasses the entire image at different scales in a sliding window manner.  DWT breaks down the image into a plausible estimate through low-pass filtering and into detail data by means of high-pass filtering. At each decomposition level the wavelet transform fragments the image into the approximation level and detail level. The approximation level is further decomposed in order to obtain the next scale representation with the help of filter banks (Nunez et al., 1999).
The source images are decomposed using the DWT. The diffused wavelet coefficients of each of the source images are combined using averaging fusion rule. The combined and fused set of coefficients so obtained is inversed transformed to obtain the final fused image.

II (b). Fusion with DTCWT
DTCWT was first proposed by Nick Kingsbury (Kingsbury, 2006), to address the problem of severe shift invariance and loss of directionality in conventional wavelet transform. The DTCWT analysis is almost similar to conventional DWT analysis, the difference being that the DTCWT complex wavelets are used to decompose the signal in the form of a tree having a real and an imaginary part. However, purely real filters are used in each tree. When applied to an image, the rows and the columns of the image are decomposed separately using 2 trees each for the rows and the columns. This results in a "quad-tree" structure. Then a pair of complex coefficients is obtained by combining those quad-tree coefficients. For combining those, simple linear addition and difference operators are used. From these operations six sub-bands are obtained at each level of decomposition which are directionally selective. The reason behind this directional selectivity is that the complex filters are able to deduct the positive and negative frequency components of 1-D signal this in turn separates the 2-D spectrum's adjacent quadrants which is something real filters are unable to do. Moreover, DTCWT is efficient in reducing redundancy and is computationally efficient compared to conventional wavelets. The Fig. 2 shows the 1-level decomposition of the visible image in which 6 different sub-bands are obtained.
For the fusion to be carried out the sub-bands of the visible mage are fused with the sub-bands of IR image along with the original decomposed image using some appropriate fusion rule. Then these fused coefficients are used to reconstruct the fused image using inverse DTCWT.

II (c). Fusion with Curvelet Transform
An image has its various features like edges, textures, lines etc. Edges or sharp changes in pixel intensities is the high frequency information, on the other hand, constant pixel intensities come under low pass information. This information is extracted using banks of high-pass and low-pass filter banks. This process is known as multi-scale decomposition. However, according to literature, the traditional wavelets which were a popular choice earlier for decomposition, failed to represent the abrupt changes in intensities over curve and contours in an efficient manner. Or it can be said that they were inefficient in representing the geometric properties of the subject under consideration and hence, they lacked directionality.
Later on, Complex Wavelet Transform (CWT) was introduced to improve the selectivity of the direction but, as the name indicated it constituted complex wavelets which were very hard to design and consequently their reconstruction was also not perfect. This in turn, also increased the computational complexity. Dual-Tree Complex Wavelet Transform (DTCWT) came as a solution to these problems but, directionality was still limited (Kingsbury, 2006). Hence to surmount such limitations, Ridgelet and Ripplet Transform based parabolic scaling low was given (http://www.numericaltours.com/matlab/wavelet_3_daubechies1d/). Curvelet Transform is based on a multi-scale representation of Ridgelet transform merged with a filtering of specific bands at spatial to extract dissimilar scales. So, as redundancy is reduced across the scales the sparsity is restored which was not the case with over complete dictionaries of transforms like ridgelets. They are able to perform at varied scales and orientations and have a property of anisotropic directional representation (Candès and Donoho, 1999).
The Curvelet transform is based on the parabolic scaling law and in recent literature, it can be witnessed, that the thresholding of the coefficients with Curvelet transform is able to nearly optimum amount of represent ability of smooth objects which have edges over C 2 curved regions (Starck et al., 2002; http://www.curvelet.org/; (Candès and Donoho, 2005).

Mathematical Description
As the curvelet transform is also called block ridgelet transform. We should first discuss the ridgelet transform. For that let us assume a univariate function ϕ: R→R which satisfies: So this function should have zero mean i.e., ∫ϕ(t)dt = 0. Now assuming that this function is normalized such then for each s>0 (scaling) or s∈(0,∞), t(shifting)∈(-∞, ∞) and ϑ∈[0,2π) we define: Which is a ridgelet function defined in R 2 space such that ϕ s,t,ϑ ∈R 2 →R 2 .
Hence, the ridgelet coefficients of a function f(y) can be calculated as: , , And similarly, the function f(y) can be reconstructed as: Now to get a curvelet family firstly, polar coordinates in the frequency domain are considered and then locally supported curvelet elements are constructed near the wedges. Wedges are the tiled areas of the frequency domain. Number of wedges can be written mathematically at the scale of 2 -i where i is the number of circular rings and x is the smallest integer function. To construct a basic curvelet we now have to define polar coordinates in the frequency domain. So let us assume ρ = (ρ 1 , ρ 2 ) as a variable which is located in the frequency domain so where, r≥0, θ∈[0,2π) and i∈X 0 also, ( ) i X V θ is the periodic window function and W is the window function. We can use different window functions for W for e.g.: As the curvelet family is designed it is used to decompose the object according to the steps shown and described in Fig. 3. The object is broken down into various sub-bands using band-pass filter ∆ ω whose centre frequencies are near [2 2ω ,2 2ω+2 ] (Starck et al., 2002;Candes et al., 2006;Emmanuel and Donoho, 2000;Donoho and Duncan, 2000). Various image fusion strategies involving the use of curvelets are described in (Dong et al., 2015;Sulochana et al., 2015;Li and Yang, 2008;Bhadauria and Dewal, 2013;Bhutada et al., 2011;Nencini et al., 2007).
In order to obtain a good quality of the fused image with increased visual perception, it is fused using curvelet transform. The Curvelet coefficients for two input images are obtained using the curvelet transform and then they are fused using averaging fusion rule. The fused coefficients are inverse transformed in order to obtain the fused image. The fusion methodology of the given algorithm is illustrated in Fig. 4 and 5 gives the similar methodology with the wavelet transform. The methodology for DTCWT is very similar to the wavelet transform. The only difference being that in DTCWT 6 sub-bands are generated and in wavelet 3 sub-bands are generated. These sub-bands are fused separately using averaging fusion rule and then fused image is reconstructed.

Results and Discussion
The input source data set i.e., the visual and infrared image is 256×256, grey scale, 8-bit image and are shown in Fig. 6. The fused results computed with the wavelet, DTCWT and the curvelet transform are given in Fig. 7.
The objective metric employed for the calculation of the edge information transfer from the source images to fused images is done using Q XY/F factor and the computed results are given in Table 1.
The expression for Q XY/F of a fusion process with X and Y source images can be given as in (Kumar, 2015;Dogra et al., 2016;Goyal et al., 2017;Dogra et al., 2017b): Q XF and Q YF are edge preservation values which are weighted by w X (a,b) and w Y (a,b).
It can be clearly witnessed from Fig. 7 and Table 1 that the fusion results with DTCWT are able to give a high amount of information transfer than the curvelet and the wavelet transform. Also, the loss of information by wavelet transform is higher than the curvelet transform and DTCWT. This can be attributed to the fact that the Wavelet transform, utilizes the concept of down-sampling, which induces the Gibbs phenomena resulting in an increased amount of noise and artefacts. On the other hand, DTCWT is able to give a high amount of information transfer which is comparable to existing state-of-the-art fusion techniques. As far as the visual assessment is concerned, the fusion results are able to reveal the concealed weapon along with the visual and facial characteristics in the image. Also, the DTCWT is able to give overall better visual results than the wavelet and curvelet transform. It can be stated that, the DTCWT is able to perform better than wavelet transform as it accounts for the positive and negative frequencies and hence results it providing increased level of directional sensitivity and perfect reconstruction. These factors, in turn generate fusion results with much higher visual perception where "regions of interest" are clearly perceptible, edges are sharp and gradient and edges are well preserved. The property of shift invariance holds vital importance in the present case scenario, where the source image is highly contaminated with noise. The wavelet transform leads to generation of visual artefacts due loss of invariance. Table 2 summarises subjective and objective assessment of the fused images. It clearly states that in case subjective evaluation, curvelet performs the poorest, wavelet fairly good results and DTCWT gives the best performance.    As far as the objective evaluation is concerned wavelet depicts the lowest performance, curvelet is able to give good objective results and DTCWT gives the highest rate of information fusion. It is evident from the zoomed region that the wavelet transform induces lot of noise and artefacts around the gun region; hence it is not suitable for the source images with inherent noise. The curvelet transform and DTCWT produced visually clear fused images in comparison to wavelet transform. This is because of the fact that the DTCWT and the curvelet transform are able to efficiently decompose the source images while preserving the edges and contours present in the image due to their isotropic nature. Consequently, the amount of information transferred also increases.
Wavelet is able to give focussed region of interest in the fused image with high contrast but it induces a certain amount of noise in the fused image due to shift variance.

Conclusion
In this manuscript, DTCWT based image fusion technique has been analysed for concealed weapon detection. The method has been compared with the performance of the wavelet transform and the curvelet transform. It can be deduced that the DTCWT and the Curvelet transform are able to give competitive image fusion performance for concealed weapon detection. Despite the low spatial resolution of the input source images, the DTCWT and the curvelet transform is able to concisely enable the visualization of the characteristic

Curvelet Based Fusion
Wavelet Based Fusion

DTCWT based Fusion
features and thermal radiations emitted by the object. This image fusion method can serve an easy CAD algorithm for detection of concealed weapons in the visual image itself with the help of complimentary information via infrared sensors.