Multiresolution Laplacian Sparse Coding Technique for Image Classification

: Sparse coding is a set of techniques used for learning a collection of over-complete bases to represent data efficiently. This technique has been used in different domain such as feature quantization and image classification. Despite its capacity of modeling, it could not represent similarity of the image coding which cause a poor performance in locality. The cause of this limitation is the features separation of the representation. To surmount the limitations of these techniques, we propose a new approach that is able to calculate similarity by taking into account the image’s spatial neighborhood of pixels. This approach is based on the integration of Kullback-Leibler distance and wavelet decomposition in the domain of image. The association of the Kullback-leibler distance and wavelet decomposition is robust to small deformations (translation, dilation and rotation). It improves the representation of locality by considering each element of an image and its neighbors in similarity calculation. Results show clear improvements in performance compared to the above techniques.


Introduction
Sparse coding techniques marked a great revolution in the field of computer vision and its applications. Those techniques suffer from the inability to model the locality and the similarity among the instances to be encoded owing to the over complete codebook and the independent coding process. To overcome these limitations, Gao proposed an approach called Laplacian Sparse Coding (Gao et al., 2010a). This approach exploits the dependence among local features. He recommended the use of histogram intersection based on K Nearest Neighbor (KNN) method to build a Laplacian matrix. This technique characterizes the similarity of local features. To maintain the consistence in sparse representation of those features, a matrix was incorporated into the function of sparse coding. In a second work, Gao enhanced the Kernel Sparse Representation technique (Gao et al., 2010b).
The third approach proposed by Gao is Hypergraph Laplacian Sparse Coding techniques (Gao et al., 2013). Within the same hyperedge, it simultaneously extracts the similarity between the instances and also composes their sparse codes.
In this paper, we suggest an enhancing of the Laplacian sparse coding technique by altering the way of similarity calculation. In our case, the computation of similarity in the image domain is based on the divergence of Kullback-Leibler and wavelet decomposition. This idea comes from its ability to take into account neighbors' similarity.
This paper is organized as follows: In section1, we introduce the Laplacian sparse coding technique. Section 2 describes the kernel sparse representation. We explain our approach in section 3. Our approach is evaluated in the last section.

Laplacian Sparse Coding
In order to alleviate the problem of hard quantization, researchers have proposed a new technique called sparse coding. This settles the problem by offering a sparse linear combination of basis vectors for each image feature. Sparse coding seeks a linear reform of a signal x,(x∈IR d ) using the bases in the codebook U = (u 1 , u 2,…, u k ), U∈IR d×k ). Sparse codes are defined by V = (v 1 , v 2,…, v n ). v i ∈IR K×1 and v ik are the weight of the vector x i in the basis vector u k . The optimization difficulty of sparse coding can be summarized as follows: is the tradeoff parameter used to stabilize the sparsity and the rebuilding error because of the independent encoding feature resulting from an over complete or sufficient codebook. Supposing that x = (x 1 , x 2,…, x n ) is the vector of features and W is the matrix of similarity having W ij the measuring of similarity of the pair (x i , The Laplacian Sparse Coding, as detailed in (Gao et al., 2010a;2010b) considers the similarity between features. The term of this approach is described by: It can be written as: The definition of the Laplacian is L = D − W (Luxburg, 2007).
Since the codebook U is not optimal, the expression can be rewritten as follows:

Kernel Sparse Representation
Gao proposed another approach called Kernel Sparse Representation to ameliorate the technique of features representation using sparse coding. He used the kernel trick because he noticed that this technique can identify the nonlinear similarity of features. This approach is fundamentally the sparse coding technique in an elevated dimensional feature space traced by tacit mapping function (Gao et al., 2010a;2010b).
< is a feature mapping function. In the same condition of sparse coding, this function is defined by: the same condition of sparse coding, this function is defined by: Using this expression, the formulation of Kernel Sparse Coding is written as follow: Gao adopted the Gaussian kernel because of its satisfactory performance in many works (Chen et al., 2010;Donoho, 2006).

A. General Context of Multiresolution Wavelet Decomposation
As we all know, multiresolution formalism enables the decomposition of a signal over several scales. The signal is constructed using the most suitable approximation at each level. Also, the wavelet decomposition gives details and approximations threshold coefficients (Hassairi et al., 2015;2018). Multiresolution wavelet decomposition studies each signal in frequency and time domains. For lower frequency, it gives better frequency resolution and poorer time resolution. While for higher frequency, it offers better time resolution and poorer frequency resolution . Fortunately, this condition is suitable for real applications; as signals have low frequency for longer time and high frequency for very short intervals.
A set of sub spaces L 2 (IR) noted (V j ) n∈Z have the following properties: These sub spaces are the formulation of a multiresolution analysis.
Expression (5) confirms that (V j ) n∈Z is a space engendered by the family (Φ j,n ) n∈Z . Its description depends on the selected topology for the efficient space. We can describe it more firmly as the union of the restricted space of linear groupings of functions Φ j,n . Thus the estimation of a signal ƒon the space V j is: Coefficients j n a are computed by performing a scalar product signal with the family features Φ j,n : Instinctively, we can see that all the functions of V j are richer or denser than those of V j+1 , which does not mean the inclusion relationship. The same expression (5) requires that the wavelets appear as a natural way to write the difference between two consecutive spaces V j and V j+1 . We construct W j+1 to complete this space: The space W j+1 is yielded by a functionψ j,n : , : ψ j,n have values on the space W j that is opposite to V j in V j+1 . We have the same translational properties, expansion on ψ j,n as on Φ j,n . The set of functions ψ j,n is named space details. Thus, the detail of the signal ƒ in the space W j is computed as follows: and the coefficients of details j n d are calculated by the expression (11): The signal of which is supposed to be exposed on a basis of V j . Applying the wavelet transforms to the k∈IN scale expresses the signal from a personalized base to the direct sum: This algorithm results in replacing the representation in a component A by a representation on V j+1 ⊕W j+1 . We sequentially pass form one space to another by different breakdowns on these direct sums: The sets of spaces V j being fitted and following any function ƒ∈L 2 (IR) of size n, can be decomposed into the basis of wavelets and scaling functions: If we complete the analysis to the last level, ƒ will be written as follow:

B. Multiresolution Laplacian Sparse
In his three works (Gao et al., 2010a;2010a;2013), Gao preserved the similarity by adding the Laplacian capacity to the sparse coding technique. Moreover, he added the hypergraph technique to the Laplacian sparse coding to ameliorate it. In this case, the similarity among the instances is defined by a hypergraph. As shown in Equation 4, the similarity is captured by this technique among the instances within the same hyperedge simultaneously and also makes their sparse codes similar to each other (Ben Said et al., 2017).
In spite of these contributions, the proposed technique is still incapable to cover all similarities between features. It analyses images spatially and does not focus on the details of each object. We can therefore say that the analysis is carried out in a superficial way. Which is why, we propose the multiresolution Laplacian sparse coding to deepen these analyses.
In the case of Multiresolution Laplacian sparse coding, the variations of the neighbors of each object of an image are taken into account when modeling an image (Jemel et al., 2016). This capacity of modeling is based on the strength of the divergence of Kullback-Lebleir and wavelet decomposition.

C. Wavelet and Kullback-Leibler Divergence
The analysis of an image I by a family of functions {ψ j,k } j,k is called wavelet transform. This transformation is based on dilation and translation of a mother waveletψ. The wavelet coefficient w(I) j,k = <ψ j,k, I> of the localization properties in space and frequency gives information about the content of the image I around a point k and in a frequency band near the scale j. The wavelet transform localizes the majority of the spatio-frequency information of the image into a few large amplitude coefficients when the image is reasonably smooth (Piro et al., 2008).
As an initial estimation, these coefficients are uncorrelated which requires a treatment by thresholding and denosing the wavelet coefficients which is very effective in image compression. In fact, the wavelet coefficients scales are correlated at each scale. A discontinuity along a curve is converted into a point on this curve k 0 by large coefficients at all scales Dependency models between all coefficients have been suggested to improve the spatial structures (Goria et al., 2005;Huber, 1981). Notably, there is a dependency between a wavelet coefficient w(I) j,k and its closest neighbor's ladder-(w(I) j−1,k ). Banerjee et al. (2005) demonstrated that wavelet coefficient vectors statistics lead to Equation 6 which is used to distinguish the spatial structures of a very different kind: To do this, it simply regulates a Gaussian mixture model for each occurrence to express the joint probability of these vectors. In this case, it is uncertain what types of structures are present in the submissions; it cannot therefore set a model. However, it is desirable that the distribution of these vectors will be illustrative of spatial structures present in the image. Accordingly, it is essential to describe a dimension taking into account the joint probability of neighborhood vectors wavelet w(1) j,k .
Given the variability of spatial structures that can be encountered in the residue, the choice of a parameterization would be difficult to justify. We propose to introduce similarity metrics without valid parameterization of the distribution of neighborhoods: The metrics derived from the information theory such as residual entropy neighborhoods, mutual information or the Kullback-Leibler divergence between distribution neighborhoods of wavelet coefficients of the two images.
Suppose a neighborhood w(1) j,k containing d coefficients. Distribution of all neighborhoods of the image I is denoted by p w(1) and checks ( ) : x dx = ∫ . The differential entropy of Shannon It measures the amount of information contained in this distribution. The Kullback-Leibler is a measure of similarity between the distributions p w (I 1 ) and p w (I 2 ): Based on Equations 17 and 18, the Kullback-Leibler distance is expressed as a difference of entropies: Knowing that the cross-entropy is defined as follows (Piro et al., 2008): The use of these dimensions on the distributions of the intensity of pixels of an image gives excellent results in the field of segmentation and image realignment (Banerjee et al, 2005;Fukunaga, 1990;Kozachenko and Leonenko, 1987). A Kullback distance in wavelet space was also advanced for the indexing problem in (Collins et al., 2005;Leonenko et al., 2008). Particularly, in these two articles, the authors parameterize the allocation of the wavelet coefficients for each scale j by a generalized Gaussian and sum the Kullback distances obtained at each scale for the similarity between the two images.
We suggest studying similar measures to establish the similarity between two images, but with two major differences. First, the wavelet coefficients at different scales are not independent. Now summing the Kullback distances at each scale matches the supposed independence. We therefore consider the associated entropy coefficients, in particular that of the previously illustrated neighborhoods. On the other hand, we do not parameterize distributions diversion. We suggest determining the similarity between images I 1 and I 2 as follows (Piro et al., 2008): p wj (I 1 ) is the non-parametric allocation of the coefficients of neighborhoods wavelet image I 1 to scale j (Piro et al., 2008).
a j > 0 is normalization weight according to attach redundancy wavelet system used in (Piro et al., 2008).
Based on the definition of the sparse coding and expression (2), we establish the rule of multiresolution sparse coding: Based on the same expression (2), matrix W implemented by Gao et al. (2013) is fulfilled by the coefficients of similarity of Kullback-Leibler S. Using expression (21) in completion, Boltz et al. (2006) suggested an estimator of the Kullback-Leibler as follows: This estimator of the Kullback-Leibler distance can be computed relatively quickly whatever the size of samples. It is more robust to the choice of the number of k nearest neighbors.
Using the definition of Laplacian as in (Gao et al., 2013) we obtain the same Equation 3.

Experiments Results
We evaluated our approach on different well known datasets. The results are evaluated based on global classification rates. The first database is the UIUC sport dataset (Li and Fei-Fei, 2007) the second is Corel 10 dataset (Lu and Horace, 2009) and the third is Scene 15.

B. Corel 10
Corel 10 dataset contains 1000 images. Those images are divided into 10 classes of 100 images (Lu and Horace, 2009). The ten classes are beach, skiing, tigers, buildings, owls, flowers, elephants, horses, food and mountains.

C. Scene 15
This dataset is composed of 4485 images categorized into 15 classes. Each class consists of 200 to 400 Images. It contains indoor scenes such as bedroom, kitchen, as well as outdoor scenes, such as Buildings and landscapes.

Results
To compare our approach to those contrasted to Gao's et al. (2013), we selected the same basis and the same number of chosen images. The results are summarized in Table 2 and 3.

Conclusion
In this study, we proposed a perfect approach of image classification based on Laplacian sparse coding and the divergence of Kullback Leibler and wavelet decomposition. The moderation of similarity is computed between images which combines the concepts of information theory and wavelet transform. The principle of this method is to sum the Kullback distances of each scale distribution called neighborhood vectors of wavelet coefficients. The neighborhood coefficients, containing not only spatial locations but also relative scales, get the spatial dependencies and inter-scale coefficients which can identify finer spatial structures. The Kullback distance on these vectors is predicted in a non-parametric method despite their higher dimension, thanks to entropy estimators of the nearest neighbors.