A Deep Convolutional Neural Wavelet Network for Classification of Medical Images

: This work present a new solution for medical image classification using the Neural Network (NN) and Wavelet Network (WN) based on the Fast Wavelet Transform (FWT) and the Adaboost algorithm. This method is divided in two stages: The learning stage and the classification stage. The first consists to extract the features using the FWT based on the MultiResolution Analysis (MRA). These features are used to calculate the inputs of the hidden layer. Then, those inputs are filtered by using the Adaboost algorithm to select the best ones corresponding to each image. The second consist to create an AutoEncoder (AE) using the best-selected wavelets of all images. Then, after a series of Stacked AE, a pooling is applied for each hidden layer to get our Convolutional Deep Neural Wavelet Network (CDNWN) architecture for the classification phase. Our experiments were performed on two different datasets and the obtained classifications rates given by our approach show a clear improvement compared to those cited in this article.


Introduction
The deep learning is a set of algorithms of machine learning, seeking to model with the abstractions of top level within the data using the architectures of models composed of multiple not linear transformations. It builds an improved feature space by using multiple layers. The origin of the DL concept came from artificial neural network research (Bengio, 2009;Deng and Yu, 2014). Since 2006, with the arrival of GPU and faster machines, has improved the use of DL techniques (Hinton, 2006;Chen et al., 2014;Krizhevsky et al., 2012;Ali et al., 2015).
Thanks to the work of Georey Hinton and Yann LeCun, they have developed new models of neural networks based on several layers and able to perform a hierarchical learning (Hinton, 2010). A little like the human brain, the layers of neurons categorize data from the simplest to the most complicated. The auto-encoder, used for learning efficient coding, was developed by Bengio (2009;Liou et al., 2014). Sparse representations for object recognition and image classification was developed by Yann LeCun (Jarrett et al., 2009;LeCun, 2012;Liu et al., 2016). The work of Daugman (2003) is the origin of Wavelet Networks (WN) (Amar et al., 2005;ElAdel et al., 2014;Ejbali et al., 2010;Zaied et al., 2012), in which Gabor wavelets have been used for image classification. The WN have become popular after the work of Pati and Krishnaprasad (1993), Zhang and Benveniste (1992) and Szu et al. (1992;Iyengar et al., 2002). The first step of the pattern classification process is the extraction of the features ElAdel et al., 2016). It should be efficient as well as insensitive to irrelevant variations of the signal to distinguish between classes (Ejbali et al., 2012).
To solve this problem, it has become necessary to deepen the vector details by extracting useful features at multiple levels of abstraction. Those techniques are known as deep learning. These architectures are based on Artificial Neural Networks (ANN) (Teyeb et al., 2014;Zahmoul and Zaied, 2017). Because of their architecture, classification ability and performance in learning, the ANN allows a supervised classification (Zou et al., 2011;Yang et al., 2017).
However, NN are computationally expensive, particularly during the training phase of classification. The random weights assigned at the beginning of each run, the number of neurons in the hidden layer and the rate of weight-updating produce variable results (Zhou, 1999;Bouchrika et al., 2014). Otherwise, deep learning is done by modifying the number and the size of layers to provide many abstraction levels. Moreover, only on the basis of the last layer of abstraction features, the classification is done.
In our work, we propose, first, a new method for feature extraction based on convolutional dyadic Multi-Resolution Analysis (MRA) to solve the problems cited above. Second, we present a Deep Convolutional Neural Network (DCNN) that models images. This architecture was based on the MRA at different abstraction levels in order to extract all features. Then, we used the Adaboost algorithm to select the best features characterizing each image.
Section 2 gives the main idea of our proposed method in which we describe the feature extraction, the feature selection and the classification methods. Section 4 shows the experimental findings. In Section 5, we conclude this paper.

Proposed Approach
Our system is composed, in the art of feature extraction, of a DCNN based on MRA and the Adaboost algorithm. It has allowed the modeling of all image features with a one hidden layer.
Our approach can be resumed in three steps: Feature extraction, feature selection and classification. FWT is used to extract features based on convolutional dyadic MRA analysis on different levels of abstraction as shown in Fig. 6 in the first step. The best features, with their weights that characterize well each class, were selected using Adaboost algorithm. These features were used in the classification step to determine the class of the input value (Fig. 1). The following algorithms detail the learning and classification steps.

Learning Algorithm
The learning algorithm steps are listed as follows: 1.
Constructing the library of candidate wavelets 2.
Calculating connection weights between all layers 3.
Calculating all hidden layer features, based on convolutional dyadic MRA on different levels 4.
Executing the activation function for all features 5.
Using the Adaboost algorithm, we select the best features characterizing images 6.
Determinating the feature weights for each unit Fig. 1: Illustration of the proposed approach

Classification Algorithm
The classification algorithm is composed of 4 steps presented as follows: 1. In the first step, the best wavelets are choosen depending on their score to generate a Global Wavelet Network (GWN) modelling one class 2. In the second step, we create the auto-encoder with wavelets used in our GWN 3. In the third step, after a set of stacked AEs with a softmax classifier, we get a Deep Neural Wavelet Network (DNWN) 4. Finally, a pooling is applied to get a Convolutional Deep Neural Wavelet Network (CDNWN) allowing the classification of one class and rejecting the other classes

Features Extraction
As illustrated in Fig. 2, FWT is used to extract the feature based on MRA at different levels. This technique accelerates the calculation of the weights of connection . The MRA (Jawerth and Sweldens, 1993;Bonneau et al., 2008;Ejbali and Zaied, 2017) compute two types of weights which are the approximate weights (A) and the detailed weights (D). In our case, we were interested only in using the detailed weights because they are more representative as they use a convolution by dyadic wavelets. The number of the abstraction levels can be calculated for the vector length. For this, it is easy to control the analysis degree of the signal. Based on MRA, we can analyse the signal at different levels. It provides all signal features, which will be very helpful in the classification step.

Features Selection
We used the Adaboost algorithm (Alonso et al., 2012) to select the best feature. We exploit the Good Detection Rate (GDR) and the Error Recognition Rate (ERR) parameters to filter the set of features of each signal. These two parameters were the thresholds obtained with the Adaboost algorithm and they differed by their polarity values. The best features of each image are called a weak learner (h(X,f,β,ρ)): (1) (f) is the feature, (β) is the threshold and (ρ) is the polarity. Knowing that a threshold (β i ) could characterize one or more signals, ( ) k β ∈N was used as the connection weights between the layers.

Construction of Global Wavelet Network (GWN)
Using the best contribution algorithm, we construct a wavelet network for each element of a class Jemai et al., 2010) of wavelets Ψ i ∈D as shown in the following Fig. 3.
The best contribution is made as follows: 1. Create the wavelet library (D) 2. Decompose the signal by FWT 3. Calculate the contribution 4. Select the best wavelet 5. Reconstruct the signal After modeling each element of a class by a wavelet network, we thought of representing each class by a GWN. To do that, we calculated the wavelets score of one class which will be illustrated in a GWN. We created two tables incorporating all WN. We counted, then, the number of appearances of all wavelets in each position for all classes (Table 1).
Afterward, we computed a new coefficient for each wavelet using Equation (3) The global coefficient for Ψ 1 is described by:  Finally, we calculated all global wavelets coefficients and saved them in a table sorted in descending order ( Table 2).
The GWN was created using the wavelet that has the best coefficients as shown in Fig. 4. It has the average of the wavelets number as those of the hidden layer wavelets.

Form WGN to NWN
The AE is created by the wavelets used in the GWN (Fig. 5). The used wavelets are bi-orthogonal. A linear function was used in the hidden layer. We estimate that: We use series of AEs to build a DNWN. In our DNWN, each layer outputs are connected to the successive layer inputs of the AE. An example of two hidden layers of our DNWN and a softmax classifier is illustrated in Fig. 6: • First, we train the first auto-encoder to learn primary features (Fig. 6a) • Those output features are wired to the inputs of the successive AE to learn secondary features (Fig. 6b) • Last, to build a DNWN with two hidden layers, we integrate each two layers together and to classify the images as desired, we use a linear classifier (Fig. 6c)

Fig. 5: A wavelet auto-encoder
Creation of the Auto-encoders

Encoder Decoder
Input layer Output layer

Fig. 6: Schematisation of DNWN with two hidden layers
In the back propagation step, to apply the fine-tuning, the linear function of the hidden layers of the DNWN is changed by a sigmoid function.
Finally, after a series of Stacked AE, a pooling is applied for each hidden layer to get our Convolutional Deep Neural Wavelet Network (CDNWN) architecture for the classification phase.

Experimental Result
In order to evaluate the performance of the proposed inspection technique, we report the results of testing experiments in this section. The proposed algorithm is tested by two data sets. The first data set (D1) is a dental image data set retrieved from many dentists (Ali et al., 2016). In choosing the teeth images, we tried to avoid biased data sets (Fig. 7). We even tried to use teeth images from both, upper and lower jaws. In addition, we used teeth images from both sides of the mouth. To demonstrate the potential of this approach and its suitability, we used 64-by-64 pixels as input images. The goal of our work is to classify the input images into normal or decayed teeth images. The second data set (D2) is retrieved from UCI Machine Learning Repository (Murphy and Aha, 1994) which contains 699 biopsies. We have used this data set to classify cancers as either benign or malignant depending on the characteristics of sample biopsies.
After the selection of the features, we train the first auto-encoder to learn the primary vector of features. Those output features are connected to the inputs of another AE to learn secondary features. Our deep networks contains five hidden layer and a Softmax layer for output. Table 3 provides the confusion matrix of our classification to perform the quality of our system using D1. The accuracy rate reaches 98% which shows clearly that the proposed approach is very efficient and gives best result compared to the approach based on Deep Neural Networks (Ali et al., 2016) (Table 4). This can be explained by the deep learning, using MRA on a different levels, which helps well in the extraction of all useful features that are able to represent the details corresponding of each image.
In this table, the first two diagonal cells show the rate of correct classifications by the trained network.
Softmax classifier Output Output For example, for all teeth X-ray images, 55% are correctly esteemed as decayed teeth. The same way for the normal teeth, 43% of all cases are correctly classified. 1% of all teeth images are incorrectly evaluated as decayed teeth and 1% are incorrectly esteemed as normal teeth. In the same manner, 98.2% of all decayed teeth images are correctly predicted and 98.2% of all normal teeth are correctly esteemed. On the whole, 98% of the predictions are correct.
In D2, we have 699 cases. Our method is followed to classify the cancers as either benign or malignant depending on the characteristics of sample biopsies. In our work, we used 2/3 of images in the training phase and the rest (1/3) in the test phase. The classification performance of data set generated from D2 is evaluated. The result indicates the proposed approach has the best performance compared with Shallow Neural Network method (MATLAB and NNTR, 2018) (Table 5 and 6).   Table 4: Comparison of recognition rate between the proposed approach and other Method using D1 Recognition rate Proposed approach 98% Classification of dental caries in X-ray images using deep neural networks (Ali et al., 2016) 97% Table 5: Qualitative result of our classification approach using D2 Target class   (MATLAB and NNTR, 2018) 97.6%

Interpretation
The FWT and the MultiResolution Analysis are used to extract the features. For the description of all features of images, we use Multi-scale analysis. Then, Using the Adaboost algorithm, we select the best features characterizing images and therefore a better description of each images of dataset. The proposed approach for the construction of the Auto-Encoder is based on the Global Wavelet Network. The GWN is generated by choosen the wavelets having the best coefficient. The GWN estimates only the elements of the corresponding class and does not give a good approximation to the elements the other class. This is explained by the way of selection of the wavelets modelling each class. The selection depends on the best contribution of each wavelet to the construction of the wavelet Network for each element of a class. The GWN consists only of the most representative wavelet of each class. Finally, a pooling is applied for each hidden layer to get a Convolutional Deep Neural Wavelet Network allowing the classification of one class and rejecting the other classes.

Conclusion
We have presented a new approach for medical image classification which combines the flexibility of neural networks with the benefits of wavelet transform using the Fast Wavelet Transform (FWT) and the Adaboost algorithm to select the best features. Also, this approach uses a DL based on AE technique to learn these features. It has made a considerable improvement for medical image classification. This method enables us to classify images of a dataset by the contraction of a DCNWN. It was hybridized to improve the classification rate of images. The obtained classifications rates given by our approach show a clear improvement compared to other method in terms of several aspects. First, the strength proved by Adaboost algorithm in selecting best features. Second, wavelet reconstruction can avoid the inconsistent background intensity. Third, the creation of AE using the best-selected wavelets of all images.
The obtained findings ensure the efficiency of the proposed architecture and encourage us to test this proposed method with other data sets of X-Ray images as lung cancer images. The accuracy and reliability of our results can be improved using a larger medical images data set. Finally, we aim to improve our approach by transforming it into an unsupervised learning method in a future work.

Author's Contributions
Ramzi Ben Ali: Designed the research plan, organized the study, participated in all experiments, coordinated the data-analysis and participated in the manuscript writing.
Ridha Ejbali: Advise research project, proof reading of the paper and contributed to the writing of the manuscript.
Mourad Zaied: Advise research project and designed the research plan and contributed to the paper writing.

Ethics
This article is original and contains unpublished material. The corresponding author confirms that all of the other authors have read and approved the manuscript and there are no ethical issues involved.