Using Invariant Translation to Denoise Electroencephalogram Signals

: Problem statement: Because of the distance between the skull and the brain and their different resistivity’s, Electroencephalogram (EEG) recordings on a machine is usually mixed with the activities generated within the area called noise. EEG signals have been used to diagnose major brain diseases such as Epilepsy, narcolepsy and dementia. The presence of these noises however can result in misdiagnosis, as such it is necessary to remove them before further analysis and processing can be done. Denoising is often done with Independent Component Analysis algorithms but of late Wavelet Transform has been utilized. Approach: In this study we utilized one of the newer Wavelet Transform methods, Translation-Invariant, to deny EEG signals. Different EEG signals were used to verify the method using the MATLAB software. Results were then compared with those of renowned ICA algorithms Fast ICA and Radical and evaluated using the performance measures Mean Square Error (MSE), Percentage Root Mean Square Difference (PRD) and Signal to Noise Ratio (SNR). Results: Experiments revealed that Translation-Invariant Wavelet Transform had the smallest MSE and PRD while having the largest SNR. Conclusion/Recommendations: This indicated that it performed superior to the ICA algorithms producing cleaner EEG signals which can influence diagnosis as well as clinical studies of the brain.


INTRODUCTION
The language of communication with the nervous system is electric so when the neurons of the human brain process information, they do so by changing the flow of electrical currents across their membranes. These changing currents generate electric and magnetic fields that can be recorded from the surface of the scalp. The electric fields are measured by attaching small electrodes to the scalp. The potentials between different electrodes are then amplified and recorded as the Electroencephalogram (EEG); which means the writing out of the electrical activity of the brain (that which is inside the head). EEG recordings therefore, show the overall activity of the millions of neurons in the brain.
The human EEG was first recorded in 1924 (Unser and Aldroubi, 1996) and since then it has acquired an important role as a diagnosis tool in medicine and brain research. Being a physical system however, EEG is subjected to random disturbance. The measurements or observations are generally contaminated with other non-cerebral signals called artifacts or noise caused by the electronic and mechanical components of the measuring devices. The recorded signal is therefore a sum of the true EEG signal x[t] and the non-cerebral noise n[t]: [t] (1) These artifacts sometimes mimic EEG signals and overlay these signals resulting in distortion making analysis impossible. EEG is among the noisiest biosignals (Celka et al., 2008) and in clinical practice; areas with artifacts are cancelled, resulting in considerable information loss-resulting sometimes in misdiagnosis. Artifacts must therefore be eliminated or attenuated to ensure correct analysis and diagnosis. Through the years there have been different methods of denoising including artifacts rejection, regression and Principal Components Analysis (PCA). Croft and Barry (2000) in their study reviewed a number of these methods when denoising EEG signals and focused on the merits of these methods. Klados et al. (2009) went on the compare different methods for denoising EEG and found them to be relatively good. The most recent methods utilized very often are Independent Component Analysis (ICA) and Wavelet Transform (WT).
Independent Component Analysis (ICA) originated from the field of Blind Source Separation (BSS) (Comon, 1994). In the BSS problem, a set of observations is given while the underlying signal information is hidden. The mixing weights of the individual signals are unknown. The BSS problem is aimed at identifying the source signals and/or the mixing weights so as to separate these information sources into signal domain, feature domain or model domain (Chien et al., 2008). The basic assumptions in the ICA method have the statements that the source signals are mutually independent and non-Gaussian distributed. ICA therefore calls for the separation of the EEG into its constituent Independent Components (ICs) and then eliminating the ICs that are believed to contribute to the noise.
Different types of ICA algorithms were proposed in the last 10-12 years. Most of them suppose that the sources are stationary and are based explicitly or implicitly on high order statistics computation. Therefore, Gaussian sources cannot be separated, as they don't have higher than 2 statistic moments. Other types of algorithms do not make the stationarity hypothesis and use the non stationary structure of the signals (i.e., their time or frequency structure) to separate them. These methods use Second Order Statistics (SOS) only and they are called SOS algorithms. As EEG signals are highly non stationary, these type of algorithms are the most widely used to denoise.
Like ICA, Wavelet Transform (WT) has been used to study EEG signals (Bhatti et al., 2001;Der and Steinmetz, 1997;Alfaouri et al., 2009;Inuso et al., 2007;Nenadic and Burdick, 2005;Kumar et al., 2008a;Unser and Aldroubi, 1996) successfully because of its good localization properties in time and frequency domain (Ghael et al., 1997). Here, the EEG signals pass through two complementary filters and emerge as two signals-approximation and details. This is called decomposition or analysis. The components can be assembled back into the original signal without loss of information. This process is called reconstruction or synthesis. The mathematical manipulation, which implies analysis and synthesis, is called Discrete Wavelet Transform (DWT) and inverse discrete wavelet transform. There have been many approaches to denoising using WT; those based on shrinkage are the most popular (Mastriani and Giraldez, 2006) where the EEG signals are decomposed into wavelets and noise removal done using thresholding and shrinkage. Akin (2002) investigated the performance of WT and found that it was better in detecting brain diseases when compared with fast Fourier transform. Found the same as (Akin, 2002). Unser and Aldroubi (1996) went on to show that wavelets are good at denoising EEG signals as well as other biomedical signals. Wavelet transform has therefore emerged as one of the superior technique in analyzing non-stationary signals like EEG. Its capability in transforming a time domain signal into time and frequency localization helps to understand the behavior of a signal better.
Our research has found however that the denoising of EEG signals have been based on the Discrete Wavelet Transform (DWT) (Kumar et al., 2008b) and the Stationary Wavelet Transform (SWT) (Kumar et al., 2008a). In this study we look at another form of wavelet transform-Translation-Invariant proposed by (Coifman and Donoho, 1995), in denoising EEG. We have found no research which applies this method. Its performance is compared to known ICA methods when denoising the same EEG signals. We found that the expected performance of each is not the final result. From theoretical analysis and experimental results, we found that Translation Invariant denoising performed much better than any of the ICA algorithms as well as orthogonal wavelets.
The study is organized as follows. We describe WT and ICA for understanding then the methodology and experimental results for denoising are presented. Finally we presented the conclusion.
Supporting literature: EEG signals: The nervous system sends commands and communicates by trains of electric impulses. When the neurons of the human brain process information they do so by changing the flow of electrical current across their membranes. These changing current (potential) generate electric fields that can be recorded from the scalp. Studies are interested in these electrical potentials but they can only be received by direct measurement. This requires a patient to under-go surgery for electrodes to be placed inside the head. This is not acceptable because of the risk to the patient. Researchers therefore collect recordings from the scalp receiving the global descriptions of the brain activity. Because the same potential is recorded from more than one electrode, signals from the electrodes are supposed to be highly correlated. These are collected by the use of an electroencephalograph and called Electroencephalogram (EEG) signals.
Understanding the brain is a huge part of Neuroscience and the development of EEG was for the elucidation of such a phenomenon. The morphology of the EEG signals has been used by researches and in clinical practice to: • Diagnose epilepsy and see what type of seizures is occurring • Produce the most useful and important test in confirming a diagnosis of epilepsy • Check for problems with loss of consciousness or dementia • Help find out a person's chance of recovery after a change in consciousness • Find out if a person who is in a coma is brain-dead • Study sleep disorders, such as narcolepsy • Watch brain activity while a person is receiving general anesthesia during brain surgery • Help find out if a person has a physical problem (in the brain, spinal cord, or nervous system) or a mental health problem The signals must therefore present a true and clear picture about brain activities. Being a physical system, recording electrical potentials, present EEG with problems; all neurons, including those outside the brain, communicate using electrical impulses. These noncerebral impulses are produced from: • Eye movements and blinking-Electrooculogram (EOG) • Cardiac Movements-Cardiograph (ECG/ EKG) • Muscle Movements-Electromyogram (EMG) • Chewing and Sucking Movement-Glossokinetic • The machinery used to record signals • The power lines EEG recordings are therefore a combination of these signals called artifacts or noise and the pure EEG signal defined mathematically in Eq. 1. The presence of these noises, n(t), introduce spikes which can be confused with neurological rhythms. They also mimic EEG signals, overlaying these signals resulting in signal distortion (Fig. 1). Correct analysis is therefore impossible, resulting in misdiagnosis in the case of some patients. Noise must be eliminated or attenuated. The method of cancellation of the contaminated segments, although practiced, can lead to considerable information loss thus other methods such as Principal Components Analysis (PCA), the use of a dipole model and more recently ICA and WT have been utilized.
Wavelet transform: An EEG signal is a wave which is an oscillating function of time or space and is periodic. In contrast a wavelet is a localized wave which has energy concentrated in time as a result it provides a versatile mathematical tool to analyze transient, nonstationary or time-varying phenomena that are not statistically predictable. Figure 2 shows the difference between both a wave and a wavelet.
A set of wavelets are employed to approximate a wave or signal. This wavelet expansion of s(t) is the representation of the wave or signal in terms of an orthogonal collection of real-valued functions generated by applying suitable transformations to the original given wavelet and defined as: These functions are called "daughter" wavelets while the original wavelet is dubbed "mother" wavelet defined as Eq. 3: The collection of coefficients a j,k is based on the subset of scales "j" and positions "k" called is the Discrete Wavelet Transform (DWT) of s(t) and represents the "details". The second term in Eq. 2 is the "approximation" based on the scaling function Eq. 4: A signal can be analyzed better with an irregular wavelet. These are employed to approximate a signal and each element in the wavelet set is constructed from the mother wavelet, by shifting (translating or delaying) and scaling (dilating or compressing) it.
Denoising using wavelet: Denoising stands for the process of removing noise i.e., unwanted information, present in an unknown signal. The use of wavelets for noise removal was first introduced by Donoho and Johnstone (1995). The general procedure involves three steps.
Decompose-a wavelet is chosen with a level N and the signal is decomposed at N using DWT to give coefficients at different scales having have different magnitudes.
Noise Removal-here for each level 1-N noise is removed from the detail coefficients using one of two processes: • Wavelet transforms maxima where noise is eliminated and maximizes the information of the original signal. The process of calculation is however unstable and the amount of calculation is great • Wavelet thresholding proposed by Donoho which was used in this research. • When threshold is applied coefficients are categorized. Noise normally produces coefficients with magnitudes smaller than those of the natural signal and according to Donoho and Johnstone basic wavelet denoising is performed by taking the WT of the noise-corrupted s[t] and then zeroing out the detail coefficients that fall below a certain threshold-noise. The other coefficients that are larger are usually caused by the desired signal • Kept (hard-thresholding) or • Shrunk (soft-thresholding) (Han et al., 2009) Reconstruct-denoised signals are reconstructed from the wavelet coefficients by an inverse wavelet transform which is applied to the thresholded signal to yield an estimate for the true signal, as Eq. 5: where, ∧ t is the diagonal thresholding operator that zeroes out wavelet coefficients less than the threshold, t.

Independent component analysis: Independent
Component Analysis (ICA) is an approach for the solution of the BSS problem (Comon, 1994). It can be represented mathematically according to Hyvarinen et al. (2001) as Eq. 6: X As n = + where, X, observed signal, represents a multi channel signal mixture of mutually Independent Components (ICs) or sources (s), n is the noise and A is the mixing matrix. (It can be seen that mathematically it is similar to Eq. 1). The problem is to determine A and recover s knowing only the measured signal X (equivalent to E(t) in Eq. 1). This result in the ultimate goal of ICA, which is to find an estimate of the inverse matrix W such that Eq. 7: where, u is the estimated ICs that are actually estimates of s. For this solution to work the assumption is made that the components are statistically independent, while the mixture is not. This is plausible since biological areas are spatially distinct and generate a specific activation; they however correlate in their flow of information (Hoffman and Falkenstien, 2008). ICA is a viable tool for analyzing the activity of EEG signals producing outputs which are as independent as possible because: • The signals recorded are the combination of temporal ICs arising from spatially fixed sources • The signals tend to be transient (localized in time), restricted to certain ranges of temporal and spatial frequencies (localized in scale) and prominent over certain scalp regions (localized in space) (Makeig et al., 1996).

Reasons for translation invariant:
Although ICA is popular and for the most part does not result in much data loss; its performance depends on the size of the data set i.e., the number of signals. The larger the set, the higher the probability that the effective number of sources will overcome the number of channels (fixed over time), resulting in an over complete ICA. This algorithm might not be able to separate noise from the signals. Another problem with ICA algorithms has to do with the signals in frequency domain. Although noise has different distinguishing features, once they overlap the EEG signals ICA cannot filter them without discarding the true signals as well. This results in data loss.
Since wavelet analysis uses bases that are localized in time as well as frequency it can represent nonstationary signals such as EEG more effectively. So, it's more compact and easier to implement. WT utilizes the distinguishing features of the noise however. Once wavelet coefficients are created, noise can be identified. Decomposition is done at different Levels (L); DWT produces different scale effects (Fig. 3). Alfaouri and Daqrouq (2008) proved that as scales increase the WT of EEG and noise present different inclination. Noise concentrates on scale 21, decreasing significantly when the scale increases, while EEG concentrates on the 22-25 scales. Elimination of the smaller scales denoises the EEG signals. WT therefore removes any overlapping of noise and EEG signals that ICA cannot filter out.
Denoising is applied only on the detail coefficients of the wavelet transform and it has been shown that this algorithm offers the advantages of smoothness and adaptation. Although simple and easy to use, research has shown that each thesholding method exhibits problems: • Hard thresholding leads to the oscillation of the reconstructed signal • Soft thresholding reduces the amplitude of the signal waveform (Han et al., 2009) This method may also result in a blur of the signal energy over several transform details of smaller amplitude which may be masked in the noise. This results in the detail been subsequently truncated when it falls below the threshold. These truncations can result in overshooting and undershooting around discontinuities similar to the Gibbs phenomena in the reconstructed denoised signal (Coifman and Donoho, 1995).
Coifman and Donoho proposed a solution by designing a cycle spinning denoising algorithm which: • Shifts the signal by collection of shifts, within range of cycle spinning • Denoise each shifted signal using a threshold (hard or soft) • Inverse-shift the denoised signal to get a signal in the same phase as the noisy signal • Averaging the estimates The Gibbs artifacts of different shifts partially cancel each other and the final estimate exhibits significantly weaker artifacts (Coifman and Donoho, 1995). This method they called a Translation-Invariant (TI) denoising scheme (Fig. 3).
Noise signals denoise signals: Experimental results in (Alfaouri and Daqrouq, 2008) confirm that single TI wavelet denoising performs better than the traditional single wavelet denoising. Research has also shown that TI produces smaller approximation error when approximating a smooth function as well as mitigating Gibbs artifacts when approximating a discontinuities function.

METERIALS AND METHODS
Here we investigate TI denoising methodology to determine its performance. Data utilized in the performance tests were real, comprised of Electroencephalographic (EEG) signals from both human and animals collected from the following site: http://sccn.ucsd.edu/~arno/fam2data/publicly_available _EEG_data.html. These data were of different sources such as: • Data set acquired is a collection of 32-channel data from one male subject who performed a visual task • Human data based on five disabled and four healthy subjects. The disabled subjects (1-5) were all wheelchair-bound but had varying communication and limb muscle control abilities. The four healthy subjects (6-9) were all male PhD students, age 30 who had no known neurological deficits. Signals were recorded at 2048 Hz sampling rate from 32 electrodes placed at the standard positions of the 10-20 international system • Data set is a collection of 32-channel data from 14 subjects (7 males, 7 females) who performed a gonogo categorization task and a go-no recognition task on natural photographs presented very briefly (20 ms). Each subject responded to a total of 2500 trials. The data is CZ referenced and is sampled at • Five data sets containing quasi-stationary, noisefree EEG signals both in normal and epileptic subjects. Each data set contains 100 single channel EEG segments of 23.6 sec duration Experiments were conducted using the above mentioned signals, in Matrix Laboratory (MATLAB) 7.8.0 (R2009) on a laptop with AMD Athlon 64×2 Dual-core Processor 1.80 GHz. To denoise we utilized the Coifman and Donoho's method divided into the following steps: • Signal collection: This algorithm is designed to denoise both natural and artificially noised EEG signals. They should therefore be mathematically defined based on Eq. 1. • The signals are decomposed into 5 levels of DWT using the Symmlet family, separating noise and true signals. Symmlets are orthogonal and its regularity increases with the increase in the number of moments (Donoho and Johnstone, 1995). After experiments the number of vanishing moments chosen is 8 (Sym 8) • Choose and apply threshold value: Denoise using the soft-thresholding method discarding all coefficients below the threshold value using VisuShrink based on the universal threshold defined by Donoho and Johnstone (1995) given as Eq. 9: 2 T 2 log N = σ Once the signals were tested using TI_FWT, we tested the same signals using two successful ICA algorithms-FastICA and, Radical. Both algorithms were downloaded from the web sites of the respective authors. In the case of FastICA, a symmetric orthogonal view based on the tanh gradient function was utilized.

RESULTS
Consider the contaminated EEG signal to be denoised in Fig. 4. This was denoised used FastICA and Radical ICA algorithms along with TI_FWT. The results can be seen in Fig. 5. This shows that the result using TI_FWT is better in comparison: To determine the effectiveness of all three methods the Mean Square Error (MSE), the Signal to Noise Ratio (SNR) and the Percentage Root Mean Square Difference (PRD), defined below, was calculated. SNR refers to how much signal and noise is present regarding just about anything and everything i.e., the ratio compares the level of a desired signal to the level of background noise. For performance, the greater the ratio, evidenced by a larger number, the less noise and the more easily it can be filtered out. Biosignals such as EEG commonly has below 0dB SNR therefore the highest SNR would be 0dB Fig. 6 shows the SNR for the three algorithms.
The MSE measures the average of the square of the "error" which is the amount by which the estimator differs from the quantity to be estimated. The difference occurs because of randomness or because the estimator doesn't account for information that could produce a more accurate estimate. For a perfect fit, I(x, y) = I'(x, y) and MSE = 0; so, the MSE index ranges from 0 to infinity, with 0 corresponding to the ideal. Figure 7 shows the results of MSE calculations.
PRD measures the square difference average between the original and reconstructed signals i.e., it measures the level of the distortion between the original signal and the reconstructed signal. The method determines the deformation percent in the denoised signal. PRD results are shown in Fig. 8.

DISCUSSION
Examination of Fig. 6 shows the TI_FWT has SNR nearer to 0 which means that performance deteriorates with FastICA and Radical i.e., it ranges from low to moderate to high noise conditions. To have the highest SNR clearly demonstrating that TI_FWT has filtered out more noise than the other algorithms and therefore produces cleaner signals as its result.
The smaller the MSE the closer the estimator is to the actual data. A small mean squared error means that the randomness reflects the data more accurately than a larger mean squared error. Figure 7 shows that TI_FWT has the smallest MSE showing that it has produced the signal which is nearest to the pure signal.
Since the variability of the signal around its baseline is what should be preserved and not the baseline itself, the performance measure used to reveal the accuracy of the algorithm was the variance of the error with respect to the variance of the signal. From Fig. 8 FastICA and Radical both have higher values indicating that in these cases, the performances are weaker due to the presence of noise.
Research have found that Wavelet Transform is the best suited for denoising as far as performance goes because of its properties like sparsity, multiresolution and multiscale nature. Non-orthogonal wavelets such as UDWT and Multiwavelets improve the performance at the expense of a large overhead in their computation (Motwani et al., 2004).

CONCLUSION
In recent years researchers have used both ICA algorithms and WT to denoise EEG signals. In this study we draw attention to Coifman and Donoho's Translation invariant Wavelet Transform and its application to denoising EEG signals. We also compared its performance against two known ICA algorithms-FastICA and Radical. It was seen that TI_FWT outperformed both algorithms having the smallest MSE and PRD indicating a cleaner signal. This was confirmed as it also has the highest SNR. We conclude therefore that the translation invariant method of wavelet is an efficient technique for improving the quality of EEG signals.