Comparison of Denoising Algorithms for Urban Scenes

: Several new denoising algorithms have recently been presented that display comparable performance and it is unclear which provide the best results. We used the practical example of urban scenes to compare two of the top-performing algorithms. We found that when using the practical case of quantizing input and output imagery, the results were different than when using the conventional method for comparison.


Introduction
The denoising problem is most often represented as an image with Additive White Gaussian Noise (AWGN), where the noise is to be removed. The latest denoising methods have demonstrated impressive results, however, their levels of effectiveness seem to differ by only small amounts, so it is not entirely clear which method works best for any particular application or which method should be selected. (Buades et al., 2005;Chatterjee and Milanfar, 2009;Chen et al., 2013;Dabov et al., 2007;Dong et al., 2015;Gu et al., 2014;Schmidt and Roth, 2014). Further research into this area is extensive. For example, the Peak Signal-to-Noise Ratio (PSNR) is the measure most often used to compare performance of algorithms, but results between methods are typically within 1 dB. The PSNR is based on the Mean Squared Error (MSE), but normalization of imagery is not necessarily performed. Methods should also be compared with proper normalization and quantization to make a meaningful comparison.
An algorithm that represented an important contribution to denoising is the Block-Matching 3D (BM3D) (Dabov et al., 2007) and newly developed methods are still compared to it. Basically, this approach groups similar patches of an image into 3D blocks that are denoised and then returned to their positions. Another significant method, named SSC-GSM, connects a Gaussian Scale Mixture (GSM) with Simultaneous Sparse Coding (SSC) that comes from the observation that many important image structures in natural images, including edges and textures, that can be characterized by the abundance of self-repeating patterns (Dong et al., 2015).
A third impressive new method uses shrinkage fields that are based on existing optimization algorithms for common random field models and are computationally efficient (Schmidt and Roth, 2014). This approach attains its performance through loss-based training of all model parameters and the use of a cascade architecture that can be adapted to different trade-offs between efficiency and image quality.
In this study we compared the performance of these three methods applied to urban scenes. We used the PSNR, as is commonly applied in most work of this kind and also the PSNR with images that have their energy normalized. We also considered practical cases of quantized noisy and denoised imagery in terms of performance.

Methods
We considered two of the current leading methods for denoising and compared them against each other while using the BM3D approach as a reference.

BM3D
This method exploits redundancy of patterns in an image. In general, an image is divided into small patches and then similar patches are grouped together in a 3D stack. Next, a 3D linear transform of each stack is performed. By exploiting the correlation between the image patches within a group, a spare representation can be found and the data effectively filtered. Typically, a Wiener filter is used on the data to reduce noise before an inverse 3D transformation is performed. The last step is to put the denoised patches back into their original locations. There are several parameters to be set and their effect has been previously discussed (Lebrun 2012).

SSC-GSM
In general, this work with SSC-GSM is related to solving an inverse problem with piecewise linear estimators, but using a non-local strategy of similar image patches and a local parametric Gaussian model. The basic idea here is to model each sparse transform coefficient as a Gaussian distribution with a positive scaling variable and impose a sparse distribution prior on the scaling variables (Dong et al., 2015).
Sets of sparse coefficients from similar patches with the same prior distributions and local and nonlocal dependencies are exploited for image restoration. Sparse coding represents a signal x as the linear combination of basis vectors D whose coefficients α satisfy some sparsity constraint. This approach models the sparse coefficients with a GSM model and a Gaussian vector β and a scalar multiplier θ. The formulation of the GSM problem reduces to the joint estimation of β and θ. For similar image patches the same θ and biased mean µ are used along with the nonlocal means approach (Buades et al., 2005) was used to estimate µ's depending on patch similarity. The sparse coefficients β and θ are optimized jointly with γ, which is related to µ through image variances using an initial estimate of the denoised image. The variables are estimated recursively in a computationally efficient fashion.

Shrinkage Fields
In the shrinkage fields approach, a restored image is predicted by finding the MAP estimate of the image given the degraded image where the corruption process is modeled with a Gaussian likelihood kernel and a strength term (Schmidt and Roth, 2014). A block coordinate descent strategy is used that alternates between minimizing with respect to in ideal image x and auxiliary variables within z. The auxiliary variables are introduced so that the MAP estimate becomes a quadratic function and optimization reduces to solving many univariate optimization problems. Shrinkage functions have often been thought of as fixed soft-thresholding functions that "pull" coefficients to zero. In this approach, they are replaced with a linear combination of Gaussian RBF kernels. This allows the optimization procedure to be reduced to a single quadratic minimization in each iteration in terms of a Shrinkage Field (SF). Therefore, an SF is a Gaussian conditional random field whose parameters are determined through learned model parameters, the observed image and the Gaussian likelihood kernel.

Results
In our experiments, we used 256×256 images that were taken from the CBCL Street Scenes collection (MIT, 2007) and added AWGN of σ = 15 and 25. To compare images we initially used the common measure of PSNR which is: (2) with x representing the original image, y the denoised image, the summations are in the horizontal and vertical directions over the entire image and M and N the number of pixels in the horizontal and vertical directions respectively.
We used the images in Fig. 1a-5a that represented random urban scenes and showed the denoising results using the PSNR in Table 1. The highest value for each case in bold text and the lower the value of MSE, the higher the value of the PSNR. Each number in the table represented the average of 100 noisy images for each case.
For a value of σ = 15, the SSC-GSM had higher values than the SF method, but on average the PSNR differed by less than 1dB. But for σ = 25, the results were mixed and both methods performed comparably. Both methods however outperformed the BM3D method.
When an image is acquired from a sensor, it may not be under controlled conditions. Therefore, the sensor may acquire some sort of noisy image. We represented this practical case by adding noise to an image then quantizing the image to 8-bits and calculating the PSNR as before. The results of this case are shown in Table 2.
For the case with σ = 15, the SSC-GSM showed a slight preference in performance, but on average the results only differed by about 0.05 db. Both methods outperformed the BM3D method, but by a much smaller amount than when the input was not quantized, in most cases less than 1 dB. For the case when σ = 25, the SSC-GSM method clearly outperformed both other methods.
The expression for MSE in Equation 2 is used directly in most research reports. However, a shift or scale of an image or patch can change the value of the MSE. Therefore, we also normalized the energy of the denoised image to that of the original image, then calculated the MSE. This way, any power remaining after denoising that is more than the original image will contribute to the MSE. This calculation of PSNR was referred to as the PSNRn. Visually, the results will be the same as with the PSNR. The results using the PSNRn are shown in Table 3. Except for the BM3D results the PSNR was always lower. For both values of noise the results were similar with between the SSC-GSM and the SF methods with the results differing often by less than 1 dB. Although both methods showed improved performance when compared to the BM3D method, the difference was not as great as when the PSNR was used. In addition, when comparing the SSC-GSM and SF methods some images showed opposite results when using the PSNR and PSNRn measures.     Table 3. PSNRn results for the different methods using the images in Fig. 1a-5a with denoised images normalized to the same energy  Fig. 1a-5a with both the noisy input and denoised images quantized to 8-bits and denoised images normalized to the same energy Finally, we compared the case when both the acquired image and denoised image were quantized to 8bits. If a denoised image from sensor is to be stored as a file, that would represent a practical case. For this scenario, we used the PSNRn measure and the results are shown in Table 4. For both values of noise, the SSC-GSM and SF methods performed similarly, but at the higher noise lever, the SF method seemed to perform slightly better. As before, the BM3D method did not perform as well as the other two.
The noisy images used in our study are shown in Fig. 1b-5b for σ = 25. The denoised results using the SSC-GGM, SF and BM3D algorithms are shown in parts c-e of those images, respectively. From these results, it can be seen that the SSC-GGM and SF methods work remarkably well and it is difficult to visually see the difference. It can also apparent that the performance of these algorithms is superior to the BM3D algorithm.

Conclusion
As expected, we found that the performance of the SSC-GGM and SF algorithms gave higher PSNR and PSNRn values and superior visual performance when compared to the BM3D algorithm. This indicates that Although the SSC-GGM and SF algorithms achieved their performance by different approaches, it is difficult to claim that one performed better than the other since the results were so close. In addition, there are several parameters that could be changed within each approach to alter the results further complicating a definitive comparison.
When the input was quantized, the PSNR values significantly decreased across all methods. The SSC-GSM's advantage decreased at a low value of sigma, but it outperformed the SF method by a small amount at the larger value of sigma. When the PSNRn metric was used without any quantization, the values decreased significantly as compared to the PSNR, excluding the BM3D method. Using the PSNRn metric with quantized input and denoised images the SF method showed a slight advantage at a higher value of sigma. Although not reported here, we found that the SF seemed to be faster than the SSC-GSM approach.