Home / Content / LAPRAN: A Scalable Laplacian Pyramid Reconstructive Adversarial Network for Flexible Compressive Sensing Reconstruction

LAPRAN: A Scalable Laplacian Pyramid Reconstructive Adversarial Network for Flexible Compressive Sensing Reconstruction

Project Overview

Compressive sensing (CS) is a transformative sampling technique that is more efficient than Nyquist Sampling. Rather than sampling at the Nyquist rate and then compressing the sampled data, CS aims to directly sense signals in a compressed form while retaining the necessary information for accurate reconstruction. The trade-off for the simplicity of encoding is the intricate reconstruction process. Conventional CS reconstruction algorithms are based on either convex optimization or greedy/iterative methods. These methods suffer from three major drawbacks limiting their practical usage. First, the iterative nature renders these methods computational intensive and not suitable for hardware acceleration. Second, the widely adopted sparsity constraint assumes the given signal is sparse on a known basis. However, natural images do not have an exactly sparse representation on any known basis (DCT, wavelet, or curvelet). The strong dependency on the sparsity constraint becomes the performance limiting factor of conventional methods. Constructing over-complete dictionaries with deterministic atoms can only moderately relax the constraint, as the learned linear sparsity models are often shallow thus have limited impacts. Third, conventional methods have a rigid structure allowing for reconstruction at a fixed resolution only. The recovery quality cannot be guaranteed when the compression ratio (CR) needs to be compromised due to a limited communication bandwidth or storage space. A better solution is to reconstruct at a compromised resolution while keeping a satisfactory reconstruction signalto-noise ratio (RSNR) rather than dropping the RSNR for a fixed resolution.

Deep neural networks (DNNs) have been explored recently for learning the inverse mapping of CS. The limitations of existing DNN-based approaches are twofold. First, the reconstruction results tend to be blurry because of the exclusive use of a Euclidean loss. Specifically, the recovery quality of DNN-based methods are usually no better than optimization-based methods when the CR is low, e.g., CR <= 10. Second, similar to the optimization-based methods, the existing DNN-based methods all have rigid structures allowing for reconstruction at a fixed and non-adaptive resolution only. The reconstruction will simply fail when the CR is lower than a required threshold.

In this project, we propose a scalable Laplacian pyramid reconstructive adversarial network (LAPRAN) for flexible CS reconstruction that addresses all the problems mentioned above. LAPRAN does not require sparsity as prior knowledge hence can be potentially used in a broader range of applications, especially where the exact signal sparsity model is unknown. When applied to image signals, LAPRAN progressively reconstruct high-fidelity images following the concept of the Laplacian pyramid through multiple stages of specialized reconstructive adversarial networks (RANs). At each pyramid level, CS measurements are fused with a low-dimensional contextual latent vector to generate a reconstructed image with both higher resolution and reconstruction quality. The non-iterative and high-concurrency natures of LAPRAN make it suitable for hardware acceleration. Furthermore, the scalable pyramid structure of LAPRAN enables high-fidelity CS reconstruction with a flexible resolution that can be adaptive to a wide range of CRs. One can dynamically add or remove RAN stages from LAPRAN to reconstruct images at a higher or lower resolution when the CR becomes lower and higher, respectively. Therefore, a consistently superior recovery quality can be guaranteed across a wide range of CRs. Experimental results on multiple public datasets show that LAPRAN offers an average 7.47dB and 5.98dB PSNR, and an average 57.93% and 33.20 % SSIM improvement compared to model-based and data-driven baselines, respectively.

The contributions of this project are:

  • We propose a novel architecture of the neural network model (LAPRAN) that enables high-fidelity, flexible and fast CS reconstruction.
  • We propose to fuse CS measurements with contextual latent vectors of lowresolution images at each pyramid level to enhance the CS recovery quality.
  • We illustrate that the progressive learning and reconstruction strategy can mitigate the difficulty of the inverse mapping problem in CS. Such a strategy not only accelerates the training by confining the search space but also improves the recovery quality by eliminating the accumulation of errors.
Fig. 1: Overall structure of the proposed LAPRAN. The CS measurement of a high-dimensional image is performed by a multi-rate random encoder. The LAPRAN takes CS measurements as inputs and progressively reconstructs an original image in multiple hierarchies with incremental resolutions and recov- ery qualities. At each pyramid level, RAN generates an image residual, which is subsequently combined with an upscaled output from the previous level to form a higher-resolution output of the current level (upsampling and upscaling respectively refers to increasing the image resolution with and without new details added).
Fig. 2: Illustration of a sensing matrix for multi-rate CS.
Fig. 3: The structure of RecGen2. A low-resolution input image i_2 is transformed into a high-frequency image residual r_2 by an encoder-decoder network. A high-resolution output image is generated by adding the image residual to the upscaled input image. The dimension of each feature map is denoted in the figure. An example output of each convolutional layer is also shown.
Fig. 4: Convergence analysis. We compare the MSE test error using the CI- FAR10 dataset at the CR of 10. The results without measurement fusion can be regarded as the performance of an SR approach. The MSE loss of the SR approach cannot be effectively reduced after stage 1 because of the lack of new information.
Fig. 5: Visual comparison of butterfly (Set 5) and zebra (Set14) at the CR of 5 and 20, respectively. LAPRAN preserves finer details.
Table 1: Summary of the major differences between the proposed and the reference methods.
Table 2: Quantitative evaluation of state-of-the-art CS reconstruction methods.
Table 3: Runtime (seconds) for reconstructing a 64 × 64 image patch. Unlike the model-based methods, the runtime of LAPRAN is invariant to CR. LAPRAN is slightly slower than ReconNet because of its large model capacity. CSGM and LDAMP are relatively slow due to their iterative nature.

Publications

Conference Proceedings

K. Xu, Zhang, Z. , and Ren, F. , LAPRAN: A Scalable Laplacian Pyramid Reconstructive Adversarial Network for Flexible Compressive Sensing Reconstruction, The 15th European Conference on Computer Vision (ECCV'18). Munich, Germany, pp. 491-507, 2018. (1.41 MB) Conference Proceedings

Acknowledgements

This work is supported by NSF grant IIS/CPS-1652038 and Google Faculty Research Award.

Sponsors