Learning in Compressed Domains

TitleLearning in Compressed Domains
Publication TypeThesis
Year of Publication2021
AuthorsXu, K
Academic DepartmentSchool of Computing and Augmented Intelligence
DegreeDoctor of Philosophy in Computer Engineering
Date Published05/2021
UniversityArizona State University
Keywords (or New Research Field)psclab

A massive volume of data is generated at an unprecedented rate in the information age. The growth of data significantly exceeds the computing and storage capacities of the existing digital infrastructure. In the past decade, many methods are invented for data compression, compressive sensing and reconstruction, and compressed learning (learning directly upon compressed data) to overcome the data-explosion challenge. While prior works are predominantly model-based, focus on small models, and not suitable for task-oriented sensing or hardware acceleration, the number of available models for compression-related tasks has escalated by orders of magnitude in the past decade. Motivated by this significant growth and the success of big data, this thesis proposes to revolutionize both the compressive sensing reconstruction (CSR) and compressed learning (CL) methods from the data-driven perspective.

In this thesis, a series of topics on data-driven CSR are discussed. Individual data-driven models are proposed for the CSR of bio-signals, images, and videos with improved compression ratio and recovery fidelity trade-off. Specifically, a scalable Laplacian pyramid reconstructive adversarial network (LAPRAN) is proposed for single-image CSR. LAPRAN progressively reconstructs images following the concept of the Laplacian pyramid through the concatenation of multiple reconstructive adversarial networks (RANs). For the CSR of videos, CSVideoNet is proposed to improve the spatial-temporal resolution of reconstructed videos.

Apart from CSR, data-driven CL is discussed in the thesis. A CL framework is proposed to extract features directly from compressed data for image classification, objection detection, and semantic/instance segmentation. Besides, the spectral bias of neural networks is analyzed from the frequency perspective, leading to a learning based frequency selection method for identifying the trivial frequency components which can be removed without accuracy loss. Compared with the conventional spaitial downsampling approaches, the proposed frequency-domain learning method can achieve higher accuracy with reduced input data size.

The methodologies proposed in this thesis are not restricted to the above-mentioned applications. The thesis also discusses other potential applications and directions for future research.

File Attachment: