Sandwiched compression: Repurposing standard codecs with neural network wrappers
arXiv, 2024.
Improved image and video compression using neural pre- and post-processing.
Abstract:
We propose sandwiching standard image and video codecs between pre- and post-processing neural networks.
The networks are jointly trained through a differentiable codec proxy to minimize a given rate-distortion
loss. This sandwich architecture not only improves the standard codec's performance on its intended
content, it can effectively adapt the codec to other types of image/video content and to other distortion
measures. Essentially, the sandwich learns to transmit “neural code images” that optimize
overall rate-distortion performance even when the overall problem is well outside the scope of the codec's
design. Through a variety of examples, we apply the sandwich architecture to sources with different
numbers of channels, higher resolution, higher dynamic range, and perceptual distortion measures. The
results demonstrate substantial improvements (up to 9 dB gains or up to 30% bitrate reductions) compared to
alternative adaptations. We derive VQ equivalents for the sandwich, establish optimality properties, and
design differentiable codec proxies approximating current standard codecs. We further analyze model
complexity, visual quality under perceptual metrics, as well as sandwich configurations that offer
interesting potentials in image/video compression and streaming.
Hindsights:
No hindsights yet.