Scalable Video Compression Algorithms

Sen-Ching S. Cheung, Daniel Tan, and Avideh Zakhor

A flexible video codec is essential in most video applications which involve hardware platforms with various capabilities. Scalability in bandwidth requirement and spatial resolutions are among the most desirable features of a flexible video codec. The goal of this project is to develop a codec that can deliver such kinds of scalabilities. Specifically, our goal is to provide video at multiple resolutions, each available at a range of bit rates, in a single compressed bit stream.

To achieve higher compression efficiency, our approach consists of incorporating motion prediction in a scalable fashion into the video codec. For common video sequences, areas of pictures in successive frames are highly correlated. Motion prediction is very efficient in exploiting such correlation to achieve better quality and bandwidth. It has been used in many video compression schemes including the MPEG-1 and MPEG-2 standards. However there are two drawbacks in motion prediction when applied to scalable video. First, the most efficient single feedback loop motion prediction have been shown to be non-scalable. Second, the transmission of motion vectors becomes costly for low bit rate video. In this project, we propose a different technique in motion prediction using multiple feedback loops with multiple block sizes to tackle the problem[3]. To enhance the performance, wiener filters generated based on a ensemble of typical video sequences, are used to minimize the effect of aliasing[3].

Another aspect of our approach is to design quantizers that can perform optimally under the scalable target bit rate constraint. Since the exact target bit rates are not known in advance, the class of bit allocation algorithms from rate distortion theory cannot be used. Our approach is to use successive-approximation quantizer with progressive coding to achieve very fine scales in the transmitting bit rates. However, as shown by W. Equitz and T. Cover [1], this type of quantizer is not as optimal as a single quantizer. In order to obtain reasonable performance, we must take advantages of correlations across different resolutions as well as from local features. Here, we modify the zero-tree coding technique developed by J. M. Shapiro [2] to encode multiple resolutions more efficiently. Also we utilize trained conditional entropy coders to decorrelate local features in order to further reduce the bit rates.

[1] W. H. R. Equitz and T. M. Cover, ``Successive Refinement of Information,'' IEEE Trans. on Information Theory, Vol. 37, pp 269-275, March 1991.

[2] J. M. Shapiro, ``Embedded Image Coding Using Zerotrees of Wavelet Coefficients,'' IEEE Trans. on Signal Processing, Vol. 41, No. 12, pp 3445-3462, December 1993.

[3] S. Cheung and A. Zakhor, ``Scalable Video Compression With Motion Compensated Prediction,'' to appear at IEEE International Conference on Image Processing, Washington D.C., Oct 1995.

Sen-ching S. Cheung / sccheung@eecs.berkeley.edu / 27 Jan 1995