Overview of the High Efficiency Video Coding(HEVC) Standard之四
I. 变换,缩放,和量化 Transform, Scaling, and Quantization
HEVC uses transform coding of the prediction error residual in a similar manner as in prior standards. The residual block is partitioned into multiple square TBs, as described in Section IV-E. The supported transform block sizes are 4×4, 8×8, 16×16, and 32×32. 和之前的标准一样,HEVC对预测残差使用变换编码; 依据标准中的规定,可以将残差块划分成多个正方形的TB; 支持的变换块尺寸为4×4, 8×8, 16×16, and 32×32.
1) 变换核 Core Transform:
Two-dimensional transforms are computed by applying 1-D transforms in the horizontal and vertical directions. The elements of the core transform matrices were derived by approximating scaled DCT basis functions, under considerations such as limiting the necessary dynamic range for transform computation and maximizing the precision and closeness to orthogonality when the matrix entries are specified as integer values. 二维变换是通过计算水平和垂直方向的一维变换实现的; 核变换矩阵的元素是通过缩放DCT基本函数导致得到的, 在矩阵系数使用整数值时,需要考虑的是限制变换计算的动态范围,并最大化精度和正交性;
For simplicity, only one integer matrix for the length of 32 points is specified, and subsampled versions are used for other sizes. For example, the matrix for the length-16 transform is as shown in the equation at the bottom of the previous page. The matrices for the length-8 and length-4 transforms can be derived by using the first eight entries of rows 0, 2, 4, . . ., and using the first four entries of rows 0, 4, 8, . . ., respectively. 为了简单化,对于32个点的长度,只有一个整数矩阵,而子样本用于其它尺寸; 例如, 长16变换的矩阵的等式见前面的; 长8的变换矩阵可以对行0,2,4,...前8个系数推导得到; 长4的变换矩阵可以对行0, 4, 8, ...前4个系数推导得到;
Although the standard specifies the transform simply in terms of the value of a matrix, the values of the entries in the matrix were selected to have key symmetry properties that enable fast partially factored implementations with far fewer mathematical operations than an ordinary matrix multiplication, and the larger transforms can be constructed by using the smaller transforms as building blocks. 尽管标准在矩阵值是做的变换的简化, 但是我们选择的矩阵的系数值有一个对称的特性,这可以减少算术操作并提高因子计算的速度; 并且大的变换可以由小的变换组成;
Due to the increased size of the supported transforms, limiting the dynamic range of the intermediate results from the first stage of the transformation is quite important. HEVC explicitly inserts a 7-b right shift and 16-b clipping operation after the first 1-D inverse transform stage of the transform (the vertical inverse transform stage) to ensure that all intermediate values can be stored in 16-b memory (for 8-b video decoding). 由于支持的变换的尺寸增加, 在变换的第一步就对中间值的动态范围做限制是很重要的; HEVC显示地在第一个1维反变换阶段插入一个7比特的右移和一个16比特的截断操作, 来确保所有的中间值都在16比特范围内;
2) 可选的4x4变换 Alternative 4 × 4 Transform:
For the transform block size of 4×4, an alternative integer transform derived from a DST is applied to the luma residual blocks for intrapicture prediction modes, with the transform matrix 对于尺寸为4x4的变换块,HEVC使用了从DST推导得到提整数变换, 以用于帧内预测模式的亮度残差块,变换矩阵如下:
The basis functions of the DST better fit the statistical property that the residual amplitudes tend to increase as the distance from the boundary samples that are used for prediction becomes larger. In terms of complexity, the 4×4 DST-style transform is not much more computationally demanding than the 4×4 DCT-style transform, and it provides approximately 1% bit-rate reduction in intrapicture predictive coding. DST的基本函数在统计上更适合于残差振幅,并增加了用于预测的边界像素的距离; 在复杂度上,4x4 DST变换和4x4DCT变换差不多, 但对帧内预测预测减少了约1%的码率;
The usage of the DST type of transform is restricted to only 4×4 luma transform blocks, since for other cases the additional coding efficiency improvement for including the additional transform type was found to be marginal. DST变换只用在4x4 亮度变换块, 因为在其它情况下,产生不了额外的编码效率改善;
3) 缩放和量化 Scaling and Quantization:
Since the rows of the transform matrix are close approximations of values of uniformly scaled basis functions of the orthonormal DCT, the prescaling operation that is incorporated in the dequantization of H.264/MPEG-4 AVC is not needed in HEVC. This avoidance of frequency-specific basis function scaling is useful in reducing the intermediate memory size, especially when considering that the size of the transform can be as large as 32×32. 因为变换矩阵的行是正交DCT变换统一归约数, 因此,不需要使用像H.264/MPEG-4 AVC中的预缩放; 这对减少中内存开销委朋,特别是变换尺寸为32x32时;
For quantization, HEVC uses essentially the same URQ scheme controlled by a quantization parameter (QP) as in H.264/MPEG-4 AVC. The range of the QP values is defined from 0 to 51, and an increase by 6 doubles the quantization step size such that the mapping of QP values to step sizes is approximately logarithmic. Quantization scaling matrices are also supported. HEVC使用的和H.264/MPEG-4 AVC一样的URQ方式来控制量化参数(QP); QP的范围为0-51,并增加到了6倍的量化步骤尺寸,以映射算术QP值 ; 量化缩放矩阵同样支持;
To reduce the memory needed to store frequency-specific scaling values, only quantization matrices of sizes 4×4 and 8×8 are used. For the larger transformations of 16×16 and 32×32 sizes, an 8×8 scaling matrix is sent and is applied by sharing values within 2×2 and 4×4 coefficient groups in frequency subspaces—except for values at DC (zero-frequency) positions, for which distinct values are sent and applied. 为了减少存储特别频繁的缩放值的内存开销, 标准规定量化系数只在尺寸为4x4和8x8时使用; 对于更大的16x16, 32x32变换,8x8的缩放系数被发送并且被2x2, 4x4系数组共享使用;