08-residue.tex 16.8 KB
 Ralph Giles committed Mar 06, 2009 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 % -*- mode: latex; TeX-master: "Vorbis_I_spec"; -*- %!TEX root = Vorbis_I_spec.tex \section{Residue setup and decode} \label{vorbis:spec:residue} \subsection{Overview} A residue vector represents the fine detail of the audio spectrum of one channel in an audio frame after the encoder subtracts the floor curve and performs any channel coupling. A residue vector may represent spectral lines, spectral magnitude, spectral phase or hybrids as mixed by channel coupling. The exact semantic content of the vector does not matter to the residue abstraction. Whatever the exact qualities, the Vorbis residue abstraction codes the residue vectors into the bitstream packet, and then reconstructs the vectors during decode. Vorbis makes use of three different encoding variants (numbered 0, 1 and 2) of the same basic vector encoding abstraction. \subsection{Residue format} Residue format partitions each vector in the vector bundle into chunks, classifies each chunk, encodes the chunk classifications and finally encodes the chunks themselves using the the specific VQ arrangement defined for each selected classification. The exact interleaving and partitioning vary by residue encoding number, however the high-level process used to classify and encode the residue vector is the same in all three variants. A set of coded residue vectors are all of the same length. High level coding structure, ignoring for the moment exactly how a partition is encoded and simply trusting that it is, is as follows: \begin{itemize} \item Each vector is partitioned into multiple equal sized chunks according to configuration specified. If we have a vector size of  Monty committed Aug 11, 2011 39 \emph{n}, a partition size \emph{residue\_partition\_size}, and a total  Ralph Giles committed Mar 06, 2009 40 of \emph{ch} residue vectors, the total number of partitioned chunks  Monty committed Aug 11, 2011 41 coded is \emph{n}/\emph{residue\_partition\_size}*\emph{ch}. It is  Ralph Giles committed Mar 06, 2009 42 important to note that the integer division truncates. In the below  Monty committed Aug 11, 2011 43 example, we assume an example \emph{residue\_partition\_size} of 8.  Ralph Giles committed Mar 06, 2009 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157  \item Each partition in each vector has a classification number that specifies which of multiple configured VQ codebook setups are used to decode that partition. The classification numbers of each partition can be thought of as forming a vector in their own right, as in the illustration below. Just as the residue vectors are coded in grouped partitions to increase encoding efficiency, the classification vector is also partitioned into chunks. The integer elements of each scalar in a classification chunk are built into a single scalar that represents the classification numbers in that chunk. In the below example, the classification codeword encodes two classification numbers. \item The values in a residue vector may be encoded monolithically in a single pass through the residue vector, but more often efficient codebook design dictates that each vector is encoded as the additive sum of several passes through the residue vector using more than one VQ codebook. Thus, each residue value potentially accumulates values from multiple decode passes. The classification value associated with a partition is the same in each pass, thus the classification codeword is coded only in the first pass. \end{itemize} \begin{center} \includegraphics[width=\textwidth]{residue-pack} \captionof{figure}{illustration of residue vector format} \end{center} \subsection{residue 0} Residue 0 and 1 differ only in the way the values within a residue partition are interleaved during partition encoding (visually treated as a black box--or cyan box or brown box--in the above figure). Residue encoding 0 interleaves VQ encoding according to the dimension of the codebook used to encode a partition in a specific pass. The dimension of the codebook need not be the same in multiple passes, however the partition size must be an even multiple of the codebook dimension. As an example, assume a partition vector of size eight, to be encoded by residue 0 using codebook sizes of 8, 4, 2 and 1: \begin{programlisting} original residue vector: [ 0 1 2 3 4 5 6 7 ] codebook dimensions = 8 encoded as: [ 0 1 2 3 4 5 6 7 ] codebook dimensions = 4 encoded as: [ 0 2 4 6 ], [ 1 3 5 7 ] codebook dimensions = 2 encoded as: [ 0 4 ], [ 1 5 ], [ 2 6 ], [ 3 7 ] codebook dimensions = 1 encoded as: [ 0 ], [ 1 ], [ 2 ], [ 3 ], [ 4 ], [ 5 ], [ 6 ], [ 7 ] \end{programlisting} It is worth mentioning at this point that no configurable value in the residue coding setup is restricted to a power of two. \subsection{residue 1} Residue 1 does not interleave VQ encoding. It represents partition vector scalars in order. As with residue 0, however, partition length must be an integer multiple of the codebook dimension, although dimension may vary from pass to pass. As an example, assume a partition vector of size eight, to be encoded by residue 0 using codebook sizes of 8, 4, 2 and 1: \begin{programlisting} original residue vector: [ 0 1 2 3 4 5 6 7 ] codebook dimensions = 8 encoded as: [ 0 1 2 3 4 5 6 7 ] codebook dimensions = 4 encoded as: [ 0 1 2 3 ], [ 4 5 6 7 ] codebook dimensions = 2 encoded as: [ 0 1 ], [ 2 3 ], [ 4 5 ], [ 6 7 ] codebook dimensions = 1 encoded as: [ 0 ], [ 1 ], [ 2 ], [ 3 ], [ 4 ], [ 5 ], [ 6 ], [ 7 ] \end{programlisting} \subsection{residue 2} Residue type two can be thought of as a variant of residue type 1. Rather than encoding multiple passed-in vectors as in residue type 1, the \emph{ch} passed in vectors of length \emph{n} are first interleaved and flattened into a single vector of length \emph{ch}*\emph{n}. Encoding then proceeds as in type 1. Decoding is as in type 1 with decode interleave reversed. If operating on a single vector to begin with, residue type 1 and type 2 are equivalent. \begin{center} \includegraphics[width=\textwidth]{residue2} \captionof{figure}{illustration of residue type 2} \end{center} \subsection{Residue decode} \subsubsection{header decode} Header decode for all three residue types is identical. \begin{programlisting}  Monty committed Aug 11, 2011 158 159 160 161 162  1) [residue\_begin] = read 24 bits as unsigned integer 2) [residue\_end] = read 24 bits as unsigned integer 3) [residue\_partition\_size] = read 24 bits as unsigned integer and add one 4) [residue\_classifications] = read 6 bits as unsigned integer and add one 5) [residue\_classbook] = read 8 bits as unsigned integer  Ralph Giles committed Mar 06, 2009 163 164 \end{programlisting}  Monty committed Aug 11, 2011 165 166 \varname{[residue\_begin]} and \varname{[residue\_end]} select the specific sub-portion of  Ralph Giles committed Mar 06, 2009 167 168 each vector that is actually coded; it implements akin to a bandpass where, for coding purposes, the vector effectively begins at element  Monty committed Aug 11, 2011 169 170 \varname{[residue\_begin]} and ends at \varname{[residue\_end]}. Preceding and following values in  Ralph Giles committed Mar 06, 2009 171 the unpacked vectors are zeroed. Note that for residue type 2, these  Monty committed Aug 11, 2011 172 values as well as \varname{[residue\_partition\_size]}apply to  Ralph Giles committed Mar 06, 2009 173 the interleaved vector, not the individual vectors before interleave.  Monty committed Aug 11, 2011 174 175 \varname{[residue\_partition\_size]} is as explained above, \varname{[residue\_classifications]} is the number of possible  Ralph Giles committed Mar 06, 2009 176 classification to which a partition can belong and  Monty committed Aug 11, 2011 177 \varname{[residue\_classbook]} is the codebook number used to  Ralph Giles committed Mar 06, 2009 178 code classification codewords. The number of dimensions in book  Monty committed Aug 11, 2011 179 \varname{[residue\_classbook]} determines how many  Ralph Giles committed Mar 06, 2009 180 181 classification values are grouped into a single classification codeword. Note that the number of entries and dimensions in book  Monty committed Aug 11, 2011 182 183 \varname{[residue\_classbook]}, along with \varname{[residue\_classifications]}, overdetermines to  Ralph Giles committed Mar 06, 2009 184 possible number of classification codewords.  Monty committed Aug 11, 2011 185 186 If \varname{[residue\_classifications]}\^{}\varname{[residue\_classbook]}.dimensions exceeds \varname{[residue\_classbook]}.entries, the  Ralph Giles committed Mar 06, 2009 187 188 189 190 191 192 bitstream should be regarded to be undecodable. Next we read a bitmap pattern that specifies which partition classes code values in which passes. \begin{programlisting}  Monty committed Aug 11, 2011 193  1) iterate [i] over the range 0 ... [residue\_classifications]-1 {  Ralph Giles committed Mar 06, 2009 194   Monty committed Aug 11, 2011 195 196  2) [high\_bits] = 0 3) [low\_bits] = read 3 bits as unsigned integer  Ralph Giles committed Mar 06, 2009 197  4) [bitflag] = read one bit as boolean  Monty committed Aug 11, 2011 198 199  5) if ( [bitflag] is set ) then [high\_bits] = read five bits as unsigned integer 6) vector [residue\_cascade] element [i] = [high\_bits] * 8 + [low\_bits]  Ralph Giles committed Mar 06, 2009 200 201 202 203 204 205 206 207 208 209 210  } 7) done \end{programlisting} Finally, we read in a list of book numbers, each corresponding to specific bit set in the cascade bitmap. We loop over the possible codebook classifications and the maximum possible number of encoding stages (8 in Vorbis I, as constrained by the elements of the cascade bitmap being eight bits): \begin{programlisting}  Monty committed Aug 11, 2011 211  1) iterate [i] over the range 0 ... [residue\_classifications]-1 {  Ralph Giles committed Mar 06, 2009 212 213 214  2) iterate [j] over the range 0 ... 7 {  Monty committed Aug 11, 2011 215  3) if ( vector [residue\_cascade] element [i] bit [j] is set ) {  Ralph Giles committed Mar 06, 2009 216   Monty committed Aug 11, 2011 217  4) array [residue\_books] element [i][j] = read 8 bits as unsigned integer  Ralph Giles committed Mar 06, 2009 218 219 220  } else {  Monty committed Aug 11, 2011 221  5) array [residue\_books] element [i][j] = unused  Ralph Giles committed Mar 06, 2009 222 223 224 225 226 227 228 229 230 231 232  } } } 6) done \end{programlisting} An end-of-packet condition at any point in header decode renders the stream undecodable. In addition, any codebook number greater than the maximum numbered codebook set up in this stream also renders the  Monty committed Aug 11, 2011 233 stream undecodable. All codebooks in array [residue\_books] are  Monty committed Jun 25, 2009 234 required to have a value mapping. The presence of codebook in array  Monty committed Aug 11, 2011 235 [residue\_books] without a value mapping (maptype equals zero) renders  Monty committed Jun 25, 2009 236 the stream undecodable.  Ralph Giles committed Mar 06, 2009 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253  \subsubsection{packet decode} Format 0 and 1 packet decode is identical except for specific partition interleave. Format 2 packet decode can be built out of the format 1 decode process. Thus we describe first the decode infrastructure identical to all three formats. In addition to configuration information, the residue decode process is passed the number of vectors in the submap bundle and a vector of flags indicating if any of the vectors are not to be decoded. If the passed in number of vectors is 3 and vector number 1 is marked 'do not decode', decode skips vector 1 during the decode loop. However, even 'do not decode' vectors are allocated and zeroed.  Monty committed Aug 11, 2011 254 255 Depending on the values of \varname{[residue\_begin]} and \varname{[residue\_end]}, it is obvious that the encoded  Ralph Giles committed Mar 06, 2009 256 257 258 portion of a residue vector may be the entire possible residue vector or some other strict subset of the actual residue vector size with zero padding at either uncoded end. However, it is also possible to  Monty committed Aug 11, 2011 259 260 set \varname{[residue\_begin]} and \varname{[residue\_end]} to specify a range partially or  Ralph Giles committed Mar 06, 2009 261 wholly beyond the maximum vector size. Before beginning residue  Monty committed Aug 11, 2011 262 263 decode, limit \varname{[residue\_begin]} and \varname{[residue\_end]} to the maximum possible vector size  Ralph Giles committed Mar 06, 2009 264 265 266 267 268 as follows. We assume that the number of vectors being encoded, \varname{[ch]} is provided by the higher level decoding process. \begin{programlisting}  Monty committed Aug 11, 2011 269  1) [actual\_size] = current blocksize/2;  Ralph Giles committed Mar 06, 2009 270  2) if residue encoding is format 2  Monty committed Aug 11, 2011 271  3) [actual\_size] = [actual\_size] * [ch];  Loren M. Lang committed Nov 11, 2017 272 273  4) [limit\_residue\_begin] = minimum of ([residue\_begin],[actual\_size]); 5) [limit\_residue\_end] = minimum of ([residue\_end],[actual\_size]);  Ralph Giles committed Mar 06, 2009 274 275 276 277 278 279 \end{programlisting} The following convenience values are conceptually useful to clarifying the decode process: \begin{programlisting}  Monty committed Aug 11, 2011 280 281 282  1) [classwords\_per\_codeword] = [codebook\_dimensions] value of codebook [residue\_classbook] 2) [n\_to\_read] = [limit\_residue\_end] - [limit\_residue\_begin] 3) [partitions\_to\_read] = [n\_to\_read] / [residue\_partition\_size]  Ralph Giles committed Mar 06, 2009 283 284 285 286 287 \end{programlisting} Packet decode proceeds as follows, matching the description offered earlier in the document. \begin{programlisting} 1) allocate and zero all vectors that will be returned.  Monty committed Aug 11, 2011 288  2) if ([n\_to\_read] is zero), stop; there is no residue to decode.  Ralph Giles committed Mar 06, 2009 289 290  3) iterate [pass] over the range 0 ... 7 {  Monty committed Aug 11, 2011 291  4) [partition\_count] = 0  Ralph Giles committed Mar 06, 2009 292   Monty committed Aug 11, 2011 293  5) while [partition\_count] is less than [partitions\_to\_read]  Ralph Giles committed Mar 06, 2009 294 295 296 297 298 299 300  6) if ([pass] is zero) { 7) iterate [j] over the range 0 .. [ch]-1 { 8) if vector [j] is not marked 'do not decode' {  Monty committed Aug 11, 2011 301 302  9) [temp] = read from packet using codebook [residue\_classbook] in scalar context 10) iterate [i] descending over the range [classwords\_per\_codeword]-1 ... 0 {  Ralph Giles committed Mar 06, 2009 303   Monty committed Aug 11, 2011 304 305 306  11) array [classifications] element [j],([i]+[partition\_count]) = [temp] integer modulo [residue\_classifications] 12) [temp] = [temp] / [residue\_classifications] using integer division  Ralph Giles committed Mar 06, 2009 307 308 309 310 311 312 313 314 315  } } } }  Monty committed Aug 11, 2011 316 317  13) iterate [i] over the range 0 .. ([classwords\_per\_codeword] - 1) while [partition\_count] is also less than [partitions\_to\_read] {  Ralph Giles committed Mar 06, 2009 318 319 320 321 322  14) iterate [j] over the range 0 .. [ch]-1 { 15) if vector [j] is not marked 'do not decode' {  Monty committed Aug 11, 2011 323 324  16) [vqclass] = array [classifications] element [j],[partition\_count] 17) [vqbook] = array [residue\_books] element [vqclass],[pass]  Ralph Giles committed Mar 06, 2009 325 326 327  18) if ([vqbook] is not 'unused') { 19) decode partition into output vector number [j], starting at scalar  Monty committed Aug 11, 2011 328  offset [limit\_residue\_begin]+[partition\_count]*[residue\_partition\_size] using  Ralph Giles committed Mar 06, 2009 329 330 331 332  codebook number [vqbook] in VQ context } }  Monty committed Aug 11, 2011 333  20) increment [partition\_count] by one  Ralph Giles committed Mar 06, 2009 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355  } } } 21) done \end{programlisting} An end-of-packet condition during packet decode is to be considered a nominal occurrence. Decode returns the result of vector decode up to that point. \subsubsection{format 0 specifics} Format zero decodes partitions exactly as described earlier in the 'Residue Format: residue 0' section. The following pseudocode presents the same algorithm. Assume: \begin{itemize}  Monty committed Aug 11, 2011 356 \item \varname{[n]} is the value in \varname{[residue\_partition\_size]}  Ralph Giles committed Mar 06, 2009 357 358 359 360 361 362 \item \varname{[v]} is the residue vector \item \varname{[offset]} is the beginning read offset in [v] \end{itemize} \begin{programlisting}  Monty committed Aug 11, 2011 363  1) [step] = [n] / [codebook\_dimensions]  Ralph Giles committed Mar 06, 2009 364 365  2) iterate [i] over the range 0 ... [step]-1 {  Monty committed Aug 11, 2011 366 367  3) vector [entry\_temp] = read vector from packet using current codebook in VQ context 4) iterate [j] over the range 0 ... [codebook\_dimensions]-1 {  Ralph Giles committed Mar 06, 2009 368 369 370  5) vector [v] element ([offset]+[i]+[j]*[step]) = vector [v] element ([offset]+[i]+[j]*[step]) +  Monty committed Aug 11, 2011 371  vector [entry\_temp] element [j]  Ralph Giles committed Mar 06, 2009 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390  } } 6) done \end{programlisting} \subsubsection{format 1 specifics} Format 1 decodes partitions exactly as described earlier in the 'Residue Format: residue 1' section. The following pseudocode presents the same algorithm. Assume: \begin{itemize} \item \varname{[n]} is the value in  Monty committed Aug 11, 2011 391 \varname{[residue\_partition\_size]}  Ralph Giles committed Mar 06, 2009 392 393 394 395 396 397 398 \item \varname{[v]} is the residue vector \item \varname{[offset]} is the beginning read offset in [v] \end{itemize} \begin{programlisting} 1) [i] = 0  Monty committed Aug 11, 2011 399 400  2) vector [entry\_temp] = read vector from packet using current codebook in VQ context 3) iterate [j] over the range 0 ... [codebook\_dimensions]-1 {  Ralph Giles committed Mar 06, 2009 401 402 403  4) vector [v] element ([offset]+[i]) = vector [v] element ([offset]+[i]) +  Monty committed Aug 11, 2011 404  vector [entry\_temp] element [j]  Ralph Giles committed Mar 06, 2009 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451  5) increment [i] } 6) if ( [i] is less than [n] ) continue at step 2 7) done \end{programlisting} \subsubsection{format 2 specifics} Format 2 is reducible to format 1. It may be implemented as an additional step prior to and an additional post-decode step after a normal format 1 decode. Format 2 handles 'do not decode' vectors differently than residue 0 or 1; if all vectors are marked 'do not decode', no decode occurrs. However, if at least one vector is to be decoded, all the vectors are decoded. We then request normal format 1 to decode a single vector representing all output channels, rather than a vector for each channel. After decode, deinterleave the vector into independent vectors, one for each output channel. That is: \begin{enumerate} \item If all vectors 0 through \emph{ch}-1 are marked 'do not decode', allocate and clear a single vector \varname{[v]}of length \emph{ch*n} and skip step 2 below; proceed directly to the post-decode step. \item Rather than performing format 1 decode to produce \emph{ch} vectors of length \emph{n} each, call format 1 decode to produce a single vector \varname{[v]} of length \emph{ch*n}. \item Post decode: Deinterleave the single vector \varname{[v]} returned by format 1 decode as described above into \emph{ch} independent vectors, one for each outputchannel, according to: \begin{programlisting} 1) iterate [i] over the range 0 ... [n]-1 { 2) iterate [j] over the range 0 ... [ch]-1 { 3) output vector number [j] element [i] = vector [v] element ([i] * [ch] + [j]) } } 4) done \end{programlisting} \end{enumerate}