mirror of
				https://github.com/cookiengineer/audacity
				synced 2025-10-26 07:13:49 +01:00 
			
		
		
		
	
		
			
				
	
	
		
			460 lines
		
	
	
		
			17 KiB
		
	
	
	
		
			XML
		
	
	
	
	
	
			
		
		
	
	
			460 lines
		
	
	
		
			17 KiB
		
	
	
	
		
			XML
		
	
	
	
	
	
| <?xml version="1.0" standalone="no"?>
 | |
| <!DOCTYPE section PUBLIC "-//OASIS//DTD DocBook XML V4.2//EN"
 | |
|                 "http://www.oasis-open.org/docbook/xml/4.2/docbookx.dtd" [
 | |
| 
 | |
| ]>
 | |
| 
 | |
| <section id="vorbis-spec-residue">
 | |
| <sectioninfo>
 | |
|  <releaseinfo>
 | |
|   $Id: 08-residue.xml,v 1.1.1.1 2004-11-13 16:51:22 mbrubeck Exp $
 | |
|  </releaseinfo>
 | |
| </sectioninfo>
 | |
| <title>Residue setup and decode</title>
 | |
| 
 | |
| 
 | |
| <section>
 | |
| <title>Overview</title>
 | |
| 
 | |
| <para>
 | |
| A residue vector represents the fine detail of the audio spectrum of
 | |
| one channel in an audio frame after the encoder subtracts the floor
 | |
| curve and performs any channel coupling.  A residue vector may
 | |
| represent spectral lines, spectral magnitude, spectral phase or
 | |
| hybrids as mixed by channel coupling.  The exact semantic content of
 | |
| the vector does not matter to the residue abstraction.</para>
 | |
| 
 | |
| <para>
 | |
| Whatever the exact qualities, the Vorbis residue abstraction codes the
 | |
| residue vectors into the bitstream packet, and then reconstructs the
 | |
| vectors during decode.  Vorbis makes use of three different encoding
 | |
| variants (numbered 0, 1 and 2) of the same basic vector encoding
 | |
| abstraction.</para>
 | |
| 
 | |
| </section>
 | |
| 
 | |
| <section>
 | |
| <title>Residue format</title>
 | |
| 
 | |
| <para>
 | |
| Residue format partitions each vector in the vector bundle into chunks,
 | |
| classifies each chunk, encodes the chunk classifications and finally
 | |
| encodes the chunks themselves using the the specific VQ arrangement
 | |
| defined for each selected classification.
 | |
| The exact interleaving and partitioning vary by residue encoding number,
 | |
| however the high-level process used to classify and encode the residue 
 | |
| vector is the same in all three variants.</para>
 | |
| 
 | |
| <para>
 | |
| A set of coded residue vectors are all of the same length.  High level
 | |
| coding structure, ignoring for the moment exactly how a partition is
 | |
| encoded and simply trusting that it is, is as follows:</para>
 | |
| 
 | |
| <itemizedlist>
 | |
| <listitem><para>Each vector is partitioned into multiple equal sized chunks
 | |
| according to configuration specified.  If we have a vector size of
 | |
| <emphasis>n</emphasis>, a partition size <emphasis>residue_partition_size</emphasis>, and a total
 | |
| of <emphasis>ch</emphasis> residue vectors, the total number of partitioned chunks
 | |
| coded is <emphasis>n</emphasis>/<emphasis>residue_partition_size</emphasis>*<emphasis>ch</emphasis>.  It is
 | |
| important to note that the integer division truncates.  In the below
 | |
| example, we assume an example <emphasis>residue_partition_size</emphasis> of 8.</para></listitem>
 | |
| 
 | |
| <listitem><para>Each partition in each vector has a classification number that
 | |
| specifies which of multiple configured VQ codebook setups are used to
 | |
| decode that partition.  The classification numbers of each partition
 | |
| can be thought of as forming a vector in their own right, as in the
 | |
| illustration below.  Just as the residue vectors are coded in grouped
 | |
| partitions to increase encoding efficiency, the classification vector
 | |
| is also partitioned into chunks.  The integer elements of each scalar
 | |
| in a classification chunk are built into a single scalar that
 | |
| represents the classification numbers in that chunk.  In the below
 | |
| example, the classification codeword encodes two classification
 | |
| numbers.</para></listitem>
 | |
| 
 | |
| <listitem><para>The values in a residue vector may be encoded monolithically in a
 | |
| single pass through the residue vector, but more often efficient
 | |
| codebook design dictates that each vector is encoded as the additive
 | |
| sum of several passes through the residue vector using more than one
 | |
| VQ codebook.  Thus, each residue value potentially accumulates values
 | |
| from multiple decode passes.  The classification value associated with
 | |
| a partition is the same in each pass, thus the classification codeword
 | |
| is coded only in the first pass.</para></listitem>
 | |
| 
 | |
| </itemizedlist>
 | |
| 
 | |
| <mediaobject>
 | |
| <imageobject>
 | |
|  <imagedata fileref="residue-pack.png" format="PNG"/>
 | |
| </imageobject>
 | |
| <textobject>
 | |
|  <phrase>[illustration of residue vector format]</phrase>
 | |
| </textobject>
 | |
| </mediaobject>
 | |
| 
 | |
| </section>
 | |
| 
 | |
| <section><title>residue 0</title>
 | |
| 
 | |
| <para>
 | |
| Residue 0 and 1 differ only in the way the values within a residue
 | |
| partition are interleaved during partition encoding (visually treated
 | |
| as a black box--or cyan box or brown box--in the above figure).</para>
 | |
| 
 | |
| <para>
 | |
| Residue encoding 0 interleaves VQ encoding according to the
 | |
| dimension of the codebook used to encode a partition in a specific
 | |
| pass.  The dimension of the codebook need not be the same in multiple
 | |
| passes, however the partition size must be an even multiple of the
 | |
| codebook dimension.</para>
 | |
| 
 | |
| <para>
 | |
| As an example, assume a partition vector of size eight, to be encoded
 | |
| by residue 0 using codebook sizes of 8, 4, 2 and 1:</para>
 | |
| 
 | |
| <programlisting>
 | |
| 
 | |
|             original residue vector: [ 0 1 2 3 4 5 6 7 ]
 | |
| 
 | |
| codebook dimensions = 8  encoded as: [ 0 1 2 3 4 5 6 7 ]
 | |
| 
 | |
| codebook dimensions = 4  encoded as: [ 0 2 4 6 ], [ 1 3 5 7 ]
 | |
| 
 | |
| codebook dimensions = 2  encoded as: [ 0 4 ], [ 1 5 ], [ 2 6 ], [ 3 7 ]
 | |
| 
 | |
| codebook dimensions = 1  encoded as: [ 0 ], [ 1 ], [ 2 ], [ 3 ], [ 4 ], [ 5 ], [ 6 ], [ 7 ]
 | |
| 
 | |
| </programlisting>
 | |
| 
 | |
| <para>
 | |
| It is worth mentioning at this point that no configurable value in the
 | |
| residue coding setup is restricted to a power of two.</para>
 | |
| 
 | |
| </section>
 | |
| 
 | |
| <section><title>residue 1</title>
 | |
| 
 | |
| <para>
 | |
| Residue 1 does not interleave VQ encoding.  It represents partition
 | |
| vector scalars in order.  As with residue 0, however, partition length
 | |
| must be an integer multiple of the codebook dimension, although
 | |
| dimension may vary from pass to pass.</para>
 | |
| 
 | |
| <para>
 | |
| As an example, assume a partition vector of size eight, to be encoded
 | |
| by residue 0 using codebook sizes of 8, 4, 2 and 1:</para>
 | |
| 
 | |
| <programlisting>
 | |
| 
 | |
|             original residue vector: [ 0 1 2 3 4 5 6 7 ]
 | |
| 
 | |
| codebook dimensions = 8  encoded as: [ 0 1 2 3 4 5 6 7 ]
 | |
| 
 | |
| codebook dimensions = 4  encoded as: [ 0 1 2 3 ], [ 4 5 6 7 ]
 | |
| 
 | |
| codebook dimensions = 2  encoded as: [ 0 1 ], [ 2 3 ], [ 4 5 ], [ 6 7 ]
 | |
| 
 | |
| codebook dimensions = 1  encoded as: [ 0 ], [ 1 ], [ 2 ], [ 3 ], [ 4 ], [ 5 ], [ 6 ], [ 7 ]
 | |
| 
 | |
| </programlisting>
 | |
| 
 | |
| </section>
 | |
| 
 | |
| <section><title>residue 2</title>
 | |
| 
 | |
| <para>
 | |
| Residue type two can be thought of as a variant of residue type 1.
 | |
| Rather than encoding multiple passed-in vectors as in residue type 1,
 | |
| the <emphasis>ch</emphasis> passed in vectors of length <emphasis>n</emphasis> are first
 | |
| interleaved and flattened into a single vector of length
 | |
| <emphasis>ch</emphasis>*<emphasis>n</emphasis>.  Encoding then proceeds as in type 1. Decoding is
 | |
| as in type 1 with decode interleave reversed. If operating on a single
 | |
| vector to begin with, residue type 1 and type 2 are equivalent.</para>
 | |
| 
 | |
| <mediaobject>
 | |
| <imageobject>
 | |
|  <imagedata fileref="residue2.png" format="PNG"/>
 | |
| </imageobject>
 | |
| <textobject>
 | |
|  <phrase>[illustration of residue type 2]</phrase>
 | |
| </textobject>
 | |
| </mediaobject>
 | |
| 
 | |
| </section>
 | |
| 
 | |
| <section>
 | |
| <title>Residue decode</title>
 | |
| 
 | |
| <section><title>header decode</title>
 | |
| 
 | |
| <para>
 | |
| Header decode for all three residue types is identical.</para>
 | |
| <programlisting>
 | |
|   1) [residue_begin] = read 24 bits as unsigned integer
 | |
|   2) [residue_end] = read 24 bits as unsigned integer
 | |
|   3) [residue_partition_size] = read 24 bits as unsigned integer and add one
 | |
|   4) [residue_classifications] = read 6 bits as unsigned integer and add one
 | |
|   5) [residue_classbook] = read 8 bits as unsigned integer
 | |
| </programlisting>
 | |
| 
 | |
| <para>
 | |
| <varname>[residue_begin]</varname> and <varname>[residue_end]</varname> select the specific
 | |
| sub-portion of each vector that is actually coded; it implements akin
 | |
| to a bandpass where, for coding purposes, the vector effectively
 | |
| begins at element <varname>[residue_begin]</varname> and ends at
 | |
| <varname>[residue_end]</varname>.  Preceding and following values in the unpacked
 | |
| vectors are zeroed.  Note that for residue type 2, these values as
 | |
| well as <varname>[residue_partition_size]</varname>apply to the interleaved
 | |
| vector, not the individual vectors before interleave.
 | |
| <varname>[residue_partition_size]</varname> is as explained above,
 | |
| <varname>[residue_classifications]</varname> is the number of possible
 | |
| classification to which a partition can belong and
 | |
| <varname>[residue_classbook]</varname> is the codebook number used to code
 | |
| classification codewords.  The number of dimensions in book
 | |
| <varname>[residue_classbook]</varname> determines how many classification values
 | |
| are grouped into a single classification codeword.</para>
 | |
| 
 | |
| <para>
 | |
| Next we read a bitmap pattern that specifies which partition classes
 | |
| code values in which passes.</para>
 | |
| 
 | |
| <programlisting>
 | |
|   1) iterate [i] over the range 0 ... [residue_classifications]-1 {
 | |
|   
 | |
|        2) [high_bits] = 0
 | |
|        3) [low_bits] = read 3 bits as unsigned integer
 | |
|        4) [bitflag] = read one bit as boolean
 | |
|        5) if ( [bitflag] is set ) then [high_bits] = read five bits as unsigned integer
 | |
|        6) vector [residue_cascade] element [i] = [high_bits] * 8 + [low_bits]
 | |
|      }
 | |
|   7) done
 | |
| </programlisting>
 | |
| 
 | |
| <para>
 | |
| Finally, we read in a list of book numbers, each corresponding to
 | |
| specific bit set in the cascade bitmap.  We loop over the possible
 | |
| codebook classifications and the maximum possible number of encoding
 | |
| stages (8 in Vorbis I, as constrained by the elements of the cascade
 | |
| bitmap being eight bits):</para>
 | |
| 
 | |
| <programlisting>
 | |
|   1) iterate [i] over the range 0 ... [residue_classifications]-1 {
 | |
|   
 | |
|        2) iterate [j] over the range 0 ... 7 {
 | |
|   
 | |
|             3) if ( vector [residue_cascade] element [i] bit [j] is set ) {
 | |
| 
 | |
|                  4) array [residue_books] element [i][j] = read 8 bits as unsigned integer
 | |
| 
 | |
|                } else {
 | |
| 
 | |
|                  5) array [residue_books] element [i][j] = unused
 | |
| 
 | |
|                }
 | |
|           }
 | |
|       }
 | |
| 
 | |
|   6) done
 | |
| </programlisting>
 | |
| 
 | |
| <para>
 | |
| An end-of-packet condition at any point in header decode renders the
 | |
| stream undecodable.  In addition, any codebook number greater than the
 | |
| maximum numbered codebook set up in this stream also renders the
 | |
| stream undecodable.</para>
 | |
| 
 | |
| </section>
 | |
| 
 | |
| <section><title>packet decode</title>
 | |
| 
 | |
| <para>
 | |
| Format 0 and 1 packet decode is identical except for specific
 | |
| partition interleave.  Format 2 packet decode can be built out of the
 | |
| format 1 decode process.  Thus we describe first the decode
 | |
| infrastructure identical to all three formats.</para>
 | |
| 
 | |
| <para>
 | |
| In addition to configuration information, the residue decode process
 | |
| is passed the number of vectors in the submap bundle and a vector of
 | |
| flags indicating if any of the vectors are not to be decoded.  If the
 | |
| passed in number of vectors is 3 and vector number 1 is marked 'do not
 | |
| decode', decode skips vector 1 during the decode loop.  However, even
 | |
| 'do not decode' vectors are allocated and zeroed.</para>
 | |
| 
 | |
| <para>
 | |
| The following convenience values are conceptually useful to clarifying
 | |
| the decode process:</para>
 | |
| 
 | |
| <programlisting>
 | |
|   1) [classwords_per_codeword] = [codebook_dimensions] value of codebook [residue_classbook]
 | |
|   2) [n_to_read] = [residue_end] - [residue_begin]
 | |
|   3) [partitions_to_read] = [n_to_read] / [residue_partition_size]
 | |
| </programlisting>
 | |
| 
 | |
| <para>
 | |
| Packet decode proceeds as follows, matching the description offered earlier in the document.  We assume that the number of vectors being encoded, <varname>[ch]</varname> is provided by the higher level decoding process.</para>
 | |
| <programlisting>
 | |
|   1) allocate and zero all vectors that will be returned.
 | |
|   2) iterate [pass] over the range 0 ... 7 {
 | |
| 
 | |
|        3) [partition_count] = 0
 | |
| 
 | |
|        4) if ([pass] is zero) {
 | |
|      
 | |
|             5) iterate [j] over the range 0 .. [ch]-1 {
 | |
| 
 | |
|                  6) if vector [j] is not marked 'do not decode' {
 | |
| 
 | |
|                       7) [temp] = read from packet using codebook [residue_classbook] in scalar context
 | |
|                       8) iterate [i] descending over the range [classwords_per_codeword]-1 ... 0 {
 | |
| 
 | |
|                            9) array [classifications] element [j],([i]+[partition_count]) =
 | |
|                               [temp] integer modulo [residue_classifications]
 | |
|                           10) [temp] = [temp] / [residue_classifications] using integer division
 | |
| 
 | |
|                          }
 | |
|       
 | |
|                     }
 | |
|             
 | |
|                }
 | |
|         
 | |
|           }
 | |
| 
 | |
|       11) iterate [i] over the range 0 .. ([classwords_per_codeword] - 1) while [partition_count] 
 | |
|           is also less than [partitions_to_read] {
 | |
| 
 | |
|             12) iterate [j] over the range 0 .. [ch]-1 {
 | |
|    
 | |
|                  13) if vector [j] is not marked 'do not decode' {
 | |
|    
 | |
|                       14) [vqclass] = array [classifications] element [j],[partition_count]
 | |
|                       15) [vqbook] = array [residue_books] element [vqclass],[pass]
 | |
|                       16) if ([vqbook] is not 'unused') {
 | |
|    
 | |
|                            17) decode partition into output vector number [j], starting at scalar 
 | |
|                            offset [residue_begin]+[partition_count]*[residue_partition_size] using 
 | |
|                            codebook number [vqbook] in VQ context
 | |
|                      }
 | |
|                 }
 | |
|    
 | |
|             18) increment [partition_count] by one
 | |
| 
 | |
|           }
 | |
|      }
 | |
|  
 | |
|  19) done
 | |
| 
 | |
| </programlisting>
 | |
| 
 | |
| <para>
 | |
| An end-of-packet condition during packet decode is to be considered a
 | |
| nominal occurrence.  Decode returns the result of vector decode up to
 | |
| that point.</para>
 | |
| 
 | |
| </section>
 | |
| 
 | |
| <section><title>format 0 specifics</title>
 | |
| 
 | |
| <para>
 | |
| Format zero decodes partitions exactly as described earlier in the
 | |
| 'Residue Format: residue 0' section.  The following pseudocode
 | |
| presents the same algorithm. Assume:</para>
 | |
| 
 | |
| <itemizedlist>
 | |
| <listitem><simpara> <varname>[n]</varname> is the value in <varname>[residue_partition_size]</varname></simpara></listitem>
 | |
| <listitem><simpara><varname>[v]</varname> is the residue vector</simpara></listitem>
 | |
| <listitem><simpara><varname>[offset]</varname> is the beginning read offset in [v]</simpara></listitem>
 | |
| </itemizedlist>
 | |
| 
 | |
| <programlisting>
 | |
|  1) [step] = [n] / [codebook_dimensions]
 | |
|  2) iterate [i] over the range 0 ... [step]-1 {
 | |
| 
 | |
|       3) vector [entry_temp] = read vector from packet using current codebook in VQ context
 | |
|       4) iterate [j] over the range 0 ... [codebook_dimensions]-1 {
 | |
| 
 | |
|            5) vector [v] element ([offset]+[i]+[j]*[step]) =
 | |
| 	        vector [v] element ([offset]+[i]+[j]*[step]) +
 | |
|                 vector [entry_temp] element [j]
 | |
| 
 | |
|          }
 | |
| 
 | |
|     }
 | |
| 
 | |
|   6) done
 | |
| 
 | |
| </programlisting>
 | |
| 
 | |
| </section>
 | |
| 
 | |
| <section><title>format 1 specifics</title>
 | |
| 
 | |
| <para>
 | |
| Format 1 decodes partitions exactly as described earlier in the
 | |
| 'Residue Format: residue 1' section.  The following pseudocode
 | |
| presents the same algorithm. Assume:</para>
 | |
| 
 | |
| <itemizedlist>
 | |
| <listitem><simpara> <varname>[n]</varname> is the value in
 | |
| <varname>[residue_partition_size]</varname></simpara></listitem>
 | |
| <listitem><simpara><varname>[v]</varname> is the residue vector</simpara></listitem>
 | |
| <listitem><simpara><varname>[offset]</varname> is the beginning read offset in [v]</simpara></listitem>
 | |
| </itemizedlist>
 | |
| 
 | |
| <programlisting>
 | |
|  1) [i] = 0
 | |
|  2) vector [entry_temp] = read vector from packet using current codebook in VQ context
 | |
|  3) iterate [j] over the range 0 ... [codebook_dimensions]-1 {
 | |
| 
 | |
|       4) vector [v] element ([offset]+[i]) =
 | |
| 	  vector [v] element ([offset]+[i]) +
 | |
|           vector [entry_temp] element [j]
 | |
|       5) increment [i]
 | |
| 
 | |
|     }
 | |
|  
 | |
|   6) if ( [i] is less than [n] ) continue at step 2
 | |
|   7) done
 | |
| </programlisting>
 | |
| 
 | |
| </section>
 | |
| 
 | |
| <section><title>format 2 specifics</title>
 | |
|  
 | |
| <para>
 | |
| Format 2 is reducible to format 1.  It may be implemented as an additional step prior to and an additional post-decode step after a normal format 1 decode.
 | |
| </para>
 | |
| 
 | |
| <para>
 | |
| Format 2 handles 'do not decode' vectors differently than residue 0 or
 | |
| 1; if all vectors are marked 'do not decode', no decode occurrs.
 | |
| However, if at least one vector is to be decoded, all the vectors are
 | |
| decoded.  We then request normal format 1 to decode a single vector
 | |
| representing all output channels, rather than a vector for each
 | |
| channel.  After decode, deinterleave the vector into independent vectors, one for each output channel.  That is:</para>
 | |
| 
 | |
| <orderedlist>
 | |
|  <listitem><simpara>If all vectors 0 through <emphasis>ch</emphasis>-1 are marked 'do not decode', allocate and clear a single vector <varname>[v]</varname>of length <emphasis>ch*n</emphasis> and skip step 2 below; proceed directly to the post-decode step.</simpara></listitem>
 | |
|  <listitem><simpara>Rather than performing format 1 decode to produce <emphasis>ch</emphasis> vectors of length <emphasis>n</emphasis> each, call format 1 decode to produce a single vector <varname>[v]</varname> of length <emphasis>ch*n</emphasis>. </simpara></listitem>
 | |
|  <listitem><para>Post decode: Deinterleave the single vector <varname>[v]</varname> returned by format 1 decode as described above into <emphasis>ch</emphasis> independent vectors, one for each outputchannel, according to:
 | |
|   <programlisting>
 | |
|   1) iterate [i] over the range 0 ... [n]-1 {
 | |
| 
 | |
|        2) iterate [j] over the range 0 ... [ch]-1 {
 | |
| 
 | |
|             3) output vector number [j] element [i] = vector [v] element ([i] * [ch] + [j])
 | |
| 
 | |
|           }
 | |
|      }
 | |
| 
 | |
|   4) done
 | |
|   </programlisting>
 | |
|  </para></listitem>
 | |
| </orderedlist>
 | |
| 
 | |
| </section>
 | |
| 
 | |
| </section>
 | |
| 
 | |
| </section>
 | |
| 
 |