mirror of
https://github.com/cookiengineer/audacity
synced 2026-04-27 00:07:17 +02:00
Move library tree where it belongs
This commit is contained in:
458
lib-src/libogg/doc/xml/03-codebook.xml
Normal file
458
lib-src/libogg/doc/xml/03-codebook.xml
Normal file
@@ -0,0 +1,458 @@
|
||||
<?xml version="1.0" standalone="no"?>
|
||||
<!DOCTYPE section PUBLIC "-//OASIS//DTD DocBook XML V4.2//EN"
|
||||
"http://www.oasis-open.org/docbook/xml/4.2/docbookx.dtd" [
|
||||
|
||||
]>
|
||||
|
||||
<section id="vorbis-spec-codebook">
|
||||
<sectioninfo>
|
||||
<releaseinfo>
|
||||
$Id: 03-codebook.xml,v 1.1.1.1 2004-11-13 16:51:21 mbrubeck Exp $
|
||||
</releaseinfo>
|
||||
</sectioninfo>
|
||||
<title>Probability Model and Codebooks</title>
|
||||
|
||||
<section>
|
||||
<title>Overview</title>
|
||||
|
||||
<para>
|
||||
Unlike practically every other mainstream audio codec, Vorbis has no
|
||||
statically configured probability model, instead packing all entropy
|
||||
decoding configuration, VQ and Huffman, into the bitstream itself in
|
||||
the third header, the codec setup header. This packed configuration
|
||||
consists of multiple 'codebooks', each containing a specific
|
||||
Huffman-equivalent representation for decoding compressed codewords as
|
||||
well as an optional lookup table of output vector values to which a
|
||||
decoded Huffman value is applied as an offset, generating the final
|
||||
decoded output corresponding to a given compressed codeword.</para>
|
||||
|
||||
<section><title>Bitwise operation</title>
|
||||
<para>
|
||||
The codebook mechanism is built on top of the vorbis bitpacker. Both
|
||||
the codebooks themselves and the codewords they decode are unrolled
|
||||
from a packet as a series of arbitrary-width values read from the
|
||||
stream according to <xref linkend="vorbis-spec-bitpacking"/>.</para>
|
||||
</section>
|
||||
|
||||
</section>
|
||||
|
||||
<section>
|
||||
<title>Packed codebook format</title>
|
||||
|
||||
<para>
|
||||
For purposes of the examples below, we assume that the storage
|
||||
system's native byte width is eight bits. This is not universally
|
||||
true; see <xref linkend="vorbis-spec-bitpacking"/> for discussion
|
||||
relating to non-eight-bit bytes.</para>
|
||||
|
||||
<section><title>codebook decode</title>
|
||||
|
||||
<para>
|
||||
A codebook begins with a 24 bit sync pattern, 0x564342:
|
||||
|
||||
<screen>
|
||||
byte 0: [ 0 1 0 0 0 0 1 0 ] (0x42)
|
||||
byte 1: [ 0 1 0 0 0 0 1 1 ] (0x43)
|
||||
byte 2: [ 0 1 0 1 0 1 1 0 ] (0x56)
|
||||
</screen></para>
|
||||
|
||||
<para>
|
||||
16 bit <varname>[codebook_dimensions]</varname> and 24 bit <varname>[codebook_entries]</varname> fields:
|
||||
|
||||
<screen>
|
||||
|
||||
byte 3: [ X X X X X X X X ]
|
||||
byte 4: [ X X X X X X X X ] [codebook_dimensions] (16 bit unsigned)
|
||||
|
||||
byte 5: [ X X X X X X X X ]
|
||||
byte 6: [ X X X X X X X X ]
|
||||
byte 7: [ X X X X X X X X ] [codebook_entries] (24 bit unsigned)
|
||||
|
||||
</screen></para>
|
||||
|
||||
<para>
|
||||
Next is the <varname>[ordered]</varname> bit flag:
|
||||
|
||||
<screen>
|
||||
|
||||
byte 8: [ X ] [ordered] (1 bit)
|
||||
|
||||
</screen></para>
|
||||
|
||||
<para>
|
||||
Each entry, numbering a
|
||||
total of <varname>[codebook_entries]</varname>, is assigned a codeword length.
|
||||
We now read the list of codeword lengths and store these lengths in
|
||||
the array <varname>[codebook_codeword_lengths]</varname>. Decode of lengths is
|
||||
according to whether the <varname>[ordered]</varname> flag is set or unset.
|
||||
|
||||
<itemizedlist>
|
||||
<listitem>
|
||||
<para>If the <varname>[ordered]</varname> flag is unset, the codeword list is not
|
||||
length ordered and the decoder needs to read each codeword length
|
||||
one-by-one.</para>
|
||||
|
||||
<para>The decoder first reads one additional bit flag, the
|
||||
<varname>[sparse]</varname> flag. This flag determines whether or not the
|
||||
codebook contains unused entries that are not to be included in the
|
||||
codeword decode tree:
|
||||
|
||||
<screen>
|
||||
byte 8: [ X 1 ] [sparse] flag (1 bit)
|
||||
</screen></para>
|
||||
|
||||
<para>
|
||||
The decoder now performs for each of the <varname>[codebook_entries]</varname>
|
||||
codebook entries:
|
||||
|
||||
<screen>
|
||||
|
||||
1) if([sparse] is set){
|
||||
|
||||
2) [flag] = read one bit;
|
||||
3) if([flag] is set){
|
||||
|
||||
4) [length] = read a five bit unsigned integer;
|
||||
5) codeword length for this entry is [length]+1;
|
||||
|
||||
} else {
|
||||
|
||||
6) this entry is unused. mark it as such.
|
||||
|
||||
}
|
||||
|
||||
} else the sparse flag is not set {
|
||||
|
||||
7) [length] = read a five bit unsigned integer;
|
||||
8) the codeword length for this entry is [length]+1;
|
||||
|
||||
}
|
||||
|
||||
</screen></para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>If the <varname>[ordered]</varname> flag is set, the codeword list for this
|
||||
codebook is encoded in ascending length order. Rather than reading
|
||||
a length for every codeword, the encoder reads the number of
|
||||
codewords per length. That is, beginning at entry zero:
|
||||
|
||||
<screen>
|
||||
1) [current_entry] = 0;
|
||||
2) [current_length] = read a five bit unsigned integer and add 1;
|
||||
3) [number] = read <link linkend="vorbis-spec-ilog">ilog</link>([codebook_entries] - [current_entry]) bits as an unsigned integer
|
||||
4) set the entries [current_entry] through [current_entry]+[number]-1, inclusive,
|
||||
of the [codebook_codeword_lengths] array to [current_length]
|
||||
5) set [current_entry] to [number] + [current_entry]
|
||||
6) increment [current_length] by 1
|
||||
7) if [current_entry] is greater than [codebook_entries] ERROR CONDITION;
|
||||
the decoder will not be able to read this stream.
|
||||
8) if [current_entry] is less than [codebook_entries], repeat process starting at 3)
|
||||
9) done.
|
||||
</screen></para>
|
||||
</listitem>
|
||||
</itemizedlist>
|
||||
|
||||
After all codeword lengths have been decoded, the decoder reads the
|
||||
vector lookup table. Vorbis I supports three lookup types:
|
||||
<orderedlist>
|
||||
<listitem>
|
||||
<simpara>No lookup</simpara>
|
||||
</listitem><listitem>
|
||||
<simpara>Implicitly populated value mapping (lattice VQ)</simpara>
|
||||
</listitem><listitem>
|
||||
<simpara>Explicitly populated value mapping (tessellated or 'foam'
|
||||
VQ)</simpara>
|
||||
</listitem>
|
||||
</orderedlist>
|
||||
</para>
|
||||
|
||||
<para>
|
||||
The lookup table type is read as a four bit unsigned integer:
|
||||
<screen>
|
||||
1) [codebook_lookup_type] = read four bits as an unsigned integer
|
||||
</screen></para>
|
||||
|
||||
<para>
|
||||
Codebook decode precedes according to <varname>[codebook_lookup_type]</varname>:
|
||||
<itemizedlist>
|
||||
<listitem>
|
||||
<para>Lookup type zero indicates no lookup to be read. Proceed past
|
||||
lookup decode.</para>
|
||||
</listitem><listitem>
|
||||
<para>Lookup types one and two are similar, differing only in the
|
||||
number of lookup values to be read. Lookup type one reads a list of
|
||||
values that are permuted in a set pattern to build a list of vectors,
|
||||
each vector of order <varname>[codebook_dimensions]</varname> scalars. Lookup
|
||||
type two builds the same vector list, but reads each scalar for each
|
||||
vector explicitly, rather than building vectors from a smaller list of
|
||||
possible scalar values. Lookup decode proceeds as follows:
|
||||
|
||||
<screen>
|
||||
1) [codebook_minimum_value] = <link linkend="vorbis-spec-float32_unpack">float32_unpack</link>( read 32 bits as an unsigned integer)
|
||||
2) [codebook_delta_value] = <link linkend="vorbis-spec-float32_unpack">float32_unpack</link>( read 32 bits as an unsigned integer)
|
||||
3) [codebook_value_bits] = read 4 bits as an unsigned integer and add 1
|
||||
4) [codebook_sequence_p] = read 1 bit as a boolean flag
|
||||
|
||||
if ( [codebook_lookup_type] is 1 ) {
|
||||
|
||||
5) [codebook_lookup_values] = <link linkend="vorbis-spec-lookup1_values">lookup1_values</link>(<varname>[codebook_entries]</varname>, <varname>[codebook_dimensions]</varname> )
|
||||
|
||||
} else {
|
||||
|
||||
6) [codebook_lookup_values] = <varname>[codebook_entries]</varname> * <varname>[codebook_dimensions]</varname>
|
||||
|
||||
}
|
||||
|
||||
7) read a total of [codebook_lookup_values] unsigned integers of [codebook_value_bits] each;
|
||||
store these in order in the array [codebook_multiplicands]
|
||||
</screen></para>
|
||||
</listitem><listitem>
|
||||
<para>A <varname>[codebook_lookup_type]</varname> of greater than two is reserved
|
||||
and indicates a stream that is not decodable by the specification in this
|
||||
document.</para>
|
||||
</listitem>
|
||||
</itemizedlist>
|
||||
</para>
|
||||
|
||||
<para>
|
||||
An 'end of packet' during any read operation in the above steps is
|
||||
considered an error condition rendering the stream undecodable.</para>
|
||||
|
||||
<section><title>Huffman decision tree representation</title>
|
||||
|
||||
<para>
|
||||
The <varname>[codebook_codeword_lengths]</varname> array and
|
||||
<varname>[codebook_entries]</varname> value uniquely define the Huffman decision
|
||||
tree used for entropy decoding.</para>
|
||||
|
||||
<para>
|
||||
Briefly, each used codebook entry (recall that length-unordered
|
||||
codebooks support unused codeword entries) is assigned, in order, the
|
||||
lowest valued unused binary Huffman codeword possible. Assume the
|
||||
following codeword length list:
|
||||
|
||||
<screen>
|
||||
entry 0: length 2
|
||||
entry 1: length 4
|
||||
entry 2: length 4
|
||||
entry 3: length 4
|
||||
entry 4: length 4
|
||||
entry 5: length 2
|
||||
entry 6: length 3
|
||||
entry 7: length 3
|
||||
</screen></para>
|
||||
|
||||
<para>
|
||||
Assigning codewords in order (lowest possible value of the appropriate
|
||||
length to highest) results in the following codeword list:
|
||||
|
||||
<screen>
|
||||
entry 0: length 2 codeword 00
|
||||
entry 1: length 4 codeword 0100
|
||||
entry 2: length 4 codeword 0101
|
||||
entry 3: length 4 codeword 0110
|
||||
entry 4: length 4 codeword 0111
|
||||
entry 5: length 2 codeword 10
|
||||
entry 6: length 3 codeword 110
|
||||
entry 7: length 3 codeword 111
|
||||
</screen></para>
|
||||
|
||||
|
||||
<note>
|
||||
<para>
|
||||
Unlike most binary numerical values in this document, we
|
||||
intend the above codewords to be read and used bit by bit from left to
|
||||
right, thus the codeword '001' is the bit string 'zero, zero, one'.
|
||||
When determining 'lowest possible value' in the assignment definition
|
||||
above, the leftmost bit is the MSb.</para>
|
||||
</note>
|
||||
|
||||
<para>
|
||||
It is clear that the codeword length list represents a Huffman
|
||||
decision tree with the entry numbers equivalent to the leaves numbered
|
||||
left-to-right:
|
||||
|
||||
<mediaobject>
|
||||
<imageobject>
|
||||
<imagedata fileref="hufftree.png" format="PNG"/>
|
||||
</imageobject>
|
||||
<textobject>
|
||||
<phrase>[huffman tree illustration]</phrase>
|
||||
</textobject>
|
||||
</mediaobject>
|
||||
</para>
|
||||
|
||||
<para>
|
||||
As we assign codewords in order, we see that each choice constructs a
|
||||
new leaf in the leftmost possible position.</para>
|
||||
|
||||
<para>
|
||||
Note that it's possible to underspecify or overspecify a Huffman tree
|
||||
via the length list. In the above example, if codeword seven were
|
||||
eliminated, it's clear that the tree is unfinished:
|
||||
|
||||
<mediaobject>
|
||||
<imageobject>
|
||||
<imagedata fileref="hufftree-under.png" format="PNG"/>
|
||||
</imageobject>
|
||||
<textobject>
|
||||
<phrase>[underspecified huffman tree illustration]</phrase>
|
||||
</textobject>
|
||||
</mediaobject>
|
||||
</para>
|
||||
|
||||
<para>
|
||||
Similarly, in the original codebook, it's clear that the tree is fully
|
||||
populated and a ninth codeword is impossible. Both underspecified and
|
||||
overspecified trees are an error condition rendering the stream
|
||||
undecodable.</para>
|
||||
|
||||
<para>
|
||||
Codebook entries marked 'unused' are simply skipped in the assigning
|
||||
process. They have no codeword and do not appear in the decision
|
||||
tree, thus it's impossible for any bit pattern read from the stream to
|
||||
decode to that entry number.</para>
|
||||
|
||||
</section>
|
||||
|
||||
<section><title>VQ lookup table vector representation</title>
|
||||
|
||||
<para>
|
||||
Unpacking the VQ lookup table vectors relies on the following values:
|
||||
<programlisting>
|
||||
the [codebook_multiplicands] array
|
||||
[codebook_minimum_value]
|
||||
[codebook_delta_value]
|
||||
[codebook_sequence_p]
|
||||
[codebook_lookup_type]
|
||||
[codebook_entries]
|
||||
[codebook_dimensions]
|
||||
[codebook_lookup_values]
|
||||
</programlisting>
|
||||
</para>
|
||||
|
||||
<para>
|
||||
Decoding (unpacking) a specific vector in the vector lookup table
|
||||
proceeds according to <varname>[codebook_lookup_type]</varname>. The unpacked
|
||||
vector values are what a codebook would return during audio packet
|
||||
decode in a VQ context.</para>
|
||||
|
||||
<section><title>Vector value decode: Lookup type 1</title>
|
||||
|
||||
<para>
|
||||
Lookup type one specifies a lattice VQ lookup table built
|
||||
algorithmically from a list of scalar values. Calculate (unpack) the
|
||||
final values of a codebook entry vector from the entries in
|
||||
<varname>[codebook_multiplicands]</varname> as follows (<varname>[value_vector]</varname>
|
||||
is the output vector representing the vector of values for entry number
|
||||
<varname>[lookup_offset]</varname> in this codebook):
|
||||
|
||||
<screen>
|
||||
1) [last] = 0;
|
||||
2) [index_divisor] = 1;
|
||||
3) iterate [i] over the range 0 ... [codebook_dimensions]-1 (once for each scalar value in the value vector) {
|
||||
|
||||
4) [multiplicand_offset] = ( [lookup_offset] divided by [index_divisor] using integer
|
||||
division ) integer modulo [codebook_lookup_values]
|
||||
|
||||
5) vector [value_vector] element [i] =
|
||||
( [codebook_multiplicands] array element number [multiplicand_offset] ) *
|
||||
[codebook_delta_value] + [codebook_minimum_value] + [last];
|
||||
|
||||
6) if ( [codebook_sequence_p] is set ) then set [last] = vector [value_vector] element [i]
|
||||
|
||||
7) [index_divisor] = [index_divisor] * [codebook_lookup_values]
|
||||
|
||||
}
|
||||
|
||||
8) vector calculation completed.
|
||||
</screen></para>
|
||||
|
||||
</section>
|
||||
|
||||
<section><title>Vector value decode: Lookup type 2</title>
|
||||
|
||||
<para>
|
||||
Lookup type two specifies a VQ lookup table in which each scalar in
|
||||
each vector is explicitly set by the <varname>[codebook_multiplicands]</varname>
|
||||
array in a one-to-one mapping. Calculate [unpack] the
|
||||
final values of a codebook entry vector from the entries in
|
||||
<varname>[codebook_multiplicands]</varname> as follows (<varname>[value_vector]</varname>
|
||||
is the output vector representing the vector of values for entry number
|
||||
<varname>[lookup_offset]</varname> in this codebook):
|
||||
|
||||
<screen>
|
||||
1) [last] = 0;
|
||||
2) [multiplicand_offset] = [lookup_offset] * [codebook_dimensions]
|
||||
3) iterate [i] over the range 0 ... [codebook_dimensions]-1 (once for each scalar value in the value vector) {
|
||||
|
||||
4) vector [value_vector] element [i] =
|
||||
( [codebook_multiplicands] array element number [multiplicand_offset] ) *
|
||||
[codebook_delta_value] + [codebook_minimum_value] + [last];
|
||||
|
||||
5) if ( [codebook_sequence_p] is set ) then set [last] = vector [value_vector] element [i]
|
||||
|
||||
6) increment [multiplicand_offset]
|
||||
|
||||
}
|
||||
|
||||
7) vector calculation completed.
|
||||
</screen></para>
|
||||
|
||||
</section>
|
||||
|
||||
</section>
|
||||
|
||||
</section>
|
||||
|
||||
</section>
|
||||
|
||||
<section>
|
||||
<title>Use of the codebook abstraction</title>
|
||||
|
||||
<para>
|
||||
The decoder uses the codebook abstraction much as it does the
|
||||
bit-unpacking convention; a specific codebook reads a
|
||||
codeword from the bitstream, decoding it into an entry number, and then
|
||||
returns that entry number to the decoder (when used in a scalar
|
||||
entropy coding context), or uses that entry number as an offset into
|
||||
the VQ lookup table, returning a vector of values (when used in a context
|
||||
desiring a VQ value). Scalar or VQ context is always explicit; any call
|
||||
to the codebook mechanism requests either a scalar entry number or a
|
||||
lookup vector.</para>
|
||||
|
||||
<para>
|
||||
Note that VQ lookup type zero indicates that there is no lookup table;
|
||||
requesting decode using a codebook of lookup type 0 in any context
|
||||
expecting a vector return value (even in a case where a vector of
|
||||
dimension one) is forbidden. If decoder setup or decode requests such
|
||||
an action, that is an error condition rendering the packet
|
||||
undecodable.</para>
|
||||
|
||||
<para>
|
||||
Using a codebook to read from the packet bitstream consists first of
|
||||
reading and decoding the next codeword in the bitstream. The decoder
|
||||
reads bits until the accumulated bits match a codeword in the
|
||||
codebook. This process can be though of as logically walking the
|
||||
Huffman decode tree by reading one bit at a time from the bitstream,
|
||||
and using the bit as a decision boolean to take the 0 branch (left in
|
||||
the above examples) or the 1 branch (right in the above examples).
|
||||
Walking the tree finishes when the decode process hits a leaf in the
|
||||
decision tree; the result is the entry number corresponding to that
|
||||
leaf. Reading past the end of a packet propagates the 'end-of-stream'
|
||||
condition to the decoder.</para>
|
||||
|
||||
<para>
|
||||
When used in a scalar context, the resulting codeword entry is the
|
||||
desired return value.</para>
|
||||
|
||||
<para>
|
||||
When used in a VQ context, the codeword entry number is used as an
|
||||
offset into the VQ lookup table. The value returned to the decoder is
|
||||
the vector of scalars corresponding to this offset.</para>
|
||||
|
||||
</section>
|
||||
|
||||
</section>
|
||||
|
||||
<!-- end section of probablity model and codebooks -->
|
||||
Reference in New Issue
Block a user