mirror of
https://github.com/cookiengineer/audacity
synced 2025-10-17 08:01:12 +02:00
Update twolame to 0.3.13.
This commit is contained in:
@@ -2,20 +2,27 @@
|
||||
"http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">
|
||||
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en">
|
||||
<head>
|
||||
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
|
||||
<meta name="generator" content="AsciiDoc 7.1.2" />
|
||||
<link rel="stylesheet" href="./twolame.css" type="text/css" />
|
||||
<link rel="stylesheet" href="./twolame-quirks.css" type="text/css" />
|
||||
<meta http-equiv="Content-Type" content="application/xhtml+xml; charset=UTF-8" />
|
||||
<meta name="generator" content="AsciiDoc 8.6.3" />
|
||||
<title>TwoLAME: MPEG Audio Layer II VBR</title>
|
||||
<link rel="stylesheet" href="./twolame.css" type="text/css" />
|
||||
<script type="text/javascript">
|
||||
/*<![CDATA[*/
|
||||
window.onload = function(){asciidoc.footnotes();}
|
||||
/*]]>*/
|
||||
</script>
|
||||
<script type="text/javascript" src="./asciidoc-xhtml11.js"></script>
|
||||
</head>
|
||||
<body>
|
||||
<body class="article">
|
||||
<div id="header">
|
||||
<h1>TwoLAME: MPEG Audio Layer II VBR</h1>
|
||||
<span id="revision">version 0.3.11</span>
|
||||
<span id="revnumber">version 0.3.13</span>
|
||||
</div>
|
||||
<h2>Contents</h2>
|
||||
<div id="content">
|
||||
<div class="sect1">
|
||||
<h2 id="_contents">Contents</h2>
|
||||
<div class="sectionbody">
|
||||
<ul>
|
||||
<div class="ulist"><ul>
|
||||
<li>
|
||||
<p>
|
||||
Introduction
|
||||
@@ -33,7 +40,7 @@ Bitrate Ranges for various Sampling frequencies
|
||||
</li>
|
||||
<li>
|
||||
<p>
|
||||
Why can't the bitrate vary from 32kbps to 384kbps for every file?
|
||||
Why can’t the bitrate vary from 32kbps to 384kbps for every file?
|
||||
</p>
|
||||
</li>
|
||||
<li>
|
||||
@@ -51,29 +58,33 @@ Long Answer
|
||||
Tech Stuff
|
||||
</p>
|
||||
</li>
|
||||
</ul>
|
||||
</ul></div>
|
||||
</div>
|
||||
<h2>Introduction</h2>
|
||||
</div>
|
||||
<div class="sect1">
|
||||
<h2 id="_introduction">Introduction</h2>
|
||||
<div class="sectionbody">
|
||||
<p>VBR mode works by selecting a different bitrate for each frame. Frames
|
||||
which are harder to encode will be allocated more bits i.e. a higher bitrate.</p>
|
||||
<p>LayerII VBR is a complete hack - the ISO standard actually says that decoders are not
|
||||
<div class="paragraph"><p>VBR mode works by selecting a different bitrate for each frame. Frames
|
||||
which are harder to encode will be allocated more bits i.e. a higher bitrate.</p></div>
|
||||
<div class="paragraph"><p>LayerII VBR is a complete hack - the ISO standard actually says that decoders are not
|
||||
required to support it. As a hack, its implementation is a pain to try and understand.
|
||||
If you're mega-keen to get full range VBR working, either (a) send me money (b) grab the
|
||||
ISO standard and a C compiler and email me.</p>
|
||||
If you’re mega-keen to get full range VBR working, either (a) send me money (b) grab the
|
||||
ISO standard and a C compiler and email me.</p></div>
|
||||
</div>
|
||||
<h2>Usage</h2>
|
||||
</div>
|
||||
<div class="sect1">
|
||||
<h2 id="_usage">Usage</h2>
|
||||
<div class="sectionbody">
|
||||
<div class="literalblock">
|
||||
<div class="content">
|
||||
<pre><tt>twolame -v [level] inputfile outputfile.</tt></pre>
|
||||
</div></div>
|
||||
<p>A level of 5 works very well for me.</p>
|
||||
<p>The level value can is a measurement of quality - the higher
|
||||
<div class="paragraph"><p>A level of 5 works very well for me.</p></div>
|
||||
<div class="paragraph"><p>The level value can is a measurement of quality - the higher
|
||||
the level the higher the average bitrate of the resultant file.
|
||||
[See TECH STUFF for a better explanation of what the value does]</p>
|
||||
<p>The confusing part of my implementation of LayerII VBR is that it's different from MP3 VBR.</p>
|
||||
<ul>
|
||||
See TECH STUFF for a better explanation of what the value does.</p></div>
|
||||
<div class="paragraph"><p>The confusing part of my implementation of LayerII VBR is that it’s different from MP3 VBR.</p></div>
|
||||
<div class="ulist"><ul>
|
||||
<li>
|
||||
<p>
|
||||
The range of bitrates used is controlled by the input sampling frequency. (See below "Bitrate ranges")
|
||||
@@ -84,17 +95,19 @@ The range of bitrates used is controlled by the input sampling frequency. (See b
|
||||
The tendency to use higher bitrates is governed by the <level>.
|
||||
</p>
|
||||
</li>
|
||||
</ul>
|
||||
<p>E.g. Say you have a 44.1kHz Stereo file. In VBR mode, the bitrate can range from 192 to 384 kbps.</p>
|
||||
<p>Using "-v -5" will force the encoder to favour the lower bitrate.</p>
|
||||
<p>Using "-v 5" will force the encoder to favour the upper bitrate.</p>
|
||||
<p>The value can actually be <strong>any</strong> int. -27, 233, 47. The larger the number, the greater
|
||||
the bitrate bias.</p>
|
||||
</ul></div>
|
||||
<div class="paragraph"><p>E.g. Say you have a 44.1kHz Stereo file. In VBR mode, the bitrate can range from 192 to 384 kbps.</p></div>
|
||||
<div class="paragraph"><p>Using "-v -5" will force the encoder to favour the lower bitrate.</p></div>
|
||||
<div class="paragraph"><p>Using "-v 5" will force the encoder to favour the upper bitrate.</p></div>
|
||||
<div class="paragraph"><p>The value can actually be <strong>any</strong> int. -27, 233, 47. The larger the number, the greater
|
||||
the bitrate bias.</p></div>
|
||||
</div>
|
||||
<h2>Bitrate Ranges</h2>
|
||||
</div>
|
||||
<div class="sect1">
|
||||
<h2 id="_bitrate_ranges">Bitrate Ranges</h2>
|
||||
<div class="sectionbody">
|
||||
<p>When making a VBR stream, the bitrate is only allowed to vary within
|
||||
set limits</p>
|
||||
<div class="paragraph"><p>When making a VBR stream, the bitrate is only allowed to vary within
|
||||
set limits</p></div>
|
||||
<div class="literalblock">
|
||||
<div class="content">
|
||||
<pre><tt>48kHz
|
||||
@@ -111,10 +124,12 @@ Stereo: 192-384kbps Mono: 96-192kbps</tt></pre>
|
||||
Stereo/Mono: 8-160kbps</tt></pre>
|
||||
</div></div>
|
||||
</div>
|
||||
<h2>Why doesn't the VBR mode work the same as MP3VBR? The Short Answer</h2>
|
||||
</div>
|
||||
<div class="sect1">
|
||||
<h2 id="_why_doesn_8217_t_the_vbr_mode_work_the_same_as_mp3vbr_the_short_answer">Why doesn’t the VBR mode work the same as MP3VBR? The Short Answer</h2>
|
||||
<div class="sectionbody">
|
||||
<p><strong>Why can't the bitrate vary from 32kbps to 384kbps for every file?</strong></p>
|
||||
<p>According to the standard (ISO/IEC 11172-3:1993) Section 2.4.2.3</p>
|
||||
<div class="paragraph"><p><strong>Why can’t the bitrate vary from 32kbps to 384kbps for every file?</strong></p></div>
|
||||
<div class="paragraph"><p>According to the standard (ISO/IEC 11172-3:1993) Section 2.4.2.3</p></div>
|
||||
<div class="literalblock">
|
||||
<div class="content">
|
||||
<pre><tt>"In order to provide the smallest possible delay and complexity, the
|
||||
@@ -130,16 +145,19 @@ Stereo/Mono: 8-160kbps</tt></pre>
|
||||
<div class="content">
|
||||
<pre><tt>"For Layer II, not all combinations of total bitrate and mode are allowed."</tt></pre>
|
||||
</div></div>
|
||||
<p>Hence, most LayerII coders would not have been written with VBR in mind, and
|
||||
<div class="paragraph"><p>Hence, most LayerII coders would not have been written with VBR in mind, and
|
||||
LayerII VBR is a hack. It works for limited cases. Getting it to work to
|
||||
the same extent as MP3-style VBR will be a major hack.</p>
|
||||
<p>(If you <strong>really</strong> want better bitrate ranges, read "The Long Answer" and submit your mega-patch.)</p>
|
||||
the same extent as MP3-style VBR will be a major hack.</p></div>
|
||||
<div class="paragraph"><p>(If you <strong>really</strong> want better bitrate ranges, read "The Long Answer" and submit your mega-patch.)</p></div>
|
||||
</div>
|
||||
<h2>Why doesn't the VBR mode work the same as MP3VBR? The Long Answer</h2>
|
||||
</div>
|
||||
<div class="sect1">
|
||||
<h2 id="_why_doesn_8217_t_the_vbr_mode_work_the_same_as_mp3vbr_the_long_answer">Why doesn’t the VBR mode work the same as MP3VBR? The Long Answer</h2>
|
||||
<div class="sectionbody">
|
||||
<p><strong>Why can't the bitrate vary from 32kbps to 384kbps for every file?</strong></p>
|
||||
<h3>Reason 1: The standard limits the range</h3>
|
||||
<p>As quoted above from the standard for 48/44.1/32kHz:</p>
|
||||
<div class="paragraph"><p><strong>Why can’t the bitrate vary from 32kbps to 384kbps for every file?</strong></p></div>
|
||||
<div class="sect2">
|
||||
<h3 id="_reason_1_the_standard_limits_the_range">Reason 1: The standard limits the range</h3>
|
||||
<div class="paragraph"><p>As quoted above from the standard for 48/44.1/32kHz:</p></div>
|
||||
<div class="literalblock">
|
||||
<div class="content">
|
||||
<pre><tt>"For Layer II, not all combinations of total bitrate and mode are allowed. See
|
||||
@@ -164,22 +182,24 @@ the same extent as MP3-style VBR will be a major hack.</p>
|
||||
320 stereo only
|
||||
384 stereo only</tt></pre>
|
||||
</div></div>
|
||||
<p>So based upon this table alone, you <strong>could</strong> have VBR stereo encoding which varies
|
||||
<div class="paragraph"><p>So based upon this table alone, you <strong>could</strong> have VBR stereo encoding which varies
|
||||
smoothly from 96 to 384kbps. Or you could have have VBR mono encoding which varies from
|
||||
32 to 192kbps. But since the top and bottom bitrates don't apply to all modes, it would
|
||||
be impossible to have a stereo file encoded from 32 to 384 kbps.</p>
|
||||
<p>But this isn't what is really limiting the allowable bitrate range - the bit allocation
|
||||
tables are the major hurdle.</p>
|
||||
<h3>Reason 2: The bit allocation tables don't allow it</h3>
|
||||
<p>From the standard, Section 2.4.3.3.1 "Bit allocation decoding"</p>
|
||||
32 to 192kbps. But since the top and bottom bitrates don’t apply to all modes, it would
|
||||
be impossible to have a stereo file encoded from 32 to 384 kbps.</p></div>
|
||||
<div class="paragraph"><p>But this isn’t what is really limiting the allowable bitrate range - the bit allocation
|
||||
tables are the major hurdle.</p></div>
|
||||
</div>
|
||||
<div class="sect2">
|
||||
<h3 id="_reason_2_the_bit_allocation_tables_don_8217_t_allow_it">Reason 2: The bit allocation tables don’t allow it</h3>
|
||||
<div class="paragraph"><p>From the standard, Section 2.4.3.3.1 "Bit allocation decoding"</p></div>
|
||||
<div class="literalblock">
|
||||
<div class="content">
|
||||
<pre><tt>"For different combinations of bitrate and sampling frequency, different bit
|
||||
allocation tables exist.</tt></pre>
|
||||
</div></div>
|
||||
<p>These bit allocation tables are pre-determined tables (in Annex B of the standard) which
|
||||
indicate</p>
|
||||
<ul>
|
||||
<div class="paragraph"><p>These bit allocation tables are pre-determined tables (in Annex B of the standard) which
|
||||
indicate</p></div>
|
||||
<div class="ulist"><ul>
|
||||
<li>
|
||||
<p>
|
||||
how many bits to read for the initial data (2,3 or 4)
|
||||
@@ -191,11 +211,11 @@ these bits are then used as an index back into the table to
|
||||
find the number of quantize levels for the samples in this subband
|
||||
</p>
|
||||
</li>
|
||||
</ul>
|
||||
<p>But the table used (and hence the number of bits and the calculated index) are different
|
||||
for different combinations of bitrate and sampling frequency.</p>
|
||||
<p>I will use TableB.2a as an example.</p>
|
||||
<p>Table B.2a Applies for the following combinations.</p>
|
||||
</ul></div>
|
||||
<div class="paragraph"><p>But the table used (and hence the number of bits and the calculated index) are different
|
||||
for different combinations of bitrate and sampling frequency.</p></div>
|
||||
<div class="paragraph"><p>I will use TableB.2a as an example.</p></div>
|
||||
<div class="paragraph"><p>Table B.2a Applies for the following combinations.</p></div>
|
||||
<div class="literalblock">
|
||||
<div class="content">
|
||||
<pre><tt>Sampling Freq Bitrates in (kbps/channel) [emphasis: this is a PER CHANNEL bitrate]
|
||||
@@ -203,17 +223,17 @@ for different combinations of bitrate and sampling frequency.</p>
|
||||
44.1 56, 64, 80
|
||||
32 56, 64, 80</tt></pre>
|
||||
</div></div>
|
||||
<p>If we have a STEREO 48kHz input file, and we use this table, then the bitrates
|
||||
we could calculate from this would be 112, 128, 160, 192, 224, 256, 320 and 384 kbps.</p>
|
||||
<p>This table contains no information on how to encode stuff at bitrates less than 112kbps
|
||||
<div class="paragraph"><p>If we have a STEREO 48kHz input file, and we use this table, then the bitrates
|
||||
we could calculate from this would be 112, 128, 160, 192, 224, 256, 320 and 384 kbps.</p></div>
|
||||
<div class="paragraph"><p>This table contains no information on how to encode stuff at bitrates less than 112kbps
|
||||
(for a stereo file). You would have to load allocation table B.2c to encode stereo at
|
||||
64kbps and 128kbps.</p>
|
||||
<p>Since it would be a MAJOR piece of hacking to get the different tables shifted in and out
|
||||
during the encoding process, once an allocation table is loaded <strong>IT IS NOT CHANGED</strong>.</p>
|
||||
<p>Hence, the best table is picked at the start of the encoding process, and the encoder
|
||||
is stuck with it for the rest of the encode.</p>
|
||||
<p>For twolame-02j, I have picked the table it loads for different
|
||||
sampling frequencies in order to optimize the range of bitrates possible.</p>
|
||||
64kbps and 128kbps.</p></div>
|
||||
<div class="paragraph"><p>Since it would be a MAJOR piece of hacking to get the different tables shifted in and out
|
||||
during the encoding process, once an allocation table is loaded <strong>IT IS NOT CHANGED</strong>.</p></div>
|
||||
<div class="paragraph"><p>Hence, the best table is picked at the start of the encoding process, and the encoder
|
||||
is stuck with it for the rest of the encode.</p></div>
|
||||
<div class="paragraph"><p>For twolame-02j, I have picked the table it loads for different
|
||||
sampling frequencies in order to optimize the range of bitrates possible.</p></div>
|
||||
<div class="literalblock">
|
||||
<div class="content">
|
||||
<pre><tt>48 kHz - Table B.2a
|
||||
@@ -235,21 +255,26 @@ sampling frequencies in order to optimize the range of bitrates possible.</p>
|
||||
bitrate over the entire range.</tt></pre>
|
||||
</div></div>
|
||||
</div>
|
||||
<h2>Tech Stuff</h2>
|
||||
<div class="sectionbody">
|
||||
<p>The VBR mode is mainly centered around the main_bit_allocation() and
|
||||
a_bit_allocation() routines in encode.c.</p>
|
||||
<p>The limited range of VBR is due to my particular implementation which restricts
|
||||
ranges to within one alloc table (see tables B.2a, B.2b, B.2c and B.2d in ISO 11172).
|
||||
The VBR range for 32/44.1khz lies within B.2b, and the 48khz VBR lies within table B.2a.</p>
|
||||
<p>I'm not sure whether it is worth extending these ranges down to lower bitrates.
|
||||
The work required to switch alloc tables <strong>during</strong> the encoding is major.</p>
|
||||
<p>In the case of silence, it might be worth doing a quick check for very low signals
|
||||
and writing a pre-calculated <strong>blank</strong> 32kpbs frame. [probably also a lot of work].</p>
|
||||
</div>
|
||||
<h2>How CBR works</h2>
|
||||
</div>
|
||||
<div class="sect1">
|
||||
<h2 id="_tech_stuff">Tech Stuff</h2>
|
||||
<div class="sectionbody">
|
||||
<ul>
|
||||
<div class="paragraph"><p>The VBR mode is mainly centered around the main_bit_allocation() and
|
||||
a_bit_allocation() routines in encode.c.</p></div>
|
||||
<div class="paragraph"><p>The limited range of VBR is due to my particular implementation which restricts
|
||||
ranges to within one alloc table (see tables B.2a, B.2b, B.2c and B.2d in ISO 11172).
|
||||
The VBR range for 32/44.1khz lies within B.2b, and the 48khz VBR lies within table B.2a.</p></div>
|
||||
<div class="paragraph"><p>I’m not sure whether it is worth extending these ranges down to lower bitrates.
|
||||
The work required to switch alloc tables <strong>during</strong> the encoding is major.</p></div>
|
||||
<div class="paragraph"><p>In the case of silence, it might be worth doing a quick check for very low signals
|
||||
and writing a pre-calculated <strong>blank</strong> 32kpbs frame. [probably also a lot of work].</p></div>
|
||||
</div>
|
||||
</div>
|
||||
<div class="sect1">
|
||||
<h2 id="_how_cbr_works">How CBR works</h2>
|
||||
<div class="sectionbody">
|
||||
<div class="ulist"><ul>
|
||||
<li>
|
||||
<p>
|
||||
Use the psycho model to determine the MNRs for each subband
|
||||
@@ -281,11 +306,13 @@ This mode does not guarentee that all the subbands are without noise
|
||||
ie there may still be subbands with MNR less than 0.0 (noisy!)
|
||||
</p>
|
||||
</li>
|
||||
</ul>
|
||||
</ul></div>
|
||||
</div>
|
||||
<h2>How VBR works</h2>
|
||||
</div>
|
||||
<div class="sect1">
|
||||
<h2 id="_how_vbr_works">How VBR works</h2>
|
||||
<div class="sectionbody">
|
||||
<ul>
|
||||
<div class="ulist"><ul>
|
||||
<li>
|
||||
<p>
|
||||
pretend we have lots of bits to spare, and work out the bits which would
|
||||
@@ -309,14 +336,16 @@ VBR "guarantees" that all subbands have MNR > VBRLEVEL or that we have
|
||||
reached the maximum bitrate.
|
||||
</p>
|
||||
</li>
|
||||
</ul>
|
||||
</ul></div>
|
||||
</div>
|
||||
<h2>FUTURE</h2>
|
||||
</div>
|
||||
<div class="sect1">
|
||||
<h2 id="_future">FUTURE</h2>
|
||||
<div class="sectionbody">
|
||||
<ul>
|
||||
<div class="ulist"><ul>
|
||||
<li>
|
||||
<p>
|
||||
with this VBR mode, we know the bits aren't going to run out, so we can
|
||||
with this VBR mode, we know the bits aren’t going to run out, so we can
|
||||
just assign them "greedily".
|
||||
</p>
|
||||
</li>
|
||||
@@ -325,12 +354,15 @@ with this VBR mode, we know the bits aren't going to run out, so we can
|
||||
VBR_a_bit_allocation() is yet to be written :)
|
||||
</p>
|
||||
</li>
|
||||
</ul>
|
||||
</ul></div>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
<div id="footnotes"><hr /></div>
|
||||
<div id="footer">
|
||||
<div id="footer-text">
|
||||
Version 0.3.11<br />
|
||||
Last updated 09-Jan-2008 11:45:18 BST
|
||||
Version 0.3.13<br />
|
||||
Last updated 2011-01-01 22:51:38 GMT
|
||||
</div>
|
||||
</div>
|
||||
</body>
|
||||
|
Reference in New Issue
Block a user