mirror of
https://github.com/cookiengineer/audacity
synced 2025-05-04 17:49:45 +02:00
202 lines
6.9 KiB
XML
202 lines
6.9 KiB
XML
<?xml version="1.0" standalone="no"?>
|
|
<!DOCTYPE appendix PUBLIC "-//OASIS//DTD DocBook XML V4.2//EN"
|
|
"http://www.oasis-open.org/docbook/xml/4.2/docbookx.dtd" [
|
|
|
|
]>
|
|
|
|
<appendix id="vorbis-over-ogg">
|
|
<appendixinfo>
|
|
<releaseinfo>
|
|
$Id: a1-encapsulation_ogg.xml,v 1.1.1.1 2004-11-13 16:51:21 mbrubeck Exp $
|
|
</releaseinfo>
|
|
</appendixinfo>
|
|
<title>Embedding Vorbis into an Ogg stream</title>
|
|
|
|
<section>
|
|
<title>Overview</title>
|
|
|
|
<para>
|
|
This document describes using Ogg logical and physical transport
|
|
streams to encapsulate Vorbis compressed audio packet data into file
|
|
form.</para>
|
|
|
|
<para>
|
|
The <xref linkend="vorbis-spec-intro"/> provides an overview of the construction
|
|
of Vorbis audio packets.</para>
|
|
|
|
<para>
|
|
The <ulink url="oggstream.html">Ogg
|
|
bitstream overview</ulink> and <ulink url="framing.html">Ogg logical
|
|
bitstream and framing spec</ulink> provide detailed descriptions of Ogg
|
|
transport streams. This specification document assumes a working
|
|
knowledge of the concepts covered in these named backround
|
|
documents. Please read them first.</para>
|
|
|
|
<section><title>Restrictions</title>
|
|
|
|
<para>
|
|
The Ogg/Vorbis I specification currently dictates that Ogg/Vorbis
|
|
streams use Ogg transport streams in degenerate, unmultiplexed
|
|
form only. That is:
|
|
|
|
<itemizedlist>
|
|
<listitem><simpara>
|
|
A meta-headerless Ogg file encapsulates the Vorbis I packets
|
|
</simpara></listitem>
|
|
<listitem><simpara>
|
|
The Ogg stream may be chained, i.e. contain multiple, contigous logical streams (links).
|
|
</simpara></listitem>
|
|
<listitem><simpara>
|
|
The Ogg stream must be unmultiplexed (only one stream, a Vorbis audio stream, per link)
|
|
</simpara></listitem>
|
|
</itemizedlist>
|
|
</para>
|
|
|
|
<para>
|
|
This is not to say that it is not currently possible to multiplex
|
|
Vorbis with other media types into a multi-stream Ogg file. At the
|
|
time this document was written, Ogg was becoming a popular container
|
|
for low-bitrate movies consisting of DiVX video and Vorbis audio.
|
|
However, a 'Vorbis I audio file' is taken to imply Vorbis audio
|
|
existing alone within a degenerate Ogg stream. A compliant 'Vorbis
|
|
audio player' is not required to implement Ogg support beyond the
|
|
specific support of Vorbis within a degenrate ogg stream (naturally,
|
|
application authors are encouraged to support full multiplexed Ogg
|
|
handling).
|
|
</para>
|
|
|
|
</section>
|
|
|
|
<section><title>MIME type</title>
|
|
|
|
<para>
|
|
The correct MIME type of any Ogg file is <literal>application/ogg</literal>.
|
|
However, if a file is a Vorbis I audio file (which implies a
|
|
degenerate Ogg stream including only unmultiplexed Vorbis audio), the
|
|
mime type <literal>audio/x-vorbis</literal> is also allowed.</para>
|
|
|
|
</section>
|
|
|
|
</section>
|
|
|
|
<section>
|
|
<title>Encapsulation</title>
|
|
|
|
<para>
|
|
Ogg encapsulation of a Vorbis packet stream is straightforward.</para>
|
|
|
|
<itemizedlist>
|
|
|
|
<listitem><simpara>
|
|
The first Vorbis packet (the identification header), which
|
|
uniquely identifies a stream as Vorbis audio, is placed alone in the
|
|
first page of the logical Ogg stream. This results in a first Ogg
|
|
page of exactly 58 bytes at the very beginning of the logical stream.
|
|
</simpara></listitem>
|
|
|
|
<listitem><simpara>
|
|
This first page is marked 'beginning of stream' in the page flags.
|
|
</simpara></listitem>
|
|
|
|
<listitem><simpara>
|
|
The second and third vorbis packets (comment and setup
|
|
headers) may span one or more pages beginning on the second page of
|
|
the logical stream. However many pages they span, the third header
|
|
packet finishes the page on which it ends. The next (first audio) packet
|
|
must begin on a fresh page.
|
|
</simpara></listitem>
|
|
|
|
<listitem><simpara>
|
|
The granule position of these first pages containing only headers is zero.
|
|
</simpara></listitem>
|
|
|
|
<listitem><simpara>
|
|
The first audio packet of the logical stream begins a fresh Ogg page.
|
|
</simpara></listitem>
|
|
|
|
<listitem><simpara>
|
|
Packets are placed into ogg pages in order until the end of stream.
|
|
</simpara></listitem>
|
|
|
|
<listitem><simpara>
|
|
The last page is marked 'end of stream' in the page flags.
|
|
</simpara></listitem>
|
|
|
|
<listitem><simpara>
|
|
Vorbis packets may span page boundaries.
|
|
</simpara></listitem>
|
|
|
|
<listitem><simpara>
|
|
The granule position of pages containing Vorbis audio is in units
|
|
of PCM audio samples (per channel; a stereo stream's granule position
|
|
does not increment at twice the speed of a mono stream).
|
|
</simpara></listitem>
|
|
|
|
<listitem><simpara>
|
|
The granule position of a page represents the end PCM sample
|
|
position of the last packet <emphasis>completed</emphasis> on that page.
|
|
A page that is entirely spanned by a single packet (that completes on a
|
|
subsequent page) has no granule position, and the granule position is
|
|
set to '-1'.
|
|
</simpara></listitem>
|
|
|
|
<listitem>
|
|
<simpara>
|
|
The granule (PCM) position of the first page need not indicate
|
|
that the stream started at position zero. Although the granule
|
|
position belongs to the last completed packet on the page and a
|
|
valid granule position must be positive, by
|
|
inference it may indicate that the PCM position of the beginning
|
|
of audio is positive or negative.
|
|
</simpara>
|
|
|
|
<itemizedlist>
|
|
<listitem><simpara>
|
|
A positive starting value simply indicates that this stream begins at
|
|
some positive time offset, potentially within a larger
|
|
program. This is a common case when connecting to the middle
|
|
of broadcast stream.
|
|
</simpara></listitem>
|
|
<listitem><simpara>
|
|
A negative value indicates that
|
|
output samples preceeding time zero should be discarded during
|
|
decoding; this technique is used to allow sample-granularity
|
|
editing of the stream start time of already-encoded Vorbis
|
|
streams. The number of samples to be discarded must not exceed
|
|
the overlap-add span of the first two audio packets.
|
|
</simpara></listitem>
|
|
</itemizedlist>
|
|
|
|
<simpara>
|
|
In both of these cases in which the initial audio PCM starting
|
|
offset is nonzero, the second finished audio packet must flush the
|
|
page on which it appears and the third packet begin a fresh page.
|
|
This allows the decoder to always be able to perform PCM position
|
|
adjustments before needing to return any PCM data from synthesis,
|
|
resulting in correct positioning information without any aditional
|
|
seeking logic.
|
|
</simpara>
|
|
|
|
<note><simpara>
|
|
Failure to do so should, at worst, cause a
|
|
decoder implementation to return incorrect positioning information
|
|
for seeking operations at the very beginning of the stream.
|
|
</simpara></note>
|
|
</listitem>
|
|
|
|
<listitem><simpara>
|
|
A granule position on the final page in a stream that indicates
|
|
less audio data than the final packet would normally return is used to
|
|
end the stream on other than even frame boundaries. The difference
|
|
between the actual available data returned and the declared amount
|
|
indicates how many trailing samples to discard from the decoding
|
|
process.
|
|
</simpara></listitem>
|
|
</itemizedlist>
|
|
|
|
</section>
|
|
|
|
</appendix>
|
|
|
|
<!-- end appendix on Vorbis encapsulation in Ogg -->
|