Skip to content
Snippets Groups Projects
Commit af68b857 authored by Ralph Giles's avatar Ralph Giles
Browse files

Update ISO Base Media Format draft to version 0.6.2.

parent 65fb456c
No related branches found
No related tags found
No related merge requests found
......@@ -7,12 +7,12 @@
</head>
<body bgcolor="0x333333" text="#60B0C0">
<b><u>Encapsulation of Opus in ISO Base Media File Format</u></b><br>
<font size="2">last updated: October 1, 2014</font><br>
<font size="2">last updated: December 13, 2014</font><br>
<br>
<div class="normal_link pre frame_box">
Encapsulation of Opus in ISO Base Media File Format
Version 0.5.3 (incomplete)
Version 0.6.2 (incomplete)
Table of Contents
......@@ -21,39 +21,22 @@ Table of Contents
<a href="#3">3</a> Terms and Definitions
<a href="#4">4</a> Design Rules of Encapsulation
<a href="#4.1">4.1</a> File Type Indentification
<a href="#4.2">4.2</a> Basic Structure
<a href="#4.2.1">4.2.1</a> Initial Movie
<a href="#4.2.2">4.2.2</a> Movie Fragments
<a href="#4.3">4.3</a> Byte Order
<a href="#4.4">4.4</a> Definition of Opus sample
<a href="#4.4.1">4.4.1</a> Opus sample
<a href="#4.4.2">4.4.2</a> Duration of Opus sample
<a href="#4.4.3">4.4.3</a> Sub-sample
<a href="#4.5">4.5</a> Random Access
<a href="#4.5.1">4.5.1</a> Random Access Point
<a href="#4.5.2">4.5.2</a> Pre-roll
<a href="#4.6">4.6</a> Trimming of Actual Duration
<a href="#4.7">4.7</a> Channel Layout
<a href="#4.8">4.8</a> Additional Requirements, Restrictions, Recommendations and Definitions for Boxes
<a href="#4.8.1">4.8.1</a> File Type Box
<a href="#4.8.2">4.8.2</a> Segment Type Box
<a href="#4.8.3">4.8.3</a> Movie Header Box
<a href="#4.8.4">4.8.4</a> Track Header Box
<a href="#4.8.5">4.8.5</a> Edit Box
<a href="#4.8.6">4.8.6</a> Edit List Box
<a href="#4.8.7">4.8.7</a> Media Header Box
<a href="#4.8.8">4.8.8</a> Handler Reference Box
<a href="#4.8.9">4.8.9</a> Sound Media Header Box
<a href="#4.8.10">4.8.10</a> Sample Table Box
<a href="#4.8.11">4.8.11</a> OpusSampleEntry
<a href="#4.8.12">4.8.12</a> Opus Specific Box
<a href="#4.8.13">4.8.13</a> Sample Group Description Box
<a href="#4.8.14">4.8.14</a> Sample to Group Box
<a href="#4.8.15">4.8.15</a> Track Extends Box
<a href="#4.8.16">4.8.16</a> Track Fragment Box
<a href="#4.8.17">4.8.17</a> Track Fragment Header Box
<a href="#4.8.18">4.8.18</a> Track Fragment Run Box
<a href="#4.9">4.9</a> Example of Encapsulation
<a href="#4.2">4.2</a> Overview of Track Structure
<a href="#4.3">4.3</a> Definitions of Opus sample
<a href="#4.3.1">4.3.1</a> Sample entry format
<a href="#4.3.2">4.3.2</a> Opus Specific Box
<a href="#4.3.3">4.3.3</a> Sample format
<a href="#4.3.4">4.3.4</a> Duration of Opus sample
<a href="#4.3.5">4.3.5</a> Sub-sample
<a href="#4.3.6">4.3.6</a> Random Access
<a href="#4.3.6.1">4.3.6.1</a> Random Access Point
<a href="#4.3.6.2">4.3.6.2</a> Pre-roll
<a href="#4.4">4.4</a> Trimming of Actual Duration
<a href="#4.5">4.5</a> Channel Layout (informative)
<a href="#4.6">4.6</a> Basic Structure (informative)
<a href="#4.6.1">4.6.2</a> Initial Movie
<a href="#4.6.2">4.6.3</a> Movie Fragments
<a href="#4.7">4.7</a> Example of Encapsulation (informative)
<a href="#5">5</a> Author's Address
<a name="1"></a>
......@@ -73,7 +56,7 @@ Table of Contents
[3] RFC 6716
Definition of the Opus Audio Codec
[4] draft-ietf-codec-oggopus-04
[4] draft-ietf-codec-oggopus-06
Ogg Encapsulation for the Opus Audio Codec
<a name="3"></a>
......@@ -109,124 +92,124 @@ Table of Contents
<a name="4"></a>
4 Design Rules of Encapsulation
4.1 File Type Indentification<a name="4.1"></a>
This specification does not define any brand to declare files are conformant to this specification.
TODO: Should we define such brands, e.g. 'Opus'? If we define the brand(s), we can utilize files conformant to
This specification does not define any brand to declare files are conformant to this specification. However,
files conformant to this specification shall contain at least one brand, which supports the requirements and the
requirements described in this clause without contradiction, in the compatible brands list of the File Type Box.
As an example, the minimal support of the encapsulation of Opus bitstreams in ISO Base Media file format requires
the 'iso2' brand in the compatible brands list since support of roll groups is required.
TODO: Should we define specific brands, e.g. 'Opus'? If we define the brand(s), we can utilize files conformant to
this specification for the storage of Opus coded bitstreams without other derived file formats.
It is not preferable that encapsulation of Opus bitstreams with only the brands of the ISO Base Media File
Format, though files conformant to this specification are compatible with certain versions of the ISO
Base Media File Format. See ISO/IEC 14496-12 [3] E.1 Introduction.
If you desire that this file format is an alternative file format to the Ogg Opus, I recommend you define.
<a name="4.2"></a>
4.2 Basic Structure
4.2.1 Initial Movie<a name="4.2.1"></a>
This subclause specifies a basic structure of the Movie Box as follows:
4.2 Overview of Track Structure
This clause summarizes requirements of the encapsulation of Opus coded bitstream as media data in audio tracks
in file formats compliant with the ISO Base Media File Format. The details are described in clauses after this
clause.
+ The handler_type field in the Handler Reference Box shall be set to 'soun'.
+ The Media Information Box shall contain the Sound Media Header Box.
+ The codingname of the sample entry is 'Opus'.
See 4.3.1 Sample entry format to get the details about the sample entry.
+ The 'dOps' box is added to the sample entry to convey initializing information for the decoder.
See 4.3.2 Opus Specific Box to get the details.
+ An Opus sample is exactly one Opus packet for each of different Opus bitstreams.
See 4.3.3 Sample format to get the details.
+ Every Opus sample is a sync sample but requires pre-roll for every random access to get correct output.
See 4.3.6 Random Access to get the details.
<a name="4.3"></a>
4.3 Definitions of Opus sample
4.3.1 Sample entry format<a name="4.3.1"></a>
For any track containing Opus bitstreams, at least one sample entry describing corresponding Opus bitstream
shall be present inside the Sample Table Box. This version of the specification defines only one sample
entry format named OpusSampleEntry whose codingname is 'Opus'. This sample entry includes exactly one Opus
Specific Box defined in 4.3.2 as a mandatory box and indicates that Opus samples described by this sample
entry are stored by the sample format described in 4.3.3.
+----+----+----+----+----+----+----+----+------------------------------+
|moov| | | | | | | | Movie Box |
+----+----+----+----+----+----+----+----+------------------------------+
| |mvhd| | | | | | | Movie Header Box |
+----+----+----+----+----+----+----+----+------------------------------+
| |trak| | | | | | | Track Box |
+----+----+----+----+----+----+----+----+------------------------------+
| | |tkhd| | | | | | Track Header Box |
+----+----+----+----+----+----+----+----+------------------------------+
| | |edts| | | | | | Edit Box |
+----+----+----+----+----+----+----+----+------------------------------+
| | | |elst| | | | | Edit List Box |
+----+----+----+----+----+----+----+----+------------------------------+
| | |mdia| | | | | | Media Box |
+----+----+----+----+----+----+----+----+------------------------------+
| | | |mdhd| | | | | Media Header Box |
+----+----+----+----+----+----+----+----+------------------------------+
| | | |hdlr| | | | | Handler Reference Box |
+----+----+----+----+----+----+----+----+------------------------------+
| | | |minf| | | | | Media Information Box |
+----+----+----+----+----+----+----+----+------------------------------+
| | | | |smhd| | | | Sound Media Information Box |
+----+----+----+----+----+----+----+----+------------------------------+
| | | | |dinf| | | | Data Information Box |
+----+----+----+----+----+----+----+----+------------------------------+
| | | | | |dref| | | Data Reference Box |
+----+----+----+----+----+----+----+----+------------------------------+
| | | | | | |url | | DataEntryUrlBox |
+----+----+----+----+----+----+ or +----+------------------------------+
| | | | | | |urn | | DataEntryUrnBox |
+----+----+----+----+----+----+----+----+------------------------------+
| | | | |stbl| | | | Sample Table |
+----+----+----+----+----+----+----+----+------------------------------+
| | | | | |stsd| | | Sample Description Box |
+----+----+----+----+----+----+----+----+------------------------------+
| | | | | | |Opus| | OpusSampleEntry |
+----+----+----+----+----+----+----+----+------------------------------+
| | | | | | | |dOps| Opus Specific Box |
+----+----+----+----+----+----+----+----+------------------------------+
| | | | | |stts| | | Decoding Time to Sample Box |
+----+----+----+----+----+----+----+----+------------------------------+
| | | | | |stsc| | | Sample To Chunk Box |
+----+----+----+----+----+----+----+----+------------------------------+
| | | | | |stsz| | | Sample Size Box |
+----+----+----+----+----+ or +----+----+------------------------------+
| | | | | |stz2| | | Compact Sample Size Box |
+----+----+----+----+----+----+----+----+------------------------------+
| | | | | |stco| | | Chunk Offset Box |
+----+----+----+----+----+ or +----+----+------------------------------+
| | | | | |co64| | | Chunk Large Offset Box |
+----+----+----+----+----+----+----+----+------------------------------+
| | | | | |sgpd| | | Sample Group Description Box |
+----+----+----+----+----+----+----+----+------------------------------+
| | | | | |sbgp| | | Sample to Group Box |
+----+----+----+----+----+----+----+----+------------------------------+
| |mvex|* | | | | | | Movie Extends Box |
+----+----+----+----+----+----+----+----+------------------------------+
| | |trex|* | | | | | Track Extends Box |
+----+----+----+----+----+----+----+----+------------------------------+
The syntax and semantics of the OpusSampleEntry is shown as follows.
Figure 1 - Basic structure of Movie Box
class OpusSampleEntry() extends AudioSampleEntry ('Opus'){
OpusSpecificBox();
}
It is strongly recommended that the order of boxes should follow the above structure.
Boxes marked with an asterisk (*) may be present.
For some boxes listed above, the additional requirements, restrictions, recommendations and definitions
are specified in 4.8 Additional Requirements, Restrictions, Recommendations and Definitions for Boxes in
this specification. For the others, the definition is as is defined in ISO/IEC 14496-12 [1].
+ channelcount:
The channelcount field shall be set to the sum of the total number of Opus bitstreams and the number
of Opus bitstreams producing two channels. This value is indentical with (M+N), where M is the value of
the *Coupled Stream Count* field and N is the value of the *Stream Count* field in the *Channel Mapping
Table* in the identification header defined in Ogg Opus [4].
+ samplesize:
The samplesize field shall be set to 16.
+ samplerate:
The samplerate field shall be set to 48000&lt&lt16.
+ OpusSpecificBox
This box contains initializing information for the decoder as defined in 4.3.2.
4.2.2 Movie Fragments<a name="4.2.2"></a>
This subclause specifies a basic structure of the Movie Fragment Box as follows:
4.3.2 Opus Specific Box<a name="4.3.2"></a>
Exactly one Opus Specific Box shall be present in each OpusSampleEntry.
The Opus Specific Box contains the Version field and this specification defines version 0 of this box.
If incompatible changes occured in the fields after the Version field within the OpusSpecificBox in the
future versions of this specification, another version will be defined.
This box refers to Ogg Opus [4] at many parts but all the data are stored as big-endian format.
+----+----+----+----+----+----+----+----+------------------------------+
|moof| | | | | | | | Movie Fragment Box |
+----+----+----+----+----+----+----+----+------------------------------+
| |mfhd| | | | | | | Movie Fragment Header Box |
+----+----+----+----+----+----+----+----+------------------------------+
| |traf| | | | | | | Track Fragment Box |
+----+----+----+----+----+----+----+----+------------------------------+
| | |tfhd| | | | | | Track Fragment Header Box |
+----+----+----+----+----+----+----+----+------------------------------+
| | |trun| | | | | | Track Fragment Run Box |
+----+----+----+----+----+----+----+----+------------------------------+
| | |sgpd|* | | | | | Sample Group Description Box |
+----+----+----+----+----+----+----+----+------------------------------+
| | |sbgp|* | | | | | Sample to Group Box |
+----+----+----+----+----+----+----+----+------------------------------+
The syntax and semantics of the Opus Specific Box is shown as follows.
Figure 2 - Basic structure of Movie Fragment Box
class ChannelMappingTable (unsigned int(8) OutputChannelCount){
unsigned int(8) StreamCount;
unsigned int(8) CoupledCount;
unsigned int(8 * OutputChannelCount) ChannelMapping;
}
It is strongly recommended that the Movie Fragment Header Box and the Track Fragment Header Box be
placed first in their container.
Boxes marked with an asterisk (*) may be present.
For some boxes listed above, the additional requirements, restrictions, recommendations and definitions
are specified in 4.8 Additional Requirements, Restrictions, Recommendations and Definitions for Boxes in
this specification. For the others, the definition is as is defined in ISO/IEC 14496-12 [1].
<a name="4.3"></a>
4.3 Byte Order
The fields in the boxes are stored as big-endian format.
All Opus samples are processed byte-by-byte. Therefore, the endianness has nothing to do with any Opus sample.
<a name="4.4"></a>
4.4 Definition of Opus sample
4.4.1 Opus sample<a name="4.4.1"></a>
aligned(8) class OpusSpecificBox extends Box('dOps'){
unsigned int(8) Version;
unsigned int(8) OutputChannelCount;
unsigned int(16) PreSkip;
unsigned int(32) InputSampleRate;
signed int(16) OutputGain;
unsigned int(8) ChannelMappingFamily;
if (ChannelMappingFamily != 0) {
ChannelMappingTable(OutputChannelCount);
}
}
+ Version:
The Version field shall be set to 0.
In the future versions of this specification, this field may be set to other values. And without support
of those values, the reader shall not read the fields after this within the OpusSpecificBox.
+ OutputChannelCount:
The OutputChannelCount field shall be set to the same value as the *Output Channel Count* field in the
identification header defined in Ogg Opus [4].
+ PreSkip:
The PreSkip field indicates the number of the priming samples, that is, the number of samples at 48000 Hz
to discard from the decoder output when starting playback. The value of the PreSkip field could be zero
when removing Opus samples containing the number of PCM samples equal to or more than of the priming
samples. The PreSkip field is not used for discarding the priming samples at the whole playback at all
since it is informative only, and that task falls on the Edit List Box.
+ InputSampleRate:
The InputSampleRate field shall be set to the same value as the *Input Sample Rate* field in the
identification header defined in Ogg Opus [4].
+ OutputGain:
The OutputGain field shall be set to the same value as the *Output Gain* field in the identification
header define in Ogg Opus [4]. Note that the value is stored as 8.8 fixed-point.
+ ChannelMappingFamily:
The ChannelMappingFamily field shall be set to the same value as the *Channel Mapping Family* field in
the identification header defined in Ogg Opus [4].
+ StreamCount:
The StreamCount field shall be set to the same value as the *Stream Count* field in the identification
header defined in Ogg Opus [4].
+ CoupledCount:
The CoupledCount field shall be set to the same value as the *Coupled Count* field in the identification
header defined in Ogg Opus [4].
+ ChannelMapping:
The ChannelMapping field shall be set to the same octet string as *Channel Mapping* field in the identi-
fication header defined in Ogg Opus [4].
4.3.3 Sample format<a name="4.3.3"></a>
An Opus sample is exactly one Opus packet for each of different Opus bitstreams. Due to support more than
two channels, an Opus sample can contain frames from multiple Opus bitstreams but all Opus packets shall
share with the total of frame sizes in a single Opus sample. The way of how to pack an Opus packet from
each of Opus bitstreams into a single Opus sample follows Appendix B. in RFC 6716 [3].
The endianness has nothing to do with any Opus sample since every Opus packet is processed byte-by-byte.
In this specification, 'sample' means 'Opus sample' except for 'padded samples', 'priming samples', 'valid
sample' and 'sample-accurate', i.e. 'sample' is 'sample' in the term defined in ISO/IEC 14496-12 [1].
......@@ -237,7 +220,7 @@ Table of Contents
Figure 3 - Example structure of an Opus sample containing two Opus bitstreams
4.4.2 Duration of Opus sample<a name="4.4.2"></a>
4.3.4 Duration of Opus sample<a name="4.3.4"></a>
The duration of Opus sample is given by multiplying the total of frame sizes for a single Opus bitstream
expressed in seconds by the value of the timescale field in the Media Header Box.
Let's say an Opus sample consists of two Opus bitstreams, where the frame size of one bitstream is 40 milli-
......@@ -250,45 +233,61 @@ Table of Contents
the last Opus sample of an Opus bitstream is given by multiplying the number of the valid samples by the
value produced by dividing the value of the timescale field in the Media Header Box by 48000.
4.4.3 Sub-sample<a name="4.4.3"></a>
4.3.5 Sub-sample<a name="4.3.5"></a>
The structure of the last Opus packet in an Opus sample is different from the others in the same Opus sample,
and the others are invalid Opus packets as an Opus sample because of self-delimiting framing. To avoid
complexities, sub-sample is not defined for Opus sample in this specification.
<a name="4.5"></a>
4.5 Random Access
4.5.1 Random Access Point<a name="4.5.1"></a>
All Opus samples can be independently decoded i.e. every Opus sample is a sync sample. Therefore, the Sync
Sample Box shall not be present as long as there are no samples other than Opus samples in the same track.
4.5.2 Pre-roll<a name="4.5.2"></a>
Opus requires at least 80 millisecond pre-roll after each random access.
Pre-roll is indicated by the roll_distance field in AudioRollRecoveryEntry. AudioPreRollEntry shall not be
used since every Opus sample is a sync sample in Opus bitstream. Note that roll_distance is expressed in
sample units in a term of ISO Base Media File Format, and always takes negative values.
For the requirement of AudioRollRecoveryEntry, the compatible_brands field in the File Type Box and/or
the Segment Type Box shall contain at least one brand which requires support for roll groups. See also
4.8.1 File Type Box and 4.8.2 Segment Type Box in this specification.
<a name="4.6"></a>
4.6 Trimming of Actual Duration
4.3.6 Random Access<a name="4.3.6"></a>
This subclause describes the nature of the random access of Opus sample.
4.3.6.1 Random Access Point<a name="4.3.6.1"></a>
All Opus samples can be independently decoded i.e. every Opus sample is a sync sample. Therefore, the
Sync Sample Box shall not be present as long as there are no samples other than Opus samples in the same
track. And the sample_is_non_sync_sample field for Opus samples shall be set to 0.
4.3.6.2 Pre-roll<a name="4.3.6.2"></a>
Opus bitstream requires at least 80 millisecond pre-roll after each random access to get correct output.
Pre-roll is indicated by the roll_distance field in AudioRollRecoveryEntry. AudioPreRollEntry shall not
be used since every Opus sample is a sync sample in Opus bitstream. Note that roll_distance is expressed
in sample units in a term of ISO Base Media File Format, and always takes negative values.
For any track containing Opus bitstreams, at least one Sample Group Description Box and at least one
Sample to Group Box within the Sample Table Box shall be present and these have the grouping_type field
set to 'roll'. If any Opus sample is contained in a track fragment, the Sample to Group Box with the
grouping_type field set to 'roll' shall be present for that track fragment.
For the requirement of AudioRollRecoveryEntry, the compatible_brands field in the File Type Box shall
contain at least one brand which requires support for roll groups.
<a name="4.4"></a>
4.4 Trimming of Actual Duration
Due to the priming samples (or the padding at the beginning) derived from the pre-roll for the startup and the
padded samples at the end, we need trim from media to get the actual duration. An edit in the Edit List Box can
achieve this demand.
achieve this demand, and the Edit Box and the Edit List Box shall be present.
For sample-accurate trimming, proper timescale should be set to the timescale field in the Movie Header Box
and the Media Header Box inside Track Box(es) for Opus bitstream.
and the Media Header Box inside Track Box(es) for Opus bitstream. The timescale field in the Media Header Box is
typically set to 48000. It is recommended that the timescale field in the Movie Header Box be set to the same
value of the timescale field in the Media Header Box in order to avoid the rounding problem when specifying
duration of edit if the timescales in all of the Media Header Boxes are set to the same value.
For example, to indicate the actual duration of an Opus bitstream in a track with the timescale fields of both
the Movie Header Box and the Media Header Box set to 48000, we would use the following edit:
segment_duration = the number of the valid samples
media_time = the number of the priming samples
media_rate = 1 &lt&lt 16
The Edit List Box is applied to whole movie including all movie fragments. Therefore, it is impossible to tell
the actual duration in the case producing movie fragments on the fly such as live-streaming. In such cases,
the duration of the last Opus sample may be helpful.
the duration of the last Opus sample may be helpful by setting zero to the segment_duration field since the
value 0 represents implicit duration equal to the sum of the duration of all samples.
TODO: Should we define a new box which indicates the last Opus samples?
Since this specification allows multiple sample descriptions, i.e. allows concatenation of multiple Opus
bitstreams in a track, each Opus bitstream may contain some padded samples.
Without such a box, we cannot know in container level whether an Opus sample is the last Opus sample in
an Opus bitstream or not. Is this preferable?
See also 4.8.6 Edit List Box in this specification.
<a name="4.7"></a>
4.7 Channel Layout
<a name="4.5"></a>
4.5 Channel Layout (informative)
By the application of alternate_group in the Track Header Box, whole audio channels in all active tracks from
non-alternate group and/or different alternate group from each other are composited into the presentation. If
an Opus sample consists of multiple Opus bitstreams, it can be splitted into individual Opus bitstreams and
......@@ -300,7 +299,9 @@ Table of Contents
OutputChannelCount = 6;
StreamCount = 4;
CoupledCount = 2;
ChannelMapping = {0, 1, 2, 3, 4, 5}; // front left, front center, front right, rear left, rear right, LFE
ChannelMapping = {0, 4, 1, 2, 3, 5}; // front left, front center, front right, rear left, rear right, LFE
Here, to couple front left to front right channels into the first stream, and couple rear left to rear right
channels into the second stream, reordering is needed since coupled streams must precede any non-coupled stream.
You extract the four Opus bitstreams from this track and you encapsulate two of the four into a track and the
others into another track. The former track is as follows.
OutputChannelCount = 6;
......@@ -327,213 +328,108 @@ Table of Contents
able to exclude certain channels from the already mapped channels to remove pure silent channels. The
channel mapping defined in the Opus Specific Box should be designed as processed before the extensions,
and the extensions should be placed after the Opus Specific Box.
<a name="4.8"></a>
4.8 Additional Requirements, Restrictions, Recommendations and Definitions for Boxes
4.8.1 File Type Box<a name="4.8.1"></a>
For any track containing Opus bitstreams, the following requirements are applied.
+ compatible_brands:
The compatible_brands fields shall contain at least one brand which requires support for the structural
boxes listed at 4.2 Basic Structure, and the additional requirements, restrictions, recommendations and
definitions specified in 4.8 Additional Requirements, Restrictions, Recommendations and Definitions for
Boxes in this specification.
As an example, the minimal support of the encapsulation of Opus bitstreams in ISO Base Media file format
requires the 'iso2' brand since support of roll groups is required.
4.8.2 Segment Type Box<a name="4.8.2"></a>
The same requirements are applied as specified at 4.8.1 File Type Box in this specification.
4.8.3 Movie Header Box<a name="4.8.3"></a>
If any track containing Opus bitstreams, the following recommendations are applied.
+ timescale:
The timescale field should be set to the same value of the timescale field in the Media Header Box
inside Track Box(es) for Opus bitstream if no tracks for bitstreams other than Opus bitstream is present.
4.8.4 Track Header Box<a name="4.8.4"></a>
For any track containing Opus bitstreams, the following requirements are applied.
+ layer:
The layer field shall be set to 0.
+ matrix:
The matrix field shall be set to { 0x00010000,0,0,0,0x00010000,0,0,0,0x40000000 }.
+ width:
The width field shall be set to 0.
+ height:
The height field shall be set to 0.
4.8.5 Edit Box<a name="4.8.5"></a>
For any track containing Opus bitstreams, exactly one Edit Box shall be present.
4.8.6 Edit List Box<a name="4.8.6"></a>
For any track containing Opus bitstreams, exactly one Edit List Box shall be present. In addition, for
non-empty edits, the following recommendations are applied.
+ segment_duration:
The segment_duration field can be used to indicate the actual duration of Opus bitstream.
When the value of the timescale field in the Movie Header Box is equal to 48000, the segment_duration
field shall be set to the number of the valid samples to indicate the actual duration.
When enabling movie fragments, the segment_duration field may be set to 0. The value 0 represents
implicit duration equal to the sum of the duration of all samples. This would be helpful for excluding
the padded samples from the presentation timeline when producing movie fragments on the fly.
+ media_time:
The media_time field can be used to remove the priming samples of Opus bitstreams.
When the value of the timescale field in the Media Header Box is equal to 48000, the media_time field
shall be set to the number of the priming samples to remove the priming samples.
+ media_rate:
If the segment_duration field is used to indicate the actual duration, the media_rate field shall be
set to 1.
4.8.7 Media Header Box<a name="4.8.7"></a>
For any track containing Opus bitstreams, the following recommendation is applied.
+ timescale:
The timescale field should be set to 48000 to access sample-accurately.
4.8.8 Handler Reference Box<a name="4.8.8"></a>
For any track containing Opus bitstreams, the following requirement is applied.
+ handler_type:
The handler_type field shall be set to 'soun'.
4.8.9 Sound Media Header Box<a name="4.8.9"></a>
For any track containing Opus bitstreams, the Sound Media Header Box shall be present.
4.8.10 Sample Table Box<a name="4.8.10"></a>
For any track containing Opus bitstreams, at least one Sample Group Description Boxes and at least one
Sample to Group Boxes shall be present, and the Sync Sample Box shall not be present as long as there are
no samples other than Opus samples in the same track.
4.8.11 OpusSampleEntry<a name="4.8.11"></a>
For any track containing Opus bitstreams, at least one OpusSampleEntry shall be present.
The syntax and semantics of the OpusSampleEntry is shown as follows.
class OpusSampleEntry() extends AudioSampleEntry ('Opus'){
OpusSpecificBox();
}
<a name="4.6"></a>
4.6 Basic Structure (informative)
4.6.1 Initial Movie<a name="4.6.1"></a>
This subclause shows a basic structure of the Movie Box as follows:
+ channelcount:
The channelcount field shall be set to the sum of the total number of Opus bitstreams and the number
of Opus bitstreams producing two channels. This value is indentical with (M+N), where M is the value of
the *Coupled Stream Count* field and N is the value of the *Stream Count* field in the *Channel Mapping
Table* in the identification header defined in Ogg Opus [4].
+ samplesize:
The samplesize field shall be set to 16.
+ samplerate:
The samplerate field shall be set to 48000&lt&lt16.
+----+----+----+----+----+----+----+----+------------------------------+
|moov| | | | | | | | Movie Box |
+----+----+----+----+----+----+----+----+------------------------------+
| |mvhd| | | | | | | Movie Header Box |
+----+----+----+----+----+----+----+----+------------------------------+
| |trak| | | | | | | Track Box |
+----+----+----+----+----+----+----+----+------------------------------+
| | |tkhd| | | | | | Track Header Box |
+----+----+----+----+----+----+----+----+------------------------------+
| | |edts| | | | | | Edit Box |
+----+----+----+----+----+----+----+----+------------------------------+
| | | |elst| | | | | Edit List Box |
+----+----+----+----+----+----+----+----+------------------------------+
| | |mdia| | | | | | Media Box |
+----+----+----+----+----+----+----+----+------------------------------+
| | | |mdhd| | | | | Media Header Box |
+----+----+----+----+----+----+----+----+------------------------------+
| | | |hdlr| | | | | Handler Reference Box |
+----+----+----+----+----+----+----+----+------------------------------+
| | | |minf| | | | | Media Information Box |
+----+----+----+----+----+----+----+----+------------------------------+
| | | | |smhd| | | | Sound Media Header Box |
+----+----+----+----+----+----+----+----+------------------------------+
| | | | |dinf| | | | Data Information Box |
+----+----+----+----+----+----+----+----+------------------------------+
| | | | | |dref| | | Data Reference Box |
+----+----+----+----+----+----+----+----+------------------------------+
| | | | | | |url | | DataEntryUrlBox |
+----+----+----+----+----+----+ or +----+------------------------------+
| | | | | | |urn | | DataEntryUrnBox |
+----+----+----+----+----+----+----+----+------------------------------+
| | | | |stbl| | | | Sample Table |
+----+----+----+----+----+----+----+----+------------------------------+
| | | | | |stsd| | | Sample Description Box |
+----+----+----+----+----+----+----+----+------------------------------+
| | | | | | |Opus| | OpusSampleEntry |
+----+----+----+----+----+----+----+----+------------------------------+
| | | | | | | |dOps| Opus Specific Box |
+----+----+----+----+----+----+----+----+------------------------------+
| | | | | |stts| | | Decoding Time to Sample Box |
+----+----+----+----+----+----+----+----+------------------------------+
| | | | | |stsc| | | Sample To Chunk Box |
+----+----+----+----+----+----+----+----+------------------------------+
| | | | | |stsz| | | Sample Size Box |
+----+----+----+----+----+ or +----+----+------------------------------+
| | | | | |stz2| | | Compact Sample Size Box |
+----+----+----+----+----+----+----+----+------------------------------+
| | | | | |stco| | | Chunk Offset Box |
+----+----+----+----+----+ or +----+----+------------------------------+
| | | | | |co64| | | Chunk Large Offset Box |
+----+----+----+----+----+----+----+----+------------------------------+
| | | | | |sgpd| | | Sample Group Description Box |
+----+----+----+----+----+----+----+----+------------------------------+
| | | | | |sbgp| | | Sample to Group Box |
+----+----+----+----+----+----+----+----+------------------------------+
| |mvex|* | | | | | | Movie Extends Box |
+----+----+----+----+----+----+----+----+------------------------------+
| | |trex|* | | | | | Track Extends Box |
+----+----+----+----+----+----+----+----+------------------------------+
4.8.12 Opus Specific Box<a name="4.8.12"></a>
Exactly one Opus Specific Box shall be present in each OpusSampleEntry.
The Opus Specific Box contains the version field and this specification defines version 0 of this box.
If incompatible changes occured in the fields after the version field within the OpusSpecificBox in the
future versions of this specification, another version will be defined.
Figure 1 - Basic structure of Movie Box
The syntax and semantics of the Opus Specific Box is shown as follows.
It is strongly recommended that the order of boxes should follow the above structure.
Boxes marked with an asterisk (*) may be present.
For most boxes listed above, the definition is as is defined in ISO/IEC 14496-12 [1]. The additional boxes
and the additional requirements, restrictions and recommendations to the other boxes are described in this
specification.
class ChannelMappingTable (unsigned int(8) OutputChannelCount){
unsigned int(8) StreamCount;
unsigned int(8) CoupledCount;
unsigned int(8 * OutputChannelCount) ChannelMapping;
}
4.6.2 Movie Fragments<a name="4.6.2"></a>
This subclause shows a basic structure of the Movie Fragment Box as follows:
aligned(8) class OpusSpecificBox extends FullBox('dOps', version, dflags){
unsigned int(8) OutputChannelCount;
if (dflags & 0x000001) {
unsigned int(16) PreSkip;
}
if (dflags & 0x000002) {
unsigned int(32) InputSampleRate;
}
if (dflags & 0x000004) {
signed int(16) OutputGain;
}
unsigned int(8) ChannelMappingFamily;
if (ChannelMappingFamily != 0) {
ChannelMappingTable(OutputChannelCount);
}
}
+----+----+----+----+----+----+----+----+------------------------------+
|moof| | | | | | | | Movie Fragment Box |
+----+----+----+----+----+----+----+----+------------------------------+
| |mfhd| | | | | | | Movie Fragment Header Box |
+----+----+----+----+----+----+----+----+------------------------------+
| |traf| | | | | | | Track Fragment Box |
+----+----+----+----+----+----+----+----+------------------------------+
| | |tfhd| | | | | | Track Fragment Header Box |
+----+----+----+----+----+----+----+----+------------------------------+
| | |trun| | | | | | Track Fragment Run Box |
+----+----+----+----+----+----+----+----+------------------------------+
| | |sgpd|* | | | | | Sample Group Description Box |
+----+----+----+----+----+----+----+----+------------------------------+
| | |sbgp|* | | | | | Sample to Group Box |
+----+----+----+----+----+----+----+----+------------------------------+
+ version:
The version field shall be set to 0.
In the future versions of this specification, this field may be set to other values. And without support
of those values, the reader shall not read the fields after this within the OpusSpecificBox.
+ flags:
The following flags are defined in the dflags:
0x000001 pre-skip-present:
This flag indicates the presence of the PreSkip field.
0x000002 input-sample-rate-present:
This flag indicates the presence of the InputSampleRate field.
0x000004 output-gain-present:
This flag indicates the presence of the OutputGain field.
+ OutputChannelCount:
The OutputChannelCount field shall be set to the same value as the *Output Channel Count* field in the
identification header defined in Ogg Opus [4].
+ PreSkip:
The PreSkip field shall be set to the same value as the *Pre-skip* field in the identification header
defined in Ogg Opus [4]. Note that the value is stored as big-endian format.
The PreSkip field can be absent after removing Opus samples containing the number of PCM samples more
than of the priming samples.
The PreSkip field is not used for removing the priming samples at the whole playback at all since it is
informative only, and that task falls on the Edit List Box.
+ InputSampleRate:
The InputSampleRate field shall be set to the same value as the *Input Sample Rate* field in the
identification header defined in Ogg Opus [4]. Note that the value is stored as big-endian format.
If the InputSampleRate field is absent, process as if it is set to 0, which indicates "unspecified".
+ OutputGain:
The OutputGain field shall be set to the same value as the *Output Gain* field in the identification
header define in Ogg Opus [4]. Note that the value is stored as 8.8 fixed-point and big-endian format.
If the OutputGain field is absent, process as if it is set to 0.
+ ChannelMappingFamily:
The ChannelMappingFamily field shall be set to the same value as the *Channel Mapping Family* field in
the identification header defined in Ogg Opus [4].
+ StreamCount:
The StreamCount field shall be set to the same value as the *Stream Count* field in the identification
header defined in Ogg Opus [4].
+ CoupledCount:
The CoupledCount field shall be set to the same value as the *Coupled Count* field in the identification
header defined in Ogg Opus [4].
+ ChannelMapping:
The ChannelMapping field shall be set to the same octet string as *Channel Mapping* field in the identi-
fication header defined in Ogg Opus [4].
Figure 2 - Basic structure of Movie Fragment Box
4.8.13 Sample Group Description Box<a name="4.8.13"></a>
For any track containing Opus bitstreams, at least one Sample Group Description Box shall be present and have
the grouping_type field set to 'roll'. In addition, the following requirements and restriction are applied.
+ version:
The version field shall be set to 1 if the grouping_type field set to 'roll'.
+ default_length
The default_length field shall be set to 2 if the grouping_type field set to 'roll'.
+ roll_distance:
The roll_distance field in any AudioRollRecoveryEntry shall not be set to zero and positive values for
any Opus sample.
See also 4.5.2 Pre-roll.
4.8.14 Sample to Group Box<a name="4.8.14"></a>
For any track containing Opus bitstreams, at least one Sample to Group Box shall be present and have the
grouping_type field set to 'roll'. In addition, the following requirement is applied.
+ group_description_index:
The group_description_index fields shall not be set to 0 if the grouping_type field set to 'roll'.
4.8.15 Track Extends Box<a name="4.8.15"></a>
For any track containing Opus bitstreams, the following requirement is applied.
+ default_sample_flags:
The sample_is_non_sync_sample field shall be set to 0.
4.8.16 Track Fragment Box<a name="4.8.16"></a>
For any track containing Opus bitstreams, if any sample is contained in track fragment, the Sample to
Group Box with the grouping_type field set to 'roll' shall be present for that track fragment.
4.8.17 Track Fragment Header Box<a name="4.8.17"></a>
For any track containing Opus bitstreams, the following requirement is applied.
+ default_sample_flags:
The sample_is_non_sync_sample field shall be set to 0.
4.8.18 Track Fragment Run Box<a name="4.8.18"></a>
For any track containing Opus bitstreams, the following requirements are applied.
+ first_sample_flags:
The sample_is_non_sync_sample field shall be set to 0.
+ sample_flags:
The sample_is_non_sync_sample field shall be set to 0.
<a name="4.9"></a>
4.9 Example of Encapsulation
It is strongly recommended that the Movie Fragment Header Box and the Track Fragment Header Box be
placed first in their container.
Boxes marked with an asterisk (*) may be present.
For the boxes listed above, the definition is as is defined in ISO/IEC 14496-12 [1].
<a name="4.7"></a>
4.7 Example of Encapsulation (informative)
[File]
size = 10349
size = 17790
[ftyp: File Type Box]
position = 0
size = 24
......@@ -542,22 +438,16 @@ Table of Contents
compatible_brands
brand[0] = mp42 : MP4 version 2
brand[1] = iso2 : ISO Base Media file format version 2
[free: Free Space Box]
position = 24
size = 8
[mdat: Media Data Box]
position = 32
size = 9551
[moov: Movie Box]
position = 9583
size = 766
position = 24
size = 757
[mvhd: Movie Header Box]
position = 9591
position = 32
size = 108
version = 0
flags = 0x000000
creation_time = UTC 2014/09/23, 15:23:21
modification_time = UTC 2014/09/23, 15:23:21
creation_time = UTC 2014/12/12, 18:41:19
modification_time = UTC 2014/12/12, 18:41:19
timescale = 48000
duration = 33600 (00:00:00.700)
rate = 1.000000
......@@ -577,7 +467,7 @@ Table of Contents
pre_defined = 0x00000000
next_track_ID = 2
[iods: Object Descriptor Box]
position = 9699
position = 140
size = 33
version = 0
flags = 0x000000
......@@ -596,18 +486,18 @@ Table of Contents
expandableClassSize = 4
Track_ID = 1
[trak: Track Box]
position = 9732
size = 617
position = 173
size = 608
[tkhd: Track Header Box]
position = 9740
position = 181
size = 92
version = 0
flags = 0x000007
Track enabled
Track in movie
Track in preview
creation_time = UTC 2014/09/23, 15:23:21
modification_time = UTC 2014/09/23, 15:23:21
creation_time = UTC 2014/12/12, 18:41:19
modification_time = UTC 2014/12/12, 18:41:19
track_ID = 1
reserved = 0x00000000
duration = 33600 (00:00:00.700)
......@@ -624,34 +514,34 @@ Table of Contents
width = 0.000000
height = 0.000000
[edts: Edit Box]
position = 9832
position = 273
size = 36
[elst: Edit List Box]
position = 9840
position = 281
size = 28
version = 0
flags = 0x000000
entry_count = 1
entry[0]
segment_duration = 33600
media_time = 3840
media_time = 312
media_rate = 1.000000
[mdia: Media Box]
position = 9868
size = 481
position = 309
size = 472
[mdhd: Media Header Box]
position = 9876
position = 317
size = 32
version = 0
flags = 0x000000
creation_time = UTC 2014/09/23, 15:23:21
modification_time = UTC 2014/09/23, 15:23:21
creation_time = UTC 2014/12/12, 18:41:19
modification_time = UTC 2014/12/12, 18:41:19
timescale = 48000
duration = 38400 (00:00:00.800)
duration = 34560 (00:00:00.720)
language = und
pre_defined = 0x0000
[hdlr: Handler Reference Box]
position = 9908
position = 349
size = 51
version = 0
flags = 0x000000
......@@ -662,147 +552,151 @@ Table of Contents
reserved = 0x00000000
name = Xiph Audio Handler
[minf: Media Information Box]
position = 9959
size = 390
position = 400
size = 381
[smhd: Sound Media Header Box]
position = 9967
position = 408
size = 16
version = 0
flags = 0x000000
balance = 0.000000
reserved = 0x0000
[dinf: Data Information Box]
position = 9983
position = 424
size = 36
[dref: Data Reference Box]
position = 9991
position = 432
size = 28
version = 0
flags = 0x000000
entry_count = 1
[url : Data Entry Url Box]
position = 10007
position = 448
size = 12
version = 0
flags = 0x000001
location = in the same file
[stbl: Sample Table Box]
position = 10019
size = 330
position = 460
size = 321
[stsd: Sample Description Box]
position = 10027
size = 72
position = 468
size = 79
version = 0
flags = 0x000000
entry_count = 1
[Opus: Audio Description]
position = 10043
size = 56
position = 484
size = 63
reserved = 0x000000000000
data_reference_index = 1
reserved = 0x0000
reserved = 0x0000
reserved = 0x00000000
channelcount = 2
channelcount = 6
samplesize = 16
pre_defined = 0
reserved = 0
samplerate = 48000.000000
[dOps: Opus Specific Box]
position = 10071
size = 20
version = 0
flags = 0x000006
OutputChannelCount = 2
InputSampleRate = 44100
OutputGain = 0.000000
ChannelMappingFamily = 0
position = 520
size = 27
Version = 0
OutputChannelCount = 6
PreSkip = 312
InputSampleRate = 48000
OutputGain = 0
ChannelMappingFamily = 1
StreamCount = 4
CoupledCount = 2
ChannelMapping
0 -> 0: front left
1 -> 4: fron center
2 -> 1: front right
3 -> 2: side left
4 -> 3: side right
5 -> 5: rear center
[stts: Decoding Time to Sample Box]
position = 10099
position = 547
size = 24
version = 0
flags = 0x000000
entry_count = 1
entry[0]
sample_count = 10
sample_delta = 3840
sample_count = 18
sample_delta = 1920
[stsc: Sample To Chunk Box]
position = 10123
position = 571
size = 40
version = 0
flags = 0x000000
entry_count = 2
entry[0]
first_chunk = 1
samples_per_chunk = 4
samples_per_chunk = 13
sample_description_index = 1
entry[1]
first_chunk = 3
samples_per_chunk = 2
first_chunk = 2
samples_per_chunk = 5
sample_description_index = 1
[stsz: Sample Size Box]
position = 10163
size = 60
position = 611
size = 92
version = 0
flags = 0x000000
sample_size = 0 (variable)
sample_count = 10
entry_size[0] = 780
entry_size[1] = 920
entry_size[2] = 963
entry_size[3] = 988
entry_size[4] = 1024
entry_size[5] = 951
entry_size[6] = 933
entry_size[7] = 1017
entry_size[8] = 992
entry_size[9] = 975
sample_count = 18
entry_size[0] = 977
entry_size[1] = 938
entry_size[2] = 939
entry_size[3] = 938
entry_size[4] = 934
entry_size[5] = 945
entry_size[6] = 948
entry_size[7] = 956
entry_size[8] = 955
entry_size[9] = 930
entry_size[10] = 933
entry_size[11] = 934
entry_size[12] = 972
entry_size[13] = 977
entry_size[14] = 958
entry_size[15] = 949
entry_size[16] = 962
entry_size[17] = 848
[stco: Chunk Offset Box]
position = 10223
size = 28
position = 703
size = 24
version = 0
flags = 0x000000
entry_count = 3
chunk_offset[0] = 40
chunk_offset[1] = 3691
chunk_offset[2] = 7616
entry_count = 2
chunk_offset[0] = 797
chunk_offset[1] = 13096
[sgpd: Sample Group Description Box]
position = 10251
size = 30
position = 727
size = 26
version = 1
flags = 0x000000
grouping_type = roll
default_length = 2 (constant)
entry_count = 3
roll_distance[0] = -1
roll_distance[1] = -2
roll_distance[2] = -3
entry_count = 1
roll_distance[0] = -2
[sbgp: Sample to Group Box]
position = 10281
size = 68
position = 753
size = 28
version = 0
flags = 0x000000
grouping_type = roll
entry_count = 6
entry_count = 1
entry[0]
sample_count = 2
group_description_index = 1
entry[1]
sample_count = 1
group_description_index = 2
entry[2]
sample_count = 1
group_description_index = 3
entry[3]
sample_count = 1
group_description_index = 2
entry[4]
sample_count = 3
group_description_index = 3
entry[5]
sample_count = 2
sample_count = 18
group_description_index = 1
[free: Free Space Box]
position = 781
size = 8
[mdat: Media Data Box]
position = 789
size = 17001
<a name="5"></a>
5 Authors' Address
Yusuke Nakamura
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment