Skip to content
Snippets Groups Projects
Commit 2d25330d authored by Jean-Marc Valin's avatar Jean-Marc Valin
Browse files

Update TOC byte

parent 57feffc1
No related merge requests found
...@@ -141,73 +141,73 @@ There are three possible operating modes for the proposed prototype: ...@@ -141,73 +141,73 @@ There are three possible operating modes for the proposed prototype:
</list> </list>
Each of these modes supports a number of difference frame sizes and sampling Each of these modes supports a number of difference frame sizes and sampling
rates. In order to distinguish between the various modes and configurations, rates. In order to distinguish between the various modes and configurations,
we need to define a simple header that can used in the transport layer we define a single-byte table-of-contents (TOC) header that can used in the transport layer
(e.g RTP) to signal this information. The following describes the proposed (e.g RTP) to signal this information. The following describes the proposed
header. TOC byte.
</t> </t>
<t> <t>
The LP mode supports the following configurations (numbered from 00000...01011 in binary): The LP mode supports the following configurations (numbered from 0 to 11):
<list style="symbols"> <list style="symbols">
<t>8 kHz: 10, 20, 40, 60 ms (00000...00011)</t> <t>8 kHz: 10, 20, 40, 60 ms (0..3)</t>
<t>12 kHz: 10, 20, 40, 60 ms (00100...00111)</t> <t>12 kHz: 10, 20, 40, 60 ms (4..7)</t>
<t>16 kHz: 10, 20, 40, 60 ms (01000...01011)</t> <t>16 kHz: 10, 20, 40, 60 ms (8..11)</t>
</list> </list>
for a total of 12 configurations. for a total of 12 configurations.
</t> </t>
<t> <t>
The hybrid mode supports the following configurations (numbered from 01100...01111): The hybrid mode supports the following configurations (numbered from 12 to 15):
<list style="symbols"> <list style="symbols">
<t>32 kHz: 10, 20 ms (01100...01101)</t> <t>32 kHz: 10, 20 ms (12..13)</t>
<t>48 kHz: 10, 20 ms (01110...01111)</t> <t>48 kHz: 10, 20 ms (14..15)</t>
</list> </list>
for a total of 4 configurations. for a total of 4 configurations.
</t> </t>
<t> <t>
The MDCT-only mode supports the following configurations (numbered from 10000...11101): The MDCT-only mode supports the following configurations (numbered from 16 to 31):
<list style="symbols"> <list style="symbols">
<t>8 kHz: 2.5, 5, 10, 20 ms (10000...10011)</t> <t>8 kHz: 2.5, 5, 10, 20 ms (16..19)</t>
<t>16 kHz: 2.5, 5, 10, 20 ms (10100...10111)</t> <t>16 kHz: 2.5, 5, 10, 20 ms (20..23)</t>
<t>32 kHz: 2.5, 5, 10, 20 ms (11000...11011)</t> <t>32 kHz: 2.5, 5, 10, 20 ms (24..27)</t>
<t>48 kHz: 2.5, 5, 10, 20 ms (11100...11111)</t> <t>48 kHz: 2.5, 5, 10, 20 ms (28..31)</t>
</list> </list>
for a total of 16 configurations. for a total of 16 configurations.
</t> </t>
<t> <t>
There is thus a total of 32 configurations, so 5 bits are necessary to There is thus a total of 32 configurations, encoded in 5 bits. On bit is used to signal mono vs stereo, which leaves 2 bits for the number of frames per packets (codes 0 to 3):
indicate the mode, frame size and sampling rate (MFS). This leaves 3 bits for the number of frames per packets (codes 0 to 7):
<list style="symbols"> <list style="symbols">
<t>0-2: 1-3 frames in the packet, each with equal compressed size</t> <t>0: 1 frames in the packet</t>
<t>3: arbitrary number of frames in the packet, each with equal compressed size (one size needs to be encoded)</t> <t>1: 2 frames in the packet, each with equal compressed size</t>
<t>4-5: 2-3 frames in the packet, with different compressed sizes, which need to be encoded (except the last one)</t> <t>2: arbitrary number of frames in the packet, each with equal compressed size</t>
<t>6: arbitrary number of frames in the packet, with different compressed sizes, each of which needs to be encoded</t> <t>3: arbitrary number of frames in the packet, with different compressed sizes</t>
<t>7: The first frame has this MFS, but others have different MFS. Each compressed size needs to be encoded.</t>
</list> </list>
When code 7 is used and the last frames of a packet have the same MFS, it is For codes 2 and 3, the TOC byte is followed by the number of frames in the packet.
allowed to switch to another code for them. For code 3, the byte indicating the number of frames is followed by N-1 frame
lengths encoded as described below. As an additional limit, the audio duration contained
within a packet may not exceed 120 ms.
</t> </t>
<t> <t>
The compressed size of the frames (if needed) is indicated -- usually -- with one byte, with the following meaning: The compressed size of the frames (if needed) is indicated -- usually -- with one byte, with the following meaning:
<list style="symbols"> <list style="symbols">
<t>0: No frame (DTX or lost packet)</t> <t>0: No frame (DTX or lost packet)</t>
<t>1-251: Size of the frame in bytes</t> <t>1-251: Size of the frame in bytes</t>
<t>252-255: A second byte is needed. The total size is (size[1]*4)+(size[0]%4)+252</t> <t>252-255: A second byte is needed. The total size is (size[1]*4)+size[0]</t>
</list> </list>
</t> </t>
<t> <t>
The maximum size representable is 255*4+3+252=1275 bytes. For 20 ms frames, that The maximum size representable is 255*4+255=1275 bytes. For 20 ms frames, that
represents a bit-rate of 510 kb/s, which is really the highest rate anyone would want represents a bit-rate of 510 kb/s, which is really the highest rate anyone would want
to use in stereo mode (beyond that point, lossless codecs would be more appropriate). to use in stereo mode (beyond that point, lossless codecs would be more appropriate).
</t> </t>
<section anchor="examples" title="Examples"> <section anchor="examples" title="Examples">
<t> <t>
Simplest case: one packet Simplest case: one narrowband mono 20-ms SILK frame
</t> </t>
<t> <t>
...@@ -216,14 +216,14 @@ Simplest case: one packet ...@@ -216,14 +216,14 @@ Simplest case: one packet
0 1 2 3 0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| MFS |0|0|0| compressed data... | | 1 |0|0|0| compressed data... |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
]]></artwork> ]]></artwork>
</figure> </figure>
</t> </t>
<t> <t>
Four frames of the same compressed size: Two 48 kHz mono 5 ms CELT frames of the same compressed size:
</t> </t>
<t> <t>
...@@ -232,14 +232,14 @@ Four frames of the same compressed size: ...@@ -232,14 +232,14 @@ Four frames of the same compressed size:
0 1 2 3 0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| MFS |0|1|1| compressed data... | | 29 |0|0|1| compressed data... |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
]]></artwork> ]]></artwork>
</figure> </figure>
</t> </t>
<t> <t>
Two frames of different compressed size: Two 48 kHz mono 20-ms hybrid frames of different compressed size:
</t> </t>
<t> <t>
...@@ -248,14 +248,16 @@ Two frames of different compressed size: ...@@ -248,14 +248,16 @@ Two frames of different compressed size:
0 1 2 3 0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| MFS |1|0|1| frame size | compressed data... | | 15 |0|1|1| 2 | frame size |compressed data|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| compressed data... |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
]]></artwork> ]]></artwork>
</figure> </figure>
</t> </t>
<t> <t>
Three frames of different <spanx style="emph">durations</spanx>: Four 48 kHz stereo 20-ms CELT frame of the same compressed size:
</t> </t>
...@@ -265,9 +267,7 @@ Three frames of different <spanx style="emph">durations</spanx>: ...@@ -265,9 +267,7 @@ Three frames of different <spanx style="emph">durations</spanx>:
0 1 2 3 0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| 1st MFS |1|1|1| frame size | 2nd MFS |1|1|1| frame size | | 31 |1|1|0| 4 | compressed data... |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| 3rd MFS |1|1|1| frame size | compressed data... |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
]]></artwork> ]]></artwork>
</figure> </figure>
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment