Skip to content
Snippets Groups Projects
Commit 7635c6db authored by Jean-Marc Valin's avatar Jean-Marc Valin
Browse files

ietf doc: fixed a few minor things that were broken in the last changes

parent bed19456
No related branches found
No related tags found
No related merge requests found
......@@ -343,7 +343,7 @@ CELT uses an entropy coder based upon <xref target="range-coding"></xref>,
which is itself a rediscovery of the FIFO arithmetic code introduced by <xref target="coding-thesis"></xref>.
It is very similar to arithmetic encoding, except that encoding is done with
digits in any base instead of with bits,
so it is faster when using larger bases (e.g.: an octet). All of the
so it is faster when using larger bases (i.e.: an octet). All of the
calculations in the range coder must use bit-exact integer arithmetic.
</t>
......@@ -519,7 +519,7 @@ The CELT codec has several optional features that can be switched on or off in e
<section anchor="intra" title="Intra-frame energy (I)">
<t>
CELT uses prediction to encode the energy in each frequency band. In order to make frames independent, however, it is possible to disable the part of the prediction that depends on previous frames. This is called <spanx style="emph">intra-frame energy</spanx> and requires around 12 more bits per frame. It is enabled with the <spanx style="emph">I</spanx> bit (Table. <xref target="flags-encoding">flags-encoding</xref>). The use of intra energy is OPTIONAL and the decision method is left to the implementor. The reference code describes one way of deciding which frames would benefit most from having their energy encoded without prediction. The intra_decision() (<xref target="quant_bands.c">quant_bands.c</xref>) function looks for frames where the log-spectral distance between consecutive frames is more than 9 dB. When such a difference is found between two frames, the next frame (not the one for which the difference is detected) is marked encoded with intra energy. The one-frame delay is to ensure that when a frame containing a transient event is lost, then the next frame will be decoded without accumulating error from the lost frame.
CELT uses prediction to encode the energy in each frequency band. In order to make frames independent, however, it is possible to disable the part of the prediction that depends on previous frames. This is called <spanx style="emph">intra-frame energy</spanx> and requires around 12 more bits per frame. It is enabled with the <spanx style="emph">I</spanx> bit (Table. <xref target="flags-encoding">flags-encoding</xref>). The use of intra energy is OPTIONAL and the decision method is left to the implementor. The reference code describes one way of deciding which frames would benefit most from having their energy encoded without prediction. The intra_decision() (<xref target="quant_bands.c">quant_bands.c</xref>) function looks for frames where the log-spectral distance between consecutive frames is more than 9 dB. When such a difference is found between two frames, the next frame (not the one for which the difference is detected) is marked encoded with intra energy. The one-frame delay is to ensure that when a frame containing a transient is lost, then the next frame will be decoded without accumulating error from the lost frame.
</t>
</section>
......@@ -708,7 +708,9 @@ all integer codevectors y of N dimensions that satisfy sum(abs(y(j))) = K.
<t>
In bands where neither pitch nor folding is used, the PVQ is used to encode
the unit vector that results from the normalization in
<xref target="normalization"></xref> directly. " In the case where a pitch
<xref target="normalization"></xref> directly. Given a PVQ codevector y,
the unit vector X is obtained as X = y/||y||, where ||.|| denotes the
L2 norm. In the case where a pitch
prediction or a folding vector p is used, the quantized unit vector X' becomes:
</t>
<t>X' = p' + g_f * y,</t>
......@@ -790,11 +792,11 @@ V(N,K) = V(N+1,K) + V(N,K+1) + V(N+1,K+1), with V(N,0) = 1 and V(0,K) = 0, K !=
There are many different ways to compute V(N,K), including pre-computed tables and direct
use of the recursive formulation. The reference implementation applies the recursive
formulation one line (or column) at a time to save on memory use,
along with an alternate,
univariate recurrence to initialise an arbitrary line, and direct
polynomial solutions for small N. All of these methods are
equivalent, and have different trade-offs in speed, memory usage, and
code size. Implementations MAY use any methods they like, as long as
along with an alternate,
univariate recurrence to initialise an arbitrary line, and direct
polynomial solutions for small N. All of these methods are
equivalent, and have different trade-offs in speed, memory usage, and
code size. Implementations MAY use any methods they like, as long as
they are equivalent to the mathematical definition.
</t>
......@@ -815,7 +817,7 @@ than 32 bits MUST be implemented with the splitting method, even if 64-bit arith
<section anchor="stereo" title="Stereo support">
<t>
When encoding a stereo stream, some parameters are shared across the left and right channels, while others are transmitted separately for each channel, or jointly encoded. Only one copy of the flags for the transients and pitch (pitch period and gains) features are transmitted. The coarse and fine energy parameters are transmitted separately for each channel. Both the coarse energy and fine energy (including the remaining fine bits at the end of the stream) have the left and right bands interleaved in the stream, with the left band encoded first.
When encoding a stereo stream, some parameters are shared across the left and right channels, while others are transmitted separately for each channel, or jointly encoded. Only one copy of the flags for the features, transients and pitch (pitch period and gains) are transmitted. The coarse and fine energy parameters are transmitted separately for each channel. Both the coarse energy and fine energy (including the remaining fine bits at the end of the stream) have the left and right bands interleaved in the stream, with the left band encoded first.
</t>
<t>
......@@ -903,74 +905,74 @@ to the application that a problem has occurred.
<section anchor="range-decoder" title="Range Decoder">
<t>
The range decoder extracts the symbols and integers encoded using the range encoder in
<xref target="range-encoder"></xref>. The range decoder maintains an internal
state vector composed of the two-tuple (dif,rng), representing the
difference between the high end of the current range and the actual
coded value, and the size of the current range, respectively. Both
dif and rng are 32-bit unsigned integer values. rng is initialized to
2^7. dif is initialized to rng minus the top 7 bits of the first
input octet. Then the range is immediately normalized, using the
<xref target="range-encoder"></xref>. The range decoder maintains an internal
state vector composed of the two-tuple (dif,rng), representing the
difference between the high end of the current range and the actual
coded value, and the size of the current range, respectively. Both
dif and rng are 32-bit unsigned integer values. rng is initialized to
2^7. dif is initialized to rng minus the top 7 bits of the first
input octet. Then the range is immediately normalized, using the
procedure described in the following section.
</t>
<section anchor="decoding-symbols" title="Decoding Symbols">
<t>
Decoding symbols is a two-step process. The first step determines
a value fs that lies within the range of some symbol in the current
context. The second step updates the range decoder state with the
three-tuple (fl,fh,ft) corresponding to that symbol, as defined in
Decoding symbols is a two-step process. The first step determines
a value fs that lies within the range of some symbol in the current
context. The second step updates the range decoder state with the
three-tuple (fl,fh,ft) corresponding to that symbol, as defined in
<xref target="encoding-symbols"></xref>.
</t>
<t>
The first step is implemented by ec_decode()
(<xref target="rangedec.c">rangedec.c</xref>),
and computes fs = ft-min((dif-1)/(rng/ft)+1,ft), where ft is
the sum of the frequency counts in the current context, as described
in <xref target="encoding-symbols"></xref>. The divisions here are exact integer division.
and computes fs = ft-min((dif-1)/(rng/ft)+1,ft), where ft is
the sum of the frequency counts in the current context, as described
in <xref target="encoding-symbols"></xref>. The divisions here are exact integer division.
</t>
<t>
In the reference implementation, a special version of ec_decode()
called ec_decode_bin() (<xref target="rangeenc.c">rangeenc.c</xref>) is defined using
the parameter ftb instead of ft. It is mathematically equivalent to
calling ec_decode() with ft = (1&lt;&lt;ftb), but avoids one of the
divisions.
In the reference implementation, a special version of ec_decode()
called ec_decode_bin() (<xref target="rangeenc.c">rangeenc.c</xref>) is defined using
the parameter ftb instead of ft. It is mathematically equivalent to
calling ec_decode() with ft = (1&lt;&lt;ftb), but avoids one of the
divisions.
</t>
<t>
The decoder then identifies the symbol in the current context
corresponding to fs; i.e., the one whose three-tuple (fl,fh,ft)
satisfies fl &lt;= fs &lt; fh. This tuple is used to update the decoder
state according to dif = dif - (rng/ft)*(ft-fh), and if fl is greater
than zero, rng = (rng/ft)*(fh-fl), or otherwise rng = rng - (rng/ft)*(ft-fh). After this update, the range is normalized.
The decoder then identifies the symbol in the current context
corresponding to fs; i.e., the one whose three-tuple (fl,fh,ft)
satisfies fl &lt;= fs &lt; fh. This tuple is used to update the decoder
state according to dif = dif - (rng/ft)*(ft-fh), and if fl is greater
than zero, rng = (rng/ft)*(fh-fl), or otherwise rng = rng - (rng/ft)*(ft-fh). After this update, the range is normalized.
</t>
<t>
To normalize the range, the following process is repeated until
rng > 2^23. First, rng is set to (rng&lt;8)&amp;0xFFFFFFFF. Then the next
8 bits of input are read into sym, using the remaining bit from the
previous input octet as the high bit of sym, and the top 7 bits of the
next octet for the remaining bits of sym. If no more input octets
remain, zero bits are used instead. Then, dif is set to
(dif&lt;&lt;8)-sym&amp;0xFFFFFFFF (i.e., using wrap-around if the subtraction
overflows a 32-bit register). Finally, if dif is larger than 2^31,
dif is then set to dif - 2^31. This process is carried out by
ec_dec_normalize() (<xref target="rangedec.c">rangedec.c</xref>).
To normalize the range, the following process is repeated until
rng > 2^23. First, rng is set to (rng&lt;8)&amp;0xFFFFFFFF. Then the next
8 bits of input are read into sym, using the remaining bit from the
previous input octet as the high bit of sym, and the top 7 bits of the
next octet for the remaining bits of sym. If no more input octets
remain, zero bits are used instead. Then, dif is set to
(dif&lt;&lt;8)-sym&amp;0xFFFFFFFF (i.e., using wrap-around if the subtraction
overflows a 32-bit register). Finally, if dif is larger than 2^31,
dif is then set to dif - 2^31. This process is carried out by
ec_dec_normalize() (<xref target="rangedec.c">rangedec.c</xref>).
</t>
</section>
<section anchor="decoding-ints" title="Decoding Uniformly Distributed Integers">
<t>
Functions ec_dec_uint() or ec_dec_bits() are based on ec_decode() and
decode one of N equiprobable symbols, each with a frequency of 1,
where N may be as large as 2^32-1. Because ec_decode() is limited to
a total frequency of 2^16-1, this is done by decoding a series of
symbols in smaller contexts.
Functions ec_dec_uint() or ec_dec_bits() are based on ec_decode() and
decode one of N equiprobable symbols, each with a frequency of 1,
where N may be as large as 2^32-1. Because ec_decode() is limited to
a total frequency of 2^16-1, this is done by decoding a series of
symbols in smaller contexts.
</t>
<t>
ec_dec_bits() (<xref target="entdec.c">entdec.c</xref>) is defined, like
ec_dec_bits() (<xref target="entdec.c">entdec.c</xref>) is defined, like
ec_decode_bin(), to take a single parameter ftb, with ftb &lt; 32.
and ftb &lt; 32, and produces an ftb-bit decoded integer value, t,
initialized to zero. While ftb is greater than 8, it decodes the next
8 most significant bits of the integer, s = ec_decode_bin(8), updates
the decoder state with the 3-tuple (s,s+1,256), adds those bits to
the decoder state with the 3-tuple (s,s+1,256), adds those bits to
the current value of t, t = t&lt;&lt;8 | s, and subtracts 8 from ftb. Then
it decodes the remaining bits of the integer, s = ec_decode_bin(ftb),
updates the decoder state with the 3 tuple (s,s+1,1&lt;&lt;ftb), and adds
......@@ -995,15 +997,15 @@ procedure described in the following section.
<section anchor="decoder-tell" title="Current Bit Usage">
<t>
The bit allocation routines in CELT need to be able to determine a
conservative upper bound on the number of bits that have been used
to decode from the current frame thus far. This drives allocation
decisions which must match those made in the encoder. This is
computed in the reference implementation to fractional bit precision
by the function ec_dec_tell() (<xref target="rangedec.c">rangedec.c</xref>). Like all
operations in the range decoder, it must be implemented in a
bit-exact manner, and must produce exactly the same value returned by
ec_enc_tell() after encoding the same symbols.
The bit allocation routines in CELT need to be able to determine a
conservative upper bound on the number of bits that have been used
to decode from the current frame thus far. This drives allocation
decisions which must match those made in the encoder. This is
computed in the reference implementation to fractional bit precision
by the function ec_dec_tell() (<xref target="rangedec.c">rangedec.c</xref>). Like all
operations in the range decoder, it must be implemented in a
bit-exact manner, and must produce exactly the same value returned by
ec_enc_tell() after encoding the same symbols.
</t>
</section>
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment