@@ -343,7 +343,7 @@ CELT uses an entropy coder based upon <xref target="range-coding"></xref>,
which is itself a rediscovery of the FIFO arithmetic code introduced by <xreftarget="coding-thesis"></xref>.
It is very similar to arithmetic encoding, except that encoding is done with
digits in any base instead of with bits,
so it is faster when using larger bases (e.g.: an octet). All of the
so it is faster when using larger bases (i.e.: an octet). All of the
calculations in the range coder must use bit-exact integer arithmetic.
</t>
...
...
@@ -519,7 +519,7 @@ The CELT codec has several optional features that can be switched on or off in e
<sectionanchor="intra"title="Intra-frame energy (I)">
<t>
CELT uses prediction to encode the energy in each frequency band. In order to make frames independent, however, it is possible to disable the part of the prediction that depends on previous frames. This is called <spanxstyle="emph">intra-frame energy</spanx> and requires around 12 more bits per frame. It is enabled with the <spanxstyle="emph">I</spanx> bit (Table. <xreftarget="flags-encoding">flags-encoding</xref>). The use of intra energy is OPTIONAL and the decision method is left to the implementor. The reference code describes one way of deciding which frames would benefit most from having their energy encoded without prediction. The intra_decision() (<xreftarget="quant_bands.c">quant_bands.c</xref>) function looks for frames where the log-spectral distance between consecutive frames is more than 9 dB. When such a difference is found between two frames, the next frame (not the one for which the difference is detected) is marked encoded with intra energy. The one-frame delay is to ensure that when a frame containing a transient event is lost, then the next frame will be decoded without accumulating error from the lost frame.
CELT uses prediction to encode the energy in each frequency band. In order to make frames independent, however, it is possible to disable the part of the prediction that depends on previous frames. This is called <spanxstyle="emph">intra-frame energy</spanx> and requires around 12 more bits per frame. It is enabled with the <spanxstyle="emph">I</spanx> bit (Table. <xreftarget="flags-encoding">flags-encoding</xref>). The use of intra energy is OPTIONAL and the decision method is left to the implementor. The reference code describes one way of deciding which frames would benefit most from having their energy encoded without prediction. The intra_decision() (<xreftarget="quant_bands.c">quant_bands.c</xref>) function looks for frames where the log-spectral distance between consecutive frames is more than 9 dB. When such a difference is found between two frames, the next frame (not the one for which the difference is detected) is marked encoded with intra energy. The one-frame delay is to ensure that when a frame containing a transient is lost, then the next frame will be decoded without accumulating error from the lost frame.
</t>
</section>
...
...
@@ -708,7 +708,9 @@ all integer codevectors y of N dimensions that satisfy sum(abs(y(j))) = K.
<t>
In bands where neither pitch nor folding is used, the PVQ is used to encode
the unit vector that results from the normalization in
<xreftarget="normalization"></xref> directly. " In the case where a pitch
<xreftarget="normalization"></xref> directly. Given a PVQ codevector y,
the unit vector X is obtained as X = y/||y||, where ||.|| denotes the
L2 norm. In the case where a pitch
prediction or a folding vector p is used, the quantized unit vector X' becomes:
</t>
<t>X' = p' + g_f * y,</t>
...
...
@@ -790,11 +792,11 @@ V(N,K) = V(N+1,K) + V(N,K+1) + V(N+1,K+1), with V(N,0) = 1 and V(0,K) = 0, K !=
There are many different ways to compute V(N,K), including pre-computed tables and direct
use of the recursive formulation. The reference implementation applies the recursive
formulation one line (or column) at a time to save on memory use,
along with an alternate,
univariate recurrence to initialise an arbitrary line, and direct
polynomial solutions for small N. All of these methods are
equivalent, and have different trade-offs in speed, memory usage, and
code size. Implementations MAY use any methods they like, as long as
along with an alternate,
univariate recurrence to initialise an arbitrary line, and direct
polynomial solutions for small N. All of these methods are
equivalent, and have different trade-offs in speed, memory usage, and
code size. Implementations MAY use any methods they like, as long as
they are equivalent to the mathematical definition.
</t>
...
...
@@ -815,7 +817,7 @@ than 32 bits MUST be implemented with the splitting method, even if 64-bit arith
<sectionanchor="stereo"title="Stereo support">
<t>
When encoding a stereo stream, some parameters are shared across the left and right channels, while others are transmitted separately for each channel, or jointly encoded. Only one copy of the flags for the transients and pitch (pitch period and gains) features are transmitted. The coarse and fine energy parameters are transmitted separately for each channel. Both the coarse energy and fine energy (including the remaining fine bits at the end of the stream) have the left and right bands interleaved in the stream, with the left band encoded first.
When encoding a stereo stream, some parameters are shared across the left and right channels, while others are transmitted separately for each channel, or jointly encoded. Only one copy of the flags for the features, transients and pitch (pitch period and gains) are transmitted. The coarse and fine energy parameters are transmitted separately for each channel. Both the coarse energy and fine energy (including the remaining fine bits at the end of the stream) have the left and right bands interleaved in the stream, with the left band encoded first.
</t>
<t>
...
...
@@ -903,74 +905,74 @@ to the application that a problem has occurred.