Skip to content
GitLab
Explore
Sign in
Register
Primary navigation
Search or go to…
Project
Opus
Manage
Activity
Members
Labels
Plan
Issues
Issue boards
Milestones
Wiki
Code
Merge requests
Repository
Branches
Commits
Tags
Repository graph
Compare revisions
Build
Pipelines
Jobs
Pipeline schedules
Artifacts
Deploy
Releases
Package Registry
Container Registry
Model registry
Operate
Environments
Terraform modules
Monitor
Incidents
Analyze
Value stream analytics
Contributor analytics
CI/CD analytics
Repository analytics
Model experiments
Help
Help
Support
GitLab documentation
Compare GitLab plans
Community forum
Contribute to GitLab
Provide feedback
Terms and privacy
Keyboard shortcuts
?
Snippets
Groups
Projects
Show more breadcrumbs
Alexander Traud
Opus
Commits
c10565bd
Commit
c10565bd
authored
15 years ago
by
Jean-Marc Valin
Browse files
Options
Downloads
Patches
Plain Diff
ietf doc: PVQ search
parent
59f67687
No related branches found
Branches containing commit
No related tags found
Tags containing commit
No related merge requests found
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
doc/ietf/draft-valin-celt-codec.xml
+47
-34
47 additions, 34 deletions
doc/ietf/draft-valin-celt-codec.xml
with
47 additions
and
34 deletions
doc/ietf/draft-valin-celt-codec.xml
+
47
−
34
View file @
c10565bd
...
...
@@ -84,19 +84,7 @@ audio with very low delay. It is suitable for encoding both
speech and music and rates starting at 32 kbit/s. It is primarly designed for transmission
over packet networks and protocols such as RTP
<xref
target=
"rfc3550"
/>
, but also includes
a certain amount of robustness to bit errors, where this could be done at no significant
cost. The codec features are:
</t>
<t>
<list
style=
"symbols"
>
<t>
Ultra-low algorithmic delay (typically 3 to 9 ms)
</t>
<t>
Full audio bandwidth (44.1 kHz and 48 kHz)
</t>
<t>
Support for both voice and music
</t>
<t>
Stereo support
</t>
<t>
Packet loss concealment
</t>
<t>
Constant bit-rates from 32 kbps to 128 kbps and above
</t>
<t>
Free software/open-source/royalty-free
</t>
</list>
cost.
</t>
<t>
The novel aspect of CELT compared to most other codecs is its very low delay,
...
...
@@ -134,10 +122,19 @@ the codec (version 0.3.2 and 0.5.1, respectively), the principles remain the sam
</t>
<t>
CELT is a transform codec, based on the Modified Discrete Cosine Transform
<xref
target=
"mdct"
/>
, which is based on a DCT-IV, with overlap and time-domain
aliasing calcellation.
</t>
<xref
target=
"mdct"
/>
, derived from the DCT-IV, with overlap and time-domain
aliasing calcellation. The main characteristics of CELT are as follows:
<list
style=
"symbols"
>
<t>
Ultra-low algorithmic delay (typically 3 to 9 ms)
</t>
<t>
Full audio bandwidth (44.1 kHz and 48 kHz)
</t>
<t>
Support for both speech and music
</t>
<t>
Stereo support
</t>
<t>
Robustness to packet loss
</t>
<t>
Constant bit-rate from 32 kbps to 128 kbps and above
</t>
<t>
Open source, with no known intellectual property issue
</t>
</list>
</t>
</section>
...
...
@@ -265,7 +262,7 @@ The CELT codec has several optional features that be switched on of off, some of
<ttcol
align=
'center'
>
P
</ttcol>
<ttcol
align=
'center'
>
S
</ttcol>
<ttcol
align=
'center'
>
F
</ttcol>
<ttcol
align=
'
center
'
>
Encoding
</ttcol>
<ttcol
align=
'
right
'
>
Encoding
</ttcol>
<c>
0
</c><c>
0
</c><c>
0
</c><c>
1
</c><c>
00
</c>
<c>
0
</c><c>
1
</c><c>
0
</c><c>
1
</c><c>
01
</c>
<c>
1
</c><c>
0
</c><c>
0
</c><c>
1
</c><c>
110
</c>
...
...
@@ -435,20 +432,45 @@ In bands where no pitch and no folding is used, the PVQ is used directly to enco
the unit vector that results from the normalisation in
<xref
target=
"normalization"
></xref>
. Given a PVQ codevector y, the unit vector X is
obtained as X = y/||y||. Where ||.|| denotes the L2 norm. In the case where a pitch
prediction or a folding vector P is used, the unit vector X becomes:
prediction or a folding vector P is used, the
quantized
unit vector X
'
becomes:
</t>
<t>
X = P + g_f * y,
</t>
<t>
X
'
= P + g_f * y,
</t>
<t>
where g_f = ( sqrt( (y^T*P)^2 + ||y||^2*(1-||P||^2) ) - y^T*P ) / ||y||^2.
</t>
<t>
This is described in mix_pitch_and_residual() (
<xref
target=
"vq.c"
>
vq.c
</xref>
).
</t>
<t>
The combination of the pitch with the pvq codeword is described in
mix_pitch_and_residual() (
<xref
target=
"vq.c"
>
vq.c
</xref>
) and is used in
both the encoder and the decoder.
</t>
<t>
The search for the best codevector y is performed by alg_quant()
(
<xref
target=
"vq.c"
>
vq.c
</xref>
). There are several possible approaches to the
search with a tradeoff between quality and complexity. The method used in the reference
implementation consists of first projecting the residual signal R = X - P onto the codebook
pyramid.
implementation computes an initial codeword y1 by projecting the residual signal
R = X - P onto the codebook pyramid of K-1 pulses:
</t>
<t>
y0 = round_towards_zero( (K-1) * R / sum(abs(R)))
</t>
<t>
Depending on N, K and the input data, the initial codeword y0 may contain from
0 to K-1 non-zero values. All the remaining pulses, with the exception of the last one,
are found iteratively with a greedy search that minimizes the normalised correlation
between y and R:
</t>
<t>
J = -R^T*y / ||y||
</t>
<t>
The last pulse is the only one considering the pitch and minimizes the cost function
<xref
target=
"celt-tasl"
></xref>
:
</t>
<t>
J = -g_f * R^T*y + (g_f)^2 * ||y||^2
</t>
<section
anchor=
"Index Encoding"
title=
"Index Encoding"
>
...
...
@@ -570,6 +592,8 @@ significant non-uniformity.
</section>
<!--
<section anchor="Evaluation of CELT Implementations" title="Evaluation of CELT Implementations">
<t>
...
...
@@ -578,18 +602,7 @@ Insert some text here.
</section>
<section
anchor=
"Issues that need to be addressed"
title=
"Issues that need to be addressed"
>
<t>
<list>
<t>
Dynamic bit allocation
</t>
<t>
Stereo coupling
</t>
</list>
</t>
</section>
-->
<section
anchor=
"Acknowledgments"
title=
"Acknowledgments"
>
...
...
This diff is collapsed.
Click to expand it.
Preview
0%
Loading
Try again
or
attach a new file
.
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Save comment
Cancel
Please
register
or
sign in
to comment