- Feb 01, 2011
-
-
9b34bd83 caused serious regressions for 240-sample frame stereo, because the previous qb limit was _always_ hit for two-phase stereo. Two-phase stereo really does operate with a different model (for example, the single bit allocated to the side should really probably be thought of as a sign bit for qtheta, but we don't count it as part of qtheta's allocation). The old code was equivalent to a separate two-phase offset of 12, however Greg Maxwell's testing demonstrates that 16 performs best.
-
Previously, we would only split a band if it was allocated more than 32 bits. However, the N=4 codebook can only produce about 22.5 bits, and two N=2 bands combined can only produce 26 bits, including 8 bits for qtheta, so if we wait until we allocate 32, we're guaranteed to fall short. Several of the larger bands come pretty far from filling 32 bits as well, though their split versions will. Greg Maxwell also suggested adding an offset to the threshold to account for the inefficiency of using qtheta compared to another VQ dimension. This patch uses 1 bit as a placeholder, as it's a clear improvement, but we may adjust this later after collecting data on more possibilities over more files.
-
- Jan 31, 2011
-
-
Jean-Marc Valin authored
-
Jean-Marc Valin authored
-
Timothy B. Terriberry authored
The first version of the mono decoder with stereo output collapsed the historic energy values stored for anti-collapse down to one channel (by taking the max). This means that a subsequent switch back would continue on using the the maximum of the two values instead of the original history, which would make anti-collapse produce louder noise (and potentially more pre-echo than otherwise). This patch moves the max into the anti_collapse function itself, and does not store the values back into the source array, so the full stereo history is maintained if subsequent frames switch back. It also fixes an encoder mismatch, which never took the max (assuming, apparently, that the output channel count would never change).
-
Timothy B. Terriberry authored
Instead of just dumping excess bits into the first band after allocation, use them to initialize the rebalancing loop in quant_all_bands(). This allows these bits to be redistributed over several bands, like normal.
-
Jean-Marc Valin authored
-
Jean-Marc Valin authored
-
Jean-Marc Valin authored
-
Jean-Marc Valin authored
-
The average caps over all values of LM and C are well below the target allocations of the last two modelines. Lower them to the caps, to prevent hitting them quite so early. This helps quality at medium-high rates, in the 180-192 kbps range.
-
Use measured cross-entropy to estimate the real cost of coding qtheta given the allocated qb parameter, instead of the entropy of the PDF. This is generally much lower, and reduces waste at high rates. This patch also removes some intermediate rounding from this computation.
-
This extends the previous rebalancing for fine energy in N=1 bands to also allocate extra fine bits for bands that go over their cap.
-
Jean-Marc Valin authored
-
Jean-Marc Valin authored
-
- Jan 30, 2011
-
-
Jean-Marc Valin authored
-
The previous "dumb cap" of (64<<LM)*(C<<BITRES) was not actually achievable by many (most) bands, and did not take the cost of coding theta for splits into account, and so was too small for some bands. This patch adds code to compute a fairly accurate estimate of the real maximum per-band rate (an estimate only because of rounding effects and the fact that the bit usage for theta is variable), which is then truncated and stored in an 8-bit table in the mode. This gives improved quality at all rates over 160 kbps/channel, prevents bits from being wasted all the way up to 255 kbps/channel (the maximum rate allowed, and approximately the maximum number of bits that can usefully be used regardless of the allocation), and prevents dynalloc and trim from producing enormous waste (eliminating the need for encoder logic to prevent this).
-
Jean-Marc Valin authored
-
- Jan 29, 2011
-
-
Jean-Marc Valin authored
We use the MDCT as low-pass filter.
-
Previously, in a stereo split with itheta==16384, but without enough bits left over to actually code a pulse, the target band would completely collapse, because the mid gain would be zero and we don't fold the side. This changes the limit to ensure that we never set qn>1 unless we know we'll have enough bits for at least one pulse. This should eliminate the last possible whole-band collapse.
-
Jean-Marc Valin authored
The old constructor is renamed celt_encoder_create_custom(). Same for the decoder.
-
Jean-Marc Valin authored
-
Jean-Marc Valin authored
-
Jean-Marc Valin authored
-
- Jan 28, 2011
-
-
Prevent VBR from shooting up to the maximum rate if set to very low target rates, and prevent the encoder VBR from producing 1 byte frames (which are no longer allowed).
-
Jean-Marc Valin authored
-
- Jan 27, 2011
-
-
Jean-Marc Valin authored
-
Jean-Marc Valin authored
The energy memory can be lowered (not increased) during a transient
-
Jean-Marc Valin authored
-
Jean-Marc Valin authored
That way they can be exact in 16 bits once multiplied by the gain
-
Jean-Marc Valin authored
-
Jean-Marc Valin authored
-
Jean-Marc Valin authored
We now encode the highest bitrate part of the split first and transfer any unused bits to the other part. We use a dead zone of three bits to prevent redistributing in cases of random fluctuation (or else we will statistically lower the allocation of higher frequencies at low-mid bitrates).
-
Jean-Marc Valin authored
-
Jean-Marc Valin authored
-
Jean-Marc Valin authored
Makes celt_exp2() use Q10 input to avoid problems on very low energy. Also makes the pitch downsampling more conservative on gain to avoid problems later.
-
Jean-Marc Valin authored
-
-
Jean-Marc Valin authored
Also defining a 1-byte packet as triggering the PLC/CNG
-
This changes folding so that the LCG is never used on transients (either short blocks or long blocks with increased time resolution), except in the case that there's not enough decoded spectrum to fold yet. It also now only subtracts the anti-collapse bit from the total allocation in quant_all_bands() when space has actually been reserved for it. Finally, it cleans up some of the fill and collapse_mask tracking (this tracking was originally made intentionally sloppy to save work, but then converted to replace the existing fill flag at the last minute, which can have a number of logical implications). The changes, in particular: 1) Splits of less than a block now correctly mark the second half as filled only if the whole block was filled (previously it would also mark it filled if the next block was filled). 2) Splits of less than a block now correctly mark a block as un-collapsed if either half was un-collapsed, instead of marking the next block as un-collapsed when the high half was. 3) The N=2 stereo special case now keeps its fill mask even when itheta==16384; previously this would have gotten cleared, despite the fact that we fold into the side in this case. 4) The test against fill for folding now only considers the bits corresponding to the current set of blocks. Previously it would still fold if any later block was filled. 5) The collapse mask used for the LCG fold data is now correctly initialized when B=16 on platforms with a 16-bit int. 6) The high bits on a collapse mask are now cleared after the TF resolution changes and interleaving at level 0, instead of waiting until the very end. This prevents extraneous high flags set on mid from being mixed into the side flags for mid-side stereo.
-