- 05 Feb, 2013 1 commit
-
-
Ronald S. Bultje authored
This patch adds column-based tiling. The idea is to make each tile independently decodable (after reading the common frame header) and also independendly encodable (minus within-frame cost adjustments in the RD loop) to speed-up hardware & software en/decoders if they used multi-threading. Column-based tiling has the added advantage (over other tiling methods) that it minimizes realtime use-case latency, since all threads can start encoding data as soon as the first SB-row worth of data is available to the encoder. There is some test code that does random tile ordering in the decoder, to confirm that each tile is indeed independently decodable from other tiles in the same frame. At tile edges, all contexts assume default values (i.e. 0, 0 motion vector, no coefficients, DC intra4x4 mode), and motion vector search and ordering do not cross tiles in the same frame. t log Tile independence is not maintained between frames ATM, i.e. tile 0 of frame 1 is free to use motion vectors that point into any tile of frame 0. We support 1 (i.e. no tiling), 2 or 4 column-tiles. The loopfilter crosses tile boundaries. I discussed this briefly with Aki and he says that's OK. An in-loop loopfilter would need to do some sync between tile threads, but that shouldn't be a big issue. Resuls: with tiling disabled, we go up slightly because of improved edge use in the intra4x4 prediction. With 2 tiles, we lose about ~1% on derf, ~0.35% on HD and ~0.55% on STD/HD. With 4 tiles, we lose another ~1.5% on derf ~0.77% on HD and ~0.85% on STD/HD. Most of this loss is concentrated in the low-bitrate end of clips, and most of it is because of the loss of edges at tile boundaries and the resulting loss of intra predictors. TODO: - more tiles (perhaps allow row-based tiling also, and max. 8 tiles)? - maybe optionally (for EC purposes), motion vectors themselves should not cross tile edges, or we should emulate such borders as if they were off-frame, to limit error propagation to within one tile only. This doesn't have to be the default behaviour but could be an optional bitstream flag. Change-Id: I5951c3a0742a767b20bc9fb5af685d9892c2c96f
-
- 30 Jan, 2013 1 commit
-
-
Ronald S. Bultje authored
Change-Id: Icb6e21dc0c2d9918faa33c8bf70943660df7ad88
-
- 28 Jan, 2013 1 commit
-
-
Paul Wilkins authored
First step in simplifying the segment mode and segment EOB flags into a simpler segment skip flag that implies 0,0 mv and EOB at position 0. Change-Id: Ib750cac31a7a02dc21082580498efd9f7d8d72a5
-
- 14 Jan, 2013 1 commit
-
-
Ronald S. Bultje authored
This experiment gives little gains and adds relatively much code complexity (and it hinders other experiments), so let's get rid of it. Change-Id: Id25e79a137a1b8a01138aa27a1fa0ba4a2df274a
-
- 13 Jan, 2013 1 commit
-
-
Deb Mukherjee authored
Fixes some scaling issues. Adds an option to only compute the dct on the low-low subband for 32x32 and 64x64 blocks using only a single 16x16 dct after 1 and 2 wavelet decomposition levels respectively. Also adds an option to use a 8x8 dct as building block. Currenlty with the 2/6 filter and with a single 16x16 dct on the low low band, the reuslts compared to full 32x32 dct is as follows: derf: -0.15% yt: -0.29% std-hd: -0.18% hd: -0.6% These are my current recommended settings, since the 2/6 filter is very simple. Results with 8x8 dct are about 0.3% worse. Change-Id: I00100cdc96e32deced591985785ef0d06f325e44
-
- 10 Jan, 2013 2 commits
-
-
Ronald S. Bultje authored
Change-Id: I615651e4c7b09e576a341ad425cf80c393637833
-
Ronald S. Bultje authored
Change-Id: If6c88752dffdb566f8d4322f135145270716fb8e
-
- 09 Jan, 2013 1 commit
-
-
Adrian Grange authored
This patch removes the old pred-filter experiment and replaces it with one that is implemented using the switchable filter framework. If the pred-filter experiment is enabled, three interopolation filters are tested during mode selection; the standard 8-tap interpolation filter, a sharp 8-tap filter and a (new) 8-tap smoothing filter. The 6-tap filter code has been preserved for now and if the enable-6tap experiment is enabled (in addition to the pred-filter experiment) the original 6-tap filter replaces the new 8-tap smooth filter in the switchable mode. The new experiment applies the prediction filter in cases of a fractional-pel motion vector. Future patches will apply the filter where the mv is pel-aligned and also to intra predicted blocks. Change-Id: I08e8cba978f2bbf3019f8413f376b8e2cd85eba4
-
- 08 Jan, 2013 1 commit
-
-
Ronald S. Bultje authored
Change-Id: I0df99742029834a85c4933652b0587cf5b6b2587
-
- 06 Jan, 2013 1 commit
-
-
Ronald S. Bultje authored
3.2% gains on std/hd, 1.0% gains on hd. Change-Id: I481d5df23d8a4fc650a5bcba956554490b2bd200
-
- 02 Jan, 2013 1 commit
-
-
Paul Wilkins authored
Part of NEW_MVREF experiment. Added update-able probabilities. Change-Id: I5a4fcf4aaed1d0d1dac980f69d535639a3d59401
-
- 26 Dec, 2012 1 commit
-
-
John Koleszar authored
Various fixups to resolve issues when building vp9-preview under the more stringent checks placed on the experimental branch. Change-Id: I21749de83552e1e75c799003f849e6a0f1a35b07
-
- 18 Dec, 2012 1 commit
-
-
Ronald S. Bultje authored
For coefficients, use int16_t (instead of short); for pixel values in 16-bit intermediates, use uint16_t (instead of unsigned short); for all others, use uint8_t (instead of unsigned char). Change-Id: I3619cd9abf106c3742eccc2e2f5e89a62774f7da
-
- 11 Dec, 2012 1 commit
-
-
Yaowu Xu authored
Change-Id: I0c1be01aae933243311ad321b6c456adaec1a0f5
-
- 08 Dec, 2012 1 commit
-
-
Yaowu Xu authored
This commit changed the ENTROPY_CONTEXT conversion between MBs that have different transform sizes. In additioin, this commit also did a number of cleanup/bug fix: 1. removed duplicate function vp9_fix_contexts() and changed to use vp8_reset_mb_token_contexts() for both encoder and decoder 2. fixed a bug in stuff_mb_16x16 where wrong context was used for the UV. 3. changed reset all context to 0 if a MB is skipped to simplify the logic. Change-Id: I7bc57a5fb6dbf1f85eac1543daaeb3a61633275c
-
- 07 Dec, 2012 1 commit
-
-
Ronald S. Bultje authored
This adds Debargha's DCT/DWT hybrid and a regular 32x32 DCT, and adds code all over the place to wrap that in the bitstream/encoder/decoder/RD. Some implementation notes (these probably need careful review): - token range is extended by 1 bit, since the value range out of this transform is [-16384,16383]. - the coefficients coming out of the FDCT are manually scaled back by 1 bit, or else they won't fit in int16_t (they are 17 bits). Because of this, the RD error scoring does not right-shift the MSE score by two (unlike for 4x4/8x8/16x16). - to compensate for this loss in precision, the quantizer is halved also. This is currently a little hacky. - FDCT and IDCT is double-only right now. Needs a fixed-point impl. - There are no default probabilities for the 32x32 transform yet; I'm simply using the 16x16 luma ones. A future commit will add newly generated probabilities for all transforms. - No ADST version. I don't think we'll add one for this level; if an ADST is desired, transform-size selection can scale back to 16x16 or lower, and use an ADST at that level. Additional notes specific to Debargha's DWT/DCT hybrid: - coefficient scale is different for the top/left 16x16 (DCT-over-DWT) block than for the rest (DWT pixel differences) of the block. Therefore, RD error scoring isn't easily scalable between coefficient and pixel domain. Thus, unfortunately, we need to compute the RD distortion in the pixel domain until we figure out how to scale these appropriately. Change-Id: I00386f20f35d7fabb19aba94c8162f8aee64ef2b
-
- 05 Dec, 2012 1 commit
-
-
Paul Wilkins authored
This patch reduces the cpu cost of the MV ref search by only allowing insert for candidates that would be in the current top 4. This could alter the outcome and slightly favors near candidates which are tested first but also limits the worst case loop count to 4 and means in many cases it will drop out and not happen. Change-Id: Idd795a825f9fd681f30f4fcd550c34c38939e113
-
- 30 Nov, 2012 1 commit
-
-
Jim Bankoski authored
Change-Id: I2c252f3ddcc99e96c1f5d3dab8bcb25a2a3637ea
-
- 29 Nov, 2012 2 commits
-
-
Jim Bankoski authored
Change-Id: Ieefd76e164ca4aa87597da0412977614ddfbacb7
-
Deb Mukherjee authored
This patch allows use of 8x8 and 4x4 ADST correctly for Intra 16x16 modes and Intra 8x8 modes when the block size selected is smaller than the prediction mode. Also includes some cleanups and refactoring. Rebase. Change-Id: Ie3257bdf07bdb9c6e9476915e3a80183c8fa005a
-
- 28 Nov, 2012 1 commit
-
-
Jim Bankoski authored
Change-Id: Ia1cce221f8511561b9cbd8edb7726fbc286ff243
-
- 27 Nov, 2012 1 commit
-
-
John Koleszar authored
Support for gyp which doesn't support multiple objects in the same static library having the same basename. Change-Id: Ib947eefbaf68f8b177a796d23f875ccdfa6bc9dc
-
- 25 Nov, 2012 1 commit
-
-
Jim Bankoski authored
More cleanup to do after this, but this is a good chunk of removing rtcd. Change-Id: I551db75e341a0a85c3ad650df1e9a60dc305681a
-
- 16 Nov, 2012 1 commit
-
-
Deb Mukherjee authored
A patch on compound inter-intra prediction. In compound inter-intra prediction, a new predictor for 16x16 inter coded MBs are obtained by combining a single inter predictor with a 16x16 intra predictor, in a manner that the weight varies with distance from the top/left boundary. The current search strategy is to combine the best inter mode with the best intra mode obtained independently. Results so far: derf +0.31% yt +0.32% std-hd +0.35% hd +0.42% It is conceivable that the results would improve somewhat with a more thorough search strategy where all intra modes are searched given the best mv, or even a joint search for the best mv and the best intra mode. Change-Id: I7951f1ed0d6eb31ca32ac24d120f1585bcd8d79b
-
- 14 Nov, 2012 1 commit
-
-
Ronald S. Bultje authored
This is in line with other cases where we disable ADST if prediction size and transform size don't match. Before this patch, the RD loop will use ADST for superblocks, but frame encoding/decoding won't. Change-Id: I700368c632eb72b5e089c22ef25649d99d7697d0
-
- 13 Nov, 2012 1 commit
-
-
Deb Mukherjee authored
This fix ensures that the forward prob update is not turned off for motion vectors. Change-Id: I0b63c9401155926763c6294df6cca68b32bac340
-
- 12 Nov, 2012 2 commits
-
-
Paul Wilkins authored
This change is a fix / extension of the newbestrefmv experiment. As such it is presented without IFDEF. The change creates a new context for coding inter modes in vp9_find_mv_refs(). This replaces the context that was previously calculated in vp9_find_near_mvs(). The new context is unoptimized and not necessarily any better at this stage (results pending), but eliminates the need for a legacy call to vp9_find_near_mvs(). Based on numbers from Scott, this could help decode speed by several %. In a later patch I will add support for forward update of context (assuming this helps) and refine the context as necessary. Change-Id: I1cd991b82c8df86cc02237a34185e6d67510698a
-
Paul Wilkins authored
Experiment to test speed trade off of reducing the extent of the ref mv search. Reducing the maximum number of tested candidates to 9 had minimal net effect on quality in any of the tests sets. Reduction to 7 has a small negative impact (worst was STD-HD at about -0.2%). This change is in response to the apparently high number of decode cycles reported in regard to mv-ref selection. Change-Id: I0e92e92e324337689358495a1ec9ccdeb23dc774
-
- 10 Nov, 2012 1 commit
-
-
Deb Mukherjee authored
Preliminary patch on a new 4x4 intra mode B_CONTEXT_PRED where the dominant direction from the context is used to encode. Various decoder changes are needed to support decoding of B_CONTEXT_PRED in conjunction with hybrid transforms since the scan order and tokenization depends on the actual direction of prediction obtained from the context. Currently the traditional directional modes are used in conjunction with the B_CONTEXT_PRED, which also seems to provide the best results. The gains are small - in the 0.1% range. Change-Id: I5a7ea80b5218f42a9c0dfb42d3f79a68c7f0cdc2
-
- 07 Nov, 2012 1 commit
-
-
Yaowu Xu authored
Change-Id: Ib39ad47a7d188f3b45416937b7eeb28c3e79b74c
-
- 06 Nov, 2012 2 commits
-
-
James Zern authored
s/([vV][pP])8/$19/ additionally dct.h was removed; declare the _c functions that are used in the tests. the TODO for conversion to parameterized tests still remains. Change-Id: I73db9425a57075bbb78a92693ba6b320578981cd
-
Yaowu Xu authored
there are still a couple type of warning left, which are related to double constants assigned to float type. As those would be addressed by the conversion of transforms into integer version. This commit has left those un-dealt with. Change-Id: I48fd9b489c0c27ad6b543f4177423419f929f2bb
-
- 02 Nov, 2012 1 commit
-
-
Yunqing Wang authored
The block sizes for decoding tokens are up to 16x16, which means eobs is within [0, 256]. Using (signed) char is not enough. Changed eobs data type to unsigned short to fix the problem. Change-Id: I88a7d3098e1f1604c336d6adb88ffec971fb03a6
-
- 01 Nov, 2012 2 commits
-
-
Ronald S. Bultje authored
Change-Id: Ic084c475844b24092a433ab88138cf58af3abbe4
-
Ronald S. Bultje authored
For non-static functions, change the prefix to vp9_. For static functions, remove the prefix. Also fix some comments, remove unused code or unused function prototypes. Change-Id: I1f8be05362f66060fe421c3d4c9a906fdf835de5
-
- 31 Oct, 2012 3 commits
-
-
Ronald S. Bultje authored
This change encompasses VP8_PTR, VP8_COMP, VP8D_COMP, VP8_COMMON, VP8Decompressor and VP8Common. Change-Id: I514ef4ad4e682370f36d656af1c09ee20da216ad
-
Ronald S. Bultje authored
For local symbols, make them static instead. Change-Id: I13d60947a46f711bc8991e16100cea2a13e3a22e
-
Ronald S. Bultje authored
Change-Id: Ic5a5f60e1ff9d9ccae4174160d36529466eeb509
-
- 30 Oct, 2012 1 commit
-
-
Paul Wilkins authored
Delete code relating to featureupdates experiment. Change-Id: If218762c658bb8cbb3007cf2069123b3e05adcbc
-
- 29 Oct, 2012 1 commit
-
-
Jim Bankoski authored
Change-Id: I321280abcf48f3dc16e194d29bde2bd3baec6006
-