- Oct 15, 2013
-
-
Jingning Han authored
Change-Id: I2d96940fae4c7a16661a43c2bf6907d8b1c1a127
-
Jingning Han authored
Use the zcoeff_blk buffer of PICK_MODE_CONTEXT to store the indexes of all-zero-coeff block of the current best mode. Remove the temporary buffer best_zcoeff_blk defined in the rate-distortion optimization loop. This improves the speed performance by about 0.5% in all speed settings. Change-Id: Ie3e15988ddfa581eafa2e19a8228d3fe4a46095c
-
- Oct 14, 2013
-
-
Jingning Han authored
This commit moves token_cache buffer into macroblock struct, instead of defining as a local variable in cost_coeffs. This avoids repeatedly re-allocating memory space in the rate-distortion optimization loop. The runtime at speed 0 reduces: bus 2000kbps, 161692ms to 159951ms football 600kbps, 229505ms to 225821ms Change-Id: If7da6b0b6d8c5138a16271a33c4548fba33d8840
-
- Oct 11, 2013
-
-
Yaowu Xu authored
-
Dmitry Kovalev authored
-
Dmitry Kovalev authored
-
Yaowu Xu authored
The commit changes to mask available intra prediction modes for test based on prediction block size. With this patch, encoding time of CpuUsed 2 reduces from 10% to 20% for HD clips with a compression drop of 0.2% Change-Id: I65f320f1237c0f5ae3a355bf7caf447f55625455
-
Yunqing Wang authored
-
Jingning Han authored
-
Yunqing Wang authored
Minor code cleanup. Change-Id: I47c1f794842d4570bb39cfd23b80f54f5606bba6
-
Paul Wilkins authored
-
Paul Wilkins authored
-
Yunqing Wang authored
-
Paul Wilkins authored
When the codec in VBR (or cq) mode hits its max q limits and is struggling to hit a target bandwidth, the bit target per frame collapses. In the first instance normal frames cap out at the maximum allowed Q and then the ARF and GFs do the same. This latter behavior is not generally desirable as GFs and ARFs are only effective from a quality and data rate perspective if they have at lease some level of -Q delta compared to the surrounding frames. In this patch I define a separate max Q for GFs and ARFs that is derived from but somewhat lower than that defined for normal frames. In effect there is a minimum Q delta that will always be available for GFs and ARFs regardless of the target rate and MAXQ setting. This may of course mean that the absolute lowest rate obtainable for a given clip is somewhat higher. Change-Id: I268868b28401900d0cd87e51e609cd3b784ab54a
-
Paul Wilkins authored
For VBR coding disable the recode loop for speeds > 0. Results pending. Change-Id: I2cd9a87c3fcbe39c05b954798d0671a4ca62c37f
-
Paul Wilkins authored
Change-Id: I5222e3db2627a3a9f7fc34f2ab4554aa5807ed51
-
Dmitry Kovalev authored
It is used only two times and it is more clear to use real type instead of typedef. Change-Id: Idc25c16504c3da4d040e0cdb33a2987631bb6a5b
-
- Oct 10, 2013
-
-
Dmitry Kovalev authored
We have two SSE2-optimized functions for idct4_1d: vp9_idct4_1d_sse2 <-- removing this one idct4_1d_sse2 vp9_idct4_1d_sse2 was used only by the following functions which already have SSE2 optimized variants: vp9_idct4x4_16_add_c -> vp9_idct4x4_16_add_see2 idct8_1d -> vp9_idct8x8_{16, 10, 1}_see2 vp9_short_iht4x4_add_c -> vp9_short_iht4x4_add_see2 Change-Id: Ib0a7f6d1373dbaf7a4a41208cd9d0671fdf15edb
-
Scott LaVarnway authored
byte version of ronalds d207 ssse3 optimizations (commit: f891f84d3ba9345b0074e682f0fea09b8ddf4f1e) Change-Id: If15f71a589ea16f78ac86a501b0c5c6231dc9af1
-
Dmitry Kovalev authored
-
Dmitry Kovalev authored
-
Yunqing Wang authored
-
Yunqing Wang authored
To ensure fast encoding/decoding on devices without ssse3 support, SSE2 optimization of sub-pixel filters was done. Test using 1080p clip showed the decoder speeds were ~70fps with ssse3 filters, ~60fps with sse2 filters, and ~15fps with c filters. Change-Id: Ie2088f87d83a889fba80a613e4d0e287aadd785c
-
Adrian Grange authored
-
Yaowu Xu authored
-
Jingning Han authored
-
Jingning Han authored
Change-Id: Ifef756a3a91423bb9f5411f06fa092027be21ecf
-
Dmitry Kovalev authored
Renames: fdct4_1d -> fdct4 fadst4_1d -> fadst4 fdct8_1d -> fdct8 fadst8_1d -> fadst8 fdct16_1d -> fdct16 fadst16_1d -> fadst16 "_1d" suffix is redundant, so removing it. The same will happen with idct in the next change sets. Change-Id: Ibf421cd2f569146c6079269df7a31819c098265e
-
Dmitry Kovalev authored
Renames: vp9_short_idct32x32_add -> vp9_idct32x32_1024_add vp9_short_idct32x32_1_add -> vp9_idct32x32_1_add vp9_idct_add_32x32 -> vp9_idct32x32_add Change-Id: Id85306f5814bac6c47463a6b5901a93082510666
-
Jingning Han authored
This commit re-designs the per transformed block rate-distortion costs tracking buffers. It removes redundant buffer usage, makes the needed context memory allocation per VP9_COMP instance and reuses the same buffer sets inside the rate-distortion optimization search loop, thereby avoiding repeatedly requiring memory space. It reduces speed 0 runtime: bus at 2000 kbps from 166763ms to 158967ms, football at 600 kbps from 246614ms to 234257ms. Both about 5% speed-up. Local tests suggest about 2% to 5% speed-up for speed 1 and 2 settings. This does not change compression performance. Change-Id: I363514c5276b5cf9a38c7251088ffc6ab7f9a4c3
-
Yaowu Xu authored
Change-Id: Id5e31833a0ef40de9f64c2f5674af7083233bf14
-
Dmitry Kovalev authored
-
Dmitry Kovalev authored
-
Dmitry Kovalev authored
-
Dmitry Kovalev authored
-
Deb Mukherjee authored
-
Jingning Han authored
-
Jingning Han authored
-
Paul Wilkins authored
-
Deb Mukherjee authored
Increases these parameters. There is a small efficiency gain. Change-Id: Ie5f0ddb39c907d335e0dafa5eb112365a81f4542 derfraw300: +0.091% stdhdraw250: +0.238%
-