- 22 Jul, 2013 2 commits
-
-
Jingning Han authored
Stack the rate-distortion statistics in the sub8x8 rd loop. This allows the encoder to skip the forward transform, quantization, and coeff cost estimation, in the sub8x8 rd optimization search, if the motion vector(s) are of integer pixel value, and have been tested in the previous prediction filter type rd loops of the same block. This gives about 2% speed-up for bus_cif at 2000 kpbs, for speed 0. Its efficacy depends how frequently the motion search will select an integer motion vector. Change-Id: Iee15d4283ad4adea05522c1d40b198b127e6dd97
-
Paul Wilkins authored
Mode search order in rd loop changed to better reflect observed hit counts. Also some adjustment of the baseline mode rd thresholds to reflect the order change and observed frequencies. Change-Id: I47a131cc83e11551df8add6d6d8d413d78d3a63c
-
- 21 Jul, 2013 1 commit
-
-
Jingning Han authored
This commit allows the encoder to skip a few buffer update steps in rd_pick_best_mbsegmentation, when early breakout has been triggered in the rd_check_segment_txsize. It provides about 1% speed-up for bus_cif at 2000 kbps, in the settings of speed 0. Change-Id: Ica034f10a24dec572b397d8389a2b81020ebc0b9
-
- 19 Jul, 2013 4 commits
-
-
Dmitry Kovalev authored
Change-Id: Ia4e83913251c1cdc7aa2abd64bf01ecb1a962119
-
Dmitry Kovalev authored
Moving TX_MODE enum to vp9_enums.h. Renaming txfm_mode variables to tx_mode. Change-Id: I459d1af6dd928ce7fccdf8ce30b6f1ca057bef92
-
Dmitry Kovalev authored
Functions: vp9_get_pred_context_switchable_interp, vp9_get_pred_context_intra_inter, vp9_get_pred_context_single_ref_p1, vp9_get_pred_context_single_ref_p2. Change-Id: I3d6fb8aee23c9062270768e1e6da416dd9bb8f96
-
Dmitry Kovalev authored
Renaming vp9_sb_mv_ref_tree to vp9_inter_mode_tree, and vp9_sb_mv_ref_encoding_array to vp9_inter_mode_encodings. Change-Id: I0e91fbf81350d3ec5a2599064c74089b5d06133a
-
- 18 Jul, 2013 9 commits
-
-
Dmitry Kovalev authored
Change-Id: Ide58a74d31ff948319445a6337d2c05e98720e34
-
Ronald S. Bultje authored
Change-Id: I96b8058f6dfecf8aa3e152cdcbfd7e10071fbbc9
-
Ronald S. Bultje authored
This prevents a duplicate memcpy of a 128-byte struct every time set_scale_factors() is called (which is a lot), thus leading to a decrease from 3.7 MB to 1.85 MB of struct copying per 64x64 block RD/partition loop. Overall, this decreases encoding time of the first 50 frames of bus @ 1500kbps (speed 0) from 1min5.9 to 1min4.9, i.e. about a 1.5% overall speedup. We can likely get more gains by removing the copy of the other struct (and replacing it with an indexing) as well. Change-Id: I3dceb7e79f71e6fe911b11cc994cf89a869dde7a
-
Ronald S. Bultje authored
This means we only do UV intra mode selection if we find any intra mode to actually be useful at all; in addition, we only do UV intra mode selection for the transform sizes that were selected, rather than all sizes available in this partition. First 50 frames of bus @ 1500kbps (speed 0) gains about 5% with this change. Change-Id: I7b461eb8b803247f57896c5a9505f745b55502b3
-
Ronald S. Bultje authored
The break statement only breaks out of the nested loop, not the top-level loop, so it doesn't always work as intended. Changing it to a return statement does what's intended. Change-Id: I585419823b39a04ec8826b1c8a216099b1728ba7
-
Ronald S. Bultje authored
The same information already exists in union b_mode_info. Change-Id: Iac5086b99a3c3cc270380138062bb693e58f9e6d
-
Ronald S. Bultje authored
This could happen during golden overlay frame coding from a previous alt-ref frame if the special overlay code was triggered. Change-Id: I3056d0c547cd26903b260ef93c94026e96bd9868
-
Ronald S. Bultje authored
Change-Id: Id4f454831f3f11099f39c30246adeaa52857d08d
-
Jingning Han authored
Make the use of mv_check_bounds consistent for mvs of both ref_frame[0] and ref_frame[1]. Change-Id: I1ca24865cc7232ca9cbe5db566c53abad1592211
-
- 17 Jul, 2013 10 commits
-
-
Ronald S. Bultje authored
Encoding of first 50 frames of bus (speed 0) @ 1500kbps goes from 1min6.2 to 1min5.9, i.e. 0.5% faster overall. Change-Id: I59d8a3b2f0a75010fa041d5e2646c8caac5bd683
-
Ronald S. Bultje authored
Encode of first 50 frames of bus @ 1500kbps (speed 0) goes from 1min7.3 to 1min6.2, i.e. 1.7% faster overall. Change-Id: I19d2deacfbffadd61d32551cee9586757ab4a987
-
Yaowu Xu authored
Change-Id: Ic4c4b363ed840935e42f495f13ea5e601a56f1b2
-
Ronald S. Bultje authored
Encode of first 50 frames of bus @ 1500kbps (speed 0) goes from 1min12.8 to 1min7.3, i.e. 8% faster. Change-Id: Ia22d1c7b687316c553cc60eacae988b24e175b62
-
Ronald S. Bultje authored
About 15% faster for bus (speed 0) first 50 frames @ 1500kbps, which goes from 1min36 to 1min24. Results become slightly better (+0.2% on derf/yt, +0.4% on hd), probably because of a bugfix for skipmode in super_block_yrd(). Overall speed change (on derfraw300) is roughly -13%. This can probably be improved further by caching best_yrd between partition searches. Also, we might be able to get more speedups by always doing PARTITION_NONE before PARTITIONS_SPLIT, not just at the sb8x8 level. Change-Id: I83736949ebd5b4a3b400ee688d7661913fefc98b
-
Ronald S. Bultje authored
+0.2% SSIM and glbPSNR on derfraw300. Change-Id: I9cba0bca55e606a22f557c7732b064f738efe84d
-
Yunqing Wang authored
Current partition checking starts from small sizes, and then goes up to large sizes. This experiment uses the small partitions' motion estimation result, which is already available, to speed up the large partition's motion estimation. We can decide to skip some patition checkings if they are unlikely choices. We could use the motion vector(MV) result as current partition's prediction MV, limit the search range and reference frame. Current result at speed 1: psnr loss: 1.19% for stdhd, 0.287% for derf. speed gain: 14% for sunflower(hd), 11% for akiyo. Further improvement will be done later. Change-Id: I5abfd070e9cace2e91e2a0247d1325df313887ab
-
Paul Wilkins authored
Use an estimate based on DC_PRED for intra uv cost within the rd loop then only do a full uv mode analysis if an intra mode is chosen. Significant speed gains in some cases. Currently only enabled for speed 2 pending speed/quality tests. Change-Id: Ie851a12400d5483bce47ec0e3ccb8516041e91c0
-
Paul Wilkins authored
Apply limit if search_method == USE_LARGESTALL to the range of UV tx sizes searched. Change-Id: I6db29f0dd237285ffc50d75a37e8b68151ad821c
-
Jingning Han authored
This commit makes the encoder to perform motion search only once per reference frame type for each 4x4/4x8/8x4 block. For bus_cif at 2000 kbps, the runtime goes from 253812ms -> 217817ms (14% speed-up) for speed 0. Change-Id: I5f17599ccc8cfaf93ccb4f98fcb6008af6d79e92
-
- 16 Jul, 2013 3 commits
-
-
Dmitry Kovalev authored
Removing VP9_COMMON* argument and adding struct tx_probs* instead of MACROBLOCKD*. Change-Id: Idf61074631a90ec51eac22c8dcd977f44ac0757c
-
Paul Wilkins authored
Change-Id: I94b97a966b5efbc9a243048f1f5ddbbdc4b1846e
-
Dmitry Kovalev authored
Removing unused and duplicated constants, moving them from *.h to *.c if possible. Change-Id: Ief4d6b984a3ca2e9b38504f0d855ed072cf7133f
-
- 15 Jul, 2013 1 commit
-
-
Jingning Han authored
This speed feature allows the encoder to largely remove the spatial dependency between blocks inside a 64x64 superblock, thereby removing the need to repeatedly encode superblocks per partition type in the rate-distortion optimization loop. A major challenge lies in the intra modes tested in the rate-distortion optimization loop. The subsequent blocks do not have access to the reconstructed boundary pixels without the intermediate coding steps. This was resolved by using the original pixels for intra prediction in the rd loop, followed by an appropriately designed distortion modeling on the quantization parameters. Experiments also suggested that the performance impact is more discernible at lower bit-rate/psnr settings. Hence a quantizer dependent threshold is applied to deactivate skip of block coding. For bus_cif at 2000 kbps, speed 0: runtime 269854ms -> 237774ms (12% speed-up) at 0.05dB performance loss. speed 1: runtime 65312ms -> 61536ms, (7% speed-up) at 0.04dB performance loss. This operation is currently turned on in settings of speed 1. Change-Id: Ib689741dfff8dd38365d8c1b92860a3e176f56ec
-
- 12 Jul, 2013 3 commits
-
-
Yaowu Xu authored
Change-Id: I23a75c495ed7ea917d7f312bef0990e20a6b53d9
-
Deb Mukherjee authored
Implements some of the helper functions more efficiently with lookups rathers than branches. Modeling function is consolidated to reduce some computations. Also merged the two enums BLOCK_SIZE_TYPES and BlockSize into one because there is no need to keep them separate (even though the semantics are a little different). No bitstream or output change. About 0.5% speedup Change-Id: I7d71a66e8031ddb340744dc493f22976052b8f9f
-
Ronald S. Bultje authored
Change-Id: I78a79fc51c2d7cc3c261f35b569155397f3dc0c4
-
- 11 Jul, 2013 2 commits
-
-
Dmitry Kovalev authored
Change-Id: Iccd4ab95ea51a6d57ed43947f2fd7ad92e8979cf
-
Dmitry Kovalev authored
Adding segmentation struct to vp9_seg_common.h. Struct members are from macroblockd and VP9Common structs. Moving segmentation related constants and enums to vp9_seg_common.h. Change-Id: I23fabc33f11a359249f5f80d161daf569d02ec03
-
- 10 Jul, 2013 5 commits
-
-
Jingning Han authored
This commit fixed the mis-use of the tx_type for inverse transform in intra4x4 rate-distortion optimization loop. It improves the overall coding performance. Change-Id: I7fe9953175b74890357dbcee33c138573766e980
-
Jim Bankoski authored
Change-Id: Ie0cb732fdcb98616a422c4463bff80642248d136
-
Deb Mukherjee authored
Adds a speed feature to eliminate full-rd computation if the modeled rd or rd based on a different parameter in the same mode is already a lot larger than the best rd yet. Specifically, only search the sharp and smooth filters if the modeled rd cost based on the regular filter is within a certain factor of the best rd cost so far. Also, skip full-rd computation of non splitmv inter modes if the modeled rd cost based on pred error is within the same factor of the best rd cost so far. Also adds some enhancements in the rd search for splitmv mode to speed things up by early breakouts. Negligible impact on performance. Resuts on derfraw300: psnr: -0.013% with the splitmv enhancements, -0.24% with the rd breakout feature on. speedup: 6% with splitmv enhancements, 20% with also residual breakout (tested on football sequence at 600 Kbps) Change-Id: I37abc308ea9f110c1679ce649b6a7e73ab1ad5fc
-
Ronald S. Bultje authored
Encode time of first 50 frames of bus (speed 0) @ 1500kbps goes from 2min4.9 to 2min3.1, i.e. a 1.4% speedup overall. Change-Id: I9b25e87974430cb942caa276410bb2eda815bd83
-
Yaowu Xu authored
Change-Id: I721ebdeef2b53ce3e5c3eba3f7462ae2103c95a8
-