- 01 Jul, 2013 1 commit
-
-
Ronald S. Bultje authored
Total encoding time for first 50 frames of bus (speed 0) @ 1500kbps goes 2min34.8 to 2min14.4, i.e. a 10.4% overall speedup. The code is x86-64 only, it needs some minor modifications to be 32bit compatible, because it uses 15 xmm registers, whereas 32bit only has 8. Change-Id: I2df53770c2e850813ffa713e1a91b45b0082b904
-
- 29 Jun, 2013 1 commit
-
-
Yaowu Xu authored
Change-Id: I692d800af1f976c84a76f8bd66864c4b39540abc
-
- 28 Jun, 2013 5 commits
-
-
Ronald S. Bultje authored
Makes cost_coeffs() a lot faster: 4x4: 236 -> 181 cycles 8x8: 888 -> 588 cycles 16x16: 3550 -> 2483 cycles 32x32: 17392 -> 12010 cycles Total encode time of first 50 frames of bus (speed 0) @ 1500kbps goes from 2min51.6 to 2min43.9, i.e. 4.7% overall speedup. Change-Id: I16b8d595946393c8dc661599550b3f37f5718896
-
Ronald S. Bultje authored
4x4: 234 -> 236 cycles 8x8: 878 -> 888 cycles 16x16: 3664 -> 3550 cycles 32x32: 18134 -> 17392 cycles Change-Id: I37a51bfbb0060a3a54f09c6045c14a989811ed78
-
Ronald S. Bultje authored
Cycle timings for first 3 frames of bus (speed 0) at 1500kbps: 4x4: 298 -> 234 cycles 8x8: 1227 -> 878 cycles 16x16: 23426 -> 18134 cycles 32x32: 4906 -> 3664 cycles Total encode time of first 50 frames of bus @ 1500kbps (speed 0) goes from 3min0.7 to 2min51.6 seconds, i.e. 5.3% faster. Change-Id: I68a0e1b530b0563b84a67342cca4b45146077e95
-
Ronald S. Bultje authored
This commit replaces zrun_zbin_boost, a method of biasing non-zero coefficients following runs of zero-coefficients to be rounded towards zero, with an explicit skip-block choice in the RD loop. The logic is basically that if individual coefficients should be rounded towards zero (from a RD point of view), the trellis/optimize loop should take care of it. If whole blocks should be zero (from a RD point of view), a single RD check is much more efficient than a complete serialization of the quantization loop. Quality change: derf +0.5% psnr, +1.6% ssim; yt +0.6% psnr, +1.1% ssim. SIMD for quantize will follow in a separate patch. Results for other test sets pending. Change-Id: Ife5fa641163ac5150ac428011e87188f1937c1f4
-
Yaowu Xu authored
Change-Id: I379617c1c731a686b3f7e032b8805860c1055b12
-
- 27 Jun, 2013 1 commit
-
-
Jingning Han authored
This commit enables configurable reference buffer pointer for intra predictor. This allows later removal of spatial dependency between blocks inside a 64x64 superblock in the rate-distortion optimization loop. Change-Id: I02418c2077efe19adc86e046a6b49364a980f5b1
-
- 26 Jun, 2013 3 commits
-
-
Paul Wilkins authored
Also tweaks to other features and experiments with what is on and off at different speed settings. Change-Id: I3e1d0be0d195216bf17c2ac5df67f34ce0b306b2
-
Paul Wilkins authored
Each frame we reset all adaptive thresholds to MAX rather than base. As modes are picked their thresholds drop down. Change-Id: Ia37f03a73003c2d9bfcda57edea07205e9a0e5e8
-
Paul Wilkins authored
Renamed cpi->sf.first_step to cpi->sf.reduce_first_step_size and changed its meaning such that it is a delta applied to reduce the default first step size (>> x) in the motion search rather than an absolute value. The default first step size is already changed according to the image dimensions (smaller for smaller images). cpi->sf.reduce_first_step_size now applies a further correction from the default. Change-Id: Ia94e08bc24c67b604831f980909af7e982fcd16d
-
- 25 Jun, 2013 3 commits
-
-
Jingning Han authored
Remove vp9_intra4x4_predict(). Use the common intra prediction function for all block sizes. Change-Id: Ibd19d51dfa3da8bbdfb79ddeb81530b2e2089560
-
Dmitry Kovalev authored
Change-Id: I453ed11b965e857a14c18ea5c0f4a0a48e7dc0d9
-
Dmitry Kovalev authored
Removing block index (ib) parameter from get_tx_type_{8x8, 16x16} functions. Change-Id: Ia213335aae7a7cb027f97b9cc9b04519840250f1
-
- 21 Jun, 2013 3 commits
-
-
Dmitry Kovalev authored
Using MV instead of int_mv for function arguments. Change-Id: Ic25e13dccbc98fac1fa1b3255127e00cca2a57f6
-
Ronald S. Bultje authored
Change vp9_block_error() to return a 64bit error variable, change all callers to expect a 64bit return value (this will prevent overflows, which we basically don't check for at all right now). Remove duplicate block_error() function, which fixed that through truncation. Remove old (incompatible) mmx/sse2 block_error SIMD versions and replace with a new one that returns a 64bit value. Encoding time of first 50 frames of bus @ 1500kbps goes from 3min29 to 3min23, i.e. a 3% overall speedup. Change-Id: Ib71ac5508b5ee8a80f1753cd85d72df1629abe68
-
Yaowu Xu authored
Change-Id: I7960178c95c54d5c4497e44cfc8c493566294b34
-
- 20 Jun, 2013 2 commits
-
-
Deb Mukherjee authored
Improves the rd modeling function and implements them using interpolation from a table which is a little faster. Also uses sse as input to the modeling function rather than var - since there is no dc prediction used and as a result the sse works a little better. derfraw300: +0.05% Speedup: ~1% Change-Id: I151353c6451e0e8fe3ae18ab9842f8f67e5151ff
-
Jim Bankoski authored
Change-Id: Ie24489a4d39f3e53e816eeebf75a1c9c7d94515a
-
- 19 Jun, 2013 1 commit
-
-
Yaowu Xu authored
Change-Id: Ic924f07c6ab0c929c6cdf11880d3c625806e272c
-
- 14 Jun, 2013 1 commit
-
-
Deb Mukherjee authored
No bitstream or output change - only cosmetics. Change-Id: Ic8c1d7ad010a87dcf27d12a38cd7dd5adba683a7
-
- 11 Jun, 2013 1 commit
-
-
Deb Mukherjee authored
Avoids divide-by-zero when variance is 0. Change-Id: I3c7f526979046ff7d17714ce960fe81d6e1442a0
-
- 10 Jun, 2013 4 commits
-
-
John Koleszar authored
Change the argument of get_uv_tx_size() to be an MBMI pointer, so that the correct column's MBMI can be passed to the function. Change-Id: Ied6b8ec33b77cdd353119e8fd2d157811815fc98
-
Paul Wilkins authored
Do not allow the rd code to check compound modes if a segment level reference frame is selected. Change-Id: I95f0c57789e0eaceed7caf227e94b4ba3130a06c
-
Ronald S. Bultje authored
Change-Id: Ib5a95bb6ab643b276df3faa9bf99595e4a69ff18
-
Tero Rintaluoma authored
Fixed point scaling factors are calculated once for each reference frame by using integer division. Otherwise fixed point scaling routines are used in all scaling calculations. This makes it possible to calculate fixed point scaling factors on device driver software and pass them to hardware and thus avoid division on hardware. TODO: - Missing check for maximum frame dimensions (currently scaling uses 14 bits) - Missing check for maximum scaling ratio (upscaling 16:1, downscaling 2:1) Problems: - Straightforward fixed point implementation can cause error +-1 compared to integer division (i.e. in x_step_q4). Should only be an issue for frames larger than 16k. Change-Id: I3cf4dabd610a4dc18da3bdb31ae244ebaf5d579c
-
- 07 Jun, 2013 4 commits
-
-
Deb Mukherjee authored
Adds coding of transform size within a frame by use of context of transform sizes selected in left and above blocks. Also incorporates code for generating stats. TODO: generate and incorporate new default stats Change-Id: I6a7af099f6ad61d448521d9a51167aedaf638ed6
-
Paul Wilkins authored
Simplify feature to only support a single reference frame instead of a mask. Change-Id: I5dd3a98c7a224aafb35708850ab82e2f220e68fb
-
Deb Mukherjee authored
Changes to the coding of transform sizes, along with forward and backward probability updates. Results: derf300: +0.241% Context based coding of transform sizes will be in a separate patch. Change-Id: I97241d60a926f014fee2de21fa4446ca56495756
-
Ronald S. Bultje authored
Code intra/inter, then comp/single, then the ref frame selection. Use contextualization for all steps. Don't code two past frames in comp pred mode. Change-Id: I4639a78cd5cccb283023265dbcc07898c3e7cf95
-
- 06 Jun, 2013 7 commits
-
-
Ronald S. Bultje authored
Split partition probabilities between keyframes and non-keyframes, since they are fairly different. Also have per-blocksize interframe y intramode probabilities, since these vary heavily between different blocksizes. Lastly, replace default probabilities for partitioning and intra modes with new ones generated from current codec. Replace counts with actual probabilities also. Change-Id: I77ca996e25e4a28e03bdbc542f27a3e64ca1234f
-
Jingning Han authored
Fix the calculation of step size in height. Change-Id: I0e0c0175f141f5a41214ae51cef233d13942d3c5
-
Jim Bankoski authored
Change-Id: Ieface458c83eb6e7ee95595d9fc662f372117c9a
-
Paul Wilkins authored
Added structures to support independent rd thresholds for different block sizes (and set experimental block size correction factors). Added structure to to allow dynamic adaptation of thresholds per mode and per block size basis depending on how often the mode/block size combination is seen (currently fixed factor). Removed some unused variables. TODO - Adaptation of thresholds based on how often each mode chosen. - The baseline mode values could also be adjusted based on the block size (e.g. for a particular intra mode use a low threshold for 4x4 prediction blocks but a relatively high value for 64x64. Change-Id: Iddee65ff3324ee309815ae7c1c5a8584720e7568
-
Paul Wilkins authored
Turn this feature off for some modes in "good" quality. Change-Id: I3f262d62cca8f01736b977af1465291e8be29f0a
-
Jim Bankoski authored
This avoids encoding tokens for blocks that are entirely in the UMV border. This changes the bitstream. Change-Id: I32b4df46ac8a990d0c37cee92fd34f8ddd4fb6c9
-
Jingning Han authored
This commit makes the coding/reconstruction operations of intra coding rate-distortion loop for UV components consistent with those of the encoding process. key frame coding gains: derf: 0.11% stdhd: 0.42% Change-Id: I8d49f83924a320e3689ef2d60096c49d7f0c7a40
-
- 05 Jun, 2013 1 commit
-
-
Deb Mukherjee authored
NO bitstream change Change-Id: I79f6146dac5fdd157051b6f8dc611c0b7b5e5f7f
-
- 04 Jun, 2013 1 commit
-
-
Jingning Han authored
This commit makes operations of the superblock intra coding rate distortion optimization consistent with those used in the encoding process. Given the test prediction mode and transform size, the rd optimizer encodes and reconstructs each transformed block of the superblock consecutively, then computes the total rate-distortion costs accosicated with the current superblock to select the coding decisions. It achieves coding performance gains: derf 0.353% yt 1.111% Change-Id: I0da2eb7a71361dfb8c1384927fc536b0c2790d07
-
- 03 Jun, 2013 1 commit
-
-
Jingning Han authored
Enable iterative motion search for compound inter-inter prediction of block sizes 4x4/4x8/8x4 only when best coding quality is selected. The iterative motion search provides about 0.1% gains for derf and stdhd at this point, at the expense of longer runtime. Change-Id: Idc03e7f827e51f1bb8d269bc3752ee297a6bbfe5
-