- 10 Mar, 2013 1 commit
-
-
John Koleszar authored
The previous implementation visited each node in the tree multiple times because it used each symbol's encoding to revisit the branches taken and increment its count. Instead, we can traverse the tree depth first and calculate the probabilities and branch counts as we walk back up. The complexity goes from somewhere between O(nlogn) and O(n^2) (depending on how balanced the tree is) to O(n). Only tested one clip (256kbps, CIF), saw 13% decoding perf improvement. Note that this optimization should port trivially to VP8 as well. In VP8, the decoder doesn't use this function, but it does routinely show up on the profile for realtime encoding. Change-Id: I4f2848e4f41dc9a7694f73f3e75034bce08d1b12
-
- 05 Mar, 2013 1 commit
-
-
Ronald S. Bultje authored
Split macroblock and superblock tokenization and detokenization functions and coefficient-related data structs so that the bitstream layout and related code of superblock coefficients looks less like it's a hack to fit macroblocks in superblocks. In addition, unify chroma transform size selection from luma transform size (i.e. always use the same size, as long as it fits the predictor); in practice, this means 32x32 and 64x64 superblocks using the 16x16 luma transform will now use the 16x16 (instead of the 8x8) chroma transform, and 64x64 superblocks using the 32x32 luma transform will now use the 32x32 (instead of the 16x16) chroma transform. Lastly, add a trellis optimize function for 32x32 transform blocks. HD gains about 0.3%, STDHD about 0.15% and derf about 0.1%. There's a few negative points here and there that I might want to analyze a little closer. Change-Id: Ibad7c3ddfe1acfc52771dfc27c03e9783e054430
-
- 04 Mar, 2013 2 commits
-
-
Yunqing Wang authored
Wrote a SSE2 vp9_short_idct4x4llm to improve the decoder performance. Change-Id: I90b9d48c4bf37aaf47995bffe7e584e6d4a2c000
-
Jingning Han authored
Fixed a couple of variable/function definitions, as well as header handling to support 16K sequence coding at high bit-rates. The width and height are each specified by two bytes in the header. Use an extra byte to explicitly indicate the scaling factors in both directions, each ranging from 0 to 15. Tested coding up to 16400x16400 dimension. Change-Id: Ibc2225c6036620270f2c0cf5172d1760aaec10ec
-
- 02 Mar, 2013 3 commits
-
-
John Koleszar authored
Update the function prototypes to match between VP9 and VP8. Change-Id: If58965073989e87df3b62b67a030ec6ce23ca04f
-
Dmitry Kovalev authored
Change-Id: Iab0176f058045181821ded95ff1cf423af1625f9
-
Dmitry Kovalev authored
Removing redundant 'extern' keyword, lowercase variable names. Change-Id: I608e8d8579aba8981f5fac3493f77b4481b13808
-
- 01 Mar, 2013 2 commits
-
-
Yaowu Xu authored
to be a fixed value of 15. Test results: cif: .124%, .068%, .081% std-hd: 2.809%, 3.174%, 2.705% Change-Id: I380c8152c973506094da15eab59e3aa22b75a983
-
Yunqing Wang authored
Simplified idct32x32 calculation when there are only 10 or less non-zero coefficients in 32x32 block. This helps the decoder performance. Change-Id: If7f8893d27b64a9892b4b2621a37fdf4ac0c2a6d
-
- 28 Feb, 2013 12 commits
-
-
James Zern authored
gf_group_bits is int64_t remove casts to int. Change-Id: I3b4225905041fac9af9fdfcbcb6f1c357ea4b593
-
Dmitry Kovalev authored
Lower case variable names, converting while loops to for loops. Change-Id: Ic3b973391eef7472a99d18d02fe79cfef5e04e62
-
Yunqing Wang authored
Provided a wrapper and removed duplicate code. Change-Id: Iaef842226ec348422e459202793b001d0983ea30
-
Scott LaVarnway authored
Change-Id: Ie89bd00d58e30bf4094cb748a282f1dfa81a31d8
-
Jim Bankoski authored
Change-Id: Id786be31da3c91d95d2955aa569ecdc6e66650df
-
John Koleszar authored
The ABOVESPREFMV experiment uses four pixels to the left of the current block, which don't exist for the left-most column. Change-Id: I4cf0b42ae8f54c0b3e7b1ed8755704b74fafc39c
-
Dmitry Kovalev authored
Removing redundant variables, using x *= y instead x = x * y, moving variable declarations into inner blocks. Change-Id: I884f95c755f55d51b7c1c6585f10296919063e41
-
Dmitry Kovalev authored
Removing redundant 'extern' keyword, better formatting, code simplification. Change-Id: I132fea14f08c706ee9ea147d19464d03f833f25b
-
John Koleszar authored
The width and height stored in the reference frames are padded out to a multiple of 16. The Width and Height variables in common are the displayed size, which may be smaller. The incorrect comparison was causing scaling related code to be called when it shouldn't have been. A notable case where this happens is 1080p, since 1088 != 1080. Change-Id: I55f743eeeeaefbf2e777e193bc9a77ff726e16b5
-
Jim Bankoski authored
sse4_1 code used uint16_t for returning sad, but that won't work for 32x32 or 64x64. This code fixes the assembly for those and also reenables sse4_1 on linux Change-Id: I5ce7288d581db870a148e5f7c5092826f59edd81
-
Jim Bankoski authored
Change-Id: I919e2dd72292fe44f2e53ada56bd42287d50cdeb Signed-off-by:
Jim Bankoski <jimbankoski@google.com>
-
Christian Duvivier authored
Scalar path is about 1.4x faster (4% overall encoder speedup). SSE2 path is about 7x faster (13% overall encoder speedup). Change-Id: I7e85d8225a914a74c61ea370210414696560094d
-
- 27 Feb, 2013 14 commits
-
-
Dmitry Kovalev authored
Fixing code style, using array lookup instead of switch statements for forward hybrid transforms (in the same way as for their inverses). Consistent usage of ROUND_POWER_OF_TWO macro in appropriate places. Change-Id: I0d3822ae11f928905fdbfbe4158f91d97c71015f
-
Dmitry Kovalev authored
Fixing indentation, removing redundant parenthesis, deciphering single letter variable names, better spacing. Change-Id: I1d447a7d69eddbf1e94e0820423615f40ea2d591
-
Yunqing Wang authored
Removed vp9/decoder/x86/vp9_idct_blk_mmx.c Change-Id: I07ab06382a394cf556fa5a8e3c98b91f6e4f9ce8
-
Yunqing Wang authored
Removed vp9_idctllm_mmx.asm Change-Id: I7152756f23a5a09ed69e8fb40edb2ab3237290fe
-
Ronald S. Bultje authored
Consistent with VP8. Change-Id: I8c316ee49f072e15abbb033a80e9c36617891f07
-
John Koleszar authored
This function was part of an optimization used in VP8 that required caching two macroblocks. This is unused in VP9, and might not survive refactoring to support superblocks, so removing it for now. Change-Id: I744e585206ccc1ef9a402665c33863fc9fb46f0d
-
Jan Kratochvil authored
s/movd/movq/ Change-Id: Id1a56de91551f8dc796f14f1056c565dfc1ba626
-
John Koleszar authored
This patch makes the encoder's use of ref_frame_map and active_ref_idx consistent with the decoder. ref_frame_map[] maps a reference buffer index to its actual location in the yv12_fb array, since many references may share an underlying buffer. active_ref_idx[] mirrors cpi->{lst,gld,alt}_fb_idx, holding the active references in each slot. This also fixes a bug in setup_buffer_inter() where the incorrect reference was used to populate the scaling factors. Change-Id: Id3728f6d77cffcd27c248903bf51f9c3e594287e
-
John Koleszar authored
Fixes a bug in vp9_set_internal_size() that prevented returning to the unscaled state. Updated the ResizeInternalTest to scale both down and up. Added a check that all frames are within 2.5% of the quality of the initial keyframe. Change-Id: I3b7ef17cdac144ed05b9148dce6badfa75cff5c8
-
John Koleszar authored
This avoids duplicating all the filters twice. Includes fixups to the convolve routines and associated tests to make this work. Change-Id: I922f86021594e55072ddb63b42b2313605db6e00
-
John Koleszar authored
This patch extends the previous support for using references of a different resolution in ZEROMV mode to all inter prediction modes. Subpixel based best-mv scoring is disabled when the reference frame differs in resolution from the current frame. Change-Id: Id4dc3e5e6692de98d9857fd56bfad3ac57e944ac
-
John Koleszar authored
This commit updates the 4x4 prediction to consistently use the build_2x1_inter_predictor() method. That function is updated to calculate the scale offset, rather than relying on the caller to calculate it. In the case that the 2x1 prediction can not be used, the scale offset is recalculated for each 1x1 block. The idea here is that the offsets are calculated before each call to vp9_build_scaled_inter_predictor(). Change-Id: I0ac3343dd54e2846efa3c4195fcd328b709ca04d
-
John Koleszar authored
This patch allows coding frames using references of different resolution, in ZEROMV mode. For compound prediction, either reference may be scaled. To test, I use the resize_test and enable WRITE_RECON_BUFFER in vp9_onyxd_if.c. It's also useful to apply this patch to test/i420_video_source.h: --- a/test/i420_video_source.h +++ b/test/i420_video_source.h @@ -93,6 +93,7 @@ class I420VideoSource : public VideoSource { virtual void FillFrame() { // Read a frame from input_file. + if (frame_ != 3) if (fread(img_->img_data, raw_sz_, 1, input_file_) == 0) { limit_ = frame_; } This forces the frame that the resolution changes on to be coded with no motion, only scaling, and improves the quality of the result. Change-Id: I1ee75d19a437ff801192f767fd02a36bcbd1d496
-
Yunqing Wang authored
Wrote SSE2 version of vp9_dc_only_idct_add_c function. In order to improve performance, clipped the absolute diff values to [0, 255]. This allowed us to keep the additions/subtractions in 8 bits. Test showed an over 2% decoder performance increase. Change-Id: Ie1a236d23d207e4ffcd1fc9f3d77462a9c7fe09d
-
- 26 Feb, 2013 5 commits
-
-
Dmitry Kovalev authored
Change-Id: I893fa36297b9bd9cff93d082f1736f6860b15c0d
-
Ronald S. Bultje authored
Change-Id: I35e64998b25694a3bb4a62164bba3c03c1db4bc7
-
Ronald S. Bultje authored
Change-Id: I17e2d2f6a4da86d9e4af7bebdea0bf5d154da084
-
Ronald S. Bultje authored
Change-Id: I62497dcf2074b4bb4787bf660e727e5cf1bf3472
-
John Koleszar authored
Ensure that all inter prediction goes through a common code path that takes scaling into account. Removes a bunch of duplicate 1st/2nd predictor code. Also introduces a 16x8 mode for 8x8 MVs, similar to the 8x4 trick we were doing before. This has an unexpected effect with EIGHTTAP_SMOOTH, so it's disabled in that case for now. Change-Id: Ia053e823a8bc616a988a0af30452e1e75a739cba
-