1. 02 Mar, 2013 1 commit
  2. 01 Mar, 2013 1 commit
    • Yunqing Wang's avatar
      Add eob<=10 case in idct32x32 · c550bb3b
      Yunqing Wang authored
      Simplified idct32x32 calculation when there are only 10 or less
      non-zero coefficients in 32x32 block. This helps the decoder
      performance.
      
      Change-Id: If7f8893d27b64a9892b4b2621a37fdf4ac0c2a6d
      c550bb3b
  3. 28 Feb, 2013 5 commits
  4. 27 Feb, 2013 10 commits
    • Dmitry Kovalev's avatar
      Code cleanup. · 347f3a0a
      Dmitry Kovalev authored
      Fixing code style, using array lookup instead of switch statements for
      forward hybrid transforms (in the same way as for their inverses).
      Consistent usage of ROUND_POWER_OF_TWO macro in appropriate places.
      
      Change-Id: I0d3822ae11f928905fdbfbe4158f91d97c71015f
      347f3a0a
    • Yunqing Wang's avatar
      Remove unused file · 5ef694cf
      Yunqing Wang authored
      Removed vp9_idctllm_mmx.asm
      
      Change-Id: I7152756f23a5a09ed69e8fb40edb2ab3237290fe
      5ef694cf
    • Ronald S. Bultje's avatar
      Move eob from BLOCKD to MACROBLOCKD. · e8c74e2b
      Ronald S. Bultje authored
      Consistent with VP8.
      
      Change-Id: I8c316ee49f072e15abbb033a80e9c36617891f07
      e8c74e2b
    • John Koleszar's avatar
      Remove unused vp9_copy32xn · 7ad8dbe4
      John Koleszar authored
      This function was part of an optimization used in VP8 that required
      caching two macroblocks. This is unused in VP9, and might not
      survive refactoring to support superblocks, so removing it for now.
      
      Change-Id: I744e585206ccc1ef9a402665c33863fc9fb46f0d
      7ad8dbe4
    • Jan Kratochvil's avatar
      Fix --as=nasm compatibility for new asm code. · 82ed3f9a
      Jan Kratochvil authored
      s/movd/movq/
      
      Change-Id: Id1a56de91551f8dc796f14f1056c565dfc1ba626
      82ed3f9a
    • John Koleszar's avatar
      Use 256-byte aligned filter tables · 6fd7dd1a
      John Koleszar authored
      This avoids duplicating all the filters twice. Includes fixups to the
      convolve routines and associated tests to make this work.
      
      Change-Id: I922f86021594e55072ddb63b42b2313605db6e00
      6fd7dd1a
    • John Koleszar's avatar
      Combined motion compensation with scaled predictors · 77f88e97
      John Koleszar authored
      This patch extends the previous support for using references of a
      different resolution in ZEROMV mode to all inter prediction modes.
      Subpixel based best-mv scoring is disabled when the reference frame
      differs in resolution from the current frame.
      
      Change-Id: Id4dc3e5e6692de98d9857fd56bfad3ac57e944ac
      77f88e97
    • John Koleszar's avatar
      Set scale factors consistently for SPLITMV · 472eeaf0
      John Koleszar authored
      This commit updates the 4x4 prediction to consistently use the
      build_2x1_inter_predictor() method. That function is updated to
      calculate the scale offset, rather than relying on the caller
      to calculate it. In the case that the 2x1 prediction can not
      be used, the scale offset is recalculated for each 1x1 block.
      The idea here is that the offsets are calculated before each
      call to vp9_build_scaled_inter_predictor().
      
      Change-Id: I0ac3343dd54e2846efa3c4195fcd328b709ca04d
      472eeaf0
    • John Koleszar's avatar
      Spatial resamping of ZEROMV predictors · eb939f45
      John Koleszar authored
      This patch allows coding frames using references of different
      resolution, in ZEROMV mode. For compound prediction, either
      reference may be scaled.
      
      To test, I use the resize_test and enable WRITE_RECON_BUFFER
      in vp9_onyxd_if.c. It's also useful to apply this patch to
      test/i420_video_source.h:
      
        --- a/test/i420_video_source.h
        +++ b/test/i420_video_source.h
        @@ -93,6 +93,7 @@ class I420VideoSource : public VideoSource {
      
           virtual void FillFrame() {
             // Read a frame from input_file.
        +    if (frame_ != 3)
             if (fread(img_->img_data, raw_sz_, 1, input_file_) == 0) {
               limit_ = frame_;
             }
      
      This forces the frame that the resolution changes on to be coded
      with no motion, only scaling, and improves the quality of the
      result.
      
      Change-Id: I1ee75d19a437ff801192f767fd02a36bcbd1d496
      eb939f45
    • Yunqing Wang's avatar
      Optimize vp9_dc_only_idct_add_c function · 35bc02c6
      Yunqing Wang authored
      Wrote SSE2 version of vp9_dc_only_idct_add_c function. In order to
      improve performance, clipped the absolute diff values to [0, 255].
      This allowed us to keep the additions/subtractions in 8 bits.
      Test showed an over 2% decoder performance increase.
      
      Change-Id: Ie1a236d23d207e4ffcd1fc9f3d77462a9c7fe09d
      35bc02c6
  5. 26 Feb, 2013 4 commits
    • Dmitry Kovalev's avatar
      Removing redundant 'extern' keyword from function declarations. · 971ff267
      Dmitry Kovalev authored
      Change-Id: I893fa36297b9bd9cff93d082f1736f6860b15c0d
      971ff267
    • John Koleszar's avatar
      Refactor inter recon functions to support scaling · 6a4f708c
      John Koleszar authored
      Ensure that all inter prediction goes through a common code path
      that takes scaling into account. Removes a bunch of duplicate
      1st/2nd predictor code. Also introduces a 16x8 mode for 8x8
      MVs, similar to the 8x4 trick we were doing before. This has an
      unexpected effect with EIGHTTAP_SMOOTH, so it's disabled in that
      case for now.
      
      Change-Id: Ia053e823a8bc616a988a0af30452e1e75a739cba
      6a4f708c
    • Yaowu Xu's avatar
      Improve 32x32 forward dct · 66d94ac1
      Yaowu Xu authored
      The commit improves the 32x32 forward dct implementation:
      1. change to use same constants and rounding as other forward dcts
      2. select rounding to specifically minimize the roundtrip error, which
      improved average 19/block to .77/block using 100000 random input.
      
      Test showed a small but consistent gain on all test sets, about .15%
      
      Change-Id: If0afd6a71880a522f60c1c234be0462092c2eb53
      66d94ac1
    • Dmitry Kovalev's avatar
      Changing pitch value meaning for fht and iht transforms. · 9bf3f751
      Dmitry Kovalev authored
      Pitch now means the number of elements, not the number of bytes.
      
      Change-Id: Idb9f2f012e39b09d596a3cc1802305a80b7c13af
      9bf3f751
  6. 25 Feb, 2013 3 commits
    • Dmitry Kovalev's avatar
      Code cleanup. · 9770d564
      Dmitry Kovalev authored
      Removing switch statements for inverse hybrid transforms. Making code style
      consistent for all similar transform implementations. Renaming shortpitch
      and short_pitch variables to half_pitch.
      
      Change-Id: I875f7a82aae4e8063a58777bf1cc3f1e67b48582
      9770d564
    • Dmitry Kovalev's avatar
      Code cleanup. · 20b0cb59
      Dmitry Kovalev authored
      Removing redundant parentheses, better code formatting, introducing
      ROUND_POWER_OF_TWO macro to replace repeated expression.
      
      Change-Id: I91aad7a53ed03482428b2419de4bb99fd92c6771
      20b0cb59
    • Jingning Han's avatar
      clean up forward and inverse hybrid transform · 77a3becf
      Jingning Han authored
      Rebased.
      
      Remove the old matrix multiplication transform computation. The 16x16
      ADST/DCT can be switched on/off and evaluated by setting ACTIVE_HT16
      300/0 in vp9/common/vp9_blockd.h.
      
      Change-Id: Icab2dbd18538987e1dc4e88c45abfc4cfc6e133f
      77a3becf
  7. 23 Feb, 2013 3 commits
    • Ronald S. Bultje's avatar
      Split coefficient token tables intra vs. inter. · 0c9e2e9a
      Ronald S. Bultje authored
      Change-Id: I5416455f8f129ca0f450d00e48358d2012605072
      0c9e2e9a
    • Paul Wilkins's avatar
      Further changes to coefficient contexts. · c17672a3
      Paul Wilkins authored
      This patch alters the balance of context between the
      coefficient bands (reflecting the position of coefficients
      within a transform blocks) and the energy of the previous
      token (or tokens) within a block.
      
      In this case the number of coefficient bands is reduced
      but more previous token energy bands are supported.
      
      Some initial rebalancing of the default tables has been
      by running multiple derf clips at multiple data rates using
      the ENTOPY_STATS macro. Further balancing needs to be
      done using larger image formatsd especially in regard to
      the bigger transform sizes which are not as well represented
      in encodings of smaller image formats.
      
      Change-Id: If9736e95c391e711b04aef6393d26f60f36e1f8a
      c17672a3
    • James Zern's avatar
      give vp9 variance struct a unique name · e5fb6321
      James Zern authored
      variance_vtable clashed with vp8/common/variance.h
      
      Change-Id: I09c1de44d5519f1bd13f58c01144c0de4706de6f
      e5fb6321
  8. 22 Feb, 2013 2 commits
    • Dmitry Kovalev's avatar
      Code cleanup. · 548b4dd5
      Dmitry Kovalev authored
      Removing redundant 'extern' keywords and parentheses, fixing indentation,
      making variable names lower case, using short expressions x *= c
      instead of x = x * c, minor code simplifications.
      
      Change-Id: If6a25fcf306d1db26e90d27e3c24a32735c607de
      548b4dd5
    • Jingning Han's avatar
      Forward butterfly hybrid transform · babbd5d1
      Jingning Han authored
      This patch includes 4x4, 8x8, and 16x16 forward butterfly ADST/DCT
      hybrid transform. The kernel of 4x4 ADST is sin((2k+1)*(n+1)/(2N+1)).
      The kernel of 8x8/16x16 ADST is of the form sin((2k+1)*(2n+1)/4N).
      
      Change-Id: I8f1ab3843ce32eb287ab766f92e0611e1c5cb4c1
      babbd5d1
  9. 21 Feb, 2013 2 commits
    • Ronald S. Bultje's avatar
      Remove "eobs" array in MACROBLOCKD. · 35524e22
      Ronald S. Bultje authored
      The information is a duplicate of "eob" in BLOCKD.
      
      Change-Id: Ia6416273bd004611da801e4bfa6e2d328d6f02a3
      35524e22
    • Deb Mukherjee's avatar
      Refactoring of switchable filter search for speed · 28b1db92
      Deb Mukherjee authored
      Refactors the switchable filter search in the rd loop to
      improve encode speed.
      
      Uses a piecewise approximation to a closed form expression to estimate
      rd cost for a Laplacian source with a given variance and quantization
      step-size.
      
      About 40% encode time reduction is achieved.
      
      Results (on a feb 12 baseline) show a slight drop:
      
      derf: -0.019%
      yt: +0.010%
      std-hd: -0.162%
      hd: -0.050%
      
      Change-Id: Ie861badf5bba1e3b1052e29a0ef1b7e256edbcd0
      28b1db92
  10. 20 Feb, 2013 3 commits
    • Dmitry Kovalev's avatar
      Code cleanup. · eb6aee50
      Dmitry Kovalev authored
      Change-Id: I7c6e3bebd94856b24dbe2aded7f9e04ef8bb8c08
      eb6aee50
    • Yaowu Xu's avatar
      Merge lossless experiment · d262e26c
      Yaowu Xu authored
      Change-Id: I7b7b8d4fda3a23699e0c920d727f8c15d37d43aa
      d262e26c
    • Tero Rintaluoma's avatar
      Avoid division in intra prediction · 56e6c66b
      Tero Rintaluoma authored
      - Using multiplication and shifting instead of division in
        intra prediction.
      - Maximum absolute difference is 1 for division statements
        in d45, d27, d63 prediction modes. However, errors can
        cumulate for large block sizes when using already predicted
        values.
      - Maximum number of non-matching result values in loops using
        division are:
        4x4        0/16
        8x8        0/64
        16x16     10/256
        32x32     13/1024
        64x64    122/4096
      
        Overall PSNR
        derf:     0.005
        yt:      -0.022
        std-hd:   0.021
        hd:      -0.006
      
      Change-Id: I3979a02eb6351636442c1af1e23d6c4e6ec1d01d
      56e6c66b
  11. 19 Feb, 2013 2 commits
    • Jingning Han's avatar
      16x16 butterfly inverse ADST/DCT hybrid transform · cd907b16
      Jingning Han authored
      rebased.
      
      This patch includes 16x16 butterfly inverse ADST/DCT hybrid
      transform. It uses the variant ADST of kernel
          sin((2k+1)*(2n+1)/4N),
      which allows a butterfly implementation.
      
      The coding gains as compared to DCT 16x16 are about 0.1% for
      both derf and std-hd. It is noteworthy that for std-hd sets
      many sequences gains about 0.5%, some 0.2%. There are also few
      points that provides -1% to -3% performance. Hence the average
      goes to about 0.1%.
      
      Change-Id: Ie80ac84cf403390f6e5d282caa58723739e5ec17
      cd907b16
    • Yaowu Xu's avatar
      Use lossless for Q0 · 93d6b86c
      Yaowu Xu authored
      The commit changes the coding mode to lossless whenever the lowest
      quantizer is choosen.
      
      As expected, test results showed no difference for cif and std-hd
      set where Q0 is rarely used. For yt and yt-hd set, Q0 is used for
      a number of clips, where this commit helped a lot in the high end.
      
      Average over all clips in the sets:
      yt: 2.391% 1.017% 1.066%
      hd: 1.937%  .764%  .787%
      
      Change-Id: I9fa9df8646fd70cb09ffe9e4202b86b67da16765
      93d6b86c
  12. 15 Feb, 2013 3 commits
  13. 14 Feb, 2013 1 commit