1. 12 May, 2017 1 commit
  2. 11 May, 2017 2 commits
    • Sean Purser-Haskell's avatar
      Extra rounding to let hw to use narrower integers. · 14b8112b
      Sean Purser-Haskell authored
      Change-Id: I175d6ff03f31a2e0d2fe7cd1c3852210d6e0ddf5
      14b8112b
    • David Barker's avatar
      More accurate chroma warping · f7a5ee53
      David Barker authored
      Previously, the projected positions of chroma pixels would effectively
      undergo double rounding, since we round both when calculating x4 / y4
      and when calculating the filter index. Further, the two roundings
      were different: x4 / y4 used ROUND_POWER_OF_TWO_SIGNED, whereas
      the filter index uses ROUND_POWER_OF_TWO.
      
      It is slightly more accurate (and faster) to replace the first
      rounding by a shift; this is motivated by the fact that
      ROUND_POWER_OF_TWO(x >> a, b) == ROUND_POWER_OF_TWO(x, a + b)
      
      Change-Id: Ia52b05745168d0aeb05f0af4c75ff33eee791d82
      f7a5ee53
  3. 06 May, 2017 1 commit
  4. 05 May, 2017 2 commits
  5. 04 May, 2017 1 commit
    • David Barker's avatar
      Add SSSE3 warp filter + const-ify warp filters · d8a423c6
      David Barker authored
      The SSSE3 filter is very similar to the SSE2 filter, but
      the horizontal pass is sped up by using the 8x8->16
      multiplies added in SSSE3.
      
      Also apply const-correctness to all versions of the filter
      
      The timings of the existing filters are unchanged, and the
      lowbd SSSE3 filter is ~17% faster than the lowbd SSE2 filter.
      
      Timings per 8x8 block:
      lowbd SSE2: 320ns
      lowbd SSSE3: 273ns
      highbd SSSE3: 300ns
      
      Filter output is unchanged.
      
      Change-Id: Ifb428a33b106d900cde1b080794796c0754ae182
      d8a423c6
  6. 03 May, 2017 4 commits
  7. 02 May, 2017 2 commits
  8. 01 May, 2017 3 commits
    • Yaowu Xu's avatar
      labs() -> llabs() · 321357ee
      Yaowu Xu authored
      llabs() takes int64_t as input paramemter, therefore fixes warnings of
      explict type conversion from int64_t to long.
      
      Change-Id: I2569a5c7e425e3690f5dc7a607bad2539c2324f6
      321357ee
    • Yaowu Xu's avatar
      Avoid left shift of negative values · cc6bdab7
      Yaowu Xu authored
      Convert shifts of int/int64 into multiplications
      
      Change-Id: I3d7ef400249096a6c3712c46f59c35c3ddfde5ca
      cc6bdab7
    • Debargha Mukherjee's avatar
      Turn off SSE2 version of warping temporarily · 1abf447b
      Debargha Mukherjee authored
      Temporarily force C version until the SSE2 version is fixed
      
      Change-Id: I51450068259f998d178b1c681872e59d056b254b
      1abf447b
  9. 28 Apr, 2017 3 commits
    • Debargha Mukherjee's avatar
      Revert "Limit to 192 filters for warp, clamp index since in some cases index 192" · 79362e33
      Debargha Mukherjee authored
      This reverts commit 266db85d.
      
      Reason for revert: Reverting to prevent software slowdown. Will be implemented differently in a separate patch.
      
      Change-Id: I386a9661c87d69e22761e5c01507f2f1f968433f
      79362e33
    • Yue Chen's avatar
      Fix test failures and warnings of WARPED_MOTION · f3e1ead3
      Yue Chen authored
      Properly set number of projection samples for seg skip blocks
      at the encoder side to clear unit test failure when both seg feature
      and warped_motion is on.
      Clear 'implicit conversions' warnings
      
      Change-Id: I29e40ffae75880dae2584dbc8772c81321f6d69e
      f3e1ead3
    • David Barker's avatar
      Fix encode/decode mismatch with global/warped motion · b62eef7b
      David Barker authored
      When predicting a 4x4 warp block (either using ZEROMV with
      global-motion, or the WARPED_CAUSAL motion mode with
      warped-motion), the warp filter would previously write
      4 bytes to the right of the block.
      
      This caused encode/decode mismatches when encoding with
      multiple threads and tile_cols > 1, since in that case
      we could end up overwriting already-generated pixels from
      the next tile across.
      
      This patch changes the filter so that we only overwrite the
      intended pixels.
      
      Change-Id: I3664b44e872e85aa5ccc0a5781f0f9ad994a5b80
      b62eef7b
  10. 27 Apr, 2017 1 commit
  11. 26 Apr, 2017 2 commits
  12. 24 Apr, 2017 1 commit
  13. 21 Apr, 2017 1 commit
    • Urvang Joshi's avatar
      Revert "warp_affine_c: Refactor highbd and lowbd versions." · 0d08afdc
      Urvang Joshi authored
      This reverts commit 8cd0e7ef.
      
      Reason for revert:
      This change breaks av1_warp_affine_c when CONFIG_HIGHBITDEPTH is enabled.
      
      In particular, running ./test_libaom --gtest_filter=*Warp* compiled with --enable-warped-motion --enable-highbitdepth shows several test failures, followed by a segmentation fault when it gets up to test SSE2/AV1WarpFilterTest.CheckOutput/4
      
      The tricky part is that the use the lowbd version of the function is dependent on a mix of two conditions:
      (1) Compile time check for CONFIG_HIGHBITDEPTH and
      (2) Run time check to see if bit-depth == 8
      So, it is tricky to refactor.
      
      BUG=aomedia:442
      
      Change-Id: I610c537fb65bde4f357185a13081639f906351de
      0d08afdc
  14. 20 Apr, 2017 1 commit
  15. 17 Apr, 2017 1 commit
  16. 13 Apr, 2017 1 commit
    • Debargha Mukherjee's avatar
      Adds option to use 1/32 subpel precision for gm/wm · 16056f5b
      Debargha Mukherjee authored
      Adds filters for 1/32 subpel precision for warping.
      To use 1/32 subpel precision make WARPEDPIXEL_PREC_BITS 5.
      By default, WARPEDPIXEL_PREC_BITS is set as 6 in common/mv.h,
      which uses 1/64 subpel precision.
      
      If 1/32 precision is used, BDRATE drops:
      on lowres:
      -1.101 (vs. -1.186% with 1/64) w/warped-motion
      -1.587 (vs. -1.650% with 1/64) w/global-motion
      
      on cam_lowres:
      -2.638 (vs. -2.707% with 1/64) w/warped-motion
      -3.396 (vs. -3.453% with 1/64) w/global-motion
      
      Change-Id: I82fbfddaad9bd9be658fe382401d212833c7ceef
      16056f5b
  17. 12 Apr, 2017 1 commit
  18. 11 Apr, 2017 2 commits
  19. 10 Apr, 2017 2 commits
  20. 08 Apr, 2017 1 commit
  21. 07 Apr, 2017 1 commit
  22. 06 Apr, 2017 1 commit
    • David Barker's avatar
      Prepare for vectorizing highbd warp filter · 2bcf280e
      David Barker authored
      This applies the same refactorings to highbd_warp_plane
      which were applied to warp_plane a while ago, and lays the
      groundwork for the relevant tests.
      
      Change-Id: Ic4c00bce1accc5a3624bba0c3b4b325e69a42c1a
      2bcf280e
  23. 05 Apr, 2017 1 commit
  24. 04 Apr, 2017 1 commit
    • Debargha Mukherjee's avatar
      Reduce precision in find_affine_int() · f2f3bcd8
      Debargha Mukherjee authored
      Reduces precision in find_affine_int() function. Makes the maximum
      mv allowed 512 from 1024.
      Negligible impact on coding efficiency.
      
      Change-Id: I76d4c6824528e3f940d1275fe0bd22d71015a8d0
      f2f3bcd8
  25. 02 Apr, 2017 1 commit
    • Yue Chen's avatar
      Use 1 sample per neighbor for local warping model estimation · 5558e5da
      Yue Chen authored
      Only 1 sample needs to be collected. Max of 8 neighbors are
      used.
      In LS estimation, the projection samples (sx, sy)->(dx, dy) are
      intentionally smoothed by assuming 3 shifted versions
      (sx, sy+n)->(dx, dy+n), (sx+n, sy)->(dx+n, dy), (sx+n,
      sy+n)->(dx+n, dy+n) also contribute to the estimation.
      For example, instead of using A[0] = sx^2, we use the sum of
      squares of source x of four points, A[0] += 4sx^2+4*n*sx+n^2.
      But computational cost wise, it does not add much overhead. Coding
      gain is mostly same as the old version. If no smoothing is added,
      will lose 0.3% on lowres.
      
      Change-Id: I04be32cffa525f7dc8ee583c0bf211d7bdc6e609
      5558e5da
  26. 31 Mar, 2017 1 commit
  27. 30 Mar, 2017 1 commit
    • Debargha Mukherjee's avatar
      A few fixes for global motion · 11f0e40d
      Debargha Mukherjee authored
      Handles a rare divisin by 0 case.
      Also adds a check on global motion parameters to disable
      if the parameters obtained are outside the range that the
      shear supports. This fixes a rare assert failure.
      Also changes the recode loop threshold somewhat.
      
      Change-Id: I4c6e74b914ac653cd9caa0563d78b0a19a2a8627
      11f0e40d