1. 05 May, 2017 1 commit
  2. 04 May, 2017 1 commit
    • David Barker's avatar
      Add SSSE3 warp filter + const-ify warp filters · d8a423c6
      David Barker authored
      The SSSE3 filter is very similar to the SSE2 filter, but
      the horizontal pass is sped up by using the 8x8->16
      multiplies added in SSSE3.
      
      Also apply const-correctness to all versions of the filter
      
      The timings of the existing filters are unchanged, and the
      lowbd SSSE3 filter is ~17% faster than the lowbd SSE2 filter.
      
      Timings per 8x8 block:
      lowbd SSE2: 320ns
      lowbd SSSE3: 273ns
      highbd SSSE3: 300ns
      
      Filter output is unchanged.
      
      Change-Id: Ifb428a33b106d900cde1b080794796c0754ae182
      d8a423c6
  3. 03 May, 2017 4 commits
  4. 02 May, 2017 2 commits
  5. 01 May, 2017 3 commits
    • Yaowu Xu's avatar
      labs() -> llabs() · 321357ee
      Yaowu Xu authored
      llabs() takes int64_t as input paramemter, therefore fixes warnings of
      explict type conversion from int64_t to long.
      
      Change-Id: I2569a5c7e425e3690f5dc7a607bad2539c2324f6
      321357ee
    • Yaowu Xu's avatar
      Avoid left shift of negative values · cc6bdab7
      Yaowu Xu authored
      Convert shifts of int/int64 into multiplications
      
      Change-Id: I3d7ef400249096a6c3712c46f59c35c3ddfde5ca
      cc6bdab7
    • Debargha Mukherjee's avatar
      Turn off SSE2 version of warping temporarily · 1abf447b
      Debargha Mukherjee authored
      Temporarily force C version until the SSE2 version is fixed
      
      Change-Id: I51450068259f998d178b1c681872e59d056b254b
      1abf447b
  6. 28 Apr, 2017 3 commits
    • Debargha Mukherjee's avatar
      Revert "Limit to 192 filters for warp, clamp index since in some cases index 192" · 79362e33
      Debargha Mukherjee authored
      This reverts commit 266db85d.
      
      Reason for revert: Reverting to prevent software slowdown. Will be implemented differently in a separate patch.
      
      Change-Id: I386a9661c87d69e22761e5c01507f2f1f968433f
      79362e33
    • Yue Chen's avatar
      Fix test failures and warnings of WARPED_MOTION · f3e1ead3
      Yue Chen authored
      Properly set number of projection samples for seg skip blocks
      at the encoder side to clear unit test failure when both seg feature
      and warped_motion is on.
      Clear 'implicit conversions' warnings
      
      Change-Id: I29e40ffae75880dae2584dbc8772c81321f6d69e
      f3e1ead3
    • David Barker's avatar
      Fix encode/decode mismatch with global/warped motion · b62eef7b
      David Barker authored
      When predicting a 4x4 warp block (either using ZEROMV with
      global-motion, or the WARPED_CAUSAL motion mode with
      warped-motion), the warp filter would previously write
      4 bytes to the right of the block.
      
      This caused encode/decode mismatches when encoding with
      multiple threads and tile_cols > 1, since in that case
      we could end up overwriting already-generated pixels from
      the next tile across.
      
      This patch changes the filter so that we only overwrite the
      intended pixels.
      
      Change-Id: I3664b44e872e85aa5ccc0a5781f0f9ad994a5b80
      b62eef7b
  7. 27 Apr, 2017 1 commit
  8. 26 Apr, 2017 2 commits
  9. 24 Apr, 2017 1 commit
  10. 21 Apr, 2017 1 commit
    • Urvang Joshi's avatar
      Revert "warp_affine_c: Refactor highbd and lowbd versions." · 0d08afdc
      Urvang Joshi authored
      This reverts commit 8cd0e7ef.
      
      Reason for revert:
      This change breaks av1_warp_affine_c when CONFIG_HIGHBITDEPTH is enabled.
      
      In particular, running ./test_libaom --gtest_filter=*Warp* compiled with --enable-warped-motion --enable-highbitdepth shows several test failures, followed by a segmentation fault when it gets up to test SSE2/AV1WarpFilterTest.CheckOutput/4
      
      The tricky part is that the use the lowbd version of the function is dependent on a mix of two conditions:
      (1) Compile time check for CONFIG_HIGHBITDEPTH and
      (2) Run time check to see if bit-depth == 8
      So, it is tricky to refactor.
      
      BUG=aomedia:442
      
      Change-Id: I610c537fb65bde4f357185a13081639f906351de
      0d08afdc
  11. 20 Apr, 2017 1 commit
  12. 17 Apr, 2017 1 commit
  13. 13 Apr, 2017 1 commit
    • Debargha Mukherjee's avatar
      Adds option to use 1/32 subpel precision for gm/wm · 16056f5b
      Debargha Mukherjee authored
      Adds filters for 1/32 subpel precision for warping.
      To use 1/32 subpel precision make WARPEDPIXEL_PREC_BITS 5.
      By default, WARPEDPIXEL_PREC_BITS is set as 6 in common/mv.h,
      which uses 1/64 subpel precision.
      
      If 1/32 precision is used, BDRATE drops:
      on lowres:
      -1.101 (vs. -1.186% with 1/64) w/warped-motion
      -1.587 (vs. -1.650% with 1/64) w/global-motion
      
      on cam_lowres:
      -2.638 (vs. -2.707% with 1/64) w/warped-motion
      -3.396 (vs. -3.453% with 1/64) w/global-motion
      
      Change-Id: I82fbfddaad9bd9be658fe382401d212833c7ceef
      16056f5b
  14. 12 Apr, 2017 1 commit
  15. 11 Apr, 2017 2 commits
  16. 10 Apr, 2017 2 commits
  17. 08 Apr, 2017 1 commit
  18. 07 Apr, 2017 1 commit
  19. 06 Apr, 2017 1 commit
    • David Barker's avatar
      Prepare for vectorizing highbd warp filter · 2bcf280e
      David Barker authored
      This applies the same refactorings to highbd_warp_plane
      which were applied to warp_plane a while ago, and lays the
      groundwork for the relevant tests.
      
      Change-Id: Ic4c00bce1accc5a3624bba0c3b4b325e69a42c1a
      2bcf280e
  20. 05 Apr, 2017 1 commit
  21. 04 Apr, 2017 1 commit
    • Debargha Mukherjee's avatar
      Reduce precision in find_affine_int() · f2f3bcd8
      Debargha Mukherjee authored
      Reduces precision in find_affine_int() function. Makes the maximum
      mv allowed 512 from 1024.
      Negligible impact on coding efficiency.
      
      Change-Id: I76d4c6824528e3f940d1275fe0bd22d71015a8d0
      f2f3bcd8
  22. 02 Apr, 2017 1 commit
    • Yue Chen's avatar
      Use 1 sample per neighbor for local warping model estimation · 5558e5da
      Yue Chen authored
      Only 1 sample needs to be collected. Max of 8 neighbors are
      used.
      In LS estimation, the projection samples (sx, sy)->(dx, dy) are
      intentionally smoothed by assuming 3 shifted versions
      (sx, sy+n)->(dx, dy+n), (sx+n, sy)->(dx+n, dy), (sx+n,
      sy+n)->(dx+n, dy+n) also contribute to the estimation.
      For example, instead of using A[0] = sx^2, we use the sum of
      squares of source x of four points, A[0] += 4sx^2+4*n*sx+n^2.
      But computational cost wise, it does not add much overhead. Coding
      gain is mostly same as the old version. If no smoothing is added,
      will lose 0.3% on lowres.
      
      Change-Id: I04be32cffa525f7dc8ee583c0bf211d7bdc6e609
      5558e5da
  23. 31 Mar, 2017 1 commit
  24. 30 Mar, 2017 1 commit
    • Debargha Mukherjee's avatar
      A few fixes for global motion · 11f0e40d
      Debargha Mukherjee authored
      Handles a rare divisin by 0 case.
      Also adds a check on global motion parameters to disable
      if the parameters obtained are outside the range that the
      shear supports. This fixes a rare assert failure.
      Also changes the recode loop threshold somewhat.
      
      Change-Id: I4c6e74b914ac653cd9caa0563d78b0a19a2a8627
      11f0e40d
  25. 23 Mar, 2017 2 commits
    • Debargha Mukherjee's avatar
      Simplify warped motion estimation to use 2d ls · b9370acd
      Debargha Mukherjee authored
      Use a simpler warped motion estimation scheme that uses a 2d
      least squares problem, where the underlying assumption
      applied is that the motion vector computed at the center
      of the current block using the warp model is exactly the same
      as the motion vector transmitted for the block.
      
      The main motivation is to reduce the complexity of the
      estimation process.
      
      Coding efficiency drop is about +0.25% on lowres:
      -1.152% (from -1.396%).
      
      Also, removes code for non-approximate division and bakes
      approximate divison in.
      
      Change-Id: Ie4ad8e32593b09f7e1920c70b0b92545236ddc54
      b9370acd
    • Debargha Mukherjee's avatar
      Split current block samples for warp estimation · e8e6cad7
      Debargha Mukherjee authored
      Change-Id: Iebc74024475c7cb88650b65df9f23b1a5e70021c
      e8e6cad7
  26. 17 Mar, 2017 1 commit
    • Debargha Mukherjee's avatar
      Replace division in warped motion least squares · 082d4df7
      Debargha Mukherjee authored
      Replaces the int64 and int32 divisions in least-squares and
      gamma or delta computation with a mechanism that decomposes
      the divisor D such that 1/D = y * 2^-k where y is obtained
      from a lookup table indexed by 8 highest bits of the difference
      D - 2^floor(log2(D)). The main complexity is now only from
      computing this decomposition, which is essentially equivalent
      to finding floor(log2(D)) (position of highest
      bit in a 64-bit integer).
      
      Also includes an out of memory bug fix and some cleanups.
      
      Change-Id: I9247fdff5f6b4191175d4b4656357bfff626f02c
      082d4df7
  27. 02 Mar, 2017 1 commit
    • Debargha Mukherjee's avatar
      Some optimizations on integer affine estimation · 93105538
      Debargha Mukherjee authored
      1. Adds a limit on number of candidate samples used for the
      estimation.
      2. Adds a limit on max mv magnitude for use in the least-squares
      3. Makes some of the internal variables 32-bit.
      
      Impact on coding efficiency in the noise range.
      
      Change-Id: I8c1c3216368ceb2e3548660a3b8c159df54a8312
      93105538
  28. 28 Feb, 2017 1 commit