1. 01 May, 2017 1 commit
  2. 28 Apr, 2017 3 commits
    • Debargha Mukherjee's avatar
      Revert "Limit to 192 filters for warp, clamp index since in some cases index 192" · 79362e33
      Debargha Mukherjee authored
      This reverts commit 266db85d.
      
      Reason for revert: Reverting to prevent software slowdown. Will be implemented differently in a separate patch.
      
      Change-Id: I386a9661c87d69e22761e5c01507f2f1f968433f
      79362e33
    • Yue Chen's avatar
      Fix test failures and warnings of WARPED_MOTION · f3e1ead3
      Yue Chen authored
      Properly set number of projection samples for seg skip blocks
      at the encoder side to clear unit test failure when both seg feature
      and warped_motion is on.
      Clear 'implicit conversions' warnings
      
      Change-Id: I29e40ffae75880dae2584dbc8772c81321f6d69e
      f3e1ead3
    • David Barker's avatar
      Fix encode/decode mismatch with global/warped motion · b62eef7b
      David Barker authored
      When predicting a 4x4 warp block (either using ZEROMV with
      global-motion, or the WARPED_CAUSAL motion mode with
      warped-motion), the warp filter would previously write
      4 bytes to the right of the block.
      
      This caused encode/decode mismatches when encoding with
      multiple threads and tile_cols > 1, since in that case
      we could end up overwriting already-generated pixels from
      the next tile across.
      
      This patch changes the filter so that we only overwrite the
      intended pixels.
      
      Change-Id: I3664b44e872e85aa5ccc0a5781f0f9ad994a5b80
      b62eef7b
  3. 27 Apr, 2017 1 commit
  4. 26 Apr, 2017 2 commits
  5. 24 Apr, 2017 1 commit
  6. 21 Apr, 2017 1 commit
    • Urvang Joshi's avatar
      Revert "warp_affine_c: Refactor highbd and lowbd versions." · 0d08afdc
      Urvang Joshi authored
      This reverts commit 8cd0e7ef.
      
      Reason for revert:
      This change breaks av1_warp_affine_c when CONFIG_HIGHBITDEPTH is enabled.
      
      In particular, running ./test_libaom --gtest_filter=*Warp* compiled with --enable-warped-motion --enable-highbitdepth shows several test failures, followed by a segmentation fault when it gets up to test SSE2/AV1WarpFilterTest.CheckOutput/4
      
      The tricky part is that the use the lowbd version of the function is dependent on a mix of two conditions:
      (1) Compile time check for CONFIG_HIGHBITDEPTH and
      (2) Run time check to see if bit-depth == 8
      So, it is tricky to refactor.
      
      BUG=aomedia:442
      
      Change-Id: I610c537fb65bde4f357185a13081639f906351de
      0d08afdc
  7. 20 Apr, 2017 1 commit
  8. 17 Apr, 2017 1 commit
  9. 13 Apr, 2017 1 commit
    • Debargha Mukherjee's avatar
      Adds option to use 1/32 subpel precision for gm/wm · 16056f5b
      Debargha Mukherjee authored
      Adds filters for 1/32 subpel precision for warping.
      To use 1/32 subpel precision make WARPEDPIXEL_PREC_BITS 5.
      By default, WARPEDPIXEL_PREC_BITS is set as 6 in common/mv.h,
      which uses 1/64 subpel precision.
      
      If 1/32 precision is used, BDRATE drops:
      on lowres:
      -1.101 (vs. -1.186% with 1/64) w/warped-motion
      -1.587 (vs. -1.650% with 1/64) w/global-motion
      
      on cam_lowres:
      -2.638 (vs. -2.707% with 1/64) w/warped-motion
      -3.396 (vs. -3.453% with 1/64) w/global-motion
      
      Change-Id: I82fbfddaad9bd9be658fe382401d212833c7ceef
      16056f5b
  10. 12 Apr, 2017 1 commit
  11. 11 Apr, 2017 2 commits
  12. 10 Apr, 2017 2 commits
  13. 08 Apr, 2017 1 commit
  14. 07 Apr, 2017 1 commit
  15. 06 Apr, 2017 1 commit
    • David Barker's avatar
      Prepare for vectorizing highbd warp filter · 2bcf280e
      David Barker authored
      This applies the same refactorings to highbd_warp_plane
      which were applied to warp_plane a while ago, and lays the
      groundwork for the relevant tests.
      
      Change-Id: Ic4c00bce1accc5a3624bba0c3b4b325e69a42c1a
      2bcf280e
  16. 05 Apr, 2017 1 commit
  17. 04 Apr, 2017 1 commit
    • Debargha Mukherjee's avatar
      Reduce precision in find_affine_int() · f2f3bcd8
      Debargha Mukherjee authored
      Reduces precision in find_affine_int() function. Makes the maximum
      mv allowed 512 from 1024.
      Negligible impact on coding efficiency.
      
      Change-Id: I76d4c6824528e3f940d1275fe0bd22d71015a8d0
      f2f3bcd8
  18. 02 Apr, 2017 1 commit
    • Yue Chen's avatar
      Use 1 sample per neighbor for local warping model estimation · 5558e5da
      Yue Chen authored
      Only 1 sample needs to be collected. Max of 8 neighbors are
      used.
      In LS estimation, the projection samples (sx, sy)->(dx, dy) are
      intentionally smoothed by assuming 3 shifted versions
      (sx, sy+n)->(dx, dy+n), (sx+n, sy)->(dx+n, dy), (sx+n,
      sy+n)->(dx+n, dy+n) also contribute to the estimation.
      For example, instead of using A[0] = sx^2, we use the sum of
      squares of source x of four points, A[0] += 4sx^2+4*n*sx+n^2.
      But computational cost wise, it does not add much overhead. Coding
      gain is mostly same as the old version. If no smoothing is added,
      will lose 0.3% on lowres.
      
      Change-Id: I04be32cffa525f7dc8ee583c0bf211d7bdc6e609
      5558e5da
  19. 31 Mar, 2017 1 commit
  20. 30 Mar, 2017 1 commit
    • Debargha Mukherjee's avatar
      A few fixes for global motion · 11f0e40d
      Debargha Mukherjee authored
      Handles a rare divisin by 0 case.
      Also adds a check on global motion parameters to disable
      if the parameters obtained are outside the range that the
      shear supports. This fixes a rare assert failure.
      Also changes the recode loop threshold somewhat.
      
      Change-Id: I4c6e74b914ac653cd9caa0563d78b0a19a2a8627
      11f0e40d
  21. 23 Mar, 2017 2 commits
    • Debargha Mukherjee's avatar
      Simplify warped motion estimation to use 2d ls · b9370acd
      Debargha Mukherjee authored
      Use a simpler warped motion estimation scheme that uses a 2d
      least squares problem, where the underlying assumption
      applied is that the motion vector computed at the center
      of the current block using the warp model is exactly the same
      as the motion vector transmitted for the block.
      
      The main motivation is to reduce the complexity of the
      estimation process.
      
      Coding efficiency drop is about +0.25% on lowres:
      -1.152% (from -1.396%).
      
      Also, removes code for non-approximate division and bakes
      approximate divison in.
      
      Change-Id: Ie4ad8e32593b09f7e1920c70b0b92545236ddc54
      b9370acd
    • Debargha Mukherjee's avatar
      Split current block samples for warp estimation · e8e6cad7
      Debargha Mukherjee authored
      Change-Id: Iebc74024475c7cb88650b65df9f23b1a5e70021c
      e8e6cad7
  22. 17 Mar, 2017 1 commit
    • Debargha Mukherjee's avatar
      Replace division in warped motion least squares · 082d4df7
      Debargha Mukherjee authored
      Replaces the int64 and int32 divisions in least-squares and
      gamma or delta computation with a mechanism that decomposes
      the divisor D such that 1/D = y * 2^-k where y is obtained
      from a lookup table indexed by 8 highest bits of the difference
      D - 2^floor(log2(D)). The main complexity is now only from
      computing this decomposition, which is essentially equivalent
      to finding floor(log2(D)) (position of highest
      bit in a 64-bit integer).
      
      Also includes an out of memory bug fix and some cleanups.
      
      Change-Id: I9247fdff5f6b4191175d4b4656357bfff626f02c
      082d4df7
  23. 02 Mar, 2017 1 commit
    • Debargha Mukherjee's avatar
      Some optimizations on integer affine estimation · 93105538
      Debargha Mukherjee authored
      1. Adds a limit on number of candidate samples used for the
      estimation.
      2. Adds a limit on max mv magnitude for use in the least-squares
      3. Makes some of the internal variables 32-bit.
      
      Impact on coding efficiency in the noise range.
      
      Change-Id: I8c1c3216368ceb2e3548660a3b8c159df54a8312
      93105538
  24. 28 Feb, 2017 1 commit
  25. 27 Feb, 2017 1 commit
    • Debargha Mukherjee's avatar
      Integerize warped motion computation · e6eb3b53
      Debargha Mukherjee authored
      Integerizes computation of the least squares for warped motion.
      The model is restricted to only Affine. Affine seems easiest
      to compute and integerize since it can be split into two 3-dim
      least squares problems, as opposed to rotation-zoom which needs
      a 4-dim least-squares problem to be solved.
      The current implementation requires only one division per block.
      
      BDRATE impact is mminimal. The upgrade to the affine model improves
      coding efficiency but integerization also degrades efficiency a
      little. Overall there is a net gain of about -0.07% BDRATE on
      the lowres set.
      BDRATE lowres: -1.113% with ----enable-warped-motion vs. without
      (up from -1.044%).
      
      Change-Id: I6b9216ac0737d76f59054293eabee48e17739ec4
      e6eb3b53
  26. 17 Feb, 2017 1 commit
    • Debargha Mukherjee's avatar
      Support trapezoidal models for global motion · 5dfa9300
      Debargha Mukherjee authored
      Adds functinoality for least-squares, RANSAC as well as encoding and
      decoding with new constrained homographies that warp blocks to horizontal
      and/or vertical trapezoids. This is for future experimentation. None
      of the models are actually enabled in the code.
      
      Change-Id: I1936018c6b11587d6fd83c3a2c63548cb641b33f
      5dfa9300
  27. 14 Feb, 2017 1 commit
  28. 01 Feb, 2017 1 commit
    • Debargha Mukherjee's avatar
      Misc global motion changes. · d978cd5e
      Debargha Mukherjee authored
      A few encoder global-motion estimation parameter changes.
      lowres: -0.844% (up by 0.08%)
      
      Change-Id: Ib080125803cf56a91ce7d482d6d1445160105010
      d978cd5e
  29. 27 Jan, 2017 1 commit
  30. 23 Jan, 2017 1 commit
    • David Barker's avatar
      Warp filter improvements · 13797462
      David Barker authored
      * The restriction on the parameter 'delta' was too strict, so we
        loosen it (delta only ever gets multiplied by -4, ... , 4,
        whereas beta gets multiplied by -7, ..., 7)
      * Correct a comment about the border clamping
      * Fix an issue with the test case
      
      Change-Id: I30e55203455ba6e419b5a8b646151a6d1fd5cc3b
      13797462
  31. 20 Jan, 2017 1 commit
    • Debargha Mukherjee's avatar
      Change the warp filter to use real 8-tap · e6044fec
      Debargha Mukherjee authored
      The warp filter for the (0,1) case is changed to use a real
      8-tap filter.
      
      Improves coding efficiency.
      
      BDRATE on lowres:
      -0.772% (up from -0.633%) with --enable-global-motion
      -1.124% (up from -1.001%) with --enable-warped-motion
      
      Change-Id: I296efe36dbc72a7af74773b71b445f19a2aa7205
      e6044fec
  32. 19 Jan, 2017 2 commits
    • David Barker's avatar
      Add correctness tests for the SSE2 warp filter · 838367db
      David Barker authored
      Also rename warp_affine() to av1_warp_affine()
      
      Change-Id: I945baff6be8a1ea942ce88dfcfa5344af6b3a966
      838367db
    • David Barker's avatar
      Optimize SSE2 warp filter · 1b888f2e
      David Barker authored
      Improve the speed of the warp filter itself by ~30%. This leads
      to an overall decoder speedup of 5-20%, depending on bitrate,
      for the global-motion experiment, and a small speedup for
      warped-motion.
      
      Applies a very minor change to the rounding during filter
      selection (ROUND_POWER_OF_TWO makes slightly more sense here
      than ROUND_POWER_OF_TWO_SIGNED, and is faster)
      
      Change-Id: I3f364221d1ec35a8aac0d2c8b0e427f527d12e43
      1b888f2e
  33. 12 Jan, 2017 1 commit
    • David Barker's avatar
      Add SSE2 vectorized warp filter for lowbd · d5dfa96e
      David Barker authored
      End-to-end speed improvements: (measured on tempete_cif.y4m,
      20 frames for encoder and all 260 frames for decoder)
      
      * GLOBAL_MOTION encoder: ~10% faster
      * GLOBAL_MOTION decoder: 100-200% faster depending on bitrate
      * WARPED_MOTION encoder: ~2.5% faster
      * WARPED_MOTION decoder: ~20-40% faster depending on bitrate
      
      The improvement in the GLOBAL_MOTION decoder is particularly
      large because its runtime is dominated by calls to warp_plane().
      
      This introduces minor changes to the output of the warp filter,
      but these should be rare.
      
      Change-Id: I5813ab9e90311e27587045153c32d400b6b9eb92
      d5dfa96e