1. 04 Apr, 2017 15 commits
  2. 03 Apr, 2017 18 commits
  3. 02 Apr, 2017 7 commits
    • Steinar Midtskogen's avatar
      Add v64_abs_s8, v128_abs_s8 and v256_abs_s8 · 6033fb85
      Steinar Midtskogen authored
      Change-Id: I529509e4e997ba123799a3a581d20624d75cf582
      6033fb85
    • Steinar Midtskogen's avatar
      CLPF: Add architecture postfix to the name of static functions · 569c7b91
      Steinar Midtskogen authored
      This makes it clear when profiling that the correct SIMD optimised
      function is run.
      
      Change-Id: I35d69b3611f40650a85f1973c4010453b2bf5a53
      569c7b91
    • Steinar Midtskogen's avatar
      Move the CLPF damping adjustment for strength up in the call chain · febe223d
      Steinar Midtskogen authored
      Rather than having the adjustment in the leaf functions, do the
      adjustment in the top-level function.  Optimising compilers would
      figure this out themselves as far as the functions are inlined, but
      probably no further and this patch gives a slightly reduced object
      code size.
      
      Change-Id: I104750962f613fa665391c9b2a9e99bcc6f47f93
      febe223d
    • Jean-Marc Valin's avatar
      Temporarily revert some 4:2:2 code · ec70797d
      Jean-Marc Valin authored
      As part of 9cf0c9cd the buffering was made
      to better handle 4:2:2, but that causes regressions in the tests, so we're
      backing out part of it for now.
      
      Change-Id: I9ca4cfeb159aa65514613989e3dcbc30f86ec5b2
      ec70797d
    • Yue Chen's avatar
      Use 1 sample per neighbor for local warping model estimation · 5558e5da
      Yue Chen authored
      Only 1 sample needs to be collected. Max of 8 neighbors are
      used.
      In LS estimation, the projection samples (sx, sy)->(dx, dy) are
      intentionally smoothed by assuming 3 shifted versions
      (sx, sy+n)->(dx, dy+n), (sx+n, sy)->(dx+n, dy), (sx+n,
      sy+n)->(dx+n, dy+n) also contribute to the estimation.
      For example, instead of using A[0] = sx^2, we use the sum of
      squares of source x of four points, A[0] += 4sx^2+4*n*sx+n^2.
      But computational cost wise, it does not add much overhead. Coding
      gain is mostly same as the old version. If no smoothing is added,
      will lose 0.3% on lowres.
      
      Change-Id: I04be32cffa525f7dc8ee583c0bf211d7bdc6e609
      5558e5da
    • Yue Chen's avatar
      Use only the first predictors of compound neighbors in OBMC · 54723f97
      Yue Chen authored
      Loss of gain in AWCY
      HL 0.23%
      LL 0 (since no compound is used in LL)
      lowres 0.277%
      midres 0.248%
      
      Change-Id: I46ad1e2f07411c838f2ca6765de57a60a9c68b12
      54723f97
    • Yue Chen's avatar
      Not use sub8x8 mv of neighbors in obmc · 13e412eb
      Yue Chen authored
      Take all sub8x8 neighbors as 8x8 blocks and use mv assigned to
      the last block.
      
      Change of performance in AWCY
      HL improved by 0.01%
      LL improved by 0.06%
      
      Change-Id: I55d3c5401222396d871f9157b62b3de29e5390b0
      13e412eb