1. 08 Feb, 2017 1 commit
  2. 06 Feb, 2017 1 commit
    • Angie Chiang's avatar
      Add av1_convolve_2d_facade · 7927a97d
      Angie Chiang authored
      When convolve_round is on, av1_convolve_2d_facade will be used for
      interpolation rather than av1_convolve. Will remove the experiment
      code of convolve_round experiment from av1_convolve in another CL.
      
      So far we use 4-bit rounding in the intermediate stage on top of using
      post rounding for compound mode after the last stage.
      
      This will give us roughly 0.45% gain on lowres , 0.39% on midres and
      roughly 0.6-0.7% on hdres
      Altogether, is 1.15% on lowresm, 0.74% on midres and roughly 1.7-1.8% on
      hdres
      
      Note that there no restriction usage of 12-tap filter in the CL.
      Adding that, we will lose roughly 0.1% again on lowres.
      
      Change-Id: I6332e1d888e28a3b3ddc29711817d66e52cb5cdf
      7927a97d
  3. 31 Jan, 2017 2 commits
    • David Barker's avatar
      Fix ext-inter + compound-segment + supertx · 426a997e
      David Barker authored
      Allow the above combination of experiments to work together
      correctly, fixing an encode/decode mismatch bug when they
      were all enabled.
      
      This change causes build_masked_compound(_highbd) to only
      ever be called if CONFIG_SUPERTX is off, so wrap these functions
      in an '#if !CONFIG_SUPERTX' block.
      
      BUG=aomedia:313
      
      Change-Id: Ic3886bc69ba9624b8fcb0a4c2d71fc64d2c0f22c
      426a997e
    • Sarah Parker's avatar
      Make global_motion work with ext_inter · c2d38715
      Sarah Parker authored
      Change-Id: I2a490e144099d7692296992528192c1f11d2c06f
      c2d38715
  4. 20 Jan, 2017 1 commit
    • Angie Chiang's avatar
      Add CONVOLVE_POST_ROUNDING flag · 117aa0dc
      Angie Chiang authored
      By turning on CONVOLVE_POST_ROUNDING, in the compound inter
      prediction mode, FILTER_BITS rounding is moved after the summation
      of two predictions.
      
      Note that the post rounding is only applied on non-sub8x8 block
      
             PSNR     BDRate
      lowres -0.808%  -0.673%
      
      Change-Id: Ib91304e6122c24d832a582ab9f5757d33eac876c
      117aa0dc
  5. 18 Jan, 2017 1 commit
  6. 17 Jan, 2017 1 commit
    • Debargha Mukherjee's avatar
      Improvements on segment mask · 1edf9a30
      Debargha Mukherjee authored
      Adds a few options to make the compound mask lightly dependent on the
      the two predictors.
      
      Also adds high bit depth support
      
      Change-Id: If57b6e8ddd140e0c00fd9d4738927d37225091cb
      1edf9a30
  7. 13 Jan, 2017 2 commits
    • Yue Chen's avatar
      Add recon functions of non-causal obmc · 86ae7b13
      Yue Chen authored
      Change-Id: Id2537c8826e07ad6605aaa9858ba6d797bcd23a5
      86ae7b13
    • Angie Chiang's avatar
      Add rounding option into av1_convolve · 674bffdc
      Angie Chiang authored
      Use a round flag in ConvolveParams to indicate if the destination buffer
      has the result rounded by FILTER_BITS or not.
      This CL is part of the goal of reducing interpolation rounding error in
      compound prediction mode.
      
      Change-Id: I49e522a89a67a771f5a6e7fbbc609e97923aecb6
      674bffdc
  8. 10 Jan, 2017 1 commit
  9. 07 Jan, 2017 1 commit
  10. 16 Dec, 2016 1 commit
    • Sarah Parker's avatar
      Add temporary dummy mask for compound segmentation · 569eddab
      Sarah Parker authored
      This uses a segmentation mask (which is temporarily computed arbitrarily)
      to blend predictors in compound prediction. The mask will be computed
      using a color segmentation in a followup patch.
      Change-Id: I2d24cf27a8589211f8a70779a5be2d61746406b9
      569eddab
  11. 15 Dec, 2016 1 commit
  12. 06 Dec, 2016 1 commit
    • David Barker's avatar
      Improve rdopt decisions for ext-inter · ac37fa3d
      David Barker authored
      Relative to previous ext-inter:
      lowres: -0.177%
           or -0.029% (with USE_RECT_INTERINTRA = 0)
      
      * When predicting interintra modes, the previous code did not provide
        the intra predictor with the correct context during rdopt. Add an
        explicit 'ctx' parameter to the relevant functions, to provide this
        context.
        This fixes a nondeterminism bug, which was causing test failures in
        *EncoderThreadTest*
      
      * For rectangular blocks, build_intra_predictors_for_interintra needs
        to overwrite part of the context buffer in order to set up the
        correct context for intra prediction. We now restore the original
        contents afterwards.
      
      * Add a flag to enable/disable rectangular interintra prediction;
        disabling improves encoding speed but reduces BDRATE improvement.
      
      Change-Id: I7458c036c7f94df9ab1ba0c7efa79aeaa7e17118
      ac37fa3d
  13. 01 Dec, 2016 1 commit
  14. 01 Nov, 2016 1 commit
  15. 30 Oct, 2016 1 commit
    • Angie Chiang's avatar
      Let is_interp_needed always return 1 · a69ce1b3
      Angie Chiang authored
      This CL will cause
      0.122% PSNR drop on lowres dataset
      0.059% PSNR drop on midres dataset
      
      However, it will facilitate hardware implementation.
      
      Change-Id: I0a0713acacbfd571509a721337711c021915dd3c
      a69ce1b3
  16. 26 Oct, 2016 2 commits
  17. 24 Oct, 2016 1 commit
  18. 21 Oct, 2016 1 commit
    • Jingning Han's avatar
      Sub8x8 block chroma component inter prediction · e29ea12f
      Jingning Han authored
      Handle the sub8x8 chroma component at the unit of 2x2/4x2/2x4 level
      and use the motion vector inherited from the luma component. This
      improves the coding performance:
      
      lowres 0.4%
      midres 0.25%
      hdres  0.15%
      
      Change-Id: I34dff4218cfa3e5d55e7ed0341f36f4719389f7e
      e29ea12f
  19. 19 Oct, 2016 1 commit
    • Urvang Joshi's avatar
      Code cleanup: mainly rd_pick_partition and methods called from there. · 52648448
      Urvang Joshi authored
      - Const correctness
      - Refactoring
      - Make variables local when possible etc
      - Remove -Wcast-qual to allow explicitly casting away const.
      
      Cherry-picked from aomedia/master: c27fcccc
      And then a number of more const correctness changes to make sure other
      experiments build OK.
      
      Change-Id: I77c18d99d21218fbdc9b186d7ed3792dc401a0a0
      52648448
  20. 13 Oct, 2016 1 commit
    • Yue Chen's avatar
      Renamings for OBMC experiment · cb60b185
      Yue Chen authored
      To get ready for pulling AV1 to nextgenv2
      Replace the experimental flag by MOTION_VAR. Rename major variables.
      
      Change-Id: If6cf4f37b9319c46d8f90df551cc7295d66ca205
      cb60b185
  21. 12 Oct, 2016 1 commit
    • Angie Chiang's avatar
      Let is_interp_needed always return 1 · 16dc1513
      Angie Chiang authored
      This CL will cause
      0.122% PSNR drop on lowres dataset
      0.059% PSNR drop on midres dataset
      
      However, it will facilitate hardware implementation.
      
      Change-Id: I0a0713acacbfd571509a721337711c021915dd3c
      16dc1513
  22. 09 Sep, 2016 1 commit
    • James Zern's avatar
      s/INTERP_FILTER/InterpFilter/ · 7b9407a8
      James Zern authored
      this matches style guidelines and stabilizes successive runs of
      clang-format across the tree. remaining types should be address in
      successive commits.
      
      Change-Id: I6ad3f69cf0a22cb9a9b895b272195f891f71170f
      7b9407a8
  23. 02 Sep, 2016 1 commit
  24. 01 Sep, 2016 2 commits
  25. 12 Aug, 2016 1 commit
  26. 29 Jul, 2016 1 commit
  27. 18 Jul, 2016 1 commit
  28. 13 Jul, 2016 1 commit
    • Geza Lore's avatar
      Optimize and cleanup obmc predictor and rd search. · 4c4f04ac
      Geza Lore authored
      Use vpx_blend_a64_hmask and vpx_blend_a64_vmask to speed up
      computing the obmc predictor. Clean up calc_target_weighted_pred.
      
      Encoder speedup: 1.3%
      Decoder speedup: 6.5%
      
      Change-Id: I0c774fe53d22399e92a10d1daf3af0010d88d2c5
      4c4f04ac
  29. 11 Jul, 2016 2 commits
    • Geza Lore's avatar
      Optimize and cleanup supertx predictor. · cd489264
      Geza Lore authored
      Use vpx_blend_a64_hmask and vpx_blend_a64_vmask to speed up
      computing the supertx predictor.
      
      Decoder speedup of up to 4% has been observed.
      
      Change-Id: I255a5ba4cc24f78dc905d25b6e2f7fbafac13253
      cd489264
    • Geza Lore's avatar
      Improve vpx_blend_* functions. · bfa59b4a
      Geza Lore authored
      - Made source buffers pointers to const.
      - Renamed vpx_blend_mask6b to vpx_blend_a64_mask. This is more
        indicative that the function does alpha blending. The 6, or 6b
        suffix was misleading, as the max mask value (64) does not fit into
        6 bits.
      - Added VPX_BLEND_* macros to use when needing to blend scalars.
      - Use VPX_BLEND_A256 in combine_interintra to be more explicit about
        the operation being done.
      - Added versions of vpx_blend_a64_* which take 1D horizontal/vertical
        masks directly and apply them to all rows/columns
        (vpx_blend_a64_hmask and vpx_blend_a64_vmask). The SSE4.1 optimzied
        horizontal version now falls back on the 2D version. This can be
        improved upon if it show up high enough in a profile.
      - All vpx_blend_a64_* functions now support block sizes down to 1x1
        (ie: a single pixel). This is for usage convenience. The SSE4.1
        optimized versions fall back on the C implementation if
        w <= 2 or h <= 2. This can again be improved if it becomes hot code.
      
      Change-Id: I13ab3835146ffafe3e1d74d8e9cf64a5abe4144d
      bfa59b4a
  30. 21 Jun, 2016 1 commit
  31. 13 Jun, 2016 1 commit
  32. 10 Jun, 2016 1 commit
  33. 06 Jun, 2016 1 commit
    • Geza Lore's avatar
      Optimize wedge partition selection. · efda2831
      Geza Lore authored
      We can optimize wedge partition selection by pre-computing the
      residuals of the 2 underlying predictors, and then blend these
      to compute the sse of the compound predictor, without actually
      having to compute and subtract the compound predictor.
      
      Similarly we can pre-compute a proxy array which we can use to
      cheaply check which mask sign would have lower sse.
      
      Details are in wedge_utils.c.
      
      Mathematically these are equivalence transformations, but due to the
      finite precision the encoder output will be perturbed, though on
      average this should make 0% difference.
      
      ext-inter gains about ~4.5% speedup.
      
      Change-Id: Ib2657c3209ae161b4090b58b4b6c392641bf2792
      efda2831
  34. 03 Jun, 2016 1 commit
    • Geza Lore's avatar
      Pre-compute and use contiguous wedge masks. · ab29978e
      Geza Lore authored
      This is purely a refactoring patch and has no functional effect.
      
      Uses of these masks can be arranged such that all input blocks are
      contiguous in memory (stride == block width). In this case 1D versions
      of  operations can be used. 1D vector operations have superior performance
      over 2D block equivalents as they are more processor cache friendly and
      they can do away with a second loop overhead.
      
      Change-Id: I2b76c9888aea2c857cc497e8a4b2841fd3dad54e
      ab29978e
  35. 20 May, 2016 1 commit