1. 18 Feb, 2018 2 commits
  2. 17 Feb, 2018 5 commits
    • Zoe Liu's avatar
      [NORMATIVE] Unify context design for single ref · 3b353474
      Zoe Liu authored
      The CL makes the context design for single reference frame coding the
      same as that for the compound reference frame coding. There are 3
      contexts designed for each of the binary symbols for the single
      reference frame scenario, and the designed contexts simply rely on the
      counts of the references used in the neighboring two blocks.
      Once this CL is merged, the coding of the reference frames, regardless
      of single prediction or compound prediction, will all follow the same
      context design pattern for all the binary symbols. The design logic is
      much simpler and the lines of code for each binary symbol context
      identification are reduced by 80%.
      Further, this CL has obtained a small coding gain for 30 frames with
      the default coding tools:
      lowres: avg_psnr -0.015%; ovr_psnr -0.021%; ssim -0.002%
      midres: avg_psnr -0.108%; ovr_psnr -0.139%; ssim -0.135%
      Change-Id: Ia72a1d18e85ac3a05308675b60b95f80f2219c46
    • Debargha Mukherjee's avatar
      Add a wrapper to 8-bit scaled convolve function · 3ffa0fef
      Debargha Mukherjee authored
      This addesses a crash in a situation where the scaled convolve
      function is called without a valid dst buffer in the conv_params
      Change-Id: Ia4a0a1213f06447155d6c92aa9efc183d8c4a79c
    • Jingning Han's avatar
      Silience compiler warning when ext-partition is off · 84597cca
      Jingning Han authored
      Change-Id: I24e98062fec2cf73e294d34bf02419f7917a9bf0
    • Jingning Han's avatar
      [NORMATIVE] Fix top right mv check condition for VERT_A partition · 56066252
      Jingning Han authored
      When a coding block runs VERT_A partition, the coding order does
      not follow raster order. This requires a special handle on the
      bottom left square block to disable its reference towards the
      top-right corner. Prior to this change the codebase would disable
      the bottom-right square, as well as the right rectangular blocks,
      from referencing the top-right mv. This commit fixes such check
      condition to allow the right rectangular block to access the top
      right mv.
      Change-Id: I87049f0cec8ed7557a87c3fdef83e01498bbcd75
    • Jingning Han's avatar
      Remove unused context models for drl_index · 233c7627
      Jingning Han authored
      Remove deprecated context models for drl index coding.
      Change-Id: If255fa93d0c746738f0fc005464388e790c89b63
  3. 16 Feb, 2018 9 commits
  4. 15 Feb, 2018 7 commits
    • Jingning Han's avatar
      Reduce txb_coeff_cost map size from 64 to 32 · f32f678d
      Jingning Han authored
      The maximum coding block size is 128 and the minimum tx size is
      4. Using 32 per dimension to keep the txb coeff cost should be
      Change-Id: Ie44fc581037f0d8270caec64543454701159eec5
    • Jingning Han's avatar
      Fix compiler errors in cfl.h · 5a338ee4
      Jingning Han authored
      Change-Id: Iadf001ae293ca50b805e8b8c569900f60a23943c
    • David Barker's avatar
      [NORMATIVE-DECODING] Add missing special case to has_top_right() · 3f0f1dfc
      David Barker authored
      Commit ea190906 fixed a major bug in has_top_right(), but missed
      out one special case:
      Consider a 128x128 block using 64x64 intra predictions. These
      intra predictions are applied in a 'Z' order, and so the bottom-left
      64x64 unit has pixels available from the top-right one. But we were
      mistakenly setting has_top_right() = 0 for this case.
      More generally, whichever transform unit has its top-right corner
      at the center of a 128x128 block, should have has_top_right() = 1.
      Fix this by introducing an explicit check for the special case.
      Change-Id: I690a292be6c1755c76bd428be94ab953dd71fbd2
    • Yaowu Xu's avatar
      Remove CONFIG_TX64X64 · d3d4159f
      Yaowu Xu authored
      The experiment is fully adopted.
      Change-Id: I6cc80a2acf0c93c13b0e36e6f4a2378fe5ce33c3
    • Debargha Mukherjee's avatar
      Stop using VP9 convolve scheme in AV1 encoder. · 1fc3df55
      Debargha Mukherjee authored
      Discontinue all VP9 style convolve rounding operations in the non-normative
      parts of the encoder.
      The function av1_convolve_2d_sr_c is forced instead of SIMD versions
      of the same function, because of incompatibility when round_1 > 0.
      setting, results on 15 frames of lowres (cpu-used=1) is -0.019% better.
      Change-Id: I72154bd896357c352c944fb2cd3b25bafafba46a
    • Yaowu Xu's avatar
      align a stack array to prevent segfault · 1e2084a2
      Yaowu Xu authored
      Change-Id: I8c3de068fd158b4706b578f7609c8ef939364525
    • Luc Trudeau's avatar
      [CFL] SSE2/AVX2 Versions of Sum and Subtract Average · 365f73bb
      Luc Trudeau authored
      Includes unit tests for conformance and speed.
      4x4: C time = 234 us, SIMD time = 152 us (~1.5x)
      8x8: C time = 664 us, SIMD time = 208 us (~3.2x)
      16x16: C time = 1687 us, SIMD time = 581 us (~2.9x)
      32x32: C time = 6118 us, SIMD time = 2119 us (~2.9x)
      4x4: C time = 250 us, SIMD time = 221 us (~1.1x)
      8x8: C time = 683 us, SIMD time = 284 us (~2.4x)
      16x16: C time = 1727 us, SIMD time = 1091 us (~1.6x)
      32x32: C time = 6092 us, SIMD time = 2107 us (~2.9x)
      Change-Id: I44ffedc683829d2c16089854ac43d4ddb4415bcd
  5. 14 Feb, 2018 13 commits
    • David Barker's avatar
      [NORMATIVE-DECODING] Fix above/left chroma block selection · d3afdb90
      David Barker authored
      As pointed out by rsbultje, my previous patch to is_smooth()
      (a883e6ea) was not quite correct. This is because, when we're
      making a chroma prediction, the uv_mode for the above/left chroma
      predictions is not necessarily in above_mbmi/left_mbmi. Instead,
      it may be in any of several places, depending on subsampling and
      the values of mi_row/mi_col.
      The cleanest solution is to explicitly maintain pointers to the
      above and left chroma blocks. Then we can simply look at those
      pointers when we want to know the above or left uv_mode.
      Also include a bit of refactoring of get_filt_type: It seems
      to be recalculating what's already in xd->{above,left}_mi,
      so just use those directly.
      Change-Id: I0230474a50d43b78cb587a2b553da9ca78cec0c6
    • Cheng Chen's avatar
      Fix jnt_comp in warp function · 5c4848d6
      Cheng Chen authored
      Fixed a bug in jnt_comp simd function.
      Re-write the function based on existing av1_warp_affine_ssse3.
      The function is almost the same as av1_warp_affine_ssse3, only
      difference is the weighted computation of final value.
      Change-Id: I65f2fadc9142f6c264a7d0e59250602636c9808b
    • Peng Bin's avatar
      Refactor inv_cos_bit for speedup · 28744b5c
      Peng Bin authored
      Replace the last parameter cos_bit for all 1D inv_txfm funcions with
      a macro define, as it is actually always equal to 12. By changing it
      to const value, compiler can do further optimization.
      Change-Id: If8a9fd99c7ac7eb6f485dafbce22b4803efda62e
    • Linfeng Zhang's avatar
      Implement fdct4x8_new_sse2 and fadst4x8_new_sse2 · 043f4964
      Linfeng Zhang authored
      Change-Id: I9ab260c5ca31fe7e06bfc0f806893463c5255c45
    • Linfeng Zhang's avatar
      Implement av1_lowbd_fwd_txfm2d_8x4_sse2 · 7bd00743
      Linfeng Zhang authored
      So far the implemented av1_lowbd_fwd_#x#_sse2 provides 10% encoder
      speed up on speed 1.
      Change-Id: I3dab438c4498059262b065300743ba1519db64b4
    • Peng Bin's avatar
      Refactor pair_set_epi16 for speedup · 8b8aaffc
      Peng Bin authored
      Use _mm_set1_epi32 instead of _mm_set_epi16, less instructions produced
      by compiler. This patch also removes the duplicate define of the same
      Speed test results:
      1. Unittest for each test cases in SSE2/AV1LbdInvTxfm2d shows 60%~80%
      speedup (except those case with TX_TYPE include iidentity)
      2. A brief speed test shows that with this CL, for speed1 encoder speeds up
      ~3% and decoder speeds up ~1.8%.
      (Baseline is 18976fa5)
      Change-Id: I2b0e12973fda05a21d6b6eb0f0efe11df6edfb84
    • Yaowu Xu's avatar
      Remove two more LPF macros · 8ec5c077
      Yaowu Xu authored
      Change-Id: I60278e399f4f65aa63526e459947e88084f0e889
    • Yaowu Xu's avatar
      Remove CONFIG_PARALLEL_DEBLOCKING · 6d0ed3ed
      Yaowu Xu authored
      The experiment is fully adopted now.
      Change-Id: I27906d2af4c746ce55aa17f64d1c0ef281e23ab2
    • Imdad Sardharwalla's avatar
      Increase seg_feature_data_max[SEG_LVL_REF_FRAME] · e4cf4fa4
      Imdad Sardharwalla authored
      Previously, segments using SEG_LVL_REF_FRAME were unable to signal the choices
      of GOLDEN = 4, BWDREF = 5, ALTREF2 = 6 and ALTREF = 7, as
      seg_feature_data_max[SEG_LVL_REF_FRAME] was set to 3. This patch increases the
      value to 7 to account for these options.
      Change-Id: I9732fa2be96ead2d4b6efdbce34a92e43c7dd04e
    • Imdad Sardharwalla's avatar
      Prevent undefined behaviour for AMVR experiment · bf2cc016
      Imdad Sardharwalla authored
      Sequences starting with intra-only frames previously resulted in undefined
      behaviour with CONFIG_AMVR == 1, as seq_force_integer_mv was only read for
      This patch makes changes as follows:
      - The syntax element force_screen_content_tools has been added to the
        SequenceHeader struct, and is read and written correspondingly
      - seq_force_integer_mv has been renamed to force_integer_mv and moved to the
        SequenceHeader struct, and is read and written correspondingly (provided that
        force_screen_content_tools != 0)
      - The conditional reading/writing of allow_screen_content_tools now happens for
        every frame after reading/writing error_resilient_mode (CONFIG_OBU == 1) or
        the sequence header (CONFIG_OBU == 0)
      - The conditional reading/writing of cur_frame_force_integer_mv now happens for
        every frame after reading/writing allow_screen_content_tools
      Change-Id: I689476fc2fa781dc8ec6fc8da91926cc8cfd3dc2
    • Yunqing Wang's avatar
      [NORMATIVE] Consolidate reference mv clamping · 3e225434
      Yunqing Wang authored
      Clamp_mv_ref happens in multiple places in ref_mv search, which can be
      convoluted as reported in issue 1124. This change is to consolidate
      the clamping into one place.
      Borg test result on lowres set:
      avg_psnr:    ovr_psnr:   ssim:
        0.000       0.000      0.001
      Change-Id: I1649d5b5f37683c9c30e493c6eed13a808ab543a
    • Jingning Han's avatar
      [NORMATIVE] Scale up mfmv ref step size in 64x64 block · 73190512
      Jingning Han authored
      When the coding block size has one side in length of 64 and above,
      scale up the mfmv reference search step size from 8 to 16 along
      that direction. The midres coding stats get 0.02% better. Among
      all the finished hdres points, no negative results showed up.
      Change-Id: I70ab7a9f9d1cf365d8ed1e06dbede307b6bc46ec
    • Jingning Han's avatar
      [NORMATIVE] Reduce spatial search region from 4 to 3 cols · 92446c52
      Jingning Han authored
      Reduce the ref mv search over spatial neighbors from 4 to 3
      Change-Id: I44eb96e2ff4243d720a5f4f68be504995ebd69b6
  6. 13 Feb, 2018 4 commits