1. 10 Sep, 2017 1 commit
    • Debargha Mukherjee's avatar
      Reduce/Eliminate line buffer for loop-restoration. · e168a783
      Debargha Mukherjee authored
      This patch forces the vertical filtering for the top and bottom
      rows of a processing unit for the Wiener filter to not use border
      more than what is set in the WIENER_BORDER_VERT macro.
      This macro is currently set at 0 to eliminate line buffer completely,
      but it could be increased to 1 or 2 to use limited line buffers
      if the coding efficiency is affected too much with a 0 line-buffer.
      
      Also, for the sgr filter we added the option of using overlapping
      windows horizonttally and vertically to improve coding efficiency.
      The vertical border used is set by the SGRPROJ_BORDER_VERT
      macro, while the horizontal border can be set by the
      SGRPROJ_BORDER_HORZ macro set at 2, the max needed. Currently we do not
      recommend changing SGRPROJ_BORDER_HORZ below 2.
      
      The overall line buffer requirement for LR is twice the max of
      WIENER_BORDER_VERT and SGRPROJ_BORDER_VERT.
      Currently both are set as 0, eliminating line buffers completely.
      
      Also this patch extends borders consistently before CDEF / LR.
      
      Change-Id: Ie58a98c784a0db547627b9cfcf55f018c30e8e79
      e168a783
  2. 09 Sep, 2017 3 commits
  3. 08 Sep, 2017 2 commits
  4. 07 Sep, 2017 4 commits
    • Debargha Mukherjee's avatar
      Fix scaling parameter in non-normative warping · 5a9e82e3
      Debargha Mukherjee authored
      Use macro SCALE_SUBPEL_BITS rather than hard-coded 4 for old
      warping code that is used for non-affine global models.
      
      Change-Id: I10ee7b29101cd79e77a4d29e69d67497fda4e967
      5a9e82e3
    • Yi Luo's avatar
      Lowbd parallel_deblocking sse2 optimization · ea8a0d52
      Yi Luo authored
      Baseline + parallel_deblocking:
      
      - Passed unit tests *SSE2/Loop8Test6*, *AVX2/Loop8Test6*.
      - 1080p, 25 frames, profile=0, encoding/decoding, output match.
      - Decoder frame rate increases from 54.15 to 65.84.
      
      Change-Id: I55938c94961066594f4b9080192c7268c19d9bf9
      ea8a0d52
    • Wei-Ting Lin's avatar
      ncobmc-adapt-weight: add bitstream to support warped motion · 07ed3ab2
      Wei-Ting Lin authored
      Change-Id: I0e9df3719e5f9a55e1386afe44851d1707e2e01b
      07ed3ab2
    • Debargha Mukherjee's avatar
      Reduce line buffer size for Wiener filter. · 22bbe4cc
      Debargha Mukherjee authored
      This patch forces the vertical filtering for the top and bottom
      rows of a processing unit for the Wiener filter to be 5-tap.
      The 5-taps are derived from the primary 7-tap fitler by forcing
      the taps at the end to be zero, and absorbing their weights into
      the other taps to maintain normalization.
      This will effectively reduce the line buffer size for luma Wiener
      filter to 4 (from 6).
      
      Change-Id: I5e21b58369777eabf553a8987387d112f98a5598
      22bbe4cc
  5. 06 Sep, 2017 7 commits
    • Wei-Ting Lin's avatar
      Fix a bug forcing encoder to transmit mbmi info first · 49c335f4
      Wei-Ting Lin authored
      Fix the bug forcing encoder/decoder to transmit/decoder all
      mbmi info for a superblock first.
      
      Change-Id: I623217655b043fc90adbcc13e4cf2a4a845084ab
      49c335f4
    • Jingning Han's avatar
      Properly merge the tile context models in lv-map · 7bc599f5
      Jingning Han authored
      Merge the tile context models at the end of frame coding.
      
      Change-Id: I0a71c49c448cfbd71fb0bd15ca7b1f6097c56529
      7bc599f5
    • Wei-Ting Lin's avatar
      Remove motion_mode_wrapper · 20885281
      Wei-Ting Lin authored
      Change-Id: I3de1c933ee0fa90e9c0d52e6cbe4bc8bf5482a73
      20885281
    • Wei-Ting Lin's avatar
      ncobmc-adapt-weight: refactoring the mode selection function · 3122b7d5
      Wei-Ting Lin authored
      Change-Id: I7393596d98f11aa53ba4b9e329386b5168b3e086
      3122b7d5
    • David Barker's avatar
      Adjust chroma position in warp filter · a60dc9d6
      David Barker authored
      When using chroma subsampling, the warp filter currently behaves
      strangely when projecting chroma pixels, especially when the
      subsamplings are not equal along the x and y axes.
      
      For example, when subsampling_x = 1 and subsampling_y = 0, we
      calculate the destination coordinates (dx, dy) from the source
      coordinates (sx, sy) as:
      dx = project(2*sx+0.5, 2*sy+0.5)/2 - 0.5
      dy = project(sx, sy)
      where project() applies the affine warp model.
      
      This patch changes to a simpler and more consistent model,
      where we:
      * Project the chroma sample into luma coordinates, taking
        the chroma sample to be co-located with the top-left luma
        sample in its (2x2, or 2x1, or 1x2) subsampling block
        (this is done for simplicity; we don't expect the exact
         position to make much difference to the output quality)
      * Apply the transformation in luma coordinates
      * Project the resulting luma sample back into chroma coordinates
      
      Change to software speed is in the noise, but this approach
      should be simpler in hardware, and should slightly improve
      quality for 4:2:2 and 4:4:0 videos.
      
      Change-Id: Idd455fdd3897594ca7d4edff5b85b78961d1638d
      a60dc9d6
    • Rupert Swarbrick's avatar
      Round up subsampled frame size in av1_loop_restoration_corners_in_sb · 7380b25e
      Rupert Swarbrick authored
      The previous code converted a frame_w (say) of 1 to zero for a plane
      where subsampling was enabled, causing a division by zero in
      av1_get_rest_ntiles. This doesn't match the spec, which says
      subsampling rounds up.
      
      The patch adds the rounding, and also adds an assertion to
      av1_get_rest_ntiles to help diagnose any other broken callsites.
      
      Change-Id: Ia6c249fa935c3a16d122ba6e7b450fe99f412fde
      7380b25e
    • Debargha Mukherjee's avatar
      Make loop-restoration use 64x64 processing units · 7a5587a8
      Debargha Mukherjee authored
      Changes loop-restoration to use processing unit size that is
      64x64 for luma; for chroma the processing unit is coupled to
      64x64 support region for luma.
      Thus for chroma the processing unit size is 32x32 for 4:2:0,
      32x64 for 4:2:2 and 64x64 for 4:4:4, etc.
      
      While the Wiener filter output should not change with this patch,
      the sgr filter will change since the boundary pixel handling in
      sgr is internal within the filter.
      
      Change-Id: I65a9e2df88927a19445420ce400acb1fcf7afa93
      7a5587a8
  6. 05 Sep, 2017 8 commits
  7. 04 Sep, 2017 3 commits
    • Jingning Han's avatar
      Static local functions in mfmv · 5c700910
      Jingning Han authored
      Change-Id: I0fefe099b314295583e8e17e55e4d8fc375a5b0c
      5c700910
    • Jingning Han's avatar
      Constrain motion vector projection range · b74a72bf
      Jingning Han authored
      Constrain the maximum motion vector projection range to be within
      +/-32 pixels in the vertical direction and +/-64 pixels in the
      horizontal direction.
      
      Such constraints allow a fixed amount of reference motion vector
      load to SRAM for each 64x64 block size, independent of the frame
      size. The wider range in the horizontal direction can be stored in
      the SRAM and reused by next 64x64 block. The compression performance
      loss is 0.03% for lowres and 0.04% for midres.
      
      Change-Id: I7f1c136363b136b1f2fa9f7c962a791c8e91a976
      b74a72bf
    • clang-format's avatar
      apply clang-format · 4eafefe0
      clang-format authored
      Change-Id: If0b48a4ee1f7902d8c6154945ccef68a2b5aabb5
      4eafefe0
  8. 03 Sep, 2017 1 commit
    • Rupert Swarbrick's avatar
      Move loop restoration coefficients to within the frame · 6c545216
      Rupert Swarbrick authored
      Rather than encoding the loop restoration coefficients at the start of
      the frame header, this patch moves them to occur just after certain
      top-level superblocks.
      
      You might hope that we could just encode coefficients on top-level
      superblocks where the top-left corner of the superblock was also the
      top-left corner of the loop restoration tile. Unfortunately, this
      can't work with the superres experiment, where the loop restoration
      tiles don't necessarily line up with the superblocks. Indeed, in
      general there can be multiple different loop restoration coefficients
      that apply in a given top-level superblock. This patch defines a
      function, av1_loop_restoration_corners_in_sb, which yields the
      rectangle [rrow0, rrow1) x [rcol0, rcol1) of loop restoration tiles
      whose top left corners lie in this top-level superblock.
      
      The total file size should be unchanged by this patch: the bits have
      just been moved from the frame header and spread out among the rest of
      the frame.
      
      Change-Id: Icf43b0560964a63dea0d2cd801313f04139188d7
      6c545216
  9. 02 Sep, 2017 3 commits
  10. 01 Sep, 2017 2 commits
    • Ryan's avatar
      this update fixes the bug described in bug report 723 · a97c897b
      Ryan authored
      link is https://bugs.chromium.org/p/aomedia/issues/detail?id=723
      
      BUG=aomedia:723
      
      Change-Id: Iece3abcd88de69ab410674615965687abb5e4579
      a97c897b
    • David Barker's avatar
      Miscellaneous fixes for var-tx · 16c64e33
      David Barker authored
      Lots of small bug fixes, mainly around the transform size coding:
      
      * The loop filter was accidentally using the non-subsampled
        block size for the V plane, due to comparing a plane index
        (0, 1, or 2) against PLANE_TYPE_UV (== 1)
      
      * We allowed an initial update of the transform partition probabilities
        even on frames where we know they will never be used
        (because tx_mode != TX_MODE_SELECT).
        Further, these probabilities would not be reverted at the end
        of the frame, leading to the probability delta persisting across frames.
      
        Change this to behave more like the non-var-tx transform size coding,
        where probability deltas are only coded for frames with
        tx_mode == TX_MODE_SELECT, and the deltas only apply for one frame.
      
      * Fix decoder for the case where the video as a whole isn't lossless,
        and we have tx_mode == TX_MODE_SELECT, but the current segment
        *is* lossless.
        Note that the encoder already does the right thing in this case.
      
      * Don't allow the transform splitting to recurse "below" 4x4.
        This is really just a refactor, but means we can increase the
        maximum depth when subdividing rectangular transforms if we
        want to, whereas the previous code would have needed special cases
        for 4x8 and 8x4 transforms.
      
      * Finally, when we hit the maximum splitting depth, don't update
        the counts as if we had coded a 'no split' symbol.
      
      Change-Id: Iaebdacc9de81d2e93d3c49241e719bbc02e32682
      16c64e33
  11. 31 Aug, 2017 6 commits
    • Yaowu Xu's avatar
      signed char -> int8_t for consistency · 0fbe33d6
      Yaowu Xu authored
      Change-Id: I5cf978071fbb55040d2be88f627b600484988520
      0fbe33d6
    • Yaowu Xu's avatar
      avoid operation on invalid ref_row · fc377967
      Yaowu Xu authored
      BUG=aomedia:718
      
      Change-Id: Ib3fc5e83dd915d6869ee2d7e0bf40427111c6499
      fc377967
    • Angie Chiang's avatar
      Use 7 neighbors for nz_map ctx · 2b38deff
      Angie Chiang authored
      This will let coding performance drop slightly
      lowres 0.093%
      
      Increase encoder speed by 24%
      
      Reduce nz_map's context size by 20%
      
      Change-Id: I871c18a7e0341e066afc334556b9998194b3f8c9
      2b38deff
    • Stanislav Vitvitskyy's avatar
      Using CDFs for read_partition special case · 8711cf5f
      Stanislav Vitvitskyy authored
      Test results:
      akiyo	    -0.05%
      bowing	    -0.072%
      bridge	    -0.042%
      bus	    -0.156%
      coastguard  -0.645%
      container   -0.087%
      deadline     0.007%
      flower       0.02%
      football    -0.009%
      foreman      0.03%
      hall         0.087%
      highway     -0.041%
      husky       -0.031%
      mad900       0.015%
      mobile      -0.007%
      mother       0.012%
      news         0.039%
      pamphlet     0.061%
      paris       -0.003%
      sign        -0.148%
      silent       0.003%
      students    -0.009%
      tempete     -0.061%
      waterfall    0.666
      
      Change-Id: I96c2fd3a6fbc5f8e5cf7f3b881ef89335e58d5ac
      8711cf5f
    • Luc Trudeau's avatar
      [CFL] Asserts for chroma_sub8x8 · c84c21c4
      Luc Trudeau authored
      When Chroma from Luma is combined with chroma_sub8x8, the prediction
      used for sub8x8 blocks originates from multiple luma blocks. Extra
      asserts are added to validate that the prediction buffer contains all
      the required information.
      
      Change-Id: I305c46ce9b8292697e1d5b181d123461026da11c
      c84c21c4
    • hui su's avatar
      Remove probablity model for coeffecient tokens · b53682f5
      hui su authored
      Remove the token prob tables and counters.
      
      Change-Id: Ic63d52d80bb922fc10b586c27a20f2378618168c
      b53682f5