1. 18 Sep, 2017 9 commits
  2. 16 Sep, 2017 5 commits
    • Tom Finegan's avatar
      Remove the itrans DSPR2 sources. · 77f792f0
      Tom Finegan authored
      These files define functions that are unused. Update CMake
      and configure builds to remove references and delete the
      source files. These files defined the DSPR2 specializations
      of high bit depth versions of the following functions:
      
      - av1_iht16x16_256_add
      - av1_iht8x8_64_add
      - av1_iht4x4_16_add
      
      Change-Id: Ie3ef2592efe1519589a735b0d0db2806eec83e59
      77f792f0
    • Hui Su's avatar
      intrabc: replace prob with cdf · 6c8584f6
      Hui Su authored
      Improves keyframe coding by 0.1% on the screen_content testset.
      
      Change-Id: I5793a67eaae21010ef200038af99ebb9029fc770
      6c8584f6
    • Joe Young's avatar
      [intra-edge] Vectorize edge filtering functions · 89d321f7
      Joe Young authored
      Add sse4_1 functions for Intra-edge experiment:
        av1_filter_intra_edge_sse4_1()
        av1_filter_intra_edge_high_sse4_1()
      
      Approx cycle reduction at qp 20, 1 kf:
        Enc (lbd) 1.4% to 0.3%
        Dec (lbd) 0.4% to 0.1%
        Enc (hbd) 1.1% to 0.2%
        Dec (hbd) 0.6% to 0.1%
      
      No change to bitstream
      
      Change-Id: I176b2d125424d7d226114c807915c33dde5c3720
      89d321f7
    • Tom Finegan's avatar
      Fix CMake mips32 build with DSPR2 enabled. · db724cf0
      Tom Finegan authored
      - Add aom_scale dspr2 sources to the correct target (aom).
      - Fix an inverted high bit depth condition.
      - Remove claims that dspr2 variants of av1_iht16x16_256_add_dspr2,
        av1_iht8x8_64_add_dspr2, av1_iht4x4_16_add_dspr2 from
        av1_rtcd_defs.pl exist in low bit depth configs.
      
      Change-Id: Ibdd42e475b81c2491f02ba10ca0d461f7ff15bc5
      db724cf0
    • Debargha Mukherjee's avatar
      Add a q index based frame superres mode · 7166f22a
      Debargha Mukherjee authored
      Refactors and adds superres-mode 3 and associated
      paramters --superres-qthresh and --superres-kf-qthresh
      that are used to trigger superres mode when the qindex
      for any frame exceeds the thresholds provided for non-key
      and key-frames respenctively. The superres scale factor
      numerator is progressively reduced from 16 starting from
      that q threshold following a fixed slope.
      
      Change-Id: If1c782993667a6fbaaa01bbde77c4924008c0d28
      7166f22a
  3. 15 Sep, 2017 2 commits
    • Nathan E. Egge's avatar
      Force C implementation of 16-point Daala TX's. · 34e1201a
      Nathan E. Egge authored
      This patch fixes a regression introduced in 1d190950 where the encoder
       was using the 16x16 VP9/AV1 transforms for RDO, but then used the Daala
       transforms for encoding.
      
      subset1:
      
      master-daala_dct16@2017-09-13T12:05:18.013Z ->
        master_daala_dct16_use_c@2017-09-13T13:05:02.252Z
      
         PSNR | PSNR Cb | PSNR Cr | PSNR HVS |    SSIM | MS SSIM | CIEDE 2000
      -0.3654 | -0.7634 | -0.7407 |  -0.4884 | -0.4699 | -0.4945 |    -0.5104
      
      master-no_rect_tx-no_var_tx@2017-09-12T00:23:18.153Z ->
        master_daala_dct16_use_c@2017-09-13T13:05:02.252Z
      
         PSNR | PSNR Cb | PSNR Cr | PSNR HVS |    SSIM | MS SSIM | CIEDE 2000
      -0.0133 |  0.1040 | -0.0440 |  -0.0492 | -0.0151 | -0.0120 |     0.0699
      
      Change-Id: Id1830d0975db4bd0320a47fdf45b4bca20881cfb
      34e1201a
    • Yunqing Wang's avatar
      Further refactor setup_ref_mv_list · d1121fa3
      Yunqing Wang authored
      This patch follows the previous refactoring patch, and further reduces
      the number of calls made to scan_row_mbmi and scan_col_mbmi by going
      through partition blocks instead of mi blocks. This patch doesn't change
      bitstream, which was proven by Borg test result.
      
      The baseline decoder speed test on 1080p clip showed that the average
      decoder speedup was 1.1%.(fps: 32.626 --> 32.994)
      
      Change-Id: Ic375ae5d682c7454e2f2a2fcf8baa6b4b438d9a6
      d1121fa3
  4. 14 Sep, 2017 1 commit
  5. 13 Sep, 2017 3 commits
    • David Michael Barr's avatar
      [CFL] Fix typedef-redefinition compiler warnings · 5b2021ea
      David Michael Barr authored
      Instead of forward-declaring AV1_COMMON and MACROBLOCKD,
      move the dependent struct and function prototype closer
      to where they are used and after these types are defined.
      
      Change-Id: I75f005b46ef322a6fcbc01377b8dded1637c5f73
      5b2021ea
    • Rupert Swarbrick's avatar
      Simplify get_partition() implementation · 136d5c17
      Rupert Swarbrick authored
      This function is given a bsize and an mi array and has to figure out
      what partition to use to divide the given bsize in the direction of
      the sizes it finds in the mi array. (Since each block size can be
      reached by only one sequence of partitions, this can be done
      unambiguously)
      
      The previous version was correct, working by looking up entries in the
      partition_lookup array. Unfortunately, that lookup isn't quite enough
      when CONFIG_EXT_PARTITION_TYPES is true, so it then had to do some
      slightly confusing computations to fix things up after the fact.
      
      The new version should be more self-explanatory and doesn't work by
      looking things up in a magic array. It looks up the width and height
      corresponding to bsize and compares them with the width and height
      corresponding to the sb_type at mi_row,mi_col in the mi array. When
      CONFIG_EXT_PARTITION_TYPES is false, this is all you need, and the
      four corresponding cases can be found by a lookup in an array of 4
      elements.
      
      With extended partition types and a sufficiently large block, you have
      to do a bit more searching. For example, if bsize is BLOCK_16X16 and
      the subsize is BLOCK_8X8, the partition might be PARTITION_SPLIT, but
      it might be one of PARTITION_HORZ_A or PARTITION_VERT_A instead. The
      new code adds some comments to explain what's going on.
      
      A nice side-effect of rewriting get_partition in this way is that it
      lets us completely dispense with the partition_lookup array.
      
      The patch also fixes comments for the A/B extended partitions in
      enums.h, which were slightly backwards (a "horizontal split means two
      blocks vertically above one another)
      
      Change-Id: I4b48189103aa63e1859f25a15d7690d53ca7baf5
      136d5c17
    • Debargha Mukherjee's avatar
      Change/refactor compound mode handling for sub8x8 · 0f248c46
      Debargha Mukherjee authored
      Turn off compound modes as long as one of the dimensions
      is less than 8.
      
      Imapct on AWCY (0.05% increase in BDRATE)
      https://arewecompressedyet.com/?job=debargha-nocdef-sub8c8nc-0907%402017-09-07T20%3A28%3A38.251Z&job=debargha-nocdef-0907%402017-09-07T14%3A42%3A17.170Z
      
      Change-Id: I4a70890c04149246a50e60990dede21cb8052fad
      0f248c46
  6. 12 Sep, 2017 3 commits
    • Nathan Egge's avatar
      Revert "Add an orthonormal 4-point Type-VII DST." · 5a5e1adb
      Nathan Egge authored
      This reverts commit 72c99e1a.
      
      No metrics on this commit.
      
      Change-Id: I9fc350b25e710c3d5d6d8299ab5348e8b31b39ea
      5a5e1adb
    • Nathan E. Egge's avatar
      Add an orthonormal 4-point Type-VII DST. · 72c99e1a
      Nathan E. Egge authored
      Replaces the orthonormal Type-IV DST with an orthonormal Type-VII DST
       in od_bin_fdst4() and od_bin_idst4()
      
      Change-Id: I4ff0888e740d8cc063a2e5deaeceef7cb0d80485
      72c99e1a
    • David Barker's avatar
      Clarify comment in av1_set_mb_mi() · ee0fd929
      David Barker authored
      The code currently pads the decoded frame width and height to a
      multiple of 8 luma pixels, but there is a TODO suggesting that
      this may be changed to only require a multiple of 4 in future.
      
      But, as per the comments on the linked bug, there are good reasons
      to keep the decoded width and height as multiples of 8. So delete
      the outdated TODO and instead outline the reasons why the current
      behaviour is helpful.
      
      BUG=aomedia:727
      
      Change-Id: I2340bbcd740afe74c2e6fb3cf2e7a420db2b4f40
      ee0fd929
  7. 11 Sep, 2017 6 commits
    • Sarah Parker's avatar
      Tokenize and write mrc mask · 99e7daa2
      Sarah Parker authored
      This allows a mask for mrc-tx to be sent in the bitstream for
      inter or intra 32x32 transform blocks. The option to send the mask
      vs build it from the prediction signal is currently controlled with
      a macro. In the future, it is likely the macro will be removed and it
      will be possible for a block to select either method. The mask building
      functions are still placeholders and will be filled in in a followup.
      
      Change-Id: Ie27643ff172cc2b1a9b389fd503fe6bf7c9e21e3
      99e7daa2
    • Jingning Han's avatar
      Use per tile model update for coeff_br coding · e29c9260
      Jingning Han authored
      Change-Id: Ie52d52bc25e3fdfdea877349215431d8edc064a3
      e29c9260
    • Soo-Chul Han's avatar
      add SEG_LVL_ZEROMV · a752d1d5
      Soo-Chul Han authored
      Change-Id: Icd04302886a4d12890d04f9f15563169a91e3a0d
      a752d1d5
    • Debargha Mukherjee's avatar
      Change vertical border to 1 (2-line buffer) · da46da8e
      Debargha Mukherjee authored
      Changes the macros SGRPROJ_BORDER_VERT and
      SGRPROJ_BORDER_VERT to 1 to use a single line border
      vertically - which will be equivalent to a 2-line buffer.
      
      Change-Id: I788d9dca53d3d492058914215acf61e9d3d3880d
      da46da8e
    • Steinar Midtskogen's avatar
      CDEF: Do not filter chroma if subsampling_x != subsampling_y · 1c1161f1
      Steinar Midtskogen authored
      Since CDEF looks uses the luma direction for chroma, CDEF would have
      to change significantly to support formats like 4:2:2.  The limited
      use of such formats does not justify the complexity to support this,
      so the simple solution is to mandate that the chroma planes aren't
      filtered if subsampling_x != subsampling_y.  Most of the visual gain
      is in luma, anyway.
      
      This also means that the chroma strengths and chroma skip condition
      shall not be sent if subsampling_x != subsampling_y.
      
      BUG=aomedia:720
      
      Change-Id: I35c184a6fe0908ae0fee1e74494b6904fa9a3c82
      1c1161f1
    • David Barker's avatar
      cdef-singlepass: Fix integer overflow · 0123bc7d
      David Barker authored
      When cdef-singlepass is enabled, it is possible to signal the
      CDEF parameters in such a way that we end up with unsigned
      integer overflow in constrain().
      
      Fix this by i) using signed instead of unsigned values,
      and ii) clamping the result to avoid going on to shift
      by a negative amount.
      
      Change-Id: Ib677b2d644e44000c54959f7280e646bf02054da
      0123bc7d
  8. 10 Sep, 2017 4 commits
    • Jingning Han's avatar
      Rework base range entropy coding in level map system · 87b01b5a
      Jingning Han authored
      Replace the truncated geometric distribution model with the grouped
      leaves structure for more efficient probability modeling.
      Each group has its own Geometric distribution
      
      This give us 0.2% gain on lowres
      
      Change-Id: If5c73dd429bd5183a8aa81042f8f56937b1d8a6a
      87b01b5a
    • Angie Chiang's avatar
      Re train nz_map prob · 698a6185
      Angie Chiang authored
      We got 0.07% gain on lowres
      
      Change-Id: I0aef14d16025d9933ec3d3b71086f3f55c81df66
      698a6185
    • Debargha Mukherjee's avatar
      Refactoring/simplification of buffers used for sgr · 1330dfd1
      Debargha Mukherjee authored
      Inlcudes miscellaneous cleanups, test fixes, and code reorganization
      for loop-restoration components.
      
      Change-Id: I5b2e6419234d945e6f4344b22636119b50df4054
      1330dfd1
    • Debargha Mukherjee's avatar
      Reduce/Eliminate line buffer for loop-restoration. · e168a783
      Debargha Mukherjee authored
      This patch forces the vertical filtering for the top and bottom
      rows of a processing unit for the Wiener filter to not use border
      more than what is set in the WIENER_BORDER_VERT macro.
      This macro is currently set at 0 to eliminate line buffer completely,
      but it could be increased to 1 or 2 to use limited line buffers
      if the coding efficiency is affected too much with a 0 line-buffer.
      
      Also, for the sgr filter we added the option of using overlapping
      windows horizonttally and vertically to improve coding efficiency.
      The vertical border used is set by the SGRPROJ_BORDER_VERT
      macro, while the horizontal border can be set by the
      SGRPROJ_BORDER_HORZ macro set at 2, the max needed. Currently we do not
      recommend changing SGRPROJ_BORDER_HORZ below 2.
      
      The overall line buffer requirement for LR is twice the max of
      WIENER_BORDER_VERT and SGRPROJ_BORDER_VERT.
      Currently both are set as 0, eliminating line buffers completely.
      
      Also this patch extends borders consistently before CDEF / LR.
      
      Change-Id: Ie58a98c784a0db547627b9cfcf55f018c30e8e79
      e168a783
  9. 09 Sep, 2017 3 commits
  10. 08 Sep, 2017 2 commits
  11. 07 Sep, 2017 2 commits
    • Debargha Mukherjee's avatar
      Fix scaling parameter in non-normative warping · 5a9e82e3
      Debargha Mukherjee authored
      Use macro SCALE_SUBPEL_BITS rather than hard-coded 4 for old
      warping code that is used for non-affine global models.
      
      Change-Id: I10ee7b29101cd79e77a4d29e69d67497fda4e967
      5a9e82e3
    • Yi Luo's avatar
      Lowbd parallel_deblocking sse2 optimization · ea8a0d52
      Yi Luo authored
      Baseline + parallel_deblocking:
      
      - Passed unit tests *SSE2/Loop8Test6*, *AVX2/Loop8Test6*.
      - 1080p, 25 frames, profile=0, encoding/decoding, output match.
      - Decoder frame rate increases from 54.15 to 65.84.
      
      Change-Id: I55938c94961066594f4b9080192c7268c19d9bf9
      ea8a0d52