1. 12 Jun, 2017 3 commits
    • Yue Chen's avatar
      supertx: code refactoring + resolve conflicts with baseline · 8e689e4b
      Yue Chen authored
      Refactoring: split prediction+extension for each plane, so we can
      handle luma/chroma supertx pred in different ways.
      Compatibility fix: fix conflicts with cb4x4 and chroma_sub8x8, now
      for chroma sub8x8 supertx, only the top-left(basic cb4x4) or the
      the bottom-right(cb4x4 + chroma_sub8x8) predictor will be used
      without any blending within a 8x8 unit.
      
      Change-Id: I6cf7b12768a82d3c7e01811ada02de84af9bd8ac
      8e689e4b
    • Zoe Liu's avatar
      Add encoder/decoder support for var-refs · 7b1ec7a9
      Zoe Liu authored
      Check the availability of the reference frames at the frame level at
      both encoder and decoder, and if a reference frame is not available
      for a specific video frame, remove the signaling of such reference
      frame info at the block level.
      
      This patch adds the consideration of the bit saving inside the RD
      optimization loop.
      
      Change-Id: I4c22f1b843b21c7d2b47e118c99c3ad615a3d4e4
      7b1ec7a9
    • Steinar Midtskogen's avatar
      Speed up CDEF parameter selection for cpu-used > 0 · b1555c93
      Steinar Midtskogen authored
      High delay cpu-used=4
         PSNR | PSNR Cb | PSNR Cr | PSNR HVS |    SSIM | MS SSIM | CIEDE 2000
       0.0579 |  0.1380 | -0.1975 |   0.0361 |  0.0226 |  0.0072 |     0.0470
      
      Low delay cpu-used=4
         PSNR | PSNR Cb | PSNR Cr | PSNR HVS |    SSIM | MS SSIM | CIEDE 2000
      -0.1016 | -0.0695 |  0.1013 |  -0.1324 | -0.0903 | -0.1260 |    -0.1710
      
      Change-Id: I5a66a1ffc2d1fb2a203065b7fbb2fd2bd2b281ad
      b1555c93
  2. 11 Jun, 2017 1 commit
  3. 09 Jun, 2017 5 commits
    • David Barker's avatar
      Vectorize av1_convolve_2d() · 8295c7c7
      David Barker authored
      Includes a test case based on the warp filter tests
      
      Change-Id: I9abea53a088f68bb8a928ebd7cb96b3266a63c13
      8295c7c7
    • Jonathan Matthews's avatar
      Avoid integer overflow in inv_txfm2d_add_facade() · 284f9d06
      Jonathan Matthews authored
      Bug introduced in change: I34cdeaed2461ed7942364147cef10d7d21e3779c
      
      BUG=aomedia:591
      
      Change-Id: I49b9edd2bf5a482b5afea5d83d56e04a0086f797
      284f9d06
    • David Barker's avatar
      Unify high-precision convolve filters: convolve-round · 726a953c
      David Barker authored
      * Reduce bit widths of intermediate values where possible
      * Change ROUND_POWER_OF_TWO_SIGNED to ROUND_POWER_OF_TWO
        in av1(_highbd)_convolve_2d
      * Apply offsetting and bounds checking, to match the intended
        hardware implementation
      * Separate the implementations of av1(_highbd)_convolve_2d
        into compound-round and non-compound-round cases. This is because
        there are now a significant number of differences between the
        functions.
      
      Overall, this is expected to affect the bitstream and encoder output
      when convolve-round alone is enabled, but *not* when compound-round
      is enabled.
      
      Change-Id: I8c21e0645fd11f64c59552885f87f4a5dd40ccf7
      726a953c
    • David Barker's avatar
      Add 'do_average' to ConvolveParams structure · e64d51a9
      David Barker authored
      The 'ref' member of ConvolveParams currently serves two purposes:
      * To indicate which component of a compound we're currently predicting,
        eg. for fetching interpolation filters with dual-filter enabled.
      * To determine whether we should average into the destination buffer.
      
      But there are two cases where we want to separate these out:
      * In joint_motion_search, we want to try combining a fixed second
        prediction with various first predictions.
      * When searching masked interinter compounds, we want to predict
        each component separately then try different combinations.
      
      In these cases, we set 'ref' to 0 and use temporary variables to
      make sure we use the correct interpolation filters. But this is
      quite fragile.
      
      This patch separates out the two uses into separate members.
      This allows us to remove some temporary variables, but more
      importantly gives easy fixes to two bugs in
      build_inter_predictors_single_buf (used by rdopt):
      
      * We previously set ref=0 but didn't fix up the interpolation filters
      * For ZERO_ZEROMV modes, the second component would accidentally
        average into the (uninitialized!) second prediction buffer
      
      BUG=aomedia:577
      BUG=aomedia:584
      BUG=aomedia:595
      
      Change-Id: Ibc31d1ac701a029ea5efaa1197dd402bc4b7af1e
      e64d51a9
    • Thomas Davies's avatar
      AOM_QM: Use 8-bit matrices and fix 2x2 transform sizes. · 92aa22a8
      Thomas Davies authored
      2x2 transforms are now hidden behind the CHROMA_2X2 macro,
      not the CB4X4 macro.
      
      Change-Id: I5d73c679fba486ccda98fa8dbb804a3902df6c8d
      92aa22a8
  4. 08 Jun, 2017 3 commits
    • Yushin Cho's avatar
      Refactor sub8x8 tx size RD for daala-dist · 30a2c5f2
      Yushin Cho authored
      For a tx size RD search with partition size >= 8x8 and tx size < 8x8,
      daala-dist function is applied to the whole partition after all tx blocks are encoded
      instead of each 8x8 sub block of the partition.
      
      Change-Id: I27d9e2960aa641f550096e32ebcdf8dfb4de79a6
      30a2c5f2
    • Nathan E. Egge's avatar
      Remove unused av1_inter_mode_tree ind/inv arrays. · 50484232
      Nathan E. Egge authored
      Change-Id: If00a0bdf239b2c9e355cffd2e472708acb189f16
      50484232
    • Sarah Parker's avatar
      Remove deprecated high-bitdepth functions · 31c66502
      Sarah Parker authored
      This unifies the codepath for high-bitdepth transforms and deletes
      all calls to the old deprecated versions. This required reworking
      the way 1d configurations are combined in order to support rectangular
      transforms.
      
      There is one remaining codepath that calls the deprecated 4x4 hbd
      transform from encoder/encodemb.c. I need to take a closer look
      at what is happening there and will leave that for a followup
      since this change has already gotten so large.
      
      lowres 10 bit: -0.035%
      lowres 12 bit: 0.021%
      
      BUG=aomedia:524
      
      Change-Id: I34cdeaed2461ed7942364147cef10d7d21e3779c
      31c66502
  5. 07 Jun, 2017 2 commits
  6. 06 Jun, 2017 6 commits
    • Alex Converse's avatar
      intrabc: Elide subpel bits · 6b2584c6
      Alex Converse authored
      objective-1-fast 1st KF: -0.04 BDRATE-PSNR
      twitch-1 1st KF: -0.04 BDRATE-PSNR
      
      Change-Id: I74e8e43278a3d228f9b0a9af014e69f80aa90a0f
      6b2584c6
    • David Barker's avatar
      Fix some UBSan warnings · 185575a7
      David Barker authored
      * Make intermediate arrays in av1(_highbd)_warp_affine_c signed,
        to avoid integer overflow when multiplying an 'unsigned int'
        by a negative 'int' value.
      
      * Pad out arrays in masked_variance_test.cc so that the array
        stride is a multiple of 16 bytes.
        This fixes some UBSan errors in masked_variance_intrin_ssse3.c
        related to unaligned loads of 32-bit values.
      
      BUG=aomedia:572
      
      Change-Id: I0cf786c94870ff128c883bed8e900b0686afc3f7
      185575a7
    • Urvang Joshi's avatar
      Add a new experiment "rect-intra-pred". · 766a389b
      Urvang Joshi authored
      Earlier, intra prediction for rectangular blocks was performed by
      running two steps of prediction on square sub-blocks.
      
      With this experiment, we do proper intra prediction for rectangular
      blocks. This ensures that we make use of all available neighboring
      pixels especially for directional modes. For this, all the intra
      predictors were updated to work with rectangular transform block sizes.
      
      Performance improvements are small but free of cost:
      
      All Intra frames:
      lowres: -0.126
      midres: -0.154
      
      Video Overall:
      lowres: -0.043
      midres: -0.100
      
      [Could not get AWCY results due to a backlog.]
      
      BUG=aomedia:551
      
      Change-Id: I7936e91b171d5c246cb0a4ea470a981a013892e6
      766a389b
    • Luc Trudeau's avatar
      [CFL] Get subsampling from AV1 common · dac5e391
      Luc Trudeau authored
      This change does not impact the bitstream
        PSNR | PSNR Cb | PSNR Cr | PSNR HVS |   SSIM | MS SSIM | CIEDE 2000
      0.0000 |  0.0000 |  0.0000 |   0.0000 | 0.0000 |  0.0000 |     0.0000
      
      Change-Id: I6e131e91bad5efa345ed2542ae970eb6122eff51
      dac5e391
    • Luc Trudeau's avatar
      Move FrameContext out of plane loop · e2ac9855
      Luc Trudeau authored
      Change-Id: Ideaeb52dbaf87e5a68da90cb94b0517760cb9d5c
      e2ac9855
    • Debargha Mukherjee's avatar
      Make loop-restoration compatible w/ frame_superres · 2dd982e4
      Debargha Mukherjee authored
      When frame_superres is on, loop-restoration should work
      on the size of the upscaled frame and not on the internal
      width and height in the common structure. This patch
      makes the necessary changes on the encoder and decoder
      side to enable that.
      
      Change-Id: I1d1c024ac6f95944169d90647b4c5a61354a5cc6
      2dd982e4
  7. 05 Jun, 2017 2 commits
    • Urvang Joshi's avatar
      is_directional_mode: Check for directional modes directly. · 875a6675
      Urvang Joshi authored
      Earlier, the condition was negating all non-directional modes to check
      if a mode is directional. This was error-prone, e.g. when a new
      non-directional mode is added.
      
      By checking for directional modes directly, we avoid such errors.
      
      Change-Id: Ia4a62e278cd73078c53ed5096db646eff77f054e
      875a6675
    • Sarah Parker's avatar
      Early termination for warp error computation · 81f6ecd1
      Sarah Parker authored
      This terminates the computation for the warp error once
      the frame error exceeds the best frame error found
      so far to avoid unneccessary computation.
      
      Change-Id: I094a0b3e13f8b91610e051cb91d20a815879dd80
      81f6ecd1
  8. 02 Jun, 2017 4 commits
    • Angie Chiang's avatar
      Mark SMOOTH2 filter under USE_EXTRA_FILTER flag · aadbb025
      Angie Chiang authored
      Change-Id: Ia9a5d818e8c2ff9b4cc41c6d7950cfe005c20bfc
      aadbb025
    • Alex Converse's avatar
      intrabc: adapt use_intrabc prob · 7c412ea4
      Alex Converse authored
      First keyframe BD-RATE objective-1-fast:
         PSNR | PSNR Cb | PSNR Cr | PSNR HVS |    SSIM | MS SSIM | CIEDE 2000
      -0.3705 | -0.3232 | -0.3812 |  -0.3782 |     N/A | -0.3412 |        N/A
      
      First keyframe BD-RATE twitch-1:
         PSNR | PSNR Cb | PSNR Cr | PSNR HVS |    SSIM | MS SSIM | CIEDE 2000
      -0.2479 | -0.2477 | -0.2467 |  -0.2567 | -0.2486 | -0.2508 |    -0.2487
      
      
      
      Change-Id: Iea6c895c6fe9e9764887a8968f6e5330903969d3
      7c412ea4
    • Ryan Lei's avatar
      integrate parallel_deblocking with CB4x4 · 17905edf
      Ryan Lei authored
      this change makes parallel deblocking experiment works with
      cb4x4. the inner loop process every 4x4 block.
      
      Change-Id: I86adb3d7b6d67a91ccc12aab29da9bfb8c522cf1
      17905edf
    • Joe Young's avatar
      [intra-edge] Use 5-tap filter · 3be70f72
      Joe Young authored
      For intra edge filtering experiment, replace the 2x iteration
      (5-6-5) filter with a 5-tap filter (2-4-4-4-2).
      
      BDrate (1 key-frame) for this change:
      cif:    +0.02%
      midres: +0.04
      720p:   -0.01
      1080p:  -0.03
      4k:     -0.01
      
      BDrate (1 key-frame) for intra-edge experiment:
      (05/31, disable rect-tx, ext-tx, delta-q, ext-delta-q)
      
                1 key-frame     60 frames
               PSNR   SSIM     PSNR  SSIM
      cif:    -0.02   -0.01   -0.03  -0.01
      midres: -0.02   -0.02   -0.05  -0.10
      720p:   -0.36   -0.39   -0.05  -0.06
      1080p:  -0.75   -0.88   -0.22  -0.27
      4k:     -0.91   -1.12   -0.45  -0.54
      
      Change-Id: I834037e662b4483d4d6bdceb1c1624d56ba293a4
      3be70f72
  9. 01 Jun, 2017 8 commits
    • Yushin Cho's avatar
      Fix daala-dist for cb4x4 · 63927c43
      Yushin Cho authored
      The place where av1_daala_dist() is applied for sub8x8 partition is
      moved from sub8x8 mode decision functions to rd_pick_partition().
      
      BD-Rate change by daala-dist with '--disable-var-tx' is:
      (AWCY, objective-1-fast, high delay mode)
      
         PSNR | PSNR Cb | PSNR Cr | PSNR HVS |    SSIM | MS SSIM | CIEDE 2000
      15.1558 | 12.9585 | 14.4662 |  -3.8651 | -1.7102 | -9.2956 |    10.8686
      
      In MSE probe mode:
      
        PSNR | PSNR Cb | PSNR Cr | PSNR HVS |   SSIM | MS SSIM | CIEDE 2000
      0.0429 |  0.0435 |  0.1651 |  -0.0415 | 0.0850 |  0.0122 |     0.0546
      
      Change-Id: I3b2ea916d41c48e433eb641adf44552e4725c198
      63927c43
    • Angie Chiang's avatar
      Add gen_txb_cache() and it's related functions · 2affb3b0
      Angie Chiang authored
      This function pre-generate counts/magnitudes of each level map
      such that we don't have to re-calculate the counts/magnitudes
      while doing the optimization.
      
      Change-Id: Ifdfc89522cf2f2b9f3734d451324081f42b47cb0
      2affb3b0
    • Angie Chiang's avatar
      Add get_coeff_cost() and get_txb_cost() · 488f921c
      Angie Chiang authored
      Change-Id: I085f2bc706fde41afbee5ff48b56acc095f804c2
      488f921c
    • Timothy B. Terriberry's avatar
      cb4x4: Move sub-4X4 TX sizes behind CONFIG_CHROMA_2X2. · fe67ed6a
      Timothy B. Terriberry authored
      cb4x4 itself should not require these sizes.
      
      This simplifies compatibility with other experiments, since we can
      first make them work with cb4x4 (which is now on by default), and
      then worry about chroma_2x2 (which is not) in separate steps.
      
      Encoder and decoder output should remain unchanged.
      
      Change-Id: I4e9fcdae49f238b5099a3c74a398fe993c2545f8
      fe67ed6a
    • Jingning Han's avatar
      Rework loop filter tx size selection · 6e4955d4
      Jingning Han authored
      Update and capture the effective transform block size per color
      plane.
      
      Change-Id: Ib6e0e7abb3973db6b8d511ee7c9948aaab048788
      6e4955d4
    • Yue Chen's avatar
      Make ext_inter/wedge/compound_segment/interintra on by default · f03907a2
      Yue Chen authored
      (1) Make unit tests for masked sad/variance encoder-only
      (2) Fix compile error with intrabc
      (3) Fix warnings reported by static analysis
      
      Change-Id: I0cd2176fcda0b81e1fc30283767678376ced4c42
      f03907a2
    • David Barker's avatar
      Fix integer overflow in warp filter · 17c37ceb
      David Barker authored
      Patch https://aomedia-review.googlesource.com/c/12602/ made the
      variable 'sum' in the warp filter unsigned, to indicate that its
      value should always be >= 0. But 'sum' is used to accumulate
      signed values, and it is expected that some of those values
      will be negative.
      
      The issue is that, when running 'x += y', if x is a uint32_t
      and y is an int (and is 32 bits), the C standard says to
      convert y to a uint32_t before doing the addition. This causes
      overflow, and so undefined behaviour, if y < 0.
      
      This is fixed by making 'sum' signed, and by explicitly bounds
      checking against zero at the end of the filter.
      
      BUG=aomedia:572
      
      Change-Id: I1d484b5f5698db0ec9761807610b3b2b35647983
      17c37ceb
    • Urvang Joshi's avatar
      get_min_tx_size: assert() doesn't need an 'if'. · affbe5e1
      Urvang Joshi authored
      Change-Id: Id2be191fb48ed8d65b452499e1a1a1f470359321
      affbe5e1
  10. 31 May, 2017 1 commit
    • Jingning Han's avatar
      Rework txfm_above and txfm_left context offset · 331662e9
      Jingning Han authored
      Make the txfm_above and txfm_left be processed in the unit of
      miniumum transform block size. Scale the transform block step
      size with respect to the mode_info step size.
      
      Change-Id: Iee4421e005db742cd4ff7899215560063e5f68e5
      331662e9
  11. 30 May, 2017 2 commits
    • David Barker's avatar
      Tidy up warp filter · facac4f5
      David Barker authored
      * Simplify the C version of the warp filter to make the intent
        of the code clearer
      * Replace saturate_uint() in the C warp filter with an assertion
        that the intermediate values are in-range. This is because they
        should (provably) *never* go out-of-range.
      * Add a comment describing the intended hardware architecture
      * Miscellaneous comment updates
      
      Change-Id: I798736f923ece599f22d573d31c5dfccd18b2d0e
      facac4f5
    • Arild Fuldseth (arilfuld)'s avatar
      Use 7-bit smooth and regular filters with DUAL_FILTER · f3b5e7f4
      Arild Fuldseth (arilfuld) authored
      Change-Id: If8f8e1a0032e914beb3ec3bcde221fe4a5605139
      f3b5e7f4
  12. 29 May, 2017 2 commits
  13. 27 May, 2017 1 commit
    • Debargha Mukherjee's avatar
      High precision Wiener filter rework · 11cf46f4
      Debargha Mukherjee authored
      Implements the high precision Wiener filter with an offset
      to reduce the error due to saturation without increasing
      the number of bits needed for intermediate precision.
      
      Also turns the high precision filter on.
      
      Change-Id: I34037a5746a6a89c5fce67753c1b027749085edf
      11cf46f4