1. 17 Oct, 2017 6 commits
    • Yue Chen's avatar
      new_multisymbol: use cdf-based costs of palette flags · dab2ca9d
      Yue Chen authored
      The modification is only applicable to palette_y_mode and
      palette_uv_mode. Welcome to make changes to other palette syntax.
      
      Change-Id: I7bf0a49c06a3986475076fe291e26f4b783b8ab9
      dab2ca9d
    • Yushin Cho's avatar
      [dist-8x8] Add more asserts · 1cd34621
      Yushin Cho authored
      Added more asserts for dist-8x8 running in PSNR mode,
      i.e. with the encoder option "--enable-dist-8x8=1"
      instead of --tune=[cdef-dist | daala-dist].
      
      The asserts checks whether a 8x8 distortion measured on reconstructed 8x8 pixels
      is identical to the sum of distortions from sub8x8 partitions (or tx blocks for
      var-tx case).
      
      Change-Id: I14f2b24e674a9cbbe60e663449fc4e7f46f0e481
      1cd34621
    • Alexander Bokov's avatar
      Improving the model for pruning the TX type search · 0c7eb10d
      Alexander Bokov authored
      Introduces two new TX type pruning modes that provide better
      speed-quality trade-off compared to the existing ones. A shallow
      neural network with one hidden layer trained separately for each
      block size is used as a prediction model. The new modes differ in
      thresholds applied to the output of the neural net, so that they
      prune different number of TX types on average.
      
      Owing to relatively low quality loss PRUNE_2D_ACCURATE is used
      by default, regardless of speed settings. Starting with speed
      setting of 3 we switch to PRUNE_2D_FAST mode to get better
      speed-up.
      
      Evaluation results:
      ----------------------------------------------------------
      Prune mode | Avg. speed-up | Quality loss | Quality loss
                 |(high bitrates)|   (lowres)   |   (midres)
      ----------------------------------------------------------
      PRUNE_ONE  |     18.7%     |    0.396%    |    0.308%
      ----------------------------------------------------------
      PRUNE_TWO  |     27.2%     |    0.439%    |    0.389%
      ----------------------------------------------------------
      PRUNE_2D_  |     18.8%     |    0.032%    |    0.063%
      ACCURATE   |               |              |
      ----------------------------------------------------------
      PRUNE_2D_  |     33.3%     |    0.504%    |     ---
      FAST       |               |              |
      
      Change-Id: Ibd59f52eef493a499e529d824edad267daa65f9d
      0c7eb10d
    • Debargha Mukherjee's avatar
      Fix a compile bug with ext-partition-types · 0b34a79f
      Debargha Mukherjee authored
      Removes some stray CONFIG_CB4X4 config macros.
      
      BUG=aomedia:921
      
      Change-Id: Icc65e0b000f659d7fb18178c928a7bff7879f58c
      0b34a79f
    • Sebastien Alaiwan's avatar
      Remove abandonned CHROMA_2X2 experiment · d8b93f56
      Sebastien Alaiwan authored
      Change-Id: I5bff0a68602a89ce480fec049c8b2c4bce44f6bb
      d8b93f56
    • Hui Su's avatar
      intrabc: support var-tx · 12546aa2
      Hui Su authored
      Support recursive tx block partition.
      
      On the screen content testset, 0.2% gain for keyframe encoding.
      
      Change-Id: I623e6fbb910fef9c91617e02edf420019f67d189
      12546aa2
  2. 16 Oct, 2017 5 commits
  3. 14 Oct, 2017 1 commit
  4. 13 Oct, 2017 1 commit
  5. 12 Oct, 2017 3 commits
    • Yunqing Wang's avatar
      Find warped reference MV · 97d6a37e
      Yunqing Wang authored
      While finding the reference MV for a block, if one neighbouring block's
      motion mode is warped motion mode, instead of directly adding that
      block's MV to the candidate MV list, we use that neighbouring block's
      warped motion parameters to compute a MV for the center point of the
      current block, and then add that MV to the candidate MV list.
      
      Borg test result:
                   avg_psnr ovr_psnr ssim
      cam_lowres:  -0.507   -0.514  -0.685
      lowres:      -0.114   -0.122  -0.180
      
      The change is added under ext_warped_motion config flag.
      
      Change-Id: I3ce6290a1fd512b613eab5d7620c8bcb08f189a6
      97d6a37e
    • Yue Chen's avatar
      filter_intra: make fi mode index entropy coded · 63ce36fc
      Yue Chen authored
      Make fi mode index entropy coded instead of using raw bits. Mode
      cost estimation in key-frame RDO is updated as well. Modification
      to inter frame RDO is not included in this patch.
      Also key-frame y mode cdf table is re-trained since fi modes are
      attached to DC_PRED
      
      Key frame BDRate:
      -0.399% lowres, -0.339% midres
      
      Change-Id: I9ccf478b0a2e48fb1870fe8451e45e2c858a5f63
      63ce36fc
    • David Barker's avatar
      Make SEG_LVL_{SKIP,ZEROMV} blocks be single-ref-only · d92f3560
      David Barker authored
      This patch modifies the interpretation of SEG_LVL_SKIP and
      SEG_LVL_ZEROMV slightly, to fix a decoder crash and to save bits
      in the intended use cases of these segment flags.
      
      Previously, blocks using either of these segment flags could
      signal reference frames just like any other block. But the mode
      was implicitly taken to be ZEROMV. This worked fine in VP9, but
      crashed for compound blocks in AV1 since those should use
      ZERO_ZEROMV instead.
      
      Now we make it so that SEG_LVL_SKIP and SEG_LVL_ZEROMV imply
      that the block is single-reference. The reference to use is taken
      from the SEG_LVL_REF_FRAME segment feature if that is present,
      or is set to LAST_FRAME if not. See the attached bug report
      for the reasoning behind this.
      
      As a related change, we also teach the encoder how to deal with
      the combination of SEG_LVL_SKIP + SEG_LVL_REF_FRAME.
      
      BUG=aomedia:675
      
      Change-Id: I5e657cbfc1f08395a0301cba701edfb1682502a5
      d92f3560
  6. 11 Oct, 2017 2 commits
  7. 10 Oct, 2017 5 commits
    • Hui Su's avatar
      Add function to control palette usage · e87fb237
      Hui Su authored
      Add av1_allow_palette() to control whether palette mode should be enabled.
      
      Change-Id: Iee24636451be42eb36093dc3453bc39c7e686276
      e87fb237
    • Lester Lu's avatar
      lgt-from-pred: transforms based on prediction · 432012f6
      Lester Lu authored
      In this experiment, sharp image discontinuity in the predicted
      block is detected. Based on this discontinuity, we choose
      particular LGTs as row and column transforms.
      
      Bitstream syntax, entropy coding, and RD search for LGT are added.
      One binary symbol is used to signal whether LGT is used. This
      experiment can work independently with the lgt experiment.
      
      lowres: -0.414% for key frames, -0.151% overall
      midres: -0.413% for key frames, -0.161% overall
      
      Change-Id: Iaa2f2c2839c34ca4134fa55e77870dc3f1fa879f
      432012f6
    • Yushin Cho's avatar
      Use pixel domain skip error if possible in var-tx · 952eae29
      Yushin Cho authored
      When early skipped in var-tx, distortion is set the same as sse.
      If so, use pixel domain sse (i.e. skip error) since is more accureate
      than sse from transform domain.
      
      Change-Id: Id3cbc66ea6318108c031413646f3d06250e75e7e
      952eae29
    • Yushin Cho's avatar
      Fix that sse is added twice during early skip in var-tx · 16efec40
      Yushin Cho authored
      The rd_stats->sse is already updated by
      "rd_stats->sse += tmp << 4;",
      which is measured by pixel_diff_dist(), i.e. in pixel domain and
      w/o quantization().
      
      Change-Id: I4dc20a7e80af9dd846aa5de4298cb56e7f0d8f7e
      16efec40
    • Rupert Swarbrick's avatar
      Don't trash memory in select_tx_type_yrd · de2ea94e
      Rupert Swarbrick authored
      This patch fixes a bug in select_tx_type_yrd. The function works by
      looping over possible transform types to find the best option (calling
      select_tx_size_fix_type for each). Whenever there's a new best
      candidate, the code copies information about the transform from the
      mbmi structure into stack-allocated "best candidate" structures. At
      the end, it copies the "best candidate" data back to mbmi.
      
      Before the patch, if ref_best_rd was small, each call to
      select_tx_size_fix_type might return INT64_MAX (because they don't
      find anything better than ref_best_rd) and so we'd never actually copy
      anything to the "best candidate" structures. Then, at the end of the
      function, we'd merrily overwrite mbmi with whatever happened to be on
      the stack, causing general mayhem when something tried to read the
      data from mbmi later.
      
      This patch exits early if no candidates were found. It also adds an
      assertion saying that if no candidates were found, ref_best_rd must
      have been less than INT64_MAX. This should hopefully catch any bugs
      where the continue keywords in the loop stop us ever actually calling
      select_tx_size_fix_type.
      
      Change-Id: I54b998148281dd80f98d1570f736964593dc753f
      de2ea94e
  8. 09 Oct, 2017 4 commits
    • Sarah Parker's avatar
      Change rectangular vartx recursion depth to 2 · d25ef8c6
      Sarah Parker authored
      0.15% improvement on lowres set
      
      Change-Id: If16a8e07797c64508f9e2d9b26ae874ac53c57a4
      d25ef8c6
    • Urvang Joshi's avatar
      Revert wrong uses of TX_SIZE enum. · ab8840eb
      Urvang Joshi authored
      Introduced by: https://aomedia-review.googlesource.com/c/aom/+/25181
      
      Change-Id: I1f25178d6b273fbeade4c33f153b5f2bac4a8b99
      ab8840eb
    • Cheng Chen's avatar
      Match braces in VIM for rdopt.c · 1483a714
      Cheng Chen authored
      Change-Id: I23344af711d9a31b819fca35ae3ad3b7edf4852e
      1483a714
    • Rupert Swarbrick's avatar
      Define block_signals_txsize function · fcff0b25
      Rupert Swarbrick authored
      This returns true if a block signals tx_size in the stream and uses it
      in the bitstream writing code and the decoder.
      
      Note that we can't quite use it in pack_inter_mode_mvs when
      CONFIG_VAR_TX && !CONFIG_RECT_TX but I've switched the code to using
      it the rest of the time since rect-tx is adopted and eventually the
      other code path should be deleted.
      
      Also use the helper function in tx_size_cost in rdopt.c, where the
      test was wrong and caused underestimates of block
      costs. (Specifically, the code that subtracts tx_size_cost from
      this_rate_tokenonly in rd_pick_intra_sby_mode ended up subtracting
      zero for a 4x8 block).
      
      The behaviour of the decoder should be unchanged. The only change in
      the encoder's behaviour should be in tx_size_cost where it should now
      match the rest of the code.
      
      Change-Id: I97236c9ce444993afe01ac5c6f4a0bb9e5049217
      fcff0b25
  9. 08 Oct, 2017 1 commit
  10. 07 Oct, 2017 2 commits
    • Luc Trudeau's avatar
      [CFL] Support for 4:2:0 High Bit Depth · 056d1f40
      Luc Trudeau authored
      high bit depth (_hbd) and low bit depth (_lbd) versions
      of the cfl functions: sum_above_row, sum_left_col,
      cfl_build_prediction, cfl_luma_subsampling_420 (4:4:4 will
      be added in subsequent commit) and cfl_alpha_dist. For
      cfl_alpha_dist, special care is given to scale the SSE
      according to the bit depth.
      
      BUG=aomedia:835
      
      Change-Id: I5b72845100d88fb8a438efe665bcae7fe1ba50b8
      056d1f40
    • Debargha Mukherjee's avatar
      Remove the speed optimization for rd_stats_stack · 9245d89d
      Debargha Mukherjee authored
      This optimization for speed was useful only when max tx-size
      was 32x32. However with tx64x64 this was breaking certain assumptions
      causing huge drops in coding efficiency. So I am removing this
      optimization for now. This can be brought back latger as a speed feature.
      The removal of this optimzation brings back the loss when 32x64
      and 64x32 transforms are used.
      
      Change-Id: I15987ea9ff53fa36a2962fe5f156c30a11e809ed
      9245d89d
  11. 06 Oct, 2017 5 commits
    • Jingning Han's avatar
      Rework key frame intra mode context model · a45d842d
      Jingning Han authored
      Reduce the context model size for key frame modes from 30240 bits
      to 4500 bits, i.e., less than 1/6 of the original context model.
      The coding performance loss on key frame is 0.14% for lowres and
      noise level difference for video sequence. The loss on key frame
      for midres is 0.05% and noise level for whole video. The change
      on hdres kf coding is 0.015%.
      
      Change-Id: I9e36825e5c5ee6ba35038c3ca349ad1ad3429910
      a45d842d
    • Debargha Mukherjee's avatar
      Avoid large stack allocations · 5d108a36
      Debargha Mukherjee authored
      When ext-partition and ncobmc-adapt-weight is on, avoid too large
      stack allocations.
      
      Change-Id: I8db74e45cac80c4e5dfd9e20cfc73d9978d1578e
      5d108a36
    • Alexander Bokov's avatar
      Predict skip flag to speed up the TX type search · 8829a24d
      Alexander Bokov authored
      Average speed-up (lowres):
      low bitrates: 6.6%
      mid bitrates: 2.5%
      high bitrates: 0.0%
      
      Average PSNR loss:
      lowres: 0.010%
      midres: 0.005%
      
      Change-Id: Id34fb247e5e31f04ca324c58142e4b5ac4edacda
      8829a24d
    • Rupert Swarbrick's avatar
      Simplify the ALL_ZERO_FLAG logic in av1_rd_pick_intra_mode_sb · 799ff701
      Rupert Swarbrick authored
      Since the CONFIG_EXT_INTER #if/#endif lines have been removed, it's a
      bit clearer what's going on here and this patch cleans up the code.
      
      Firstly, the patch pulls the cheap checks on best_mbmode.ref_frame out
      to the front of the block, so we needn't call gm_get_motion_vector at
      all for compound predictions.
      
      Next, second element of the zeromv array is never used, so we needn't
      compute it.
      
      Finally, the patch removes the calls to lower_mv_precision. These
      shouldn't be needed, but it's not exactly obvious why not so the patch
      adds some comments to gm_get_motion_vector to explain what's going on
      and adds an assertion to make sure they are true. It also adds a call
      to integer_mv_precision on the early return path of
      gm_get_motion_vector, correcting an apparent bug when CONFIG_AMVR is
      true.
      
      This patch shouldn't make any difference to encoder or decoder
      behaviour.
      
      Change-Id: I0b4a01063574d080bbf6d30187f4e1748c60939d
      799ff701
    • RogerZhou's avatar
      Extend IntraBC to 4x4 · ca86546f
      RogerZhou authored
      Change-Id: I3f30c35bcd1bc623ad0c34c4b954ff71b2fcfd00
      ca86546f
  12. 05 Oct, 2017 2 commits
  13. 04 Oct, 2017 3 commits
    • Rupert Swarbrick's avatar
      Fix rate costing for small blocks with skip flag · c6cc1f5e
      Rupert Swarbrick authored
      In av1_rd_pick_intra_mode_sb, the code calculates the rate for Y and
      UV planes separately. If the transform coefficient should be zero,
      rd_pick_intra_sby_mode and rd_pick_intra_sbuv_mode return the cost of
      actually coding up the zero coefficient, but also set a flag (y_skip
      or uv_skip) saying that this could be skipped.
      
      Since the skip flag isn't per-plane, av1_rd_pick_intra_mode_sb checks
      to see whether both y_skip and uv_skip were true. In that case, it
      costs the block for setting the skip flag rather than outputting zero
      transform coefficients.
      
      If a small block (less than 8x8) has no chroma information,
      x->skip_chroma_rd is true. In that case, we don't call
      rd_pick_intra_sbuv_mode and so uv_skip is never set. However, when we
      come to write the block, it will be written using the skip flag. This
      patch gets the costing right in that case.
      
      Change-Id: Ib31b80b4b44a5c8ed9d9b3f86d782c54927345f3
      c6cc1f5e
    • Debargha Mukherjee's avatar
      Fix rd scales for transforms larger than 32x32 · b02d2f39
      Debargha Mukherjee authored
      Change-Id: I1ddec0cf3513e2bd7568393e5ed5d52c25014ab4
      b02d2f39
    • Rupert Swarbrick's avatar
      Pack InterpFilters into a single integer · 27e90295
      Rupert Swarbrick authored
      Before this patch, if CONFIG_DUAL_FILTER was true then an MB_MODE_INFO
      stored its filter choices as an array of four numbers, each of which
      was between 0 and 10. It also seems that elements 2 and 3 of the array
      were always the same as elements 0 and 1 when used.
      
      This patch defines a new type(def) called InterpFilters together with
      constructor and extractor functions. When CONFIG_DUAL_FILTER is zero,
      InterpFilters is a synonym for InterpFilter and the constructor and
      extractor functions should compile away to nothing. When it is
      nonzero, InterpFilters is a uint32_t which stores the x filter in the
      high part and the y filter in the low part (this looks strange, but
      matches the old numbering).
      
      Making this change allows us to get rid of lots of special case code
      that was dependent on CONFIG_DUAL_FILTER. The uniform
      extract/make/broadcast interface also actually shortens code in
      general.
      
      Change-Id: I6b24a61bac3e4b220d8d46d0b27cfe865dcfba81
      27e90295