1. 29 Jan, 2018 1 commit
  2. 28 Jan, 2018 3 commits
    • David Michael Barr's avatar
      [CFL] Independent search termination for plane and sign · 2fae28b2
      David Michael Barr authored
      Stop if less than half of the iterations give improvement.
      
      Minor metric changes for a 2.5x speed up of the alpha search.
      
      Results on subset1:
         PSNR | PSNR Cb | PSNR Cr | PSNR HVS |    SSIM | MS SSIM | CIEDE 2000
      -0.0038 |  0.0466 |  0.1388 |  -0.0103 | -0.0312 | -0.0220 |     0.0330
      
      Change-Id: Ic25a995eee500ffc4b80b73635baf0a710954dc0
      2fae28b2
    • David Michael Barr's avatar
      [CFL] allow for 4:1 rects if full tx available · d27f1e61
      David Michael Barr authored
      Disable CFL sub8x8 validation in this case, as it appears to give
      false-negatives for 4:1 blocks. All other tests pass.
      
      The coding gain on subset1 is quite significant.
      
         PSNR | PSNR Cb | PSNR Cr | PSNR HVS |    SSIM | MS SSIM | CIEDE 2000
      -0.1270 | -1.1386 | -1.1426 |  -0.1167 | -0.1157 | -0.1264 |    -0.4142
      
      Change-Id: Ic20c9b1a5ff28e0fbd4e6491ed2cd2d1f6b487c9
      d27f1e61
    • Yaowu Xu's avatar
      Avoid out of bound array access · 92245c87
      Yaowu Xu authored
      Change-Id: I4066561b769cf2bd4af515c9d351f609c08e3076
      92245c87
  3. 27 Jan, 2018 1 commit
  4. 26 Jan, 2018 17 commits
    • Thomas Daede's avatar
      Add CDF_STORAGE_REDUCTION experiment flag. · 5f0c41de
      Thomas Daede authored
      Change-Id: I8ce208e842b738bb729d5732f0f35366c3549063
      5f0c41de
    • Hui Su's avatar
      Fix memory overflow in av1_cdef_search() · 4d24a989
      Hui Su authored
      The logic to check for frame boundary needs to take 128 superblock size
      into account when ext-partition is on.
      
      BUG=aomedia:1268
      
      Change-Id: I40d2128d5ab46d57ecab9c9ecbef122005fe4b11
      4d24a989
    • Frank Bossen's avatar
      Speed up SSE4 implementation of 64-point inverse transform · c2368362
      Frank Bossen authored
      Avoid unnecessary computations knowing that only the lower
      frequency 32x32 quadrant has nonzero values.
      
      Runs about 2x faster
      
      Change-Id: Ie86f56ccdce917e30b594253f10e121b4dcb0abc
      c2368362
    • Maxym Dmytrychenko's avatar
      SSE2 optimizations for _6/_16 lowbd lpf functions · ae6e6bc1
      Maxym Dmytrychenko authored
      Includes vertical and horizontal implementations
      and to fix 5/13 TAPs/Parallel deblocking support.
      
      Re-working internals of the filters for better
      re-usage across different sizes.
      
      Tests are enabled.
      
      Performance changes, SSE2 over C:
      Horizontal methods: up to    3-4x
      Vertical   methods: up to 1.5x-2x
      
      Change-Id: I2e36035355d8c23c1d4b0d59d0e23f598e9d0e3f
      ae6e6bc1
    • Angie Chiang's avatar
      Add get_txw/h_idx functions · 29d2f21e
      Angie Chiang authored
      Change-Id: Ibace8208109068aae1e93275d28ab8bd8e58c529
      29d2f21e
    • Sebastien Alaiwan's avatar
      av1_rtcd_defs: fix formatting · f4123630
      Sebastien Alaiwan authored
      Change-Id: Ic4464eab6bedb18451f3506d1d58258f9fa64985
      f4123630
    • Sebastien Alaiwan's avatar
      Remove DAALA_TX experiment · 5859636f
      Sebastien Alaiwan authored
      This experiment has been abandonned for AV1.
      
      Change-Id: Ief8ed6a51a5e7bac17838ebb7a88d88bbf90a96f
      5859636f
    • Jingning Han's avatar
      Properly reset the skip_mode element in mb_mode_info · 3da65bff
      Jingning Han authored
      The skip_mode element might re-use prior frame's coding decision
      for a current coding block rate-distortion search. Properly reset
      it to be zero for regular rate-distortion mode search.
      
      This improves the coding performance for ext-skip by 0.07% for
      lowres.
      
      Change-Id: Idbda5b441e3eb844e03ca07bd174b4b7f8a7cb59
      3da65bff
    • Yaowu Xu's avatar
      minor reorder of operations · 30bf8713
      Yaowu Xu authored
      This also fixes several UBSan warnings.
      
      Change-Id: I4ea5f744c42983ea44c7cd6925555eab4938097c
      30bf8713
    • Yi Luo's avatar
      Fix loopfilter function usage · 31791278
      Yi Luo authored
      Here we should use aom_lpf_horizontal_16 function instead of
      aom_lpf_horizontal_16_dual function.
      
      aom_lpf_horizontal_16_dual works for two horizontal blocks,
      also fixed.
      
      Change-Id: Icc991d3f98bb182fa30497f120021aeb17839d21
      31791278
    • Debargha Mukherjee's avatar
      Adjust last odd row weight in fast_sgr · 127b562a
      Debargha Mukherjee authored
      Change-Id: I2348a7c6a3553bbbb0d061820a7c546a1a0367df
      127b562a
    • David Barker's avatar
      Fix compile warning with mono-video disabled · 6cd8e177
      David Barker authored
      The variable 'num_planes' is only used when mono-video is enabled,
      so move it inside a #if CONFIG_MONO_VIDEO block
      
      Change-Id: I415f764b2629478edde579142b7242851991b1c0
      6cd8e177
    • Yushin Cho's avatar
      [seg] No need to decide temporal_update · e8d8879e
      Yushin Cho authored
      If error resilient mode is true, temporal update of seg_id is not used,
      thus don't need to decide seg->temporal_update flag by calling
      av1_choose_segmap_coding_method().
      
      Change-Id: Ifb2271be53f1a6bc64f1196af5e7fbe46741fab0
      e8d8879e
    • Cheng Chen's avatar
      Skip txfm search · 3c22260b
      Cheng Chen authored
      Skip transform type search.
      
      Without txk_sel:
      Skip remaining transform type search when all transform blocks inside
      the coding block have eob = 0.
      
      With txk_sel:
      For each transform block, whenever eob = 0, we skip remaining
      transform type search.
      
      Speed impact:
      On low bitrate, 25% speed up.
      On high bitrate, 15-20% speed up.
      
      Performance impact: Google test lowres, 30 frames
      With txk_sel: 0.15% drop
      Without txk_sel: 0.30% drop
      
      Change-Id: I5e8db730a19feec22e378611046b1ce1ab001c85
      3c22260b
    • Yaowu Xu's avatar
      localize initialization of zero and max · 14b7967b
      Yaowu Xu authored
      This commit change the initialization of two constants to smaller
      scope, reducing the number of aligned parameters being passed into
      cfl_predict_hbd. This fixes the compiling issues with vs2015.
      
      BUG=aomedia:1275
      
      Change-Id: Idd19e945ac6312654b7b0184fcbf65ca398c46ce
      14b7967b
    • Yunqing Wang's avatar
      Speed up av1_find_mv_refs() · b41ffb95
      Yunqing Wang authored
      av1_update_mv_context() is only used to provide compound_mode_context,
      which is the same as mode_context in find_mv_refs_idx(). This patch
      removes the calling of av1_update_mv_context() that takes 0.5% of the
      decoder time. This doesn't change bitstream.
      
      Change-Id: I6f0e082b237ff42c3b3e72361c46f98249ba07ab
      b41ffb95
    • Yaowu Xu's avatar
      Remove const from parameter passed-by-value · 838ea62c
      Yaowu Xu authored
      This makes the usage of const consistent.
      
      Change-Id: I0ebf59842d8df234d0f4a91636b4bc2d6e9a6c81
      838ea62c
  5. 25 Jan, 2018 18 commits
    • Yunqing Wang's avatar
      Search the same set of neighbouring positions · 28f3fbf7
      Yunqing Wang authored
      This patch prepares for removing of av1_update_mv_context(). In
      av1_update_mv_context() and av1_find_mv_refs(), the neighbouring
      positions searched are not exactly the same. This patch fixes it.
      This causes bitstream chamges, but shouldn't affect the coding
      quality.
      
      Change-Id: I59d2f8c318df388f2d06634cd96802b773c8bb13
      28f3fbf7
    • Yaowu Xu's avatar
      Add num_plane to av1_copy_tree_context() · 68377282
      Yaowu Xu authored
      To support monochrome video and fixes a nightly test segfault.
      
      BUG=aomedia:1273
      
      Change-Id: I87dd3d5ca79e8f0ce51ee31738205ae5a53af072
      68377282
    • Hui Su's avatar
      Re-enable the tx type pruning speed feature · 4e71fd94
      Hui Su authored
      Change-Id: I93702d24bf7d711b6910e2e502f9f97c661bcf6c
      4e71fd94
    • Yushin Cho's avatar
      [seg] Initialize temporal_update flag · b42e98de
      Yushin Cho authored
      Initialization has been nowhere done for seg->temporal_update.
      
      Change-Id: I3ccc0e10e14a83859b683c026093b921ea6d5dbf
      b42e98de
    • Frank Bossen's avatar
      Add SSE4 implementation of 64-point transform · 5a06fe32
      Frank Bossen authored
      Can reduce decoder run time by 4 percent.
      
      Change-Id: Ibdd5bb3a18002789852f2e367b32533163a8c022
      5a06fe32
    • Jingning Han's avatar
      Use meaningful names in txk-sel rd control · 66965a20
      Jingning Han authored
      Change-Id: I83ca47c1469d8e383a815058c02c4826c6282873
      66965a20
    • Jingning Han's avatar
      Use safe soft quantization speed feature setup · 802eeaa8
      Jingning Han authored
      Change-Id: If8836621586ab5090affbb8d6d7b0be3a3e4cde8
      802eeaa8
    • David Barker's avatar
      [intra, bugfix] Prevent overflow in DC_PRED · b844ee1b
      David Barker authored
      Commit https://aomedia-review.googlesource.com/c/aom/+/40541 replaced
      a division in the DC intra predictor by an approximate
      multiply+shift sequence.
      
      Unfortunately, this approximation is able to produce out-of-range
      values. For example, consider 4x8 DC_PRED, with bit depth = 10.
      If all of the context pixels are 0x3FF (the max value), then we get:
      
      sum = 12 * 0x3FF
      expected_dc = (sum * 0xAB) >> 11 = 1024 = 0x400
      
      This means that we need to insert a clip_pixel(_highbd) operation
      at the end of the DC prediction, to bring this value back in range.
      
      BUG=aomedia:1272
      
      Change-Id: I9beb9ac8a4b39803865f7e23932402ecd1d6f672
      b844ee1b
    • Yunqing Wang's avatar
      Remove mode_context calculation in find_mv_refs_idx() · 8152737f
      Yunqing Wang authored
      mode_context[ref_frame] is calculated in find_mv_refs_idx(), but is
      set to 0 in setup_ref_mv_list. Therefore, the calculation in
      find_mv_refs_idx() is not needed.
      
      Change-Id: I65ca06a2000278ad21c2eaa81eb12c48a7c1fcb8
      8152737f
    • Frank Bossen's avatar
      Do MV scaling on the fly for memory and run time reduction · 7b6bb947
      Frank Bossen authored
      This change is not normative and produces the same results as before.
      TPL_MV_REF data structure is about 5x smaller.
      Observed overall decoder run time reduction is about 4%.
      No observed change in encoder run time.
      
      Change-Id: Id68a492bac3bf28f48b7ceeedf85cd29981238ee
      7b6bb947
    • Tom Finegan's avatar
      Add obu_sizing experiment. · 41150ad4
      Tom Finegan authored
      Writes PRE_OBU_SIZE_BYTES (currently 4) bytes padded unsigned LEB128
      encoded integers in OBU size fields when enabled:
      
      $ cmake path/to/aom -DCONFIG_OBU=1 -DCONFIG_OBU_SIZING=1 && cmake --build .
      
      Requires CONFIG_OBU.
      
      BUG=aomedia:1125
      
      Change-Id: I4d184ef0c8587d24e9c8c3e63237ea5003386c6a
      41150ad4
    • Frederic Barbier's avatar
      Give skip_mode priority over segmentation · b3bb318d
      Frederic Barbier authored
      BUG=aomedia:1266
      
      Change-Id: I7612e379aa7c63da56e975e95cd7266cd1f8c68d
      b3bb318d
    • Yue Chen's avatar
      Clean up and rework rates in motion_mode_rd() · c5024215
      Yue Chen authored
      Remove all *bmc variables, which were used to record basic motion
      search results (no advanced masked compound) when obmc and warped
      motion modes were allowed to work with compound ref.
      Remove switchable rate that is passed in to it, since in most
      motion modes, we need to recalculate the cost based on motion_mode
      and the refined mv. This change slightly improve the rd perf.
      
      Performance change: -0.024%
      
      Change-Id: I4afe0927e97cc7e7251022957f7665ed3032079c
      c5024215
    • Angie Chiang's avatar
      Simplify txfm table · 0c7b8d84
      Angie Chiang authored
      Instead of listing all possible stage_range,
      we use set_fwd_txfm_non_scale_range() to generate 2d stage_range
      from 1d stage_range.
      
      This will reduce the complexity of txfm table significantly.
      
      This is a lossless change.
      The coding performance isn't changed.
      The txfm config is exactly the same as it was before.
      
      Change-Id: Ibd1d9e53772bb928faaeecc98d81cbc8f38b27ed
      0c7b8d84
    • Angie Chiang's avatar
      Refactor buf_offset in av1_inv_txfm2d.c · 0822557b
      Angie Chiang authored
      Change-Id: I73d1d15ab678242737432064d203c476057286ed
      0822557b
    • Zoe Liu's avatar
      Simplify context identification for coding ref frames · fa8bad19
      Zoe Liu authored
      This patch simply aggregates the checking on the counts of certain
      reference frames in the neighboring above and left blocks. It does
      not incur any coding performance change.
      
      Change-Id: I59a962ba95e7ab16731ce97371ec5709a582a0ba
      fa8bad19
    • Hui Su's avatar
      Move av1_search_txk_type() to rdopt.c · 4a5c6cf8
      Hui Su authored
      Change-Id: I4f9d014324b35e30f25cae5fa570620249640cf6
      4a5c6cf8
    • Hui Su's avatar
      Reduce the size of av1_prob_cost[] · c1cd5194
      Hui Su authored
      Only half of it was necessary.
      
      Change-Id: I0b5fc9ae6a17f5d812e10ee903a12f23f1377d8e
      c1cd5194