1. 24 Jun, 2017 3 commits
    • James Zern's avatar
      {decodeframe,rdopt}.c: fix asserts with strings · 88896734
      James Zern authored
      lead with '0 &&' to avoid string to bool conversion warnings
      
      BUG=aomedia:621
      
      Change-Id: I2cd6618377f9ed94f4d9dbc252f6f5cfc81efea4
      88896734
    • Angie Chiang's avatar
      Pass mbmi into get_scan() · bd99b38c
      Angie Chiang authored
      This is to facilitate future experiment related to adapt_scan
      
      Change-Id: I51628f3df81bd82db7f8f553d13da0ee5792d7d9
      bd99b38c
    • Yushin Cho's avatar
      Fix compile warning · a3d70911
      Yushin Cho authored
      Fixed the compile warning when both global-motion
      and warped-motion are disabled.
      
      Change-Id: Ie3ac036fc6c0a15e54a56427452682d7ea7864db
      a3d70911
  2. 22 Jun, 2017 2 commits
    • Jingning Han's avatar
      Fix compiler warning in joint_motion_search · cb637674
      Jingning Han authored
      Avoid compiler warning when global-motion is off.
      
      Change-Id: Ie6a0d3e4efc0e06b263e8c8c0c0dc153738c3804
      cb637674
    • Yushin Cho's avatar
      Fix daala-dist, rd tx search · 04eb9594
      Yushin Cho authored
      Previously, for block >=8x8, and tx < 8x8,
      we skipped setting the early-exit flag in block_rd_txfm() because
      distortion for sub8x8 tx block is from MSE but reference (best)
      is from daala-dist.
      However, not setting early-exit flag turned out to be the reason
      for a regression in MSE probe mode of daala-dist because
      it loses the chance to set rd_stats properly.
      
      On the other hand, there is still a small regression, say 0.05% psnr bd-rate,
      which seems to occur in the case that a tx block in a partition has chosen
      the skipped rd_cost since it is smaller than non-skip rd_cost and
      set the early-exit flag to 0 (so, not exit), but the daala-dist applied
      to the whole partition cannot access the same info but can choose from
      two kinds of rd_costs:
      1) all tx blocks are skipped (even if a tx block has non-zero coeff) and 0 bits
      2) sum of final distortion of all tx blocks (i.e. non-zero coeff decoded)
      and bits to encode coeffs.
      
      Change-Id: I2ec69972aa1f22d465293cb9e8d5e18ef2c6f7f3
      04eb9594
  3. 21 Jun, 2017 1 commit
    • Timothy B. Terriberry's avatar
      cb4x4: Move sub-4X4 block sizes behind chroma flags. · 81ec2619
      Timothy B. Terriberry authored
      cb4x4 itself should not require these sizes.
      
      This simplifies compatibility with other experiments, since we can
      first make them work with cb4x4 (which is now on by default), and
      then worry about chroma_sub8x8 and chroma_2x2 (which is not) in
      separate steps.
      
      Encoder and decoder output should remain unchanged.
      
      Change-Id: Iff2a5494cab3b7d96f881e8bd9cd4bf18c817cfa
      81ec2619
  4. 20 Jun, 2017 2 commits
    • Yunqing Wang's avatar
      Declare rate_mv_bmc in warped motion · 562a3937
      Yunqing Wang authored
      A motion refining was added in warped motion, which required the
      declaration of rate_mv_bmc in warped motion.
      
      BUG=aomedia:613
      
      Change-Id: I74dfc396f915a5cc4599bfbdccad758fa630505f
      562a3937
    • Luc Trudeau's avatar
      [CFL] RDO Loop Rework · 14fc5045
      Luc Trudeau authored
      CfL performs an extra loop iteration during luma mode selection. Recent
      changes have broken the extra iteration. Remove previous approach.
      
      New approach adds the extra iteration right before uv parameter
      selection. Interesting fact, If the best luma intra mode already has
      worse RD performance than the best inter mode found so far (if any),
      then the entire chroma intra search is skipped, including the extra 
      iteration.
      
      Results on Subset1 (compared to 3e18e4ae with CfL)
      
         PSNR | PSNR Cb | PSNR Cr | PSNR HVS |    SSIM | MS SSIM | CIEDE 2000
      -0.3090 | -2.7271 | -2.3521 |  -0.3369 | -0.3463 | -0.3525 |    -1.1868
      
      Change-Id: If67b0badd2c8ea25c61685483d39d622c1729b18
      14fc5045
  5. 19 Jun, 2017 4 commits
    • Joe Young's avatar
      [intra-edge] Convert 4x4 VP9 to ext-intra; upsample edge samples · 830d4ce4
      Joe Young authored
      Updates to intra-edge experiment
      
      - Convert VP9-style intra pred to Ext-intra style
      - Upsample edge predictors by 2x based on angle and edge size
      
      BD-rate, 1-kf AWCY
        360p:  -0.11%
        720p:  -0.54
        1080p: -0.96
      
      Change-Id: Ib73805d31d5d286e607a7ee7470fcbdf11edbbff
      830d4ce4
    • Timothy B. Terriberry's avatar
      encoder: Remove 64x upsampled reference buffers · 5d24b6f0
      Timothy B. Terriberry authored
      They do not handle border extension correctly (interpolation and
      border extension do not commute unless you upsample into the
      border), nor do they handle crop dimensions that are not a multiple
      of 8 (the upsampled version is not sufficiently large), in addition
      to using massive amounts of memory and being a criminal waste of
      cache (1 byte used for every 8 bytes fetched).
      
      This commit reimplements use_upsampled_references by computing the
      subpixel samples on the fly. This implementation not only corrects
      the border handling, but is also faster, while maintaining the
      same quality.
      
      HL AWCY results are basically noise:
          PSNR | PSNR HVS |   SSIM | MS SSIM | CIEDE 2000
        0.0188 |   0.0187 | 0.0045 |  0.0063 |     0.0228
      
      Change-Id: I7527db9f83b87a7bb8b35342f7e6457cd0bef9cd
      5d24b6f0
    • Zoe Liu's avatar
      Add new coding tool of ext-comp-refs · c082bbcb
      Zoe Liu authored
      The tool of ext-comp-refs adds the uni-directional compound reference
      prediction. In details, 3 pairs of uni-direcitonal compound references
      are added for the comp ref prediction:
      (LAST_FRAME, LAST2_FRAME),
      (LAST_FRAME, GOLDEN_FRAME), and
      (BWDREF_FRAME, ALTREF_FRAME).
      
      This new tool of ext-comp-refs will eventually overwrite
      one-side-compound and have the two coding tools to merge to one.
      
      It achieves -0.35 ~ -0.55% coding gains in BDRate, compared against
      AV1 baseline with the default experiments on, but without
      one-sided-compound. It achieves -0.2% ~ -0.3% coding gains when
      one-sided-compound is on. It achieves larger gains on higher
      resolution.
      
      Change-Id: Icbdb16e97b96aaebaf2213f5f72d5331e2e358eb
      c082bbcb
    • Sarah Parker's avatar
      Add macro to disable trellis optimization in rdopt · 345366ac
      Sarah Parker authored
      Turning off the trellis optimization gives a performance
      drop of 0.726% on the lowres set.
      
      Change-Id: I4fdd1e20fb6f671162cd32b3abe699cd2aee1919
      345366ac
  6. 17 Jun, 2017 1 commit
    • Timothy B. Terriberry's avatar
      var_tx: Remove custom distortion calculations. · d62e2a3a
      Timothy B. Terriberry authored
      Although this does not fully convert var-tx to using
      av1_block_dist(), it does make it use the same distortion functions
      av1_block_dist() uses: pixel_sse() and sum_squares_visible().
      
      Change-Id: I1173bc6941a3b895381b9fcb73b533b5afc31aab
      d62e2a3a
  7. 16 Jun, 2017 1 commit
  8. 15 Jun, 2017 2 commits
    • Urvang Joshi's avatar
      Remove 'rddiv' member from various structs. · 70006e46
      Urvang Joshi authored
      This was initialized from a const and never modified. But was still
      passed around and stored in multiple structs.
      
      Removed these 'rddiv' member variables and now RDOPT() and RDOPT_DBL()
      always use the const RDDIV_BITS directly.
      
      Change-Id: I1a8dfd2c8fa857d466ad1207b4f0dd6ec07eafb8
      70006e46
    • Zoe Liu's avatar
      Add the new coding tool "speed_refs" · d1ac0321
      Zoe Liu authored
      This patch will not cause any performance change regardless of whether
      speed_refs is on or off.
      
      This coding tool is targeted to speed up the encoder side reference
      frame selection process. The essential idea is to have two scanning
      passes for each superblock of size 64x64 and this CL lays out the
      initial framework but no reference frame selection is done yet:
      
      First scanning pass - To simplify the partition and the mode
      candidates (e.g. considering nearestmv / nearmv / zeromv only) and
      identify the best reference frame prediction candidates;
      
      Second scanning pass - Use the best reference frame candidate(s)
      obtained from the first pass to encode the current superblock.
      
      Change-Id: I11266d468de3077271a5e866eebd341a8014d136
      d1ac0321
  9. 14 Jun, 2017 2 commits
  10. 13 Jun, 2017 3 commits
    • Yushin Cho's avatar
      Fix a bug in daala-dist · 09b01a24
      Yushin Cho authored
      Fix the bug that height of a partition is used as a stride mistakenly.
      This fixes the regression caused by sub8x8 tx size rd search
      for a partition >= 8x8.
      
      Change-Id: I6114814dcec70fd5198f681c0a861bc9849286fd
      09b01a24
    • Zoe Liu's avatar
      Add encoder/decoder pipeline to support single ref comp modes · 85b66463
      Zoe Liu authored
      Now the single ref comp mode should work with WEDGE and
      COMPOUND_SEGMENT. For motion_var, the OBMC_CAUSAL mode uses the 2nd
      predictor if the neighboring block is single ref comp mode predicted.
      
      This patch removes the mode of SR_NEAREST_NEWMV and leaves four
      single ref comp modes in total:
      
      SR_NEAREST_NEARMV
      SR_NEAR_NEWMV
      SR_ZERO_NEWMV
      SR_NEW_NEWMV
      
      Change-Id: If6140455771f0f1a3b947766eccf82f23cc6b67a
      85b66463
    • Yushin Cho's avatar
      Another fix of daala-dist for cb4x4 · c0f6bf25
      Yushin Cho authored
      Daala-dist replaces the luma distortion of sub8x8 partitions with
      its own distortion thus requires to split the luma distortion only.
      Doing so, there has been a bug that INT_MAX64 value comes
      when the sub8x8 parition is skipped. This happened because the existing
      code does not initialize the rd_stats_y or tmp_rd_stats_y, i.e. rd_stat struct
      for luma only in several places.
      
      Change-Id: If229b53bb7a6cff0b8751138a32b1dcf02665624
      c0f6bf25
  11. 12 Jun, 2017 3 commits
    • Yushin Cho's avatar
      Fix a compile warning with global-motion off · c9751c59
      Yushin Cho authored
      Change-Id: I8379e4055e9c2737f1ad310095d7a318e6e74b2f
      c9751c59
    • Yue Chen's avatar
      supertx: code refactoring + resolve conflicts with baseline · 8e689e4b
      Yue Chen authored
      Refactoring: split prediction+extension for each plane, so we can
      handle luma/chroma supertx pred in different ways.
      Compatibility fix: fix conflicts with cb4x4 and chroma_sub8x8, now
      for chroma sub8x8 supertx, only the top-left(basic cb4x4) or the
      the bottom-right(cb4x4 + chroma_sub8x8) predictor will be used
      without any blending within a 8x8 unit.
      
      Change-Id: I6cf7b12768a82d3c7e01811ada02de84af9bd8ac
      8e689e4b
    • Zoe Liu's avatar
      Add encoder/decoder support for var-refs · 7b1ec7a9
      Zoe Liu authored
      Check the availability of the reference frames at the frame level at
      both encoder and decoder, and if a reference frame is not available
      for a specific video frame, remove the signaling of such reference
      frame info at the block level.
      
      This patch adds the consideration of the bit saving inside the RD
      optimization loop.
      
      Change-Id: I4c22f1b843b21c7d2b47e118c99c3ad615a3d4e4
      7b1ec7a9
  12. 10 Jun, 2017 1 commit
    • Timothy B. Terriberry's avatar
      var_tx: Fix distortion calc. in av1_tx_block_rd_b · ab141115
      Timothy B. Terriberry authored
      This was hard-coding the assumption that the block size for the
      smallest TX size was also the smallest block size. This is no
      longer true since fe67ed6a landed.
      
      As a result, for TX blocks that overlapped the frame edge, it was
      only measuring distortion on the upper-left 2x2 part of each 4x4
      sub-block, causing the encoder to prefer larger transforms which
      cause such overlap and avoid transforms which do not, causing a
      regression.
      
      This patch uses the appropriate conversion table, which fixes the
      regression.
      
      BUG=aomedia:593
      
      Change-Id: Id253cf0f3a5252378e3f340b8350120639ff5c88
      ab141115
  13. 09 Jun, 2017 1 commit
    • David Barker's avatar
      Add 'do_average' to ConvolveParams structure · e64d51a9
      David Barker authored
      The 'ref' member of ConvolveParams currently serves two purposes:
      * To indicate which component of a compound we're currently predicting,
        eg. for fetching interpolation filters with dual-filter enabled.
      * To determine whether we should average into the destination buffer.
      
      But there are two cases where we want to separate these out:
      * In joint_motion_search, we want to try combining a fixed second
        prediction with various first predictions.
      * When searching masked interinter compounds, we want to predict
        each component separately then try different combinations.
      
      In these cases, we set 'ref' to 0 and use temporary variables to
      make sure we use the correct interpolation filters. But this is
      quite fragile.
      
      This patch separates out the two uses into separate members.
      This allows us to remove some temporary variables, but more
      importantly gives easy fixes to two bugs in
      build_inter_predictors_single_buf (used by rdopt):
      
      * We previously set ref=0 but didn't fix up the interpolation filters
      * For ZERO_ZEROMV modes, the second component would accidentally
        average into the (uninitialized!) second prediction buffer
      
      BUG=aomedia:577
      BUG=aomedia:584
      BUG=aomedia:595
      
      Change-Id: Ibc31d1ac701a029ea5efaa1197dd402bc4b7af1e
      e64d51a9
  14. 08 Jun, 2017 1 commit
    • Yushin Cho's avatar
      Refactor sub8x8 tx size RD for daala-dist · 30a2c5f2
      Yushin Cho authored
      For a tx size RD search with partition size >= 8x8 and tx size < 8x8,
      daala-dist function is applied to the whole partition after all tx blocks are encoded
      instead of each 8x8 sub block of the partition.
      
      Change-Id: I27d9e2960aa641f550096e32ebcdf8dfb4de79a6
      30a2c5f2
  15. 07 Jun, 2017 1 commit
    • Yi Luo's avatar
      Add HBD data path for av1_block_error_avx2 · d61e608d
      Yi Luo authored
      - Add unit test for av1_block_error.
      - Fix av1_dist_block logic for calling av1_block_error.
      
      Change-Id: Id8a47ee113417360a29fc2334d9ca72b5793e2d7
      d61e608d
  16. 06 Jun, 2017 1 commit
    • Alex Converse's avatar
      intrabc: Fix mode and MV cost · d5d9b6ca
      Alex Converse authored
      objective-1-fast 1st KF: -0.07 BDRATE-PSNR
      twitch-1 1st KF: -0.04 BDRATE-PSNR
      
      Change-Id: I089900514c40f3b8b77708dac2c8bfbce2f540ff
      d5d9b6ca
  17. 05 Jun, 2017 2 commits
  18. 02 Jun, 2017 6 commits
    • Angie Chiang's avatar
      Mark SMOOTH2 filter under USE_EXTRA_FILTER flag · aadbb025
      Angie Chiang authored
      Change-Id: Ia9a5d818e8c2ff9b4cc41c6d7950cfe005c20bfc
      aadbb025
    • Angie Chiang's avatar
      Pass above/left ctx plane_bsize to av1_optimize_b · 3511c37d
      Angie Chiang authored
      This is to facilitate lv_map experiment
      
      Change-Id: Ife779b172c4b81a9b2b4640464163300996e3969
      3511c37d
    • Alex Converse's avatar
      intrabc: adapt use_intrabc prob · 7c412ea4
      Alex Converse authored
      First keyframe BD-RATE objective-1-fast:
         PSNR | PSNR Cb | PSNR Cr | PSNR HVS |    SSIM | MS SSIM | CIEDE 2000
      -0.3705 | -0.3232 | -0.3812 |  -0.3782 |     N/A | -0.3412 |        N/A
      
      First keyframe BD-RATE twitch-1:
         PSNR | PSNR Cb | PSNR Cr | PSNR HVS |    SSIM | MS SSIM | CIEDE 2000
      -0.2479 | -0.2477 | -0.2467 |  -0.2567 | -0.2486 | -0.2508 |    -0.2487
      
      
      
      Change-Id: Iea6c895c6fe9e9764887a8968f6e5330903969d3
      7c412ea4
    • Yunqing Wang's avatar
      Add MV refining search in warped motion experiment · 68f3ccd1
      Yunqing Wang authored
      Implemented a MV refining search after the warped motion parameters were
      found. Only 4 or 8 positions were checked so there was almost no impact
      on encoder speed.
      
      Borg test result:
                  avg_psnr     ovr_psnr    ssim
      cam_lowres: -0.543%      -0.574%     -0.670%
      lowres    : -0.222%      -0.230%     -0.285%
      
      Change-Id: Ic2f6c1fe548b089d50e9c33bb365e6b128aabc93
      68f3ccd1
    • Jingning Han's avatar
      Deprecate special rd loop for sub8x8 block size · b2a01db8
      Jingning Han authored
      Remove the special rate-distortion optimization loop for sub8x8
      block size from vp9.
      
      Change-Id: I62c6cf537a54769f26f2d4938ebed5fed2c84741
      b2a01db8
    • Jingning Han's avatar
      Resolve extremely large stack alloc in rdopt · d064cf03
      Jingning Han authored
      Move the large stack allocation from stack initialization to
      dedicated mem space. This resolves the extremely large stack issue
      when ext-partition, motion-var, and high bit-depth are all turned
      on.
      
      BUG=aomedia:415
      
      Change-Id: I85b77bbc6429093fcb0152176d9e237087d6bbd8
      d064cf03
  19. 01 Jun, 2017 3 commits
    • Yushin Cho's avatar
      Fix daala-dist for cb4x4 · 63927c43
      Yushin Cho authored
      The place where av1_daala_dist() is applied for sub8x8 partition is
      moved from sub8x8 mode decision functions to rd_pick_partition().
      
      BD-Rate change by daala-dist with '--disable-var-tx' is:
      (AWCY, objective-1-fast, high delay mode)
      
         PSNR | PSNR Cb | PSNR Cr | PSNR HVS |    SSIM | MS SSIM | CIEDE 2000
      15.1558 | 12.9585 | 14.4662 |  -3.8651 | -1.7102 | -9.2956 |    10.8686
      
      In MSE probe mode:
      
        PSNR | PSNR Cb | PSNR Cr | PSNR HVS |   SSIM | MS SSIM | CIEDE 2000
      0.0429 |  0.0435 |  0.1651 |  -0.0415 | 0.0850 |  0.0122 |     0.0546
      
      Change-Id: I3b2ea916d41c48e433eb641adf44552e4725c198
      63927c43
    • Timothy B. Terriberry's avatar
      cb4x4: Move sub-4X4 TX sizes behind CONFIG_CHROMA_2X2. · fe67ed6a
      Timothy B. Terriberry authored
      cb4x4 itself should not require these sizes.
      
      This simplifies compatibility with other experiments, since we can
      first make them work with cb4x4 (which is now on by default), and
      then worry about chroma_2x2 (which is not) in separate steps.
      
      Encoder and decoder output should remain unchanged.
      
      Change-Id: I4e9fcdae49f238b5099a3c74a398fe993c2545f8
      fe67ed6a
    • hui su's avatar
      Initialize chroma mode info before RD search · eaddeee1
      hui su authored
      Make sure initialization is done regardless of whether RD search
      is skipped (skip_chroma_rd).
      
      BUG=aomedia:568
      
      Change-Id: Idb620b34be6930bb35ab6c912dfd4777f7614159
      eaddeee1