1. 26 Jul, 2017 1 commit
    • Jingning Han's avatar
      Optimize transform block rate-distortion search · 3bce7547
      Jingning Han authored
      The soft coefficient optimization process would monotonically
      increase the transform block distortion and decrease the
      coefficient rate cost. Such observation provides a lower bound
      on the rate-distortion cost for the given transform block. This
      commit compares this lower bound against the best available
      rate-distortion cost value and skips unnecessary optimization
      process. It speeds up the baseline encoding process by 15%.
      
      Change-Id: Ida8098a2820cef60d59ec1e72f0bbb1acbd98165
      3bce7547
  2. 25 Jul, 2017 2 commits
    • Yushin Cho's avatar
      Fix that matching { and } can be searched in inter mode decision · 67dda51a
      Yushin Cho authored
      Because #if ... #else ... put the '{' on the same line, dangling { or } occurs,
      which causes automatic syntax analyzer, such as 'Ctrl-Shifht-P' in Eclipse
      or '%' of vi, fail to find matching { and }.
      
      For some developers, this can make quick reading and/or understaning blocks of code
      almost impossible.
      
      Three function or blocks are repaird.
      1. av1_rd_pick_inter_mode_sb() {...}
      
      2. for (midx = 0; midx < MAX_MODES; ++midx) {...}
         in av1_rd_pick_inter_mode_sb()
      
      3. handle_inter_mode() {...}
      
      Change-Id: Ib5ac63b8c7f9870a491fac337ae3f58c57ce5e46
      67dda51a
    • Jingning Han's avatar
      Account for the 64x64 proc block constrain in obmc masking · 440d4254
      Jingning Han authored
      Make the codec account for the 64x64 processing unit constraint
      when producing the mask for overlapped filter.
      
      Change-Id: I3e596492ae522abe678369b0c9710441549e817e
      440d4254
  3. 24 Jul, 2017 1 commit
    • Luc Trudeau's avatar
      [CFL] Fix rare overflow in distortion computation · 4c5df105
      Luc Trudeau authored
      Worst case SSE for a 12-bit 64x64 block requires 48 bits
      (2*(12+log(64)+log(64))). As such, the dist variable must
      be int64.
      
      Results on Subset1 (compared to 19b5c8fa with CfL enabled)
      
        PSNR | PSNR Cb | PSNR Cr | PSNR HVS |   SSIM | MS SSIM | CIEDE 2000
      0.0030 |  0.0001 |  0.0100 |   0.0026 | 0.0024 | -0.0008 |     0.0028
      
      Change-Id: I1364c089c223b96daed942175a915fed0f6f1023
      4c5df105
  4. 20 Jul, 2017 3 commits
    • Zoe Liu's avatar
      Add support to the experiment of altref2 · 97ad058e
      Zoe Liu authored
      This CL adds an extra alt-ref reference frame, namely ALTREF2_FRAME,
      and designs the contexts for ALTREF2_FRAME.
      
      Change-Id: I12fe8629b868aebf6c2b54260fca5abc38f90ae6
      97ad058e
    • Sarah Parker's avatar
      Add new MRC_DCT tx type · 53f93dbd
      Sarah Parker authored
      This adds the new transform to the list of possible transforms.
      The impact on performance is in the noise range because the transform
      implementation currently performs DCT as a placeholder. This transform
      will initially only have an implementation for TX_32X32 and it is
      skipped in the tx search for smaller transform sizes.
      
      Change-Id: Iab2faddc525b478ca06972a753428a4f4ef53ac6
      53f93dbd
    • Yushin Cho's avatar
      New experiment DIST_8x8 · b7b60c57
      Yushin Cho authored
      A framework for computing a distortion at 8x8 luma block level
      during RDO-based mode decision search. New 8x8 distortion metric can
      be plugged in by way of this tool.
      
      Existing daala_dist now uses this experiment as well.
      Other possible applications that can make use of this experiment would be
      a distortion meric, which should apply at 8x8 pixels such as PSNR-HVS, SSIM, or etc.
      
      A rd_cost for final coding mode decision for a super block is
      computed for a partition size 8x8 or larger. For a block larger than 8x8,
      a distortion of each 8x8 block is independently computed then summed up.
      
      The rd_cost for 8x8 block with new 8x8 distortion metric is computed
      only when the mode decision of its sub8x8 blocks are completed.
      However, MSE distortion metric is used with sub8x8 mode decision. Thus,
      early termination is also determined with the MSE based rd_cost.
      Because the best rd_cost (i.e. the reference rd_cost) during sub8x8 prediction
      or sub8x8 tx is based on new 8x8 distortion while each sub8x8 uses MSE,
      the existing early termination cannot be used (And this can be the one of possible reason
      for the BD-Rate change with this revision).
      
      For a sub8x8 prediction, prediction mode for each sub8x8 block of a 8x8 block is
      decided with existing MSE and then av1_dist_8x8() is applied to the 8x8 pixels.
      (There is also av1_dist_8x8_diff, which can input diff signal directly)
      
      For a sub8x8 tx in a block larger than 8x8, instead of computing MSE distortion for
      each sub8x8 tx block, we wait until all sub8x8 tx blocks are encoded before av1_dist_8x8()
      is applied to 8x8 pixels.
      
      Sub8x8 prediction and transformas were most of tricky parts in this change.
      Two kind of distortions, for a) predicted pixels and b) decoded pixels
      (i.e. predicted + possible reconstructed residue), are always computed during RDO.
      In order to access those two signals a) and b) for a 8x8 block after
      its sub8x8 mode decision is finished, a) and b) need be properly stored for later retrieval.
      
      The CB4X4 makes the task of accessing a) and b) signals for sub8x8 block further difficult,
      since the intermediate data (i.e. a and/or b) for sub8x8 block
      are not easily accessible outside of current partition unless reconstruced
      with decided coding modes.
      
      Change-Id: If60301a890c0674a3de1d8206965bbd6a6495bb7
      b7b60c57
  5. 19 Jul, 2017 1 commit
    • Jingning Han's avatar
      Rework txk_type indexing system for chroma component · 19b5c8fa
      Jingning Han authored
      Use the row and column indexes to fetch txk_type, which allows the
      chroma components to derive the tx type from the corresponding luma
      components. It improves the coding performance of txk-sel by 0.18%.
      
      Change-Id: I3f4bca5839e13ae95e51053e76cd86fe58202ac9
      19b5c8fa
  6. 17 Jul, 2017 1 commit
  7. 14 Jul, 2017 3 commits
    • Yunqing Wang's avatar
      Sample selection in warped motion · 1bc82866
      Yunqing Wang authored
      Added a sample selection process in warped motion.
      1. Gather more samples including multiple rows on the top, multiple
      columns on the left, and the upper-right block.
      2. Sort samples by the MV difference between the neighbour's MV and
      the current block's MV. Trim the samples with considerably large MV
      difference.
      
      Borg test result:
                   avg_psnr ovr_psnr ssim
      cam_lowres:  -0.241   -0.243  -0.376
      lowres:      -0.104   -0.110  -0.179
      
      The changes are wrapped in WARPED_MOTION_SORT_SAMPLES macro.
      
      Change-Id: I2730bb31a0a3ad28215ccd16fd6da0ea8b2ed404
      1bc82866
    • hui su's avatar
      refactor get_tx_type() · 45b6475e
      hui su authored
      Change-Id: I2888bd8905253e02e3ac74597275cf56e5142d29
      45b6475e
    • David Michael Barr's avatar
      [CFL] Move alpha picking code to rdopt.c · 2510f64e
      David Michael Barr authored
      This simplifies the path from rd_pick_intra_sbuv_mode()
      
      Results on Subset1 (compared to  dff41923 with CfL enabled)
      
        PSNR | PSNR Cb | PSNR Cr | PSNR HVS |   SSIM | MS SSIM | CIEDE 2000
      0.0000 |  0.0000 |  0.0000 |   0.0000 | 0.0000 |  0.0000 |     0.0000
      
      Change-Id: I0bade9d347c626a78ba7077b960afdb318ecca69
      Signed-off-by: default avatarDavid Michael Barr <b@rr-dav.id.au>
      2510f64e
  8. 13 Jul, 2017 2 commits
  9. 12 Jul, 2017 6 commits
    • hui su's avatar
      Automatically turn on/off screen content tools · d9a812bd
      hui su authored
      Turn "allow_screen_content_tools" on when the source video has many blocks
      with only few different colors. The automatic detection is enabled by
      defualt (or with command line flag "--tune-content=default"). With
      "--tune-content=screen", the screen content tools are always turned on.
      
      On the screen_content test set, the "default" setting is less than 0.3%
      worse than the "screen" setting on keyframe encoding.
      
      Change-Id: Iac7ab8952c96531d1fae84da1823291f5987519c
      d9a812bd
    • Rupert Swarbrick's avatar
      ext-partition-types: Add 4:1 partitions · 93c39e91
      Rupert Swarbrick authored
      This patch adds support for 4:1 rectangular blocks to various common
      data arrays, and adds new partition types to the EXT_PARTITION_TYPES
      experiment which will use them.
      
      This patch has the following restrictions, which can be lifted in
      future patches:
      
        * ext-partition-types is incompatible with fp_mb_stats and supertx
          for the moment
      
        * Currently only 32x32 superblocks can use the new partition types
      
      There's a slightly odd restriction about when we allow
      PARTITION_HORZ_4 or PARTITION_VERT_4. Since these both live in the
      EXT_PARTITION_TYPES CDF, read_partition() can only return them if both
      has_rows and has_cols is true. This means that at least half of the
      width and height of the block must be visible. It might be nice to
      relax that restriction but that would imply a change to how we encode
      partition types, which seems already to be in a state of flux, so
      maybe it's better to wait until that has settled down.
      
      Change-Id: Id7fc3fd0f762f35f63b3d3e3bf4e07c245c7b4fa
      93c39e91
    • Jingning Han's avatar
      Fix chroma component boundary context update in RD loop · 328d57b8
      Jingning Han authored
      Fix the chroma component boundary context update in the inter
      residual rd search.
      
      Change-Id: Ice8028386a8b3bf921e2bf523ad0d2dcea707c7a
      328d57b8
    • Yushin Cho's avatar
      Fix pvq for cb4x4 and maintain its configure · cd4f4a2a
      Yushin Cho authored
      Recently, sub8x8 inter mode decition functions have been
      removed from the av1 codebase, so codebase does not allow
      disabling cb4x4 anymore.
      
      This makes pvq not working simply crashing
      because we had disabled cb4x4 if pvq is enabled.
      Hence, pvq has been fixed for cb4x4.
      
      Also, if pvq is enabled, disable lgt and highbitdepth in the configure.
      
      Change-Id: I2cb675c0dbc12bce60ed6a66c34ea3e907cc35b3
      cd4f4a2a
    • Luc Trudeau's avatar
      [CFL] Add CfL Alpha cost to RDO · dff41923
      Luc Trudeau authored
      The cost of signaling the alpha symbol and the signs are added to the
      DC_PRED rate in RDO.
      
      Results on Subset1(compared to f9e04152b with CfL enabled)
      
         PSNR | PSNR Cb | PSNR Cr | PSNR HVS |    SSIM | MS SSIM | CIEDE 2000
      -0.3699 |  1.5330 |  0.8664 |  -0.2881 | -0.3107 | -0.2587 |     0.1954
      
      Change-Id: Icd9827d11ee4ef29dfb527e636f0f380bcafa062
      dff41923
    • Zoe Liu's avatar
      Further work on ext-comp-refs for ref frame coding · fcf5fa27
      Zoe Liu authored
      (1) Work with var-refs to remove redundant bits in ref frame
          coding;
      (2) Add a new uni-directional compound reference pair:
          (LAST_FRAME, LAST3_FRAME);
      (3) Redesign the contexts for encoding uni-directional reference frame
          pairs;
      (4) Use aom_entropy_optimizer to collect stats for all the default
          probability setups related to the coding of reference frames.
      
      Compared against the baseline (default enabled tools excluding ext-tx
      and global-motion for encoder speed concern) with one-sided-compound,
      the coding gain of ext-comp-refs + var-refs - one-sided-compound is:
      
      lowres: avg_psnr -0.385%; ovr_psnr -0.378% ssim -0.344%
      midres: avg_psnr -0.466%; ovr_psnr -0.447% ssim -0.513%
      
      AWCY - High Latency:
         PSNR | PSNR Cb | PSNR Cr | PSNR HVS |    SSIM | MS SSIM | CIEDE 2000
      -0.2758 | -0.1526 | -0.0965 |  -0.2581 | -0.2492 | -0.2534 |    -0.2118
      
      AWCY - Low Latency:
         PSNR | PSNR Cb | PSNR Cr | PSNR HVS |    SSIM | MS SSIM | CIEDE 2000
      -1.0467 | -1.4500 | -0.9732 |  -0.9928 | -1.0407 | -1.0180 |    -1.0049
      
      Compared against the baseline (default enabled tools excluding ext-tx
      and global-motion for encoder speed concern) without
      one-sided-compound, the coding gain of
      ext-comp-refs + var-refs - one-sided-compound is:
      
      lowres: avg_psnr -0.875%; ovr_psnr -0.877% ssim -0.895%
      midres: avg_psnr -0.824%; ovr_psnr -0.802% ssim -0.843%
      
      Change-Id: I8de774c9a74c20632ea93ccb0c17779fa94431cb
      fcf5fa27
  10. 11 Jul, 2017 2 commits
    • Sarah Parker's avatar
      Remove SEPARATE_GLOBAL_MOTION macro · 0eea89f3
      Sarah Parker authored
      Global_motion, obmc and warped_motion are now permanently
      mutually exclusive.
      
      Change-Id: Ib1a1207cc7caa6459a2027c6c4a50fcf4c451e76
      0eea89f3
    • Nathan E. Egge's avatar
      Remove the EC_ADAPT experimental flags. · 6bdc40f1
      Nathan E. Egge authored
      Removing these flags make the EC_ADAPT experiment an integral part of
       the draft AV1 bitstream definition
      This commit has no effect on metrics.
      
      Change-Id: Ice78520935e8bfa9d25cf4b8384a1b872069d09c
      6bdc40f1
  11. 10 Jul, 2017 1 commit
    • Lester Lu's avatar
      Inter and intra LGTs · 708c1ec5
      Lester Lu authored
      Here we have an LGT to replace ADST for intra residual blocks, and
      another LGT to replace ADST for inter residual blocks. The changes
      are only applied to transform length 4 and 8, and only for the
      lowbitdepth path.
      
      lowres: -0.18%
      
      Change-Id: Iadc1e02b53e3756b44f74ca648cfa8b0e8ca7af4
      708c1ec5
  12. 06 Jul, 2017 6 commits
  13. 05 Jul, 2017 2 commits
  14. 03 Jul, 2017 1 commit
    • Luc Trudeau's avatar
      [CFL] Adjust Pixel Buffer for Chroma Sub8x8 · 780d249d
      Luc Trudeau authored
      Adjust row and col offset for sub8x8 blocks to allow the CfL prediction
      to use all available reconstructed luma pixels.
      
      Results on Subset 1 (Compared to b03c2f44 with CfL)
      
         PSNR | PSNR Cb | PSNR Cr | PSNR HVS |    SSIM | MS SSIM | CIEDE 2000
      -0.1355 | -0.8517 | -0.4481 |  -0.0579 | -0.0237 | -0.0203 |    -0.2765
      
      Change-Id: Ia91f0a078f0ff4f28bb2d272b096f579e0d04dac
      780d249d
  15. 29 Jun, 2017 1 commit
    • Luc Trudeau's avatar
      [CFL] Better encapsulation · 3dc55e0f
      Luc Trudeau authored
      The function cfl_compute_parameters is added and contains the logic
      related to building the CfL context parameters. As such, many cfl
      functions can now be encapsulated inside of cfl.c and not exposed to the
      rest of AV1.
      
      This also allows for supplemental asserts that validate that the CfL
      context is properly built.
      
      Results on Subset1 (compared to 9c6f8547 with CfL)
      
        PSNR | PSNR Cb | PSNR Cr | PSNR HVS |   SSIM | MS SSIM | CIEDE 2000
      0.0000 |  0.0000 |  0.0000 |   0.0000 | 0.0000 |  0.0000 |     0.0000
      
      Change-Id: I6d14a426416b3af5491bdc145db7281b5e988cae
      3dc55e0f
  16. 28 Jun, 2017 3 commits
  17. 27 Jun, 2017 2 commits
    • Wei-Ting Lin's avatar
      ncobmc_adapt_weight: Add bitstream syntax · 85a8f70c
      Wei-Ting Lin authored
      Define the syntax and entropy coding templates for
      NCOBMC_ADAPT_WEIGHT. The actual values of the default
      probabilities and the index tree structure need to
      be fine tuned.
      
      In this experiment all mv's in a superblock are sent
      first as in the ncobmc case.
      
      Change-Id: I68d50d3d27346c2847ea449a1168c6a99fbb4d3d
      85a8f70c
    • Jingning Han's avatar
      Rework recursive transform block partition search · e3b81bcf
      Jingning Han authored
      Support transform block level kernel selection in the recursive
      transform block partitioning search.
      
      Change-Id: I511c39705ee636b0c9fabbe4720fe5a9764b964a
      e3b81bcf
  18. 26 Jun, 2017 2 commits
    • Yushin Cho's avatar
      daala-dist: high bit depth support · 8ab875d6
      Yushin Cho authored
      Change-Id: Idafef140d3425a9a9f66cb8864a804c4d2a89a70
      8ab875d6
    • Yushin Cho's avatar
      Fix daala-dist for var-tx · 0474912c
      Yushin Cho authored
      The var-tx has its own suite of tx size/type RD search functions,
      which recursively split the partition into square tx blocks.
      
      The Daala-dist requires access to 8x8 pixels (both decoded and predicted)
      since it measures the distortion for multiple of a 8x8 pixels.
      Thus, if tx block is smaller than 8x8, it waits until all of sub8x8 blocks
      are RD searched (with MSE) then replaces the MSE of 8x8 pixels with
      daala-dist's calculated distortion for 8x8 pixels.
      
      It is also applied to luma pixels only.
      
      Change-Id: Ic4891e89b4ef05cf880aa26781d2d06ccf3142de
      0474912c