1. 27 Sep, 2017 1 commit
    • David Barker's avatar
      ext-partition: Don't read not-yet-decoded values · 761b1ac8
      David Barker authored
      When deciding whether the top-right or bottom-left blocks are
      available, we currently always act as if we're using 128x128
      superblocks. This means that, when using 64x64 superblocks,
      we sometimes conclude that blocks are available when they haven't
      been decoded yet!
      
      This typically happens at, for example, mi_row=15, mi_col=16
      (for bottom left), where we're at a 64x64 boundary but not
      a 128x128 boundary.
      
      This patch fixes the issue by checking based on the signalled
      superblock size.
      
      Note: Most of this patch is just threading 'cm' through the
      intra prediction process, so that we have access to cm->sb_size
      in has_top_right() and has_bottom_left()
      
      Change-Id: I126964c510aafffc870e7cd8b3e64a46abb14b3a
      761b1ac8
  2. 26 Sep, 2017 1 commit
  3. 11 Sep, 2017 1 commit
    • Sarah Parker's avatar
      Tokenize and write mrc mask · 99e7daa2
      Sarah Parker authored
      This allows a mask for mrc-tx to be sent in the bitstream for
      inter or intra 32x32 transform blocks. The option to send the mask
      vs build it from the prediction signal is currently controlled with
      a macro. In the future, it is likely the macro will be removed and it
      will be possible for a block to select either method. The mask building
      functions are still placeholders and will be filled in in a followup.
      
      Change-Id: Ie27643ff172cc2b1a9b389fd503fe6bf7c9e21e3
      99e7daa2
  4. 08 Sep, 2017 1 commit
  5. 04 Sep, 2017 1 commit
  6. 31 Aug, 2017 1 commit
    • Luc Trudeau's avatar
      [CFL] Asserts for chroma_sub8x8 · c84c21c4
      Luc Trudeau authored
      When Chroma from Luma is combined with chroma_sub8x8, the prediction
      used for sub8x8 blocks originates from multiple luma blocks. Extra
      asserts are added to validate that the prediction buffer contains all
      the required information.
      
      Change-Id: I305c46ce9b8292697e1d5b181d123461026da11c
      c84c21c4
  7. 28 Aug, 2017 1 commit
    • Luc Trudeau's avatar
      [CFL] Move store flag to CFL_CTX · fcca37a4
      Luc Trudeau authored
      With recent changes, it is now possible to store the storage
      flag inside the CFL_CTX. This simplifies the implementation
      and will allow reuse in the decoder.
      
      This change does not alter the bitstream.
      
      Change-Id: Ibb8aebdd3d06f8765d40248ece8a038892e87032
      fcca37a4
  8. 22 Aug, 2017 1 commit
    • Lester Lu's avatar
      Refactor lgt · 918fe698
      Lester Lu authored
      Change get_lgt in order to integrate a later experiment
      lgt_from_pred with lgt. There are two main changes.
      
      The main purpose for this change is to unify get_fwd_lgt and
      get_inv_lgt functions into a get_lgt function so the lgt basis
      functions can always be selected through the same function in
      both forward and inverse transform paths. The structure of those
      functions will also be consistent with the get_lgt_from_pred
      functions that will be added in the lgt-from-pred experiment.
      
      These changes have no impact on the bitstream.
      
      Change-Id: Ifd3dfc1a9e1a250495830ddbf42c201e80aa913e
      918fe698
  9. 18 Aug, 2017 1 commit
    • Hui Su's avatar
      Remove dpcm-intra experiment · 400bf651
      Hui Su authored
      Coding gain becomes tiny on top of other experiments.
      
      Change-Id: Ia89b1c2a2653f3833dff8ac8bb612eaa3ba18446
      400bf651
  10. 17 Aug, 2017 1 commit
    • Yushin Cho's avatar
      Introduce runtime switch for dist_8x8 · 55104335
      Yushin Cho authored
      Even if 'dist-8x8' is enabled with configure,
      the dist-8x8 is not acutally enabled (so, no change in encoding behaviour)
      until the command line option, '--enable-dist-8x8=1" is used.
      
      The cdef-dist and daala-dist can not be enabled by a command line option yet.
      
      This commit is a part of prep-work to remove DIST_8X8, CDEF_DIST,
      and DAALA_DIST experimental flags.
      
      Change-Id: I5c2df90f837b32f44e756572a19272dfb4c3dff4
      55104335
  11. 16 Aug, 2017 1 commit
  12. 15 Aug, 2017 1 commit
    • Thomas Davies's avatar
      AOM_QM: enable by default · 181fc08f
      Thomas Davies authored
      No change to metrics, as quantization matrices are not used
      unless --enable-qm=1 is set on the command line.
      
      Fix no highbitdepth compilation, and fix compile errors and
      warnings for PVQ and NEW_QUANT experiments.
      
      Change-Id: I49aceb5acf6ca6790c81e760e5b208788f87086d
      181fc08f
  13. 08 Aug, 2017 2 commits
    • Tom Finegan's avatar
      Fix CONFIG_LV_MAP builds with CMake. · e29f1e9a
      Tom Finegan authored
      Add missing source files to the CMake build, and silence a
      warning.
      
      BUG=aomedia:683
      
      Change-Id: I857fce54239d121bc8d47fc805f814890b1b736f
      e29f1e9a
    • Thomas Davies's avatar
      AOM_QM: use SIMD for flat matrices and re-enable tests. · 1870382c
      Thomas Davies authored
      When AOM_QM is enabled, by default quantization matrices are
      flat unless enabled with --enable-qm=1. Re-use existing SIMD
      functions when a flat matrix is used, so that there is no
      speed deficit when AOM_QM is enabled.
      
      SIMD for the non-flat case is TBC.
      
      Change-Id: I1bb8da70d3dd5858dac15099610ddf61662e3d0d
      1870382c
  14. 04 Aug, 2017 2 commits
  15. 03 Aug, 2017 3 commits
    • Yaowu Xu's avatar
      Replace shift with multiply · bc83b642
      Yaowu Xu authored
      To avoid left shift of negative values.
      
      BUG=aomedia:678
      
      Change-Id: I8dacf99f162771a58bef1f839cafd0be9a8f5e86
      bc83b642
    • Sarah Parker's avatar
      Add macros to turn off inter and intra mrc_dct separately · 2e08d96d
      Sarah Parker authored
      This will aid in testing different masking methods for inter
      and intra blocks.
      
      Change-Id: Ic038da77e55405e3303177e6cd260bd5e19311c1
      2e08d96d
    • hui su's avatar
      Calculate coeff token cost from CDF · c0cf71df
      hui su authored
      AWCY results:
      PSNR	PSNR HVS  SSIM	CIEDE 2000
      -0.09	-0.04	  -0.02	  -0.03
      
      On Google testsets:
      lowres  -0.18%
      midres  -0.20%
      
      Above results are obtained with
      --disable-ext-refs --disable-dual-filter --disable-loop-restoration
      --disable-global-motion --disable-warped-motion
      
      Change-Id: Iba58d5e5ec9a65d0afba29609aa2e379a80d7236
      c0cf71df
  16. 02 Aug, 2017 1 commit
    • Angie Chiang's avatar
      Add txmg experiment · ad653a39
      Angie Chiang authored
      This experiment aims at merging lbd/hbd txfms
      
      So far this exp uses hbd transform on lbd path.
      The performances I observed are
      lowres -0.089%
      midres  0.065%
      (negative means performance drop)
      
      Started from here, two main things are needed to be done.
      1) Fix overflow due to quantizer noise
      2) Generate a 16-bit version from the hbd txfm
      
      Change-Id: I35bb1fc0cbb78decad2570ff5826ed665f739752
      ad653a39
  17. 26 Jul, 2017 2 commits
    • Yue Chen's avatar
      rect_tx_ext: work with var_tx · d6bdd46b
      Yue Chen authored
      Change-Id: Ie2c34490dc50cb242bcd701308e6b55243883b15
      d6bdd46b
    • Sarah Parker's avatar
      Add txfm functions corresponding to MRC_DCT · 5b8e6d2d
      Sarah Parker authored
      MRC_DCT uses a mask based on the prediction signal to modify the
      residual before applying DCT_DCT. This adds all necessary functions
      to perform this transform and makes the prediction signal available
      to the 32x32 txfm functions so the mask can be created. I am still
      experimenting with different types of mask generation functions and
      so this patch contains a placeholder. This patch has no impact on
      performance.
      
      Change-Id: Ie3772f528e82103187a85c91cf00bb291dba328a
      5b8e6d2d
  18. 20 Jul, 2017 2 commits
    • Jingning Han's avatar
      Make maximum transform coding unit 64x64 for inter blocks · c2b797fa
      Jingning Han authored
      This commit makes the maximum transform coefficient coding unit
      64x64 for inter coded blocks. It allows the hardware design to
      reuse the existing 64x64 coding pipeline for 128x128 level blocks.
      
      Change-Id: Ibadd59cf7e652984456cac621ec2294d48cf4507
      c2b797fa
    • Yushin Cho's avatar
      New experiment DIST_8x8 · b7b60c57
      Yushin Cho authored
      A framework for computing a distortion at 8x8 luma block level
      during RDO-based mode decision search. New 8x8 distortion metric can
      be plugged in by way of this tool.
      
      Existing daala_dist now uses this experiment as well.
      Other possible applications that can make use of this experiment would be
      a distortion meric, which should apply at 8x8 pixels such as PSNR-HVS, SSIM, or etc.
      
      A rd_cost for final coding mode decision for a super block is
      computed for a partition size 8x8 or larger. For a block larger than 8x8,
      a distortion of each 8x8 block is independently computed then summed up.
      
      The rd_cost for 8x8 block with new 8x8 distortion metric is computed
      only when the mode decision of its sub8x8 blocks are completed.
      However, MSE distortion metric is used with sub8x8 mode decision. Thus,
      early termination is also determined with the MSE based rd_cost.
      Because the best rd_cost (i.e. the reference rd_cost) during sub8x8 prediction
      or sub8x8 tx is based on new 8x8 distortion while each sub8x8 uses MSE,
      the existing early termination cannot be used (And this can be the one of possible reason
      for the BD-Rate change with this revision).
      
      For a sub8x8 prediction, prediction mode for each sub8x8 block of a 8x8 block is
      decided with existing MSE and then av1_dist_8x8() is applied to the 8x8 pixels.
      (There is also av1_dist_8x8_diff, which can input diff signal directly)
      
      For a sub8x8 tx in a block larger than 8x8, instead of computing MSE distortion for
      each sub8x8 tx block, we wait until all sub8x8 tx blocks are encoded before av1_dist_8x8()
      is applied to 8x8 pixels.
      
      Sub8x8 prediction and transformas were most of tricky parts in this change.
      Two kind of distortions, for a) predicted pixels and b) decoded pixels
      (i.e. predicted + possible reconstructed residue), are always computed during RDO.
      In order to access those two signals a) and b) for a 8x8 block after
      its sub8x8 mode decision is finished, a) and b) need be properly stored for later retrieval.
      
      The CB4X4 makes the task of accessing a) and b) signals for sub8x8 block further difficult,
      since the intermediate data (i.e. a and/or b) for sub8x8 block
      are not easily accessible outside of current partition unless reconstruced
      with decided coding modes.
      
      Change-Id: If60301a890c0674a3de1d8206965bbd6a6495bb7
      b7b60c57
  19. 19 Jul, 2017 1 commit
    • Jingning Han's avatar
      Rework txk_type indexing system for chroma component · 19b5c8fa
      Jingning Han authored
      Use the row and column indexes to fetch txk_type, which allows the
      chroma components to derive the tx type from the corresponding luma
      components. It improves the coding performance of txk-sel by 0.18%.
      
      Change-Id: I3f4bca5839e13ae95e51053e76cd86fe58202ac9
      19b5c8fa
  20. 17 Jul, 2017 2 commits
    • Lester Lu's avatar
      Unify FWD_TXFM_PARAM and INV_TXFM_PARAM · 27319b6e
      Lester Lu authored
      Change two similar structs, FWD_TXFM_PARAM and INV_TXFM_PARAM,
      into a common struct: TxfmParam. Its definition is moved to
      aom_dsp/txfm_common.h to simplify dependency.
      
      This change is made so that, in later changes of the LGT
      experiment, functions requiring FWD_TXFM_PARAM and
      INV_TXFM_PARAM, such as get_fwd_lgt4 and get_inv_lgt4, can
      also be unified.
      
      Change-Id: I756b0176a02314005060adbf8e62386f10eeb344
      27319b6e
    • hui su's avatar
      refactor get_tx_size() and get_uv_tx_size() · 0c6244b6
      hui su authored
      Change-Id: I802c9e41ebfed090b5ad8300917aad5e16ad026a
      0c6244b6
  21. 14 Jul, 2017 2 commits
  22. 13 Jul, 2017 1 commit
  23. 12 Jul, 2017 1 commit
  24. 11 Jul, 2017 3 commits
    • Luc Trudeau's avatar
      [CFL] Compute alpha costs using av1_cost_symbol · 19bb3498
      Luc Trudeau authored
      Use the uniform way to compue the cost of symbols in AV1.
      
      Results on Subset1 (compared to 8a516a8f with CfL enabled)
      
         PSNR | PSNR Cb | PSNR Cr | PSNR HVS |    SSIM | MS SSIM | CIEDE 2000
      -0.0357 | -0.0854 |  0.0305 |  -0.0422 | -0.0097 | -0.0171 |    -0.1042
      
      Change-Id: Ie908fc7d20c480634002c78027b070223b3ea96d
      19bb3498
    • Luc Trudeau's avatar
      [CFL] Convert cfl_alpha to q3 · 4e81d929
      Luc Trudeau authored
      Alpha's biggest fraction is 1/8, so Q3 does not change the bitstream.
      
      Results on Subset1 (compared to 503aca74 with CfL enabled)
      
        PSNR | PSNR Cb | PSNR Cr | PSNR HVS |   SSIM | MS SSIM | CIEDE 2000
      0.0000 |  0.0000 |  0.0000 |   0.0000 | 0.0000 |  0.0000 |     0.0000
      
      Change-Id: I1fe5b2ace97179d5f950d7406a4f3d391924f89d
      4e81d929
    • Nathan E. Egge's avatar
      Remove the EC_ADAPT experimental flags. · 6bdc40f1
      Nathan E. Egge authored
      Removing these flags make the EC_ADAPT experiment an integral part of
       the draft AV1 bitstream definition
      This commit has no effect on metrics.
      
      Change-Id: Ice78520935e8bfa9d25cf4b8384a1b872069d09c
      6bdc40f1
  25. 10 Jul, 2017 2 commits
    • Lester Lu's avatar
      Inter and intra LGTs · 708c1ec5
      Lester Lu authored
      Here we have an LGT to replace ADST for intra residual blocks, and
      another LGT to replace ADST for inter residual blocks. The changes
      are only applied to transform length 4 and 8, and only for the
      lowbitdepth path.
      
      lowres: -0.18%
      
      Change-Id: Iadc1e02b53e3756b44f74ca648cfa8b0e8ca7af4
      708c1ec5
    • Luc Trudeau's avatar
      [CFL] Q0 DC_Pred · 7651b739
      Luc Trudeau authored
      The block level DC_PRED computed by CfL goes down from Q6 to Q0. This
      will allow to reuse existing assembly for DC_PRED and also reduce the
      requirements on the multilpy required to scale the reconstructed luma
      values
      
      Results on Subset1 (compared to f9684d222 with CfL enabled)
      
         PSNR | PSNR Cb | PSNR Cr | PSNR HVS |    SSIM | MS SSIM | CIEDE 2000
      -0.0347 |  0.0229 | -0.1326 |  -0.0420 | -0.0057 | -0.0072 |    -0.0644
      
      Change-Id: I6ba82cc9e04fa4ab7c8ec40a7856deb273881748
      7651b739
  26. 07 Jul, 2017 1 commit
    • Lester Lu's avatar
      Signature changes for the LGT experiment · d8b1ddce
      Lester Lu authored
      The input arguments of av1_fht* and av1_iht* functions (and their
      HBD versions) are slightly changed. Input arguments tx_type and
      bd are carried by a struct fwd_txfm_param/inv_txfm_param. This
      struct is meant to later on carry other prediction information,
      such as intra top/left boundaries to the transform level, so
      that the choice of transforms can be more adaptive to the
      prediction mode and local video content.
      
      Change-Id: Ia42544248a51845be64b72855b642ef1fe5910a9
      d8b1ddce
  27. 06 Jul, 2017 3 commits
    • Kyle Siefring's avatar
      Remove the token state array from optimize_b_greedy. · 627e2fd5
      Kyle Siefring authored
      The token state array was carried over from the old optimize_b.
      With hbd and 64x64 transforms on the array uses 128KB. While the array
      could be changed to only store tokens, this commit opts to remove
      it entirely.
      
      Improves performance on difficult clips at q20 by roughly 2% with
      high-bitdepth enabled. Actual speedup should be higher.
      
      This change has no impact on metrics.
      
      Change-Id: Ib9924092dee30b0f0abcc7850e8bb52d3e891e31
      627e2fd5
    • Luc Trudeau's avatar
      [CFL] Fewer bits for fixed point · 475fc9df
      Luc Trudeau authored
      Since alpha is Q3, we reduce y_average from Q10 to Q3. As such, the
      prediction is reduced from Q13 to Q6. Chroma dc_pred is reduced from Q7
      to Q6 in order to match with the prediction.
      
      Results on Subset1 (compared to 209de2e5b with CfL enabled)
      
        PSNR | PSNR Cb | PSNR Cr | PSNR HVS |   SSIM | MS SSIM | CIEDE 2000
      0.0010 |  0.0176 | -0.0538 |  -0.0043 | 0.0027 | -0.0097 |    -0.0018
      
      Change-Id: Ib7dd3968a764e0380ddc0ad2333ebacf1e9699cd
      475fc9df
    • Luc Trudeau's avatar
      [CFL] Convert dc_pred to fixed point · 2e6cb7e7
      Luc Trudeau authored
      The dc_pred values stored in the CfL context are in Q8.7 (Worst case
      division will be of 1/128).
      
      Results on Subset1 (compared to f9684d222 with CfL enabled)
      
        PSNR | PSNR Cb | PSNR Cr | PSNR HVS |   SSIM | MS SSIM | CIEDE 2000
      0.0118 | -0.0181 | -0.0109 |   0.0086 | 0.0086 |  0.0196 |     0.0018
      
      Change-Id: I0701e04fb76f03eff12ed01fd5fda675fbb15e32
      2e6cb7e7