1. 08 Aug, 2017 2 commits
    • Tom Finegan's avatar
      Fix CONFIG_LV_MAP builds with CMake. · e29f1e9a
      Tom Finegan authored
      Add missing source files to the CMake build, and silence a
      warning.
      
      BUG=aomedia:683
      
      Change-Id: I857fce54239d121bc8d47fc805f814890b1b736f
      e29f1e9a
    • Thomas Davies's avatar
      AOM_QM: use SIMD for flat matrices and re-enable tests. · 1870382c
      Thomas Davies authored
      When AOM_QM is enabled, by default quantization matrices are
      flat unless enabled with --enable-qm=1. Re-use existing SIMD
      functions when a flat matrix is used, so that there is no
      speed deficit when AOM_QM is enabled.
      
      SIMD for the non-flat case is TBC.
      
      Change-Id: I1bb8da70d3dd5858dac15099610ddf61662e3d0d
      1870382c
  2. 04 Aug, 2017 2 commits
  3. 03 Aug, 2017 3 commits
    • Yaowu Xu's avatar
      Replace shift with multiply · bc83b642
      Yaowu Xu authored
      To avoid left shift of negative values.
      
      BUG=aomedia:678
      
      Change-Id: I8dacf99f162771a58bef1f839cafd0be9a8f5e86
      bc83b642
    • Sarah Parker's avatar
      Add macros to turn off inter and intra mrc_dct separately · 2e08d96d
      Sarah Parker authored
      This will aid in testing different masking methods for inter
      and intra blocks.
      
      Change-Id: Ic038da77e55405e3303177e6cd260bd5e19311c1
      2e08d96d
    • hui su's avatar
      Calculate coeff token cost from CDF · c0cf71df
      hui su authored
      AWCY results:
      PSNR	PSNR HVS  SSIM	CIEDE 2000
      -0.09	-0.04	  -0.02	  -0.03
      
      On Google testsets:
      lowres  -0.18%
      midres  -0.20%
      
      Above results are obtained with
      --disable-ext-refs --disable-dual-filter --disable-loop-restoration
      --disable-global-motion --disable-warped-motion
      
      Change-Id: Iba58d5e5ec9a65d0afba29609aa2e379a80d7236
      c0cf71df
  4. 02 Aug, 2017 1 commit
    • Angie Chiang's avatar
      Add txmg experiment · ad653a39
      Angie Chiang authored
      This experiment aims at merging lbd/hbd txfms
      
      So far this exp uses hbd transform on lbd path.
      The performances I observed are
      lowres -0.089%
      midres  0.065%
      (negative means performance drop)
      
      Started from here, two main things are needed to be done.
      1) Fix overflow due to quantizer noise
      2) Generate a 16-bit version from the hbd txfm
      
      Change-Id: I35bb1fc0cbb78decad2570ff5826ed665f739752
      ad653a39
  5. 26 Jul, 2017 2 commits
    • Yue Chen's avatar
      rect_tx_ext: work with var_tx · d6bdd46b
      Yue Chen authored
      Change-Id: Ie2c34490dc50cb242bcd701308e6b55243883b15
      d6bdd46b
    • Sarah Parker's avatar
      Add txfm functions corresponding to MRC_DCT · 5b8e6d2d
      Sarah Parker authored
      MRC_DCT uses a mask based on the prediction signal to modify the
      residual before applying DCT_DCT. This adds all necessary functions
      to perform this transform and makes the prediction signal available
      to the 32x32 txfm functions so the mask can be created. I am still
      experimenting with different types of mask generation functions and
      so this patch contains a placeholder. This patch has no impact on
      performance.
      
      Change-Id: Ie3772f528e82103187a85c91cf00bb291dba328a
      5b8e6d2d
  6. 20 Jul, 2017 2 commits
    • Jingning Han's avatar
      Make maximum transform coding unit 64x64 for inter blocks · c2b797fa
      Jingning Han authored
      This commit makes the maximum transform coefficient coding unit
      64x64 for inter coded blocks. It allows the hardware design to
      reuse the existing 64x64 coding pipeline for 128x128 level blocks.
      
      Change-Id: Ibadd59cf7e652984456cac621ec2294d48cf4507
      c2b797fa
    • Yushin Cho's avatar
      New experiment DIST_8x8 · b7b60c57
      Yushin Cho authored
      A framework for computing a distortion at 8x8 luma block level
      during RDO-based mode decision search. New 8x8 distortion metric can
      be plugged in by way of this tool.
      
      Existing daala_dist now uses this experiment as well.
      Other possible applications that can make use of this experiment would be
      a distortion meric, which should apply at 8x8 pixels such as PSNR-HVS, SSIM, or etc.
      
      A rd_cost for final coding mode decision for a super block is
      computed for a partition size 8x8 or larger. For a block larger than 8x8,
      a distortion of each 8x8 block is independently computed then summed up.
      
      The rd_cost for 8x8 block with new 8x8 distortion metric is computed
      only when the mode decision of its sub8x8 blocks are completed.
      However, MSE distortion metric is used with sub8x8 mode decision. Thus,
      early termination is also determined with the MSE based rd_cost.
      Because the best rd_cost (i.e. the reference rd_cost) during sub8x8 prediction
      or sub8x8 tx is based on new 8x8 distortion while each sub8x8 uses MSE,
      the existing early termination cannot be used (And this can be the one of possible reason
      for the BD-Rate change with this revision).
      
      For a sub8x8 prediction, prediction mode for each sub8x8 block of a 8x8 block is
      decided with existing MSE and then av1_dist_8x8() is applied to the 8x8 pixels.
      (There is also av1_dist_8x8_diff, which can input diff signal directly)
      
      For a sub8x8 tx in a block larger than 8x8, instead of computing MSE distortion for
      each sub8x8 tx block, we wait until all sub8x8 tx blocks are encoded before av1_dist_8x8()
      is applied to 8x8 pixels.
      
      Sub8x8 prediction and transformas were most of tricky parts in this change.
      Two kind of distortions, for a) predicted pixels and b) decoded pixels
      (i.e. predicted + possible reconstructed residue), are always computed during RDO.
      In order to access those two signals a) and b) for a 8x8 block after
      its sub8x8 mode decision is finished, a) and b) need be properly stored for later retrieval.
      
      The CB4X4 makes the task of accessing a) and b) signals for sub8x8 block further difficult,
      since the intermediate data (i.e. a and/or b) for sub8x8 block
      are not easily accessible outside of current partition unless reconstruced
      with decided coding modes.
      
      Change-Id: If60301a890c0674a3de1d8206965bbd6a6495bb7
      b7b60c57
  7. 19 Jul, 2017 1 commit
    • Jingning Han's avatar
      Rework txk_type indexing system for chroma component · 19b5c8fa
      Jingning Han authored
      Use the row and column indexes to fetch txk_type, which allows the
      chroma components to derive the tx type from the corresponding luma
      components. It improves the coding performance of txk-sel by 0.18%.
      
      Change-Id: I3f4bca5839e13ae95e51053e76cd86fe58202ac9
      19b5c8fa
  8. 17 Jul, 2017 2 commits
    • Lester Lu's avatar
      Unify FWD_TXFM_PARAM and INV_TXFM_PARAM · 27319b6e
      Lester Lu authored
      Change two similar structs, FWD_TXFM_PARAM and INV_TXFM_PARAM,
      into a common struct: TxfmParam. Its definition is moved to
      aom_dsp/txfm_common.h to simplify dependency.
      
      This change is made so that, in later changes of the LGT
      experiment, functions requiring FWD_TXFM_PARAM and
      INV_TXFM_PARAM, such as get_fwd_lgt4 and get_inv_lgt4, can
      also be unified.
      
      Change-Id: I756b0176a02314005060adbf8e62386f10eeb344
      27319b6e
    • hui su's avatar
      refactor get_tx_size() and get_uv_tx_size() · 0c6244b6
      hui su authored
      Change-Id: I802c9e41ebfed090b5ad8300917aad5e16ad026a
      0c6244b6
  9. 14 Jul, 2017 2 commits
  10. 13 Jul, 2017 1 commit
  11. 12 Jul, 2017 1 commit
  12. 11 Jul, 2017 3 commits
    • Luc Trudeau's avatar
      [CFL] Compute alpha costs using av1_cost_symbol · 19bb3498
      Luc Trudeau authored
      Use the uniform way to compue the cost of symbols in AV1.
      
      Results on Subset1 (compared to 8a516a8f with CfL enabled)
      
         PSNR | PSNR Cb | PSNR Cr | PSNR HVS |    SSIM | MS SSIM | CIEDE 2000
      -0.0357 | -0.0854 |  0.0305 |  -0.0422 | -0.0097 | -0.0171 |    -0.1042
      
      Change-Id: Ie908fc7d20c480634002c78027b070223b3ea96d
      19bb3498
    • Luc Trudeau's avatar
      [CFL] Convert cfl_alpha to q3 · 4e81d929
      Luc Trudeau authored
      Alpha's biggest fraction is 1/8, so Q3 does not change the bitstream.
      
      Results on Subset1 (compared to 503aca74 with CfL enabled)
      
        PSNR | PSNR Cb | PSNR Cr | PSNR HVS |   SSIM | MS SSIM | CIEDE 2000
      0.0000 |  0.0000 |  0.0000 |   0.0000 | 0.0000 |  0.0000 |     0.0000
      
      Change-Id: I1fe5b2ace97179d5f950d7406a4f3d391924f89d
      4e81d929
    • Nathan E. Egge's avatar
      Remove the EC_ADAPT experimental flags. · 6bdc40f1
      Nathan E. Egge authored
      Removing these flags make the EC_ADAPT experiment an integral part of
       the draft AV1 bitstream definition
      This commit has no effect on metrics.
      
      Change-Id: Ice78520935e8bfa9d25cf4b8384a1b872069d09c
      6bdc40f1
  13. 10 Jul, 2017 2 commits
    • Lester Lu's avatar
      Inter and intra LGTs · 708c1ec5
      Lester Lu authored
      Here we have an LGT to replace ADST for intra residual blocks, and
      another LGT to replace ADST for inter residual blocks. The changes
      are only applied to transform length 4 and 8, and only for the
      lowbitdepth path.
      
      lowres: -0.18%
      
      Change-Id: Iadc1e02b53e3756b44f74ca648cfa8b0e8ca7af4
      708c1ec5
    • Luc Trudeau's avatar
      [CFL] Q0 DC_Pred · 7651b739
      Luc Trudeau authored
      The block level DC_PRED computed by CfL goes down from Q6 to Q0. This
      will allow to reuse existing assembly for DC_PRED and also reduce the
      requirements on the multilpy required to scale the reconstructed luma
      values
      
      Results on Subset1 (compared to f9684d222 with CfL enabled)
      
         PSNR | PSNR Cb | PSNR Cr | PSNR HVS |    SSIM | MS SSIM | CIEDE 2000
      -0.0347 |  0.0229 | -0.1326 |  -0.0420 | -0.0057 | -0.0072 |    -0.0644
      
      Change-Id: I6ba82cc9e04fa4ab7c8ec40a7856deb273881748
      7651b739
  14. 07 Jul, 2017 1 commit
    • Lester Lu's avatar
      Signature changes for the LGT experiment · d8b1ddce
      Lester Lu authored
      The input arguments of av1_fht* and av1_iht* functions (and their
      HBD versions) are slightly changed. Input arguments tx_type and
      bd are carried by a struct fwd_txfm_param/inv_txfm_param. This
      struct is meant to later on carry other prediction information,
      such as intra top/left boundaries to the transform level, so
      that the choice of transforms can be more adaptive to the
      prediction mode and local video content.
      
      Change-Id: Ia42544248a51845be64b72855b642ef1fe5910a9
      d8b1ddce
  15. 06 Jul, 2017 6 commits
    • Kyle Siefring's avatar
      Remove the token state array from optimize_b_greedy. · 627e2fd5
      Kyle Siefring authored
      The token state array was carried over from the old optimize_b.
      With hbd and 64x64 transforms on the array uses 128KB. While the array
      could be changed to only store tokens, this commit opts to remove
      it entirely.
      
      Improves performance on difficult clips at q20 by roughly 2% with
      high-bitdepth enabled. Actual speedup should be higher.
      
      This change has no impact on metrics.
      
      Change-Id: Ib9924092dee30b0f0abcc7850e8bb52d3e891e31
      627e2fd5
    • Luc Trudeau's avatar
      [CFL] Fewer bits for fixed point · 475fc9df
      Luc Trudeau authored
      Since alpha is Q3, we reduce y_average from Q10 to Q3. As such, the
      prediction is reduced from Q13 to Q6. Chroma dc_pred is reduced from Q7
      to Q6 in order to match with the prediction.
      
      Results on Subset1 (compared to 209de2e5b with CfL enabled)
      
        PSNR | PSNR Cb | PSNR Cr | PSNR HVS |   SSIM | MS SSIM | CIEDE 2000
      0.0010 |  0.0176 | -0.0538 |  -0.0043 | 0.0027 | -0.0097 |    -0.0018
      
      Change-Id: Ib7dd3968a764e0380ddc0ad2333ebacf1e9699cd
      475fc9df
    • Luc Trudeau's avatar
      [CFL] Convert dc_pred to fixed point · 2e6cb7e7
      Luc Trudeau authored
      The dc_pred values stored in the CfL context are in Q8.7 (Worst case
      division will be of 1/128).
      
      Results on Subset1 (compared to f9684d222 with CfL enabled)
      
        PSNR | PSNR Cb | PSNR Cr | PSNR HVS |   SSIM | MS SSIM | CIEDE 2000
      0.0118 | -0.0181 | -0.0109 |   0.0086 | 0.0086 |  0.0196 |     0.0018
      
      Change-Id: I0701e04fb76f03eff12ed01fd5fda675fbb15e32
      2e6cb7e7
    • Luc Trudeau's avatar
      [CFL] Fixed point implementation for tx average · bfe2827b
      Luc Trudeau authored
      This change does not impact the bitstream as no loss is incured by using
      a fixed point value for the transform size average.
      
      For low bit depth, the transform size average is stored using Q8.10
      fixed point format. Worst case, smallest fraction is 1/1024.
      
      Results on Subset1 (Compared to 366b74 with CfL)
      
        PSNR | PSNR Cb | PSNR Cr | PSNR HVS |   SSIM | MS SSIM | CIEDE 2000
      0.0000 |  0.0000 |  0.0000 |   0.0000 | 0.0000 |  0.0000 |     0.0000
      
      Change-Id: Ia5b046b92a0e4c40e413b16af3394bdc0a8c8cd9
      bfe2827b
    • Luc Trudeau's avatar
      [CFL] Compute Average Over TX Block Instead of Pred Block · 03678940
      Luc Trudeau authored
      When computing alpha, multiple averages are computed, one for each
      transform block. The CfL prediction now uses the transform block average
      instead of partition block average.
      
      This allows the decoder to build the CfL prediction by using only the
      collocated reconstructed luma values for the current transform size and
      not the entire partition.
      
      Results on Subset 1 (Compared to 0e81b97c with CfL)
      
        PSNR | PSNR Cb | PSNR Cr | PSNR HVS |   SSIM | MS SSIM | CIEDE 2000
      0.0180 |  0.2627 |  0.2274 |   0.0233 | 0.0301 |  0.0312 |     0.1506
      
      A small regression is expected, this change was made to simplify
      hardware implementations.
      
      Change-Id: Ib2ce2a3053b85300c5c62ef0e3270af489568a38
      03678940
    • Luc Trudeau's avatar
      [CFL] clip CFL prediction to avoid overflow · 5c453db2
      Luc Trudeau authored
      The value predicted using CfL is clipped to avoid going out of the
      scope of the uint8. Both overflow and underflow was detected over
      Subtset1.
      
      Results on Subset1 (compared to 7e55571e with CfL enabled)
      
        PSNR | PSNR Cb | PSNR Cr | PSNR HVS |   SSIM | MS SSIM | CIEDE 2000
      0.0019 |  0.0001 |  0.0009 |   0.0047 | 0.0020 |  0.0023 |     0.0012
      
      Change-Id: Ie1190e2286aa90542eaa68b814cc5cfa031acb73
      5c453db2
  16. 03 Jul, 2017 2 commits
    • Luc Trudeau's avatar
      [CFL] Adjust Pixel Buffer for Chroma Sub8x8 · 780d249d
      Luc Trudeau authored
      Adjust row and col offset for sub8x8 blocks to allow the CfL prediction
      to use all available reconstructed luma pixels.
      
      Results on Subset 1 (Compared to b03c2f44 with CfL)
      
         PSNR | PSNR Cb | PSNR Cr | PSNR HVS |    SSIM | MS SSIM | CIEDE 2000
      -0.1355 | -0.8517 | -0.4481 |  -0.0579 | -0.0237 | -0.0203 |    -0.2765
      
      Change-Id: Ia91f0a078f0ff4f28bb2d272b096f579e0d04dac
      780d249d
    • Guillaume Martres's avatar
      Remove Unused UPDATE_RD_COST macro · 858e2388
      Guillaume Martres authored
      It stopped being used after 09302f5a
      
      Change-Id: Ie7d567c787a4120f8b73378b3a82267249a82e3d
      858e2388
  17. 29 Jun, 2017 1 commit
    • Luc Trudeau's avatar
      [CFL] Better encapsulation · 3dc55e0f
      Luc Trudeau authored
      The function cfl_compute_parameters is added and contains the logic
      related to building the CfL context parameters. As such, many cfl
      functions can now be encapsulated inside of cfl.c and not exposed to the
      rest of AV1.
      
      This also allows for supplemental asserts that validate that the CfL
      context is properly built.
      
      Results on Subset1 (compared to 9c6f8547 with CfL)
      
        PSNR | PSNR Cb | PSNR Cr | PSNR HVS |   SSIM | MS SSIM | CIEDE 2000
      0.0000 |  0.0000 |  0.0000 |   0.0000 | 0.0000 |  0.0000 |     0.0000
      
      Change-Id: I6d14a426416b3af5491bdc145db7281b5e988cae
      3dc55e0f
  18. 28 Jun, 2017 2 commits
  19. 27 Jun, 2017 1 commit
    • Luc Trudeau's avatar
      [CFL] Sum Alpha Distortion Over Transform Block · 8fb4c9e7
      Luc Trudeau authored
      This change does not impact the bitstream, it changes how to distortion
      is summed when evaluating alpha. The sum is still taken over the entire
      partition. However, instead of iterating over the entire surface all at
      once, CfL now iterates over each transform block. This is in light of
      future work to compute alpha over transform blocks and not prediction
      blocks.
      
      Results on Subset1 (compared to 9c6f8547 with CfL)
      
        PSNR | PSNR Cb | PSNR Cr | PSNR HVS |   SSIM | MS SSIM | CIEDE 2000
      0.0000 |  0.0000 |  0.0000 |   0.0000 | 0.0000 |  0.0000 |     0.0000
      
      Change-Id: Ic7b72201d29ad6b2527748e35b212bec515e3bdb
      8fb4c9e7
  20. 24 Jun, 2017 1 commit
    • Angie Chiang's avatar
      Pass mbmi into get_scan() · bd99b38c
      Angie Chiang authored
      This is to facilitate future experiment related to adapt_scan
      
      Change-Id: I51628f3df81bd82db7f8f553d13da0ee5792d7d9
      bd99b38c
  21. 21 Jun, 2017 1 commit
    • Timothy B. Terriberry's avatar
      cb4x4: Move sub-4X4 block sizes behind chroma flags. · 81ec2619
      Timothy B. Terriberry authored
      cb4x4 itself should not require these sizes.
      
      This simplifies compatibility with other experiments, since we can
      first make them work with cb4x4 (which is now on by default), and
      then worry about chroma_sub8x8 and chroma_2x2 (which is not) in
      separate steps.
      
      Encoder and decoder output should remain unchanged.
      
      Change-Id: Iff2a5494cab3b7d96f881e8bd9cd4bf18c817cfa
      81ec2619
  22. 20 Jun, 2017 1 commit