1. 18 Aug, 2017 1 commit
  2. 15 Aug, 2017 2 commits
    • Monty Montgomery's avatar
      Add CONFIG_DAALA_DCT64 experiment. · a4e245a9
      Monty Montgomery authored
      This experiment replaces the 64-point Type-II DCT and related
      scaling vp9 transforms with the 64-point orthonormal
      Daala transforms.
      
      subset-1:
      
          monty-square-baseline-s1-F2@2017-07-28T03:35:45.962Z ->
            monty-square-dct64-s1-F2@2017-07-29T04:50:58.412Z
      
             PSNR | PSNR Cb | PSNR Cr | PSNR HVS |    SSIM | MS SSIM | CIEDE 2000
          -0.1930 | -0.2037 | -0.0643 |  -0.1917 | -0.2331 | -0.3510 |    -0.1810
      
      objective-1-fast:
      
          monty-square-baseline-o1f-F2@2017-07-28T03:35:35.533Z ->
            monty-square-dct64-o1f-F2@2017-07-29T04:50:28.542Z
      
             PSNR | PSNR Cb | PSNR Cr | PSNR HVS |    SSIM | MS SSIM | CIEDE 2000
          -0.2557 | -0.1743 | -0.4900 |  -0.3028 | -0.4147 | -0.5764 |    -0.2864
      
      Change-Id: I1f944df29e44d2e350c42555af274f2d75a62a92
      a4e245a9
    • Monty Montgomery's avatar
      Add CONFIG_DAALA_DCT32 experiment. · 2cb52baf
      Monty Montgomery authored
      This experiment replaces the 32-point Type-II DCT and 32-point
      Type-IV DST scaling vp9 transforms with the 32-point orthonormal
      Daala transforms.
      
      subset-1:
      
          monty-square-baseline-s1-F3@2017-08-02T11:50:51.375Z ->
            monty-square-dct32-s1-F3@2017-08-02T11:50:18.859Z
      
            PSNR | PSNR Cb | PSNR Cr | PSNR HVS |    SSIM | MS SSIM | CIEDE 2000
          0.0000 |  0.0115 | -0.1044 |  -0.0185 | -0.0069 | -0.0603 |     0.0555
      
      objective-1-fast (4 frames):
      
          monty-square-baseline-o1f-F3-l4-fine@2017-08-12T02:18:05.560Z ->
            monty-square-dct32-o1f-F3-l4-fine@2017-08-12T02:19:44.461Z
      
            PSNR | PSNR Cb | PSNR Cr | PSNR HVS |    SSIM | MS SSIM | CIEDE 2000
         -0.0269 | -0.0715 |     N/A |  -0.0547 | -0.0268 | -0.0590 |        N/A
      
      Change-Id: Ib1bad991d82eb67956e94a6216298a84e908b169
      2cb52baf
  3. 08 Aug, 2017 3 commits
  4. 04 Aug, 2017 1 commit
    • Sarah Parker's avatar
      Avoid using MRC_DCT when the mask produced is invalid · c5ccd4ca
      Sarah Parker authored
      If the mask is invalid, do not allow the encoder to select MRC_DCT.
      Currently the mask is invalid if it is all 1 or all 0, but these
      criteria will likely expand in a future patch.
      
      Change-Id: I77230ea8357bfdb2bf1e6338903d44bbf1db22d1
      c5ccd4ca
  5. 03 Aug, 2017 1 commit
  6. 02 Aug, 2017 1 commit
    • Angie Chiang's avatar
      Add txmg experiment · ad653a39
      Angie Chiang authored
      This experiment aims at merging lbd/hbd txfms
      
      So far this exp uses hbd transform on lbd path.
      The performances I observed are
      lowres -0.089%
      midres  0.065%
      (negative means performance drop)
      
      Started from here, two main things are needed to be done.
      1) Fix overflow due to quantizer noise
      2) Generate a 16-bit version from the hbd txfm
      
      Change-Id: I35bb1fc0cbb78decad2570ff5826ed665f739752
      ad653a39
  7. 29 Jul, 2017 1 commit
    • Monty Montgomery's avatar
      Add CONFIG_DAALA_DCT16 experiment. · cb9c1c52
      Monty Montgomery authored
      This experiment replaces the 16-point Type-II DCT and 16-point Type-IV
      DST scaling vp9 transforms with the 16-point orthonormal Daala
      transforms.  These have reduced complexity and are perfect
      reconstruction.  There is currently no net coding performance impact.
      
      subset-1:
      
        monty-square-baseline-s1-F@2017-07-23T03:43:45.042Z ->
           monty-square-dct16-s1-F@2017-07-23T03:42:29.805Z
      
         PSNR | PSNR Cb | PSNR Cr | PSNR HVS |    SSIM | MS SSIM | CIEDE 2000
      -0.0152 | -0.0028 | -0.0929 |  -0.0432 | -0.0457 | -0.0425 |    -0.0237
      
        objective-1-fast:
      
        monty-square-baseline-o1f-F@2017-07-23T03:44:19.973Z ->
           monty-square-dct16-o1f-F@2017-07-23T03:43:22.549Z
      
        PSNR | PSNR Cb | PSNR Cr | PSNR HVS |   SSIM | MS SSIM | CIEDE 2000
      0.0305 |  0.0926 | -0.1600 |   0.0471 | 0.0219 | -0.0075 |     0.0135
      
      Change-Id: I54fed26d65fd8450693334bb400b1fafd7e0dacb
      cb9c1c52
  8. 28 Jul, 2017 1 commit
    • Urvang Joshi's avatar
      Fix logical errors for TX64x64. · 9136ab7d
      Urvang Joshi authored
      - Wrong function argument fix: this was not caught by compile test
      because DCT_DCT has a value of 0, which was converted to a NULL pointer.
      - Wrong prob array size.
      
      Change-Id: Iaf1747dc7fb40db1d1ab35f965fb60994d8dec95
      9136ab7d
  9. 26 Jul, 2017 3 commits
    • Yue Chen's avatar
      rect_tx_ext: work with var_tx · d6bdd46b
      Yue Chen authored
      Change-Id: Ie2c34490dc50cb242bcd701308e6b55243883b15
      d6bdd46b
    • Sarah Parker's avatar
      Add txfm functions corresponding to MRC_DCT · 5b8e6d2d
      Sarah Parker authored
      MRC_DCT uses a mask based on the prediction signal to modify the
      residual before applying DCT_DCT. This adds all necessary functions
      to perform this transform and makes the prediction signal available
      to the 32x32 txfm functions so the mask can be created. I am still
      experimenting with different types of mask generation functions and
      so this patch contains a placeholder. This patch has no impact on
      performance.
      
      Change-Id: Ie3772f528e82103187a85c91cf00bb291dba328a
      5b8e6d2d
    • Monty Montgomery's avatar
      Add CONFIG_DAALA_DCT8 experiment. · cf18fe4e
      Monty Montgomery authored
      This experiment replaces the 8-point Type-II DCT and 8-point Type-IV DST
       scaling vp9 transforms with the 8-point orthonormal Daala transforms.
      These have reduced complexity and are perfect reconstruction at the cost
       of a slightly worse coding performance.
      This is because the Daala transforms expect the input to be shifted by 4
       bits but the output scale of the vp9 transforms is only 3 bits.
      
      subset-1:
      
      monty-square-baseline-subset1 ->
        monty-square-dct8-subset1@2017-07-17T21:37:44.281Z
      
        PSNR | PSNR Cb | PSNR Cr | PSNR HVS |   SSIM | MS SSIM | CIEDE 2000
      0.0019 | -0.0011 | -0.0585 |  -0.0111 | 0.0305 |  0.0317 |     0.0187
      
      objective-1-fast:
      
      monty-square-baseline-o1f ->
        monty-square-dct8-o1f@2017-07-17T21:37:15.735Z
      
        PSNR | PSNR Cb | PSNR Cr | PSNR HVS |   SSIM | MS SSIM | CIEDE 2000
      0.0285 |  0.0129 | -0.5080 |   0.0529 | 0.0345 |  0.0441 |     0.0054
      
      Change-Id: I2b775495398fb717204a295397c3c5e3ca938183
      cf18fe4e
  10. 20 Jul, 2017 1 commit
    • Sarah Parker's avatar
      Add new MRC_DCT tx type · 53f93dbd
      Sarah Parker authored
      This adds the new transform to the list of possible transforms.
      The impact on performance is in the noise range because the transform
      implementation currently performs DCT as a placeholder. This transform
      will initially only have an implementation for TX_32X32 and it is
      skipped in the tx search for smaller transform sizes.
      
      Change-Id: Iab2faddc525b478ca06972a753428a4f4ef53ac6
      53f93dbd
  11. 19 Jul, 2017 1 commit
    • Jingning Han's avatar
      Rework txk_type indexing system for chroma component · 19b5c8fa
      Jingning Han authored
      Use the row and column indexes to fetch txk_type, which allows the
      chroma components to derive the tx type from the corresponding luma
      components. It improves the coding performance of txk-sel by 0.18%.
      
      Change-Id: I3f4bca5839e13ae95e51053e76cd86fe58202ac9
      19b5c8fa
  12. 17 Jul, 2017 2 commits
    • Lester Lu's avatar
      Unify FWD_TXFM_PARAM and INV_TXFM_PARAM · 27319b6e
      Lester Lu authored
      Change two similar structs, FWD_TXFM_PARAM and INV_TXFM_PARAM,
      into a common struct: TxfmParam. Its definition is moved to
      aom_dsp/txfm_common.h to simplify dependency.
      
      This change is made so that, in later changes of the LGT
      experiment, functions requiring FWD_TXFM_PARAM and
      INV_TXFM_PARAM, such as get_fwd_lgt4 and get_inv_lgt4, can
      also be unified.
      
      Change-Id: I756b0176a02314005060adbf8e62386f10eeb344
      27319b6e
    • hui su's avatar
      refactor get_tx_size() and get_uv_tx_size() · 0c6244b6
      hui su authored
      Change-Id: I802c9e41ebfed090b5ad8300917aad5e16ad026a
      0c6244b6
  13. 14 Jul, 2017 1 commit
  14. 12 Jul, 2017 2 commits
    • Monty Montgomery's avatar
      Add CONFIG_DAALA_DCT4 experiment. · 02078a38
      Monty Montgomery authored
      This experiment replaces the 4-point Type-II scaled-output vp9 DCT
       transform with the 4-point Type-II orthonormal Daala DCT transform.
      Right now the CONFIG_DAALA_DCT4 experiment depends on CONFIG_DCT_ONLY
       as it does not add an orthonormal 4-point DST.
      
      subset-1:
      
      monty-baseline-dctonly-squaretx-subset1 ->
        monty-dct4-dctonly-squaretx-subset1-rerun
      
        PSNR | PSNR Cb | PSNR Cr | PSNR HVS |   SSIM | MS SSIM | CIEDE 2000
      0.0055 | -0.0132 | -0.0405 |   0.0261 | 0.0005 |  0.0246 |     0.0226
      
      objective-1-fast:
      
      monty-baseline-dctonly-squaretx-o1f ->
        monty-dct4-dctonly-squaretx-o1f
      
         PSNR | PSNR Cb | PSNR Cr | PSNR HVS |    SSIM | MS SSIM | CIEDE 2000
      -0.0215 | -0.1573 |     N/A |  -0.0131 | -0.0347 | -0.0390 |    -0.1121
      
      Change-Id: Idef8f6e5525037d5bbb2d0927675c21d1922d69a
      02078a38
    • Monty Montgomery's avatar
      Minor refactor to match the 4x4 forward transform. · 554d2c33
      Monty Montgomery authored
      Change-Id: Ib5337dfa78b73059ad169ca98a07119aa991864b
      554d2c33
  15. 11 Jul, 2017 1 commit
    • Monty Montgomery's avatar
      Add CONFIG_DCT_ONLY experiment. · cb55dad1
      Monty Montgomery authored
      Building with --enable-dct_only will force the encoder to use only
       tx_type == DCT_DCT.
      This experiment gives a loss and is only added for testing.
      
      subset-1:
      
      master@2017-02-21T01:23:58.825Z ->
       master-dct_only@2017-02-21T02:57:28.585Z
      
        PSNR | PSNR Cb | PSNR Cr | PSNR HVS |   SSIM | MS SSIM | CIEDE 2000
      2.5467 |  1.0524 |  0.9171 |   1.8849 | 2.6626 |  2.4995 |     1.8402
      
      objective-1-fast:
      
      master@2017-02-21T01:47:43.790Z ->
       master-dct_only@2017-02-20T16:54:03.578Z
      
        PSNR | PSNR Cb | PSNR Cr | PSNR HVS |   SSIM | MS SSIM | CIEDE 2000
      1.6625 |  0.3948 |  0.3368 |   1.5268 | 1.7142 |  1.7097 |     1.0743
      
      Change-Id: I19b738f3d1a450bc50422149ac42bc184bfae08a
      cb55dad1
  16. 10 Jul, 2017 1 commit
    • Lester Lu's avatar
      Inter and intra LGTs · 708c1ec5
      Lester Lu authored
      Here we have an LGT to replace ADST for intra residual blocks, and
      another LGT to replace ADST for inter residual blocks. The changes
      are only applied to transform length 4 and 8, and only for the
      lowbitdepth path.
      
      lowres: -0.18%
      
      Change-Id: Iadc1e02b53e3756b44f74ca648cfa8b0e8ca7af4
      708c1ec5
  17. 07 Jul, 2017 1 commit
    • Lester Lu's avatar
      Signature changes for the LGT experiment · d8b1ddce
      Lester Lu authored
      The input arguments of av1_fht* and av1_iht* functions (and their
      HBD versions) are slightly changed. Input arguments tx_type and
      bd are carried by a struct fwd_txfm_param/inv_txfm_param. This
      struct is meant to later on carry other prediction information,
      such as intra top/left boundaries to the transform level, so
      that the choice of transforms can be more adaptive to the
      prediction mode and local video content.
      
      Change-Id: Ia42544248a51845be64b72855b642ef1fe5910a9
      d8b1ddce
  18. 29 Jun, 2017 1 commit
  19. 27 Jun, 2017 1 commit
    • Yi Luo's avatar
      Fix inv txfm low/high bitdepth selection logic · 51281095
      Yi Luo authored
      We are going to have several commits to setup new low/high
      bitdepth data path selection logic. This patch is for inverse
      transform. Let me summarize the ideas as following.
      
      - For low/high bitdepth selection, encoder depends on
        input configuration, e.g., video sequence bitdepth,
        profile. Decoder depends on input bitstream. This has
        nothing to do with compiler/build  configuration.
      
      - Typical encoder usage for sampling format 4:2:0.
        1) 8-bit video sequence:
         a) --profile=0
         Fastest encoding/decoding pipeline on speedup.
      
         b) --profile=2 --bit-depth=10
         Image pixels are left shifted by 2 bits. It
         employs 16-bit reference frame buffer and has high
         calculation precision. It usually enjoys higher
         compression performance.
      
        2) 10/12-bit video sequence (HDR):
         --profile=2 --bit-depth=10/12
      
      - Transform coefficient type:
        Lowbitdepth:  int16_t
        Highbitdepth: int32_t
      
      - The type, tran_low_t is still used in codebase,
        Which is int32_t, defining the data path capacity.
        Naturally, it is high bitdepth.
      
      Eventually we shall remove the configuration flags,
      CONFIG_HIGHBITDEPTH/CONFIG_LOWBITDEPTH, and seperate
      low and high bitdepth data path. Two data paths co-exist
      in the same build environment.
      
      Change-Id: I35c06d4d4f19ebf80d909168fdddbae57c3cc884
      51281095
  20. 26 Jun, 2017 1 commit
    • Lester Lu's avatar
      New experiment: LGT · ad8290b8
      Lester Lu authored
      In previous ADSTs, DST-7 and DST-4 are used for length 4 and length
      8/16/32, respectively. In this LGT experiment we explore transforms
      between DST-4 and DST-7. When CONFIG_LGT flag is on, adst4 and adst8
      are replaced by lgt4 and lgt8, the intermediate transforms with
      pre-chosen parameters.
      
      The LGTs applied here are lgt4_160 and lgt8_170, where the numbers
      mean the self-loop weights times 100. The associated values for DST-7
      and DST-4 are 100 and 200.
      
      ovr_psnr:
      lowres: -0.140
      midres: -0.131
      hdres: -0.078
      
      These changes are not applied to the highbd scenario in the
      current version.
      
      Change-Id: I20600456da8766528b2b6b11aa28801e70af498e
      ad8290b8
  21. 12 Jun, 2017 1 commit
    • Sarah Parker's avatar
      Clean up hbd transform code · 30dfa883
      Sarah Parker authored
      Responding to some left over cosmetic comments from
      2b5cdb1cf87c933331a16cc0221455d0a8c255e1
      
      Change-Id: I42e126593526cedd6675adf35b9c1df78e1ddf54
      30dfa883
  22. 08 Jun, 2017 1 commit
    • Sarah Parker's avatar
      Remove deprecated high-bitdepth functions · 31c66502
      Sarah Parker authored
      This unifies the codepath for high-bitdepth transforms and deletes
      all calls to the old deprecated versions. This required reworking
      the way 1d configurations are combined in order to support rectangular
      transforms.
      
      There is one remaining codepath that calls the deprecated 4x4 hbd
      transform from encoder/encodemb.c. I need to take a closer look
      at what is happening there and will leave that for a followup
      since this change has already gotten so large.
      
      lowres 10 bit: -0.035%
      lowres 12 bit: 0.021%
      
      BUG=aomedia:524
      
      Change-Id: I34cdeaed2461ed7942364147cef10d7d21e3779c
      31c66502
  23. 01 Jun, 2017 1 commit
    • Timothy B. Terriberry's avatar
      cb4x4: Move sub-4X4 TX sizes behind CONFIG_CHROMA_2X2. · fe67ed6a
      Timothy B. Terriberry authored
      cb4x4 itself should not require these sizes.
      
      This simplifies compatibility with other experiments, since we can
      first make them work with cb4x4 (which is now on by default), and
      then worry about chroma_2x2 (which is not) in separate steps.
      
      Encoder and decoder output should remain unchanged.
      
      Change-Id: I4e9fcdae49f238b5099a3c74a398fe993c2545f8
      fe67ed6a
  24. 20 May, 2017 1 commit
    • hui su's avatar
      DPCM intra coding experiment · b8a6fd6b
      hui su authored
      Encode a block line by line, horizontally or vertically. In the vertical
      mode, each row is predicted by the reconsturcted row above;
      in the horizontal mode, each column is predicted by the reconstructed
      column to the left.
      
      The DPCM modes are enabled automatically for blocks with horizontal or
      vertical prediction mode, and 1D transform types (ext-tx).
      
      Change-Id: I133ab6b537fa24a6e314ee1ef1d2fe9bd9d56c13
      b8a6fd6b
  25. 19 May, 2017 3 commits
    • Jonathan Matthews's avatar
      Fix highbd DCT and ADST data overwriting issue · 362d0c7b
      Jonathan Matthews authored
      Exposed by Change-Id: I048c6e9cc790520247cc21ae9b92a9c8d84d00a7
      
      BUG=aomedia:525
      
      Change-Id: Ia83f8a8efcf0eac4912f247f38887c0dd533da85
      362d0c7b
    • Sarah Parker's avatar
      Add configurations for hbd identity transform · 3eed4175
      Sarah Parker authored
      This adds the proper cfgs to av1_{inv/fwd}_txfm1d_cfg for
      the identity transform so all hbd transforms can use
      the same codepath. This has no impact on performance
      since the new identity transforms that correspond with
      the cfgs are not yet being called. Once this is checked in,
      we should be able to delete all deprecated transform functions
      and have a single code flow for all hbd transforms.
      
      BUG=aomedia:524
      
      Change-Id: I3d1bfbc8bc29b367e8ddf7dcd27525af0bd31067
      3eed4175
    • Yue Chen's avatar
      Enable 1:4/4:1 transform for 8x16 and 16x8 luma blocks · 56e226e3
      Yue Chen authored
      It gives 0.1% gain on lowres and midres
      
      Change-Id: I555a492a68571c525713840d73aa5614fe80a87d
      56e226e3
  26. 18 May, 2017 1 commit
    • Sarah Parker's avatar
      Refactor hbd txfm configurations to be 1D · eec47e65
      Sarah Parker authored
      The hbd transform configurations were originally written for all possible
      2d transforms. Now that there are many more possible 2d transforms
      due to EXT_TX and RECT_TX, it is simpler to write the cfg for the
      4 1D transform types and compose them to make all new possible transform
      types. This will allow for an easier integration of the identity transform
      for EXT_TX and rectangular transforms for RECT_TX into the current
      hbd transform codepath and facilitate the removal of obsolete transforms.
      This has no impact on performance.
      
      BUG=aomedia:524
      
      Change-Id: I1e217bcd217fd637b1df94fae62d9c59a0523c1a
      eec47e65
  27. 15 May, 2017 2 commits
  28. 11 May, 2017 1 commit
    • Yi Luo's avatar
      Partial IDCT 32x32 avx2 · 40f22ef8
      Yi Luo authored
      - Function level improvement (ms):
      Functions       ssse3  avx2   Percentage
      idct32x32_1024  794    374    52.9%
      idct32x32_135   354    169    52.2%
      idct32x32_34    197    142    27.9%
      idct32x32_1     n/a     26    n/a
      
      - Integrating in default scan order.
      
      Change-Id: I84815112b26b8a8cb800281a1cfb1706342af57d
      40f22ef8
  29. 08 May, 2017 1 commit
    • Yi Luo's avatar
      Partial IDCT 16x16 avx2 · f6176abb
      Yi Luo authored
      - Function level improvement:
      functions      sse2  avx2  percentage
      idct16x16_256  365   226   38%
      idct16x16_38   n/a   136   n/a
      idct16x16_10   171   110   35%
      idct16x16_1     34    26   23%
      
      - Integrated in AV1 for default scan order.
      
      Change-Id: Ieb1a8e730bea9c371ebc0e5f4a748640d8f5e921
      f6176abb
  30. 04 May, 2017 1 commit