1. 06 Sep, 2017 1 commit
    • Sarah Parker's avatar
      Remove global motion from compressed header · 3e579a60
      Sarah Parker authored
      This requires making a temporary copy of the functions in
      binary_codes_writer/reader to take in the aom_write_bit_buffer type.
      
      Change-Id: Idb60b29cff69b45224535c6e6a4079a34a2c6871
      3e579a60
  2. 05 Sep, 2017 2 commits
    • Timothy B. Terriberry's avatar
      Remove the EC_SMALLMUL experimental flag. · f9ef4f6b
      Timothy B. Terriberry authored
      This experiment has been fully adopted and is now an integral part
      of the draft AV1 bitstream definition.
      
      objdump -d libaom.a gives identical output before and after this
      patch.
      
      Change-Id: I6f936f4b10de23a9471e0ccadf9cf178fb62be69
      f9ef4f6b
    • Rupert Swarbrick's avatar
      Define missing subtract_xxx functions in highbd_subtract_sse2.c · 4b5c2bb4
      Rupert Swarbrick authored
      Also, get rid of the boilerplate code using some macros. STACK_V(h,f) means
      "call f twice, stacking vertically at an offset of h". STACK_H(w,f)
      means "call f twice, stacking horizontally at an offset of w".
      
      Note that functions like subtract_128x64 are now only defined when the
      equivalent block sizes (e.g. BLOCK_128x64) are defined. As such, we
      have to fix up subtract_test.cc so it doesn't try to call
      aom_highbd_subtract_block_sse2 with unsupported sizes.
      
      BUG=aomedia:684
      
      Change-Id: I5b0fefe70e4083786d11d25cdd5dcf02823bae7b
      4b5c2bb4
  3. 30 Aug, 2017 1 commit
    • Yi Luo's avatar
      Highbd parallel_deblocking sse2 optimization · 6f5569f3
      Yi Luo authored
      - Decoder speed improves ~13.7% (baseline + parallel_deblocking).
      - Highbd loopfilter AVX2 version works when this experiment is
        disabled.
      
      Change-Id: I5d56b137a1d52236a4735656c370d57ef71ae043
      6f5569f3
  4. 22 Aug, 2017 2 commits
    • Lester Lu's avatar
      Refactor lgt · 918fe698
      Lester Lu authored
      Change get_lgt in order to integrate a later experiment
      lgt_from_pred with lgt. There are two main changes.
      
      The main purpose for this change is to unify get_fwd_lgt and
      get_inv_lgt functions into a get_lgt function so the lgt basis
      functions can always be selected through the same function in
      both forward and inverse transform paths. The structure of those
      functions will also be consistent with the get_lgt_from_pred
      functions that will be added in the lgt-from-pred experiment.
      
      These changes have no impact on the bitstream.
      
      Change-Id: Ifd3dfc1a9e1a250495830ddbf42c201e80aa913e
      918fe698
    • Jingning Han's avatar
      Initialize lv-map syntax probability model · fdaa55ed
      Jingning Han authored
      Initialize the cdf model for level map syntax elements.
      
      Change-Id: I3865e07c126eb4c856803c12485b05782dea6526
      fdaa55ed
  5. 15 Aug, 2017 6 commits
    • Monty Montgomery's avatar
      Add 4-point DST to DAALA_DCT4 experiment · 573cf25f
      Monty Montgomery authored
      CONFIG_DAALA_DCT4 currently force-enables CONFIG_DCT_ONLY due to a
      missing 4-point DST.  The DST had not been included because it was a
      significant coding performance loss; this turned out to be a bug that
      has since been corrected.
      
      This patch adds a 4-point type IV DST to the DAALA_DCT4 experiment.
      There is a small coding performance loss in using the type IV over
      AV1's current type VII.
      
      subset-1:
         monty-newdst4test-baseline-s1-F@2017-07-29T04:58:43.976Z ->
            monty-newdst4test-daala-s1-F@2017-07-29T04:59:56.094Z
      
         PSNR | PSNR Cb | PSNR Cr | PSNR HVS |    SSIM | MS SSIM | CIEDE 2000
      -0.0336 |  0.1393 |  0.0491 |   0.4118 | -0.0439 |  0.2084 |     0.0476
      
      objective-1-fast:
         monty-newdst4test-baseline-o1f-F@2017-07-29T04:58:10.439Z ->
            monty-newdst4test-daala-o1f-F@2017-07-29T04:59:04.678Z
      
        PSNR | PSNR Cb | PSNR Cr | PSNR HVS |    SSIM | MS SSIM | CIEDE 2000
      0.0064 |  0.1071 | -0.0108 |   0.1133 | -0.0035 |  0.0765 |     0.0502
      
      Change-Id: Ie29835edbe0e41bc86f4b09457e88d924cc9bf7e
      573cf25f
    • Monty Montgomery's avatar
      Add CONFIG_DAALA_DCT64 experiment. · a4e245a9
      Monty Montgomery authored
      This experiment replaces the 64-point Type-II DCT and related
      scaling vp9 transforms with the 64-point orthonormal
      Daala transforms.
      
      subset-1:
      
          monty-square-baseline-s1-F2@2017-07-28T03:35:45.962Z ->
            monty-square-dct64-s1-F2@2017-07-29T04:50:58.412Z
      
             PSNR | PSNR Cb | PSNR Cr | PSNR HVS |    SSIM | MS SSIM | CIEDE 2000
          -0.1930 | -0.2037 | -0.0643 |  -0.1917 | -0.2331 | -0.3510 |    -0.1810
      
      objective-1-fast:
      
          monty-square-baseline-o1f-F2@2017-07-28T03:35:35.533Z ->
            monty-square-dct64-o1f-F2@2017-07-29T04:50:28.542Z
      
             PSNR | PSNR Cb | PSNR Cr | PSNR HVS |    SSIM | MS SSIM | CIEDE 2000
          -0.2557 | -0.1743 | -0.4900 |  -0.3028 | -0.4147 | -0.5764 |    -0.2864
      
      Change-Id: I1f944df29e44d2e350c42555af274f2d75a62a92
      a4e245a9
    • Ralph Giles's avatar
      aom_dsp: regularize EXT_PARTITION_TYPES handling. · ccfdfce1
      Ralph Giles authored
      aom_dsp_rtcd_defs.pl compares most CONFIG_* keys to "yes"
      to see if they're set. The script was checking just
      
        if (aom_config("CONFIG_EXT_PARTITION_TYPES"))
      
      in some cases. The build system doesn't add disabled
      configuration options to libs.mk so this is effectively
      the same, however it means that setting the config
      key explicitly to 0 or "no" in the config headers
      was treated the same as setting it to 1 or "yes",
      and aom_dsp_rtcd.h would have opposite expections
      from aom_config.h or aom_config.asm.
      
      Treat this key similarly to others for consistency.
      
      Change-Id: I27bd7a5532ba4afc2bb289b43b57a1b1971c0348
      ccfdfce1
    • Urvang Joshi's avatar
      Remove ALT_INTRA flag. · 93b543ab
      Urvang Joshi authored
      This experiment has been adopted as it has been cleared by Tapas.
      
      Change-Id: I0682face60f62dd43091efa0a92d09d846396850
      93b543ab
    • Thomas Davies's avatar
      AOM_QM: enable by default · 181fc08f
      Thomas Davies authored
      No change to metrics, as quantization matrices are not used
      unless --enable-qm=1 is set on the command line.
      
      Fix no highbitdepth compilation, and fix compile errors and
      warnings for PVQ and NEW_QUANT experiments.
      
      Change-Id: I49aceb5acf6ca6790c81e760e5b208788f87086d
      181fc08f
    • Monty Montgomery's avatar
      Add CONFIG_DAALA_DCT32 experiment. · 2cb52baf
      Monty Montgomery authored
      This experiment replaces the 32-point Type-II DCT and 32-point
      Type-IV DST scaling vp9 transforms with the 32-point orthonormal
      Daala transforms.
      
      subset-1:
      
          monty-square-baseline-s1-F3@2017-08-02T11:50:51.375Z ->
            monty-square-dct32-s1-F3@2017-08-02T11:50:18.859Z
      
            PSNR | PSNR Cb | PSNR Cr | PSNR HVS |    SSIM | MS SSIM | CIEDE 2000
          0.0000 |  0.0115 | -0.1044 |  -0.0185 | -0.0069 | -0.0603 |     0.0555
      
      objective-1-fast (4 frames):
      
          monty-square-baseline-o1f-F3-l4-fine@2017-08-12T02:18:05.560Z ->
            monty-square-dct32-o1f-F3-l4-fine@2017-08-12T02:19:44.461Z
      
            PSNR | PSNR Cb | PSNR Cr | PSNR HVS |    SSIM | MS SSIM | CIEDE 2000
         -0.0269 | -0.0715 |     N/A |  -0.0547 | -0.0268 | -0.0590 |        N/A
      
      Change-Id: Ib1bad991d82eb67956e94a6216298a84e908b169
      2cb52baf
  6. 14 Aug, 2017 2 commits
  7. 11 Aug, 2017 1 commit
    • Yi Luo's avatar
      Simplify pixel clamping in highbitdepth loop filter · 099b1221
      Yi Luo authored
      The constants used in pixel clamping is based on bitdepth.
      Their calculation is moved outside pixel clamping and does
      only once. This achieves about <2% speed improvement on
      decoder.
      
      Change-Id: I48dcaebe04a3478962c3b6568d247a23b47a89d4
      099b1221
  8. 10 Aug, 2017 3 commits
  9. 08 Aug, 2017 1 commit
    • Thomas Davies's avatar
      Refactor quantization C code. · f3b5ee14
      Thomas Davies authored
      This commit de-duplicates C reference quantization code
      and unifies quantization matrix (QM) and non-QM code
      paths when there is no SIMD.
      
      The reorganisation also will facilitate re-using SIMD quant
      functions for QM when the matrix is flat, as is the
      default when AOM_QM is enabled.
      
      Change-Id: Idbfdac9eb9a31adcffe734aac1877d58b86fab77
      f3b5ee14
  10. 04 Aug, 2017 2 commits
  11. 03 Aug, 2017 1 commit
  12. 29 Jul, 2017 1 commit
    • Monty Montgomery's avatar
      Add CONFIG_DAALA_DCT16 experiment. · cb9c1c52
      Monty Montgomery authored
      This experiment replaces the 16-point Type-II DCT and 16-point Type-IV
      DST scaling vp9 transforms with the 16-point orthonormal Daala
      transforms.  These have reduced complexity and are perfect
      reconstruction.  There is currently no net coding performance impact.
      
      subset-1:
      
        monty-square-baseline-s1-F@2017-07-23T03:43:45.042Z ->
           monty-square-dct16-s1-F@2017-07-23T03:42:29.805Z
      
         PSNR | PSNR Cb | PSNR Cr | PSNR HVS |    SSIM | MS SSIM | CIEDE 2000
      -0.0152 | -0.0028 | -0.0929 |  -0.0432 | -0.0457 | -0.0425 |    -0.0237
      
        objective-1-fast:
      
        monty-square-baseline-o1f-F@2017-07-23T03:44:19.973Z ->
           monty-square-dct16-o1f-F@2017-07-23T03:43:22.549Z
      
        PSNR | PSNR Cb | PSNR Cr | PSNR HVS |   SSIM | MS SSIM | CIEDE 2000
      0.0305 |  0.0926 | -0.1600 |   0.0471 | 0.0219 | -0.0075 |     0.0135
      
      Change-Id: I54fed26d65fd8450693334bb400b1fafd7e0dacb
      cb9c1c52
  13. 26 Jul, 2017 2 commits
    • Sarah Parker's avatar
      Add txfm functions corresponding to MRC_DCT · 5b8e6d2d
      Sarah Parker authored
      MRC_DCT uses a mask based on the prediction signal to modify the
      residual before applying DCT_DCT. This adds all necessary functions
      to perform this transform and makes the prediction signal available
      to the 32x32 txfm functions so the mask can be created. I am still
      experimenting with different types of mask generation functions and
      so this patch contains a placeholder. This patch has no impact on
      performance.
      
      Change-Id: Ie3772f528e82103187a85c91cf00bb291dba328a
      5b8e6d2d
    • Monty Montgomery's avatar
      Add CONFIG_DAALA_DCT8 experiment. · cf18fe4e
      Monty Montgomery authored
      This experiment replaces the 8-point Type-II DCT and 8-point Type-IV DST
       scaling vp9 transforms with the 8-point orthonormal Daala transforms.
      These have reduced complexity and are perfect reconstruction at the cost
       of a slightly worse coding performance.
      This is because the Daala transforms expect the input to be shifted by 4
       bits but the output scale of the vp9 transforms is only 3 bits.
      
      subset-1:
      
      monty-square-baseline-subset1 ->
        monty-square-dct8-subset1@2017-07-17T21:37:44.281Z
      
        PSNR | PSNR Cb | PSNR Cr | PSNR HVS |   SSIM | MS SSIM | CIEDE 2000
      0.0019 | -0.0011 | -0.0585 |  -0.0111 | 0.0305 |  0.0317 |     0.0187
      
      objective-1-fast:
      
      monty-square-baseline-o1f ->
        monty-square-dct8-o1f@2017-07-17T21:37:15.735Z
      
        PSNR | PSNR Cb | PSNR Cr | PSNR HVS |   SSIM | MS SSIM | CIEDE 2000
      0.0285 |  0.0129 | -0.5080 |   0.0529 | 0.0345 |  0.0441 |     0.0054
      
      Change-Id: I2b775495398fb717204a295397c3c5e3ca938183
      cf18fe4e
  14. 21 Jul, 2017 1 commit
    • Angie Chiang's avatar
      Integrate convolve_round with compound_segment · 7b517095
      Angie Chiang authored
      This integration only covers low bitdepth mode for now
      
      The performance of Convolve_round on top of compound_segment
      revives from 0.475% to 0.612% on lowres
      
      Change-Id: I21606c79d0a22c0834966730358267c082d8071e
      7b517095
  15. 17 Jul, 2017 1 commit
    • Lester Lu's avatar
      Unify FWD_TXFM_PARAM and INV_TXFM_PARAM · 27319b6e
      Lester Lu authored
      Change two similar structs, FWD_TXFM_PARAM and INV_TXFM_PARAM,
      into a common struct: TxfmParam. Its definition is moved to
      aom_dsp/txfm_common.h to simplify dependency.
      
      This change is made so that, in later changes of the LGT
      experiment, functions requiring FWD_TXFM_PARAM and
      INV_TXFM_PARAM, such as get_fwd_lgt4 and get_inv_lgt4, can
      also be unified.
      
      Change-Id: I756b0176a02314005060adbf8e62386f10eeb344
      27319b6e
  16. 14 Jul, 2017 2 commits
  17. 13 Jul, 2017 2 commits
  18. 12 Jul, 2017 2 commits
    • Rupert Swarbrick's avatar
      ext-partition-types: Add 4:1 partitions · 93c39e91
      Rupert Swarbrick authored
      This patch adds support for 4:1 rectangular blocks to various common
      data arrays, and adds new partition types to the EXT_PARTITION_TYPES
      experiment which will use them.
      
      This patch has the following restrictions, which can be lifted in
      future patches:
      
        * ext-partition-types is incompatible with fp_mb_stats and supertx
          for the moment
      
        * Currently only 32x32 superblocks can use the new partition types
      
      There's a slightly odd restriction about when we allow
      PARTITION_HORZ_4 or PARTITION_VERT_4. Since these both live in the
      EXT_PARTITION_TYPES CDF, read_partition() can only return them if both
      has_rows and has_cols is true. This means that at least half of the
      width and height of the block must be visible. It might be nice to
      relax that restriction but that would imply a change to how we encode
      partition types, which seems already to be in a state of flux, so
      maybe it's better to wait until that has settled down.
      
      Change-Id: Id7fc3fd0f762f35f63b3d3e3bf4e07c245c7b4fa
      93c39e91
    • Monty Montgomery's avatar
      Add CONFIG_DAALA_DCT4 experiment. · 02078a38
      Monty Montgomery authored
      This experiment replaces the 4-point Type-II scaled-output vp9 DCT
       transform with the 4-point Type-II orthonormal Daala DCT transform.
      Right now the CONFIG_DAALA_DCT4 experiment depends on CONFIG_DCT_ONLY
       as it does not add an orthonormal 4-point DST.
      
      subset-1:
      
      monty-baseline-dctonly-squaretx-subset1 ->
        monty-dct4-dctonly-squaretx-subset1-rerun
      
        PSNR | PSNR Cb | PSNR Cr | PSNR HVS |   SSIM | MS SSIM | CIEDE 2000
      0.0055 | -0.0132 | -0.0405 |   0.0261 | 0.0005 |  0.0246 |     0.0226
      
      objective-1-fast:
      
      monty-baseline-dctonly-squaretx-o1f ->
        monty-dct4-dctonly-squaretx-o1f
      
         PSNR | PSNR Cb | PSNR Cr | PSNR HVS |    SSIM | MS SSIM | CIEDE 2000
      -0.0215 | -0.1573 |     N/A |  -0.0131 | -0.0347 | -0.0390 |    -0.1121
      
      Change-Id: Idef8f6e5525037d5bbb2d0927675c21d1922d69a
      02078a38
  19. 11 Jul, 2017 3 commits
  20. 10 Jul, 2017 1 commit
    • Lester Lu's avatar
      Inter and intra LGTs · 708c1ec5
      Lester Lu authored
      Here we have an LGT to replace ADST for intra residual blocks, and
      another LGT to replace ADST for inter residual blocks. The changes
      are only applied to transform length 4 and 8, and only for the
      lowbitdepth path.
      
      lowres: -0.18%
      
      Change-Id: Iadc1e02b53e3756b44f74ca648cfa8b0e8ca7af4
      708c1ec5
  21. 08 Jul, 2017 1 commit
    • Fergus Simpson's avatar
      Fix frame scaling prediction · 505f0068
      Fergus Simpson authored
      Use higher precision offsets for more accurate predictor
      generation when references are at a different scale from
      the coded frame.
      
      Change-Id: I4c2c0ec67fa4824273cb3bd072211f41ac7802e8
      505f0068
  22. 06 Jul, 2017 1 commit
  23. 29 Jun, 2017 1 commit