1. 02 Nov, 2017 3 commits
  2. 01 Nov, 2017 1 commit
  3. 30 Oct, 2017 1 commit
  4. 25 Oct, 2017 2 commits
    • Sebastien Alaiwan's avatar
      av1_rtcd_defs.pl: deduplicate HBD/LBD · 27427722
      Sebastien Alaiwan authored
      There's no change to the generated file.
      
      Change-Id: I77e9d78d22d084bc77dbf1dc5b8b99368cd2444e
      27427722
    • Yue Chen's avatar
      Optimizations for filter_intra · 57b8ff68
      Yue Chen authored
      Reduce number of modes from 10 to 6, and disable fi modes in UV.
      To reduce complexity, apply filter directly without subtracting
      the estimated means.
      
      Change-Id: Iaf78d92d31e4a7cc30ea7863b57a9611c5f503e6
      57b8ff68
  5. 24 Oct, 2017 1 commit
    • Rupert Swarbrick's avatar
      Expose av1_loop_restoration_filter_unit in restoration.h · dd6f09ab
      Rupert Swarbrick authored
      This patch also does a certain amount of rejigging for loop
      restoration coefficients, grouping the information for a given
      restoration unit into a structure called RestorationUnitInfo. The end
      result is to completely dispense with the RestorationInternal
      structure.
      
      The copy_tile functions in restoration.c, together with those
      functions that operate on a single stripe, have been changed so that
      they take pointers to the top-left corner of the area on which they
      should work, together with a width and height.
      
      The same isn't true of av1_loop_restoration_filter_unit, which still
      takes pointers to the top-left of the tile. This is because you
      actually need the absolute position in the tile in order to do striped
      loop restoration properly.
      
      Change-Id: I768c182cd15c9b2d6cfabb5ffca697cd2a3ff9e1
      dd6f09ab
  6. 21 Oct, 2017 1 commit
  7. 19 Oct, 2017 2 commits
    • Nathan E. Egge's avatar
      Rename DAALA_DCTx experiments to DAALA_TXx. · e554f36c
      Nathan E. Egge authored
      Change-Id: I8fa0a67d7a198b8b24837ffc352acf77f390cffe
      e554f36c
    • Rupert Swarbrick's avatar
      General tidy-ups in loop restoration code · d3d0615e
      Rupert Swarbrick authored
      This refactors the iteration in restoration.c so that all the scary
      stuff lies in a pair of general functions, filter_frame and
      filter_rest_unit.
      
      filter_frame is currently very simple, iterating over the restoration
      units in the frame. Once we've made it so that restoration units don't
      span tile boundaries, this function is the one we'll need to update to
      iterate over tiles and then restoration units within the tile.
      
      filter_rest_unit replaces the outer loop of the loop_*_filter_tile*
      functions. It deals with chopping the restoration unit into stripes of
      height procunit_height. When CONFIG_STRIPED_LOOP_RESTORATION is true,
      it also deals with calling setup_processing_stripe_boundary and
      restore_processing_stripe_boundary to use boundary data from the
      deblocked output.
      
      Some of the ugly #if/#endif blocks have been elided in the wiener
      filter code (both low and high bit depth), by defining a convolve
      alias based on USE_WIENER_HIGH_INTERMEDIATE_PRECISION.
      
      There are also changes to extend const-ness for the source frame. I've
      adopted the convention that the frame input is called "data" (as it
      was before) while it's non-const. This is true as far as
      filter_rest_unit. Then each "process one stripe" function takes a
      const pointer to the source frame, at which point it's called "src".
      
      The intention is that, once filter_rest_unit no longer needs a
      RestorationInternal pointer, this function can be exposed in
      restoration.h and can be used by pickrst.c
      
      Change-Id: I18043a172ef0ca1154d87cf7f63e3a80944627cd
      d3d0615e
  8. 11 Oct, 2017 2 commits
  9. 10 Oct, 2017 4 commits
    • Jingning Han's avatar
      Format clean-up av1_rtcd_defs.pl · 3ba27237
      Jingning Han authored
      Change-Id: I7a94cdef41e5e451247de939313feb58cd991e7f
      3ba27237
    • Lester Lu's avatar
      lgt-from-pred: transforms based on prediction · 432012f6
      Lester Lu authored
      In this experiment, sharp image discontinuity in the predicted
      block is detected. Based on this discontinuity, we choose
      particular LGTs as row and column transforms.
      
      Bitstream syntax, entropy coding, and RD search for LGT are added.
      One binary symbol is used to signal whether LGT is used. This
      experiment can work independently with the lgt experiment.
      
      lowres: -0.414% for key frames, -0.151% overall
      midres: -0.413% for key frames, -0.161% overall
      
      Change-Id: Iaa2f2c2839c34ca4134fa55e77870dc3f1fa879f
      432012f6
    • Rupert Swarbrick's avatar
      Add an SSE4.1 implementation of av1_highbd_convolve_2d_scale · 724d31eb
      Rupert Swarbrick authored
      For large blocks this is about 8x the speed of the C version. The code
      needs SSE 4.1 for the PMULLD instruction that we use to do SIMD 32-bit
      multiplies.
      
      The patch uses av1_convolve_scale_test (written already to test the
      low bit depth path) to make sure the optimised code matches the C
      version.
      
      Change-Id: I9304d6bb3d2cb31390de93ed08ff1a852e3ace86
      724d31eb
    • Rupert Swarbrick's avatar
      Add an SSE4.1 implementation of av1_convolve_2d_scale · 98dc22b8
      Rupert Swarbrick authored
      For large blocks this is almost 8x the speed of the C version. The
      code needs SSE 4.1 for the PMULLD instruction that we use to do SIMD
      32-bit multiplies.
      
      This patch also makes av1_convolve_scale_test actually test something,
      making sure the optimised code matches the C version. The slightly
      excessive generality in the test (all the templating) is because of a
      following patch, which is for the high bit depth path and can then use
      most of the same test code.
      
      Change-Id: I6732bc6b2378ffaadae5aa6441100cf660f7ee11
      98dc22b8
  10. 05 Oct, 2017 1 commit
  11. 02 Oct, 2017 3 commits
  12. 01 Oct, 2017 1 commit
  13. 28 Sep, 2017 1 commit
    • Monty Montgomery's avatar
      Remove dead av1_dct8x8_quant_xxxx functions · 7f7dd08a
      Monty Montgomery authored
      They're unused, disabled in the prototype setup, but still built and
      complicating the already convoluted ifdef mess in TX experiment
      configuration.
      
      Don't leave dead code in the sourcebase.  That's what SCM is for.
      
      Change-Id: Idb2adf597ac064c7b5027df8af1cf65054984aa4
      7f7dd08a
  14. 27 Sep, 2017 1 commit
  15. 20 Sep, 2017 1 commit
    • Joe Young's avatar
      [intra-edge] Vectorize upsampling · ad0196b8
      Joe Young authored
      Add sse4_1 functions for Intra-edge experiment:
        av1_upsample_intra_edge_sse4_1()
        av1_upsample_intra_edge_high_sse4_1()
      
      Approx cycle reduction at qp 20, 1 kf:
        Enc:  0.5% to 0.3%
        Dec:  0.4% to 0.2%
      
      Change-Id: I97f0eee09b78218b418b484d80c338cec037f1b9
      ad0196b8
  16. 16 Sep, 2017 2 commits
    • Joe Young's avatar
      [intra-edge] Vectorize edge filtering functions · 89d321f7
      Joe Young authored
      Add sse4_1 functions for Intra-edge experiment:
        av1_filter_intra_edge_sse4_1()
        av1_filter_intra_edge_high_sse4_1()
      
      Approx cycle reduction at qp 20, 1 kf:
        Enc (lbd) 1.4% to 0.3%
        Dec (lbd) 0.4% to 0.1%
        Enc (hbd) 1.1% to 0.2%
        Dec (hbd) 0.6% to 0.1%
      
      No change to bitstream
      
      Change-Id: I176b2d125424d7d226114c807915c33dde5c3720
      89d321f7
    • Tom Finegan's avatar
      Fix CMake mips32 build with DSPR2 enabled. · db724cf0
      Tom Finegan authored
      - Add aom_scale dspr2 sources to the correct target (aom).
      - Fix an inverted high bit depth condition.
      - Remove claims that dspr2 variants of av1_iht16x16_256_add_dspr2,
        av1_iht8x8_64_add_dspr2, av1_iht4x4_16_add_dspr2 from
        av1_rtcd_defs.pl exist in low bit depth configs.
      
      Change-Id: Ibdd42e475b81c2491f02ba10ca0d461f7ff15bc5
      db724cf0
  17. 10 Sep, 2017 1 commit
  18. 18 Aug, 2017 1 commit
    • Hui Su's avatar
      Remove dpcm-intra experiment · 400bf651
      Hui Su authored
      Coding gain becomes tiny on top of other experiments.
      
      Change-Id: Ia89b1c2a2653f3833dff8ac8bb612eaa3ba18446
      400bf651
  19. 15 Aug, 2017 2 commits
  20. 11 Aug, 2017 2 commits
    • Urvang Joshi's avatar
      tx64x64: Use C version for DCT/IDCT transform. · 900643be
      Urvang Joshi authored
      The SSE4 function does not support 64x64 size, and was triggering an
      assertion failure lowbitdepth is disabled.
      
      BUG=aomedia:672
      
      Change-Id: Id14e76b5c180a211a84c2e933b07e8acf72dddbc
      900643be
    • Steinar Midtskogen's avatar
      Add experiment CONFIG_CDEF_SINGLEPASS: Make CDEF single pass · 5978212b
      Steinar Midtskogen authored
      Low latency, cpu-used=0:
         PSNR | PSNR Cb | PSNR Cr | PSNR HVS |    SSIM | MS SSIM | CIEDE 2000
      -0.3162 | -0.6719 | -0.6535 |   0.0089 | -0.3890 | -0.1515 |    -0.6682
      
      High latency, cpu-used=0:
         PSNR | PSNR Cb | PSNR Cr | PSNR HVS |    SSIM | MS SSIM | CIEDE 2000
      -0.0293 | -0.3556 | -0.5505 |   0.0684 | -0.0862 |  0.0513 |    -0.2765
      
      Low latency, cpu-used=4:
         PSNR | PSNR Cb | PSNR Cr | PSNR HVS |    SSIM | MS SSIM | CIEDE 2000
      -0.2248 | -0.7764 | -0.6630 |  -0.2109 | -0.3240 | -0.2532 |    -0.6980
      
      High latency, cpu-used=4:
         PSNR | PSNR Cb | PSNR Cr | PSNR HVS |    SSIM | MS SSIM | CIEDE 2000
      -0.1118 | -0.5841 | -0.7406 |  -0.0463 | -0.2442 | -0.1064 |    -0.4187
      
      Change-Id: I9ca8399c8f45489541a66f535fb3d771eb1d59ab
      5978212b
  21. 08 Aug, 2017 1 commit
    • Thomas Davies's avatar
      Refactor quantization C code. · f3b5ee14
      Thomas Davies authored
      This commit de-duplicates C reference quantization code
      and unifies quantization matrix (QM) and non-QM code
      paths when there is no SIMD.
      
      The reorganisation also will facilitate re-using SIMD quant
      functions for QM when the matrix is flat, as is the
      default when AOM_QM is enabled.
      
      Change-Id: Idbfdac9eb9a31adcffe734aac1877d58b86fab77
      f3b5ee14
  22. 04 Aug, 2017 1 commit
    • Steinar Midtskogen's avatar
      CDEF cleanup · 94de0aaa
      Steinar Midtskogen authored
      Name changes and code moves to bring code more in line with the
      design doc and an upcoming single-pass patch.  No functional changes.
      
      Change-Id: I2bccd58c644e534b139f420b623390aa971fbdb0
      94de0aaa
  23. 31 Jul, 2017 1 commit
    • Peter de Rivaz's avatar
      Unified warp_affine and warp_affine_post_round · b6a31753
      Peter de Rivaz authored
      This patch removes the need for a separate warp_affine_post_round
      function by adding the functionality to the warp_affine function.
      
      The encoded output should remain unchanged, but the encoder/decoder
      should operate faster because the sse2 and ssse3 warp implementation
      can now be used when post_rounding is being used.
      
      Change-Id: Ide52cae55de59a9da9c27c5793e17390f6d2c03e
      b6a31753
  24. 24 Jul, 2017 1 commit
    • Urvang Joshi's avatar
      filter-intra: Support rectangular blocks. · 6a99691d
      Urvang Joshi authored
      - Use 'tx_size' in function signatures.
      - filter_intra_taps_3 and filter_intra_taps_4 updated to support
        TX_SIZES_ALL (thanks to yuec@)
      
      With these changes, filter-intra works correctly with rect-intra-pred.
      So, we remove the temporary workaround for this.
      
      Change-Id: Ide0f593419c21a74c08c61859f8dad918ca169fa
      6a99691d
  25. 17 Jul, 2017 1 commit
    • Lester Lu's avatar
      Unify FWD_TXFM_PARAM and INV_TXFM_PARAM · 27319b6e
      Lester Lu authored
      Change two similar structs, FWD_TXFM_PARAM and INV_TXFM_PARAM,
      into a common struct: TxfmParam. Its definition is moved to
      aom_dsp/txfm_common.h to simplify dependency.
      
      This change is made so that, in later changes of the LGT
      experiment, functions requiring FWD_TXFM_PARAM and
      INV_TXFM_PARAM, such as get_fwd_lgt4 and get_inv_lgt4, can
      also be unified.
      
      Change-Id: I756b0176a02314005060adbf8e62386f10eeb344
      27319b6e
  26. 13 Jul, 2017 1 commit
    • Yi Luo's avatar
      Speed up convolve_round post-rounding by avx2 · 04cef497
      Yi Luo authored
      - Decoder convolve rounding cycle percentage drops from
        2.75% to 0.91% by using avx2 function on i7-6700.
      
      Change-Id: I34ae48f45c0b4073f8962647d2181365ffe3325b
      04cef497
  27. 07 Jul, 2017 1 commit
    • Lester Lu's avatar
      Signature changes for the LGT experiment · d8b1ddce
      Lester Lu authored
      The input arguments of av1_fht* and av1_iht* functions (and their
      HBD versions) are slightly changed. Input arguments tx_type and
      bd are carried by a struct fwd_txfm_param/inv_txfm_param. This
      struct is meant to later on carry other prediction information,
      such as intra top/left boundaries to the transform level, so
      that the choice of transforms can be more adaptive to the
      prediction mode and local video content.
      
      Change-Id: Ia42544248a51845be64b72855b642ef1fe5910a9
      d8b1ddce