1. 26 Jan, 2018 1 commit
  2. 24 Jan, 2018 1 commit
  3. 27 Dec, 2017 1 commit
  4. 14 Nov, 2017 1 commit
    • Monty Montgomery's avatar
      Simplify Daala inverse TX toplevel for constant shift · 359854fe
      Monty Montgomery authored
      Rather than backing out all the LGT-related shifting matrices
      throughout the existing TX code, separate out and simplify Daala
      inverse TX into a single dedicated entry point.  When DAALA_TX is
      enabled, CONFIG_HIGHBITDEPTH is also forced, and all of Daala TX
      (lowbd and highbd) uses this single TX dispatch.
      
      This patch is purely non-functional changes.
      
      subset 1:
      monty-TXtesting-fwd-s1@2017-11-12T05:25:09.557Z ->
       monty-TXtesting-inv-s1@2017-11-12T05:25:43.878Z
      
        PSNR | PSNR Cb | PSNR Cr | PSNR HVS |   SSIM | MS SSIM | CIEDE 2000
      0.0000 |  0.0000 |  0.0000 |   0.0000 | 0.0000 |  0.0000 |     0.0000
      
      objective-1-fast:
      monty-TXtesting-fwd-o1f@2017-11-12T05:25:29.386Z ->
       monty-TXtesting-inv-o1f@2017-11-12T05:25:58.897Z
      
        PSNR | PSNR Cb | PSNR Cr | PSNR HVS |   SSIM | MS SSIM | CIEDE 2000
      0.0000 |  0.0000 |  0.0000 |   0.0000 | 0.0000 |  0.0000 |     0.0000
      
      Change-Id: I790e8d7ac08eb214eb712f5441d6e5f76ebddf17
      359854fe
  5. 09 Nov, 2017 1 commit
  6. 04 Nov, 2017 1 commit
  7. 02 Nov, 2017 1 commit
    • Sebastien Alaiwan's avatar
      Remove experimental flag of EXT_TX · 3bac9928
      Sebastien Alaiwan authored
      This experiment has been adopted, we can simplify the code
      by dropping the associated preprocessor conditionals.
      
      Change-Id: I02ed47186bbc32400ee9bfadda17659d859c0ef7
      3bac9928
  8. 19 Oct, 2017 1 commit
  9. 05 Oct, 2017 1 commit
  10. 15 Aug, 2017 1 commit
    • Monty Montgomery's avatar
      Disable only coding transform SIMD for DAALA_TX · 1d190950
      Monty Montgomery authored
      Rather than disabling MMX (well, all of SIMD) for daala transforms,
      selectively disable the AV1 TX SIMD through
      av1/common/av1_rtcd_defs.pl
      
      This also requires quite a few testing build fixups.
      
      Change-Id: I689eaafbdd3a87e3a8eeef97412a1846ef886055
      1d190950
  11. 17 Jul, 2017 1 commit
    • Lester Lu's avatar
      Unify FWD_TXFM_PARAM and INV_TXFM_PARAM · 27319b6e
      Lester Lu authored
      Change two similar structs, FWD_TXFM_PARAM and INV_TXFM_PARAM,
      into a common struct: TxfmParam. Its definition is moved to
      aom_dsp/txfm_common.h to simplify dependency.
      
      This change is made so that, in later changes of the LGT
      experiment, functions requiring FWD_TXFM_PARAM and
      INV_TXFM_PARAM, such as get_fwd_lgt4 and get_inv_lgt4, can
      also be unified.
      
      Change-Id: I756b0176a02314005060adbf8e62386f10eeb344
      27319b6e
  12. 07 Jul, 2017 1 commit
    • Lester Lu's avatar
      Signature changes for the LGT experiment · d8b1ddce
      Lester Lu authored
      The input arguments of av1_fht* and av1_iht* functions (and their
      HBD versions) are slightly changed. Input arguments tx_type and
      bd are carried by a struct fwd_txfm_param/inv_txfm_param. This
      struct is meant to later on carry other prediction information,
      such as intra top/left boundaries to the transform level, so
      that the choice of transforms can be more adaptive to the
      prediction mode and local video content.
      
      Change-Id: Ia42544248a51845be64b72855b642ef1fe5910a9
      d8b1ddce
  13. 20 Apr, 2017 1 commit
    • Sebastien Alaiwan's avatar
      Drop support for CONFIG_EMULATE_HARDWARE · c6a48a25
      Sebastien Alaiwan authored
      This experiment complexifies DSP function dispatch, without bringing
      any real value (it's non-normative arbitrary behaviour).
      Moreover, it only has an effect on obsolete transforms, the new ones
      don't implement this mechanism.
      
      Change-Id: Idaccdd0c14ed6b7008cd4f365c7f017ba8ccacf5
      c6a48a25
  14. 12 Apr, 2017 1 commit
  15. 31 Mar, 2017 1 commit
  16. 13 Feb, 2017 1 commit
  17. 01 Feb, 2017 1 commit
    • Tom Finegan's avatar
      Fix tests on macosx. · 29ba6756
      Tom Finegan authored
      - Wrap functions hidden by CONFIG_MOTION_VAR properly in test code.
      - Add some missing ampersands.
      
      Change-Id: Ie7c4e1f14cbacec1c157c7ce110b01350b2ed78e
      29ba6756
  18. 10 Jan, 2017 1 commit
    • Angie Chiang's avatar
      Fix RunAccuracyCheck failure · e6aece86
      Angie Chiang authored
      Measure the accuracy of each transform in terms of per coefficient basis.
      Set up a accuracy limit corresponding to current transform
      implementation.
      
      Change-Id: Ib7db9680c963427e94e728bf453b66180ce30b89
      e6aece86
  19. 29 Nov, 2016 1 commit
  20. 28 Nov, 2016 1 commit
    • Yaowu Xu's avatar
      Fix compiling of tests with emulate-hardware · 46f0f299
      Yaowu Xu authored
      CONFIG_EMULATE_HARDWARE disable SIMD versions of transform functions.
      This commits added !CONFIG_EMULATE_HARDWARE to get tests that use
      SIMD versions of transforms to compile.
      
      Change-Id: I4b9ef5a46ae8f12c439f4fe18766b95f8a520d34
      46f0f299
  21. 22 Nov, 2016 1 commit
  22. 01 Nov, 2016 1 commit
    • Yi Luo's avatar
      Hybrid inverse transforms 16x16 AVX2 optimization · 73172000
      Yi Luo authored
      - Add unit tests to verify the bit-exact result.
      - User level time reduction (EXT_TX):
          encoder: 3.63%
          decoder: 2.36%
      - Also add tx_type=V_DCT...H_FLIPADST SSE2 for 16x16 inv txfm.
      
      Change-Id: Idc6d9e8254aa536e5f18a87fa0d37c6bd551c083
      73172000
  23. 13 Oct, 2016 1 commit
  24. 06 Oct, 2016 1 commit
    • Yi Luo's avatar
      Hybrid forward transforms 16x16 AVX2 optimization · e8e8cd8f
      Yi Luo authored
      - Unit tests are added for AVX2 SIMD.
      - Encoder speed improvement:
        AV1 baseline and EXT_TX, three 1080p sequences at bitrate:
        800 Kbps, 2 Mbps, 6 Mbps, on i7-6700 CPU, average
        user level time reduction: 3.86%.
      
      Change-Id: Ibbd7837ee3a831c6b1e4e471bf6c8d3fa3a19ff4
      e8e8cd8f
  25. 01 Sep, 2016 2 commits
  26. 12 Aug, 2016 1 commit
  27. 18 May, 2016 2 commits
    • Angie Chiang's avatar
      Turn on flip in inverse txfm2d · 6f28581b
      Angie Chiang authored
      Fix build failed
      Reduce txfm test time
      
      Change-Id: Ieaf6b27f3a272d06286f817f01230413fa8adcf6
      6f28581b
    • Yi Luo's avatar
      Integrate HBD row/column flip fwd txfm SSE4.1 optimization · 1d307368
      Yi Luo authored
      - Integrate 5 flip transform types for each 4x4, 8x8, and 16x16
        block, for experiment, EXT_TX.
      - Encoder speed improves about 12%-15%.
      - Update the unit tests for bit-exact result against C.
      
      Change-Id: Idf27c87f1e516ca5b66c7b70142477a115404ccb
      1d307368
  28. 09 May, 2016 1 commit
    • Yi Luo's avatar
      HBD hybrid transform 16x16 SSE4.1 optimization · 412ad22f
      Yi Luo authored
      - Tx_type: DCT_DCT, DCT_ADST, ADST_DCT, ADST_ADST.
      - Update vp10_fht16x16_test.cc to do bit-exact test against
        latest C version.
      - HBD encoder speed improves ~1.8%.
      
      Change-Id: Icfc799a212e5289bcf6cedcae3722032133a2bc6
      412ad22f
  29. 27 Apr, 2016 1 commit
  30. 25 Mar, 2016 1 commit
    • Yi Luo's avatar
      8x8/16x16 HT types V_DCT to H_FLIPADST SSE2 optimization · 770bf715
      Yi Luo authored
      - Wrote function: fidtx8_sse2() and fidtx16_sse2().
      - Turned on vp10_fht8x8_sse2()/vp10_fht16x16_sse2() for new types.
      - Updated 8x8/16x16 unit tests for accuracy/speed.
      - Running 20K times with random numbers and getting through
        tx type from V_DCT to H_FLIPADST, SSE2 speed improvement:
        8x8: ~131%
        16x16: ~66%
      
      Change-Id: Ibbb707e932a08fec3b1f423a7dab280a1d696c9a
      770bf715
  31. 21 Mar, 2016 1 commit
    • Debargha Mukherjee's avatar
      Adds 1D transforms for ADST/FlipADST to make 16 · 1b175593
      Debargha Mukherjee authored
      Makes a set of 16 transforms total, adding all 1D
      combinations of ADST and FlipADST, and removng all DST
      transforms.
      
      lowres, midres both improve by about 0.1% and hdres by
      -0.378% in BDRATE but with fewer transforms that are also
      simpler.
      
      Further experiments to continue later.
      
      Change-Id: I7348a4c0e12078fdea5ae3a2d36a89a319ffcc6e
      1b175593
  32. 08 Mar, 2016 1 commit
    • Yi Luo's avatar
      Implemented DST 16x16 SSE2 intrinsics optimization · 50a164a1
      Yi Luo authored
      - Implemented fdst16_sse2(), fdst16_8col() against C version: fdst16().
      - Turned on 7 DST related hybrid txfm types in vp10_fht16x16_sse2().
      - Replaced vp10_fht10x10_c() with vp10_fht16x16_sse2() in
        fwd_txfm_16x16().
      - Added vp10_fht16x16_sse2() unit test against C version:
        vp10_fht16x16_c() (--gtest_filter=*VP10Trans16x16*).
      - Unit test passed.
      - Speed improvement: 2.4%, 3.2%, 3.2%, for city_cif.y4m, garden_sif.y4m,
        and mobile_cif.y4m.
      
      Change-Id: Ib30a67ce5d5964bef143d588d0f8fa438be8901f
      50a164a1