1. 01 Oct, 2017 1 commit
  2. 12 Jun, 2017 1 commit
    • Sarah Parker's avatar
      Clean up hbd transform code · 30dfa883
      Sarah Parker authored
      Responding to some left over cosmetic comments from
      2b5cdb1cf87c933331a16cc0221455d0a8c255e1
      
      Change-Id: I42e126593526cedd6675adf35b9c1df78e1ddf54
      30dfa883
  3. 08 Jun, 2017 1 commit
    • Sarah Parker's avatar
      Remove deprecated high-bitdepth functions · 31c66502
      Sarah Parker authored
      This unifies the codepath for high-bitdepth transforms and deletes
      all calls to the old deprecated versions. This required reworking
      the way 1d configurations are combined in order to support rectangular
      transforms.
      
      There is one remaining codepath that calls the deprecated 4x4 hbd
      transform from encoder/encodemb.c. I need to take a closer look
      at what is happening there and will leave that for a followup
      since this change has already gotten so large.
      
      lowres 10 bit: -0.035%
      lowres 12 bit: 0.021%
      
      BUG=aomedia:524
      
      Change-Id: I34cdeaed2461ed7942364147cef10d7d21e3779c
      31c66502
  4. 18 May, 2017 1 commit
    • Sarah Parker's avatar
      Refactor hbd txfm configurations to be 1D · eec47e65
      Sarah Parker authored
      The hbd transform configurations were originally written for all possible
      2d transforms. Now that there are many more possible 2d transforms
      due to EXT_TX and RECT_TX, it is simpler to write the cfg for the
      4 1D transform types and compose them to make all new possible transform
      types. This will allow for an easier integration of the identity transform
      for EXT_TX and rectangular transforms for RECT_TX into the current
      hbd transform codepath and facilitate the removal of obsolete transforms.
      This has no impact on performance.
      
      BUG=aomedia:524
      
      Change-Id: I1e217bcd217fd637b1df94fae62d9c59a0523c1a
      eec47e65
  5. 14 Dec, 2016 1 commit
    • Yaowu Xu's avatar
      Align temp buffer to 16 byte boundary · c1c502b8
      Yaowu Xu authored
      The optimized instrinsics require the buffer to be aligned on 16 byte
      boundaries. The commit fixes segfaults caused by unaligned access.
      
      Change-Id: I07fc242e43070bb8829871c50da52f50e60246a9
      c1c502b8
  6. 29 Nov, 2016 1 commit
  7. 01 Nov, 2016 1 commit
  8. 01 Sep, 2016 2 commits
  9. 17 Aug, 2016 1 commit
  10. 12 Aug, 2016 1 commit
  11. 04 Aug, 2016 1 commit
  12. 11 Jul, 2016 1 commit
  13. 11 May, 2016 1 commit
  14. 09 May, 2016 1 commit
    • Yi Luo's avatar
      HBD hybrid transform 16x16 SSE4.1 optimization · 412ad22f
      Yi Luo authored
      - Tx_type: DCT_DCT, DCT_ADST, ADST_DCT, ADST_ADST.
      - Update vp10_fht16x16_test.cc to do bit-exact test against
        latest C version.
      - HBD encoder speed improves ~1.8%.
      
      Change-Id: Icfc799a212e5289bcf6cedcae3722032133a2bc6
      412ad22f
  15. 06 May, 2016 1 commit
  16. 30 Apr, 2016 1 commit
    • Yi Luo's avatar
      HBD hybrid transform 8x8 SSE4.1 optimization · 299c5fc2
      Yi Luo authored
      - Tx_type: DCT_DCT, DCT_ADST, ADST_DCT, ADST_ADST.
      - Update bit-exact unit test against current C version.
      - HBD encoder speed improves ~3.8%.
      
      Change-Id: Ie13925ba11214eef2b5326814940638507bf68ec
      299c5fc2
  17. 25 Apr, 2016 1 commit
    • Yi Luo's avatar
      HBD hybrid transform 4x4 SSE4.1 optimization · a4593f17
      Yi Luo authored
      - Optimization on tx_type: DCT_DCT, DCT_ADST, ADST_DCT, ADST_ADST.
      - Overall encoder speed improves ~4.5%-6%.
      - Update bit-exact unit test against current C version.
      
      Change-Id: If751c030612245b1c2470200c9570cf40d655504
      a4593f17
  18. 22 Apr, 2016 1 commit
    • Yi Luo's avatar
      Change hybrid transform function argument from TXFM_2D_CFG* to int · cf7f0069
      Yi Luo authored
        Unit test shows manually developed SSE4.1 code would performs ~30%
        better if TXFM_2D_CFG configuration is set in lower level. This
        change only updates function signature. There is no performance
        impact.
      
      Change-Id: I62692bd50a21ffc8a944bbd6c155c0a2020ad77b
      cf7f0069
  19. 30 Mar, 2016 2 commits
    • Angie Chiang's avatar
      change vp10_fwd_txfm2d_#x#_sse2 to vp10_fwd_txfm2d_#x#_sse4_1 · 25520d8d
      Angie Chiang authored
      The speed performance for running 20k times  is as follows
      
      Notice that the vp10_highbd_fdct#x#_sse2 version is
      16-bit version plus range check
      
      The rest are 32-bit version
      
      vp10_fwd_txfm2d_4x4_c (2 ms)
      vp10_fwd_txfm2d_8x8_c (9 ms)
      vp10_fwd_txfm2d_16x16_c (45 ms)
      vp10_fwd_txfm2d_32x32_c (233 ms)
      
      vp10_fwd_txfm2d_4x4_sse4_1 (2 ms)
      vp10_fwd_txfm2d_8x8_sse4_1 (3 ms)
      vp10_fwd_txfm2d_16x16_sse4_1 (16 ms)
      vp10_fwd_txfm2d_32x32_sse4_1 (80 ms)
      
      vp10_highbd_fdct4x4_c (1 ms)
      vp10_highbd_fdct8x8_c (3 ms)
      vp10_highbd_fdct16x16_c (17 ms)
      highbd_fdct32x32_c (160 ms)
      
      vp10_highbd_fdct4x4_sse2 (0 ms)
      vp10_highbd_fdct8x8_sse2 (2 ms)
      vp10_highbd_fdct16x16_sse2 (8 ms)
      highbd_fdct32x32_sse2 (105 ms)
      
      Change-Id: I24daf1e0d4d66e91e4ce61ef71cefa7b70ee90ce
      25520d8d
    • Angie Chiang's avatar
      Add vp10_fwd_txfm2d_sse2 · 11d2bb54
      Angie Chiang authored
      Change-Id: Idfbe3c7f5a7eb799c03968171006f21bf3d96091
      11d2bb54