1. 30 Mar, 2017 1 commit
    • Yi Luo's avatar
      Add SSE2 av1_fht32x32 · 9a3d29ea
      Yi Luo authored
      BUG=aomedia:407
      
      Change-Id: I27a7a230bbc701920a996d1e22ae4d22ca8cfead
      9a3d29ea
  2. 13 Feb, 2017 1 commit
  3. 01 Feb, 2017 1 commit
    • Tom Finegan's avatar
      Fix tests on macosx. · 29ba6756
      Tom Finegan authored
      - Wrap functions hidden by CONFIG_MOTION_VAR properly in test code.
      - Add some missing ampersands.
      
      Change-Id: Ie7c4e1f14cbacec1c157c7ce110b01350b2ed78e
      29ba6756
  4. 13 Jan, 2017 1 commit
  5. 25 Oct, 2016 1 commit
  6. 21 Oct, 2016 1 commit
  7. 20 Oct, 2016 1 commit
    • Yi Luo's avatar
      Fix the overflow of av1_fht32x32() in 2D DCT_DCT · 157e45a4
      Yi Luo authored
      - Use range check function to avoid DCT_DCT overflow.
        We need to re-develop the column txfm side scaling/rounding. Now,
        we prefer to maintain the current BDRate level.
      - Encoder user level time reduction <1% owing to av1_fht32x32_avx2.
      - Add MemCheck unit test and fdct32() unit test.
      
      Change-Id: I1e67030f67bc637859798ebe2f6698afffb8531c
      157e45a4
  8. 12 Oct, 2016 1 commit
    • Yi Luo's avatar
      Hybrid forward transform 32x32 AVX2 optimization · fed8e1c0
      Yi Luo authored
      - av1_fht32x32 AVX2 function level time reduction ~89% compared to C.
      
      - av1_fht32x32_avx2() on DCT_DCT improves 42.62% over aom_fdct32x32_avx2()
        But function replacement must go with the corresponding inverse txfm.
      
      - No obvious user level time reduction due to 32x32 TX_TYPE selection.
      
      - Zero high 128b YMM to avoid AVX-SSE transition penalties
        (fix 16x16 case).
      
      - Added 32x32 AVX2 unit tests to verify bitexact.
      
      - AVX2 optimization summary:
        On CPU i7-6700, based on 16x16/32x32 fwd txfm optimization results:
        C to AVX2: function level time reduction, ~86-89%.
        SSE2 to AVX2: function level time reduction, ~51%.
      
      Change-Id: Idd0cd8bf066a61c7117140ef15ab6c1f8eb4b036
      fed8e1c0