1. 27 Aug, 2015 1 commit
    • Johann's avatar
      Add sse2 versions of halfpix variance · a28b2c6f
      Johann authored
      These were lost in the great sub pixel variance move of
      6a82f0d7
      
      Not having these functions caused a ~10% performance regression in
      some realtime vp8 encodes.
      
      Change-Id: I50658483d9198391806b27899f2c0d309233c4b5
      a28b2c6f
  2. 20 Aug, 2015 1 commit
  3. 19 Aug, 2015 1 commit
  4. 18 Aug, 2015 1 commit
  5. 17 Aug, 2015 1 commit
  6. 14 Aug, 2015 1 commit
  7. 13 Aug, 2015 2 commits
  8. 12 Aug, 2015 2 commits
  9. 11 Aug, 2015 1 commit
  10. 10 Aug, 2015 3 commits
  11. 07 Aug, 2015 6 commits
  12. 05 Aug, 2015 3 commits
    • Alex Converse's avatar
      Narrow a load in iwht4x4_16_add. · 05720527
      Alex Converse authored
      The top half is unused.
      
      Change-Id: I29b2f6a93e20ea43aff4ad0bd2d52257e1e752b6
      05720527
    • Scott LaVarnway's avatar
      VPX: remove scaled calls from FUN_CONV_1D · 4e6b5079
      Scott LaVarnway authored
      and FUN_CONV_2D macros.  The predict lut now handles
      this case.  The encoder now calls vpx_scaled_2d() instead
      of vpx_convolve8() for scaling.
      
      Change-Id: Ia1c8af8a31e4cb4887a587143108cb45835f7df7
      4e6b5079
    • James Zern's avatar
      Revert "VP9_COPY_CONVOLVE_SSE2 optimization" · afd2f68d
      James Zern authored
      This reverts commit a5e97d87.
      
      Additionally:
      Revert "vpx_convolve_copy_sse2: fix win64"
      
      This reverts commit 22a8474f.
      
      This change performs poorly on various x86_64 devices affecting
      performance by 1-3% at 1080P. Performance on chromebook like devices was
      mixed neutral to slightly negative, so there should be minimal change
      there.
      
      Change-Id: I95831233b4b84ee96369baa192a2d4cc7639658c
      afd2f68d
  13. 04 Aug, 2015 3 commits
  14. 03 Aug, 2015 6 commits
  15. 02 Aug, 2015 3 commits
  16. 01 Aug, 2015 2 commits
  17. 31 Jul, 2015 3 commits
    • Jingning Han's avatar
      Factor inverse transform functions into vpx_dsp · e8b133c7
      Jingning Han authored
      This commit moves the module inverse transform functions from vp9
      to vpx_dsp folder. The hybrid transform wrapper functions stay in
      the vp9 folder, since it involves codec-specific data structures.
      
      Change-Id: Ib066367c953d3d024c73ba65157bbd70a95c9ef8
      e8b133c7
    • Scott LaVarnway's avatar
      VP9_COPY_CONVOLVE_SSE2 optimization · a5e97d87
      Scott LaVarnway authored
      This function suffers from a couple problems in small core(tablets):
      -The load of the next iteration is blocked by the store of previous iteration
      -4k aliasing (between future store and older loads)
      -current small core machine are in-order machine and because of it the store will spin the rehabQ until the load is finished
      fixed by:
      - prefetching 2 lines ahead
      - unroll copy of 2 rows of block
      - pre-load all xmm regiters before the loop, final stores after the loop
      The function is optimized by:
      copy_convolve_sse2 64x64 - 16%
      copy_convolve_sse2 32x32 - 52%
      copy_convolve_sse2 16x16 - 6%
      copy_convolve_sse2 8x8 - 2.5%
      copy_convolve_sse2 4x4 - 2.7%
      credit goes to Tom Craver(tom.r.craver@intel.com) and Ilya Albrekht(ilya.albrekht@intel.com)
      
      Change-Id: I63d3428799c50b2bf7b5677c8268bacb9fc29671
      a5e97d87
    • Zoe Liu's avatar
      Refactor mips/dspr2 on convolution. · 7cfdc003
      Zoe Liu authored
      Change-Id: If59a39d5a92c261537342726f94bb7f7f26dfff3
      7cfdc003