1. 05 Mar, 2016 1 commit
  2. 18 Feb, 2016 1 commit
  3. 17 Feb, 2016 8 commits
  4. 16 Feb, 2016 2 commits
  5. 29 Jan, 2016 3 commits
    • Yaowu Xu's avatar
      Enable sse2 version of inverse wht for hbd build · 0aef1bc8
      Yaowu Xu authored
      Change-Id: If8f5efd701a11c8a7ad3078d10ec3cd0fe27667e
      0aef1bc8
    • Yaowu Xu's avatar
      SSSE3 idct8x8 functions for highbitdpeth build · b2297108
      Yaowu Xu authored
      This commit changes SSSE3 optimized idct8x8 functions to work with
      highbitdepth build.
      
      With this commit and the previous one that enabled SSSE3 idct32x32
      functions, tests showed virtually no difference on decoding speed for
      file fdJc1_IBKJA.248.webm for the build with -enable-vp9-highbitdpeth
      option and the build without the option.
      
      Change-Id: Ibe0634149ec70e8b921e6b30171664b8690a9c45
      b2297108
    • Yaowu Xu's avatar
      Enable hbd_build to use SSSE3optimized functions · aac1ef7f
      Yaowu Xu authored
      This commit changes the SSSE3 assembly functions for idct32x32 to
      support highbitdepth build.
      
      On test clip fdJc1_IBKJA.248.webm, this cuts the speed difference
      between hbd and lbd build from between 3-4% to 1-2%.
      
      Change-Id: Ic3390e0113bc1ca5bba8ec80d1795ad31b484fca
      aac1ef7f
  6. 23 Dec, 2015 3 commits
  7. 19 Dec, 2015 1 commit
  8. 18 Dec, 2015 3 commits
  9. 17 Dec, 2015 1 commit
    • Jian Zhou's avatar
      Code clean of sad4xN(_avg)_sse · b158d9a6
      Jian Zhou authored
      Replace MMX with SSE2, reduce psadbw ops which may help Silvermont.
      
      Change-Id: Ic7aec15245c9e5b2f3903dc7631f38e60be7c93d
      b158d9a6
  10. 14 Dec, 2015 1 commit
  11. 11 Dec, 2015 1 commit
    • Jian Zhou's avatar
      Code clean of tm_predictor_32x32 · 88120481
      Jian Zhou authored
      Reallocate the xmm register usage so that no ARCH_X86_64 required.
      Reduce memory access to the left neighbor by half.
      Speed up by single digit on big core machine.
      
      Change-Id: I392515ed8e8aeb02e6a717b3966b1ba13f5be990
      88120481
  12. 10 Dec, 2015 1 commit
    • Jian Zhou's avatar
      SSE2 based h_predictor_32x32 · c90a8a1a
      Jian Zhou authored
      Relocate the function from SSSE3 to SSE2, Unroll loop from 16 to 8,
      and reduce mem access to left.
      Speed up by single digit in ./test_intra_pred_speed on big core
      machines.
      
      Change-Id: I2b7fc95ffc0c42145be2baca4dc77116dff1c960
      c90a8a1a
  13. 08 Dec, 2015 1 commit
    • Jian Zhou's avatar
      Re-enable SSE2 based intra 4x4 prediction · aa5b517a
      Jian Zhou authored
      4x4 Intra predictor implemented with MMX is replaced with SSE2.
      Segfault in change 315561 when decoding vp8 is taken care of.
      
      Change-Id: I083a7cb4eb8982954c20865160f91ebec777ec76
      aa5b517a
  14. 05 Dec, 2015 1 commit
  15. 04 Dec, 2015 4 commits
    • Jian Zhou's avatar
      Speed up h_predictor_16x16 · e86c7c86
      Jian Zhou authored
      Relocate the function from SSSE3 to SSE2, Unroll loop from 8 to 4,
      and reduce mem access to left.
      Speed up by >20% in ./test_intra_pred_speed.
      
      Change-Id: Ie48229c2e32404706b722442942c84983bda74cc
      e86c7c86
    • Jian Zhou's avatar
      Speed up h_predictor_8x8 · da3f08fa
      Jian Zhou authored
      Relocate the function from SSSE3 to SSE2, Unroll loop from 4 to 2,
      and reduce mem access to left.
      Speed up by >20% in ./test_intra_pred_speed.
      
      Change-Id: Ib9f1846819783b6e05e2a310c930eb844b2b4d2e
      da3f08fa
    • Jian Zhou's avatar
      MMX in intra 8x8 prediction replaced with SSE2 · aa2764ab
      Jian Zhou authored
      8x8 Intra predictor implemented with MMX is replaced with SSE2.
      
      Change-Id: I0c90e7c1e1e6942489ac2bfe58903b728aac7a52
      aa2764ab
    • Jian Zhou's avatar
      MMX in intra 4x4 prediction replaced with SSE2 · 89a1efa4
      Jian Zhou authored
      4x4 Intra predictor implemented with MMX is replaced with SSE2.
      
      Change-Id: Id57da2a7c38832d0356bc998790fc1989d39eafc
      89a1efa4
  16. 02 Dec, 2015 1 commit
  17. 30 Nov, 2015 1 commit
    • Jian Zhou's avatar
      SSE2 speed up of h_predictor_4x4 · 9d29d762
      Jian Zhou authored
      Relocate h_predictor_4x4 from SSSE3 to SSE2 with XMM registers.
      Speed up by ~25% in ./test_intra_pred_speed.
      
      Change-Id: I64e14c13b482a471449be3559bfb0da45cf88d9d
      9d29d762
  18. 25 Nov, 2015 1 commit
  19. 19 Nov, 2015 1 commit
    • Jian Zhou's avatar
      Speed up tm_predictor_4x4 · 79b68626
      Jian Zhou authored
      tm_predictor_4x4 is implemented with SSE2 using XMM registers.
      Speed up by ~25% in ./test_intra_pred_speed.
      
      Change-Id: I25074b78d476a2cb17f81cf654bdfd80df2070e0
      79b68626
  20. 18 Nov, 2015 1 commit
  21. 11 Nov, 2015 1 commit
  22. 10 Nov, 2015 1 commit
  23. 20 Oct, 2015 1 commit
    • Geza Lore's avatar
      Optimize vpx_quantize_{b,b_32x32} assembler. · 9cfba09a
      Geza Lore authored
      Added optimization of the 8 bit assembly quantizer routines. This makes
      these functions up to 100% faster, depending on encoding parameters.
      
      This patch maskes the encoder faster in both the high bitdepth and 8bit
      configurations. In the high bitdepth configuration, it effects profile 0
      only.
      
      Based on my profiling using 1080p input the net gain is between 1-3% for
      the 8 bit config, and around 2.5-4.5% for the high bitdepth config,
      depending on target bitrate. The difference between the 8 bit and high
      bitdepth configurations for the same encoder run is reduced by 1% in all
      cases I have profiled.
      
      Change-Id: I86714a6b7364da20cd468cd784247009663a5140
      9cfba09a