1. 04 Oct, 2013 1 commit
    • Dmitry Kovalev's avatar
      Giving consistent names to IDCT/IWHT functions. · 3a060257
      Dmitry Kovalev authored
      The idea is to have the following names for each transform size:
      
      vp9_idct4x4_add
        vp9_idct4x4_1_add
        vp9_idct4x4_10_add
        vp9_idct4x4_16_add
      
      vp9_idct8x8_add
        vp9_idct8x8_1_add
        vp9_idct8x8_10_add
        vp9_idct8x8_64_add
      
      etc for 16x16, 32x32
      
      The actual list of renames in this patch:
      
      vp9_idct_add_lossless     -> vp9_iwht4x4_add
      vp9_short_iwalsh4x4_add   -> vp9_iwht4x4_16_add
      vp9_short_iwalsh4x4_1_add -> vp9_iwht4x4_1_add
      
      vp9_idct_add            -> vp9_idct4x4_add
      vp9_short_idct4x4_add   -> vp9_idct4x4_16_add
      vp9_short_idct4x4_1_add -> vp9_idct4x4_1_add
      
      Change-Id: I6f43f7437c68dd30cdd05d72e213765578ed30b1
      3a060257
  2. 02 Oct, 2013 3 commits
  3. 30 Sep, 2013 1 commit
    • Dmitry Kovalev's avatar
      Removing vp9_add_constant_residual_{8x8, 16x16, 32x32} functions. · 548671dd
      Dmitry Kovalev authored
      We don't need these functions anymore. The only one which was actually
      used is vp9_add_constant_residual_32x32. Addition of
      vp9_short_idct32x32_1_add eliminates this single usage. SSE2 optimized
      version of vp9_short_idct32x32_1_add will be added in the next patch set,
      right now it is only C implementation. Now we have all idct functions
      implemented in a consistent manner.
      
      Change-Id: I63df79a13cf62aa2c9360a7a26933c100f9ebda3
      548671dd
  4. 27 Sep, 2013 1 commit
  5. 26 Sep, 2013 1 commit
  6. 25 Sep, 2013 1 commit
  7. 12 Sep, 2013 1 commit
  8. 11 Sep, 2013 1 commit
  9. 05 Sep, 2013 1 commit
    • Jingning Han's avatar
      Use saturated addition in SSSE3 of 32x32 quant · 458c2833
      Jingning Han authored
      The 32x32 forward transform can potentially reach peak coefficient
      value close to 32700, while the rounding factor can go upto 610.
      This could cause overflow issue in the SSSE3 implementation of 32x32
      quantization process.
      
      This commit resolves this issue by replacing the addition operations
      with saturated addition operations in 32x32 block quantization.
      
      Change-Id: Id6b98996458e16c5b6241338ca113c332bef6e70
      458c2833
  10. 04 Sep, 2013 2 commits
  11. 01 Sep, 2013 1 commit
    • Jingning Han's avatar
      Fix 32x32 forward transform SSE2 version · 3cf46fa5
      Jingning Han authored
      This commit fixed the potential overflow issue in the SSE2
      implementation of 32x32 forward DCT. It resolved the corrupted
      coded frames in the border of scenes.
      
      Change-Id: If87eef2d46209269f74ef27e7295b6707fbf56f9
      3cf46fa5
  12. 29 Aug, 2013 1 commit
    • Jingning Han's avatar
      Fix overflow issue in SSSE3 32x32 quantization · abff6788
      Jingning Han authored
      The 32x32 quantization process can potentially have the intermediate
      stacks over 16-bit range, thereby causing enc/dec mismatch. This commit
      fixes this overflow issue in the SSSE3 implementation, as well as the
      prototype, of 32x32 quantization.
      
      This fixes issue 607 from webm@googlecode.
      
      Change-Id: I85635e6ca236b90c3dcfc40d449215c7b9caa806
      abff6788
  13. 27 Aug, 2013 1 commit
  14. 26 Aug, 2013 3 commits
  15. 22 Aug, 2013 1 commit
    • hkuang's avatar
      Add neon optimize vp9_short_idct10_16x16_add. · 4082bf9d
      hkuang authored
      vp9_short_idct10_16x16_add is used to handle the block that only have valid data
      at top left 4x4 block. All the other datas are 0. So we could cut many
      unnecessary calculations in order to save instructions.
      
      Change-Id: I6e30a3fee1ece5af7f258532416d0bfddd1143f0
      4082bf9d
  16. 21 Aug, 2013 1 commit
  17. 20 Aug, 2013 1 commit
    • hkuang's avatar
      Add neon optimize vp9_short_idct10_8x8_add. · 37cda6dc
      hkuang authored
      vp9_short_idct10_8x8_add is used to handle the block that only have valid data
      at top left 4x4 block. All the other datas are 0. So we could cut several
      unnecessary calculations in order to save instructions.
      
      Change-Id: I34fda95e29082b789aded97c2df193991c2d9195
      37cda6dc
  18. 15 Aug, 2013 1 commit
  19. 14 Aug, 2013 4 commits
  20. 12 Aug, 2013 1 commit
    • Jingning Han's avatar
      SSE2 high precision 32x32 forward DCT · 78136edc
      Jingning Han authored
      Enable SSE2 implementation of high precision 32x32 forward DCT. The
      intermediate stacks are of 32-bits. The run-time goes down from
      32126 cycles to 13442 cycles.
      
      Change-Id: Ib5ccafe3176c65bd6f2dbdef790bd47bbc880e56
      78136edc
  21. 07 Aug, 2013 1 commit
  22. 06 Aug, 2013 6 commits
  23. 05 Aug, 2013 2 commits
  24. 02 Aug, 2013 2 commits
  25. 01 Aug, 2013 1 commit
    • Jingning Han's avatar
      Remove unused vp9_short_idct10_32x32_add · 67719abd
      Jingning Han authored
      The inverse 32x32 transform detects all zero entries and skips the
      computations accordingly per 8 rows in the first 1-D operation. The
      function vp9_short_idct10_32x32_add performs differently and is not
      used anywhere, hence removed.
      
      Change-Id: Ic4fad422debbde7b6b6ffed47c69fbd4268a906c
      67719abd