1. 28 Jul, 2015 1 commit
  2. 15 May, 2015 1 commit
    • James Zern's avatar
      rename vp9_dct32x32_avx2.c to vp9_dct32x32_avx2_impl.h · 4ec47249
      James Zern authored
      this file shouldn't be built directly, it is included in vp9_dct_avx2.c
      to create a non-high-bitdepth and a high-bitdepth version
      
      silences missing prototype warnings for the unused FDCT32x32* functions
      
      Change-Id: I4c19935c0e035b393be513bde735e9a78064a494
      4ec47249
  3. 28 Jul, 2014 1 commit
    • levytamar82's avatar
      Fix bug 805 · 4ba92dc5
      levytamar82 authored
      Remove all the redundant dct functions (dct4x4, dct8x8)
      in avx2 except dct32x32 those functions were copied originally from dct_sse2
      
      Change-Id: I742576fbf5175f3ac09f2076976a9247b259323e
      4ba92dc5
  4. 13 Feb, 2014 1 commit
  5. 06 Feb, 2014 1 commit
  6. 28 Jan, 2014 1 commit
  7. 21 Nov, 2013 2 commits
  8. 13 Nov, 2013 1 commit
    • Jingning Han's avatar
      Fix an overflow issue in SSE2 forward ADST · fabc7836
      Jingning Han authored
      The step that sums three input samples could potentially cause the
      intermediate result go beyond 16 bit limit, when operating as the
      second 1-D transform. This commit fixes the issue.
      
      Change-Id: Iaf512449ac2d25ddd8a806d760afab362c62a516
      fabc7836
  9. 24 Oct, 2013 1 commit
  10. 23 Oct, 2013 4 commits
  11. 21 Oct, 2013 1 commit
  12. 18 Oct, 2013 2 commits
  13. 15 Oct, 2013 1 commit
  14. 24 Sep, 2013 1 commit
  15. 12 Aug, 2013 1 commit
    • Jingning Han's avatar
      SSE2 high precision 32x32 forward DCT · 78136edc
      Jingning Han authored
      Enable SSE2 implementation of high precision 32x32 forward DCT. The
      intermediate stacks are of 32-bits. The run-time goes down from
      32126 cycles to 13442 cycles.
      
      Change-Id: Ib5ccafe3176c65bd6f2dbdef790bd47bbc880e56
      78136edc
  16. 06 Aug, 2013 2 commits
  17. 10 Jul, 2013 1 commit
    • Jingning Han's avatar
      SSE2 16x16 ADST/DCT hybrid transform · 11442353
      Jingning Han authored
      This commit enables 16x16 ADST/DCT forward hybrid transform using SSE2
      operations. It reduces the runtime from 5433 cycles to 1621 cycles, at
      no compression performance loss.
      
      Change-Id: I75fd7f1984e9e28846af459f810ff0d6ae125230
      11442353
  18. 03 Jul, 2013 1 commit
    • Jingning Han's avatar
      Refactor SSE2 8x8 functional units · 2cb75c96
      Jingning Han authored
      These serve as building blocks for SSE2 8x8 and 16x16 ADST/DCT
      hybrid transform coding.
      
      Change-Id: I4089a754c66e0c986f67d9b8ec4dfb9627ad430d
      2cb75c96
  19. 29 Jun, 2013 2 commits
  20. 28 Jun, 2013 1 commit
  21. 26 Jun, 2013 1 commit
    • Yaowu Xu's avatar
      fixed a compiling problem with MSVC win32 build · 60dc7375
      Yaowu Xu authored
      The aligned array in parameter list caused win32 build to report
      c2719 error. This commit fixed the issue by make the parameter
      type a pointer instead of an array.
      
      Change-Id: I4ed654ce4eba2db4995d9cdc136c68e9a6acc992
      60dc7375
  22. 25 Jun, 2013 3 commits
  23. 26 Apr, 2013 2 commits
    • Johann's avatar
      Whitespace nit · e3038ca8
      Johann authored
      Change-Id: I7486970c57cda75d26ec2c6d1f36bd668c955f66
      e3038ca8
    • Johann's avatar
      Normalize more intrinsic filenames · 863601c5
      Johann authored
      vp9_dequantize_x86 has only sse2 functions.
      
      vp9_dct_sse2_intrinsics has no namespace collision and can drop
      _intrinsics.
      
      vp9_idct_mmx.h is unused.
      
      Change-Id: Ic16e31fb372a1d1e841a62ecb4189fe8f95808ec
      863601c5
  24. 16 Apr, 2013 2 commits
  25. 18 Mar, 2013 1 commit
    • Yunqing Wang's avatar
      Optimize 8x8 idct function · 6344c84c
      Yunqing Wang authored
      Wrote sse2 functions of vp9_short_idct8x8 and vp9_short_idct10_8x8.
      Compared to c version, the sse2 version is 2X faster. The decoder
      test didn't show noticeable gain since 8x8 idct doesn't take much
      of decoding time (less than 1% in my test).
      
      Change-Id: I56313e18cd481700b3b52c4eda5ca204ca6365f3
      6344c84c
  26. 15 Mar, 2013 1 commit
    • Christian Duvivier's avatar
      Faster vp9_short_fdct16x16. · 4418b790
      Christian Duvivier authored
      Scalar path is about 1.5x faster (3.1% overall encoder speedup).
      SSE2 path is about 7.2x faster (7.8% overall encoder speedup).
      
      Change-Id: I06da5ad0cdae2488431eabf002b0d898d66d8289
      4418b790
  27. 28 Feb, 2013 2 commits