1. 06 Aug, 2013 2 commits
    • Jim Bankoski's avatar
      block error / x86inc mods · 62c6aa88
      Jim Bankoski authored
      Change-Id: Icb607745634e10b9bac5019d06661ece09fcdb40
      62c6aa88
    • Jim Bankoski's avatar
      reworked config for use_x86_inc · a93b115c
      Jim Bankoski authored
      Support enabling it or disabling it.  Moved read out to configure.sh
      so that its done once instead of in make and in config.
      
      Change-Id: I73a9190cf31de9f03e8a577f478fa522f8c01c8b
      a93b115c
  2. 11 Jul, 2013 1 commit
  3. 01 Jul, 2013 1 commit
    • Ronald S. Bultje's avatar
      Quantize (64-bit only, for now) SSSE3 SIMD. · 7353ceab
      Ronald S. Bultje authored
      Total encoding time for first 50 frames of bus (speed 0) @ 1500kbps
      goes 2min34.8 to 2min14.4, i.e. a 10.4% overall speedup. The code is
      x86-64 only, it needs some minor modifications to be 32bit compatible,
      because it uses 15 xmm registers, whereas 32bit only has 8.
      
      Change-Id: I2df53770c2e850813ffa713e1a91b45b0082b904
      7353ceab
  4. 29 Jun, 2013 1 commit
  5. 21 Jun, 2013 2 commits
    • Ronald S. Bultje's avatar
      Implement SSE2 block_error. · 54b2a596
      Ronald S. Bultje authored
      Change vp9_block_error() to return a 64bit error variable, change all
      callers to expect a 64bit return value (this will prevent overflows,
      which we basically don't check for at all right now). Remove duplicate
      block_error() function, which fixed that through truncation. Remove
      old (incompatible) mmx/sse2 block_error SIMD versions and replace with
      a new one that returns a 64bit value.
      
      Encoding time of first 50 frames of bus @ 1500kbps goes from 3min29 to
      3min23, i.e. a 3% overall speedup.
      
      Change-Id: Ib71ac5508b5ee8a80f1753cd85d72df1629abe68
      54b2a596
    • Ronald S. Bultje's avatar
      Add subtract_block SSE2 version and unit test. · 25c588b1
      Ronald S. Bultje authored
      3% faster overall (3min35.0 to 3min28.5).
      
      Change-Id: I5ff8a5c2c91586b6632ca5009ad1ea51ce94af5e
      25c588b1
  6. 20 Jun, 2013 1 commit
    • Ronald S. Bultje's avatar
      Implement sse2 and ssse3 versions for all sub_pixel_variance sizes. · 8fb6c581
      Ronald S. Bultje authored
      Overall speedup around 5% (bus @ 1500kbps first 50 frames 4min10 ->
      3min58). Specific changes to timings for each function compared to
      original assembly-optimized versions (or just new version timings if
      no previous assembly-optimized version was available):
      
      sse2   4x4:    99 ->   82 cycles
      sse2   4x8:           128 cycles
      sse2   8x4:           121 cycles
      sse2   8x8:   149 ->  129 cycles
      sse2   8x16:  235 ->  245 cycles (?)
      sse2  16x8:   269 ->  203 cycles
      sse2  16x16:  441 ->  349 cycles
      sse2  16x32:          641 cycles
      sse2  32x16:          643 cycles
      sse2  32x32: 1733 -> 1154 cycles
      sse2  32x64:         2247 cycles
      sse2  64x32:         2323 cycles
      sse2  64x64: 6984 -> 4442 cycles
      
      ssse3  4x4:           100 cycles (?)
      ssse3  4x8:           103 cycles
      ssse3  8x4:            71 cycles
      ssse3  8x8:           147 cycles
      ssse3  8x16:          158 cycles
      ssse3 16x8:   188 ->  162 cycles
      ssse3 16x16:  316 ->  273 cycles
      ssse3 16x32:          535 cycles
      ssse3 32x16:          564 cycles
      ssse3 32x32:          973 cycles
      ssse3 32x64:         1930 cycles
      ssse3 64x32:         1922 cycles
      ssse3 64x64:         3760 cycles
      
      Change-Id: I81ff6fe51daf35a40d19785167004664d7e0c59d
      8fb6c581
  7. 17 Jun, 2013 1 commit
  8. 29 May, 2013 1 commit
    • Dmitry Kovalev's avatar
      Compressed/uncompressed frame header changes. · 18c83b37
      Dmitry Kovalev authored
      Adding API to read/write uncompressed frame header bits (it is not final
      yet). Separate functions to read/write uncompressed header. Moving
      clr_type, error_resilient_mode, refresh_frame_context,
      frame_parallel_decoding_mode, frame_context_idx from compressed partition
      to uncompressed frame header.
      
      Change-Id: Id3ed8a387980c652ae147549412f4ec24a0a5bd0
      18c83b37
  9. 28 May, 2013 1 commit
  10. 21 May, 2013 1 commit
    • Dmitry Kovalev's avatar
      Adding API to read/write uncompressed frame header bits. · df037b61
      Dmitry Kovalev authored
      The API is not final yet and can be changed. Actual layout of
      uncompressed frame part will be finalized later. Right now moving
      clr_type, error_resilient_mode, refresh_frame_context,
      frame_parallel_decoding_mode from first compressed partition to
      uncompressed frame part.
      
      Change-Id: I3afc5d4ea92c5a114f4c3d88f96858cccc15b76e
      df037b61
  11. 03 May, 2013 1 commit
  12. 01 May, 2013 1 commit
  13. 26 Apr, 2013 1 commit
    • Johann's avatar
      Normalize more intrinsic filenames · 863601c5
      Johann authored
      vp9_dequantize_x86 has only sse2 functions.
      
      vp9_dct_sse2_intrinsics has no namespace collision and can drop
      _intrinsics.
      
      vp9_idct_mmx.h is unused.
      
      Change-Id: Ic16e31fb372a1d1e841a62ecb4189fe8f95808ec
      863601c5
  14. 25 Apr, 2013 1 commit
  15. 16 Apr, 2013 2 commits
  16. 28 Feb, 2013 2 commits
  17. 27 Feb, 2013 2 commits
    • Ronald S. Bultje's avatar
      Move eob from BLOCKD to MACROBLOCKD. · e8c74e2b
      Ronald S. Bultje authored
      Consistent with VP8.
      
      Change-Id: I8c316ee49f072e15abbb033a80e9c36617891f07
      e8c74e2b
    • John Koleszar's avatar
      Remove unused vp9_copy32xn · 7ad8dbe4
      John Koleszar authored
      This function was part of an optimization used in VP8 that required
      caching two macroblocks. This is unused in VP9, and might not
      survive refactoring to support superblocks, so removing it for now.
      
      Change-Id: I744e585206ccc1ef9a402665c33863fc9fb46f0d
      7ad8dbe4
  18. 15 Feb, 2013 1 commit
  19. 09 Feb, 2013 2 commits
  20. 26 Dec, 2012 1 commit
  21. 05 Dec, 2012 1 commit
  22. 04 Dec, 2012 1 commit
    • Yaowu Xu's avatar
      Fix the build with MSVC · 6a5e6e05
      Yaowu Xu authored
      1. remove the dependency on non existing "vp9_temporal_filter_x86.h"
      2. prefix filenames with vp9_ in obj_int_extract.bat to reflect the
      change of the actual filenames.
      
      Change-Id: Ib1b4d96ac41788f76917764a6722d8461c857302
      6a5e6e05
  23. 03 Dec, 2012 1 commit
  24. 29 Nov, 2012 1 commit
  25. 27 Nov, 2012 1 commit
    • John Koleszar's avatar
      Add vp9_ prefix to all vp9 files · fcccbcbb
      John Koleszar authored
      Support for gyp which doesn't support multiple objects in the same
      static library having the same basename.
      
      Change-Id: Ib947eefbaf68f8b177a796d23f875ccdfa6bc9dc
      fcccbcbb
  26. 01 Nov, 2012 1 commit