1. 29 Apr, 2011 1 commit
  2. 21 Apr, 2011 1 commit
    • Johann's avatar
      keep values in registers during quantization · 508ae1b3
      Johann authored
      add an sse4 quantizer so we can use pinsrw/pextrw and keep values in xmm
      registers instead of proxying through the stack. and as long as we're
      bumping up, use some ssse3 instructions in the EOB detection (see ssse3
      fast quantizer)
      pick up about a percent on 32bit and about two on 64bit.
      
      Change-Id: If15abba0e8b037a1d231c0edf33501545c9d9363
      508ae1b3
  3. 13 Apr, 2011 1 commit
    • John Koleszar's avatar
      Refactor lookahead ring buffer · 88841f10
      John Koleszar authored
      This patch cleans up the source buffer storage and copy mechanism to
      allow access through a standard push/pop/peek interface. This approach
      also avoids an extra copy in the case where the source is not a
      multiple of 16, fixing issue #102.
      
      Change-Id: I05808c39f5743625cb4c7af54cc841b9b10fdbd9
      88841f10
  4. 28 Mar, 2011 1 commit
    • Johann's avatar
      add asm_enc_offsets.c for all targets · 4be062bb
      Johann authored
      now that we need asm_enc_offsets.c for x86 and arm and it is
      harmless to build it for other targets, add it unconditionally
      
      Change-Id: I320c5220afd94fee2b98bda9ff4e5e34c67062f3
      4be062bb
  5. 11 Mar, 2011 2 commits
  6. 08 Mar, 2011 1 commit
    • Yunqing Wang's avatar
      Write SSSE3 sub-pixel filter function · 244e2e14
      Yunqing Wang authored
      1. Process 16 pixels at one time instead of 8.
      2. Add check for both xoffset =0 and yoffset=0, which happens
         during motion search.
      This change gave encoder 1%~3% performance gain.
      
      Change-Id: Idaa39506b48f4f8b2fbbeb45aae8226fa32afb3e
      244e2e14
  7. 22 Feb, 2011 1 commit
  8. 10 Feb, 2011 1 commit
    • John Koleszar's avatar
      Fix relative include paths · 02321de0
      John Koleszar authored
      Allow compiling without adding vp8/{common,encoder,decoder} to the
      include paths.
      
      Change-Id: Ifeb5dac351cdfadcd659736f5158b315a0030b6c
      02321de0
  9. 09 Feb, 2011 1 commit
  10. 06 Jan, 2011 1 commit
    • Johann's avatar
      x86 sse2 temporal_filter_apply · 8b0cf5f7
      Johann authored
      count can be reduced to short because the max number of filtered frames
      is set to 15. the max value for any frame is 32 (modifier = 16,
      filter_weight = 2). 15*32 = 480 which requires 9 bits
      
      this function goes from about 7000 us / 1000 iterations for the C code
      to < 275 us / 1000 iterations for sse2 for block_size = 16 and from
      about 1800 us / 1000 iters to < 100 us / 1000 iters for block_size = 8
      
      Change-Id: I64a32607f58a2d33c39286f468b04ccd457d9e6e
      8b0cf5f7
  11. 01 Nov, 2010 1 commit
    • Scott LaVarnway's avatar
      SSSE3 version of fast quantizer · ff4a71f4
      Scott LaVarnway authored
      (test clip: tulip)
      For good quality mode with speed=1, this gave the encoder
      a small (2 - 3%) performance boost.
      
      Change-Id: I8a1d4269465944ac0819986c2f0be4b0a2ee0b35
      ff4a71f4
  12. 27 Oct, 2010 1 commit
    • Yunqing Wang's avatar
      Full search SAD function optimization in SSE4.1 · 71ecb5d7
      Yunqing Wang authored
      Use mpsadbw, and calculate 8 sad at once. Function list:
      vp8_sad16x16x8_sse4
      vp8_sad16x8x8_sse4
      vp8_sad8x16x8_sse4
      vp8_sad8x8x8_sse4
      vp8_sad4x4x8_sse4
      
      (test clip: tulip)
      For best quality mode, this gave encoder a 5% performance boost.
      For good quality mode with speed=1, this gave encoder a 3%
      performance boost.
      
      Change-Id: I083b5a39d39144f88dcbccbef95da6498e490134
      71ecb5d7
  13. 25 Oct, 2010 1 commit
    • Johann's avatar
      isolate new temporal filtering code · e81e30c2
      Johann authored
      onyx_if is getting pretty big. split out the temporal code to make it
      easier to look at.
      
      Change-Id: I207c3a94c90e91b32e3ea5e1836a53b7a990fabd
      e81e30c2
  14. 18 Oct, 2010 1 commit
    • Yunqing Wang's avatar
      Add SSE2 subtract functions · 4db20765
      Yunqing Wang authored
      Instead of doing 8-bit data unpack and 16-bit subtraction, use
      psubb to do 16 8-bit subtractions and pcmpgtb to preserve the
      sign information. This does not bring noticable gain since
      these functions are not called frequently.
      
      Change-Id: I90a0dfaa3db9d422e4ada324076596ffb178548e
      4db20765
  15. 09 Sep, 2010 1 commit
  16. 02 Sep, 2010 1 commit
    • James Zern's avatar
      encoder: remove postproc dependency · 76640f85
      James Zern authored
      Remove the dependency on postproc.c for the encoder in general, the only
      unchecked need for it is when CONFIG_PSNR is enabled. All other cases
      are already wrapped in CONFIG_POSTPROC. In the CONFIG_PSNR case the file
      will still be included.
      
      Additionally, when VP8_SET_POSTPROC is used with the encoder when post
      processing has been disabled an error will be returned.
      
      This addresses issue #153.
      
      Change-Id: Ia6dfe20167f7077734a6058cbd1d794550346089
      76640f85
  17. 13 Aug, 2010 1 commit
    • John Koleszar's avatar
      move segmentation_common to encoder · 80d3923a
      John Koleszar authored
      vp8_update_gf_useage_maps() is only used by the encoder. This patch
      fixes the ability to build in decode-only or encode-only
      configurations.
      
      Change-Id: I3a5211428e539886ba998e09e8abd747ac55c9aa
      80d3923a
  18. 22 Jul, 2010 1 commit
    • John Koleszar's avatar
      msvs: fix install of codec sources · 4d86ef35
      John Koleszar authored
      The libs.mk file must be installed for the vpx.vcproj file to be
      generated. It was being installed, but not in the src/ directory as
      expected.
      
      Also missed include files yasm.rules, quantize_x86.h
      
      Change-Id: Ic1a6f836e953bfc954d6e42a18c102a0114821eb
      4d86ef35
  19. 24 Jun, 2010 2 commits
    • Scott LaVarnway's avatar
      Added first-pass sse2 version of Yaowu's new fdct. · f1a3b1e0
      Scott LaVarnway authored
      Change-Id: Ib479210067510162879c368428b92690591120b2
      f1a3b1e0
    • Yaowu Xu's avatar
      Redo the forward 4x4 dct · d0dd01b8
      Yaowu Xu authored
      The new fdct lowers the round trip sum squared error for a
      4x4 block ~0.12. or ~0.008/pixel. For reference, the old
      matrix multiply version has average round trip error 1.46
      for a 4x4 block.
      
      Thanks to "derf" for his suggestions and references.
      
      Change-Id: I5559d1e81d333b319404ab16b336b739f87afc79
      d0dd01b8
  20. 18 Jun, 2010 1 commit
    • John Koleszar's avatar
      cosmetics: trim trailing whitespace · 94c52e4d
      John Koleszar authored
      When the license headers were updated, they accidentally contained
      trailing whitespace, so unfortunately we have to touch all the files
      again.
      
      Change-Id: I236c05fade06589e417179c0444cb39b09e4200d
      94c52e4d
  21. 14 Jun, 2010 1 commit
    • Scott LaVarnway's avatar
      sse2 version of vp8_regular_quantize_b · 48c84d13
      Scott LaVarnway authored
      Added sse2 version of vp8_regular_quantize_b which improved encode
      performance(for the clip used) by ~10% for 32 bit builds and ~3% for
      64 bit builds.
      
      Also updated SHADOW_ARGS_TO_STACK to allow for more than 9 arguments.
      
      Change-Id: I62f78eabc8040b39f3ffdf21be175811e96b39af
      48c84d13
  22. 11 Jun, 2010 1 commit
    • John Koleszar's avatar
      require --enable-psnr to build ssim · 9099fc0d
      John Koleszar authored
      ssim.c comiles in a huge (512M) amount of global scratch space. Allocating
      this data on the heap would be a better solution, but this file doesn't
      need to be built at all in most cases, so as a first pass, disable it
      except when doing opsnr.stt output (--enable-psnr).
      
      Change-Id: I320d812f6d652a12516a16b52295ebff20b5bd42
      9099fc0d
  23. 05 Jun, 2010 1 commit
    • John Koleszar's avatar
      shared library support (.so) · 7aa97a35
      John Koleszar authored
      This patch adds support for building shared libraries when configured
      with the --enable-shared switch.
      
      Building DLLs would require more invasive changes to the sample
      utilities than I want to make in this patch, since on Windows you can't
      use the address of an imported symbol in a static initializer. The best
      way to work around this is proably to build the codec interface mapping
      table with an init() function, but dll support is of questionable value
      anyway, since most windows users will probably use a media framework
      lib like webmdshow, which links this library in staticly.
      
      Change-Id: Iafb48900549b0c6b67f4a05d3b790b2643d026f4
      7aa97a35
  24. 04 Jun, 2010 1 commit
  25. 25 May, 2010 1 commit
    • John Koleszar's avatar
      install includes in DIST_DIR/include/vpx, move vpx_codec/ to vpx/ · b7492341
      John Koleszar authored
      This renames the vpx_codec/ directory to vpx/, to allow applications
      to more consistently reference these includes with the vpx/ prefix.
      This allows the includes to be installed in /usr/local/include/vpx
      rather than polluting the system includes directory with an
      excessive number of includes.
      
      Change-Id: I7b0652a20543d93f38f421c60b0bbccde4d61b4f
      b7492341
  26. 18 May, 2010 1 commit