1. 17 Jul, 2013 1 commit
    • Johann's avatar
      vp9_convolve8_neon placeholder · 59dc4e9c
      Johann authored
      Call the individually optimized horizontal and vertical functions. This
      implementation abuses the temp buffer.
      
      This will be replaced with a custom optimized function.
      
      Over 2x speedup.
      
      Change-Id: I5b908d2a73d264e9810d6022bbff73207a3055dd
      59dc4e9c
  2. 15 Jul, 2013 1 commit
  3. 13 Jul, 2013 1 commit
  4. 12 Jul, 2013 1 commit
    • Johann's avatar
      vp9_convolve8_[horiz|vert]_avg · a15bebfc
      Johann authored
      Super basic conversion from the other implementations. Any changes to
      one should be trivial to copy over keep in sync.
      
      Change-Id: I1720b4128e0aba4b2779e3761f6494f8a09d3ea8
      a15bebfc
  5. 11 Jul, 2013 3 commits
  6. 10 Jul, 2013 1 commit
  7. 09 Jul, 2013 1 commit
    • Yaowu Xu's avatar
      Added a lossless test · 9ce6de19
      Yaowu Xu authored
      It does encodings with min and max q set at 0, and check to make sure
      output PSNR at MAX_PSNR (100).
      
      Change-Id: Ia2418353cccf6e487204ea4ff874a7e71e55cb3e
      9ce6de19
  8. 08 Jul, 2013 1 commit
    • John Koleszar's avatar
      Fix loopfilter bug · 527fc5ca
      John Koleszar authored
      In the rare case were 4x4 interior filtering was called for but no
      8x8 or larger filtering takes place, the previous code was skipping
      the filtering. This patch fixes the issue by including the interior
      mask in the overall mask for the filter application loops.
      
      Change-Id: I4a0b65056c64f97478827c2ff41e0914fc7779d0
      527fc5ca
  9. 02 Jul, 2013 1 commit
    • Jim Bankoski's avatar
      new unit test for cpu-speed · b0520b61
      Jim Bankoski authored
      Tests q0 ( lossless),  very high bitrate and low bitrates at cpu speed
      0, 1 and 2.
      
      Change-Id: I0c5cdca00acd8d01e7b13f124b3b08d4b1ae9f6d
      b0520b61
  10. 27 Jun, 2013 1 commit
  11. 26 Jun, 2013 4 commits
  12. 25 Jun, 2013 3 commits
  13. 24 Jun, 2013 2 commits
    • Ronald S. Bultje's avatar
      Add SAD unit tests for all rectangular sizes. · 3c4abbe4
      Ronald S. Bultje authored
      Change-Id: I47e81b51f072abdb276bdec85423febba34b5f81
      3c4abbe4
    • John Koleszar's avatar
      Fix loopfilter of leftmost 4x4 edges in SB · 858475a0
      John Koleszar authored
      For cases where there's no transform set in bit 0 (the left edge of
      the SB) but bit 0 of mask_4x4_int is set (the edge 4 pixels from the
      left edge needs filtering), it was incorrectly being skipped before.
      This situation only happens on the leftmost edge of the image, as
      the edge at column 0 is intentionally skipped since there aren't
      pixels to the left to read.
      
      Change-Id: Ib2fbbcb40166e90af31b1a0e13b85b68c226cbd3
      858475a0
  14. 22 Jun, 2013 3 commits
  15. 21 Jun, 2013 2 commits
  16. 20 Jun, 2013 3 commits
    • Ronald S. Bultje's avatar
      SSE2/SSSE3 optimizations and unit test for sub_pixel_avg_variance(). · 1e6a32f1
      Ronald S. Bultje authored
      Encoding of bus @ 1500kbps (first 50 frames) goes from 3min57 to
      3min35, i.e. approximately a 10.5% speedup. Note that the SIMD versions
      which use a bilinear filter (x_offset & 7 || y_offset & 7) aren't
      perfectly interleaved, and can probably be improved further in the
      future. I've marked this with a few TODOs/FIXMEs in the code.
      
      Change-Id: I5c9e900c0f0d32e431a50fecae213b510b2549f9
      1e6a32f1
    • Ronald S. Bultje's avatar
      Implement sse2 and ssse3 versions for all sub_pixel_variance sizes. · 8fb6c581
      Ronald S. Bultje authored
      Overall speedup around 5% (bus @ 1500kbps first 50 frames 4min10 ->
      3min58). Specific changes to timings for each function compared to
      original assembly-optimized versions (or just new version timings if
      no previous assembly-optimized version was available):
      
      sse2   4x4:    99 ->   82 cycles
      sse2   4x8:           128 cycles
      sse2   8x4:           121 cycles
      sse2   8x8:   149 ->  129 cycles
      sse2   8x16:  235 ->  245 cycles (?)
      sse2  16x8:   269 ->  203 cycles
      sse2  16x16:  441 ->  349 cycles
      sse2  16x32:          641 cycles
      sse2  32x16:          643 cycles
      sse2  32x32: 1733 -> 1154 cycles
      sse2  32x64:         2247 cycles
      sse2  64x32:         2323 cycles
      sse2  64x64: 6984 -> 4442 cycles
      
      ssse3  4x4:           100 cycles (?)
      ssse3  4x8:           103 cycles
      ssse3  8x4:            71 cycles
      ssse3  8x8:           147 cycles
      ssse3  8x16:          158 cycles
      ssse3 16x8:   188 ->  162 cycles
      ssse3 16x16:  316 ->  273 cycles
      ssse3 16x32:          535 cycles
      ssse3 32x16:          564 cycles
      ssse3 32x32:          973 cycles
      ssse3 32x64:         1930 cycles
      ssse3 64x32:         1922 cycles
      ssse3 64x64:         3760 cycles
      
      Change-Id: I81ff6fe51daf35a40d19785167004664d7e0c59d
      8fb6c581
    • Jingning Han's avatar
      Add unit tests for 4x4 ADST · 362809df
      Jingning Han authored
      Enable sign bias check and round-trip error unit tests for 4x4 hybrid
      transform modules.
      
      Change-Id: Icd3d839f098d4b92b00ff76eac146765b039d0d3
      362809df
  17. 19 Jun, 2013 1 commit
    • John Koleszar's avatar
      Add some unaligned test vectors · 639db571
      John Koleszar authored
      Tests resolutions of 8, 10, 16, 18, 32, 34, 64, 66 to exercise the
      border conditions, as well as non-SB aligned sizes.
      
      Change-Id: Ie7c2b7860ac3727e23202042f2e86792652912f8
      639db571
  18. 18 Jun, 2013 2 commits
  19. 17 Jun, 2013 1 commit
    • Jeff Petkau's avatar
      Change the encryption feature to use a callback for decryption. · 368c7237
      Jeff Petkau authored
      This allows code calling the library can choose an arbitrary
      encryption algorithm.
      
      Decoder control parameter VP8_SET_DECRYPT_KEY is renamed to
      VP8D_SET_DECRYPTOR, and now takes an small config struct instead
      of just a byte array.
      
      Change-Id: I0462b3388d8d45057e4f79a6b6777fe713dc546e
      368c7237
  20. 14 Jun, 2013 1 commit
    • Jingning Han's avatar
      Enable sse2 version of sad8x4/4x8 · c43af9a8
      Jingning Han authored
      The encoding time for bus at CIF goes from 661s to 625s. This commit
      also enabled unit test of sad8x4/4x8 in sad_test.cc.
      
      Change-Id: If3d10ebb56bda584bdb69bcf056599d580b12cb1
      c43af9a8
  21. 13 Jun, 2013 2 commits
    • Jingning Han's avatar
      Enable sse2 version of sad8x4/4x8 · 15f50e7b
      Jingning Han authored
      The encoding time for bus at CIF goes from 661s to 625s. This commit
      also enabled unit test of sad8x4/4x8 in sad_test.cc.
      
      Change-Id: If3d10ebb56bda584bdb69bcf056599d580b12cb1
      15f50e7b
    • John Koleszar's avatar
      Add vp9 test vectors unit test · 119c9812
      John Koleszar authored
      These files can stand in until we get proper syntax vectors. They
      should provide some additional assurance against inadvertant
      bitstream changes.
      
      Change-Id: I12f6c9a5f054e30df40a7ff1f33145abf7e1d59d
      119c9812
  22. 12 Jun, 2013 1 commit
  23. 10 Jun, 2013 1 commit
    • Deb Mukherjee's avatar
      Cosmetic cleanups of filters · 995ce523
      Deb Mukherjee authored
      No bitstream change.
      
      Removes unused filters and the code for the case of 2 switchable filters;
      also changes the 8tap-smooth filter coefficients for integer shifts to be
      interpolating to be consistent with the way it is implemented currently.
      
      Change-Id: I96c542fd8c06f4e0df507a645976f58e6de92aae
      995ce523
  24. 07 Jun, 2013 2 commits
    • Jingning Han's avatar
      Handle partition type coding of boundary blocks · 78b8190c
      Jingning Han authored
      The partition types of blocks sitting on the frame boundary are
      constrained by the block size and the position of each sub-block
      relative to the frame. Hence we use truncated probability models
      to handle the coding of such information.
      
      100 frames run:
      yt 0.138%
      
      Change-Id: I85d9b45665c15280069c0234ea6f778af586d87d
      78b8190c
    • John Koleszar's avatar
      Add marker bit to bool-coded partition start · a425e2cc
      John Koleszar authored
      Adds a marker bit to allow distinguishing the frame header from its residual
      data.
      
      Change-Id: Id75d47acc9e5a97007e4690c4f8748a4ce63e641
      a425e2cc