1. 16 Jun, 2016 1 commit
    • Geza Lore's avatar
      Use correct size load in vpx_avg_4x4_sse2. · ffa91733
      Geza Lore authored
      The old version used 64 bit loads, and then ignored the top half
      of the result. This can cause asan failures if we read past the end
      of a buffer. Switched to using 32 bit loads instead.
      Change-Id: I57da127a26f869fb4b4f700b55408f6dc2fbbc1a
  2. 14 Dec, 2015 1 commit
  3. 20 Nov, 2015 2 commits
    • James Zern's avatar
      fix vp9_satd_sse2 · 60760f71
      James Zern authored
      accumulate satd in 32-bits
      + add unit test
      Change-Id: I6748183df3662ddb9d635f9641f9586f2fd38ad5
    • James Zern's avatar
      vp9_satd: return an int · 3e0138ed
      James Zern authored
      the final sum may use up to 26 bits
      + add a unit test
      + disable the sse2 as the result will rollover; this will be fixed in a
      future commit
      Change-Id: I2a49811dfaa06abfd9fa1e1e65ed7cd68e4c97ce
  4. 12 Jun, 2015 1 commit
  5. 15 May, 2015 1 commit
  6. 17 Apr, 2015 1 commit
  7. 14 Apr, 2015 1 commit
  8. 13 Apr, 2015 1 commit
    • Marco's avatar
      Force_split on 16x16 blocks in variance partition. · eb8c6675
      Marco authored
      Force split on 16x16 block (to 8x8) based on the minmax over the 8x8 sub-blocks.
      Also increase variance threshold for 32x32, and add exit condiiton in choose_partition
      (with very safe threshold) based on sad used to select reference frame.
      Some visual improvement near moving boundaries.
      Average gain in psnr/ssim: ~0.6%, some clips go up ~1 or 2%.
      Encoding time increase (due to more 8x8 blocks) from ~1-4%, depending on clip.
      Change-Id: I4759bb181251ac41517cd45e326ce2997dadb577
  9. 31 Mar, 2015 1 commit
  10. 30 Mar, 2015 3 commits
    • Jingning Han's avatar
      Fix 8x8 Hadamard SSE2 implementation · 34a996ac
      Jingning Han authored
      This commit fixes the SSE2 version 8x8 Hadamard transform
      alignment and makes it consistent with the C version.
      Change-Id: I1304e5f97e0e5ef2d798fe38081609c39f5bfe74
    • Jingning Han's avatar
      Enable 16x16 Hadamard transform in SATD based mode decision · 26d3d3af
      Jingning Han authored
      This commit replaces the 16x16 2D-DCT transform with Hadamard
      transform for RTC coding mode. It reduces the CPU cycles cost
      on 16x16 transform by 5X. Overall it makes the speed -6 encoding
      speed 1.5% faster without compromise on compression performance.
      Change-Id: If6c993831dc4c678d841edc804ff395ed37f2a1b
    • Jingning Han's avatar
      Hadamard transform based coding mode decision process · 8c411f74
      Jingning Han authored
      This commit uses Hadamard transform based rate-distortion cost
      estimate for rtc coding mode decision. It improves the compression
      performance of speed -6 for many hard clips at lower bit-rates.
      For example, 5.5% for jimredvga, 6.7% for mmmoving, 6.1% for
      niklas720p. This will introduce extra encoding cycle costs at
      this point.
      Change-Id: Iaf70634fa2417a705ee29f2456175b981db3d375
  11. 16 Mar, 2015 1 commit
    • Jingning Han's avatar
      Refactor column integral projection computation · 2cfddec3
      Jingning Han authored
      Move the scaling factor outside column projection. This avoids
      repeated calculation of the same scaling factor. Profiling shows
      that the percentage of vp9_int_pro_col_sse2 of overall cycles
      goes from 2.29% down to 1.88%.
      Change-Id: I5ac4e324ab2d7f33ba2de66dd2a12e04e04dfd66
  12. 11 Mar, 2015 1 commit
    • Jingning Han's avatar
      Apply fast motion search to golden reference frame · 54eda13f
      Jingning Han authored
      This commit enables the rtc coding mode to run integral projection
      based motion search for golden reference frame. It improves the
      speed -6 compression performance by 1.1% on average, 3.46% for
      jimred_vga, 6.46% for tacomascmvvga, and 0.5% for vidyo clips. The
      speed -6 is about 6% slower.
      Change-Id: I0fe402ad2edf0149d0349ad304ab9b2abdf0c804
  13. 03 Mar, 2015 1 commit
  14. 01 Mar, 2015 1 commit
    • Jingning Han's avatar
      Use variance metric for integral projection vector match · 1790d452
      Jingning Han authored
      This commit replaces the SAD with variance as metric for the
      integral projection vector match. It improves the search accuracy
      in the presence of slight light change. The average speed -6
      compression performance for rtc set is improved by 1.7%. No speed
      changes are observed for the test clips.
      Change-Id: I71c1d27e42de2aa429fb3564e6549bba1c7d6d4d
  15. 26 Feb, 2015 1 commit
  16. 19 Feb, 2015 1 commit
    • Jingning Han's avatar
      Integral projection based motion estimation · ed2dc59c
      Jingning Han authored
      This commit introduces a new block match motion estimation
      using integral projection measurement. The 2-D block and the nearby
      region is projected onto the horizontal and vertical 1-D vectors,
      respectively. It then runs vector match, instead of block match,
      over the two separate 1-D vectors to locate the motion compensated
      reference block.
      This process is run per 64x64 block to align the reference before
      choosing partitioning in speed 6. The overall CPU cycle cost due
      to this additional 64x64 block match (SSE2 version) takes around 2%
      at low bit-rate rtc speed 6. When strong motion activities exist in
      the video sequence, it substantially improves the partition
      selection accuracy, thereby achieving better compression performance
      and lower CPU cycles.
      The experiments were tested in RTC speed -6 setting:
      cloud 1080p 500 kbps
      17006 b/f, 37.086 dB, 5386 ms ->
      16669 b/f, 37.970 dB, 5085 ms (>0.9dB gain and 6% faster)
      pedestrian_area 1080p 500 kbps
      53537 b/f, 36.771 dB, 18706 ms ->
      51897 b/f, 36.792 dB, 18585 ms (4% bit-rate savings)
      blue_sky 1080p 500 kbps
      70214 b/f, 33.600 dB, 13979 ms ->
      53885 b/f, 33.645 dB, 10878 ms (30% bit-rate savings, 25% faster)
      jimred 400 kbps
      13380 b/f, 36.014 dB, 5723 ms ->
      13377 b/f, 36.087 dB, 5831 ms  (2% bit-rate savings, 2% slower)
      Change-Id: Iffdb6ea5b16b77016bfa3dd3904d284168ae649c
  17. 03 Dec, 2014 1 commit
    • Marco's avatar
      Enable non-rd mode coding on key frame, for speed 6. · 8fd3f9a2
      Marco authored
      For key frame at speed 6: enable the non-rd mode selection in speed setting
      and use the (non-rd) variance_based partition.
      Adjust some logic/thresholds in variance partition selection for key frame only (no change to delta frames),
      mainly to bias to selecting smaller prediction blocks, and also set max tx size of 16x16.
      Loss in key frame quality (~0.6-0.7dB) compared to rd coding,
      but speeds up key frame encoding by at least 6x.
      Average PNSR/SSIM metrics over RTC clips go down by ~1-2% for speed 6.
      Change-Id: Ie4845e0127e876337b9c105aa37e93b286193405
  18. 10 Oct, 2014 1 commit
  19. 07 Oct, 2014 1 commit
    • Jim Bankoski's avatar
      experimental : partition using 1/8 x 1/8 image · 0ce51d82
      Jim Bankoski authored
      The concept:
      There's too much noise in source pixels for variance and at low bitrate
      the reconstructed looks nothing like the source so we have problems
      getting good partitionings with either.   This skirts the issue by using
      a box blur scaled down version for variance calculations.  To compare
      against source_var_ moved keyframe to be rd based like source_var.
      Change-Id: Ie3babdbfadae324b7b5a76bea192893af27f0624