1. 03 Jan, 2014 1 commit
  2. 27 Dec, 2013 1 commit
    • Jingning Han's avatar
      Adaptive motion control on ref and search range · a4ce53f1
      Jingning Han authored
      This commit takes a preliminary attempt to refine the motion search
      control. It detects the SAD associated with mv predictor per reference
      frame, and based on which to determine whether the encoder wants to
      reduce the motion search range (if the predicted mv provides fairly
      small SAD), or to skip the current reference frame (if there exists
      another ref frame that gives much smaller SAD cost).
      
      This feature is turned on in the settings of speed 1 and above.
      
      In speed 1, compression performance changed
      derf  -0.018%
      yt    -0.043%
      hd    -0.045%
      stdhd -0.281%
      
      speed-up
      pedestrian_area_1080p at 4000 kbps 100 frames
      199651ms -> 188846ms (5.5% speed-up)
      blue_sky_1080p at 6000 kbps
      443531ms -> 415239ms (6.3% speed-up)
      
      In speed 2, compression performance changed
      derf  -0.026%
      yt    -0.090%
      hd    -0.055%
      stdhd -0.210%
      
      speed-up
      pedstrian 113949ms -> 108855ms (4.5% speed-up)
      blue_sky  271057ms -> 257322ms (5% speed-up)
      
      Change-Id: I1b74ea28278c94fea329d971d706d573983d810d
      a4ce53f1
  3. 20 Dec, 2013 1 commit
    • Jingning Han's avatar
      Store the SSE of prediction residuals · 243327f4
      Jingning Han authored
      Buffer the SSE of prediction residuals in the rate-distortion
      optimization loop of a given block. This information would be used
      for later encoding control.
      
      Change-Id: If4e63f3462490513c48be9407d3327c8dd438367
      243327f4
  4. 13 Dec, 2013 1 commit
    • Jingning Han's avatar
      Enable adaptive pred filter type for sub8x8 · 3b5a90bd
      Jingning Han authored
      This commit enables an adaptive prediction filter type selection
      for sub8x8 block sizes. In speed 1, it re-uses the filter type of
      collocated 8x8 block if it is tested in the rate-distortion optimization
      loop, for the sub8x8 blocks. Otherwise, it runs the normal test
      over all the three filter types. In speed 2, it re-uses the 8x8
      block's prediction filter type, if available. Otherwise, force it
      to be EIGHTTAP.
      
      Compression and speed performance wise:
      speed 1
      derf -0.266%
      yt   -0.138%
      
      bus at 2000 kbps: 33766ms -> 30451ms (10% speed-up)
      football at 600 kbps: 48173ms -> 43786ms (9% speed-up)
      
      speed 2
      derf -0.026%
      yt   +0.134%
      
      bus at 2000 kbps: 18973ms -> 17698ms (6% speed-up)
      football at 600 kbps: 26748ms -> 25096ms (6% speed-up)
      
      Change-Id: I77e097533b969fd3472147225fa79fc98095d342
      3b5a90bd
  5. 06 Dec, 2013 2 commits
  6. 04 Dec, 2013 2 commits
    • Dmitry Kovalev's avatar
      Moving eob array to the encoder. · f00d157c
      Dmitry Kovalev authored
      In the decoder we don't need to save eobs, we can pass eob as an argument.
      That's why removing eob arrays from VP9Decompressor and TileWorkerData,
      and moving eob pointer from macroblockd_plane to macroblock_plane.
      
      Change-Id: I8eb919acc837acfb3abdd8319af63d1bbca8217a
      f00d157c
    • Dmitry Kovalev's avatar
      Cleaning up vp9_entropy.h file. · 8e89e2f2
      Dmitry Kovalev authored
      Renaming constants for consistency:
        DCT_VAL_CATEGORY1 => CATEGORY1_TOKEN
        DCT_VAL_CATEGORY2 => CATEGORY2_TOKEN
        DCT_VAL_CATEGORY3 => CATEGORY3_TOKEN
        DCT_VAL_CATEGORY4 => CATEGORY4_TOKEN
        DCT_VAL_CATEGORY5 => CATEGORY5_TOKEN
        DCT_VAL_CATEGORY6 => CATEGORY6_TOKEN
        DCT_EOB_TOKEN     => EOB_TOKEN
        DCT_EOB_MODEL_TOKEN => EOB_MODEL_TOKEN
        MAX_ENTROPY_TOKENS => ENTROPY_TOKENS
      
      Moving constants:
        INTER_MODE_CONTEXTS from vp9_entropy.h to vp9_blockd.h.
        EOSB_TOKEN from vp9_entropy.h to vp9_tokenize.h
      
      Change-Id: I5fcbf081318e1d365792b6d290a930c6cb0f3fc2
      8e89e2f2
  7. 27 Nov, 2013 1 commit
  8. 14 Nov, 2013 1 commit
    • Deb Mukherjee's avatar
      Simplifies band-getting with a static array · cfcd5c4f
      Deb Mukherjee authored
      Simplifies the code by implementing band mapping with static arrays.
      A lot of the code complexity introduced in a previous patch
      disappears.
      
      Change-Id: Ia3fac36e594fb5ad2d55ae141c58bba4c55c2d28
      cfcd5c4f
  9. 13 Nov, 2013 2 commits
    • Jingning Han's avatar
      Dual buffer encoding for intra modes · b6b91432
      Jingning Han authored
      Overall change (using dual buffer scheme for superblocks of both inter
      and intra modes) reduces speed 2 runtime:
      bluesky_1080p at 6000kbps:   263553ms -> 257441ms
      riverbed_1080p at 8000kbps:  233230ms -> 225308ms.
      
      Change-Id: Idf8d70f768a4b0d97b2a8506372c57b7b4022119
      b6b91432
    • Dmitry Kovalev's avatar
      Moving q_index from MACROBLOCKD to MACROBLOCK. · 3f3d14e1
      Dmitry Kovalev authored
      Moving because q_index is used only by encoder.
      
      Change-Id: I0b96175614ed4fd3d76ee56a0ba36258e1e896f6
      3f3d14e1
  10. 12 Nov, 2013 3 commits
    • Deb Mukherjee's avatar
      Removes conditional statements from band getting · 5ade4237
      Deb Mukherjee authored
      Implements scan order to band map with arrays in both the encoder
      and decoder to remove conditional statements.
      
      Encoding seems to be about 1% faster at speed 0, tested on football.
      Decoding seems to be about 0.5-1% faster on a set of 25 videos.
      
      Change-Id: Idb233ca0b9e0efd790e30880642e8717e1c5c8dd
      5ade4237
    • Jingning Han's avatar
      Enable dual buffer rd search and encoding scheme · 34b6abef
      Jingning Han authored
      This commit enables the dual buffer rate-distortion optimization
      and encoding scheme. It stacks the original transform coefficients,
      quantized levels, and reconstructed coefficients, in the rate-
      distortion optimization search process, hence eliminates the need
      to re-run residual generation, forward transform, and quantization
      in the encoding stage.
      
      Change-Id: I011bfad3a59a380a869ee552e91dae0394ec492e
      34b6abef
    • Jingning Han's avatar
      Allocate dual buffer sets for encoding · 3b3aea68
      Jingning Han authored
      Allocate memory space of dual buffer sets that store the coeff, qcoeff,
      dqcoeff, and eobs. Connect the pointers of macroblock_plane and
      macroblockd_plane to the actual buffer in use accordingly.
      
      Change-Id: I2f0b5f482ca879fae39095013eaf8901db20a5a4
      3b3aea68
  11. 11 Nov, 2013 1 commit
  12. 06 Nov, 2013 1 commit
  13. 30 Oct, 2013 1 commit
  14. 24 Oct, 2013 1 commit
  15. 22 Oct, 2013 1 commit
    • Dmitry Kovalev's avatar
      Removing quantize_b_4x4 function pointer. · ec414372
      Dmitry Kovalev authored
      The pointer was asigned only once with vp9_regular_quantize_b_4x4, calling
      this function directly now. Also removing unused declarations:
        prototype_quantize_block
        prototype_quantize_block_pair
        prototype_quantize_mb
        vp9_regular_quantize_b_4x4_pair
        vp9_regular_quantize_b_8x8
      
      Change-Id: I14325bc2f082336820671eafbc06126651b79f73
      ec414372
  16. 19 Oct, 2013 1 commit
    • Dmitry Kovalev's avatar
      Removing NUM_ prefix from constant names. · 6d2a0da7
      Dmitry Kovalev authored
      Renames for consistency with other constants:
        NUM_FRAME_TYPES -> FRAME_TYPES
        NUM_PARTITION_CONTEXTS -> PARTITION_CONTEXTS
      
      Change-Id: I3db30acb2868eb0a424237c831087b2e264ec47f
      6d2a0da7
  17. 18 Oct, 2013 2 commits
    • Dmitry Kovalev's avatar
      Using INTER_MODES constant instead of MB_MODE_COUNT - NEARESTMV. · 18a4bd25
      Dmitry Kovalev authored
      Change-Id: Ie5ec392904d03fd5485474b33be8408108e9d3c9
      18a4bd25
    • Jingning Han's avatar
      Make memory alloc in pick_mode_context bsize aware · 72033fcf
      Jingning Han authored
      This commit makes the buffer allocation of zcoeff_blk array in
      pick_mode_context block size aware. It calculates the number of
      4x4 blocks in the partition and assigns the memory space accordingly.
      This process (and the uninitialization) is done once for each encoding
      pass. It allows memory copy of smaller buffer when possible.
      
      For football at 600kbps, the runtimes improve by about 1%:
      speed 1, 45961ms -> 45472ms
      speed 2, 23863ms -> 23598ms
      
      Change-Id: Id2ca24906fa89f46fa5fe742ec4b8efc2a61f877
      72033fcf
  18. 16 Oct, 2013 2 commits
  19. 15 Oct, 2013 3 commits
  20. 14 Oct, 2013 1 commit
    • Jingning Han's avatar
      Move token_cache from cost_coeffs to MACROBLOCK · f60a3910
      Jingning Han authored
      This commit moves token_cache buffer into macroblock struct, instead
      of defining as a local variable in cost_coeffs. This avoids repeatedly
      re-allocating memory space in the rate-distortion optimization loop.
      
      The runtime at speed 0 reduces:
      bus 2000kbps, 161692ms to 159951ms
      football 600kbps, 229505ms to 225821ms
      
      Change-Id: If7da6b0b6d8c5138a16271a33c4548fba33d8840
      f60a3910
  21. 10 Oct, 2013 1 commit
    • Jingning Han's avatar
      Re-design rate-distortion cost tracking buffers · fc19243c
      Jingning Han authored
      This commit re-designs the per transformed block rate-distortion
      costs tracking buffers. It removes redundant buffer usage, makes
      the needed context memory allocation per VP9_COMP instance and
      reuses the same buffer sets inside the rate-distortion optimization
      search loop, thereby avoiding repeatedly requiring memory space.
      
      It reduces speed 0 runtime:
      
      bus at 2000 kbps from 166763ms to 158967ms,
      football at 600 kbps from 246614ms to 234257ms.
      
      Both about 5% speed-up. Local tests suggest about 2% to 5% speed-up
      for speed 1 and 2 settings. This does not change compression
      performance.
      
      Change-Id: I363514c5276b5cf9a38c7251088ffc6ab7f9a4c3
      fc19243c
  22. 09 Oct, 2013 1 commit
    • Jingning Han's avatar
      Deprecate the use of PARTITION_INFO from encoder · 03fe08ca
      Jingning Han authored
      Use b_mode_info to store the inter prediction mode of sub8x8 block,
      in replacement of the use of partition_info. Remove redundant buffer
      update for partition_info. For bus_cif at 2000 kbps, this seem to make
      speed 0 about 1% faster.
      
      Change-Id: Id1b3be45e75a24fb4b42335ac480c23e440978f6
      03fe08ca
  23. 01 Oct, 2013 1 commit
  24. 23 Sep, 2013 1 commit
    • Jingning Han's avatar
      Enable per transformed block zero coeffs forcing · a517343c
      Jingning Han authored
      This commit enables forcing all coefficients zero per transformed
      block, when its rate-distortion cost is lower than regular coeff
      quantization.
      
      The overall performance improvement (including its parent patch on
      calculating rd cost per transformed block) at speed 1:
      derf:  0.298%
      yt:    0.452%
      hd:    0.741%
      stdhd: 0.006%
      
      Change-Id: I66005fe0fd7af192c3eba32e02fd6d77952accb5
      a517343c
  25. 13 Sep, 2013 1 commit
    • Jingning Han's avatar
      Adaptive motion search control · c4826c59
      Jingning Han authored
      This commit enables adaptive constraint on motion search range for
      smaller partitions, given the motion vectors of collocated larger
      partition as a candidate initial search point.
      
      It makes speed 0 runtime of bus at CIF and 2000 kbps goes from
      167s down to 162s (3% speed-up), at 0.01dB performance gains. In
      the settings of speed 1, this makes the runtime goes from 33687 ms
      to 32142 ms (4.5% speed-up), at 0.03dB performance gains.
      
      Compression performance wise, it gains at speed 1:
      derf  0.118%
      yt    0.237%
      hd    0.203%
      stdhd 0.438%
      
      Change-Id: Ic8b34c67810d9504a9579bef2825d3fa54b69454
      c4826c59
  26. 27 Aug, 2013 1 commit
  27. 24 Aug, 2013 1 commit
  28. 07 Aug, 2013 1 commit
    • Jingning Han's avatar
      Use low precision 32x32fdct for encodemb in speed1 · debb9c68
      Jingning Han authored
      The low precision 32x32 fdct has all the intermediate steps within
      16-bit depth, hence allowing faster SSE2 implementation, at the
      expense of larger round-trip error. It was used in the rate-distortion
      optimization search loop only.
      
      Using the low precision version, in replace of the high precision one,
      affects the compression performance by about 0.7% (derf, stdhd) at
      speed 0. For speed 1, it makes derf set down by only 0.017%.
      
      Change-Id: I4e7d18fac5bea5317b91c8e7dabae143bc6b5c8b
      debb9c68
  29. 05 Aug, 2013 1 commit
    • Deb Mukherjee's avatar
      Add variance based mode/skipping · 8b3faccb
      Deb Mukherjee authored
      Adds a speed feature to skip all intra modes other than
      DC_PRED if the source variance is small. This feature is
      made part of speed 1 and up.
      
      Results on derf300: psnr -0.07%, speedup about 1-2%
      
      Also uses the source variance to fine-tune the early
      termination criteria when FLAG_EARLY_TERMINATE is on.
      This feature is made part of speed 2 and up.
      
      Results on derf300: psnr -0.52%, speedup about 5-7%
      
      Change-Id: I59e38aa836557cfa5405ae706fc64815cbfe4232
      8b3faccb
  30. 03 Aug, 2013 1 commit
  31. 29 Jul, 2013 1 commit