1. 22 Aug, 2013 3 commits
    • James Zern's avatar
      vp9/encoder: fix last_frame_seg_map mem leak · a5726ac4
      James Zern authored
      remove duplicate allocation from vp9_create_compressor, it was added to
      vp9_alloc_frame_buffers in:
      
      d5bec522 Added resizing & initialization of last frame segment map
      
      Change-Id: I996723226a16a62aff8f9a52ac74e0b73cc98fdf
      a5726ac4
    • Jingning Han's avatar
      Refactor rd_pick_partition for parameter control · 01a37177
      Jingning Han authored
      This commit changes the partition search order of superblocks from
      {SPLIT, NONE, HORZ, VERT} to {NONE, SPLIT, HORZ, VERT} for
      consistency with that of sub8x8 partition search. It enable the use
      of early termination in partition search for all block sizes.
      
      For ped_area_1080p 50 frames coded at 4000 kbps, it makes the runtime
      goes down from 844305ms -> 818003ms (3% speed-up) at speed 0.
      
      This will further move towards making the in-search partition types
      configurable, hence unifying various speed-up approaches.
      
      Some speed 1 and 2 features are turned off during the refactoring
      process, including:
      disable_split_var_thresh
      using_small_partition_info
      
      Stricter constraints are applied to use_square_partition_only for
      right/bottom boundary blocks. Will bring back/refine these features
      subsequently. At this point, it makes derf set at speed 1 about
      0.45% higher in compression performance, and 9% down in run-time.
      
      Change-Id: I3db9f9d1d1a0d6cbe2e50e49bd9eda1cf705f37c
      01a37177
    • Deb Mukherjee's avatar
      Fixes on feature disabling split based on variance · 8b810c7a
      Deb Mukherjee authored
      Adds a couple of minor fixes, which may be absorbed in Jingning's
      patch. Thanks to Guillaume for pointing these out.
      Also adjusts the thresholds for speed 1 and 2 to 16 and 32
      respectively, to keep quality drops small.
      
      Results:
      --------
      derfraw300:  threshold = 16, psnr -0.082%, speedup 2-3%
                   threshold = 32, psnr -0.218%, speedup 5-6%
      stdhdraw250: threshold = 16, psnr -0.031%, speedup 2-3%
                   threshold = 32, psnr -0.273%, speedup 5-6%
      
      Change-Id: I4b11ae8296cca6c2a9f644be7e40de7c423b8330
      8b810c7a
  2. 20 Aug, 2013 2 commits
    • Deb Mukherjee's avatar
      Cleanup/enhancements of switchable filter search · 2ffe64ad
      Deb Mukherjee authored
      Cleans up the switchable filter search logic. Also adds a
      speed feature - a variance threshold - to disable filter search
      if source variance is lower than this value.
      
      Results: derfraw300
      threshold = 16, psnr -0.238%, 4-5% speedup (tested on football)
      threshold = 32, psnr -0.381%, 8-9% speedup (tested on football)
      threshold = 64, psnr -0.611%, 12-13% speedup (tested on football)
      threshold = 96, psnr -0.804%, 16-17% speedup (tested on football)
      
      Based on these results, the threshold is chosen as 16 for speed 1,
      32 for speed 2, 64 for speed 3 and 96 for speed 4.
      
      Change-Id: Ib630d39192773b1983d3d349b97973768e170c04
      2ffe64ad
    • Paul Wilkins's avatar
      Changes to auto partition size selection. · e8923fe4
      Paul Wilkins authored
      Changes to code to auto select a partition size range
      based on data from spatial neighbors.
      
      Now looks at the sb_type in each 8x8 block of above
      and left SB64.
      
      The effect on speed 1 is now weaker giving better
      quality but less speed gain. Now also used in speed 2.
      
      Change-Id: Iace33a97d5c3498dd2a9a8a4067351941abcbabc
      e8923fe4
  3. 16 Aug, 2013 1 commit
  4. 15 Aug, 2013 2 commits
    • Dmitry Kovalev's avatar
      Moving segmentation struct from MACROBLOCKD to VP9_COMMON. · b7616e38
      Dmitry Kovalev authored
      VP9_COMMON is the right place to segmentatation struct because it has
      global segmentation parameters, not something specific to macroblock
      processing.
      
      Change-Id: Ib9ada0c06c253996eb3b5f6cccf6a323fbbba708
      b7616e38
    • Deb Mukherjee's avatar
      Speed feature to skip split partition based on var · 24856b6a
      Deb Mukherjee authored
      Adds a speed feature to disable split partition search based on a
      given threshold on the source variance. A tighter threshold derived
      from the threshold provided is used to also disable horizontal and
      vertical partitions.
      
      Results on derfraw300:
      threshold = 16, psnr = -0.057%, speedup ~1% (football)
      threshold = 32, psnr = -0.150%, speedup ~4-5% (football)
      threshold = 64, psnr = -0.570%, speedup ~10-12% (football)
      
      Results on stdhdraw250:
      threshold = 32, psnr = -0.18%, speedup is somewhat more than derf
      because of a larger number of smoother blocks at higher resolution.
      
      Based on these results, a threshold of 32 is chosen for speed 1,
      and a threshold of 64 is chosen for speeds 2 and above.
      
      Change-Id: If08912fb6c67fd4242d12a0d094783a99f52f6c6
      24856b6a
  5. 13 Aug, 2013 1 commit
    • Paul Wilkins's avatar
      Trivial clean up. · 5459f68d
      Paul Wilkins authored
      Delete unused / commented out  variable references.
      
      Change-Id: Iaf20c0c3744f89adb296d153b516b5ea41b4f3b4
      5459f68d
  6. 12 Aug, 2013 2 commits
  7. 10 Aug, 2013 2 commits
  8. 09 Aug, 2013 3 commits
  9. 08 Aug, 2013 1 commit
    • Deb Mukherjee's avatar
      Adds a new subpel motion function · 1ba91a84
      Deb Mukherjee authored
      Adds a new subpel motion estimation function that uses a 2-level
      tree-structured decision tree to eliminate redundant computations.
      It searches fewer points than iterative search (which can search
      the same point multiple times) but has the same quality roughly.
      
      This is made the default setting at speeds 0 and 1, while at
      speed 2 and above only a 1-level search is used.
      
      Also includes various cleanups for consistency and redundancy removal.
      
      Results:
      derf: +0.012% psnr
      stdhd: +0.09% psnr
      Speedup of about 2-3%
      
      Change-Id: Iedde4866f5475586dea0f0ba4cb7428fba24eee9
      1ba91a84
  10. 07 Aug, 2013 2 commits
    • Jingning Han's avatar
      Use low precision 32x32fdct for encodemb in speed1 · debb9c68
      Jingning Han authored
      The low precision 32x32 fdct has all the intermediate steps within
      16-bit depth, hence allowing faster SSE2 implementation, at the
      expense of larger round-trip error. It was used in the rate-distortion
      optimization search loop only.
      
      Using the low precision version, in replace of the high precision one,
      affects the compression performance by about 0.7% (derf, stdhd) at
      speed 0. For speed 1, it makes derf set down by only 0.017%.
      
      Change-Id: I4e7d18fac5bea5317b91c8e7dabae143bc6b5c8b
      debb9c68
    • Deb Mukherjee's avatar
      Clean ups of the subpel search functions · 71b43b0f
      Deb Mukherjee authored
      Removes some unused code and speed features, and organizes the
      interfaces for fractional mv step functions for use in new speed
      features to come.
      
      In the process a new speed feature - number of iterations per
      step during the subpel search - is exposed.
      
      No change when this parameter is set as the original value of 3.
      
      Results:
      subpel_iters_per_step = 3: baseline
      subpel_iters_per_step = 2: psnr -0.067%, 1% speedup
      subpel_iters_per_step = 1: psnr -0.331%, 3-4% speedup
      
      Change-Id: I2eba8a21f6461be8caf56af04a5337257a5693a8
      71b43b0f
  11. 06 Aug, 2013 1 commit
    • Deb Mukherjee's avatar
      Flexible support for various pattern searches · 15b5a6a2
      Deb Mukherjee authored
      Adds a few pattern searches to achieve various tradeoffs
      between motion estimation complexity and performance.
      The search framework is unified across these searches so that a
      common pattern search function is used for all. Besides it will
      be easier to experiment with various patterns or combinations
      thereof at different scales in the future.
      
      The new pattern search is multi-scale and is capable of using
      different patterns at different scales.
      
      The new hex search uses 8 points at the smallest scale
      and 6 points at other scales.
      Two other pattern searches - big-diamond and square are
      also added. Big diamond uses 4 points at the smallest scale and
      8 points in diamond shape at the larger scales.
      Square is very similar conceptually to the default n-step search
      but is somewhat faster since it keeps only one survivor across
      all scales.
      
      Psnr/speed-up results on derf300:
      
      hex: -1.6% psnr%, 6-8% speed-up
      big-diamond: -0.96% psnr, 4-5% speedup
      square: -0.93% psnr, 4-5% speedup
      
      Change-Id: I02a7ef5193f762601e0994e2c99399a3535a43d2
      15b5a6a2
  12. 05 Aug, 2013 2 commits
    • Deb Mukherjee's avatar
      Add variance based mode/skipping · 8b3faccb
      Deb Mukherjee authored
      Adds a speed feature to skip all intra modes other than
      DC_PRED if the source variance is small. This feature is
      made part of speed 1 and up.
      
      Results on derf300: psnr -0.07%, speedup about 1-2%
      
      Also uses the source variance to fine-tune the early
      termination criteria when FLAG_EARLY_TERMINATE is on.
      This feature is made part of speed 2 and up.
      
      Results on derf300: psnr -0.52%, speedup about 5-7%
      
      Change-Id: I59e38aa836557cfa5405ae706fc64815cbfe4232
      8b3faccb
    • Dmitry Kovalev's avatar
      Replacing long block size enum values with shorter ones (2). · d007446b
      Dmitry Kovalev authored
      Change-Id: I428c4d42212b757112e3acfe5b81314cfbb5fd6b
      d007446b
  13. 03 Aug, 2013 1 commit
  14. 01 Aug, 2013 2 commits
  15. 30 Jul, 2013 2 commits
  16. 26 Jul, 2013 1 commit
    • Paul Wilkins's avatar
      Auto min and max partition size experiment. · fe5e2a91
      Paul Wilkins authored
      Speed feature experiment to set an upper and lower
      partition size limit based on what has been seen
      in spatial neighbors.
      
      This seems to gives quite reasonable speed gains in local
      (10-15%) and when used with speed 0 the losses are small
      (0.25% derf, 0.35% stdhd). However, for now I am only
      enabling it on speed 1 as there may be clashes with the existing
      temporal partition selection in speed 2.
      
      Using a tighter min / max around the range derived from the
      neighbors increases speed further but at the cost of a
      bigger quality loss. However,  I think this spatial method could
      be combined with data from either the last frame or a variance
      method (or both) to refine the range of minimum and maximum
      partition size. I.e. consider the min and max from spatial and
      temporal neighbors and the variance recommendation.
      
      Change-Id: I1b96bf8b84368d6aad0c7aa600fe141b4f07435f
      fe5e2a91
  17. 25 Jul, 2013 2 commits
    • Dmitry Kovalev's avatar
      General cleanups. · 7131cb0e
      Dmitry Kovalev authored
      Removing unused constants, macros, and function declarations. Using
      ROUND_POWER_OF_TWO macro, vp9_zero, vp9_copy where possible. Moving
      #include from *.h to *.c. Merging for loops for motion vectors.
      
      Change-Id: Ic3bf841764a2bb177128bb3a6d7aa8f68229cd13
      7131cb0e
    • Dmitry Kovalev's avatar
      Removing vp9_adapt_mode_context function. · 47d61f00
      Dmitry Kovalev authored
      Moving code from vp9_adapt_mode_context to vp9_adapt_mode_probs.
      
      Change-Id: I60829c30b28968cd813551ef3a206dfb98d323c9
      47d61f00
  18. 24 Jul, 2013 2 commits
  19. 23 Jul, 2013 1 commit
    • Paul Wilkins's avatar
      Renaming of segment constants. · 32042af1
      Paul Wilkins authored
      Renamed:
        MAX_MB_SEGMENTS to MAX_SEGMENTS
        MB_SEG_TREE_PROBS to SEG_TREE_PROBS
      
      The minimum unit for segmentation in the segment map
      is now 8x8 so it is misleading to use MB_ as macro-block
      traditionally refers to a 16x16 region.
      
      Change-Id: I0b55a6f0426bb46dd13435fcfa5bae0a30a7fa22
      32042af1
  20. 22 Jul, 2013 2 commits
    • Paul Wilkins's avatar
      Re-order mode search in rd. · 1d189d64
      Paul Wilkins authored
      Mode search order in rd loop changed to better reflect
      observed hit counts.
      
      Also some adjustment of the baseline mode rd thresholds
      to reflect the order change and observed frequencies.
      
      Change-Id: I47a131cc83e11551df8add6d6d8d413d78d3a63c
      1d189d64
    • Paul Wilkins's avatar
      Fix build error. · 888375d2
      Paul Wilkins authored
      When CONFIG_POSTPROC is set there was a now
      invalid reference to cm->filter_level.
      
      Changed to cpi->mb.e_mbd.lf.filter_level in line with
      change Iaf5fb71c33719cdfa1b991f671caf071be9ea035
      
      Change-Id: If746e60044903f7ba8d0d346225b3d015226c7d0
      888375d2
  21. 19 Jul, 2013 2 commits
    • Dmitry Kovalev's avatar
      Moving all loop filter related variables into new struct. · ee1771eb
      Dmitry Kovalev authored
      Adding loopfilter struct with fields from MACROBLOCKD and VP9Common.
      Eventually it will be moved to vp9_loopfilter.h for better code structure.
      
      Change-Id: Iaf5fb71c33719cdfa1b991f671caf071be9ea035
      ee1771eb
    • Deb Mukherjee's avatar
      Reworked the auto_mv_step_size speed feature · 302698fb
      Deb Mukherjee authored
      This patch modifies the auto_mv_step_size speed feature to
      use a combination of the maximum magnitude mv from the last
      inter frame, and the maximum magnitude mv for the two reference
      mvs with the same reference. For arf frames, the max mav step
      for the resolution is used.
      The bounds therefore are slightly tighter. The feature is made
      a speed 1 feature.
      
      Rebased.
      
      Results (when this feature is turned on over speed 0):
      derfraw300: -0.046% psnr, about 5+% speedup
      (tested on football: goes from 4m30.760s to 4m17.410s).
      
      Change-Id: If492797a61b0b4b3e58c0b8f86afb880165fc9f6
      302698fb
  22. 18 Jul, 2013 2 commits
  23. 17 Jul, 2013 1 commit