1. 10 Aug, 2013 1 commit
  2. 09 Aug, 2013 3 commits
  3. 08 Aug, 2013 1 commit
    • Deb Mukherjee's avatar
      Adds a new subpel motion function · 1ba91a84
      Deb Mukherjee authored
      Adds a new subpel motion estimation function that uses a 2-level
      tree-structured decision tree to eliminate redundant computations.
      It searches fewer points than iterative search (which can search
      the same point multiple times) but has the same quality roughly.
      
      This is made the default setting at speeds 0 and 1, while at
      speed 2 and above only a 1-level search is used.
      
      Also includes various cleanups for consistency and redundancy removal.
      
      Results:
      derf: +0.012% psnr
      stdhd: +0.09% psnr
      Speedup of about 2-3%
      
      Change-Id: Iedde4866f5475586dea0f0ba4cb7428fba24eee9
      1ba91a84
  4. 07 Aug, 2013 2 commits
    • Jingning Han's avatar
      Use low precision 32x32fdct for encodemb in speed1 · debb9c68
      Jingning Han authored
      The low precision 32x32 fdct has all the intermediate steps within
      16-bit depth, hence allowing faster SSE2 implementation, at the
      expense of larger round-trip error. It was used in the rate-distortion
      optimization search loop only.
      
      Using the low precision version, in replace of the high precision one,
      affects the compression performance by about 0.7% (derf, stdhd) at
      speed 0. For speed 1, it makes derf set down by only 0.017%.
      
      Change-Id: I4e7d18fac5bea5317b91c8e7dabae143bc6b5c8b
      debb9c68
    • Deb Mukherjee's avatar
      Clean ups of the subpel search functions · 71b43b0f
      Deb Mukherjee authored
      Removes some unused code and speed features, and organizes the
      interfaces for fractional mv step functions for use in new speed
      features to come.
      
      In the process a new speed feature - number of iterations per
      step during the subpel search - is exposed.
      
      No change when this parameter is set as the original value of 3.
      
      Results:
      subpel_iters_per_step = 3: baseline
      subpel_iters_per_step = 2: psnr -0.067%, 1% speedup
      subpel_iters_per_step = 1: psnr -0.331%, 3-4% speedup
      
      Change-Id: I2eba8a21f6461be8caf56af04a5337257a5693a8
      71b43b0f
  5. 06 Aug, 2013 1 commit
    • Deb Mukherjee's avatar
      Flexible support for various pattern searches · 15b5a6a2
      Deb Mukherjee authored
      Adds a few pattern searches to achieve various tradeoffs
      between motion estimation complexity and performance.
      The search framework is unified across these searches so that a
      common pattern search function is used for all. Besides it will
      be easier to experiment with various patterns or combinations
      thereof at different scales in the future.
      
      The new pattern search is multi-scale and is capable of using
      different patterns at different scales.
      
      The new hex search uses 8 points at the smallest scale
      and 6 points at other scales.
      Two other pattern searches - big-diamond and square are
      also added. Big diamond uses 4 points at the smallest scale and
      8 points in diamond shape at the larger scales.
      Square is very similar conceptually to the default n-step search
      but is somewhat faster since it keeps only one survivor across
      all scales.
      
      Psnr/speed-up results on derf300:
      
      hex: -1.6% psnr%, 6-8% speed-up
      big-diamond: -0.96% psnr, 4-5% speedup
      square: -0.93% psnr, 4-5% speedup
      
      Change-Id: I02a7ef5193f762601e0994e2c99399a3535a43d2
      15b5a6a2
  6. 05 Aug, 2013 2 commits
    • Deb Mukherjee's avatar
      Add variance based mode/skipping · 8b3faccb
      Deb Mukherjee authored
      Adds a speed feature to skip all intra modes other than
      DC_PRED if the source variance is small. This feature is
      made part of speed 1 and up.
      
      Results on derf300: psnr -0.07%, speedup about 1-2%
      
      Also uses the source variance to fine-tune the early
      termination criteria when FLAG_EARLY_TERMINATE is on.
      This feature is made part of speed 2 and up.
      
      Results on derf300: psnr -0.52%, speedup about 5-7%
      
      Change-Id: I59e38aa836557cfa5405ae706fc64815cbfe4232
      8b3faccb
    • Dmitry Kovalev's avatar
      Replacing long block size enum values with shorter ones (2). · d007446b
      Dmitry Kovalev authored
      Change-Id: I428c4d42212b757112e3acfe5b81314cfbb5fd6b
      d007446b
  7. 03 Aug, 2013 1 commit
  8. 01 Aug, 2013 2 commits
  9. 30 Jul, 2013 2 commits
  10. 26 Jul, 2013 1 commit
    • Paul Wilkins's avatar
      Auto min and max partition size experiment. · fe5e2a91
      Paul Wilkins authored
      Speed feature experiment to set an upper and lower
      partition size limit based on what has been seen
      in spatial neighbors.
      
      This seems to gives quite reasonable speed gains in local
      (10-15%) and when used with speed 0 the losses are small
      (0.25% derf, 0.35% stdhd). However, for now I am only
      enabling it on speed 1 as there may be clashes with the existing
      temporal partition selection in speed 2.
      
      Using a tighter min / max around the range derived from the
      neighbors increases speed further but at the cost of a
      bigger quality loss. However,  I think this spatial method could
      be combined with data from either the last frame or a variance
      method (or both) to refine the range of minimum and maximum
      partition size. I.e. consider the min and max from spatial and
      temporal neighbors and the variance recommendation.
      
      Change-Id: I1b96bf8b84368d6aad0c7aa600fe141b4f07435f
      fe5e2a91
  11. 25 Jul, 2013 2 commits
    • Dmitry Kovalev's avatar
      General cleanups. · 7131cb0e
      Dmitry Kovalev authored
      Removing unused constants, macros, and function declarations. Using
      ROUND_POWER_OF_TWO macro, vp9_zero, vp9_copy where possible. Moving
      #include from *.h to *.c. Merging for loops for motion vectors.
      
      Change-Id: Ic3bf841764a2bb177128bb3a6d7aa8f68229cd13
      7131cb0e
    • Dmitry Kovalev's avatar
      Removing vp9_adapt_mode_context function. · 47d61f00
      Dmitry Kovalev authored
      Moving code from vp9_adapt_mode_context to vp9_adapt_mode_probs.
      
      Change-Id: I60829c30b28968cd813551ef3a206dfb98d323c9
      47d61f00
  12. 24 Jul, 2013 2 commits
  13. 23 Jul, 2013 1 commit
    • Paul Wilkins's avatar
      Renaming of segment constants. · 32042af1
      Paul Wilkins authored
      Renamed:
        MAX_MB_SEGMENTS to MAX_SEGMENTS
        MB_SEG_TREE_PROBS to SEG_TREE_PROBS
      
      The minimum unit for segmentation in the segment map
      is now 8x8 so it is misleading to use MB_ as macro-block
      traditionally refers to a 16x16 region.
      
      Change-Id: I0b55a6f0426bb46dd13435fcfa5bae0a30a7fa22
      32042af1
  14. 22 Jul, 2013 2 commits
    • Paul Wilkins's avatar
      Re-order mode search in rd. · 1d189d64
      Paul Wilkins authored
      Mode search order in rd loop changed to better reflect
      observed hit counts.
      
      Also some adjustment of the baseline mode rd thresholds
      to reflect the order change and observed frequencies.
      
      Change-Id: I47a131cc83e11551df8add6d6d8d413d78d3a63c
      1d189d64
    • Paul Wilkins's avatar
      Fix build error. · 888375d2
      Paul Wilkins authored
      When CONFIG_POSTPROC is set there was a now
      invalid reference to cm->filter_level.
      
      Changed to cpi->mb.e_mbd.lf.filter_level in line with
      change Iaf5fb71c33719cdfa1b991f671caf071be9ea035
      
      Change-Id: If746e60044903f7ba8d0d346225b3d015226c7d0
      888375d2
  15. 19 Jul, 2013 2 commits
    • Dmitry Kovalev's avatar
      Moving all loop filter related variables into new struct. · ee1771eb
      Dmitry Kovalev authored
      Adding loopfilter struct with fields from MACROBLOCKD and VP9Common.
      Eventually it will be moved to vp9_loopfilter.h for better code structure.
      
      Change-Id: Iaf5fb71c33719cdfa1b991f671caf071be9ea035
      ee1771eb
    • Deb Mukherjee's avatar
      Reworked the auto_mv_step_size speed feature · 302698fb
      Deb Mukherjee authored
      This patch modifies the auto_mv_step_size speed feature to
      use a combination of the maximum magnitude mv from the last
      inter frame, and the maximum magnitude mv for the two reference
      mvs with the same reference. For arf frames, the max mav step
      for the resolution is used.
      The bounds therefore are slightly tighter. The feature is made
      a speed 1 feature.
      
      Rebased.
      
      Results (when this feature is turned on over speed 0):
      derfraw300: -0.046% psnr, about 5+% speedup
      (tested on football: goes from 4m30.760s to 4m17.410s).
      
      Change-Id: If492797a61b0b4b3e58c0b8f86afb880165fc9f6
      302698fb
  16. 18 Jul, 2013 2 commits
  17. 17 Jul, 2013 4 commits
    • Yunqing Wang's avatar
      Remove unnecessary calling of vp9_init_quantizer() · 3798db88
      Yunqing Wang authored
      vp9_init_quantizer() is called in vp9_create_compressor(), and
      should not be called in vp9_set_speed_features().
      
      Change-Id: Ic2f1f4b0531b9d46bb841d7e1d8da9812207dad6
      3798db88
    • Yunqing Wang's avatar
      Enable disable_splitmv feature for other speeds · 10e83b07
      Yunqing Wang authored
      Added disable_splitmv feature at other speed levels. For speed 3 or
      above, always turn it on.
      
      Change-Id: Ibb36f0a7ef12a34b4f8d0f9cb6193eab43b34360
      10e83b07
    • Yunqing Wang's avatar
      Speed up motion estimation using small partitions' result(experiment) · df90d58f
      Yunqing Wang authored
      Current partition checking starts from small sizes, and then goes up
      to large sizes. This experiment uses the small partitions' motion
      estimation result, which is already available, to speed up the
      large partition's motion estimation. We can decide to skip some
      patition checkings if they are unlikely choices. We could use the
      motion vector(MV) result as current partition's prediction MV, limit
      the search range and reference frame.
      
      Current result at speed 1:
      psnr loss: 1.19% for stdhd, 0.287% for derf.
      speed gain: 14% for sunflower(hd), 11% for akiyo.
      
      Further improvement will be done later.
      
      Change-Id: I5abfd070e9cace2e91e2a0247d1325df313887ab
      df90d58f
    • Paul Wilkins's avatar
      Move uv intra mode selection in rd loop. · 2ee338ce
      Paul Wilkins authored
      Use an estimate based on DC_PRED for intra uv cost
      within the rd loop then only do a full uv mode analysis
      if an intra mode is chosen.
      
      Significant speed gains in some cases. Currently only
      enabled for speed 2 pending speed/quality tests.
      
      Change-Id: Ie851a12400d5483bce47ec0e3ccb8516041e91c0
      2ee338ce
  18. 16 Jul, 2013 3 commits
    • Dmitry Kovalev's avatar
      Cleaning up tile code. · 9482a0bf
      Dmitry Kovalev authored
      Removing tile_rows and tile_columns from VP9Common, removing redundant
      constants MIN_TILE_WIDTH and MAX_TILE_WIDTH, changing signature of
      vp9_get_tile_n_bits.
      
      Change-Id: I8ff3104a38179b2c6900df965c144c1d6f602267
      9482a0bf
    • James Zern's avatar
      use consistent framerate naming · 9581eb6e
      James Zern authored
      s/frame_rate/framerate/g
      
      Change-Id: I6fc3e088e419c5f46e3a9390dd8a2cad2677a2fc
      9581eb6e
    • Yaowu Xu's avatar
      Change to extend full border only when needed · 5b915ebd
      Yaowu Xu authored
      This is a short term optimization till we work out a decoder
      implementation requiring no frame border extension.
      
      Change-Id: I02d15bfde4d926b50a4e58b393d8c4062d1be70f
      5b915ebd
  19. 15 Jul, 2013 1 commit
    • Jingning Han's avatar
      Skip duplicate block encoding in the rd loop · faff6ed0
      Jingning Han authored
      This speed feature allows the encoder to largely remove the spatial
      dependency between blocks inside a 64x64 superblock, thereby removing
      the need to repeatedly encode superblocks per partition type in the
      rate-distortion optimization loop.
      
      A major challenge lies in the intra modes tested in the rate-distortion
      optimization loop. The subsequent blocks do not have access to the
      reconstructed boundary pixels without the intermediate coding steps.
      This was resolved by using the original pixels for intra prediction
      in the rd loop, followed by an appropriately designed distortion
      modeling on the quantization parameters. Experiments also suggested
      that the performance impact is more discernible at lower bit-rate/psnr
      settings. Hence a quantizer dependent threshold is applied to deactivate
      skip of block coding.
      
      For bus_cif at 2000 kbps,
      speed 0: runtime 269854ms -> 237774ms (12% speed-up) at 0.05dB
               performance loss.
      
      speed 1: runtime 65312ms  -> 61536ms, (7% speed-up) at 0.04dB
               performance loss.
      
      This operation is currently turned on in settings of speed 1.
      
      Change-Id: Ib689741dfff8dd38365d8c1b92860a3e176f56ec
      faff6ed0
  20. 12 Jul, 2013 1 commit
    • James Zern's avatar
      yv12config: remove YUV_TYPE · 4fc6c88e
      James Zern authored
      this was never fleshed out in the context of VP8, for which it was
      added. for VP9 it has no meaning.
      
      Change-Id: Iba2ecc026d9e947067b96690245d337e51e26eff
      4fc6c88e
  21. 11 Jul, 2013 3 commits
    • Dmitry Kovalev's avatar
      Moving segmentation related vars into separate struct. · c4ad3273
      Dmitry Kovalev authored
      Adding segmentation struct to vp9_seg_common.h. Struct members are from
      macroblockd and VP9Common structs. Moving segmentation related constants
      and enums to vp9_seg_common.h.
      
      Change-Id: I23fabc33f11a359249f5f80d161daf569d02ec03
      c4ad3273
    • Scott LaVarnway's avatar
      Eliminated prev_mip memsets/memcpys in encoder · f2a6bcfb
      Scott LaVarnway authored
      This patch is in experimental but was not merged into master.
      
      This patch swaps ptrs instead of copying and uses the
      last show_frame flag instead of setting the entire buffer
      to zero.
      
      Change-Id: Ia0950466c8ba301a2a5bf917ff3d07bc1a2c2311
      f2a6bcfb
    • Paul Wilkins's avatar
      Speed 2 feature adjustment. · 5290eeab
      Paul Wilkins authored
      With sf->auto_mv_step_size on it is questionable
      whether sf->reduce_first_step_size is worthwhile.
      At speed 2 it was not having a big impact.
      
      Even at speed 2 sf->optimize_coefficients = 0 is not
      having a big speed imapct so for now I have moved it
      down into a higher speed setting.
      
      Change-Id: I8a54de76d486ad37aabce76474889da2768b14c1
      5290eeab
  22. 10 Jul, 2013 1 commit
    • Deb Mukherjee's avatar
      Prunes out full-rd computation based on modeled rd · 53ff43ad
      Deb Mukherjee authored
      Adds a speed feature to eliminate full-rd computation if the modeled
      rd or rd based on a different parameter in the same mode is already
      a lot larger than the best rd yet.
      
      Specifically, only search the sharp and smooth filters if the modeled
      rd cost based on the  regular filter is within a certain factor of the
      best rd cost so far. Also, skip full-rd computation of non splitmv
      inter modes if the modeled rd cost based on pred error is within the
      same factor of the best rd cost so far.
      
      Also adds some enhancements in the rd search for splitmv mode to
      speed things up by early breakouts. Negligible impact on performance.
      
      Resuts on derfraw300:
      psnr:    -0.013% with the splitmv enhancements, -0.24% with the rd
               breakout feature on.
      speedup: 6% with splitmv enhancements, 20% with also residual breakout
               (tested on football sequence at 600 Kbps)
      
      Change-Id: I37abc308ea9f110c1679ce649b6a7e73ab1ad5fc
      53ff43ad