1. 14 Aug, 2014 1 commit
    • Yaowu Xu's avatar
      Set max_intra_bsize to 32x32 · 4d6d0613
      Yaowu Xu authored
      At --good and speed 3 or above for resolution less than 720p. This
      disables the tests for 64x64 intra prediction modes. Encoding time
      reduction is about 1%.
      
      Change-Id: Ib396e3d1417fece416e3f0fee929b128acbb130f
      4d6d0613
  2. 13 Aug, 2014 2 commits
    • Jingning Han's avatar
      Allow full coeff probability model and cost update · ccef8842
      Jingning Han authored
      This commit moves the simplified coefficient probability model
      and costing update to speed 4, and turns on chessboard pattern
      mode search for sub 720p sequences. The overall coding performance
      of speed 3 is improved:
      derf  0.889%
      stdhd 1.744%
      
      The speed 3 runtime for test sequences are improved:
      bus cif at 1000 kbps 9823 ms -> 9642 ms
      pedestrian 1080p 2000 kbps 189559 ms -> 183284 ms
      
      Change-Id: Iecbc7496a68f31fd49fb09f8dfd97c028d675a5d
      ccef8842
    • Jingning Han's avatar
      Enable motion field based mode seach skip · 0daadeb6
      Jingning Han authored
      This commit allows the encoder to check the above and left neighbor
      blocks' reference frames and motion vectors. If they are all
      consistent, skip checking the NEARMV and ZEROMV modes. This is
      enabled in speed 3. The coding performance is improved:
      
      pedestrian area 1080p at 2000 kbps,
      from  74773 b/f, 41.101 dB, 198064 ms
      to    74795 b/f, 41.099 dB, 193078 ms
      
      park joy 1080p at 15000 kbps,
      from 290727 b/f, 30.640 dB, 609113 ms
      to   290558 b/f, 30.630 dB, 592815 ms
      
      Overall compression performance of speed 3 is changed
      derf  -0.171%
      stdhd -0.168%
      
      Change-Id: I8d47dd543a5f90d7a1c583f74035b926b6704b95
      0daadeb6
  3. 12 Aug, 2014 1 commit
    • Jim Bankoski's avatar
      intra blocks disallowed inadvertently · 5c55202c
      Jim Bankoski authored
      At speed 6 the smallest partitioning was 16x16 and biggest
      intra block was 8x8, essentially disallowing all intra blocks
      which produces ugly artifacts when revealing new video.
      
      Change-Id: I364042d4c64e09be0666ade64aac94d0a1b586cf
      5c55202c
  4. 08 Aug, 2014 1 commit
    • Dmitry Kovalev's avatar
      Moving pass from VP9_COMP to VP9EncoderConfig. · 91c2f1e4
      Dmitry Kovalev authored
      We had a very complicated way to initialize cpi->pass from
      cfg->g_pass:
      switch (cfg->g_pass) {
        case VPX_RC_ONE_PASS:
          oxcf->mode = ONE_PASS_GOOD;
          break;
        case VPX_RC_FIRST_PASS:
          oxcf->mode = TWO_PASS_FIRST;
          break;
        case VPX_RC_LAST_PASS:
          oxcf->mode = TWO_PASS_SECOND_BEST;
          break;
      }
      
      cpi->pass = get_pass(oxcf->mode).
      
      Now pass is moved to VP9EncoderConfig and initialization is simple:
      switch (cfg->g_pass) {
        case VPX_RC_ONE_PASS:
          oxcf->pass = 0;
          break;
        case VPX_RC_FIRST_PASS:
          oxcf->pass = 1;
          break;
        case VPX_RC_LAST_PASS:
          oxcf->pass = 2;
          break;
      }
      
      Change-Id: I8f582203a4575f5e39b071598484a8ad2b72e0d9
      91c2f1e4
  5. 05 Aug, 2014 1 commit
  6. 30 Jul, 2014 1 commit
    • Jingning Han's avatar
      Chessboard pattern partition search · ca2dcb7f
      Jingning Han authored
      This commit enables a chessboard pattern constrained partition
      search for 720p and above resolutions. The scheme applies stricter
      partition search to alternative blocks based on its above/left
      neighboring blocks' partition range, as well as that of the
      collocated blocks in the previous frame. It is currently turned
      on at 16x16 block size level. The chessboard pattern is flipped
      per coding frame.
      
      The speed 3 runtime is reduced:
      park_joy_1080p, 652832 ms -> 607738 ms (7% speed-up)
      pedestrian_area_1080p, 215998 ms -> 200589 ms (8% speed-up)
      
      The compression performance is changed:
      hd     -0.223%
      stdhd  -0.295%
      
      Change-Id: I2d4d123ae89f7171562f618febb4d81789575b19
      ca2dcb7f
  7. 22 Jul, 2014 1 commit
    • Jingning Han's avatar
      Enable chessboard inter prediction filter type search · 54ad0958
      Jingning Han authored
      This commit enables a chessboard pattern prediction filter type
      search scheme for rate-distortion optimization speed-up. For the
      inferred motion vector modes, the encoder can re-use its above/left
      neighbor blocks' prediction filter type and skip a full test on
      all possible filter types. Such operation is turned on/off
      alternatively in a chessboard manner.
      
      It is turned on in speed 3. For test clip pedestrian 1080p, the
      runtime is reduced from 231500 ms -> 221700 ms. The compression
      performance is changed:
      derf:  -0.147%
      yt:    -0.134%
      hd:    -0.079%
      stdhd: -0.220%
      
      Change-Id: I1912f278e7576c2dc632688e3ad7a257410c605a
      54ad0958
  8. 21 Jul, 2014 1 commit
    • Jingning Han's avatar
      Turn on adaptive pred filter scheme for sub8x8 below 720p · ffd948bb
      Jingning Han authored
      For sequences of resolution below 720p, the encoder will check
      intra prediction modes and inter prediction modes from LAST_FRAME.
      This commit turns on adaptive prediction filter scheme for sub8x8
      blocks, where inter prediction modes are enabled. For the test
      sequence bus at CIF, the speed 2 runtime goes down from 17879 ms
      to 16783 ms, i.e., 6% speed up. The compression performance of
      derf set is down by -0.128%.
      
      Change-Id: I01d5321a5ceab4e0666ac5be56c52d896c7a8d45
      ffd948bb
  9. 16 Jul, 2014 1 commit
  10. 15 Jul, 2014 1 commit
    • Yaowu Xu's avatar
      Added a rt speed 12 · faa686bb
      Yaowu Xu authored
      We target this speed to achieve similar encoding speed and better
      compression than vp8 rt mode with cpu-used at -12.
      
      Change-Id: Ic1bb4371c81a17ea80e83459c1cbf4c09a3498e8
      faa686bb
  11. 11 Jul, 2014 1 commit
  12. 09 Jul, 2014 2 commits
    • Yunqing Wang's avatar
      Remove repetitive code in mcomp.c · a581da21
      Yunqing Wang authored
      Deleted vp9_find_best_sub_pixel_comp_tree(), and combined it in
      vp9_find_best_sub_pixel_tree().
      
      Change-Id: Ifb25763c8b19822df5537cc1daa76ce88dc3b056
      a581da21
    • Yunqing Wang's avatar
      Adjust full-pixel search method in real-time mode · 9bd3be69
      Yunqing Wang authored
      Use FAST_HEX in speed 5 and 6, which covers more points than
      FAST_DIAMOND and improves motion search quality.
      
      At speed 6, RTC set borg tests showed slight quality gain (psnr
      gain: 0.143%, ssim gain: 0.226%). No noticeable encoding speed
      change.
      
      Change-Id: Ifa62875d9a52ee382ec494f271382bb77d8c67bf
      9bd3be69
  13. 08 Jul, 2014 1 commit
    • Jingning Han's avatar
      Re-design quantization process for 32x32 transform block · 9ad1b9fc
      Jingning Han authored
      This commit enables a new quantization process for 32x32 2D-DCT
      transform coefficient blocks. It improves the compression
      performance of speed 5 by 1.4%. The overall compression gains of
      speed 5 due to the new quantization scheme is 4.7%. It also includes
      the SSSE3 implementation of the 32x32 quantization process.
      
      Change-Id: I0855b124fd6462418683f783f5bcb44255c9993b
      9ad1b9fc
  14. 07 Jul, 2014 1 commit
    • Alex Converse's avatar
      Cleanup motion search speed features. · f60a1178
      Alex Converse authored
      * Replace max_step_search_steps with constant MAX_MVSEARCH_STEPS
      * Fold (reduce_first_step_size + speed > 5) into reduce_first_step_size
        replacing uses of reduce_first_step_size that don't add the speed
        check with zero.
      
      Change-Id: Iae46395dbf3eaca138bf4d18b838a9e364b5a198
      f60a1178
  15. 02 Jul, 2014 2 commits
    • Yaowu Xu's avatar
      Added a speed feature controlling a motion search parameter · 92a6db79
      Yaowu Xu authored
      This commit added a speed feature to control the step_param used in
      full pixel motion search. The intention is to reduced the search
      steps for high speed real time coding.
      
      Change-Id: I21d2f0105c2b647783a6688615da7fcf2b6d670b
      92a6db79
    • Jingning Han's avatar
      Re-design quantization process · 9ac2f663
      Jingning Han authored
      This commit re-designs the quantization process for transform
      coefficient blocks of size 4x4 to 16x16. It improves compression
      performance for speed 7 by 3.85%. The SSSE3 version for the
      new quantization process is included.
      
      The average runtime of the 8x8 block quantization is reduced
      from 285 cycles -> 255 cycles, i.e., over 10% faster.
      
      Change-Id: I61278aa02efc70599b962d3314671db5b0446a50
      9ac2f663
  16. 01 Jul, 2014 1 commit
    • Yunqing Wang's avatar
      Elevate NEWMV mode checking threshold in real time · f31ff029
      Yunqing Wang authored
      The current threshold is knid of low, and in many cases NEWMV
      mode is checked but not picked as the best mode. This patch
      added a speed feature to increase NEWMV threshold, so that
      less partition mode checking goes to check NEWMV. This feature
      is enabled for speed 6 and 7.
      
      Rtc set borg tests showed:
      1. Speed 6, overall psnr: -0.088%, ssim: -1.339%;
         Average speedup on rtc set is 11.1%.
      2. Speed 7, overall psnr: -0.505%, ssim: -2.320%
         Average speedup on rtc set is 12.9%.
      
      Change-Id: I953b849eeb6e0d5a1f13eacba30c14204472c5be
      f31ff029
  17. 30 Jun, 2014 2 commits
    • Yunqing Wang's avatar
      Enable encode breakout in real time · dee5782f
      Yunqing Wang authored
      For real time speed 7, once encode breakout is on(i.e. encoding
      setting --static-thresh=1), a proper encode breakout threshold
      is set to speed up the encoder.
      
      Set --static-thresh=1, RTC set borg test showed a slight overall
      psnr loss of 0.162%, but ssim gain of 0.287%. The average speedup
      on RTC set is 6%, and for some clips, the speedup can be 10+%.
      
      Change-Id: Id522d9ce779ff7c699936d13d0c47083de4afb85
      dee5782f
    • Yunqing Wang's avatar
      Decide the partitioning threshold from the variance histogram · 9d41313e
      Yunqing Wang authored
      Before encoding a frame, calculate and store each 16x16 block's
      variance of source difference between last and current frame.
      Find partitioning threshold T for the frame from its variance
      histogram, and then use T to make partition decisions.
      
      Comparing with fixed 16x16 partitioning, rtc set test showed an
      overall psnr gain of 3.242%, and ssim gain of 3.751%. The best
      psnr gain is 8.653%.
      
      The overall encoding speed didn't change much. It got faster for
      some clips(for example, 12% speedup for vidyo1), and a little
      slower for others.
      
      Also, a minor modification was made in datarate unit test.
      
      Change-Id: Ie290743aa3814e83607b93831b667a2a49d0932c
      9d41313e
  18. 27 Jun, 2014 2 commits
  19. 26 Jun, 2014 2 commits
    • Jingning Han's avatar
      Adaptive txfm size selection depending on residual sse/variance · 5a3e3c6d
      Jingning Han authored
      This commit enables an adaptive transform size selection method
      for speed -6. It uses largest transform size when the sse is more
      than 4 times of variance, i.e., most energy is compacted in the
      DC coefficient. Otherwise, use the default TX_8X8. It improves
      the compression efficiency for rtc set of speed -6 by 0.8%, no
      speed change observed.
      
      Change-Id: Ie6ed1e728ff7bf88ebe940a60811361cdd19969c
      5a3e3c6d
    • Jingning Han's avatar
      Make non-RD intra mode search txfm size dependent · 2aa50eaf
      Jingning Han authored
      This commit fixes the potential issue in the non-RD mode decision
      flow that only checks part of the block to estimate the cost. It
      was due to the use of fixed transform size, in replacing the
      largest transform block size. This commit enables per transform
      block cost estimation of the intra prediction mode in the non-RD
      mode decision.
      
      Change-Id: I14ff92065e193e3e731c2bbf7ec89db676f1e132
      2aa50eaf
  20. 24 Jun, 2014 2 commits
    • Yunqing Wang's avatar
      Reuse inter prediction result in real-time speed 6 · 0aae1000
      Yunqing Wang authored
      In real-time speed 6, no partition search is done. The inter
      prediction results got from picking mode can be reused in the
      following encoding process. A speed feature reuse_inter_pred_sby
      is added to only enable the resue in speed 6.
      
      This patch doesn't change encoding result. RTC set tests showed
      that the encoding speed gain is 2% - 5%.
      
      Change-Id: I3884780f64ef95dd8be10562926542528713b92c
      0aae1000
    • Paul Wilkins's avatar
      Fix some bugs in multi-arf · 8160a26f
      Paul Wilkins authored
      Fix some bugs relating to the use of buffers
      in the overlay frames.
      
      Fix bug where a mid sequence overlay was
      propagating large partition and transform sizes into
      the subsequent frame because of :-
        sf->last_partitioning_redo_frequency  > 1 and
        sf->tx_size_search_method == USE_LARGESTALL
      
      Change-Id: Ibf9ef39a5a5150f8cbdd2c9275abb0316c67873a
      8160a26f
  21. 19 Jun, 2014 1 commit
    • Jingning Han's avatar
      Allow key frame more flexibility in mode search · c99a8fd7
      Jingning Han authored
      This commit allows the key frame to search through more prediction
      modes and more flexible block sizes. No speed change observed. The
      coding performance for rtc set is improved by 1.7% for speed -5 and
      3.0% for speed -6.
      
      Change-Id: Ifd1bc28558017851b210b4004f2d80838938bcc5
      c99a8fd7
  22. 18 Jun, 2014 1 commit
    • Yunqing Wang's avatar
      Modify non-rd intra mode checking · 55834d42
      Yunqing Wang authored
      Speed 6 uses small tx size, namely 8x8. max_intra_bsize needs to
      be modified accordingly to ensure valid intra mode checking.
      Borg test on RTC set showed an overall PSNR gain of 0.335% in speed
      -6.
      
      This also changes speed -5 encoding by allowing DC_PRED checking
      for block32x32. Borg test on RTC set showed a slight PSNR gain of
      0.145%, and no noticeable speed change.
      
      Change-Id: I1502978d8fbe265b3bb235db0f9c35ba0703cd45
      55834d42
  23. 12 Jun, 2014 1 commit
    • Dmitry Kovalev's avatar
      Adding MV_SPEED_FEATURES struct. · 4ff1a614
      Dmitry Kovalev authored
      Moving all motion vector related speed parameters from SPEED_FEATURES to
      MV_SPEED_FEATURES.
      
      Change-Id: I3e9af0039c7162f8671878c5920bce3cb256a84e
      4ff1a614
  24. 09 Jun, 2014 1 commit
    • Yunqing Wang's avatar
      Use small transform size in non-rd real-time mode · b04d7668
      Yunqing Wang authored
      In non-rd real-time mode, choosing smaller transform size in
      encoding gives better video quality and good speed gain than
      choosing larger transform size. This patch set tx size search
      method to ALLOW_8X8, which is better than using 4x4 or other
      larger sizes.
      
      Borg tests on rtc set at speed 6 showed significant gain on quality.
      PSNR gain: 11.034% and SSIM gain: 15.466%.
      
      The speed gain is 5% - 12% for <720p clips, and 2% - 7% for
      720p clips.
      
      Change-Id: If4dc74ed2df359346b059f47fb73b4a0193ec548
      b04d7668
  25. 06 Jun, 2014 1 commit
  26. 03 Jun, 2014 1 commit
  27. 29 May, 2014 3 commits
  28. 22 May, 2014 1 commit
  29. 21 May, 2014 1 commit
    • Yaowu Xu's avatar
      Enable various thresholds of motion detection · 3bda7ec1
      Yaowu Xu authored
      This commit changed to enable the encoder to adjust motion dection
      speed threshold based on picture size. In addition, cpu-used 1 now
      does a partition search every other frame instead of every third
      frame for low resolution inputs.
      
      The change has no quality/speed impact for 720p and above. Test
      showed the change increase encoding time by between 3% to 6% for
      cpu-used 2 encodiong of 360p sequences. It also has a compression
      gain about .3%.
      
      For cpu-used 2, the change resolved some very disturbing visual
      artifacts in certain sequences when large block partitionings and
      transforms are used as a result of copying the partition from a
      previous frame.
      
      Change-Id: Ic7fd22508cdb811d4ca935655adbf20109286cfa
      3bda7ec1
  30. 19 May, 2014 1 commit
    • Yunqing Wang's avatar
      Add static-threshold skipping in non-rd mode · b91b146d
      Yunqing Wang authored
      Added a skipping test in non-rd inter-mode. After interpolation
      prediction step, the residuals are tested to see if they will be
      quantized to 0 based on modeling between spatial domain and
      frequency domain.
      
      Set static-thresh to 800 for >=720p and 300 for <720p, rtc set
      tests showed
      1. Speed 5, psnr: -0.514%; ssim: -1.748%;
         speedup on related clips: 5% -11%
      2. Speed 6, psbr: -0.628%; ssim: -1.637%;
         speedup on related clips: 4% - 9%
      
      Change-Id: I62fbf26bc043ecd2b584f255f1a4ee5ab52bfcf3
      b91b146d
  31. 23 Apr, 2014 1 commit
    • Jingning Han's avatar
      Chessboard pattern prediction filter type search in non-RD coding · 8969f7c8
      Jingning Han authored
      This commit introduces a chessboard pattern search for the prediction
      filter type search. It runs extensive search in alternate blocks and
      allows the rest blocks to refer coding decisions of their nearby
      neighbors.
      
      For pedestrian 1080p at 4000 kbps, the runtime of speed -5 goes down
      from 43990 ms to 42200 ms. The overall compression performance for
      RTC set is changed by -1.37%.
      
      Change-Id: Icfe220c49451cda796f0ca91d935c9ed01e56c9d
      8969f7c8