1. 08 Feb, 2016 1 commit
  2. 03 Feb, 2016 1 commit
  3. 27 Jan, 2016 1 commit
  4. 20 Jan, 2016 1 commit
  5. 15 Jan, 2016 1 commit
    • Alex Converse's avatar
      Tie the bit cost scale to a define. · 269428e3
      Alex Converse authored
      This is a pure-refactor in preparation to potentially raise the bit-cost
      resolution.
      
      Verified at good speed 0 and rt speed -6.
      
      Change-Id: I5347e6e8c28a9ad9dd0aae1d76a3d0f3c2335bb9
      269428e3
  6. 13 Jan, 2016 1 commit
  7. 07 Jan, 2016 2 commits
    • Yaowu Xu's avatar
      Enable encoder to avoid 8x4 or 4x8 partitions · 9cac17d1
      Yaowu Xu authored
      This commit enables encoder to avoid 8x4 and 4x8 partitions for
      scaled reference frames when libvpx is configured and built with
      --enable-better-hw-compatibility
      
      Change-Id: I02ad65c386f5855f4325d72570c49164ed52f413
      9cac17d1
    • Yaowu Xu's avatar
      Fix a typo · 650a2d76
      Yaowu Xu authored
      Change-Id: I12de2dd5e5f375551804166188d76a9ad8067b41
      650a2d76
  8. 11 Dec, 2015 1 commit
    • Jingning Han's avatar
      Fix sub8x8 motion search on scaled reference frame · 27bbfd65
      Jingning Han authored
      This commit makes the sub8x8 block rate-distortion optimization
      scheme use precise motion compensated prediction to compute the rd
      cost. It fixes a potential buffer overflow issue related to sub8x8
      motion search on scaled reference frame.
      
      Change-Id: I4274992ef4f54eaacfde60db045e269c13aaa2de
      27bbfd65
  9. 19 Nov, 2015 1 commit
  10. 13 Nov, 2015 1 commit
    • paulwilkins's avatar
      Changes to exhaustive motion search. · 0149fb3d
      paulwilkins authored
      This change alters the nature and use of exhaustive motion search.
      
      Firstly any exhaustive search is preceded by a normal step search.
      The exhaustive search is only carried out if the distortion resulting
      from the step search is above a threshold value.
      
      Secondly the simple +/- 64 exhaustive search is replaced by a
      multi stage mesh based search where each stage has a range
      and step/interval size. Subsequent stages use the best position from
      the previous stage as the center of the search but use a reduced range
      and interval size.
      
      For example:
        stage 1: Range +/- 64 interval 4
        stage 2: Range +/- 32 interval 2
        stage 3: Range +/- 15 interval 1
      
      This process, especially when it follows on from a normal step
      search, has shown itself to be almost as effective as a full range
      exhaustive search with step 1 but greatly lowers the computational
      complexity such that it can be used in some cases for speeds 0-2.
      
      This patch also removes a double exhaustive search for sub 8x8 blocks
      which also contained  a bug (the two searches used different distortion
      metrics).
      
      For best quality in my test animation sequence this patch has almost
      no impact on quality but improves encode speed by more than 5X.
      
      Restricted use in good quality speeds 0-2 yields significant quality gains
      on the animation test of 0.2 - 0.5 db with only a small impact on encode
      speed. On most clips though the quality gain and speed impact are small.
      
      Change-Id: Id22967a840e996e1db273f6ac4ff03f4f52d49aa
      0149fb3d
  11. 06 Nov, 2015 1 commit
  12. 21 Oct, 2015 1 commit
    • Geza Lore's avatar
      Optimize vp9_highbd_block_error_8bit assembly. · aa8f8522
      Geza Lore authored
      A new version of vp9_highbd_error_8bit is now available which is
      optimized with AVX assembly. AVX itself does not buy us too much, but
      the non-destructive 3 operand format encoding of the 128bit SSEn integer
      instructions helps to eliminate move instructions. The Sandy Bridge
      micro-architecture cannot eliminate move instructions in the processor
      front end, so AVX will help on these machines.
      
      Further 2 optimizations are applied:
      
      1. The common case of computing block error on 4x4 blocks is optimized
      as a special case.
      2. All arithmetic is speculatively done on 32 bits only. At the end of
      the loop, the code detects if overflow might have happened and if so,
      the whole computation is re-executed using higher precision arithmetic.
      This case however is extremely rare in real use, so we can achieve a
      large net gain here.
      
      The optimizations rely on the fact that the coefficients are in the
      range [-(2^15-1), 2^15-1], and that the quantized coefficients always
      have the same sign as the input coefficients (in the worst case they are
      0). These are the same assumptions that the old SSE2 assembly code for
      the non high bitdepth configuration relied on. The unit tests have been
      updated to take this constraint into consideration when generating test
      input data.
      
      Change-Id: I57d9888a74715e7145a5d9987d67891ef68f39b7
      aa8f8522
  13. 08 Oct, 2015 1 commit
    • Geza Lore's avatar
      Optimization of 8bit block error for high bitdepth · 0134764f
      Geza Lore authored
      If high bit depth configuration is enabled, but encoding in profile 0,
      the code now falls back on optimized SSE2 assembler to compute the
      block errors, similar to when high bit depth is not enabled.
      
      Change-Id: I471d1494e541de61a4008f852dbc0d548856484f
      0134764f
  14. 30 Sep, 2015 1 commit
  15. 23 Sep, 2015 1 commit
  16. 09 Sep, 2015 2 commits
    • Jingning Han's avatar
      Fix ioc warnings related to sub8x8 reference frame · b6d71a30
      Jingning Han authored
      Access scaled reference frame in the sub8x8 rate-distortion
      optimization loop only when the current test mode is an inter mode.
      This prevents an ioc warning triggered by sending intra_frame index
      to fetch scaled reference frame.
      
      Change-Id: I6177ecc946651dd86c7ce362e3f65c4074444604
      b6d71a30
    • Jingning Han's avatar
      Enable sub8x8 inter mode with scaled ref frame in RD optimization · 50461166
      Jingning Han authored
      This commit allows the encoder to include sub8x8 inter mode with
      scaled reference frame in the rate-distortion optimization scheme.
      
      Change-Id: Ibbe9678801592826ef22566566dcdeeb008350d5
      50461166
  17. 31 Aug, 2015 1 commit
  18. 27 Aug, 2015 1 commit
  19. 25 Aug, 2015 1 commit
  20. 12 Aug, 2015 1 commit
  21. 10 Aug, 2015 1 commit
  22. 07 Aug, 2015 1 commit
  23. 06 Aug, 2015 1 commit
  24. 31 Jul, 2015 6 commits
    • Alex Converse's avatar
      Compute skippable inside the block_rd_txfm loop. · ab20c98e
      Alex Converse authored
      Change-Id: Iaa43aeeb7a2074495e00cdb83bb551c3f13d3ed2
      ab20c98e
    • Alex Converse's avatar
      Simplify model_rd_for_sb HBD ifdefs · c62228f2
      Alex Converse authored
      Change-Id: Ic1ce346a053800ae3b2d77178f46e6a388357f6d
      c62228f2
    • Alex Converse's avatar
      Simplify dist_block HBD ifdefs · da9c73c2
      Alex Converse authored
      Change-Id: Ic0b4e92cbaf813bcca8a8e9052c936c2e025e114
      da9c73c2
    • Alex Converse's avatar
      Give skip_txfm constants names. · 4ac5058a
      Alex Converse authored
      This is using a define instead of an enum to keep byte packing.
      
      Change-Id: I3abb07c8bfe377e19be4531b624af7b7b4207792
      4ac5058a
    • Alex Converse's avatar
      Short circuit rate_block in block_rd_txfm. · 73422d3b
      Alex Converse authored
      Don't run rate_block (cost_coeffs) if distortion alone is enough to
      surpass best_rd.
      
      This decreases 2nd pass runtime on HD at speed 2 by about 2%. There is
      zero effect on output if tx_cache is removed.
      
      Change-Id: Ia3b1cc77bfbe6ee988c395fde06c0eb92940b784
      73422d3b
    • Yunqing Wang's avatar
      Remove tx cache and speed up tx size selection · 3b2e73b9
      Yunqing Wang authored
      1. The RD scores obtained during the tx size selection were stored in the
      tx cache, and used to help make the tx decision for the following frames.
      This wasn't used anymore in VP9 encoder. Recovered the related decision
      making code from 1.5+ years ago, and borg tests didn't show any quality
      gain. This patch removed it to lower the complexity.
      
      2. An optimization was done after the above refactoring. If the tx_mode
      is not TX_MODE_SELECT, we only need to test the chosen tx size instead
      of all posible tx sizes. This gave a 1.5% average speed gain at speed 2,
      and a 1% average speed gain at speed 3.
      
      Change-Id: Id8cd650e066a8cef33829d8c15388a8138adc78c
      3b2e73b9
  25. 30 Jul, 2015 2 commits
  26. 28 Jul, 2015 1 commit
  27. 21 Jul, 2015 1 commit
    • Yaowu Xu's avatar
      vpx_dsp/bitreader.h: vp9_->vpx_ · bf82514b
      Yaowu Xu authored
      Replace vp9_ in names to vpx_ as they are not codec specific.
      
      Change-Id: I2e583aa63dee769353ada4b42417aa15c4074ebb
      bf82514b
  28. 20 Jul, 2015 1 commit
    • Jingning Han's avatar
      Refactor highbd forward transform use case · 389ed6da
      Jingning Han authored
      Separate the hybrid transform case from 2D-DCT case. This will
      allow us to clear up cross dependency between c and SIMD
      implementations later.
      
      Change-Id: Iaa499e8b096850a1c5a0c50a3b6e63e15d0184bf
      389ed6da
  29. 13 Jul, 2015 1 commit
    • Jingning Han's avatar
      Refactor intra block prediction function · 81452cf0
      Jingning Han authored
      This commit simplifies the intra block boundary condition logic.
      It removes the block index from the argument set.
      
      Change-Id: If00142512eb88992613d6609356dfd73ba390138
      81452cf0
  30. 08 Jul, 2015 2 commits
    • paulwilkins's avatar
      Changes to use of rectangular partitions. · 8dd466ed
      paulwilkins authored
      Changes to allow more use of rectangular partitions at
      speeds 1 and 2 for content classed by the first pass as
      animation and for blocks near the active image edge.
      
      This has quite a big impact in quality for the animated
      test sequence but also hurts encode speed for speed 2.
      
      For other content types the impact on both speed and
      quality is small.
      
      Added some plumbing for detection of internal vertical
      image edges.
      
      Change-Id: I3fc48de2349f8cb87946caaf0b06dbb0ea261a9a
      8dd466ed
    • paulwilkins's avatar
      Change speed and rd features for formatting bars. · a126b6ce
      paulwilkins authored
      Change speed features / behavior for split mode when there
      is an internal active edge (e.g. formatting bars).
      
      Remove some threshold constraints in rd code near the active
      edge of the image.
      
      Add some plumbing for left and right active edge detection.
      
      Patch set 5. Limit rd pass through for sub 8x8 to internal active edges.
      This takes away any speed penalty for most clips but keeps the enhanced
      edge coding for the more critical case of internal image edges
      
      Change-Id: If644e4762874de4fe9cbb0a66211953fa74c13a5
      a126b6ce
  31. 07 Jul, 2015 1 commit