1. 20 Jun, 2016 1 commit
    • Jingning Han's avatar
      Handle two identical states in the trellis chain · 5223a4b4
      Jingning Han authored
      When the next two states are identical, skip repeated cost table
      fetch and multiplication operations. This makes the trellis unit
      about 5% faster.
      
      Change-Id: I0dbf7ad0a5732044e4e45dd59e9431a251c678f2
      5223a4b4
  2. 17 Jun, 2016 2 commits
  3. 16 Jun, 2016 2 commits
  4. 15 Jun, 2016 1 commit
    • Jingning Han's avatar
      Refactor trellis optimization process · e9c44a76
      Jingning Han authored
      This commit refactors the trellis coefficient optimization process.
      It saves multiplications used to generate the final dequantized
      coefficients. It also removes two memset operations on quantized
      and dequantized coefficient sets.
      
      The trellis coefficient optimization is on average running over
      10% faster.
      
      Change-Id: If3aa26d2a706c3012bf2b7ac059bf1825250e81f
      e9c44a76
  5. 14 Jun, 2016 1 commit
    • Jingning Han's avatar
      Rework transform quantization pipeline · 1faf2887
      Jingning Han authored
      This commit reworks the transform and quantization unit. It enables
      the use of adaptive quantization for intra modes. This further
      improves the compression performance:
      lowres 0.36%
      midres 0.79%
      hdres  0.73%
      
      The key frame coding performance is improved:
      lowres 1.7%
      midres 1.9%
      hdres  3.3%
      
      The overall coding gains are:
      lowres 1.1%
      midres 1.8%
      hdres  2.3%
      
      Change-Id: Iaec1a3a4c1d5eac883ab526ed076d957060479dd
      1faf2887
  6. 13 Jun, 2016 1 commit
  7. 10 Jun, 2016 2 commits
    • Jingning Han's avatar
      Trellis based adaptive quantization · 25ca3229
      Jingning Han authored
      This commit combines uniform quantizer with trellis based coefficient
      level optimization. It improves the codebase compression performance:
      
      lowres 0.8%
      midres 1.0%
      hdres  1.6%
      
      Note that the current trellis optimization unit is using C code. This
      will make the cost of the overall quantization process slower. A number
      of optimizations will come up next.
      
      Change-Id: Id441dd238e4844409d0f08f82604be777f3f5282
      25ca3229
    • Sarah Parker's avatar
      Move new quant experiment from nextgen · a21afd42
      Sarah Parker authored
      This experiment implements non-uniform quantization where
      the width of the bins increases gradually to more closely
      match a laplacian distribution of the coeficcients.
      
      Performance Gain:
      derflr: 0.15%
      hevcmr: 0.675%
      
      Change-Id: I25234244e3bcd94b87c1f77cf682190b61c8ef94
      a21afd42
  8. 08 Jun, 2016 1 commit
    • Jingning Han's avatar
      Take out skip_recode speed feature · 025fa11c
      Jingning Han authored
      The assumption doesn't hold true in the current codebase. Remove
      this speed feature to simplify the codebase.
      
      Change-Id: I9b69f484c9b7cd612b825047cc5b2fce63ee0af7
      025fa11c
  9. 04 May, 2016 1 commit
    • Yaowu Xu's avatar
      Change to use proper type in vp10_token_state · 0d7dc0ca
      Yaowu Xu authored
      "qc" in vp10_token_state is used to save quantized coefficients, this
      commit changes the type from short to tran_low_t to properly reflect
      the value range for highbitdepth build.
      
      This fixes an out-of-range bug when optimize_b is used in highbitdepth
      build.
      
      Change-Id: I914c6fd3d3f4b9d061f9ed7cc5f08a883ab59dcd
      0d7dc0ca
  10. 21 Apr, 2016 1 commit
  11. 19 Apr, 2016 5 commits
    • hui su's avatar
      Adjust optimize_b RD parameters · ad59b08f
      hui su authored
      Coding gain:
      lowres  0.44%
      midres  0.24%
      hdres   0.32%
      
      Change-Id: Ie558203b2b2bf5c16cd49b114df3d696c4f35049
      ad59b08f
    • hui su's avatar
      Enable optimize_b for intra blocks · e43c2111
      hui su authored
      Coding gain:
      lowres  0.05%
      midres  0.10%
      hdres   0.18%
      
      Change-Id: I508b150c02588f911a8ddddfe73c770f0819fe10
      e43c2111
    • Geza Lore's avatar
      Fix uninitialized blk_skip for VAR TX. · 7aa95be9
      Geza Lore authored
      x->blk_skip used to be uninitialized (leftover from encoding the
      previous block), if cm->tx_mode != TX_MODE_SELECT (which is used with
      higher --cpu-used or --rt options). This resulted in degraded coding
      performance when using cm->tx_mode != TX_MODE_SELECT.
      
      This fixes the VP10/EndToEndTestLarge.EndtoEndPSNRTest/40 unit test.
      
      Also fixed an edge effect where encode_block in encodemb.c used the
      formal width of the block (without cropping at the right edge), to
      look up blk_skip, while select_tx_block in rdopt.c used the cropped
      width to set blk_skip.
      
      Change-Id: I76d0f49ac5ab3ab54203573e0d7fcfcc1c6aa10d
      7aa95be9
    • Geza Lore's avatar
      Revert "Fix uninitialized blk_skip for VAR TX." · 8d64b53d
      Geza Lore authored
      This reverts commit e7b89d88.
      8d64b53d
    • Geza Lore's avatar
      Fix uninitialized blk_skip for VAR TX. · e7b89d88
      Geza Lore authored
      x->blk_skip used to be uninitialzied (leftover from encoding the
      previous block), if cm->tx_mode != TX_MODE_SELECT (which is used with
      higher --cpu-used or --rt options). This resulted in degraded coding
      performance when uning cm->tx_mode != TX_MODE_SELECT.
      
      This fixes the VP10/EndToEndTestLarge.EndtoEndPSNRTest/40 unit test.
      
      Change-Id: If39062927446798c626fc93694b4e6a4f35fa5da
      e7b89d88
  12. 31 Mar, 2016 1 commit
    • Geza Lore's avatar
      Rename MI_BLOCK_SIZE and MI_MASK macros. · 511da8cb
      Geza Lore authored
      Rename MI_BLOCK_SIZE.* -> MAX_MIB_SIZE.* (MIB is for MI Block).
      Rename MI_MASK.* -> MAX_MIB_MASK.*
      
      There are no functional changes.
      
      This is in preparation for coding the superblock size at the frame
      level, which will require some of these constants to become variables.
      The new names better reflect future semantics, and hence make the code
      clearer.
      
      Change-Id: Iee08d97554cf4cc16a5dc166a3ffd1ab91529992
      511da8cb
  13. 30 Mar, 2016 2 commits
  14. 28 Mar, 2016 1 commit
  15. 23 Mar, 2016 1 commit
  16. 18 Mar, 2016 2 commits
  17. 11 Feb, 2016 1 commit
    • Alex Converse's avatar
      Port switch to 9-bit rate cost to vp10. · b3ad8128
      Alex Converse authored
      Brings the following commits to vp10:
      269428e3 Tie the bit cost scale to a define.
      d13385ce Switch to 9-bit rate cost constants built on a 256 probability denominator.
      ad43a738 Fix a signed overflow in vp9 motion cost.
      1c9b0918 Fix some interger overflow errors
      fac947df Restore previous motion search bit-error scale.
      
      Change-Id: I598ba7ee7efcde18439c31dfa96b86cbf297a580
      b3ad8128
  18. 19 Jan, 2016 1 commit
  19. 05 Jan, 2016 1 commit
    • Debargha Mukherjee's avatar
      Super transform - ported from nextgen branch · 3787b174
      Debargha Mukherjee authored
      Various additional changes were made to make the experiment
      compatible with misc_fixes.
      
      derflr: +0.979%
      hevcmr: +0.865%
      
      Speed-wise with --enable-supertx the encoder is only about 10%
      slower than without. Decoding impact is about 30% slowdown.
      
      Note this does not work with ext-tx or var-tx yet. That is
      a TODO.
      
      Change-Id: If25af4241a7a9efbd28f58eda3c4f044c7a7ef4b
      3787b174
  20. 11 Dec, 2015 1 commit
    • Angie Chiang's avatar
      Refactor vp10_encode_block_intra · 0919edd4
      Angie Chiang authored
      1) Add VP10_XFORM_QUANT_SKIP_QUANT mode for vp10_xform_quant
      2) Let encode_block call vp10_xform_quant so that its code flow
         is clear
      
      Change-Id: I122d5cf6a089f444ae018f3e4bf844be847e17ee
      0919edd4
  21. 03 Dec, 2015 1 commit
    • Angie Chiang's avatar
      Refactor vp10_xform_quant · 88cae8b4
      Angie Chiang authored
      1) Add facade to quantize b/fp/dc version so that their interface
         are the same.
      2) Merge vp10_xform_quant b/fp/dc version to one function so that
         the code flow in encodemb.c is clear
      
      Change-Id: Ib62d6215438fc2d07f4e7e72393f964832d6746f
      88cae8b4
  22. 25 Nov, 2015 3 commits
    • Angie Chiang's avatar
      Add facade to inverse txfm · a245d9f8
      Angie Chiang authored
      Add inv_txfm and highbd_inv_txfm as facades of inverse transform
      such that the code flow in encodemb.c can be simpler
      
      Change-Id: Iea45fd22dd8b173f8eb3919ca6502636f7bcfcf7
      a245d9f8
    • Angie Chiang's avatar
      Create hybrid_fwd_txfm.c · 96baa73e
      Angie Chiang authored
      Move txfm functions from encodemb to hybrid_twd_txfm.c
      to make encodemb's code flow clear
      
      Change-Id: If174d8ddb490d149c103e5127d30ef19adfbed13
      96baa73e
    • Angie Chiang's avatar
      merge txfm_#x#_1 into txfm_#x# · 30e325a9
      Angie Chiang authored
      Change-Id: I9f539491fe676898246976c91d5ac4804a155803
      30e325a9
  23. 09 Nov, 2015 1 commit
    • Johann's avatar
      Release v1.5.0 · cbecf57f
      Johann authored
      Javan Whistling Duck release.
      
      Change-Id: If44c9ca16a8188b68759325fbacc771365cb4af8
      cbecf57f
  24. 03 Nov, 2015 1 commit
    • Geza Lore's avatar
      Eliminate copying for FLIPADST in fwd transforms. · 01bb4a31
      Geza Lore authored
      This patch eliminates the copying of data when using FLIPADST forward
      transforms, by incorporating the necessary data flipping into the
      load_buffer_* functions of the SSE2 optimized forward transforms. The
      load_buffer_* functions are normally inlined, so the overhead of copying
      the data is removed and the overhead of flipping is minimized. Left to
      right flipping is still not free, as the columns need to be shuffled in
      registers.
      
      To preserve identity between the C and SSE2 implementations, the
      appropriate C implementations now also do the data flipping as part of
      the transform, rather than relying on the caller for flipping the input.
      
      Overall speedup is about 1.5-2% in encode on my tests. Note that these
      are only the forward transforms. Inverse transforms to come in a later
      patch.
      
      There are also a few code hygiene changes:
      - Fixed some indents of switch statements.
      - DCT_DCT transform now always use vp10_fht* functions, which dispatch
        to vpx_fdct* for DCT_DCT (some of them used to call vpx_fdct*
        directly, some of them used to call vp10_fht*).
      
      Change-Id: I93439257dc5cd104ac6129cfed45af142fb64574
      01bb4a31
  25. 30 Oct, 2015 1 commit
  26. 29 Oct, 2015 1 commit
  27. 27 Oct, 2015 1 commit
  28. 23 Oct, 2015 2 commits