1. 12 Jul, 2013 3 commits
    • Deb Mukherjee's avatar
      Some minor cleanups for efficiency · 94c481f9
      Deb Mukherjee authored
      Implements some of the helper functions more efficiently with
      lookups rathers than branches. Modeling function is consolidated
      to reduce some computations.
      
      Also merged the two enums BLOCK_SIZE_TYPES and BlockSize into
      one because there is no need to keep them separate (even though
      the semantics are a little different).
      
      No bitstream or output change.
      
      About 0.5% speedup
      
      Change-Id: I7d71a66e8031ddb340744dc493f22976052b8f9f
      94c481f9
    • Dmitry Kovalev's avatar
      Removing redundant code mostly from vp9_pred_common.{h, c}. · dd150e8e
      Dmitry Kovalev authored
      Removing redundant function arguments and curly braces.
      
      Change-Id: I46e02561f33fe02e84a3b19756f03b9504bd6a1b
      dd150e8e
    • Ronald S. Bultje's avatar
      Remove unused function block_error(). · ee09dd99
      Ronald S. Bultje authored
      Change-Id: I78a79fc51c2d7cc3c261f35b569155397f3dc0c4
      ee09dd99
  2. 11 Jul, 2013 5 commits
  3. 10 Jul, 2013 11 commits
  4. 09 Jul, 2013 4 commits
    • Dmitry Kovalev's avatar
      Adding encode_tiles function to vp9_bitstream.c. · d82f459d
      Dmitry Kovalev authored
      Change-Id: Ie44824ec25fd8fdb25d7c8124a9b28c26d802029
      d82f459d
    • John Koleszar's avatar
      Remove all asm offset files from VP9 · f0d9f10d
      John Koleszar authored and James Zern's avatar James Zern committed
      The files are empty and unused.
      
      Change-Id: Ieb4242d14273efdf24149bda33f9591540bba06a
      f0d9f10d
    • Ronald S. Bultje's avatar
      Unbreak lossless. · 059c0ba5
      Ronald S. Bultje authored
      Change-Id: I8130ec9b5371c65e885f245a5ac73840c23cb4a1
      059c0ba5
    • Ronald S. Bultje's avatar
      Make intra prediction pointers RTCD-based. · 8350e7fe
      Ronald S. Bultje authored
      This probably has a mildly negative impact on performance, but will
      (in future commits - or possibly merged with this one) allow SIMD
      implementations of individual intra prediction functions. We may
      perhaps want to consider having separate functions per txfm-size
      also (i.e. 4x4, 8x8, 16x16 and 32x32 intra prediction functions for
      each intra prediction mode), but I haven't played much with that
      yet.
      
      Change-Id: Ie739985eee0a3fcbb7aed29ee6910fdb653ea269
      8350e7fe
  5. 08 Jul, 2013 7 commits
    • Ronald S. Bultje's avatar
      Don't call encode_sb() for the final of 4-split subpartitions. · a5062cc6
      Ronald S. Bultje authored
      The resulting reconstruction is never used, thus it just wastes CPU
      cycles. Reduces encode time of first 50 frames of bus (speed 0) @
      1500kbps from 2min2.0 to 2min1.2, i.e. a 0.65% overall speedup.
      
      Change-Id: I74755ca3aadc21e2be220f486259060bd4088c45
      a5062cc6
    • Ronald S. Bultje's avatar
      Don't recalculate mv_ref costs for each block/partition. · 8fde07a3
      Ronald S. Bultje authored
      Changes cost_mv_ref() into doing a LUT into pre-calculated cost
      arrays instead. Encode time of first 50 frames of bus (speed 0)
      @ 1500kbps goes from 2min11.6 to 2min10.9, i.e. 0.5% faster overall.
      
      Change-Id: If186e92c34c201b29cbbc058785a15c9c09e433a
      8fde07a3
    • Ronald S. Bultje's avatar
      Remove unnecessary memset(best_index, 0) from trellis/optimize. · 5a732549
      Ronald S. Bultje authored
      First 50 frames of bus @ 1500kbps (speed 0) goes from 2min12.6 to
      2min11.6, i.e. 0.75% overall speedup.
      
      Change-Id: I67054f8146e82a02b6457c51a1c8627a937e5e1e
      5a732549
    • Ronald S. Bultje's avatar
      Remove memcpy() in handle_inter_mode() filter selection. · fcf7998a
      Ronald S. Bultje authored
      Encode time of first 50 frames of bus (speed 0) @ 1500kbps goes from
      2min4.9 to 2min3.1, i.e. a 1.4% speedup overall.
      
      Change-Id: Ibe8b08d159797504c5d0c5122de1b6da3b6595e0
      fcf7998a
    • Ronald S. Bultje's avatar
      Make frame-wide filter-type decision fully RD-based. · ed995afb
      Ronald S. Bultje authored
      Overall, on all test sets, this gains about +0.2% on all metrics.
      City is a clip where this really hurts (-1.0% on all metrics), I'm
      not quite sure why yet. Maybe interesting to look into in the future.
      
      Change-Id: I6f0eecb20e72f0194633270d30bf00d76d9eae78
      ed995afb
    • Dmitry Kovalev's avatar
      Using mi_cols instead of mb_cols. · b7559258
      Dmitry Kovalev authored
      Eliminating usage of mb-units, switching to mi-units. Adding
      ALIGN_POWER_OF_TWO macro.
      
      Change-Id: I2491c969f713207c062011878b57e4e531818607
      b7559258
    • Deb Mukherjee's avatar
      Implements several heuristics to prune mode search · d9b62160
      Deb Mukherjee authored
      Skips mode searches for intra and compound inter modes depending
      on the best mode so far and the reference frames. The various
      heuristics to be used are selected by bits from a flag. The
      previous direction based intra mode search pruning is also absorbed
      in this framework.
      
      Specifically the flags and their impact are:
      
      1) FLAG_SKIP_INTRA_BESTINTER (skip intra mode search for oblique
      directional modes and TM_PRED if the best so far is
      an inter mode)
      derfraw300: -0.15%, 10% speedup
      
      2) FLAG_SKIP_INTRA_DIRMISMATCH (skip D27, D63, D117 and D153
      mode search if the best so far is not one of the closest
      hor/vert/diagonal directions.
      derfraw300: -0.05%, about 9% speedup
      
      3) FLAG_SKIP_COMP_BESTINTRA (skip compound prediction mode
      search if the best so far is an intra mode)
      derfraw300: -0.06%, about 7-8% speedup
      
      4) FLAG_SKIP_COMP_REFMISMATCH (skip compound prediction search
      if the best single ref inter mode does not have the same ref
      as one of the two references being tested in the compound mode)
      derfraw300: -0.56%, about 10% speedup
      
      Change-Id: I1a736cd29b36325489e7af9f32698d6394b2c495
      d9b62160
  6. 04 Jul, 2013 1 commit
  7. 03 Jul, 2013 7 commits
    • Dmitry Kovalev's avatar
      Adding write_skip_coeff function. · dda1835d
      Dmitry Kovalev authored
      Change-Id: I221126f22ab9067348eb0efb8a73b15a8f49c3fd
      dda1835d
    • Jingning Han's avatar
      Enable early termination in rd search · 2bd6fe08
      Jingning Han authored
      This commit allows encoder to detect the cumulative rate-distortion
      cost per transformed block inside a partition. If the cumulative
      rd cost is already above the best rd value, it terminates the rest
      operations and continue to next prediction mode test.
      
      It reduces the runtime of bus at target bit-rate 2000 from 308 second
      to 266 second, i.e., about 13% speed-up at no performance penalty.
      
      Change-Id: I5f15a3d8955d97031d5653006027866a00654e7a
      2bd6fe08
    • Dmitry Kovalev's avatar
      Calling set_partition_seg_context() instead of code duplication. · 2ad62c93
      Dmitry Kovalev authored
      Change-Id: I65be6acc54c99688fd1f0c946cec3511514b8555
      2ad62c93
    • Dmitry Kovalev's avatar
      Replacing 64 / MI_SIZE with MI_BLOCK_SIZE. · 5a21de84
      Dmitry Kovalev authored
      Change-Id: I32276552b3ea6dc1dce8e298be114cfe1019b31c
      5a21de84
    • Jingning Han's avatar
      Refactor SSE2 8x8 functional units · 2cb75c96
      Jingning Han authored
      These serve as building blocks for SSE2 8x8 and 16x16 ADST/DCT
      hybrid transform coding.
      
      Change-Id: I4089a754c66e0c986f67d9b8ec4dfb9627ad430d
      2cb75c96
    • Paul Wilkins's avatar
      Fix to comp_inter_joint_search_thresh feature. · f58b44ad
      Paul Wilkins authored
      When this is 0 (BLOCK_SIZE_AB4X4) we want to do
      the inter joint search for all sizes.
      
      Change-Id: Id40cd6fe7790e7e1165352b9cef5e12fa8c0bc88
      f58b44ad
    • Paul Wilkins's avatar
      Added two new skip experiments. · 72c5778e
      Paul Wilkins authored
      sf->unused_mode_skip_lvl. Tests modes as normal for all
      sizes at or below the given level. At larger sizes it skips
      all modes that were not chosen at any smaller size.
      Hence setting BLOCK_SIZE_SB64X64 is in effect off.
      Setting BLOCK_SIZE_AB4X4 will only consider modes that
      were chosen for one or more 4x4 blocks at larger sizes.
      
      sf->reference_masking.
      Do a test encode of the NONE partition at one size and create
      a reference frame mask based on the best rd choice. In the
      full search only allow this reference frame.
      Currently it is testing 64x64 and repeats this in the full search.
      This does not work well with Jim's Partition code just now and
      is disabled by default.
      
      Change-Id: I8f8c52d2ef4a0c08100150b0ea4155d1aaab93dd
      72c5778e
  8. 02 Jul, 2013 2 commits