1. 29 Jul, 2013 3 commits
  2. 27 Jul, 2013 3 commits
    • Dmitry Kovalev's avatar
      Cleanup: replacing xd->mode_info_context with temp variable. · cc0ff7ec
      Dmitry Kovalev authored
      Change-Id: I5a3e83102784cabb918a5404405fcab99c5bb9b6
      cc0ff7ec
    • Ronald S. Bultje's avatar
      Inverse dimension order in token_cost array. · 118ccdcd
      Ronald S. Bultje authored
      This allows us to increment the position at the band-level only as
      we go from one band to the next; more importantly, that allows us to
      use an add instead of multiply instruction, and omit the instruction
      altogether if the band doesn't change from one coef to the next, thus
      being slightly faster (probably more noticeable on systems where a
      multiply is expensive, like arm).
      
      Change-Id: I4343fe35b9f9a47fa00b217bdcbf5f91ff96c381
      118ccdcd
    • Jingning Han's avatar
      Shortcut 8x8/16x16 inverse 2D-DCT · 38fa4871
      Jingning Han authored
      This commit brought back the shortcut implementation of 8x8/16x16
      inverse 2D-DCT. When the eob <= 10, it skips the inverse transform
      operations on row 4:7/4:15 in the first round. For bus_cif at 1000
      kbps, this provides about 2% speed-up at speed 0.
      
      Change-Id: I453e2d72956467d75be4ad8c04b4482ab889d572
      38fa4871
  3. 26 Jul, 2013 6 commits
    • Dmitry Kovalev's avatar
      vp9_decodemv.c cleanup. · d42e60d2
      Dmitry Kovalev authored
      Renaming:
        read_intra_mode_info  -> read_intra_frame_mode_info
        read_inter_mode_info  -> read_inter_frame_mode_info
        read_intra_block_part -> read_intra_block_mode_info
        read_inter_block_part -> read_inter_block_mode_info
        read_ref_frame        -> read_ref_frames
        read_reference_frame  -> read_is_inter_block
      
      Using num_4x4_blocks_{wide, high}_lookup instead of bit shifts.
      
      Change-Id: I83c81573b4ef6f53f2f8d24683895014bebfba61
      d42e60d2
    • Jingning Han's avatar
      Special handle on DC only inverse 8x8 2D-DCT · 325e0aa6
      Jingning Han authored
      This commit enables a special handle for the 8x8 inverse 2D-DCT,
      where only DC coefficient is quantized to be non-zero. For bus_cif
      at 2000 kbps, it provides about 1% speed-up at speed 0.
      
      Change-Id: I2523222359eec26b144cf8fd4c63a4ad63b1b011
      325e0aa6
    • hkuang's avatar
      Fix some format error and code error in neon code. · 588b4daf
      hkuang authored
      Change-Id: I748dee8938dfb19f417f24eed005f3d216f83a82
      588b4daf
    • Ronald S. Bultje's avatar
      d45 intra prediction SSSE3 optimizations. · 94b0c679
      Ronald S. Bultje authored
      Change-Id: Ie48035ff4f93c41f8a9b3023e6444fd10432d8fb
      94b0c679
    • Paul Wilkins's avatar
      Auto min and max partition size experiment. · fe5e2a91
      Paul Wilkins authored
      Speed feature experiment to set an upper and lower
      partition size limit based on what has been seen
      in spatial neighbors.
      
      This seems to gives quite reasonable speed gains in local
      (10-15%) and when used with speed 0 the losses are small
      (0.25% derf, 0.35% stdhd). However, for now I am only
      enabling it on speed 1 as there may be clashes with the existing
      temporal partition selection in speed 2.
      
      Using a tighter min / max around the range derived from the
      neighbors increases speed further but at the cost of a
      bigger quality loss. However,  I think this spatial method could
      be combined with data from either the last frame or a variance
      method (or both) to refine the range of minimum and maximum
      partition size. I.e. consider the min and max from spatial and
      temporal neighbors and the variance recommendation.
      
      Change-Id: I1b96bf8b84368d6aad0c7aa600fe141b4f07435f
      fe5e2a91
    • Yunqing Wang's avatar
      Modify static threshold calculation · 52256cdb
      Yunqing Wang authored
      Used 3 * standard_deviation in internal threshold calculation
      instead of fit curve. This actually approached the algorithm
      better.
      For comparison, similar tests were done:
      The overall psnr loss is less than before.
      1. derf set:
      when static-thresh = 1, psnr loss is 0.329%;
      when static-thresh = 500, psnr loss is 0.970%;
      2. stdhd set:
      when static-thresh = 1, psnr loss is 0.922%;
      when static-thresh = 500, psnr loss is 1.307%;
      
      Similar speedup is achieved. For example,
      clip            bitrate  static-thresh psnr    time
      akiyo(cif)       500        0          48.952  5.077s(50f)
      akiyo            500        500        48.866  4.169s(50f)
      
      parkjoy(1080p)   4000       0          30.388  78.20s(30f)
      parkjoy          4000       500        30.367  70.85s(30f)
      
      sunflower(1080p) 4000       0          44.402  74.55s(30f)
      sunflower        4000       500        44.414  68.69s(30f)
      
      Change-Id: Ic78833642ce1911dbbd1cb6c899a2d7e2dfcc1f3
      52256cdb
  4. 25 Jul, 2013 13 commits
    • Dmitry Kovalev's avatar
      Making read_inter_mode_info function more clear. · 048e9c09
      Dmitry Kovalev authored
      Now read_inter_mode_info calls read_intra_block_part (renamed from
      read_intra_block_modes) or read_inter_block_part (just added).
      
      Change-Id: I541badea6b663e0ae692ec158665efb90ed20c03
      048e9c09
    • Yunqing Wang's avatar
      Add encoding option --static-thresh · d36852b7
      Yunqing Wang authored
      This option exists in VP8, and it was rewritten in VP9 to support
      skipping on different partition levels. After prediction is done,
      we can check if the residuals in the partition block will be all
      quantized to 0. If this is true, the skip flag is set, and only
      prediction data are needed in reconstruction. Based on DCT's energy
      conservation property, the skipping check can be estimated in
      spatial domain.
      
      The prediction error is calculated and compared to a threshold.
      The threshold is determined by the dequant values, and also
      adjusted by partition sizes. To be precise, the DC and AC parts
      for Y, U, and V planes are checked to decide skipping or not.
      
      Test showed that
      1. derf set:
      when static-thresh = 1, psnr loss is 0.666%;
      when static-thresh = 500, psnr loss is 1.162%;
      2. stdhd set:
      when static-thresh = 1, psnr loss is 1.249%;
      when static-thresh = 500, psnr loss is 1.668%;
      
      For different clips, encoding speedup range is between several
      percentage and 20+% when static-thresh <= 500. For example,
      clip            bitrate  static-thresh psnr    time
      akiyo(cif)       500        0          48.923  5.635s(50f)
      akiyo            500        500        48.863  4.402s(50f)
      
      parkjoy(1080p)   4000       0          30.380  77.54s(30f)
      parkjoy          4000       500        30.384  69.59s(30f)
      
      sunflower(1080p) 4000       0          44.461  85.2s(30f)
      sunflower        4000       500        44.418  78.1s(30f)
      
      Higher static-thresh values give larger speedup with larger
      quality loss.
      
      Change-Id: I857031ceb466ff314ab580ac5ec5d18542203c53
      d36852b7
    • Johann's avatar
      Add const to vp9_accum_mv_refs parameter · 6c8ef8d9
      Johann authored
      Change-Id: I0625d8ffddf590dfecd1bb8b8d6f57ef64b8bf18
      6c8ef8d9
    • Dmitry Kovalev's avatar
      General cleanups. · 7131cb0e
      Dmitry Kovalev authored
      Removing unused constants, macros, and function declarations. Using
      ROUND_POWER_OF_TWO macro, vp9_zero, vp9_copy where possible. Moving
      #include from *.h to *.c. Merging for loops for motion vectors.
      
      Change-Id: Ic3bf841764a2bb177128bb3a6d7aa8f68229cd13
      7131cb0e
    • Dmitry Kovalev's avatar
      Adding lookup table for size group. · 08fd41cc
      Dmitry Kovalev authored
      Change-Id: Ia6144d77ebed66e0739b62e4d673e26a95aa9550
      08fd41cc
    • Adrian Grange's avatar
      Simplify handling of sub-partition motion vectors · be700e14
      Adrian Grange authored
      Simplified the code that extracts and uses the motion
      vectors for the 4 sub-partitions in rd_pick_partition.
      
      Change-Id: Iaf698ef7ee3aef9edd59015e1ae065dd359b17d9
      be700e14
    • Jingning Han's avatar
      Make coeff_optimize initialized per-plane · 2f58faff
      Jingning Han authored
      This commit makes the initialization of trellis coeff optimization
      a per-plane operation, thereby eliminating the redundant steps in
      encode_sby and encode_sbuv. It makes the encoder at speed 0 slightly
      faster.
      
      Change-Id: Iffe9faca6a109dafc0dd69dc7273cbdec19b17cd
      2f58faff
    • Dmitry Kovalev's avatar
      Removing duplicated PREDICTION_PROBS constant. · 778989a0
      Dmitry Kovalev authored
      Already defined in vp9_seg_common.h.
      
      Change-Id: I5a0e3fa15966b1ebeb77ccd506b55fc231c22342
      778989a0
    • Dmitry Kovalev's avatar
      Removing vp9_adapt_mode_context function. · 47d61f00
      Dmitry Kovalev authored
      Moving code from vp9_adapt_mode_context to vp9_adapt_mode_probs.
      
      Change-Id: I60829c30b28968cd813551ef3a206dfb98d323c9
      47d61f00
    • Yaowu Xu's avatar
      fix a bug where flags are not reset · 3e386aef
      Yaowu Xu authored
      The feature that uses small partition results as a measure to skip
      mode evaluation at larger partition requires the flags to be reset.
      The reset was missing in the code path that calls rd_use_partition().
      
      Change-Id: Ia0a3a0aee1a862b6e2333d596808db7c48033d50
      3e386aef
    • Jingning Han's avatar
      SSE2 inverse 4x4 2D-DCT with DC only · 384e37e3
      Jingning Han authored
      Add SSE2 implementation to handle the special case of inverse 2D-DCT
      where only DC coefficient is non-zero.
      
      Change-Id: I2c6a59e21e5e77b8cf39a4af5eecf4d5ade32e2f
      384e37e3
    • Dmitry Kovalev's avatar
      Removing duplicated code for merging two probabilities. · 40358dc4
      Dmitry Kovalev authored
      Adding common merge_probs and merge_probs2 functions. Changing ints to
      usigned ints in some places.
      
      Change-Id: Icf088ffdea7cf5b95284a128916409bdd53506b0
      40358dc4
    • Dmitry Kovalev's avatar
      Inlining vp9_init_mode_contexts function. · 4450fa4c
      Dmitry Kovalev authored
      Change-Id: I21ee76bcae101cc9f6ef1d867622e50b7ae565fc
      4450fa4c
  5. 24 Jul, 2013 10 commits
  6. 23 Jul, 2013 5 commits