1. 06 Dec, 2014 1 commit
    • Jingning Han's avatar
      Remove redundant rdcost reset · 17bedc54
      Jingning Han authored
      The initial reset of this_rdc in vp9_pick_inter_mode is not needed,
      since it will be re-assign when used.
      Change-Id: Ic0e12d741cbab292fc214c1eabb48b129af7839b
  2. 05 Dec, 2014 5 commits
    • Jingning Han's avatar
      Fix a motion search skip condition in vp9_pick_inter_mode · eadffb2d
      Jingning Han authored
      Compare the current best mode rate-distortion cost with the skip
      threshold to decide if performing motion search.
      Change-Id: Ia071824f8dd3b7db485f424692a485a2da6a1a9f
    • Jingning Han's avatar
      Remove redundant MB_MODE_INFO reset from vp9_pick_mode_inter · 732d57c2
      Jingning Han authored
      Change-Id: I0222f7abc61202f4a83b117bbfb042ada6304562
    • Jingning Han's avatar
      Remove redundant vp9_zero in choose_partitioning · 9d88b308
      Jingning Han authored
      It makes the overall speed -6 about 2% faster with no compression
      performance change.
      Change-Id: I680a967b421caa2c5a5cdb821311c4726a2df45a
    • Jingning Han's avatar
      Enable conditional skip path in rd_pick_intra_sby_mode · 74ded486
      Jingning Han authored
      These speed-up features for key frame coding are only turned on
      in the settings of hybrid non-RD and RD mode decision. It provides
      about 20% speed-up to the hybrid key frame coding at the expense
      of certain compression performance loss. For vidyo1, the key frame
      coding statistics are changed
      9838F, 35.020 dB, 61677 us -> 9920F, 34.834 dB, 47556 us
      Overall rtc set compression performance is down by -0.257%.
      Change-Id: I0025447fda26bb7855e982955642b5f55d71b51f
    • Jingning Han's avatar
      Use hybrid RD and non-RD coding flow for key frame coding · 07711e9b
      Jingning Han authored
      When block size is below 16x16, the encoder swap from non-RD to
      RD mode for key frame coding. This largely brough back the key
      frame compression performance. For vidyo1 at 1000 kbps, the key
      frame coding statistics are changed
      9978F, 34.183 dB, 36807 us -> 9838F, 35.020 dB, 61677 us
      As compared to the full RD case
      7187F, 34.930 dB, 214470 us
      The overall rtc set coding performance (single key frame setting)
      is improved by 1.5%.
      Change-Id: I78a4ecf025d7b24ec911e85be94e01da05e77878
  3. 04 Dec, 2014 7 commits
  4. 03 Dec, 2014 7 commits
    • Adrian Grange's avatar
      Use memset for initialization to 0 · 73caef05
      Adrian Grange authored
      Change-Id: I714ca22b5d51016bf8b035cf457616c707257641
    • Marco's avatar
      Increase delta-qp for aq=3 mode, after key frame. · a047e7cd
      Marco authored
      For a few refresh periods after key frame, use large qp-delta
      to increase quality ramp-up.
      Change-Id: Ib5a150fb2dfa6bafd0d4e6b5d28dfd0724b61319
    • Jingning Han's avatar
      Fix indent in source_var_based_partition_search_method · 17176cd4
      Jingning Han authored
      Change-Id: I6e5e0571d6967b9b992966336715e35bb97f187e
    • Marco's avatar
      Enable non-rd mode coding on key frame, for speed 6. · 8fd3f9a2
      Marco authored
      For key frame at speed 6: enable the non-rd mode selection in speed setting
      and use the (non-rd) variance_based partition.
      Adjust some logic/thresholds in variance partition selection for key frame only (no change to delta frames),
      mainly to bias to selecting smaller prediction blocks, and also set max tx size of 16x16.
      Loss in key frame quality (~0.6-0.7dB) compared to rd coding,
      but speeds up key frame encoding by at least 6x.
      Average PNSR/SSIM metrics over RTC clips go down by ~1-2% for speed 6.
      Change-Id: Ie4845e0127e876337b9c105aa37e93b286193405
    • Jingning Han's avatar
      Remove unused ONE_LOOP entry from speed feature · a8d8c0f6
      Jingning Han authored
      Change-Id: I56ead0ebc2491144c4e79e5859b05e126176702c
    • Jingning Han's avatar
      Rework coeff probability model update for rtc coding · 8fe50191
      Jingning Han authored
      This commit reworks the ONE_LOOP_REDUCED coefficient probability
      model update process. It allows model update for every coefficient
      across the spectrum at a coarser resolution, instead of performing
      precise update only for certain subset of probability models.
      The overall runtime remains nearly same (<1% change) for speed -6.
      The compression performance is improved by 7.5% in PSNR for speed
      -5 and 4.57% for speed -6, respectively.
      Change-Id: Ifb17136382ee7e39a9f34ff4a4f09a753125c8d1
    • James Zern's avatar
      vp9: sync threads after a longjmp · 6f7ab014
      James Zern authored
      Synchronize all threads immediately as a subsequent decode call may
      cause a resize invalidating some allocations.
      fixes one aspect of crbug.com/437655
      Change-Id: Ie993b62c2756478543206ddbe43ec6268d90a470
  5. 02 Dec, 2014 4 commits
    • Peter de Rivaz's avatar
      Reinsert macro to fix issue 884. · 2c886953
      Peter de Rivaz authored
      Change 72056 unfolded some macro definitions,
      but lost some alternative behaviour required for
      high bitdepth encodes.
      This causes the encoder to crash, see issue 884.
      Change-Id: I8ce4d73c9fe0a3c10ccb86fba210fabc8b2f0ccc
    • Deb Mukherjee's avatar
      Fix a warning related to VPX_EFLAG_FORCE_KF check · 02941b0d
      Deb Mukherjee authored
      Fixes a warning in chrome build.
      Change-Id: I8fa0fd3e7ba1aecf89e5f79ce94cd64ed6a9567c
    • Peter de Rivaz's avatar
      Added high bitdepth sse2 transform functions · 7e40a55e
      Peter de Rivaz authored
      Also removes some spurious changes in common/vp9_blockd.h which
      was introduced by a rebase issue between nextgen and master branches.
      Change-Id: If359f0e9a71bca9c2ba685a87a355873536bb282
      (cherry picked from commit 005d80cd05269a299cd2f7ddbc3d4d8b791aebba)
      (cherry picked from commit 08d2f548007fd8d6fd41da8ef7fdb488b6485af3)
      (cherry picked from commit 4230c2306c194c058f56433a5275aa02a2e71d56)
    • Marco Paniconi's avatar
      Cyclic refresh: factor segment delta-q into rate control. · 83fd1897
      Marco Paniconi authored
      Incorporate segment delta-q into estimated bits.
      This generally improves the rate control under cyclic refresh (aq=3) mode.
      Change-Id: I1dc60fb230e7d08357fae18909d8ed27bf58e037
  6. 27 Nov, 2014 1 commit
    • Paul Wilkins's avatar
      Increase strength of AQ1. · 0d3d6e0e
      Paul Wilkins authored
      This patch greatly increase the strength of AQ1.
      Visual tests show strong gains on many clips but their is a big
      hit on psnr.
      SSIM is more mixed with some winners and losers.
      Change-Id: Idaa5d3b41d8576096bfa000b62bc531c3d8bf6a1
  7. 26 Nov, 2014 2 commits
  8. 25 Nov, 2014 5 commits
    • Yaowu Xu's avatar
      Separate rate_correction_factor for boosted GFs · e4234b3f
      Yaowu Xu authored
      When the golden frame is boosted, the rate correction factor is not
      correlated well with other inter frames even in CBR mode. This commit
      changes to use GF specific rate_correction_factor when gf_cbr_boost
      is greater than 20%.
      Change-Id: I6312c1564387bcacc11f4c5e8a9cfdc781b5c3ab
    • Jingning Han's avatar
      Cosmetic change in vp9_pick_inter_mode · a04ed984
      Jingning Han authored
      Change-Id: Ic072585ebffdb36982ed7b8b9f875ca6c1c656c4
    • Jingning Han's avatar
      Adaptively adjust mode test kick-off thresholds in RTC coding · 92a7cfc8
      Jingning Han authored
      This commit allows the encoder to increase the mode test kick-off
      thresholds if the previous best mode renders all zero quantized
      coefficients, thereby saving motion search runs when possible.
      The compression performance of speed -5 and -6 is down by -0.446%
      and 0.591%, respectively. The runtime of speed -6 is improved by
      10% for many test clips.
      vidyo1, 1000 kbps
      16578 b/f, 40.316 dB, 7873 ms -> 16575 b/f, 40.262 dB, 7126 ms
      nik720p, 1000 kbps
      33311 b/f, 38.651 dB, 7263 ms -> 33304 b/f, 38.629 dB, 6865 ms
      dark720p, 1000 kbps
      33331 b/f, 39.718 dB, 13596 ms -> 33324 b/f, 39.651 dB, 12000 ms
      mmoving, 1000 kbps
      33263 b/f, 40.983 dB, 7566 ms -> 33259 b/f, 40.978 dB, 7531 ms
      Change-Id: I7591617ff113e91125ec32c9b853e257fbc41d90
    • James Zern's avatar
      vp9_reader: reorder struct members · e1f55e04
      James Zern authored
      improves locality of reference
      Change-Id: Ia4d55bb8c98e479528d88303fa35e8c74fbf939d
    • Yunqing Wang's avatar
      vp9_ethread: modify VP9_COMP structure · edbd61e1
      Yunqing Wang authored
      This patch modified struct VP9_COMP. Created a struct ThreadData
      to include data that need to be copied for each thread. In
      multiple thread case, one thread processes one tile. all threads
      share one copy of VP9_COMP,
      (refer to VP9_COMP *cpi in the code)
      but each thread has its own copy of ThreadData,
      (refer to ThreadData *td in the code).
      Therefore, within the scope of encode_tiles(), both cpi and td
      need to be passed as function parameters.
      In single thread case, the FRAME_COUNTS pointer in ThreadData
      points to "counts" in VP9_COMMON.
      Change-Id: Ib37908b2d8e2c0f4f9c18f38017df5ce60e8b13e
  9. 24 Nov, 2014 4 commits
    • Alex Converse's avatar
      Fix a tautological assert. · 0496d114
      Alex Converse authored
      Change-Id: I90ad08823e1d038384536fa9f458caadc2c87f38
    • Jingning Han's avatar
      Remove redundant intra mode penalty from vp9_pick_inter_mode · 25be81e2
      Jingning Han authored
      The intra mode penalty is covered by intra_cost_penalty. This
      commit removes the other intra cost threshold, provided that the
      constant 50 is negligible in normal rate-distortion cost.
      Change-Id: I9b8b7483c43b9a41741622e7057def1f7d51bb72
    • Peter de Rivaz's avatar
      Refactored idct routines and headers · 3a8c43a4
      Peter de Rivaz authored
      This change is made in preparation for a
      subsequent patch which adds acceleration
      for the highbitdepth transform functions.
      The highbitdepth transform functions attempt
      to use 16/32bit sse instructions where possible,
      but fallback to using the C implementations if
      potential overflow is detected.  For this reason
      the dct routines are made global so they can be
      called from the acceleration functions in the
      subsequent patch.
      Change-Id: Ia921f191bf6936ccba4f13e8461624b120c1f665
      (cherry picked from commit 454342d4e77dbb67f4a3c10f97a57a6fcb46d9a0)
    • Jingning Han's avatar
      Key frame non-RD mode decision process · 2fbdfd2c
      Jingning Han authored
      This commit makes a non-RD coding mode decision process for key
      frame coding. It can be optionally turned on in speed -6 and above.
      Change-Id: I0847258b392877a0210b4768bef88ebc9ad009b5
  10. 21 Nov, 2014 4 commits
    • Marco's avatar
      Only allow for cyclic refresh (aq=3 mode) for base layer. · 53c3f2ca
      Marco authored
      Condition existed for temporal case, added it for spatial as well.
      Issue: https://code.google.com/p/webm/issues/detail?id=878.
      Change-Id: I38339207f9a94924f5568a081eabe64f867a686d
    • Paul Wilkins's avatar
      Fix some minor nits. · ea494c0e
      Paul Wilkins authored
      Change-Id: Ib8810d431fa20a2c78e0caaa28eb2c99903e60fb
    • Jingning Han's avatar
      Rework forward txfm/quantization skip system in RTC coding mode · 7428cebe
      Jingning Han authored
      This commit allows more aggressive decision to skip forward
      transform and quantization for luma component in RTC coding mode.
      The chroma components remains going through the normal coding
      routine, since they are not included in the non-RD mode search
      It reduces the runtime cost by 2% - 10%. In speed -6,
      vidyo1 1000 kbps
      16576 b/f, 40.281 dB, 8402 ms -> 16576 b/f, 40.323 dB, 7764 ms
      nik720p 1000 kbps
      33337 b/f, 38.622 dB, 7473 ms -> 33299 b/f, 38.660 dB, 7314 ms
      dark720p 1000 kbps
      33330 b/f, 39.785 dB, 13505 ms -> 33325 b/f, 39.714 dB, 13105 ms
      The compression performance of speed -6 is improved by 0.44% in
      PSNR and 1.31% in SSIM.
      Change-Id: Iae9e3738de6255babea734e5897f29118bebc6d7
    • Paul Wilkins's avatar
      Remove rate component adjustment for AQ1 · f5209d7e
      Paul Wilkins authored
      In AQ1 a rate adjustment was applied for blocks coded with a
      deltaq. This tends to skew the partition selection and cause
      rate overshoot.
      For example, consider a 64x64 super block where some but not all
      sub blocks are in a low q segment and some are in a high q segment.
      The choice of Q when considering large partition and transform sizes
      is defined by the lowest sub block segment id (currently this implies the
      lowest Q). If some parts of the larger partition are very hard this will
      cause a high rate component.
      The correct behavior here is for the rd code to discard the large partition
      choice and break down to sub blocks where some have low and some
      have high Q.  However the rate correction factor above mask the high
      cost of coding at a larger partition size.
      Change-Id: Ie077edd0b1b43c094898f481df772ea280b35960