1. 06 Dec, 2014 1 commit
  2. 05 Dec, 2014 3 commits
    • Jingning Han's avatar
      Remove redundant vp9_zero in choose_partitioning · 9d88b308
      Jingning Han authored
      It makes the overall speed -6 about 2% faster with no compression
      performance change.
      
      Change-Id: I680a967b421caa2c5a5cdb821311c4726a2df45a
      9d88b308
    • Jingning Han's avatar
      Enable conditional skip path in rd_pick_intra_sby_mode · 74ded486
      Jingning Han authored
      These speed-up features for key frame coding are only turned on
      in the settings of hybrid non-RD and RD mode decision. It provides
      about 20% speed-up to the hybrid key frame coding at the expense
      of certain compression performance loss. For vidyo1, the key frame
      coding statistics are changed
      9838F, 35.020 dB, 61677 us -> 9920F, 34.834 dB, 47556 us
      
      Overall rtc set compression performance is down by -0.257%.
      
      Change-Id: I0025447fda26bb7855e982955642b5f55d71b51f
      74ded486
    • Jingning Han's avatar
      Use hybrid RD and non-RD coding flow for key frame coding · 07711e9b
      Jingning Han authored
      When block size is below 16x16, the encoder swap from non-RD to
      RD mode for key frame coding. This largely brough back the key
      frame compression performance. For vidyo1 at 1000 kbps, the key
      frame coding statistics are changed
      
      9978F, 34.183 dB, 36807 us -> 9838F, 35.020 dB, 61677 us
      
      As compared to the full RD case
      7187F, 34.930 dB, 214470 us
      
      The overall rtc set coding performance (single key frame setting)
      is improved by 1.5%.
      
      Change-Id: I78a4ecf025d7b24ec911e85be94e01da05e77878
      07711e9b
  3. 04 Dec, 2014 6 commits
  4. 03 Dec, 2014 7 commits
    • Adrian Grange's avatar
      Use memset for initialization to 0 · 73caef05
      Adrian Grange authored
      Change-Id: I714ca22b5d51016bf8b035cf457616c707257641
      73caef05
    • Marco's avatar
      Increase delta-qp for aq=3 mode, after key frame. · a047e7cd
      Marco authored
      For a few refresh periods after key frame, use large qp-delta
      to increase quality ramp-up.
      
      Change-Id: Ib5a150fb2dfa6bafd0d4e6b5d28dfd0724b61319
      a047e7cd
    • Jingning Han's avatar
      Fix indent in source_var_based_partition_search_method · 17176cd4
      Jingning Han authored
      Change-Id: I6e5e0571d6967b9b992966336715e35bb97f187e
      17176cd4
    • Marco's avatar
      Enable non-rd mode coding on key frame, for speed 6. · 8fd3f9a2
      Marco authored
      For key frame at speed 6: enable the non-rd mode selection in speed setting
      and use the (non-rd) variance_based partition.
      
      Adjust some logic/thresholds in variance partition selection for key frame only (no change to delta frames),
      mainly to bias to selecting smaller prediction blocks, and also set max tx size of 16x16.
      
      Loss in key frame quality (~0.6-0.7dB) compared to rd coding,
      but speeds up key frame encoding by at least 6x.
      Average PNSR/SSIM metrics over RTC clips go down by ~1-2% for speed 6.
      
      Change-Id: Ie4845e0127e876337b9c105aa37e93b286193405
      8fd3f9a2
    • Jingning Han's avatar
      Remove unused ONE_LOOP entry from speed feature · a8d8c0f6
      Jingning Han authored
      Change-Id: I56ead0ebc2491144c4e79e5859b05e126176702c
      a8d8c0f6
    • Jingning Han's avatar
      Rework coeff probability model update for rtc coding · 8fe50191
      Jingning Han authored
      This commit reworks the ONE_LOOP_REDUCED coefficient probability
      model update process. It allows model update for every coefficient
      across the spectrum at a coarser resolution, instead of performing
      precise update only for certain subset of probability models.
      
      The overall runtime remains nearly same (<1% change) for speed -6.
      The compression performance is improved by 7.5% in PSNR for speed
      -5 and 4.57% for speed -6, respectively.
      
      Change-Id: Ifb17136382ee7e39a9f34ff4a4f09a753125c8d1
      8fe50191
    • James Zern's avatar
      vp9: sync threads after a longjmp · 6f7ab014
      James Zern authored
      Synchronize all threads immediately as a subsequent decode call may
      cause a resize invalidating some allocations.
      
      fixes one aspect of crbug.com/437655
      
      Change-Id: Ie993b62c2756478543206ddbe43ec6268d90a470
      6f7ab014
  5. 02 Dec, 2014 4 commits
    • Peter de Rivaz's avatar
      Reinsert macro to fix issue 884. · 2c886953
      Peter de Rivaz authored
      Change 72056 unfolded some macro definitions,
      but lost some alternative behaviour required for
      high bitdepth encodes.
      This causes the encoder to crash, see issue 884.
      
      Change-Id: I8ce4d73c9fe0a3c10ccb86fba210fabc8b2f0ccc
      2c886953
    • Deb Mukherjee's avatar
      Fix a warning related to VPX_EFLAG_FORCE_KF check · 02941b0d
      Deb Mukherjee authored
      Fixes a warning in chrome build.
      
      Change-Id: I8fa0fd3e7ba1aecf89e5f79ce94cd64ed6a9567c
      02941b0d
    • Peter de Rivaz's avatar
      Added high bitdepth sse2 transform functions · 7e40a55e
      Peter de Rivaz authored
      Also removes some spurious changes in common/vp9_blockd.h which
      was introduced by a rebase issue between nextgen and master branches.
      
      Change-Id: If359f0e9a71bca9c2ba685a87a355873536bb282
      (cherry picked from commit 005d80cd05269a299cd2f7ddbc3d4d8b791aebba)
      (cherry picked from commit 08d2f548007fd8d6fd41da8ef7fdb488b6485af3)
      (cherry picked from commit 4230c2306c194c058f56433a5275aa02a2e71d56)
      7e40a55e
    • Marco Paniconi's avatar
      Cyclic refresh: factor segment delta-q into rate control. · 83fd1897
      Marco Paniconi authored
      Incorporate segment delta-q into estimated bits.
      This generally improves the rate control under cyclic refresh (aq=3) mode.
      
      Change-Id: I1dc60fb230e7d08357fae18909d8ed27bf58e037
      83fd1897
  6. 27 Nov, 2014 1 commit
    • Paul Wilkins's avatar
      Increase strength of AQ1. · 0d3d6e0e
      Paul Wilkins authored
      This patch greatly increase the strength of AQ1.
      
      Visual tests show strong gains on many clips but their is a big
      hit on psnr.
      
      SSIM is more mixed with some winners and losers.
      
      Change-Id: Idaa5d3b41d8576096bfa000b62bc531c3d8bf6a1
      0d3d6e0e
  7. 26 Nov, 2014 2 commits
  8. 25 Nov, 2014 5 commits
    • Yaowu Xu's avatar
      Separate rate_correction_factor for boosted GFs · e4234b3f
      Yaowu Xu authored
      When the golden frame is boosted, the rate correction factor is not
      correlated well with other inter frames even in CBR mode. This commit
      changes to use GF specific rate_correction_factor when gf_cbr_boost
      is greater than 20%.
      
      Change-Id: I6312c1564387bcacc11f4c5e8a9cfdc781b5c3ab
      e4234b3f
    • Jingning Han's avatar
      Cosmetic change in vp9_pick_inter_mode · a04ed984
      Jingning Han authored
      Change-Id: Ic072585ebffdb36982ed7b8b9f875ca6c1c656c4
      a04ed984
    • Jingning Han's avatar
      Adaptively adjust mode test kick-off thresholds in RTC coding · 92a7cfc8
      Jingning Han authored
      This commit allows the encoder to increase the mode test kick-off
      thresholds if the previous best mode renders all zero quantized
      coefficients, thereby saving motion search runs when possible.
      The compression performance of speed -5 and -6 is down by -0.446%
      and 0.591%, respectively. The runtime of speed -6 is improved by
      10% for many test clips.
      
      vidyo1, 1000 kbps
      16578 b/f, 40.316 dB, 7873 ms -> 16575 b/f, 40.262 dB, 7126 ms
      
      nik720p, 1000 kbps
      33311 b/f, 38.651 dB, 7263 ms -> 33304 b/f, 38.629 dB, 6865 ms
      
      dark720p, 1000 kbps
      33331 b/f, 39.718 dB, 13596 ms -> 33324 b/f, 39.651 dB, 12000 ms
      
      mmoving, 1000 kbps
      33263 b/f, 40.983 dB, 7566 ms -> 33259 b/f, 40.978 dB, 7531 ms
      
      Change-Id: I7591617ff113e91125ec32c9b853e257fbc41d90
      92a7cfc8
    • James Zern's avatar
      vp9_reader: reorder struct members · e1f55e04
      James Zern authored
      improves locality of reference
      
      Change-Id: Ia4d55bb8c98e479528d88303fa35e8c74fbf939d
      e1f55e04
    • Yunqing Wang's avatar
      vp9_ethread: modify VP9_COMP structure · edbd61e1
      Yunqing Wang authored
      This patch modified struct VP9_COMP. Created a struct ThreadData
      to include data that need to be copied for each thread. In
      multiple thread case, one thread processes one tile. all threads
      share one copy of VP9_COMP,
      (refer to VP9_COMP *cpi in the code)
      but each thread has its own copy of ThreadData,
      (refer to ThreadData *td in the code).
      Therefore, within the scope of encode_tiles(), both cpi and td
      need to be passed as function parameters.
      
      In single thread case, the FRAME_COUNTS pointer in ThreadData
      points to "counts" in VP9_COMMON.
      
      Change-Id: Ib37908b2d8e2c0f4f9c18f38017df5ce60e8b13e
      edbd61e1
  9. 24 Nov, 2014 4 commits
    • Alex Converse's avatar
      Fix a tautological assert. · 0496d114
      Alex Converse authored
      Change-Id: I90ad08823e1d038384536fa9f458caadc2c87f38
      0496d114
    • Jingning Han's avatar
      Remove redundant intra mode penalty from vp9_pick_inter_mode · 25be81e2
      Jingning Han authored
      The intra mode penalty is covered by intra_cost_penalty. This
      commit removes the other intra cost threshold, provided that the
      constant 50 is negligible in normal rate-distortion cost.
      
      Change-Id: I9b8b7483c43b9a41741622e7057def1f7d51bb72
      25be81e2
    • Peter de Rivaz's avatar
      Refactored idct routines and headers · 3a8c43a4
      Peter de Rivaz authored
      This change is made in preparation for a
      subsequent patch which adds acceleration
      for the highbitdepth transform functions.
      
      The highbitdepth transform functions attempt
      to use 16/32bit sse instructions where possible,
      but fallback to using the C implementations if
      potential overflow is detected.  For this reason
      the dct routines are made global so they can be
      called from the acceleration functions in the
      subsequent patch.
      
      Change-Id: Ia921f191bf6936ccba4f13e8461624b120c1f665
      (cherry picked from commit 454342d4e77dbb67f4a3c10f97a57a6fcb46d9a0)
      3a8c43a4
    • Jingning Han's avatar
      Key frame non-RD mode decision process · 2fbdfd2c
      Jingning Han authored
      This commit makes a non-RD coding mode decision process for key
      frame coding. It can be optionally turned on in speed -6 and above.
      
      Change-Id: I0847258b392877a0210b4768bef88ebc9ad009b5
      2fbdfd2c
  10. 21 Nov, 2014 7 commits
    • Marco's avatar
      Only allow for cyclic refresh (aq=3 mode) for base layer. · 53c3f2ca
      Marco authored
      Condition existed for temporal case, added it for spatial as well.
      Issue: https://code.google.com/p/webm/issues/detail?id=878.
      
      Change-Id: I38339207f9a94924f5568a081eabe64f867a686d
      53c3f2ca
    • Paul Wilkins's avatar
      Fix some minor nits. · ea494c0e
      Paul Wilkins authored
      Change-Id: Ib8810d431fa20a2c78e0caaa28eb2c99903e60fb
      ea494c0e
    • Jingning Han's avatar
      Rework forward txfm/quantization skip system in RTC coding mode · 7428cebe
      Jingning Han authored
      This commit allows more aggressive decision to skip forward
      transform and quantization for luma component in RTC coding mode.
      The chroma components remains going through the normal coding
      routine, since they are not included in the non-RD mode search
      process.
      
      It reduces the runtime cost by 2% - 10%. In speed -6,
      vidyo1 1000 kbps
      16576 b/f, 40.281 dB, 8402 ms -> 16576 b/f, 40.323 dB, 7764 ms
      
      nik720p 1000 kbps
      33337 b/f, 38.622 dB, 7473 ms -> 33299 b/f, 38.660 dB, 7314 ms
      
      dark720p 1000 kbps
      33330 b/f, 39.785 dB, 13505 ms -> 33325 b/f, 39.714 dB, 13105 ms
      
      The compression performance of speed -6 is improved by 0.44% in
      PSNR and 1.31% in SSIM.
      
      Change-Id: Iae9e3738de6255babea734e5897f29118bebc6d7
      7428cebe
    • Paul Wilkins's avatar
      Remove rate component adjustment for AQ1 · f5209d7e
      Paul Wilkins authored
      In AQ1 a rate adjustment was applied for blocks coded with a
      deltaq. This tends to skew the partition selection and cause
      rate overshoot.
      
      For example, consider a 64x64 super block where some but not all
      sub blocks are in a low q segment and some are in a high q segment.
      The choice of Q when considering large partition and transform sizes
      is defined by the lowest sub block segment id (currently this implies the
      lowest Q). If some parts of the larger partition are very hard this will
      cause a high rate component.
      
      The correct behavior here is for the rd code to discard the large partition
      choice and break down to sub blocks where some have low and some
      have high Q.  However the rate correction factor above mask the high
      cost of coding at a larger partition size.
      
      Change-Id: Ie077edd0b1b43c094898f481df772ea280b35960
      f5209d7e
    • Paul Wilkins's avatar
      Switch AQ1 segment basis from q ratio to rate ratio. · 1663eff7
      Paul Wilkins authored
      In defining the Q deltas for segments in AQ1 use a rate
      ratio rather than a q ratio.
      
      Change-Id: Id31a74fcf2b7e55437e42a51c21b3cbcb57028d4
      1663eff7
    • Paul Wilkins's avatar
      Add adaptive midpoint for AQ1. · fc47c5d6
      Paul Wilkins authored
      Make the midpoint variance used in AQ mode 1 segmentation
      depend on the overall complexity of the frame in two pass.
      
      Change-Id: I452814ec57f7a32352e41bb250e78066abe952dd
      fc47c5d6
    • Alex Converse's avatar
      Allow DC/H/V/TM on screen content. · bc1b3d84
      Alex Converse authored
      6.3% better compression
      less than 1% compression time increase
      
      Change-Id: Ie83c059436e54c09de9e7c87e06e0a6d40dc38fe
      bc1b3d84