1. 06 Dec, 2014 1 commit
  2. 05 Dec, 2014 1 commit
    • Jingning Han's avatar
      Enable conditional skip path in rd_pick_intra_sby_mode · 74ded486
      Jingning Han authored
      These speed-up features for key frame coding are only turned on
      in the settings of hybrid non-RD and RD mode decision. It provides
      about 20% speed-up to the hybrid key frame coding at the expense
      of certain compression performance loss. For vidyo1, the key frame
      coding statistics are changed
      9838F, 35.020 dB, 61677 us -> 9920F, 34.834 dB, 47556 us
      
      Overall rtc set compression performance is down by -0.257%.
      
      Change-Id: I0025447fda26bb7855e982955642b5f55d71b51f
      74ded486
  3. 25 Nov, 2014 1 commit
    • Yunqing Wang's avatar
      vp9_ethread: modify VP9_COMP structure · edbd61e1
      Yunqing Wang authored
      This patch modified struct VP9_COMP. Created a struct ThreadData
      to include data that need to be copied for each thread. In
      multiple thread case, one thread processes one tile. all threads
      share one copy of VP9_COMP,
      (refer to VP9_COMP *cpi in the code)
      but each thread has its own copy of ThreadData,
      (refer to ThreadData *td in the code).
      Therefore, within the scope of encode_tiles(), both cpi and td
      need to be passed as function parameters.
      
      In single thread case, the FRAME_COUNTS pointer in ThreadData
      points to "counts" in VP9_COMMON.
      
      Change-Id: Ib37908b2d8e2c0f4f9c18f38017df5ce60e8b13e
      edbd61e1
  4. 20 Nov, 2014 3 commits
  5. 14 Nov, 2014 1 commit
  6. 30 Oct, 2014 1 commit
  7. 29 Oct, 2014 2 commits
    • Jingning Han's avatar
      Enable mode search threshold update in non-RD coding mode · 9349a28e
      Jingning Han authored
      Adaptively adjust the mode thresholds after each mode search round
      to skip checking less likely selected modes. Local tests indicate
      5% - 10% speed-up in speed -5 and -6. Average coding performance
      loss is -1.055%.
      
      speed -5
      vidyo1 720p 1000 kbps
      16533 b/f, 40.851 dB, 12607 ms -> 16556 b/f, 40.796 dB, 11831 ms
      
      nik 720p 1000 kbps
      33229 b/f, 39.127 dB, 11468 ms -> 33235 b/f, 39.131 dB, 10919 ms
      
      speed -6
      vidyo1 720p 1000 kbps
      16549 b/f, 40.268 dB, 10138 ms -> 16538 b/f, 40.212 dB, 8456 ms
      
      nik 720p 1000 kbps
      33271 b/f, 38.433 dB,  7886 ms -> 33279 b/f, 38.416 dB, 7843 ms
      
      Change-Id: I2c2963f1ce4ed9c1cf233b5b2c880b682e1c1e8b
      9349a28e
    • Hui Su's avatar
      Combine vp9_encode_block_intra and encode_block_intra · 0928da3b
      Hui Su authored
      Change-Id: I79091fb677b64892ecca2fb466fde14602d8cdfc
      0928da3b
  8. 28 Oct, 2014 1 commit
  9. 24 Oct, 2014 1 commit
    • Jingning Han's avatar
      Tile based adaptive mode search in RD loop · eee201c2
      Jingning Han authored
      Make the spatially adaptive mode search in rate-distortion
      optimization loop inter tile independent. Experiments suggest that
      this does not significantly change the coding staticstics.
      
      Single tile, speed 3:
      pedestrian_area 1080p 1500 kbps
      59192 b/f, 40.611 dB, 101689 ms
      
      blue_sky 1080p 1500 kbps
      58505 b/f, 36.347 dB, 62458 ms
      
      mobile_cal 720p 1000 kbps
      13335 b/f, 35.646 dB, 45655 ms
      
      as compared to 4 column tiles, speed 3:
      pedestrian_area 1080p 1500 kbps
      59329 b/f, 40.597 dB, 101917 ms
      
      blue_sky 1080p 1500 kbps
      58712 b/f, 36.320 dB, 62693 ms
      
      mobile_cal 720p 1000 kbps
      13191 b/f, 35.485 dB, 45319 ms
      
      Change-Id: I35c6e1e0a859fece8f4145dec28623cbc6a12325
      eee201c2
  10. 22 Oct, 2014 1 commit
    • Yunqing Wang's avatar
      vp9_ethread: allocate frame contexts outside VP9_COMMON struct · 7c7e4d4e
      Yunqing Wang authored
      This patch allocated frame contexts outside VP9_COMMON. This allows
      multiple threads to share the same copy of frame contexts, and
      reduces the overhead. It also guarantees the correct update of
      these contexts during bitstream packing. This patch doesn't change
      encoding result.
      
      Change-Id: Ic181a2460b891d1d587278a6d02d8057b9dbd353
      7c7e4d4e
  11. 17 Oct, 2014 1 commit
    • Jingning Han's avatar
      Reset rate cost value in rd mode search · 94ecfa32
      Jingning Han authored
      When early termination is triggered, properly reset the rate cost
      to invalid value to avoid potential ioc issue.
      
      Change-Id: I3444390be2e49a34bb02cf8a74c33d5dbd96d88d
      94ecfa32
  12. 16 Oct, 2014 1 commit
    • Jingning Han's avatar
      Fix an ioc issue in super_block_uvrd · ed100c0b
      Jingning Han authored
      This commit fixes an ioc issue that will happen when the cumulative
      variables are not in effective use. The fix discards these
      redundant additions.
      
      Change-Id: Idbac5bfb989c0cedc5f8a323effce938519b2457
      ed100c0b
  13. 14 Oct, 2014 1 commit
  14. 13 Oct, 2014 3 commits
  15. 12 Oct, 2014 1 commit
  16. 09 Oct, 2014 3 commits
    • Deb Mukherjee's avatar
      Rename highbitdepth functions to use highbd prefix · 1929c9b3
      Deb Mukherjee authored
      Uses highbd_ prefix convention consistently.
      
      Change-Id: I58f7f799a7ff8e32701bcd71c955bcf1cdd4581e
      1929c9b3
    • Deb Mukherjee's avatar
      Subpel search cleanups and enhancements · d78dbff0
      Deb Mukherjee authored
      - Some fixes to surface fit.
      - Returns variance function as cost rather than sad in the
        pattern search and diamond search functions. Only
        vp9_pattern_search_sad function used in bigdia search
        uses sad as integer 1-away costs.
      - Deploys SUBPEL_TREE_PRUNED_MORE for speed 4+.
      
      Results:
      derf [Speed 3]: About +0.036% in coding efficiency without any
      discernible speed loss.
      derf [Speed 4]: About 2-3% faster at -0.199% loss in coding efficiency.
      derf [Speed 5]: About 3-4% faster at -0.149% loss in coding efficiency.
      
      Change-Id: I8462f94f6adb46966ca964f2bd0400977357fd63
      d78dbff0
    • Yunqing Wang's avatar
      Allow mode search breakout at very low prediction errors · e18edd5e
      Yunqing Wang authored
      In model_rd_for_sb function, the spatial domain SSE and variance
      are checked to see if transform coefficients are quantized to 0.
      Besides that, this patch adds another set of thresholds that are
      much more strict. These thresholds are used to conduct a partition
      block level check to measure if all its TX blocks are skippable
      for YUV planes. If it is true, x->skip is set for this partition
      block, and thus its mode search is terminated.
      
      This speeds up the encoding at very low prediction error case,
      such as screen sharing application. This patch covers what
      rd_encode_breakout_test() does, so that function is removed.
      
      Borg test at speed 3 shows:
      For stdhd set, psnr: +0.008%, ssim: +0.014%;
      For derf set, psnr: +0.018%, ssim: +0.025%.
      No noticeable speed change.
      
      Change-Id: I4e5f15cf10016a282a68e35175ff854b28195944
      e18edd5e
  17. 08 Oct, 2014 2 commits
  18. 07 Oct, 2014 4 commits
  19. 06 Oct, 2014 1 commit
    • Jingning Han's avatar
      Fix eobs buffer pointer mis-use · a7555158
      Jingning Han authored
      This commit fixes a buffer pointer mis-use in store_coding_context.
      The compression performance for stdhd set of speed 3 is improved by
      0.097%. It fixes issue 869.
      
      Change-Id: Idc59e22035eaf39f7133ca04174894374d647ff7
      a7555158
  20. 05 Oct, 2014 1 commit
    • Jingning Han's avatar
      Fix an IOC issue in vp9_rd_pick_inter_mode_sb · 085b97aa
      Jingning Han authored
      It is possible that the GOLDEN reference frame is not avaiable, in
      which setting the predicted mv will be associated with a residual
      value of INT_MAX. This commit checks this condition before
      left shift and comparison with that of ALTREF frame, to avoid
      overflow issue.
      
      Change-Id: Ib98c3149dbdd016f2fe5beaafb13f67d469dd07c
      085b97aa
  21. 03 Oct, 2014 2 commits
    • Jingning Han's avatar
      Rework partition search skip scheme · bb260d90
      Jingning Han authored
      This commit enables the encoder to skip split partition search if
      the bigger block size has all non-zero quantized coefficients in low
      frequency area and the total rate cost is below a certain threshold.
      It logarithmatically scales the rate threshold according to the
      current block size. For speed 3, the compression performance loss:
      derf  -0.093%
      stdhd -0.066%
      
      Local experiments show 4% - 20% encoding speed-up for speed 3.
      blue_sky_1080p, 1500 kbps
      51051 b/f, 35.891 dB, 67236 ms ->
      50554 b/f, 35.857 dB, 59270 ms (12% speed-up)
      
      old_town_cross_720p, 1500 kbps
      14431 b/f, 36.249 dB, 57687 ms ->
      14108 b/f, 36.172 dB, 46586 ms (19% speed-up)
      
      pedestrian_area_1080p, 1500 kbps
      50812 b/f, 40.124 dB, 100439 ms ->
      50755 b/f, 40.118 dB,  96549 ms (4% speed-up)
      
      mobile_calendar_720p, 1000 kbps
      10352 b/f, 35.055 dB, 51837 ms ->
      10172 b/f, 35.003 dB, 44076 ms (15% speed-up)
      
      Change-Id: I412e34db49060775b3b89ba1738522317c3239c8
      bb260d90
    • Deb Mukherjee's avatar
      Prevent negative cost for highbitdepth · 431cdc33
      Deb Mukherjee authored
      Adds proper scaling for highbitdepth in a rdopt cost.
      
      Change-Id: I066694799a7f491b830945ef1c66eb202071c355
      431cdc33
  22. 01 Oct, 2014 3 commits
    • Deb Mukherjee's avatar
      High-bitdepth bugfixes · a160d725
      Deb Mukherjee authored
      Miscellaneous bug-fixes for high bitdepth functionality.
      With this patch, high bit-depth profiles become mostly functional,
      except for an intermittent assert failure issue that is being
      tracked.
      
      Change-Id: I6a7fcbdcf1e5b09842e88535f8442d2e1230748c
      a160d725
    • Yunqing Wang's avatar
      Modify block transform skipping check · e4aac6bb
      Yunqing Wang authored
      Block transform skipping was implemented based on DCT's energy
      conservation property. Modified the thresholds using zero bin
      parameters. AC and DC coefficients were checked separately to
      allow better identifying of skippable blocks.
      
      Borg test at speed 3 showed:
      stdhd set: psnr gain: 0.153%, ssim gain: 0.051%;
      derf set: psnr gain: 0.023%, ssim gain: 0.036%
      
      For most test clips, the encoding speedup is 1% - 2%.
      parkrun(720p): 7.5% speedup, park_joy(1080p): 3.5% speedup.
      
      Change-Id: If28eb81113a077414f5ca7b021c14f9069b373bb
      e4aac6bb
    • Jingning Han's avatar
      Conditionally skip reference frame check · 891793a5
      Jingning Han authored
      For regular inter frames, if the distance from GOLDEN_FRAME is larger
      than 2 and if the predicted motion vector of LAST_FRAME gives lower
      sse than that of GOLDEN_FRAME, skip the GOLDE_FRAME mode checking in
      the rate-distortion optimization. It provides about 5% speed-up at
      expense of -0.137% and -0.230% performance down for speed 3. Local
      experiment results:
      
      pedestrian 1080p 2000 kbps
      66712 b/f, 40.908 dB, 113688 ms ->
      66768 b/f, 40.911 dB, 108752 ms
      
      blue_sky 1080p 2000 kbps
      51054 b/f, 35.894 dB, 70406 ms ->
      51051 b/f, 35.891 dB, 67236 ms
      
      old_town_cross 720p 1500 kbps
      14412 b/f, 36.252 dB, 60690 ms ->
      14431 b/f, 36.249 dB, 57346 ms
      
      Change-Id: Idfcafe7f63da7a4896602fc60bd7093f0f0d82ca
      891793a5
  23. 26 Sep, 2014 1 commit
    • Jingning Han's avatar
      Skip certain ALTREF inter modes in ARF coding · ccdb518f
      Jingning Han authored
      This commit enables the encoder to skip checking ALTREF inter modes
      in ARF coding, if the predicted motion vectors suggest that the
      GOLDEN_FRAME provides higher prediction accuracy than ALTREF_FRAME.
      
      It improves the speed 3 encoding speed by about 5%, at the expense
      of compression performance loss -0.041% and -0.225% for derf and
      stdhd, respectively.
      
      pedestrian_area 1080p 2000 kbps
      66705 b/f, 40.909 dB, 118738 ms ->
      66732 b/f, 40.908 dB, 113688 ms
      
      old_town_cross 720p 1500 kbps
      14427 b/f, 36.256 dB, 62746 ms ->
      14412 b/f, 36.252 dB, 60690 ms
      
      blue_sky 1080p 1500 kbps
      51026 b/f, 35.897 dB, 73310 ms ->
      50921 b/f, 35.893 dB, 70406 ms
      
      bus CIF 1000 kbps
      21301 b/f, 34.841 dB, 7326 ms ->
      21248 b/f, 34.837 dB, 7196 ms
      
      Change-Id: I76cf88b4d655e1ee3c0cb03c8a5745493040e8d2
      ccdb518f
  24. 25 Sep, 2014 2 commits
  25. 23 Sep, 2014 1 commit
    • Yaowu Xu's avatar
      Adapt mode based rd_threshold for similar block size · 4a101310
      Yaowu Xu authored
      The rd_thresholds are adaptively changed based on best mode tested.
      It was only changed for the same block size, this commit makes the
      adaptation for similar block sizes too. The commit also made minor
      adjustment and code cleanups.
      
      The impact on encoding time for _ped:
      118089 ms -> 111927 ms
      
      The impact on compression:
      derf:  -0.339%
      stdhd: -0.303%
      
      Change-Id: I8817fed1102350497f2ec631849e43f753878e5d
      4a101310