1. 05 Sep, 2014 1 commit
    • Yunqing Wang's avatar
      Correct the mode decisions in special cases · 1dd9a639
      Yunqing Wang authored
      The rate costs calculated for inter modes are not precise in some
      cases, which causes NEWMV is chosen instead of NEARESTMV, NEARMV,
      and ZEROMV. This patch added checks for these cases, and corrected
      the mode decisions.
      
      Borg tests at speed 3 showed:
      1. stdhd set: 0.102% PSNR gain and 0.088% SSIM gain.
      2. derf set:  0.147% PSNR gain and 0.132% SSIM gain.
      No speed change.
      
      Change-Id: I35d17684b89ad4734fb610942d707899146426db
      1dd9a639
  2. 03 Sep, 2014 3 commits
    • Jingning Han's avatar
      Speed up compound inter prediction mode check · d62d804e
      Jingning Han authored
      This commit allows the encoder to store outcomes of single reference
      frame modes and compares them to decide if the inter prediction
      filter, forward transform, and quantization can be skipped.
      
      The compression performance of speed 3 is down
      derf  -0.364%
      stdhd -0.198%
      
      For test sequences, the speed 3 runtime is reduced
      highway CIF 100 kbps, 51976 ms -> 45033 ms, 13% speed-up
      stockholm 720p 1000 kbps, 71826 ms -> 67838 ms, 5.5% speed-up
      pedestrian 1080p 2000 kbps, 154924 ms -> 150702 ms, 2.6% speed-up
      
      Change-Id: I5aa26f918d2b4b5197a2c0afa2779319f1c88e44
      d62d804e
    • Yaowu Xu's avatar
      Merge two similar functions into one · e759d957
      Yaowu Xu authored
      intra_super_block_yrd() and inter_super_block_yrd() are largely same,
      this commit merges them into one to reduce code duplication.
      
      Change-Id: I64d7042a5b099345627cf55663010c185b25ec37
      e759d957
    • Jingning Han's avatar
      Skip comp inter mode test in RD loop with same frame bias signs · 801fef26
      Jingning Han authored
      This commit allows the encoder to skip check on compound inter
      modes in the rate-distortion optimization loop, if the reference
      frame bias signs are the same.
      
      Change-Id: Ib753e6bb11cbdd338aee69dbe2b649671f75a6b0
      801fef26
  3. 02 Sep, 2014 1 commit
    • Jingning Han's avatar
      Skip comp inter mode tests for arf coding · 33176fef
      Jingning Han authored
      This commit skips the compound inter mode prediction check in the
      rate-distortion optimization loop for ARF coding. It reduces the
      runtime for certain test clips at speed 3, at no compression
      performance change:
      
      bus CIF 1000 kbps, 8260 ms -> 8090 ms, 1.8% speed-up
      stockholm 720p 1000 kbps, 74453 ms -> 71826 ms, 2.9% speed-up
      
      No visible speed-up for pedestrian area 1080p at 2000 kbps.
      
      Change-Id: Ic68aa56837159b726563b784e2e3729e846465ad
      33176fef
  4. 29 Aug, 2014 3 commits
    • Jingning Han's avatar
      Fix int64_t to unsigned int conversion warnings · 6ddf1e15
      Jingning Han authored
      Use unsigned int type to store the sse in the pixel domain. The
      precision is sufficient to handle sse of block size up to 64x64.
      The transform domain version however needs int64_t, since there is
      a transfer gain applied in the forward transformation that might
      cause unsigned int overflow.
      
      Change-Id: Ifef97c38597e426262290f35341fbb093cf0a079
      6ddf1e15
    • Jingning Han's avatar
      Skip intra mode tests depending on inter residuals · 4282955e
      Jingning Han authored
      This commit allows encoder to skip intra coding mode test, when
      the known inter residual is less than the source variance. It
      reduces the runtime of speed 3 for test clips:
      bus cif 1000 kbps: 8587 ms -> 8260 ms, 3.8% speed-up
      pedestrian 1080p 2000 kbps: 161381 ms -> 155241 ms, 3.7% speed-up.
      
      The compression performance is down by
      derf   -0.36%
      stdhd  -0.25%
      
      Change-Id: I75ce1e035b4da2153cb1ac14111d1a07c05a735d
      4282955e
    • Jingning Han's avatar
      Extend block level sse to support multiple txfm blocks · 02e6ecdc
      Jingning Han authored
      This commit extends the sse and forward transform computation flag
      to support the case 64x64 blocks where there are 4 32x32 2D-DCT
      blocks.
      
      Change-Id: I86a3e805dfaa0f3abd812f590520c71aa0e40473
      02e6ecdc
  5. 28 Aug, 2014 2 commits
    • Yunqing Wang's avatar
      Early termination in encoding partition search · 4d2c3769
      Yunqing Wang authored
      In the partition search, the encoder checks all possible
      partitionings in the superblock's partition search tree.
      This patch proposed a set of criteria for partition search
      early termination, which effectively decided whether or
      not to terminate the search in current branch based on the
      "skippable" result of the quantized transform coefficients.
      The "skippable" information was gathered during the
      partition mode search, and no overhead calculations were
      introduced.
      
      This patch gives significant encoding speed gains without
      sacrificing the quality.
      
      Borg test results:
      1. At speed 1,
         stdhd set: psnr: +0.074%, ssim: +0.093%;
         derf set:  psnr: -0.024%, ssim: +0.011%;
      2. At speed 2,
         stdhd set: psnr: +0.033%, ssim: +0.100%;
         derf set:  psnr: -0.062%, ssim: +0.003%;
      3. At speed 3,
         stdhd set: psnr: +0.060%, ssim: +0.190%;
         derf set:  psnr: -0.064%, ssim: -0.002%;
      4. At speed 4,
         stdhd set: psnr: +0.070%, ssim: +0.143%;
         derf set:  psnr: -0.104%, ssim: +0.039%;
      
      The speedup ranges from several percent to 60+%.
                       speed1    speed2    speed3    speed4
      (1080p, 100f):
      old_town_cross:  48.2%     23.9%     20.8%     16.5%
      park_joy:        11.4%     17.8%     29.4%     18.2%
      pedestrian_area: 10.7%      4.0%      4.2%      2.4%
      (720p, 200f):
      mobcal:          68.1%     36.3%     34.4%     17.7%
      parkrun:         15.8%     24.2%     37.1%     16.8%
      shields:         45.1%     32.8%     30.1%      9.6%
      (cif, 300f)
      bus:              3.7%     10.4%     14.0%      7.9%
      deadline:        13.6%     14.8%     12.6%     10.9%
      mobile:           5.3%     11.5%     14.7%     10.7%
      
      Change-Id: I246c38fb952ad762ce5e365711235b605f470a66
      4d2c3769
    • Deb Mukherjee's avatar
      Updates vp9_pattern search to return integer sads · 04b100b2
      Deb Mukherjee authored
      Updates the vp9_pattern_search function to return integer one-away
      neighbors' sad values, for subsequent use in speeding up the
      sub-pel search. Also, removes code for the do_refine option
      which is not being used currently.
      Updates the integer and subpel functions to pass in a 5-element
      sad list for output or input.
      
      A new pruned sub-pel search algorithm is implemented that uses
      the sad returned from the integer pel search. But it is not
      deployed yet.
      
      Change-Id: Ifa9f5ad024b5b660570366d2bd900343e1891520
      04b100b2
  6. 27 Aug, 2014 4 commits
  7. 26 Aug, 2014 1 commit
    • Yaowu Xu's avatar
      add a new interp filter search strategy. · 1144fee3
      Yaowu Xu authored
      This commit addes a new strategy to reduce the search for optimal
      interpolation filter type. The encoder counts and store how many each
      filter type is selected and used for each of the reference frames.
      A filter type that is rarely used for all three reference frames is
      masked out to avoid computation.
      
      The impact on compression is neglectible:
      -0.02% on derf
      +0.02% on stdhd
      
      Encoding time is seen to reduce by 2~3%.
      
      Change-Id: Ibafa92291b51185de40da513716222db4b230383
      1144fee3
  8. 25 Aug, 2014 1 commit
  9. 21 Aug, 2014 1 commit
  10. 19 Aug, 2014 1 commit
  11. 18 Aug, 2014 2 commits
    • Yunqing Wang's avatar
      Add early termination in transform size search · ba70f160
      Yunqing Wang authored
      In the full-rd transform size search, we go through all transform
      sizes to choose the one with best rd score. In this patch, an
      early termination is added to stop the search once we see that the
      smaller size won't give better rd score than the larger size. Also,
      the search starts from largest transform size, then goes down to
      smallest size.
      
      A speed feature tx_size_search_breakout is added, which is turned off
      at speed 0, and on for other speeds. The transform size search is
      turned on at speed 1.
      
      Borg test results:
      1. At speed 1,
         derf set: psnr gain: 0.618%, ssim gain: 0.377%;
         stdhd set: psnr gain: 0.594%, ssim gain: 0.162%;
         No noticeable speed change.
      3. At speed 2,
         derf set: psnr loss: 0.157%, ssim loss: 0.175%;
         stdhd set: psnr loss: 0.090%, ssim loss: 0.101%;
         speed gain: ~4%.
      
      Change-Id: I22535cd2017b5e54f2a62bb6a38231aea4268b3f
      ba70f160
    • Jingning Han's avatar
      Speed up mode search depending on relative ref frame position · 6a464eca
      Jingning Han authored
      This commit enables the encoder to record the location of the
      center frame to generate alter reference frame. It then allows to
      skip checking prediction modes of other reference frame types when
      it comes to encode this frame.
      
      The speed 3 runtime is reduced for the test sequences:
      bus at CIF 1000 kbps, 9791 ms -> 9446 ms, i.e., 3.5% speed-up,
      pedestrian at 1080p 2000 kbps, 184043 ms -> 175730 ms, i.e., 4.5%
      speed-up.
      
      No compression performance change observed.
      
      Change-Id: Iacfde3bcc1445964e7a241f239bd6ea11cb94bd1
      6a464eca
  12. 15 Aug, 2014 2 commits
  13. 13 Aug, 2014 2 commits
    • Jingning Han's avatar
      Skip mode search based on reference frame consistency · 1e305479
      Jingning Han authored
      This commit enables the encoder to skip NEARMV and ZEROMV if the
      above and left blocks have identical reference frame, and the
      current reference is different from that. It reduces the runtime
      of speed 3 for test sequences:
      bus cif at 1000 kbps 10064 ms -> 9823 ms
      pedestrian 1080p at 2000 kbps 193078 ms -> 189559 ms
      
      The compression performance is changed by
      derf  -0.085%
      stdhd -0.103%
      
      Change-Id: If304f26d42e6412152a84c3dd7b02635c38444f4
      1e305479
    • Jingning Han's avatar
      Enable motion field based mode seach skip · 0daadeb6
      Jingning Han authored
      This commit allows the encoder to check the above and left neighbor
      blocks' reference frames and motion vectors. If they are all
      consistent, skip checking the NEARMV and ZEROMV modes. This is
      enabled in speed 3. The coding performance is improved:
      
      pedestrian area 1080p at 2000 kbps,
      from  74773 b/f, 41.101 dB, 198064 ms
      to    74795 b/f, 41.099 dB, 193078 ms
      
      park joy 1080p at 15000 kbps,
      from 290727 b/f, 30.640 dB, 609113 ms
      to   290558 b/f, 30.630 dB, 592815 ms
      
      Overall compression performance of speed 3 is changed
      derf  -0.171%
      stdhd -0.168%
      
      Change-Id: I8d47dd543a5f90d7a1c583f74035b926b6704b95
      0daadeb6
  14. 08 Aug, 2014 1 commit
  15. 04 Aug, 2014 3 commits
  16. 31 Jul, 2014 1 commit
    • Yunqing Wang's avatar
      Code cleanup in rdopt.c · 678d7472
      Yunqing Wang authored
      Moved encode_breakout code out of handle_inter_mode().
      
      Change-Id: Icd661136b05fdf163768c406f91e0c98a8df89eb
      678d7472
  17. 29 Jul, 2014 1 commit
    • Jingning Han's avatar
      Use frame index directly in get_chessboard_index · c36f78b0
      Jingning Han authored
      The get_chessboard_index() used to call the entire VP9_COMMON
      struct pointer to retrieve the chessboard pattern index. This cl
      makes it call the frame index directly.
      
      Change-Id: I3cad9d209ea2e77a358085a04fe1ff0ddec5ba03
      c36f78b0
  18. 24 Jul, 2014 2 commits
  19. 23 Jul, 2014 3 commits
    • Jingning Han's avatar
      Remove redundant argument entry in handle_inter_mode · e945c56d
      Jingning Han authored
      The value of mode_excluded has been properly set in
      vp9_rd_pick_inter_mode_sb(). It is redundant to send it in
      handle_inter_mode() and re-set the value again.
      
      Change-Id: I408d4731f2f42e0bcf3ae62e85757717bb410471
      e945c56d
    • Jingning Han's avatar
      Use the chessboard pattern pred search in newmv mode · 4f2f8672
      Jingning Han authored
      This commit extends the chessboard pattern prediction filter search.
      If the above and left blocks have the same prediction filter type,
      the encoder will skip the prediction filter type search and use the
      reference one.
      
      The overall chessboard pattern prediction filter type search reduces
      speed 3 runtime for hard clips. Experiments on park joy at 1080p
      and 15000 kbps show that the runtime goes from 723265 ms to 65832 ms,
      i.e., about 10% speed-up. Compression performance wise, it affects
      the coding quality by
      
      Change-Id: I880975497c7ad166532e9eea9bf46684d77ff327
      derf:    -0.326%
      yt:      -0.257%
      hd:      -0.241%
      stdhd:   -0.417%
      4f2f8672
    • Jingning Han's avatar
      Remove redundant num_refs definition · 35381910
      Jingning Han authored
      Use is_comp_pred to replace the use case of num_refs.
      
      Change-Id: I4d0c1e14d5f728428a2ae3d293cd2b4a8b2f31d8
      35381910
  20. 22 Jul, 2014 2 commits
    • Jingning Han's avatar
      Enable chessboard inter prediction filter type search · 54ad0958
      Jingning Han authored
      This commit enables a chessboard pattern prediction filter type
      search scheme for rate-distortion optimization speed-up. For the
      inferred motion vector modes, the encoder can re-use its above/left
      neighbor blocks' prediction filter type and skip a full test on
      all possible filter types. Such operation is turned on/off
      alternatively in a chessboard manner.
      
      It is turned on in speed 3. For test clip pedestrian 1080p, the
      runtime is reduced from 231500 ms -> 221700 ms. The compression
      performance is changed:
      derf:  -0.147%
      yt:    -0.134%
      hd:    -0.079%
      stdhd: -0.220%
      
      Change-Id: I1912f278e7576c2dc632688e3ad7a257410c605a
      54ad0958
    • Jingning Han's avatar
      USE local best_filter variable in handle_inter_mode · 5de6114e
      Jingning Han authored
      This should be a local variable. Move the definition from
      vp9_rd_pick_inter_mode_sb to handle_inter_mode.
      
      Change-Id: I14f4168bb1c896ed04e8f6d4cd89fbf4c9839944
      5de6114e
  21. 21 Jul, 2014 1 commit
  22. 11 Jul, 2014 2 commits