1. 16 Jun, 2016 1 commit
  2. 15 Jun, 2016 1 commit
  3. 14 Jun, 2016 5 commits
    • Jingning Han's avatar
      Rework transform quantization pipeline · 1faf2887
      Jingning Han authored
      This commit reworks the transform and quantization unit. It enables
      the use of adaptive quantization for intra modes. This further
      improves the compression performance:
      lowres 0.36%
      midres 0.79%
      hdres  0.73%
      The key frame coding performance is improved:
      lowres 1.7%
      midres 1.9%
      hdres  3.3%
      The overall coding gains are:
      lowres 1.1%
      midres 1.8%
      hdres  2.3%
      Change-Id: Iaec1a3a4c1d5eac883ab526ed076d957060479dd
    • hui su's avatar
      Handle intra modes when tx type speed feature is enabled · 8c3b3d36
      hui su authored
      Change-Id: I9dc156214f3b3ded33ab30d558124b3151548161
    • hui su's avatar
      Speed up ext-intra inter frame encoding · 8f9c9b28
      hui su authored
      Skip filter intra mode search when regular intra modes have large
      rd cost.
      Encoding speed improvement:  8%.
      Compression performance drop: 0.02%  / 0.09%  / 0.03% on
                                    lowres / midres / hdres
      Change-Id: I94d3e48781bff6ae6895a54f271dd65c959bb976
    • hui su's avatar
      ext-intra: refactor rd loop in interframe · 70566f05
      hui su authored
      Move filter intra modes search to the end, after regular
      mode search.
      On average no performance changes.
      Change-Id: I9293c8fdf706ebf831fbd61c6bb81959790f4848
    • hui su's avatar
      Fix rate cost calculation for ext-intra · 7fa61d7d
      hui su authored
      It was broken by commit 8ee640f9.
      Change-Id: I26b9eba810c74849b0805e64da2d269ab0685cb9
  4. 13 Jun, 2016 2 commits
  5. 10 Jun, 2016 4 commits
    • Jingning Han's avatar
      Trellis based adaptive quantization · 25ca3229
      Jingning Han authored
      This commit combines uniform quantizer with trellis based coefficient
      level optimization. It improves the codebase compression performance:
      lowres 0.8%
      midres 1.0%
      hdres  1.6%
      Note that the current trellis optimization unit is using C code. This
      will make the cost of the overall quantization process slower. A number
      of optimizations will come up next.
      Change-Id: Id441dd238e4844409d0f08f82604be777f3f5282
    • Debargha Mukherjee's avatar
      Some refactoring to support warped motion mode · 03be30ba
      Debargha Mukherjee authored
      Change-Id: I15d54a3ae48b2b33082668116792c6595bdb3ddb
    • Sarah Parker's avatar
      Move new quant experiment from nextgen · a21afd42
      Sarah Parker authored
      This experiment implements non-uniform quantization where
      the width of the bins increases gradually to more closely
      match a laplacian distribution of the coeficcients.
      Performance Gain:
      derflr: 0.15%
      hevcmr: 0.675%
      Change-Id: I25234244e3bcd94b87c1f77cf682190b61c8ef94
    • Angie Chiang's avatar
      Revert "Optimize wedge partition selection." · 95340fcc
      Angie Chiang authored
      This reverts commit efda2831.
      This commit causes segmentation fault at SSE2/SumSquares2DTest.RandomValues/0
      Change-Id: I171937e4daf6f15323e8206418773deb03bd8c53
  6. 08 Jun, 2016 1 commit
    • Jingning Han's avatar
      Remove swap buffer speed feature · 0d6980d7
      Jingning Han authored
      The inter prediction residual can undergo different transform types
      during the rate-distortion optimization search. The assumption used
      in this speed feature no longer holds true. This commit removes the
      related code to clean up the codebase and clear out unit test
      failure in higher speed setting.
      Change-Id: I7f7cd4df2345ed3e607c9fae75b38cd2dbde0cac
  7. 07 Jun, 2016 3 commits
    • Jingning Han's avatar
      Add tx type speed feature to recursive transform block partitioning · 33dafdb5
      Jingning Han authored
      Change-Id: I45440a72b4287d98cbe21b72defc67138a8eb953
    • Jingning Han's avatar
      Rework the tx type speed feature · 9a858e86
      Jingning Han authored
      This commit re-works the transform type speed feature. It moves
      the transform type selection outside of the coding mode loop. This
      avoids repeated motion search if the best prediction mode is
      chosen as NEWMV. It improves the speed performance for clips that
      contain more motion activities.
      For mobile_cif at 1000 kbps, this makes the baseline encoding 7%
      faster and makes the encoding with dynamic motion vector referencing
      scheme enabled 10% faster.
      Change-Id: I93e2714b3e461303372c4b66a4134ee212faffd1
    • Zoe Liu's avatar
      Fix a RD performance bug in bipredictive frames · 5414abb4
      Zoe Liu authored
      This patch will make sure the use of the BWDREF_FRAME for the
      encoding of both the two types of bipredictive frames, namely
      LAST_BIPRED_UPDATE and BIPRED_UPDATE. To realize it, the
      updates on the cpi->ref_frame_flags have been moved to before
      the encoding of one frame, instread of originally handled after
      the encoding of one frame.
      RD performance has been improved slightly, approximately by 0.17%
      compared to before the applying of this patch:
      lowres: Avg -3.474; BDRate -3.324
      derflr: Avg -2.097; BDRate -1.353
      Change-Id: I0aa19afd752293e345489fbff104c4351ca5498c
  8. 06 Jun, 2016 1 commit
    • Geza Lore's avatar
      Optimize wedge partition selection. · efda2831
      Geza Lore authored
      We can optimize wedge partition selection by pre-computing the
      residuals of the 2 underlying predictors, and then blend these
      to compute the sse of the compound predictor, without actually
      having to compute and subtract the compound predictor.
      Similarly we can pre-compute a proxy array which we can use to
      cheaply check which mask sign would have lower sse.
      Details are in wedge_utils.c.
      Mathematically these are equivalence transformations, but due to the
      finite precision the encoder output will be perturbed, though on
      average this should make 0% difference.
      ext-inter gains about ~4.5% speedup.
      Change-Id: Ib2657c3209ae161b4090b58b4b6c392641bf2792
  9. 03 Jun, 2016 6 commits
    • Jingning Han's avatar
      Make ref-mv experiment support ActiveMap · 27d8a948
      Jingning Han authored
      Reset the ref_mv_idx and predicted motion vector when the coding
      block belongs to skip segment.
      Change-Id: I5746ab315a436b829b64a1a25121989d3c11c995
    • Geza Lore's avatar
      Always include the cost of tx size in rate for Y. · b87078d5
      Geza Lore authored
      The transform can only be skipped if both Y and U/V can be skipped, so
      we always include the cost of tx size in the rate for Y. This will
      get later subtracted if the transform is actually skipped.
      Change-Id: I136a223e5596f18b69bb9f743e7e08438183a215
    • Geza Lore's avatar
      Check if sub8x8 rd stats are valid before reusing them. · d9870c32
      Geza Lore authored
      Change-Id: I5d49f15a07de58c226d4003b4691e001abf1f3f8
    • Geza Lore's avatar
      Compute cost of UV mode accurately for intra blocks. · 8ee640f9
      Geza Lore authored
      We used to cache the cost of the UV mode from the search with a
      different previously tried Y mode, but the UV mode is contexted
      on the Y mode, so caching the cost is inaccurate.
      Change-Id: Ib003510afb6fc9befb7808b67b0be64f1c0a0804
    • Geza Lore's avatar
      Factor out model_rd_from_sse · 73bc3119
      Geza Lore authored
      Change-Id: Ia60ff0ecc8d083870fadbfe07d494d1e2c080489
    • Geza Lore's avatar
      Pre-compute and use contiguous wedge masks. · ab29978e
      Geza Lore authored
      This is purely a refactoring patch and has no functional effect.
      Uses of these masks can be arranged such that all input blocks are
      contiguous in memory (stride == block width). In this case 1D versions
      of  operations can be used. 1D vector operations have superior performance
      over 2D block equivalents as they are more processor cache friendly and
      they can do away with a second loop overhead.
      Change-Id: I2b76c9888aea2c857cc497e8a4b2841fd3dad54e
  10. 31 May, 2016 3 commits
    • hui su's avatar
      ext-intra: speed up keyframe encoding · fa933553
      hui su authored
      130% speed increase for keyframe encoding, with 0.4%
      compression loss.
      When kf-max-dist=150, 1.5% speed increase with 0.03%
      compression loss.
      Change-Id: I4cf7314ab95b9eb6dd17f314aca8955522c82676
    • hui su's avatar
      Add a speed feature for inter tx type search · f523d7b5
      hui su authored
      Seperate prediction mode and tx type search for inter
      modes. Enabled for speed >=1.
      speed increase     40%
      compression drop   0.30%/0.29% on lowres/midres
      speed increase    160%
      compression drop  1.08%/0.95% on lowres/midres
      Change-Id: Ieb34b1ee80df6980d16e26a5783e08cc0deae55b
    • hui su's avatar
      Add a speed feature for intra tx type search · 38e6dd71
      hui su authored
      Add a speed feature to seperate prediction mode and tx type search
      for intra modes: search for best intra prediction mode with fixed
      default tx type first, then choose the best tx type for the
      selected mode.
      Coding performance drop:
        lowres 0.10% midres 0.08% hdres 0.14%
      with ext-tx
        lowres 0.14% midres 0.25% hdres 0.20%
      Speed improvement is 20% for baseline and 17% for ext-tx.
      It is turned on for speed >= 1.
      Change-Id: Ia5e8d39e8a4e2e42c521bfde938f8b6a98ab24f9
  11. 28 May, 2016 1 commit
    • Zoe Liu's avatar
      Make the bi-predictive frame group interval adjustable · e89ca180
      Zoe Liu authored
      This is for the bidir-pred experiment. Previously the length of the
      bi-predictive frame group interval is fixed at 2, i.e. one
      bi-predictive frame may be inserted every other frame. This patch
      makes the length adjustable, i.e. any positive number may be
      specified, but the use of the backward ref will be turned off if the
      bi-predictive frame group interval is larger than the golden frame
      Further, an additional rate factor level has been added:
      , which applies to LAST_BIPRED_UPDATE frames that are not used as
      Change-Id: I5514d34a64dd486bbb5756c2d0612946f598a789
  12. 24 May, 2016 2 commits
    • Zoe Liu's avatar
      Added an experiment "bidir_pred" for backward prediction · cf5083d4
      Zoe Liu authored
      Major parts have been implemented as follows:
      (1) Added BRF_UPDATE, LASTNRF_UPDATE, and NRF_UPDATE in firstpass.c;
      (2) Added the handling for the scenario of
      "cpi->common.show_existing_frame == 1" at the encoder;
      (3) Added a new reference frame of BWDREF_FRAME;
      (4) Have bwd-ref work with upsampled references.
      Note that when the experiment of "ext_refs" turned on, this experiment
      will be turned off automatically currently.
      RD performance in Overall PSNR has been improved, compared against the
      VP10 baseline:
      lowres: Avg -3.312; BDRate -3.154
      derflr: Avg -1.927; BDRate -1.176
      midres: Avg -2.149; BDRate -2.001
      hdres : Avg -0.567; BDRate -0.588
      Change-Id: I4c06ff51cc20194bffbd4d2346e57ba3dcf6b62c
    • hui su's avatar
      Skip unnecessary calculations in ext-intra · 4a741a5d
      hui su authored
      Around 5% speedup.
      Change-Id: I1c552e4e58fbf5637c0b5a97dd2cc4f83a1ca201
  13. 20 May, 2016 1 commit
  14. 19 May, 2016 2 commits
    • Yaowu Xu's avatar
      Fix a build issue · a9fc1cc2
      Yaowu Xu authored
      When both obmc and dual_filter is enabled.
      Change-Id: I56b127573a6cca31469bb357cf7a6a9c3df64071
    • Geza Lore's avatar
      Fix obmc + ext-interp interference · 009bd115
      Geza Lore authored
      With ext-interp, a switchable interpolation filter is coded iff the
      motion vector uses fractional pixel movement (ie, true subpixel
      movement). With ext-interp and obmc enabled at the same time, the RD
      search proceeds as:
      1. Do motion search
      2. Do interpolation filter search iff subpixel motion, otherwise use
      3. Evaluate obmc=0
      4. Evaluete obmc=1 - This involves another motion search
      If the motion search in step 4 yields an integer motion vector, while
      the search in step 1 did not, then an interp_filter value other than
      EIGHTTAP_REGULAR is invalid, and will cause an assertion failure
      at output time, or a mismatch if not using --enable-debug.
      The fix sets the interp_filter to EIGHTTAP_REGULAR if obmc=1 is picked
      with an integer motion vector.
      Change-Id: I4685d1ad537f41d833dc9eb64845956b67886cca
  15. 17 May, 2016 2 commits
    • Debargha Mukherjee's avatar
      Reducing computation of interintra modes · 049dbe77
      Debargha Mukherjee authored
      Use model for interintra mode search.
      Speed-up about 5-10% with about 0.04 drop in efficiency.
      lowres: -2.60%
      Change-Id: I825bf0ba8a46eb7f19fc528c25b8df066fb8ea95
    • James Zern's avatar
      vp10/rdopt,rd_pick_intra4x4block: port tsan fix from vp9 · 8eba4ac4
      James Zern authored
      minus the non-existent nonrd portion. original change:
      commit d642294b
      Author: Jingning Han <jingning@google.com>
      Date:   Thu Feb 11 12:36:49 2016 -0800
          Fix tsan error in VP9 sub8x8 intra mode search
          This commit fixes issue 1141. The issue was triggered in multi-tile
          encoding. The change properly saves and restores the block context
          information in the real-time mode selection process. It removes
          several redundant memcpy operations in sub8x8 intra block mode
          Change-Id: I35c9ad197f4bd500ec39b5fc833f052f19eee010
      Change-Id: Idfa38c54c9e645479f6870d46f71fb1e91c071da
  16. 16 May, 2016 3 commits
    • Jingning Han's avatar
      Unify the per directional filter type system for compound modes · 4677e1a7
      Jingning Han authored
      For the current stage, we assume a single prediction filter type
      per direction in the settings of compound inter prediction modes.
      Change-Id: I12a1afdd364b93fcee870bd11ad01fc40ab48cff
    • Jingning Han's avatar
      Enable per motion component filter type selection · d567e14e
      Jingning Han authored
      Change-Id: I73823fc94f296d225dece7156de71b30bae3fcb7
    • Debargha Mukherjee's avatar
      Various wedge enhancements · fb8ea173
      Debargha Mukherjee authored
      Increases number of wedges for smaller block and removes
      wedge coding mode for blocks larger than 32x32.
      Also adds various other enhancements for subsequent experimentation,
      including adding provision for multiple smoothing functions
      (though one is used currently), adds a speed feature that decides
      the sign for interinter wedges using a fast mechanism, and refactors
      wedge representations.
      lowres: -2.651% BDRATE
      Most of the gain is due to increase in codebook size for 8x8 - 16x16.
      Change-Id: I50669f558c8d0d45e5a6f70aca4385a185b58b5b
  17. 11 May, 2016 2 commits
    • Geza Lore's avatar
      Cost wedge sign/index properly in rdopt. · c1b73901
      Geza Lore authored
      Lowres improves by about 0.1%
      lowres: -2.164 BDRATE
      Change-Id: I393bbb92700bfbb8763ace424f4edc2d672a74b4
    • Yue Chen's avatar
      Add single motion search for OBMC predictor · 370f203a
      Yue Chen authored
      Weighted single motion search is implemented for obmc predictor.
      When NEWMV mode is used, to determine the MV for the current block,
      we run weighted motion search to compare the weighted prediction
      with (source - weighted prediction using neighbors' MVs), in which
      the distortion is the actual prediction error of obmc prediction.
      Coding gain: 0.404/0.425/0.366 for lowres/midres/hdres
      Speed impact: +14% encoding time
                    (obmc w/o mv search 13%-> obmc w/ mv search 27%)
      Change-Id: Id7ad3fc6ba295b23d9c53c8a16a4ac1677ad835c