1. 03 Feb, 2015 2 commits
  2. 31 Jan, 2015 1 commit
    • hkuang's avatar
      Try again to merge branch 'frame-parallel' into master branch. · be6aeada
      hkuang authored
      In frame parallel decode, libvpx decoder decodes several frames on all
      cpus in parallel fashion. If not being flushed, it will only return frame
      when all the cpus are busy. If getting flushed, it will return all the
      frames in the decoder. Compare with current serial decode mode in which
      libvpx decoder is idle between decode calls, libvpx decoder is busy
      between decode calls.
      Current frame parallel decode will only speed up the decoding for frame
      parallel encoded videos. For non frame parallel encoded videos, frame
      parallel decode is slower than serial decode due to lack of loopfilter
      worker thread.
      There are still some known issues that need to be addressed. For example:
      decode frame parallel videos with segmentation enabled is not right sometimes.
      * frame-parallel:
        Add error handling for frame parallel decode and unit test for that.
        Fix a bug in frame parallel decode and add a unit test for that.
        Add two test vectors to test frame parallel decode.
        Add key frame seeking to webmdec and webm_video_source.
        Implement frame parallel decode for VP9.
        Increase the thread test range to cover 5, 6, 7, 8 threads.
        Fix a bug in adding frame parallel unit test.
        Add VP9 frame-parallel unit test.
        Manually pick "Make the api behavior conform to api spec." from master branch.
        Move vp9_dec_build_inter_predictors_* to decoder folder.
        Add segmentation map array for current and last frame segmentation.
        Include the right header for VP9 worker thread.
        Move vp9_thread.* to common.
        ctrl_get_reference does not need user_priv.
        Seperate the frame buffers from VP9 encoder/decoder structure.
        Revert "Revert "Revert "Revert 3 patches from Hangyu to get Chrome to build:"""
      This reverts commit a18da976.
      Change-Id: I361442ffec1586d036ea2e0ee97ce4f077585f02
  3. 30 Jan, 2015 2 commits
    • James Zern's avatar
      vp9: rename 'near' parameters · f6c2a6c5
      James Zern authored
      + nearest for consistency
      near is a reserved word in windows builds so using it as a parameter
      name may cause build failures with some configurations
      Change-Id: Iddf1d4ecdb39843f14e95dbfd9dca55f07f81403
    • Yaowu Xu's avatar
      Optimize coef update · 45971abd
      Yaowu Xu authored
      1. move the check of search method of USE_TX_8X8 up one level to
      avoid operations of build_tree_distributions()
      2. count tx used and avoid computaton for coef udpate when one size
      is not used at all.
      Change-Id: Ia3e54a2588aa531c41377a1bfaa64385d04a592c
  4. 29 Jan, 2015 1 commit
  5. 28 Jan, 2015 4 commits
    • Paul Wilkins's avatar
      Change to update of rate control factors. · f752da8c
      Paul Wilkins authored
      Remove damping parameter and use the damping
      formula introduced by Yaowu Xu in all cases.
      Change-Id: I18db7e0d0f262d5140102f259ab07821d374d285
    • Yaowu Xu's avatar
      Simplify update_coef_probs() · ff99a3c7
      Yaowu Xu authored
      1. reduce the size of temporaray arrays on stack
      2. avoid build_tree_distribution for tx size that is not used at all.
      Change-Id: I0f8d7124e16a3789d3c15ad24cf02c1c12789e2c
    • Marco's avatar
      Fix to vp9 denoiser. · c0923d4d
      Marco authored
      Prevent from using wrong mv for denoiser motion compensation.
      Change-Id: Ifa0f9daabdbdab0900d3c17304059fe0d15de914
    • Yunqing Wang's avatar
      Fix issues in 32bit PIC enabled build · 10d5e09c
      Yunqing Wang authored
      This patch was to fix issue 924:
      The SECTION_RODATA macro was modified to support macho32 format.
      The sub-pixel functions were modified to pass in 2 more parameters
      to handle the global offsets for PIC build.
      Change-Id: I3bfcd336bcae945edf300bca4ab40376a2628cd4
  6. 27 Jan, 2015 5 commits
  7. 26 Jan, 2015 2 commits
    • Paul Wilkins's avatar
      Adjust active maxq for GF groups. · fd070220
      Paul Wilkins authored
      Currently disabled by default: enabled using
      In this patch the active max Q is adjusted for each GF
      group based on the vbr bit allocation and raw first pass
      group error.
      This will tend to give a lower q for easy sections
      and a higher value for very hard sections. As such it is
      expected to improve quality in some of the easier
      sections where quality issues have been reported.
      This change tends to hurt overall psnr but help
      average psnr. SSIM also shows a small gain.
      Average results for derf, yt, std-hd and yt-hd test sets were
      as follows (%change for average psnr, overal psnr and ssim):-
      derf +0.291, - 0.252, -0.021
      yt +6.466, -1.436, +0.552
      std-hd +0.490, +0.014, +0.380
      yt-hd +5.565, - 1.573, +0.099
      Change-Id: Icc015499cebbf2a45054a05e8e31f3dfb43f944a
    • Yaowu Xu's avatar
      Fix MSVC warnings on conversion from int64 to int · 6d16f6c1
      Yaowu Xu authored
      Change-Id: I7e96509ffa36899fcd2935749927a1e8aac8d025
  8. 25 Jan, 2015 1 commit
    • Frank Galligan's avatar
      Add Neon intrinsic vp9_fdct8x8_quant_neon · 9f6eba41
      Frank Galligan authored
      On Nexus 7 speed -5 got ~2%, -6 got ~15%, -7 and -8 got ~30%
      increase in perf.
      Tested on Nexus 7, built with ndk r10d, gcc 4.9.
      Change-Id: I83246d63b96674d170098a572fa4fe28a05aaf51
  9. 23 Jan, 2015 4 commits
  10. 22 Jan, 2015 1 commit
    • Marco's avatar
      Modify variance partition selection for low resolutions. · 0dccb627
      Marco authored
      For low spatial resolutions: bias partittion selection to smaller block sizes,
      and base the variance computation on 4x4 down-sampling.
      Also move the threshold computations into the choose_partitioning,
      so they are computed once for each sb block.
      On low-res clips (RTC_derf) PSNR/SSIMetrics increase by about 4-5%.
      No change for resolutions above CIF.
      Change-Id: I93f8ff742c8044786977bb6e31dcf8efda6dd1b0
  11. 21 Jan, 2015 3 commits
  12. 20 Jan, 2015 1 commit
  13. 19 Jan, 2015 1 commit
    • JackyChen's avatar
      SSE2 code for the filter in MFQE. · 09673deb
      JackyChen authored
      The SSE2 code is from VP8 MFQE, reuse it in VP9. No change on VP8
      side. In our testing, we achieve 2X speed by adopting this change.
      Change-Id: Ib2b14144ae57c892005c1c4b84e3379d02e56716
  14. 17 Jan, 2015 2 commits
    • Frank Galligan's avatar
      Fix variance Neon intrinsics > 32x32 · cc2da09d
      Frank Galligan authored
      The 16 bit sum vector was overflowing.
      Change-Id: I0fdf38e832ee99457ec8680a92691a6175ff8c3f
    • Yunqing Wang's avatar
      vp9_ethread: add parallel loopfilter · e76eaf05
      Yunqing Wang authored
      1. Added row-based loopfilter in encoder;
      2. Moved common multi-threaded loopfilter functions from decoder
         to common;
      3. Merged multi-threaded loopfilter code, and made encoder/
         decoder call same function to reduce code duplication.
      Encoder tests showed that 1% - 2% speedup was seen for good-quality
      2-pass mode(at speed 3); 1% - 3% speedup using 2 threads and 4% - 6%
      speedup using 4 threads were seen for real-time mode(at speed 7).
      Change-Id: I8a4ac51c2ad9bab9fa7b864e90743931c53ec1c4
  15. 16 Jan, 2015 3 commits
  16. 15 Jan, 2015 1 commit
    • Frank Galligan's avatar
      Add Neon intrinsics for vp9_avg_8x8_neon · 6e7e1cf3
      Frank Galligan authored
      On Nexus 7 speed -5, -6, -7, and -8 saw about a 1% increase
      in perf for 480p. Speeds -5, -6, -7, and -8 saw about a 1.5%
      increase in perf for 720p.
      Tested on Nexus 7, built with ndk r10d, gcc 4.9.
      Change-Id: Ibf17ebfd952a6aec941719bd8306df8ec4574bee
  17. 14 Jan, 2015 4 commits
    • Yunqing Wang's avatar
      Align thread data in vp9_ethread · 99b99831
      Yunqing Wang authored
      On some platforms, such as 32bit Windows and 32bit Mac, the allocated
      memory isn't aligned automatically. The thread data is aligned to
      ensure the correct access in SIMD code.
      Change-Id: I1108c145fe982ddbd3d9324952758297120e4806
    • Yaowu Xu's avatar
      Add encoder control for setting color space · e94b415c
      Yaowu Xu authored
      This commit adds encoder side control for vp9 to set color space info
      in the output compressed bitstream.
      It also amends the "vp9_encoder_params_get_to_decoder" test to verify
      the correct color space information is passed from the encoder end to
      decoder end.
      Change-Id: Ibf5fba2edcb2a8dc37557f6fae5c7816efa52650
    • Frank Galligan's avatar
      Add 64x64 sub_pel_variance Neon function · ec1d8387
      Frank Galligan authored
      On Nexus 7 speed -5, -6, -7, and -8 saw about a 15% increase
      in perf for 480p. Speeds -5, -6, -7, and -8 saw about a 10%
      increase in perf for 720p.
      Tested on Nexus 7, built with ndk r10d, gcc 4.9.
      Change-Id: I2fa5315845e3021c9a6e2ea47e52e68b398d8334
    • Frank Galligan's avatar
      Switch remaining Neon variance functions to shifts · 588f74f8
      Frank Galligan authored
      Saves 5 instructions on 8x8 and 16x16 and 8 instructions
      on 32x32, when compiled with 4.9.
      Change-Id: Id3da613a36a9d27d8c5169c59ba45d247c920c6c
  18. 13 Jan, 2015 2 commits
    • Minghai Shang's avatar
      [twopass temporal svc] Fix decoding error on seek. · a14415d1
      Minghai Shang authored
      Don't put small empty frame in front of a key frame. We will
      put key frame flag in webm container if there's a visible key
      frame. But there will be decoding error when we seek to here
      if we put the small empty frame, which will be inter frame,
      in front of it.
      Change-Id: Id50c2c1fd31da0405ff6faa7375cc2f49c55402d
    • Yaowu Xu's avatar
      Enable decoder to pass through color space info · 6b223fcb
      Yaowu Xu authored
      This commit added a field to vpx_image_t for indicating color space,
      the field is also added to YUV_BUFFER_CONFIG. This allows the color
      space information pass through the decoder from input stream to the
      output buffer.
      The commit also updated compare_img() function with added verification
      of matching color space to ensure the color space information to be
      correctly passed from encode to decoder in compressed vp9 streams.
      Change-Id: I412776ec83defd8a09d76759aeb057b8fa690371