1. 24 Jun, 2014 1 commit
    • Yunqing Wang's avatar
      Reuse inter prediction result in real-time speed 6 · 0aae1000
      Yunqing Wang authored
      In real-time speed 6, no partition search is done. The inter
      prediction results got from picking mode can be reused in the
      following encoding process. A speed feature reuse_inter_pred_sby
      is added to only enable the resue in speed 6.
      This patch doesn't change encoding result. RTC set tests showed
      that the encoding speed gain is 2% - 5%.
      Change-Id: I3884780f64ef95dd8be10562926542528713b92c
  2. 19 Jun, 2014 4 commits
    • hkuang's avatar
      Add superframe support for frame parallel decoding. · 1eb6e683
      hkuang authored
      A superframe is a bunch of frames that bundled as one frame. It is mostly
      used to combine one or more non-displayable frames and one displayable frame.
      For frame parallel decoding, libvpx decoder will only support decoding one
      normal frame or a super frame with superframe index.
      If an application pass a superframe without superframe index or a chunk
      of displayable frames without superframe index to libvpx decoder, libvpx
      will not decode it in frame parallel mode. But libvpx decoder still could
      decode it in serial mode.
      Change-Id: I04c9f2c828373d64e880a8c7bcade5307015ce35
    • Tim Kopp's avatar
      Fixes in VP9 alloc, free, and COPY_FRAME case · b56f3af7
      Tim Kopp authored
      Change-Id: I1216f17e2206ef521fe219b6d72d8e41d1ba1147
    • Tim Kopp's avatar
      Improved vp9 denoiser running avg update. · 0fec8f97
      Tim Kopp authored
      Change-Id: Ie0aa41fb7957755544321897b3bb2dd92f392027
    • Tim Kopp's avatar
      Implemented COPY_BLOCK case for vp9 denoiser · ff388071
      Tim Kopp authored
      Change-Id: Ie89ad1e3aebbd474e1a0db69c1961b4d1ddcd33e
  3. 18 Jun, 2014 6 commits
    • Tim Kopp's avatar
      Changed buf_2ds in vp9 denoiser to YV12 buffers · 2614e56c
      Tim Kopp authored
      Changed alloc, free, and running average code as necessary.
      Change-Id: Ifc4d9ccca462164214019963b3768a457791b9c1
    • Tim Kopp's avatar
      Update running avg for VP9 denoiser · a4b7a713
      Tim Kopp authored
      Change-Id: I9577d648542064052795bf5770428fbd5c276b7b
    • Tim Kopp's avatar
      Implemented vp9_denoiser_{alloc,free}() · 2a720673
      Tim Kopp authored
      Change-Id: I79eba79f7c52eec19ef2356278597e06620d5e27
    • Yunqing Wang's avatar
      Modify non-rd intra mode checking · 55834d42
      Yunqing Wang authored
      Speed 6 uses small tx size, namely 8x8. max_intra_bsize needs to
      be modified accordingly to ensure valid intra mode checking.
      Borg test on RTC set showed an overall PSNR gain of 0.335% in speed
      This also changes speed -5 encoding by allowing DC_PRED checking
      for block32x32. Borg test on RTC set showed a slight PSNR gain of
      0.145%, and no noticeable speed change.
      Change-Id: I1502978d8fbe265b3bb235db0f9c35ba0703cd45
    • Jingning Han's avatar
      Separate rate-distortion modeling for DC and AC coefficients · 7c45dc98
      Jingning Han authored
      This is the first step to rework the rate-distortion modeling used
      in rtc coding mode. The overall goal is to make the modeling
      customized for the statistics encountered in the rtc coding.
      This commit makes encoder to perform rate-distortion modeling for
      DC and AC coefficients separately. No speed changes observed.
      The coding performance for pedestrian_area_1080p is largely
      speed -5, from 79558 b/f, 37.871 dB -> 79598 b/f, 38.600 dB
      speed -6, from 79515 b/f, 37.822 dB -> 79544 b/f, 38.130 dB
      Overall performance for rtc set at speed -6 is improved by 0.67%.
      Change-Id: I9153444567e5f75ccdcaac043c2365992c005c0c
    • Adrian Grange's avatar
      Improve vp9_rb_bytes_read · dbd1184a
      Adrian Grange authored
      Change-Id: I69eba120eb3d8ec43b5552451c8a9bd009390795
  4. 16 Jun, 2014 1 commit
    • Pengchong Jin's avatar
      skip the un-necessary motion search in the first pass · cdc954fd
      Pengchong Jin authored
      This patch allows the VP9 encoder to skip the un-necessary
      motion search in the first pass. It computes the motion error
      of 0,0 motion using the last source frame as the reference,
      and skips the further motion search if this error is small.
      Borg test shows overall the patch gives PSNR gain (derf -0.001%,
      yt 0.341%, hd 0.282%). Individual clips may have PSNR gain or
      loss. The best PSNR performance is 7.347% and the worst is -0.662%.
      The first pass encoding speedup for slideshow clips is over 30%.
      Change-Id: I4cac4dbd911f277ee858e161f3ca652c771344fe
  5. 15 Jun, 2014 1 commit
  6. 13 Jun, 2014 7 commits
    • Jingning Han's avatar
      Fix C versions of DC calculation functions · 6b0bc34b
      Jingning Han authored
      This commit fixes the scaling factors used in the C versions of the
      DC calculation functions.
      Change-Id: Iab41108c2bb93c2f2e78667214f3a772a2b707b5
    • Dmitry Kovalev's avatar
      Moving RD-opt related code from vp9_encoder.h to vp9_rdopt.h. · 3f8508eb
      Dmitry Kovalev authored
      Change-Id: I8fab776c8801e19d3f5027ed55a6aa69eee951de
    • Dmitry Kovalev's avatar
      Replacing RC_MODE with vpx_rc_mode. · bcfbd2f9
      Dmitry Kovalev authored
      Both enums are identical.
      Change-Id: I06653f9c90a2d3a2dd5c741e75b17ee7d066a56f
    • Jingning Han's avatar
      Fix out of boundary memory read in fuzz test on vpxdec · 1ba18717
      Jingning Han authored
      This commit fixes frame header decoding for superframe index, to
      prevent out of boundary memory read triggered by fuzz test
      vector. It resolves a chromium security violation issue
      The issue was introduced in the change:
      Add VPXD_SET_DECRYPTOR support to the VP9 decoder.
      cl-id I88f86c8ff9af34e0b6531028b691921b54c2fc48
      where the buffer was read before validation check on index offset
      A test vector is added accordingly.
      Change-Id: I41c988e776bbdd1033312a668e03a3dbcf44ca99
    • Paul Wilkins's avatar
      Revert "skip un-neccessary motion search in the first pass" · af8d4054
      Paul Wilkins authored
      This patch appears to have introduced non-determinism and/or
      mismatch from debug vs release.
      This reverts commit 5daef90e.
      Change-Id: I80081e55cfeaaa821b510b58a4e6e6328003c7da
    • hkuang's avatar
      Delay decreasing reference count in frame-parallel decoding. · e4c5f7e2
      hkuang authored
      The current decoding scheme will decrease the reference count
      of the output frame when finish decoding. Then the application
      could copy the frame from the decoder buffer to application buffer.
      In frame-parallel decoding, a decoded frame will not be outputted
      until several frames later which depends on thread numbers. So
      the decoded frame's reference count should be decreased only
      after application finish copying the frame out. But due to the
      limitation of vpx_codec_get_frame, decoder could not know when
      application finish decoding. So use a index last_show_frame to
      release the last output frame's reference count.
      Change-Id: I403ee0d01148ac1182e5a2d87cf7dcc302b51e63
    • Johann's avatar
      Use lrand48 on Android · 79afb5eb
      Johann authored
      When building x86 assembly use lrand48 instead of the
      undocumented inlined _rand function.
      Android now supports rand()
      but only for new versions. Original workaround:
      Change-Id: I130566837d5bfc9e54187ebe9807350d1a7dab2a
  7. 12 Jun, 2014 11 commits
    • Tim Kopp's avatar
      Added skeleton for VP9 denoiser · ab8bfb07
      Tim Kopp authored
      Change-Id: Iccf6ede4c4f85646b0f8daec47050ce93e267c90
    • Dmitry Kovalev's avatar
      Adding MV_SPEED_FEATURES struct. · 4ff1a614
      Dmitry Kovalev authored
      Moving all motion vector related speed parameters from SPEED_FEATURES to
      Change-Id: I3e9af0039c7162f8671878c5920bce3cb256a84e
    • Dmitry Kovalev's avatar
      Moving full_pixel_search() to vp9_mcomp.c. · 442cbf56
      Dmitry Kovalev authored
      Change-Id: I12389f801ebd3bd2ae3bf31e125433bfb429ee65
    • Dmitry Kovalev's avatar
      Adding is_altref_enabled() function. · 86583b2b
      Dmitry Kovalev authored
      Change-Id: I54cdb4ce11590511e6f86bc2fd55771f1c18a20a
    • Dmitry Kovalev's avatar
      Replacing txfm_size with tx_size. · 4345d12d
      Dmitry Kovalev authored
      Change-Id: Ifa6374e9db5919322733b656e0865f5f19ee6f2c
    • Dmitry Kovalev's avatar
      Removing unused ssim_weighted_pred_err field from FIRSTPASS_STATS. · eaeda536
      Dmitry Kovalev authored
      Change-Id: Ia8c7e3905ac21732cb6b8099eaf8df72c7e36b73
    • Jingning Han's avatar
      Fast computation path for forward transform and quantization · ccba289f
      Jingning Han authored
      This commit enables a fast path computational flow for forward
      transformation. It checks the sse and variance of prediction
      residuals and decides if the quantized coefficients are all
      zero, dc only, or more. It then selects the corresponding coding
      path in the forward transformation and quantization stage.
      It is currently enabled in rtc coding mode. Will do it for rd
      coding mode next.
      In speed -6, the runtime for pedestrian_area 1080p at 1000 kbps
      goes down from 14234 ms to 13704 ms, i.e., about 4% speed-up.
      Overall coding performance for rtc set is changed by -0.18%.
      Change-Id: I0452da1786d59bc8bcbe0a35fdae9f623d1d44e1
    • Pengchong Jin's avatar
      skip un-neccessary motion search in the first pass · 5daef90e
      Pengchong Jin authored
      This patch allows the encoder to skip the
      un-neccessary motion search in the first pass. It
      calculates the error of the zero motion vector using
      the last source frame as reference and skips the
      further motion search in the first pass if the error
      is small.
      The encoding speedup of the first pass for slideshow
      videos is over 30%. Borg test shows the overall PSNR
      performance remain approximately the same (derf -0.009,
      hd 0.387, yt 0.021, stdhd 0.065). Individual clips may
      have either PSNR gain or loss. The worst PSNR perfomance
      is from yt set, with a PSNR loss of -1.1.
      Change-Id: I08b2ab110b695e4689573b2567fa531b6457616e
    • Alex Converse's avatar
      Fix SEG_LVL_SKIP in non-RD inter mode selection. · 6c3f311b
      Alex Converse authored
      Add a set_mode_info_seg_skip function that fills the requisite mode info.
      Change-Id: I460b1b6845d720d9b09ed5b64df0ea0aac443f62
    • Alex Converse's avatar
      Fix SEG_LVL_SKIP in RD inter mode selection. · b0a8057f
      Alex Converse authored
      * Only use ZEROMV, disalowing the intra modes that were previously
      * Score rate and distortion as zero.
      Change-Id: Ifcf99e272095725f11da1dcd26bd0f850683e680
    • hkuang's avatar
      Initially add frame_parallel_decode flag. · 537cb060
      hkuang authored
      Stub flag temporarily set to 0 until frame parallel
      decoding implementations are finished.
      Change-Id: I8ab768138e8f8f8eb809875703b2502ea0fe7cea
  8. 10 Jun, 2014 7 commits
    • James Zern's avatar
      vp9_rtcd: correct avx2 references · 9f3a0dbb
      James Zern authored
      avx2 code is all intrinsics and as a result doesn't rely on x86inc.asm
      Change-Id: I76ad39474d8a00658f3e43131830ef0f4f34772a
    • James Zern's avatar
      vp9_sub_pixel_*variance*: disable avx2 variants · 520cb3f3
      James Zern authored
      tests failing under Win32/Win64
      + variance_test: add missing avx2 functions (partially disabled)
      Change-Id: I6abc0657ea076379ab9ca65c12678b9ea199849d
    • James Zern's avatar
      vp9_sad*x4d: disable avx2 variants · d3ff009d
      James Zern authored
      tests failing under Win32/Win64
      + sad_test: add missing avx2 functions (disabled)
      Change-Id: I8224fba2b270f6039ab1877d71e1e512f0081856
    • hkuang's avatar
      Add mode info arrays and mode info index. · cdffeaaa
      hkuang authored
      In non frame-parallel decoding, this works the same way as
      current decoding scheme. Every time after decoder finish
      decoding a frame, it will swap the current mode info pointer
      and  previous mode info pointer if the decoded frame needs
      to be shown. Both mode info pointer and previous mode info
      pointer are from mode info arrays.
      In frame-parallel decoding, this will become more complicated
      as current frame's mode info pointer will be shared with next
      frame as previous mode info pointer. But when one decoder
      thread finishes decoding one frame and starts to work on next
      available frame, it needs to retain the decoded frame's mode
      info pointers until next frame finishes decoding. The mode info
      index will serve this purpose. The decoder will use different
      buffer in the mode info arrays and use the other buffer to save
      previous decoded frame’s mode info.
      Change-Id: If11d57d8eb0ee38c8876158e5482177fcb229428
    • Dmitry Kovalev's avatar
      Removing two unused TX_SIZE_SEARCH_METHOD members. · bc93f425
      Dmitry Kovalev authored
      Change-Id: I33a38bb9f46e7ef509bbbf0cfd7bc3ea5072d022
    • James Zern's avatar
      vp9_f(dct|ht): disable avx2 variants · dd9f5029
      James Zern authored
      tests failing under Win32/Win64
      + dct16x16_test: add missing avx2 functions (partially disabled)
      exercises the forward transforms
      no idct/iht implementations, so the c-code is used
      Change-Id: I04f64a457fa0828a00f32b5c9fe4f55294f21f61
    • James Zern's avatar
      convolve: disable avx2 variants · 5704578f
      James Zern authored
      tests failing under Win32/Win64
      Change-Id: I5d49d11911bcda3a832b14efe5500d22597bedcf
  9. 09 Jun, 2014 1 commit
    • Yunqing Wang's avatar
      Use small transform size in non-rd real-time mode · b04d7668
      Yunqing Wang authored
      In non-rd real-time mode, choosing smaller transform size in
      encoding gives better video quality and good speed gain than
      choosing larger transform size. This patch set tx size search
      method to ALLOW_8X8, which is better than using 4x4 or other
      larger sizes.
      Borg tests on rtc set at speed 6 showed significant gain on quality.
      PSNR gain: 11.034% and SSIM gain: 15.466%.
      The speed gain is 5% - 12% for <720p clips, and 2% - 7% for
      720p clips.
      Change-Id: If4dc74ed2df359346b059f47fb73b4a0193ec548
  10. 06 Jun, 2014 1 commit
    • Adrian Grange's avatar
      Revert "Removing this_frame_stats member from TWO_PASS struct." · a4f74792
      Adrian Grange authored
      Use of stack frame variable "fps" beyond the lifetime of the function.
      fps is sent as a paremeter to output_stats and stored in the
      packet holding this encoded frame. This has scope beyond the
      lifetime of the calling function.
      This reverts commit 3f95a230
      Change-Id: Icd8e14b3d7dd733590ada12e619b9dce95b6b0f5