1. 11 Sep, 2013 1 commit
    • Scott LaVarnway's avatar
      New mode_info_context storage -- undo revert · ac6093d1
      Scott LaVarnway authored
      mode_info_context was stored as a grid of MODE_INFO structs.
      The grid now constists of pointers to MODE_INFO structs.  The
      MODE_INFO structs are now stored as a stream (decoder only),
      eliminating unnecessary copies and is a little more cache
      friendly.
      
      Change-Id: I031d376284c6eb98a38ad5595b797f048a6cfc0d
      ac6093d1
  2. 10 Sep, 2013 5 commits
    • Yunqing Wang's avatar
      Stop partition checking when distortion is small · 0607abc3
      Yunqing Wang authored
      If the current obtained distortion is very small, which happens
      for static image case, we pick the current partition type without
      further split checking.
      
      This won't affect regular videos. For static videos, we got 10%~12%
      encoding speed gain. PSNR was better for some clips, and worse for
      others. Overall it was even.
      
      Change-Id: If787a57bedf46fc595ca4f5ded2b0c0a69e9fdef
      0607abc3
    • Yunqing Wang's avatar
      Modify encode breakout for static frames · 939791a1
      Yunqing Wang authored
      Thank Paul for the suggestions. While turning on static-thresh
      for static-image videos, a big jump on bitrate was seen. In this
      patch, we detected static frames in the video using first-pass
      stats. For different cases, disable encode breakout or reduce
      encode breakout threshold to limit the skipping.
      
      More modification need be done to break incorrect partition
      picking pattern for static frames while skipping happens.
      
      Change-Id: Ia25f47041af0f04e229c70a0185e12b0ffa6047f
      939791a1
    • Paul Wilkins's avatar
      Modified mode skip functionality. · 4f660cc0
      Paul Wilkins authored
      A previous speed feature skipped modes not used in earlier
      partitions but this not longer worked as intended following
      changes to the partition coding order and in conjunction
      with some other speed features (Especially speed 2 and above).
      
      This modified mode skip feature sets a mask after the first X
      modes have been tested in each partition depending on the
      reference frame of the current best case.
      
      This patch also makes some changes to the order modes are
      tested to fit better with this skip functionality.
      
      Initial testing suggests speed and rd hit count improvements
      of up to 20% at speed 1. Quality results. (derf -1.9%, std hd  +0.23%).
      
      Change-Id: Idd8efa656cbc0c28f06d09690984c1f18b1115e1
      4f660cc0
    • Paul Wilkins's avatar
      Added extra check to rd_auto_partition_range() · 901c4954
      Paul Wilkins authored
      Added check that the returned max and minimum are
      valid in bottom and right border cases.
      
      Change-Id: I2d6cdc9b5f04c7d0ff512ddcf3228331e028bf9b
      901c4954
    • hkuang's avatar
      Speed up idct16x16 by rearrange instructions. · fc5ec206
      hkuang authored
      Speed improve from 376% to 400% faster base on assembly-perf.
      
      Change-Id: If0b2eccc39d5793dc101ce9feb7fcadf88396ea2
      fc5ec206
  3. 09 Sep, 2013 2 commits
    • Ivan Maltz's avatar
      API extensions and sample app for spacial scalable encoder · 01b35c3c
      Ivan Maltz authored
      Sample app: vp9_spatial_scalable_encoder
      vpx_codec_control extensions:
        VP9E_SET_SVC
        VP9E_SET_WIDTH, VP9E_SET_HEIGHT, VP9E_SET_LAYER
        VP9E_SET_MIN_Q, VP9E_SET_MAX_Q
      expanded buffer size for vp9_convolve
      
      modified setting of initial width in vp9_onyx_if.c so that layer size
      can be set prior to initial encode
      
      Default number of layers set to 3 (VPX_SS_DEFAULT_LAYERS)
      Number of layers set explicitly in vpx_codec_enc_cfg.ss_number_layers
      
      Change-Id: I2c7a6fe6d665113671337032f7ad032430ac4197
      01b35c3c
    • James Zern's avatar
      Revert "New mode_info_context storage" · 54a03e20
      James Zern authored
      This reverts commit dae17734
      
      Encode crashes, leaks and increases integer overflow errors.
      
      Change-Id: I595aa2649bb8d0b6552ff91652837a74c103fda2
      54a03e20
  4. 08 Sep, 2013 1 commit
    • Yaowu Xu's avatar
      Reduce the amount of extension in src frames · 65c2444e
      Yaowu Xu authored
      The commit changes the border pixel extension from 160 pixel each side
      to what is necessary in arnr filter or motion estimation portion, i.e.
      16 pixel on top and left side. For right or bottom side, the extension
      is changed to either round up image size to multiple of 64 or at least
      16 pixels.
      
      Change-Id: Ic05e19b94368c1ab4df568723aae5734e6c3d2c5
      65c2444e
  5. 07 Sep, 2013 1 commit
    • Jingning Han's avatar
      Fix overflow issue in 16x16 quantization SSSE3 · 09bc942b
      Jingning Han authored
      The 16x16 transform unit test suggested that the peak coefficient
      value can reach 32639. This could cause potential overflow issue
      in the SSSE3 implmentation of 16x16 block quantization. This commit
      fixes this issue by replacing addition with saturated addition.
      
      Change-Id: I6d5bb7c5faad4a927be53292324bd2728690717e
      09bc942b
  6. 06 Sep, 2013 3 commits
    • Paul Wilkins's avatar
      Enable kf restrictions at speed 4 · f15cdc74
      Paul Wilkins authored
      Change-Id: I453409d3be3f5fe118b15affde45cb52184aef20
      f15cdc74
    • Deb Mukherjee's avatar
      Support a constant quality mode in VP9 · e378a89b
      Deb Mukherjee authored
      Adds a new end-usage option for constant quality encoding in vpx. This
      first version implemented for VP9, encodes all regular inter frames
      using the quality specified in the --cq-level= option, while encoding
      all key frames and golden/altref frames at a quality better than that.
      
      The current performance on derfraw300 is +0.910% up from bitrate control,
      but achieved without multiple recode loops per frame.
      
      The decision for qp for each altref/golden/key frame will be improved
      in subsequent patches based on better use of stats from the first pass.
      Further, the qp for regular inter frames may also be varied around the
      provided cq-level.
      
      Change-Id: I6c4a2a68563679d60e0616ebcb11698578615fb3
      e378a89b
    • Scott LaVarnway's avatar
      New mode_info_context storage · dae17734
      Scott LaVarnway authored
      mode_info_context was stored as a grid of MODE_INFO structs.
      The grid now constists of a pointer to a MODE_INFO struct and
      a "in the image" flag.  The MODE_INFO structs are now stored
      as a stream, eliminating unnecessary copies and is a little
      more cache friendly.
      
      For the test clips used, the decoder performance improved
      by ~4.3% (1080p) and ~9.7% (720p).
      
      Patch Set 2: Re-encoded clips with latest. Now ~1.7% (1080p)
      and 5.9% (720p).
      
      Change-Id: I846f29e88610fce2523ca697a9a9ef2a182e9256
      dae17734
  7. 05 Sep, 2013 3 commits
  8. 04 Sep, 2013 9 commits
  9. 03 Sep, 2013 1 commit
    • Paul Wilkins's avatar
      Attempt to fix speed 4 · 49317cdd
      Paul Wilkins authored
      Speed 4 fixed partition size. Use fixed size unless it does not
      fit inside image, in which case use the largest size that does.
      
      Change-Id: I250f7a80506750dd82ab355721624a1344247223
      49317cdd
  10. 01 Sep, 2013 1 commit
    • Jingning Han's avatar
      Fix 32x32 forward transform SSE2 version · 3cf46fa5
      Jingning Han authored
      This commit fixed the potential overflow issue in the SSE2
      implementation of 32x32 forward DCT. It resolved the corrupted
      coded frames in the border of scenes.
      
      Change-Id: If87eef2d46209269f74ef27e7295b6707fbf56f9
      3cf46fa5
  11. 30 Aug, 2013 3 commits
    • Yunqing Wang's avatar
      Use correct bit cost while static-thresh is on · 0ca7855f
      Yunqing Wang authored
      While static-thresh is on, we only need to transmit skip
      flag if skip = 1. The cost of skip bit is added to the
      total rate cost.
      
      Change-Id: I64e73e482bc297eba22907026298a15fa8cc3920
      0ca7855f
    • Tero Rintaluoma's avatar
      Fix intermediate height in convolve_c · e326cecf
      Tero Rintaluoma authored
      - Intermediate height was not correct i.e. when block size is 4 and
        y_step_q4 is 6. In this case intermediate height was
        (4*6) >> 4 = 1 and vertical interpolation needs two source pixels
        plus 7 extra pixels for taps.
      - Also if the current output block is 16x16 and we are using 4x upscaling
        we need only 12 rows after horizontal filtering instead of 16.
      
        Patch Set 2: Intermediate_height updated after CL 66723
                     "Fix bug in convolution functions (filter selection)"
      
      Change-Id: I5a1a1bc2ac9d5edb3a6e0818de618bf318fdd589
      e326cecf
    • Jim Bankoski's avatar
      rework filter_block_plane · bc50961a
      Jim Bankoski authored
      Change-Id: I55c3b60c4c0f4910d3dfb70e3edaae00cfa8dc4d
      bc50961a
  12. 29 Aug, 2013 7 commits
    • Paul Wilkins's avatar
      Added per pixel inter rd hit count stats · 1f4bf79d
      Paul Wilkins authored
      Added some code to output normalized rd hit count stats.
      In effect this approximates to the average number of rd
      operations/tests per pixel for the sequence.
      
      The results are not quite accurate and I have not bothered
      to account for partial SB64s at frame edges and for key frames
      However they do give some idea of the number of modes /
      prediction methods being tested for each pixel across the
      different partition sizes. This indicates how much scope their
      is for further gains either by reducing the number of partitions
      examined or the modes per partition through heuristics.
      
      Patch 3 moved place where count incremented so partial rd
      tests that are aborted with INT_MAX return are also counted.
      
      Example numbers for first 50 frames of Akiyo.
      Speed 0 ~84.4 rd operations / pixel
      Speed 1 ~28.8
      Speed 2 ~11.9
      
      Change-Id: Ib956e787e12f7fa8b12d3a1a2f6cda19a65a6cb8
      1f4bf79d
    • James Zern's avatar
      consistently name VP9_COMMON variables #3 · d765df27
      James Zern authored
      stragglers
      
      Change-Id: Ib1e853f9a331b7b66639dc34d79568d84d1930f1
      d765df27
    • James Zern's avatar
      consistently name VP9_COMMON variables #2 · aa053212
      James Zern authored
      oci -> cm
      
      Change-Id: Ifd75c809d9cc99034d3c2fccc4653a78b3aec21f
      aa053212
    • James Zern's avatar
      consistently name VP9_COMMON variables #1 · 924d7451
      James Zern authored
      pc -> cm
      
      Change-Id: If3e83404f574316fdd3b9aace2487b64efdb66f3
      924d7451
    • Jingning Han's avatar
      Fix overflow issue in SSSE3 32x32 quantization · abff6788
      Jingning Han authored
      The 32x32 quantization process can potentially have the intermediate
      stacks over 16-bit range, thereby causing enc/dec mismatch. This commit
      fixes this overflow issue in the SSSE3 implementation, as well as the
      prototype, of 32x32 quantization.
      
      This fixes issue 607 from webm@googlecode.
      
      Change-Id: I85635e6ca236b90c3dcfc40d449215c7b9caa806
      abff6788
    • Yaowu Xu's avatar
      Fixed potential overflows · aaa7b444
      Yaowu Xu authored
      The two arrays are typically initialized to INT64_MAX, if they are not
      filled with valid values before the addition, the values can overflow
      and lead to wrong results.
      
      Change-Id: I515de22cf3e8f55af4b74bdb2c8eb821a02d3059
      aaa7b444
    • Scott LaVarnway's avatar
      Improved mb_lpf_horizontal_edge_w_sse2_8 · 22dc946a
      Scott LaVarnway authored
      This patch is a reformatted version of optimizations done by
      engineers at Intel (Erik/Tamar) who have been providing
      performance feedback for VP9.  For the test clips used (720p, 1080p),
      up to 1.2% performance improvement was seen.
      
      Change-Id: Ic1a7149098740079d5453b564da6fbfdd0b2f3d2
      22dc946a
  13. 28 Aug, 2013 3 commits
    • Dmitry Kovalev's avatar
      General code cleanup. · b62ddd5f
      Dmitry Kovalev authored
      Switching from mi_{width, height}_log2 and b_{width, height}_log2 to
      num_8x8_blocks_{wide, high} and num_4x4_blocks_{wide, high}. Removing
      redundant code, adding const.
      
      Change-Id: Iaab2207590fd24d0b76999071778d1395dc5cd5d
      b62ddd5f
    • Deb Mukherjee's avatar
      Adds a speed feature for fast 1-loop forw updates · e02dc84c
      Deb Mukherjee authored
      Incorporates a speed feature for fast forward updates of
      coefficients. This feature takes 3 values:
      0 - use standard 2-loop version
      1 - use a 1-loop version
      2 - use a 1-loop version with reduced updates
      
      Results: derfraw300 +0.007% (on speed 0) at feature value = 1
                          -0.160% (on speed 0) at feature value = 2
      
      There is substantial speed up at speeds 2 and above for low
      resolution sequences where the entropy updates are a big part
      of the overall computations.
      
      Change-Id: Ie96fc50777088a5bd441288bca6111e43d03bcae
      e02dc84c
    • Dmitry Kovalev's avatar
      Renaming txfm_size to tx_size. · 851a2fd7
      Dmitry Kovalev authored
      Change-Id: I752e374867d459960995b24d197301d65ad535e3
      851a2fd7