1. 15 Oct, 2013 1 commit
  2. 14 Oct, 2013 1 commit
    • Jingning Han's avatar
      Move token_cache from cost_coeffs to MACROBLOCK · f60a3910
      Jingning Han authored
      This commit moves token_cache buffer into macroblock struct, instead
      of defining as a local variable in cost_coeffs. This avoids repeatedly
      re-allocating memory space in the rate-distortion optimization loop.
      
      The runtime at speed 0 reduces:
      bus 2000kbps, 161692ms to 159951ms
      football 600kbps, 229505ms to 225821ms
      
      Change-Id: If7da6b0b6d8c5138a16271a33c4548fba33d8840
      f60a3910
  3. 12 Oct, 2013 1 commit
  4. 11 Oct, 2013 12 commits
    • Dmitry Kovalev's avatar
      Adding TREE_SIZE macro + cleanup. · 860e4676
      Dmitry Kovalev authored
      Using TREE_SIZE for the following trees:
        vp9_intra_mode_tree
        vp9_inter_mode_tree
        vp9_partition_tree
        vp9_switchable_interp_tree
        vp9_mv_joint_tree
        vp9_mv_class_tree
        vp9_mv_class0_tree
        vp9_mv_fp_tree
      
      Change-Id: I0212bb4c1ee6648249f68517e28a67a56591ee1b
      860e4676
    • Dmitry Kovalev's avatar
      Consistent names for inverse hybrid transforms (2 of 2). · ac468dde
      Dmitry Kovalev authored
      Renames:
        vp9_iht_add       -> vp9_iht4x4_add
        vp9_iht_add_8x8   -> vp9_iht8x8_add
        vp9_iht_add_16x16 -> vp9_iht16x16_add
      
      Change-Id: I8f1a2913e02d90d41f174f27e4ee2fad0dbd4a21
      ac468dde
    • Scott Graham's avatar
      Get libvpx to compile on VS2013. · 3806bab2
      Scott Graham authored
      `round` is defined in the runtime library now.
      https://codereview.chromium.org/23922008/
      
      Change-Id: I3852740058d32f63ce283579acbe284865e32dba
      3806bab2
    • Dmitry Kovalev's avatar
      Consistent names for inverse hybrid transforms (1 of 2). · 7ef57391
      Dmitry Kovalev authored
      Renames:
        vp9_short_iht4x4_add     -> vp9_iht4x4_16_add
        vp9_short_iht8x8_add     -> vp9_iht8x8_64_add
        vp9_short_iht16x16_add_c -> vp9_iht16x16_256_add
      
      Change-Id: Ibca7a188fd062b196787ac5efc1ea545e7f166c0
      7ef57391
    • Dmitry Kovalev's avatar
      Adding const to the input argument of all 1D transforms. · 44195fda
      Dmitry Kovalev authored
      Also adding static to iadst16_1d and fadst16 functions.
      
      Change-Id: I13c7df3b776f0f8efc6e80099bdb0a2f6d29edaf
      44195fda
    • Dmitry Kovalev's avatar
      Replacing {VP9_COEF, MODE}_UPDATE_PROB with DIFF_UPDATE_PROB. · 4a0f9478
      Dmitry Kovalev authored
      Values of MODE_UPDATE_PROB and VP9_COEF_UPDATE_PROB are equal, so replacing
      them with one constant. Inlining appropriate arguments for functions:
        vp9_cond_prob_diff_update (encoder)
        vp9_diff_update_prob (decoder)
      
      Change-Id: I1255a1cb477743b799b3bfbbcd8de6b32b067338
      4a0f9478
    • Deb Mukherjee's avatar
      Change in rddiv parameter to make it a power of 2 · d9655e42
      Deb Mukherjee authored
      Converts the constant rddiv parameter to 128 (from 100) and
      implements RDCOST with bit-shift rather than multiplication.
      Other parameters are also adjusted to roughly keep the same
      balance between Rate and Distortion.
      
      There is a slight speed-up of about 0.5-1% (at speed 0) as
      testted on football_cif.
      
      There is a slight change in performance due to small change
      in the parameters.
      derfraw300: +0.033%
      stdhdraw250; +0.102%
      
      Change-Id: I70ac69f58fa71c83108f68fe41796cd19d1fc760
      d9655e42
    • Yaowu Xu's avatar
      Masking intra mode choice adaptively · 8b175679
      Yaowu Xu authored
      The commit changes to mask available intra prediction modes for test
      based on prediction block size.
      
      With this patch, encoding time of CpuUsed 2 reduces from 10% to 20% for
      HD clips with a compression drop of 0.2%
      
      Change-Id: I65f320f1237c0f5ae3a355bf7caf447f55625455
      8b175679
    • Yunqing Wang's avatar
      Code cleanup · 57b97b56
      Yunqing Wang authored
      Minor code cleanup.
      
      Change-Id: I47c1f794842d4570bb39cfd23b80f54f5606bba6
      57b97b56
    • Paul Wilkins's avatar
      Experimental rate control change. · 704028d4
      Paul Wilkins authored
      When the codec in VBR (or cq) mode hits its max q limits and is
      struggling to hit a target bandwidth, the bit target per frame collapses.
      
      In the first instance normal frames cap out at the maximum allowed
      Q and then the ARF and GFs do the same. This latter behavior is not
      generally desirable as GFs and ARFs are only effective from a quality
      and data rate perspective if they have at lease some level of -Q delta
      compared to the surrounding frames.
      
      In this patch I define a separate max Q for GFs and ARFs that is
      derived from but somewhat lower than that defined for normal frames.
      In effect there is a minimum Q delta that will always be available for
      GFs and ARFs regardless of the target rate and MAXQ setting.
      
      This may of course mean that the absolute lowest rate obtainable for
      a given clip is somewhat higher.
      
      Change-Id: I268868b28401900d0cd87e51e609cd3b784ab54a
      704028d4
    • Paul Wilkins's avatar
      Disable recode loop. · 8b989f5b
      Paul Wilkins authored
      For VBR coding disable the recode loop for speeds > 0.
      
      Results pending.
      
      Change-Id: I2cd9a87c3fcbe39c05b954798d0671a4ca62c37f
      8b989f5b
    • Dmitry Kovalev's avatar
      Removing vp9_tree_p typedef. · 98400c1b
      Dmitry Kovalev authored
      It is used only two times and it is more clear to use real type instead
      of typedef.
      
      Change-Id: Idc25c16504c3da4d040e0cdb33a2987631bb6a5b
      98400c1b
  5. 10 Oct, 2013 10 commits
    • Dmitry Kovalev's avatar
      Removing vp9_idct4_1d_sse2 function. · ddf1b762
      Dmitry Kovalev authored
      We have two SSE2-optimized functions for idct4_1d:
        vp9_idct4_1d_sse2 <-- removing this one
        idct4_1d_sse2
      
      vp9_idct4_1d_sse2 was used only by the following functions which already
      have SSE2 optimized variants:
        vp9_idct4x4_16_add_c   -> vp9_idct4x4_16_add_see2
        idct8_1d               -> vp9_idct8x8_{16, 10, 1}_see2
        vp9_short_iht4x4_add_c -> vp9_short_iht4x4_add_see2
      
      Change-Id: Ib0a7f6d1373dbaf7a4a41208cd9d0671fdf15edb
      ddf1b762
    • Scott LaVarnway's avatar
      d207 intra prediction ssse3 using bytes · 83936e8c
      Scott LaVarnway authored
      byte version of ronalds d207 ssse3 optimizations
      (commit: f891f84d3ba9345b0074e682f0fea09b8ddf4f1e)
      
      Change-Id: If15f71a589ea16f78ac86a501b0c5c6231dc9af1
      83936e8c
    • Yunqing Wang's avatar
      SSE2 8-tap sub-pixel filter optimization · 3fb728c7
      Yunqing Wang authored
      To ensure fast encoding/decoding on devices without ssse3 support,
      SSE2 optimization of sub-pixel filters was done. Test using 1080p
      clip showed the decoder speeds were ~70fps with ssse3 filters, ~60fps
      with sse2 filters, and ~15fps with c filters.
      
      Change-Id: Ie2088f87d83a889fba80a613e4d0e287aadd785c
      3fb728c7
    • Jingning Han's avatar
      Fix typo in comment message · f0772dc5
      Jingning Han authored
      Change-Id: Ifef756a3a91423bb9f5411f06fa092027be21ecf
      f0772dc5
    • Dmitry Kovalev's avatar
      Consistent names for FDCT functions. · fc82dbb4
      Dmitry Kovalev authored
      Renames:
        fdct4_1d   -> fdct4
        fadst4_1d  -> fadst4
        fdct8_1d   -> fdct8
        fadst8_1d  -> fadst8
        fdct16_1d  -> fdct16
        fadst16_1d -> fadst16
      
      "_1d" suffix is redundant, so removing it. The same will happen with idct
      in the next change sets.
      
      Change-Id: Ibf421cd2f569146c6079269df7a31819c098265e
      fc82dbb4
    • Dmitry Kovalev's avatar
      Giving consistent names to IDCT 32x32 functions. · 1e766b50
      Dmitry Kovalev authored
      Renames:
        vp9_short_idct32x32_add   -> vp9_idct32x32_1024_add
        vp9_short_idct32x32_1_add -> vp9_idct32x32_1_add
        vp9_idct_add_32x32        -> vp9_idct32x32_add
      
      Change-Id: Id85306f5814bac6c47463a6b5901a93082510666
      1e766b50
    • Jingning Han's avatar
      Re-design rate-distortion cost tracking buffers · fc19243c
      Jingning Han authored
      This commit re-designs the per transformed block rate-distortion
      costs tracking buffers. It removes redundant buffer usage, makes
      the needed context memory allocation per VP9_COMP instance and
      reuses the same buffer sets inside the rate-distortion optimization
      search loop, thereby avoiding repeatedly requiring memory space.
      
      It reduces speed 0 runtime:
      
      bus at 2000 kbps from 166763ms to 158967ms,
      football at 600 kbps from 246614ms to 234257ms.
      
      Both about 5% speed-up. Local tests suggest about 2% to 5% speed-up
      for speed 1 and 2 settings. This does not change compression
      performance.
      
      Change-Id: I363514c5276b5cf9a38c7251088ffc6ab7f9a4c3
      fc19243c
    • Yaowu Xu's avatar
      change to avoid out-of-range computation · b47cef05
      Yaowu Xu authored
      Change-Id: Id5e31833a0ef40de9f64c2f5674af7083233bf14
      b47cef05
    • Deb Mukherjee's avatar
      Adjustment to mv cost parameters · e4b0fce4
      Deb Mukherjee authored
      Increases these parameters.
      There is a small efficiency gain.
      
      Change-Id: Ie5f0ddb39c907d335e0dafa5eb112365a81f4542
      derfraw300: +0.091%
      stdhdraw250: +0.238%
      e4b0fce4
    • Dmitry Kovalev's avatar
      Adding const to several pointers. · d9d7040e
      Dmitry Kovalev authored
      Change-Id: I7231589bda71d0d23c730283febd5bb58585a0da
      d9d7040e
  6. 09 Oct, 2013 5 commits
    • Jingning Han's avatar
      Fix intra dist model of skip_encode feature · 013db649
      Jingning Han authored
      The intra mode distortion adjustment for skip_encode feature was
      broken in the refactoring cc91851. This commit fixes it and tunes
      the distortion models used therein.
      
      Change-Id: I0d676e82f8e855536a90cf9b3e3fdefafcd886c6
      013db649
    • Yaowu Xu's avatar
      Added #define of snprintf for MSVC · 850a9196
      Yaowu Xu authored
      snprintf is not supported by MSVC, the commit replace it with the msvc
      variant _snprintf to enable build.
      
      Change-Id: I686943a78c289bae6b486a5e75effad5f86c24de
      850a9196
    • Deb Mukherjee's avatar
      Clean-ups in rdopt.c · eb8b1cd7
      Deb Mukherjee authored
      Some minor cleanups in preparation for experimentation with
      some encode parameters and thresholds
      
      Change-Id: I449d66da97eae0a7acdf4aae374e2f9111342056
      eb8b1cd7
    • Jingning Han's avatar
      Deprecate the use of PARTITION_INFO from encoder · 03fe08ca
      Jingning Han authored
      Use b_mode_info to store the inter prediction mode of sub8x8 block,
      in replacement of the use of partition_info. Remove redundant buffer
      update for partition_info. For bus_cif at 2000 kbps, this seem to make
      speed 0 about 1% faster.
      
      Change-Id: Id1b3be45e75a24fb4b42335ac480c23e440978f6
      03fe08ca
    • Parag Salasakar's avatar
      mips dsp-ase r2 vp9 decoder bilinear convolve optimizations · eeb5b62d
      Parag Salasakar authored
      Change-Id: Ic31b4ef85e65070b4f8b9f26e068ccfaae00c4f0
      eeb5b62d
  7. 08 Oct, 2013 5 commits
    • Jingning Han's avatar
      Remove extra line in decode_coefs · c5e91080
      Jingning Han authored
      Change-Id: Id1fde9920d60c6991a8ef6de5103ae3e578312ed
      c5e91080
    • Jingning Han's avatar
      All zero coeff skip in IDCT 32x32 · 6594ca88
      Jingning Han authored
      When all coefficients are zeros, skip the corresponding 1-D inverse
      transform. This practice has been used in the SSE2 implementation of
      inverse 32x32 DCT. This commit imports this algorithm into the C code.
      
      Change-Id: I0f58bfcb183a569fab85d524d5d9cf8ae8653f86
      6594ca88
    • Dmitry Kovalev's avatar
      Removing inv_txm4x4_1_add and inv_txm4x4_add function pointers. · c983c966
      Dmitry Kovalev authored
      We already have itxm_add member in MACROBLOCKD structure. Both
      inv_txm4x4_1_add and inv_txm4x4_add are just its special cases for
      different eob values. But eob logic is already implemented in
      vp9_iwht4x4_add and vp9_idct4x4_add (that's why also removing
      inverse_transform_b_4x4_add).
      
      Change-Id: I80bec9b6f7d40c5e5033c613faca5c819c3e6326
      c983c966
    • Yaowu Xu's avatar
      Change to allow less rectangular partion check · e29137df
      Yaowu Xu authored
      For CpuUsed 1 & 2, this commit allow to skip retangular partition check
      when NONE is better than SPLIT. It also changed to allow such logic
      on alt ref frame coding rather than use square partition all them. The
      change has gain compressio about .3% on yt and ythd for both 1&2, It
      helped .6% compression on cif and stdhd for both CpuUsed 1&2.
      
      Change-Id: I814b653baf89f59acd20e042629a12938a1bd4e5
      e29137df
    • Jim Bankoski's avatar
      easy to fix cpplint issue in rdopt.c · 08feefbe
      Jim Bankoski authored
      Change-Id: Id093816146de0d100f0c6ae2542aaa427dbab2d8
      08feefbe
  8. 07 Oct, 2013 5 commits