1. 14 Jan, 2014 1 commit
    • Deb Mukherjee's avatar
      Minor fix on an assert · 1699d6bd
      Deb Mukherjee authored
      Fixes assert that fails occasionally on small values of
      max-key frame intervals. Also, adds a small change on
      updating frames_to_key for frame drops.
      Change-Id: Icc2b33b25e3e4ced7e49f8db73e0a887ef9c99e0
  2. 13 Jan, 2014 6 commits
    • Dmitry Kovalev's avatar
      Converting int_mv to MV. · 2033ac49
      Dmitry Kovalev authored
      Change-Id: Id31c0e100d275bd3650eaf5e4b8fe5ce648dbfaf
    • Dmitry Kovalev's avatar
      Adding mv_has_subpel() function. · b02c72b5
      Dmitry Kovalev authored
      Change-Id: I50922bb1a689f8515debaa018f850b231c21189f
    • Yaowu Xu's avatar
      fix a div by zero issue · 31d3f43e
      Yaowu Xu authored
      Change-Id: I091dfaa0ed5b9672eedd46d6097469d0802e24ef
    • Yaowu Xu's avatar
      Enable reference frame masking for rt mode · 5e5d4c0e
      Yaowu Xu authored
      Reference frame masking helped good quality mode to gain about 5% in
      encoding speed, this commit enable it for rt mode to gain the speed
      In addition, this commit move the speed feature setup to a separate
      Change-Id: I015e8f78bbb21dd43ae183b9b9355bea2ccda9c5
    • Paul Wilkins's avatar
      No arf right before real scene cut. · a00dad39
      Paul Wilkins authored
      To reduce pulsing we now allow an arf just before forced key frames
      and at the end of a clip or section (which may be stitched to
      another clip or section). However, this does not make sense for
      key frames arising from real scene cuts.
      Change from original patch reflects other recent changes in regard
      to alignment of gf/arf and kf groups.
      Change-Id: I074a91d1207e9b3e28085af982f6718aa599775f
    • Paul Wilkins's avatar
      Further rate control tweaks and fixes. · 603075fa
      Paul Wilkins authored
      Further fixes regarding min and max rate.
      Bug fixes re kf group bits and last kf group.
      Change-Id: Iaafd719d30a489e135a3c55851ce8c632091a436
  3. 11 Jan, 2014 1 commit
    • Dmitry Kovalev's avatar
      Cleaning up and fixing psnr calculation code. · 4def0a81
      Dmitry Kovalev authored
      Introducing calc_psnr() which calculates psnr between two yv12 buffers.
      Previously we incorrectly used width/height instead of
      crop_width/crop_height to calculate number of samples -- fixed.
      Change-Id: Iecda01980555de55ad347e0276e6641c793fa56c
  4. 10 Jan, 2014 10 commits
  5. 09 Jan, 2014 7 commits
    • Marco Paniconi's avatar
      Keep buffer clipped to maximum in change_config. · 193fa5c8
      Marco Paniconi authored
      Under a configuration change, where the bitrate suddenly decreases,
      the buffer level may be larger than maximum allowed (for that first frame to be encoded after change_config).
      This change keeps it clipped to its maximum level.
      Change-Id: I4d0b5b3d1fd8148600dd39e02bd630c9464baba5
    • Yaowu Xu's avatar
      Simplify set_rt_speed_feature() · 2d381d76
      Yaowu Xu authored
      1. Made speed choices to be progressive
      2. Adjusted rt speed settings to achieve better speed/quality
      Overall, rt-5 gained 2.5% in compression/quality, encoding time of 720p
      niklas clip goes from 137,052ms to 121,874ms
      Change-Id: Ia6e7e1e15225395a868a2f1059c3db8e266e1600
    • Jingning Han's avatar
      Optimze inv 16x16 DCT with 10 non-zero coeffs - P2 · af31b27a
      Jingning Han authored
      This commit further optimizes SSE2 operations in the second 1-D
      inverse 16x16 DCT, with (<10) non-zero coefficients. The average
      runtime of this module goes down from 779 cycles -> 725 cycles.
      Change-Id: Iac31b123640d9b1e8f906e770702936b71f0ba7f
    • levytamar82's avatar
      SSSE3 convolution optimization · 511d218c
      levytamar82 authored
      Optimizing all SSSE3 assembly for convolution:
      1. vp9_filter_block1d4_h8_sse2
      2. vp9_filter_block1d8_h8_sse2
      3. vp9_filter_block1d16_h8_sse2
      4. vp9_filter_block1d4_v8_sse2
      5. vp9_filter_block1d8_v8_sse2
      6. vp9_filter_block1d16_v8_sse2
      my optimization include:
      -processing 2x8 elements in one 128 bit register instead of processing
      8 elements in one 128 bit register.
      -removing unecessary loads.
      This optimization gives between 2.4% user level gain for 480p input
      and 1.6% user level gain for 720p.
      This Optimization done only for 64bit.
      Change-Id: Icb586dc0c938b56699864fcee6c52fd43b36b969
    • Johann's avatar
      Use the correct member for initialization · 719dadf3
      Johann authored
      On Windows this fails with:
      error C2440: 'initializing': cannot convert from int_mv to uint32_t
      Change-Id: I51630efd0e83a0ce620c91aa7859dd6fc1572e99
    • Dmitry Kovalev's avatar
      Using VP9_COMMON instead of VP9_COMP. · b16fac42
      Dmitry Kovalev authored
      Change-Id: If7d3958653104f3e170853e931f8489de3ecf3cc
    • Dmitry Kovalev's avatar
      Adding {get, set}_rate_correction_factor() functions. · c01fe86c
      Dmitry Kovalev authored
      Change-Id: Ib3212832953a3445fc5f021af0e1de7886f09b4f
  6. 08 Jan, 2014 10 commits
    • Jingning Han's avatar
      Optimze inv 16x16 DCT with 10 non-zero coeffs - P1 · ba6ab46c
      Jingning Han authored
      This commit is the first patch optimizing SSE2 implementation of inverse
      16x16 DCT with <10 non-zero coefficients. It focused on the first 1-D (row)
      transformation. It exploits the fact that only top-left 4x4 block contains
      non-zero coefficients, in a 2-D inverse 16x16 DCT with <10 coeffients.
      The average runtime of idct16x16_10 unit is reduced from
      883 cycles -> 779 cycles (12% faster).
      For pedestrian_area_1080p 300 frames at 4000 kbps, the speed 2 runtime goes
      down from 310651 ms  -> 305910 ms. The decoding speed goes up from
      80.37 fps -> 80.87 fps.
      Change-Id: Ic6f3ac5a637a76c07ba73ddaafe318a699fea645
    • Dmitry Kovalev's avatar
      Removing direct references to {lst_fb, gld_fb, alt_fb}_idx fields. · 510a8282
      Dmitry Kovalev authored
      Change-Id: Ib1d9628d2b538b6dc27b0db1fa7f40f70ff2072f
    • Dmitry Kovalev's avatar
      Cleanups around cpi->common. · 0ecd583d
      Dmitry Kovalev authored
      Change-Id: I0c42a729038d0f4cb7bc07f587d066fcb1dfe9d9
    • Dmitry Kovalev's avatar
      Renaming 'Mode' to 'mode'. · 962c8b24
      Dmitry Kovalev authored
      Change-Id: I6cdd670d66288dbd66228f38bba6b30502d25362
    • Dmitry Kovalev's avatar
      Renaming 'Sharpness' to 'sharpness'. · 57be8136
      Dmitry Kovalev authored
      Change-Id: I54513dc3b3321e0c0bb6b15ea5c34085ed80b4a4
    • Alex Converse's avatar
      Add a C fallback for get_msb() and change inline to INLINE. · ce7ff3b6
      Alex Converse authored
      For systems without __builtin_clz() or _BitScanReverse(), taken from libwep
      Change-Id: Iead257efc1772c466c79e1dc0356ed571d38d43e
    • hkuang's avatar
      Add initial intra frame neon optimization. 1~2% gain. · 691111aa
      hkuang authored
      More intra optimizations will be added.
      Change-Id: I33ae8d93f6002bf7b64cc2669602d9e6bfa5a6e8
    • levytamar82's avatar
      AVX2 Variance Optimization · 357b6536
      levytamar82 authored
      Optimizing the variance functions: vp9_variance16x16, vp9_variance32x32,
      vp9_variance64x64, vp9_variance32x16, vp9_variance64x32,
      vp9_mse16x16 by migrating to AVX2
      some of the functions were optimized by processing 32 elements instead of 16.
      some of the functions were optimized by processing 2 loop strides of 16
      elements in a single 256 bit register
      This optimization gives between 2.4% - 2.7% user level performance gain
      and 42% function level gain.
      Change-Id: I265ae08a2b0196057a224a86450153ef3aebd85d
    • Alex Converse's avatar
      Replace RD modeling with a fixed point approximation. · f2ca665f
      Alex Converse authored
      Change-Id: I44eb44eb3f36c05d916ef140ef42cc84f72f99ec
    • Paul Wilkins's avatar
      Fix rate allocation bug. · d7b49b28
      Paul Wilkins authored
      Fix miss alignment of the frames contributing to the
      error score and bit allocation for gf/arf groups.
      Initial results slightly +.
      Change-Id: Ie508bdcfdac52e592d48e1f13e01b3551b523deb
  7. 07 Jan, 2014 5 commits
    • Deb Mukherjee's avatar
      Further rate control cleanups · 730ade41
      Deb Mukherjee authored
      Some cleanups on frames_to_key, frames_since_key.
      Also removes the unused fixed_q parameters in vp9.
      Change-Id: If8743a32c71de30a8d17136477b53d607a7acda8
    • Jingning Han's avatar
      Fix an issue in motion vector prediction stage · 06e4f825
      Jingning Han authored
      The previous implementation stops motion vector prediction test when
      the zero motion vector appears for the second time. This commit fixes
      it by simply skipping the second time check on zero mv and continuing
      on to next mv candidate.
      It slightly improves stdhd in speed 2 by 0.06% on average. Most static
      sequences are not affected. A few hard ones, like jet, ped, and riverbed
      were improved by 0.1 - 0.2%.
      Change-Id: Ia8d4e2ffb7136669e8ad1fb24ea6e8fdd6b9a3c1
    • Jingning Han's avatar
      Remove deprecated variable from rt_speed_feature · fdad4fd2
      Jingning Han authored
      This resolves a merge error.
      Change-Id: Ifb83acc0a08e80c82f7624f9c86f79d3a86cc871
    • Dmitry Kovalev's avatar
      Removing unused mvp_fill manipulation code. · 6a7a7341
      Dmitry Kovalev authored
      The code can be removed because mvp_full will be overridden after that.
      Change-Id: I89559b1b6914c86bcd02b7359d37241948ac11d3
    • Dmitry Kovalev's avatar
      Adding new_mv local variable. · c015ba5f
      Dmitry Kovalev authored
      Change-Id: I9631b35810c232c134f39dc0edadb1b3860a45ae