1. 14 Jan, 2014 1 commit
    • Deb Mukherjee's avatar
      Minor fix on an assert · 1699d6bd
      Deb Mukherjee authored
      Fixes assert that fails occasionally on small values of
      max-key frame intervals. Also, adds a small change on
      updating frames_to_key for frame drops.
      
      Change-Id: Icc2b33b25e3e4ced7e49f8db73e0a887ef9c99e0
      1699d6bd
  2. 13 Jan, 2014 4 commits
    • Dmitry Kovalev's avatar
      Adding mv_has_subpel() function. · b02c72b5
      Dmitry Kovalev authored
      Change-Id: I50922bb1a689f8515debaa018f850b231c21189f
      b02c72b5
    • Yaowu Xu's avatar
      Enable reference frame masking for rt mode · 5e5d4c0e
      Yaowu Xu authored
      Reference frame masking helped good quality mode to gain about 5% in
      encoding speed, this commit enable it for rt mode to gain the speed
      improvement.
      
      In addition, this commit move the speed feature setup to a separate
      function.
      
      Change-Id: I015e8f78bbb21dd43ae183b9b9355bea2ccda9c5
      5e5d4c0e
    • Paul Wilkins's avatar
      No arf right before real scene cut. · a00dad39
      Paul Wilkins authored
      To reduce pulsing we now allow an arf just before forced key frames
      and at the end of a clip or section (which may be stitched to
      another clip or section). However, this does not make sense for
      key frames arising from real scene cuts.
      
      Change from original patch reflects other recent changes in regard
      to alignment of gf/arf and kf groups.
      
      Change-Id: I074a91d1207e9b3e28085af982f6718aa599775f
      a00dad39
    • Paul Wilkins's avatar
      Further rate control tweaks and fixes. · 603075fa
      Paul Wilkins authored
      Further fixes regarding min and max rate.
      Bug fixes re kf group bits and last kf group.
      
      Change-Id: Iaafd719d30a489e135a3c55851ce8c632091a436
      603075fa
  3. 11 Jan, 2014 1 commit
    • Dmitry Kovalev's avatar
      Cleaning up and fixing psnr calculation code. · 4def0a81
      Dmitry Kovalev authored
      Introducing calc_psnr() which calculates psnr between two yv12 buffers.
      Previously we incorrectly used width/height instead of
      crop_width/crop_height to calculate number of samples -- fixed.
      
      Change-Id: Iecda01980555de55ad347e0276e6641c793fa56c
      4def0a81
  4. 10 Jan, 2014 9 commits
    • Jim Bankoski's avatar
      explain speed features · 6439aa5a
      Jim Bankoski authored
      Added comments to explain what the various speed features do, and removed
      1 that was clearly unused.
      
      Change-Id: Icd37a536072ddafedbfaefcecbe48979f6d10faf
      6439aa5a
    • Jingning Han's avatar
      Declare setup_buffer_inter in vp9_rdopt.h · db2b350d
      Jingning Han authored
      This funtion initializes buffer pointers and first stage motion vector
      prediction. It will be needed by both regular rate-distortion
      optimization loop and the non-RD mode decision. Hence move its
      declaration in vp9_rdopt.h
      
      Change-Id: I64e8b6316c9d05f20756a62721533a2e4d158235
      db2b350d
    • Dmitry Kovalev's avatar
      Removing mi_height_log2_lookup table. · 96be0a50
      Dmitry Kovalev authored
      Change-Id: I1f0ae2edc3a96b33c0494d165ae756a8feba6184
      96be0a50
    • Dmitry Kovalev's avatar
      Cleaning up vp9_dx_iface.c. · 21ededd4
      Dmitry Kovalev authored
      Change-Id: I6a0dfb95c55ee6cadc7b1675782c7830e5c7caaf
      21ededd4
    • Dmitry Kovalev's avatar
      Cleaning up vp9_rc_postencode_update() function. · 447eece3
      Dmitry Kovalev authored
      Change-Id: I02e44c10660fdb9201a802ad19ceb64756feeebe
      447eece3
    • Marco Paniconi's avatar
      Don't use gf_update by default for 1-pass CBR. · c46538d4
      Marco Paniconi authored
      Change-Id: I5df6abceb0a2a69706feadeb820b593cae88f573
      c46538d4
    • Paul Wilkins's avatar
      Revert "SSSE3 convolution optimization" · b6452571
      Paul Wilkins authored
      This reverts commit 511d218c.
      
      In current form intrinsics break borg build.
      
      Change-Id: Ied37936af841250ecff449802e69a3d3761c91b9
      b6452571
    • Jingning Han's avatar
      Enable skipping reference frame check in rd loop · d66c7486
      Jingning Han authored
      This commit allows encoder to compare the SAD cost associated with
      the best motion vector predictor, per frame. If one reference frame
      has this cost more than 4 times of the best SAD cost given by other
      reference frames, skip NEARESTMV, NEARMV, ZEROMV mode check of this
      reference frame.
      
      This setting is turned on in speed 2 and above. Compression quality
      change in speed 2:
      derf  -0.014%
      yt    -0.097%
      hd    -0.023%
      stdhd  0.046%
      
      It reduces the speed 2 runtime of test sequences:
      pedestrian_area_1080p 4000 kbps 310763 ms -> 303595 ms
      bluesky_1080p 6000 kbps         259852 ms -> 251920 ms
      
      Change-Id: I7f59cf79503d51836d61d56d50dc5bdf0e502e22
      d66c7486
    • Deb Mukherjee's avatar
      Cleanups on refresh flags · 412e4954
      Deb Mukherjee authored
      Cleanups on frame refresh flags and external overrides.
      
      Change-Id: Ia6a56fe1bde906b1dc3fcbf4ef1c7b207cd2df2d
      412e4954
  5. 09 Jan, 2014 7 commits
    • Marco Paniconi's avatar
      Keep buffer clipped to maximum in change_config. · 193fa5c8
      Marco Paniconi authored
      Under a configuration change, where the bitrate suddenly decreases,
      the buffer level may be larger than maximum allowed (for that first frame to be encoded after change_config).
      This change keeps it clipped to its maximum level.
      
      Change-Id: I4d0b5b3d1fd8148600dd39e02bd630c9464baba5
      193fa5c8
    • Yaowu Xu's avatar
      Simplify set_rt_speed_feature() · 2d381d76
      Yaowu Xu authored
      1. Made speed choices to be progressive
      2. Adjusted rt speed settings to achieve better speed/quality
      
      Overall, rt-5 gained 2.5% in compression/quality, encoding time of 720p
      niklas clip goes from 137,052ms to 121,874ms
      
      Change-Id: Ia6e7e1e15225395a868a2f1059c3db8e266e1600
      2d381d76
    • Jingning Han's avatar
      Optimze inv 16x16 DCT with 10 non-zero coeffs - P2 · af31b27a
      Jingning Han authored
      This commit further optimizes SSE2 operations in the second 1-D
      inverse 16x16 DCT, with (<10) non-zero coefficients. The average
      runtime of this module goes down from 779 cycles -> 725 cycles.
      
      Change-Id: Iac31b123640d9b1e8f906e770702936b71f0ba7f
      af31b27a
    • levytamar82's avatar
      SSSE3 convolution optimization · 511d218c
      levytamar82 authored
      Optimizing all SSSE3 assembly for convolution:
      1. vp9_filter_block1d4_h8_sse2
      2. vp9_filter_block1d8_h8_sse2
      3. vp9_filter_block1d16_h8_sse2
      4. vp9_filter_block1d4_v8_sse2
      5. vp9_filter_block1d8_v8_sse2
      6. vp9_filter_block1d16_v8_sse2
      my optimization include:
      -processing 2x8 elements in one 128 bit register instead of processing
      8 elements in one 128 bit register.
      -removing unecessary loads.
      This optimization gives between 2.4% user level gain for 480p input
      and 1.6% user level gain for 720p.
      This Optimization done only for 64bit.
      
      Change-Id: Icb586dc0c938b56699864fcee6c52fd43b36b969
      511d218c
    • Johann's avatar
      Use the correct member for initialization · 719dadf3
      Johann authored
      On Windows this fails with:
      error C2440: 'initializing': cannot convert from int_mv to uint32_t
      
      Change-Id: I51630efd0e83a0ce620c91aa7859dd6fc1572e99
      719dadf3
    • Dmitry Kovalev's avatar
      Using VP9_COMMON instead of VP9_COMP. · b16fac42
      Dmitry Kovalev authored
      Change-Id: If7d3958653104f3e170853e931f8489de3ecf3cc
      b16fac42
    • Dmitry Kovalev's avatar
      Adding {get, set}_rate_correction_factor() functions. · c01fe86c
      Dmitry Kovalev authored
      Change-Id: Ib3212832953a3445fc5f021af0e1de7886f09b4f
      c01fe86c
  6. 08 Jan, 2014 10 commits
    • Jingning Han's avatar
      Optimze inv 16x16 DCT with 10 non-zero coeffs - P1 · ba6ab46c
      Jingning Han authored
      This commit is the first patch optimizing SSE2 implementation of inverse
      16x16 DCT with <10 non-zero coefficients. It focused on the first 1-D (row)
      transformation. It exploits the fact that only top-left 4x4 block contains
      non-zero coefficients, in a 2-D inverse 16x16 DCT with <10 coeffients.
      
      The average runtime of idct16x16_10 unit is reduced from
      883 cycles -> 779 cycles (12% faster).
      
      For pedestrian_area_1080p 300 frames at 4000 kbps, the speed 2 runtime goes
      down from 310651 ms  -> 305910 ms. The decoding speed goes up from
      80.37 fps -> 80.87 fps.
      
      Change-Id: Ic6f3ac5a637a76c07ba73ddaafe318a699fea645
      ba6ab46c
    • Dmitry Kovalev's avatar
      Removing direct references to {lst_fb, gld_fb, alt_fb}_idx fields. · 510a8282
      Dmitry Kovalev authored
      Change-Id: Ib1d9628d2b538b6dc27b0db1fa7f40f70ff2072f
      510a8282
    • Dmitry Kovalev's avatar
      Cleanups around cpi->common. · 0ecd583d
      Dmitry Kovalev authored
      Change-Id: I0c42a729038d0f4cb7bc07f587d066fcb1dfe9d9
      0ecd583d
    • Dmitry Kovalev's avatar
      Renaming 'Mode' to 'mode'. · 962c8b24
      Dmitry Kovalev authored
      Change-Id: I6cdd670d66288dbd66228f38bba6b30502d25362
      962c8b24
    • Dmitry Kovalev's avatar
      Renaming 'Sharpness' to 'sharpness'. · 57be8136
      Dmitry Kovalev authored
      Change-Id: I54513dc3b3321e0c0bb6b15ea5c34085ed80b4a4
      57be8136
    • Alex Converse's avatar
      Add a C fallback for get_msb() and change inline to INLINE. · ce7ff3b6
      Alex Converse authored
      For systems without __builtin_clz() or _BitScanReverse(), taken from libwep
      
      Change-Id: Iead257efc1772c466c79e1dc0356ed571d38d43e
      ce7ff3b6
    • hkuang's avatar
      Add initial intra frame neon optimization. 1~2% gain. · 691111aa
      hkuang authored
      More intra optimizations will be added.
      
      Change-Id: I33ae8d93f6002bf7b64cc2669602d9e6bfa5a6e8
      691111aa
    • levytamar82's avatar
      AVX2 Variance Optimization · 357b6536
      levytamar82 authored
      Optimizing the variance functions: vp9_variance16x16, vp9_variance32x32,
      vp9_variance64x64, vp9_variance32x16, vp9_variance64x32,
      vp9_mse16x16 by migrating to AVX2
      some of the functions were optimized by processing 32 elements instead of 16.
      some of the functions were optimized by processing 2 loop strides of 16
      elements in a single 256 bit register
      This optimization gives between 2.4% - 2.7% user level performance gain
      and 42% function level gain.
      
      Change-Id: I265ae08a2b0196057a224a86450153ef3aebd85d
      357b6536
    • Alex Converse's avatar
      Replace RD modeling with a fixed point approximation. · f2ca665f
      Alex Converse authored
      Change-Id: I44eb44eb3f36c05d916ef140ef42cc84f72f99ec
      f2ca665f
    • Paul Wilkins's avatar
      Fix rate allocation bug. · d7b49b28
      Paul Wilkins authored
      Fix miss alignment of the frames contributing to the
      error score and bit allocation for gf/arf groups.
      
      Initial results slightly +.
      
      Change-Id: Ie508bdcfdac52e592d48e1f13e01b3551b523deb
      d7b49b28
  7. 07 Jan, 2014 8 commits