1. 26 Oct, 2011 1 commit
    • Attila Nagy's avatar
      Reduce partial frame copy in encoder's pick_filter_level_fast · de828094
      Attila Nagy authored
      The partial frame copy function used to copy an extra 8 lines above
      and  below. The partial frame filtering can only modify 3 pixel rows
      above the partial frame. Reduce copy to bare minimum needed, which is
      4 lines, so that partial filtering on copied frame is possible.
      
      Define the "magic" fraction number for partial filtering in
      loopfilter.h .
      
      Change-Id: I4791ffc541b6884b12759a0d0714a8faf16147ec
      de828094
  2. 24 Oct, 2011 1 commit
  3. 20 Oct, 2011 2 commits
    • James Berry's avatar
      Fix: check cx_data buffer prior to write · bc715113
      James Berry authored
      check to make sure that cx_data buffer has enough room before
      writting to it, prior behavior did not which could result in a crash.
      
      Change-Id: I3fab6f2bc4a96d7c675ea81acd39ece121738b28
      bc715113
    • Johann's avatar
      Don't copy borders for loop_filter_pick · 7cdc986c
      Johann authored
      During the _pick only the Y plane is examined. In addition, data beyond
      the borders of the frame is not read.
      
      Change-Id: Ic549adfca70fc6e0b55f8aab0efe81f0afac89f9
      7cdc986c
  4. 18 Oct, 2011 1 commit
    • Scott LaVarnway's avatar
      Remove usage of predict buffer for decode · ed9c66f5
      Scott LaVarnway authored
      Instead of using the predict buffer, the decoder now writes
      the predictor into the recon buffer.  For blocks with eob=0,
      unnecessary idcts can be eliminated.  This gave a performance
      boost of ~1.8% for the HD clips used.
      
      Tero: Added needed changes to ARM side and scheduled some
            assembly code to prevent interlocks.
      
      Patch Set 6:  Merged (I1bcdca7a95aacc3a181b9faa6b10e3a71ee24df3)
      into this commit because of similarities in the idct
      functions.
      Patch Set 7: EC bug fix.
      
      Change-Id: Ie31d90b5d3522e1108163f2ac491e455e3f955e6
      ed9c66f5
  5. 14 Oct, 2011 1 commit
    • Attila Nagy's avatar
      Fix: vp8cx_pack_tokens_into_partitions_armv5 crash · a5cd42fe
      Attila Nagy authored
      It was crashing when number of partitions was bigger than the number
      of MB rows (ex. 128x96 with 8 partitions).
      Start point was not checked against mb_rows, plus extra
      "empty" partitions were not written out.
      
      Change-Id: I9c2f013b9ec022354b658fab4ef799ff8b1de93d
      a5cd42fe
  6. 11 Oct, 2011 1 commit
    • Adrian Grange's avatar
      Added rate-targeted temporal scalability · 217591fd
      Adrian Grange authored
      Added the ability to create rate-targeted, temporally
      scalable, VP8 compatible bitstreams.
      
      The application vp8_scalable_patterns.c demonstrates how
      to use this capability. Users can create output bitstreams
      containing upto 5 temporally separable streams encoded
      as a single VP8 bitstream.
      (previously abandoned as:
      I92d1483e887adb274d07ce9e567e4d0314881b0a)
      
      Change-Id: I156250a3fe930be57c069d508c41b6a7a4ea8d6a
      217591fd
  7. 10 Oct, 2011 2 commits
  8. 04 Oct, 2011 1 commit
  9. 30 Sep, 2011 3 commits
    • Scott LaVarnway's avatar
      Improved tokenize · ab00d209
      Scott LaVarnway authored
      For a realtime HD encodings, up to 1.6% gains seen.
      
      
      
      Change-Id: If45028e23db95124da63f9d38ffe06e05596cc6e
      ab00d209
    • Alpha Lam's avatar
      Call vp8_find_near_mvs lazily · 7bce513a
      Alpha Lam authored
      vp8_find_near_mvs() is being called on all possible reference frames
      but the data computed may be used if the loop exits early, which can
      be due to x->skip beign set to 1.
      
      Optimize this by call vp8_find_near_mvs() laziy only if it is going
      to be used and not computed yet.
      
      Change-Id: Iccdbd4c962a670c9f2c99b8aca8096042ca5dc98
      7bce513a
    • Paul Wilkins's avatar
      CQ and two pass rate control. · b6e27d5f
      Paul Wilkins authored
      Changes to the selection of Q limits for two pass
      and two pass CQ mode.
      
      Allowance made for Mode and motion vector costs.
      Some refactoring of common code.
      
      For Derf and YT sets CQ mode average improvement
      circa 1% (SSIM and Global PSNR).
      
      Some increased tendency to undershoot even when
      user CQ not reached.
      
      Patch2: Removed some test code accidentally merged.
      
      Change-Id: Icf74d13af77437c08602571dc7a97e747cce5066
      b6e27d5f
  10. 29 Sep, 2011 1 commit
    • Attila Nagy's avatar
      Multithreaded encoder, late sync loopfilter · 380d64ec
      Attila Nagy authored
      Sync with loopfilter thread just at the beginning of next frame encoding.
      This returns control to application faster and allows a better multicore scaling.
      When PSNR packets are generated the final filtered frame is needed imediatly
      so we cannot delay the sync.
      
      Change-Id: I288d97b5e331d41d6f5bb49d97986fa12ac6f066
      380d64ec
  11. 22 Sep, 2011 1 commit
  12. 20 Sep, 2011 3 commits
    • Fritz Koenig's avatar
      Move neon only arm functions under arm/neon. · bd0c3409
      Fritz Koenig authored
      These files don't contain generic arm code, so should
      only be compiled by neon.
      
      Change-Id: Ie712823aa04d4235e7cfe7a3b725e73ee4c3e564
      bd0c3409
    • Tero Rintaluoma's avatar
      NEON FDCT updated to match current C code · 0c2529a8
      Tero Rintaluoma authored
      - Removed fast_fdct4x4_neon and fast_fdct8x4_neon
      - Uses now short_fdct4x4 and short_fdct8x4
      - Gives ~1-2% speed-up on Cortex-A8/A9
      
      Change-Id: Ib62f2cb2080ae719f8fa1d518a3a5e71278a41ec
      0c2529a8
    • Tero Rintaluoma's avatar
      Fixed armv5te multiplications · 3c19bc3f
      Tero Rintaluoma authored
      Rd and Rm registers should be different in 'mul'. This register
      combination results in unpredictable behaviour. GCC will give
      a warning and RVCT an error in this case.
      
      Restriction applies only to armv5 targets and not for armv6 and above.
      
      Change-Id: I378d17c51e1f16a6820814fbed43e115aaabb03e
      3c19bc3f
  13. 19 Sep, 2011 2 commits
    • Tero Rintaluoma's avatar
      Updated ARMv6 forward transforms to match C · 4c3ad66b
      Tero Rintaluoma authored
      - Updated walsh transform to match C
        (based on Change Id24f3392)
      - Changed fast_fdct4x4 and 8x4 to short_fdct4x4 and 8x4
        correspondingly
      
      Change-Id: I704e862f40e315b0a79997633c7bd9c347166a8e
      4c3ad66b
    • Tero Rintaluoma's avatar
      NEON walsh transform updated to match C · 2a4b2a00
      Tero Rintaluoma authored
      Modified original patch If2f07220885c4c3a0cae0dace34ea0e36124f001
      according to comments. Scheduled code a little bit to prevent some
      interlocks.
      
      Change-Id: I338f02b881098782f82af63d97f042b85e63e902
      2a4b2a00
  14. 13 Sep, 2011 1 commit
    • Scott LaVarnway's avatar
      Fixed encoder crash · 5bc7b3a6
      Scott LaVarnway authored
      caused by the "Removed bmi copy to/from BLOCKD" commit.
      
      Change-Id: I9fae71bdc34c8ecc07bb81cd3ccf498b91ce3ec7
      5bc7b3a6
  15. 31 Aug, 2011 1 commit
  16. 30 Aug, 2011 1 commit
  17. 25 Aug, 2011 1 commit
    • Yunqing Wang's avatar
      Minor modification on key frame decision · 1f20202e
      Yunqing Wang authored
      This change makes sure that no key frame recoding in real-time mode
      even if CONFIG_REALTIME_ONLY is not configured.
      
      Change-Id: Ifc34141f3217a6bb63cc087d78b111fadb35eec2
      1f20202e
  18. 24 Aug, 2011 2 commits
    • Fritz Koenig's avatar
      Quiet warning by removing unused variable. · 4797a972
      Fritz Koenig authored
      fwd_boost_score was not being computed or
      referenced, so remove declaration.
      
      Change-Id: Iece36cde1ec113e3c6afaff1407d24cdf12bd0a8
      4797a972
    • Scott LaVarnway's avatar
      Removed bmi copy to/from BLOCKD · b870947d
      Scott LaVarnway authored
      for SPLITMV and B_PRED modes.  Modified code to use the bmi
      found in mode_info_context instead of BLOCKD.  On the decode
      side, the uvmvs are calculated only when required, instead of
      every macroblock.  This is WIP. (bmi should eventually be
      removed from BLOCKD)
      Small performance gains noticed for RT encodes and decodes.(VGA)
      
      Change-Id: I2ed7f0fd5ca733655df684aa82da575c77a973e7
      b870947d
  19. 23 Aug, 2011 1 commit
    • Fritz Koenig's avatar
      Use local labels for jumps/loops in x86 assembly. · c5f890af
      Fritz Koenig authored
      Prepend . to local labels in assembly code.  This
      allows non unique labels within a file.  Also
      makes profiling information more informative
      by keeping the function name with the loop name.
      
      Change-Id: I7a983cb3a5ba2413d5dafd0a37936b268fb9e37f
      c5f890af
  20. 22 Aug, 2011 2 commits
  21. 19 Aug, 2011 2 commits
    • Fritz Koenig's avatar
      Reclasify optimized ssim calculations as SSE2. · 01376858
      Fritz Koenig authored
      Calculations were incorrectly classified as either
      SSE3 or SSSE3.  Only using SSE2 instructions.
      Cleanup function names and make non-RTCD code work
      as well.
      
      Change-Id: I29f5c2ead342b2086a468029c15e2c1d948b5d97
      01376858
    • Alpha Lam's avatar
      Copy less when active map is in use · 4e8d35a4
      Alpha Lam authored
      When active map is specified and the current frame is not a key frame,
      golden frame nor a altref frame then copy only those active regions.
      
      This significantly reduces encoding time by as much as 19% on the test
      system where realtime encoding is used. This is particularly useful
      when the frame size is large (e.g. 2560x1600) and there's only a few
      action macroblocks.
      
      Change-Id: If394a813ec2df5a0201745d1348dbde4278f7ad4
      4e8d35a4
  22. 17 Aug, 2011 1 commit
    • Paul Wilkins's avatar
      Small boost to every other frame. · 744f4823
      Paul Wilkins authored
      Instead of a single mid GF boost apply a few extra bits to
      every other frame. This gives a very small average metrics
      improvement on both derf and YT sets.
      
      Also use min GF interval as min KF interval.
      
      Change-Id: Iee238b8cae0ffaed850a5a944ac825cee18da485
      744f4823
  23. 16 Aug, 2011 1 commit
    • Scott LaVarnway's avatar
      Faster vp8_default_coef_probs · 19987dcb
      Scott LaVarnway authored
      Copies from a generated table instead of building the
      default coeff probabilities during runtime.
      
      Change-Id: I4d9551ea3a2d7d4a4f7ce9eda006495221a8de50
      19987dcb
  24. 12 Aug, 2011 1 commit
    • John Koleszar's avatar
      Revert "Improved 1-pass CBR rate control" · e9613170
      John Koleszar authored
      This reverts commit b5ea2fbc. Further
      testing showed noticable keyframe popping in some cases, reverting this
      for now to give time for a proper fix.
      
      Conflicts:
      
      	vp8/encoder/onyx_if.c
      	vp8/encoder/ratectrl.c
      
      Change-Id: I159f53d1bf0e24c035754ab3ded8ccfd58fd04af
      e9613170
  25. 03 Aug, 2011 2 commits
    • John Koleszar's avatar
      Fix source buffer selection · 238dae86
      John Koleszar authored
      This patch fixes a bug in the interaction between the recode loop and
      spatial resampling. If the codec was in a spatial resampling state,
      and a subsequent iteration of the recode loop disables resampling,
      then the source buffer must be reset to the unscaled source.
      
      Change-Id: I4e4cd47b943f6cd26a47449dc7f4255b38e27c77
      238dae86
    • Yunqing Wang's avatar
      Adjust half-pixel only search · b9f19f89
      Yunqing Wang authored
      Changed motion search in vp8_find_best_half_pixel_step() to be the
      same as in vp8_find_best_sub_pixel_step(), which checks 5 points
      instead of 8 points. This only affects real-time mode with
      cpu-used >=9. Tests showed it gives 2% encoding speedup with
      a quality loss(psnr) of up to 0.5%.
      
      Change-Id: I16049cad1535002346d46cfdfad345bfc3dc5146
      b9f19f89
  26. 01 Aug, 2011 1 commit
  27. 29 Jul, 2011 1 commit
  28. 27 Jul, 2011 2 commits
    • Yunqing Wang's avatar
      Preload reference area in sub-pixel motion search (real-time mode) · 2f2302f8
      Yunqing Wang authored
      This change implemented same idea in change "Preload reference area
      to an intermediate buffer in sub-pixel motion search." The changes
      were made to vp8_find_best_sub_pixel_step() and vp8_find_best_half
      _pixel_step() functions which are called when speed >= 5. Test
      result (using tulip clip):
      
      1. On Core2 Quad machine(Linux)
      rt mode, speed (-5 ~ -8), encoding speed gain: 2% ~ 3%
      rt mode, speed (-9 ~ -11), encoding speed gain: 1% ~ 2%
      rt mode, speed (-12 ~ -14), no noticeable encoding speed gain
      
      2. On Xeon machine(Linux)
      Test on speed (-5 ~ -14) didn't show noticeable speed change.
      
      Change-Id: I21bec2d6e7fbe541fcc0f4c0366bbdf3e2076aa2
      2f2302f8
    • Yunqing Wang's avatar
      Fix range checks in motion search · bde2afbe
      Yunqing Wang authored
      There were some situations that the start motion vectors were
      out of range. This fix adjusted range checks to make sure they
      are checked and clamped.
      
      Change-Id: Ife83b7fed0882bba6d1fa559b6e63c054fd5065d
      bde2afbe