1. 19 Apr, 2014 1 commit
  2. 11 Apr, 2014 1 commit
  3. 24 Mar, 2014 1 commit
  4. 24 Jan, 2014 1 commit
  5. 16 Oct, 2013 1 commit
  6. 02 Oct, 2013 1 commit
  7. 25 Jun, 2013 1 commit
    • Ronald S. Bultje's avatar
      Add averaging-SAD functions for 8-point comp-inter motion search. · c24d9223
      Ronald S. Bultje authored
      Makes first 50 frames of bus @ 1500kbps encode from 3min22.7 to 3min18.2,
      i.e. 2.3% faster. In addition, use the sub_pixel_avg functions to calc
      the variance of the averaging predictor. This is slightly suboptimal
      because the function is subpixel-position-aware, but it will (at least
      for the SSE2 version) not actually use a bilinear filter for a full-pixel
      position, thus leading to approximately the same performance compared to
      if we implemented an actual average-aware full-pixel variance function.
      That gains another 0.3 seconds (i.e. encode time goes to 3min17.4), thus
      leading to a total gain of 2.7%.
      
      Change-Id: I3f059d2b04243921868cfed2568d4fa65d7b5acd
      c24d9223
  8. 17 Jun, 2013 1 commit
  9. 22 May, 2013 1 commit
  10. 10 May, 2013 1 commit
    • Yunqing Wang's avatar
      Add joint motion search in comp_inter_inter mode(experiment) · 9f5811c2
      Yunqing Wang authored
      In current code, motion vectors got from single prediction mode are used
      in compound prediction mode directly. These motion vectors may not give
      accurate prediction since they are searched independently. In this patch,
      we took Pascal's suggestion, and did joint motion search in compound
      prediction mode to find better motion vectors in this situation.
      Test results:
      Overall PSNR: 0.570%(derf), 0.918%(stdhd);
      SSIM: 0.572%(derf), 1.009%(stdhd);
      
      The encoder is a little slower. This can be improved since some c
      code is used in motion search.
      
      Change-Id: Ib30c9240f6c56c9b070867b4ca89412a76d9f3c6
      9f5811c2
  11. 02 Mar, 2013 1 commit
  12. 28 Feb, 2013 1 commit
    • Jim Bankoski's avatar
      this commit converts all sad ptrs to uint32 · 714aa9f3
      Jim Bankoski authored
      sse4_1 code used uint16_t for returning sad, but that
      won't work for 32x32 or 64x64.   This code fixes the
      assembly for those and also reenables sse4_1 on linux
      
      Change-Id: I5ce7288d581db870a148e5f7c5092826f59edd81
      714aa9f3
  13. 27 Feb, 2013 1 commit
    • John Koleszar's avatar
      Remove unused vp9_copy32xn · 7ad8dbe4
      John Koleszar authored
      This function was part of an optimization used in VP8 that required
      caching two macroblocks. This is unused in VP9, and might not
      survive refactoring to support superblocks, so removing it for now.
      
      Change-Id: I744e585206ccc1ef9a402665c33863fc9fb46f0d
      7ad8dbe4
  14. 23 Feb, 2013 1 commit
  15. 18 Dec, 2012 1 commit
  16. 30 Nov, 2012 1 commit
  17. 27 Nov, 2012 1 commit
    • John Koleszar's avatar
      Add vp9_ prefix to all vp9 files · fcccbcbb
      John Koleszar authored
      Support for gyp which doesn't support multiple objects in the same
      static library having the same basename.
      
      Change-Id: Ib947eefbaf68f8b177a796d23f875ccdfa6bc9dc
      fcccbcbb
  18. 18 Nov, 2012 1 commit
    • Jim Bankoski's avatar
      clean out some of the rtcd code. · f4871b6a
      Jim Bankoski authored
      This removes functions that are no longer needed and cleans up some warnings.
      
      Change-Id: I292a4c3694e9c1d68ce99cea390905b198434719
      f4871b6a
  19. 01 Nov, 2012 2 commits
  20. 31 Oct, 2012 1 commit
  21. 22 Oct, 2012 1 commit
  22. 20 Aug, 2012 1 commit
    • Ronald S. Bultje's avatar
      Superblock coding. · 5d4cffb3
      Ronald S. Bultje authored
      This commit adds a pick_sb_mode() function which selects the best 32x32
      superblock coding mode. Then it selects the best per-MB modes, compares
      the two and encodes that in the bitstream.
      
      The bitstream coding is rather simplistic right now. At the SB level,
      we code a bit to indicate whether this block uses SB-coding (32x32
      prediction) or MB-coding (anything else), and then we follow with the
      actual modes. This could and should be modified in the future, but is
      omitted from this commit because it will likely involve reorganizing
      much more code rather than just adding SB coding, so it's better to let
      that be judged on its own merits.
      
      Gains on derf: about even, YT/HD: +0.75%, STD/HD: +1.5%.
      
      Change-Id: Iae313a7cbd8f75b3c66d04a68b991cb096eaaba6
      5d4cffb3
  23. 17 Aug, 2012 1 commit
  24. 14 Aug, 2012 1 commit
  25. 17 Jul, 2012 1 commit
  26. 15 Mar, 2012 1 commit
    • Yaowu Xu's avatar
      WebM Experimental Codec Branch Snapshot · 6035da54
      Yaowu Xu authored
      This is a code snapshot of experimental work currently ongoing for a
      next-generation codec.
      
      The codebase has been cut down considerably from the libvpx baseline.
      For example, we are currently only supporting VBR 2-pass rate control
      and have removed most of the code relating to coding speed, threading,
      error resilience, partitions and various other features.  This is in
      part to make the codebase easier to work on and experiment with, but
      also because we want to have an open discussion about how the bitstream
      will be structured and partitioned and not have that conversation
      constrained by past work.
      
      Our basic working pattern has been to initially encapsulate experiments
      using configure options linked to #IF CONFIG_XXX statements in the
      code. Once experiments have matured and we are reasonably happy that
      they give benefit and can be merged without breaking other experiments,
      we remove the conditional compile statements and merge them in.
      
      Current changes include:
      * Temporal coding experiment for segments (though still only 4 max, it
        will likely be increased).
      * Segment feature experiment - to allow various bits of information to
        be coded at the segment level. Features tested so far include mode
        and reference frame information, limiting end of block offset and
        transform size, alongside Q and loop filter parameters, but this set
        is very fluid.
      * Support for 8x8 transform - 8x8 dct with 2nd order 2x2 haar is used
        in MBs using 16x16 prediction modes within inter frames.
      * Compound prediction (combination of signals from existing predictors
        to create a new predictor).
      * 8 tap interpolation filters and 1/8th pel motion vectors.
      * Loop filter modifications.
      * Various entropy modifications and changes to how entropy contexts and
        updates are handled.
      * Extended quantizer range matched to transform precision improvements.
      
      There are also ongoing further experiments that we hope to merge in the
      near future: For example, coding of motion and other aspects of the
      prediction signal to better support larger image formats, use of larger
      block sizes (e.g. 32x32 and up) and lossless non-transform based coding
      options (especially for key frames). It is our hope that we will be
      able to make regular updates and we will warmly welcome community
      contributions.
      
      Please be warned that, at this stage, the codebase is currently slower
      than VP8 stable branch as most new code has not been optimized, and
      even the 'C' has been deliberately written to be simple and obvious,
      not fast.
      
      The following graphs have the initial test results, numbers in the
      tables measure the compression improvement in terms of percentage. The
      build has  the following optional experiments configured:
      --enable-experimental --enable-enhanced_interp --enable-uvintra
      --enable-high_precision_mv --enable-sixteenth_subpel_uv
      
      CIF Size clips:
      http://getwebm.org/tmp/cif/
      HD size clips:
      http://getwebm.org/tmp/hd/
      (stable_20120309 represents encoding results of WebM master branch
      build as of commit#7a159071)
      
      They were encoded using the following encode parameters:
      --good --cpu-used=0 -t 0 --lag-in-frames=25 --min-q=0 --max-q=63
      --end-usage=0 --auto-alt-ref=1 -p 2 --pass=2 --kf-max-dist=9999
      --kf-min-dist=0 --drop-frame=0 --static-thresh=0 --bias-pct=50
      --minsection-pct=0 --maxsection-pct=800 --sharpness=0
      --arnr-maxframes=7 --arnr-strength=3(for HD,6 for CIF)
      --arnr-type=3
      
      Change-Id: I5c62ed09cfff5815a2bb34e7820d6a810c23183c
      6035da54
  27. 06 Mar, 2012 1 commit
  28. 16 Feb, 2012 1 commit
    • Paul Wilkins's avatar
      Code simplification · 79d330d7
      Paul Wilkins authored
      Removal of the pickinter.c and .h files and calls to this
      code.
      
      Removal of some code relating to real time and one pass
      settings  though there is more to be done in this regard.
      
      However,  vp8_set_speed_features() now
      only supports modes 0 and 1 and speeds up to 3
      so rd should always be set.
      
      Change-Id: I62c0c1b6154ab499785baef310536080e87bc4d8
      79d330d7
  29. 30 Jan, 2012 2 commits
  30. 24 Oct, 2011 1 commit
    • Paul Wilkins's avatar
      Further segment feature extensions. · 01ce04bc
      Paul Wilkins authored
      This quite large check in includes the following:
      
      Merge in some code from Ronald (mbgraph.c) that scans a Gf/arf group.
      This is used as a basis for a simple segmentation for the normal frames
      in a gf/arf group. This code also uses satd functions from Yaowu.
      
      Adds functionality for coding the latest possible position of an EOB for
      blocks in the segment. (Currently 0-15 only, hence just for 4x4 dct).
      Where the EOB position is 0 this acts like "skip" and the normal coding
      of skip at the per mb level is disabled.
      
      Added functions (seg_common.c) for setting and reading segment feature
      elements. These may want to be optimized away at some point but while the
      mecahnism is in a state of flux they provide a single location for making
      changes and keep things a bit cleaner.
      
      This is still proof of concept code. Currently the tested feature set:-
      
      Quantizer,
      Loop Filter level,
      Reference frame,
      Prediction Mode,
      EOB end stop.
      
      TBD:-
      
      Add functions for setting and reading the feature data with range
      and validity checking.
      
      Handling of signed and unsigned feature data. At the moment all is assumed
      to be signed and a sign bit is coded but many cannot be negative.
      
      Correct handling of EOB feature with intra coded blocks.
      
      Testing/trapping of legal/illegal ref frame and mode combinations.
      
      Transform size switch plus merge and test with 8c8 DCT work
      
      Merge and test with Sumans Segmenation coding optimizations
      
      Change-Id: Iee12e83661c7abbd1e0ce6810915eb4ec35e2d8e
      01ce04bc
  31. 22 Aug, 2011 2 commits
  32. 19 Aug, 2011 1 commit
    • Fritz Koenig's avatar
      Reclasify optimized ssim calculations as SSE2. · 01376858
      Fritz Koenig authored
      Calculations were incorrectly classified as either
      SSE3 or SSSE3.  Only using SSE2 instructions.
      Cleanup function names and make non-RTCD code work
      as well.
      
      Change-Id: I29f5c2ead342b2086a468029c15e2c1d948b5d97
      01376858
  33. 22 Jul, 2011 1 commit
    • Yunqing Wang's avatar
      Preload reference area to an intermediate buffer in sub-pixel motion search · 20bd1446
      Yunqing Wang authored
      In sub-pixel motion search, the search range is small(+/- 3 pixels).
      Preload whole search area from reference buffer into a 32-byte
      aligned buffer. Then in search, load reference data from this buffer
      instead. This keeps data in cache, and reduces the crossing cache-
      line penalty. For tulip clip, tests on Intel Core2 Quad machine(linux)
      showed encoder speed improvement:
        3.4%   at --rt --cpu-used =-4
        2.8%   at --rt --cpu-used =-3
        2.3%   at --rt --cpu-used =-2
        2.2%   at --rt --cpu-used =-1
      
      Test on Atom notebook showed only 1.1% speed improvement(speed=-4).
      Test on Xeon machine also showed less improvement, since unaligned
      data access latency is greatly reduced in newer cores.
      
      Next, I will apply similar idea to other 2 sub-pixel search functions
      for encoding speed > 4.
      
      Make this change exclusively for x86 platforms.
      
      Change-Id: Ia7bb9f56169eac0f01009fe2b2f2ab5b61d2eb2f
      20bd1446
  34. 09 Jun, 2011 1 commit
  35. 06 Jun, 2011 1 commit
    • Yaowu Xu's avatar
      remove redundant functions · d4700731
      Yaowu Xu authored
      The encoder defined about 4 set of similar functions to calculate sum,
      variance or sse or a combination of them. This commit removed one set
      of these functions, get8x8var and get16x16var, where calls to the later
      function are replaced with var16x16 by using the fact on a 16x16 MB:
          variance == sse - sum*sum/256
      
      Change-Id: I803eabd1fb3ab177780a40338cbd596dffaed267
      d4700731
  36. 25 May, 2011 1 commit
  37. 29 Apr, 2011 1 commit