1. 23 Jan, 2013 1 commit
    • Scott LaVarnway's avatar
      Intrinsic version of loopfilter now matches C code · 6a997400
      Scott LaVarnway authored
      Updated the instrinsic code to match Yaowu's latest loopfilter change.
      (I584393906c4f5f948a581d6590959522572743bb)
      
      The decoder performance improved by ~30% for the test clip used.
      
      Change-Id: I026cfc75d5bcb7d8d58be6f0440ac9e126ef39d2
      6a997400
  2. 18 Jan, 2013 1 commit
    • Yaowu Xu's avatar
      a minor change to a portion of loop filtering · b95ed688
      Yaowu Xu authored
      The loop filtering used for MB edge or internal edge of a MB using 8x8
      tranform was reading 5 pixel each side and writting 3 pixel each side.
      With suggestion from Aki and Scott on hardware&software performance,
      this commit changed to read 4 pixel each side and write 3 pixel each
      side.
      
      Change-Id: I584393906c4f5f948a581d6590959522572743bb
      b95ed688
  3. 14 Jan, 2013 2 commits
    • Yaowu Xu's avatar
      Merge experiment "widerlpf" · f7dab600
      Yaowu Xu authored
      Change-Id: I0c94475075e66e13cfe4c20fab7db6474441ae86
      f7dab600
    • Yaowu Xu's avatar
      changed UV plane loop filtering for TX_8X8 · ad9a16ed
      Yaowu Xu authored
      In commit 9a1d73d0, loop filtering was added for UV 4x4 boundaries
      when TX_8X8 is used by a MB. This commit further refined the decision
      to be based on the actual transform used for the UV planes. When
      UV planes use 4x4 transform, i.e. when prediction mode used is either
      I8X8_PRED or SPLITMV, UV planes are filtered on 4x4 boundaries, and no
      filtering is applied on 4x4 block boundaries when UV planes use 8X8
      transform.
      
      Change-Id: Ibb404face0a1d129b4b4abaf67c55d82e8df8bec
      ad9a16ed
  4. 11 Jan, 2013 1 commit
    • Yaowu Xu's avatar
      Reduce the usage of widerlpf · 6c9fb22e
      Yaowu Xu authored
      The commit changed to not to use wider lpf within a superblock when
      32x32 transform is used for the block.
      
      The commit also changed to use the shorter version of loop filtering:
      for UV planes.
      
      Change-Id: I344c1fb9a3be9d1200782a788bcb0b001fedcff8
      6c9fb22e
  5. 10 Jan, 2013 2 commits
  6. 08 Jan, 2013 2 commits
    • Ronald S. Bultje's avatar
      Merge superblocks (32x32) experiment. · 4455036c
      Ronald S. Bultje authored
      Change-Id: I0df99742029834a85c4933652b0587cf5b6b2587
      4455036c
    • Yaowu Xu's avatar
      minor loop filter refactoring and cleanup · d278d018
      Yaowu Xu authored
      This commit did a couple of minor cleanup/refactoring to prepare for
      futher loop filter experiments. It merged y_only version of loop filter
      function into the regular one, which makes sure that same logic is used
      for functions for picking level and for actual loop filtering.
      
      Change-Id: Id10c94dccd45f58e5310bacfdf6ee63cbb60b86f
      d278d018
  7. 06 Jan, 2013 1 commit
  8. 18 Dec, 2012 1 commit
  9. 07 Dec, 2012 1 commit
    • Ronald S. Bultje's avatar
      32x32 transform for superblocks. · c456b35f
      Ronald S. Bultje authored
      This adds Debargha's DCT/DWT hybrid and a regular 32x32 DCT, and adds
      code all over the place to wrap that in the bitstream/encoder/decoder/RD.
      
      Some implementation notes (these probably need careful review):
      - token range is extended by 1 bit, since the value range out of this
        transform is [-16384,16383].
      - the coefficients coming out of the FDCT are manually scaled back by
        1 bit, or else they won't fit in int16_t (they are 17 bits). Because
        of this, the RD error scoring does not right-shift the MSE score by
        two (unlike for 4x4/8x8/16x16).
      - to compensate for this loss in precision, the quantizer is halved
        also. This is currently a little hacky.
      - FDCT and IDCT is double-only right now. Needs a fixed-point impl.
      - There are no default probabilities for the 32x32 transform yet; I'm
        simply using the 16x16 luma ones. A future commit will add newly
        generated probabilities for all transforms.
      - No ADST version. I don't think we'll add one for this level; if an
        ADST is desired, transform-size selection can scale back to 16x16
        or lower, and use an ADST at that level.
      
      Additional notes specific to Debargha's DWT/DCT hybrid:
      - coefficient scale is different for the top/left 16x16 (DCT-over-DWT)
        block than for the rest (DWT pixel differences) of the block. Therefore,
        RD error scoring isn't easily scalable between coefficient and pixel
        domain. Thus, unfortunately, we need to compute the RD distortion in
        the pixel domain until we figure out how to scale these appropriately.
      
      Change-Id: I00386f20f35d7fabb19aba94c8162f8aee64ef2b
      c456b35f
  10. 28 Nov, 2012 1 commit
  11. 27 Nov, 2012 1 commit
    • John Koleszar's avatar
      Add vp9_ prefix to all vp9 files · fcccbcbb
      John Koleszar authored
      Support for gyp which doesn't support multiple objects in the same
      static library having the same basename.
      
      Change-Id: Ib947eefbaf68f8b177a796d23f875ccdfa6bc9dc
      fcccbcbb
  12. 01 Nov, 2012 3 commits
  13. 31 Oct, 2012 2 commits
  14. 30 Oct, 2012 2 commits
  15. 16 Oct, 2012 1 commit
  16. 11 Oct, 2012 1 commit
  17. 30 Aug, 2012 1 commit
    • Jingning Han's avatar
      hybrid transform of 16x16 dimension · de6dfa6b
      Jingning Han authored
      Enable ADST/DCT of dimension 16x16 for I16X16 modes. This change provides
      benefits mostly for hd sequences.
      
      Set up the framework for selectable transform dimension.
      
      Also allowing quantization parameter threshold to control the use
      of hybrid transform (This is currently disabled by setting threshold
      always above the quantization parameter. Adaptive thresholding can
      be built upon this, which will further improve the coding performance.)
      
      The coding performance gains (with respect to the codec that has all
      other configuration settings turned on) are
      
      derf:   0.013
      yt:     0.086
      hd:     0.198
      std-hd: 0.501
      
      Change-Id: Ibb4263a61fc74e0b3c345f54d73e8c73552bf926
      de6dfa6b
  18. 20 Aug, 2012 1 commit
    • Ronald S. Bultje's avatar
      Superblock coding. · 5d4cffb3
      Ronald S. Bultje authored
      This commit adds a pick_sb_mode() function which selects the best 32x32
      superblock coding mode. Then it selects the best per-MB modes, compares
      the two and encodes that in the bitstream.
      
      The bitstream coding is rather simplistic right now. At the SB level,
      we code a bit to indicate whether this block uses SB-coding (32x32
      prediction) or MB-coding (anything else), and then we follow with the
      actual modes. This could and should be modified in the future, but is
      omitted from this commit because it will likely involve reorganizing
      much more code rather than just adding SB coding, so it's better to let
      that be judged on its own merits.
      
      Gains on derf: about even, YT/HD: +0.75%, STD/HD: +1.5%.
      
      Change-Id: Iae313a7cbd8f75b3c66d04a68b991cb096eaaba6
      5d4cffb3
  19. 03 Aug, 2012 1 commit
    • Daniel Kang's avatar
      16x16 DCT blocks. · fed8a183
      Daniel Kang authored
      Set on all 16x16 intra/inter modes
      
      Features:
      - Butterfly fDCT/iDCT
      - Loop filter does not filter internal edges with 16x16
      - Optimize coefficient function
      - Update coefficient probability function
      - RD
      - Entropy stats
      - 16x16 is a config option
      
      Have not tested with experiments.
      
      hd:     2.60%
      std-hd: 2.43%
      yt:     1.32%
      derf:   0.60%
      
      Change-Id: I96fb090517c30c5da84bad4fae602c3ec0c58b1c
      fed8a183
  20. 02 Aug, 2012 1 commit
    • Scott LaVarnway's avatar
      Added row based loopfilter · 1746b2ad
      Scott LaVarnway authored
      Interleaved loopfiltering with decode.  For 1080p clips, up to 1%
      performance gain.  For 4k clips, up to 10% seen.  This patch is required
      for better "frame-based" multithreading.
      
      Change-Id: Ic834cf32297cc04f27e8205652fb9f70cbe290db
      1746b2ad
  21. 27 Jul, 2012 1 commit
    • Deb Mukherjee's avatar
      Merges several experiments · 9984a155
      Deb Mukherjee authored
      The following five experiments are merged:
      
      newentropy
      newupdate
      adaptive_entropy (also includes a couple of parameter changes
                        that improves results a little
                        in common/entropymode.c and encoder/modecosts.c
                        that were not merged from the internal branch)
      newintramodes
      expanded_coef_context
      
      Change-Id: I8a142a831786ee9dc936f22be1d42a8bced7d270
      9984a155
  22. 17 Jul, 2012 1 commit
  23. 23 May, 2012 2 commits
    • Attila Nagy's avatar
      Fix another multithreaded encoder loopfilter race condition · 48908530
      Attila Nagy authored
      After a key frame encoding, the frame type could change while
      filtering is still going on. Pass the frame type as parameter to the
      loopfilter function and don't read it from common storage.
      
      vp8cx_set_alt_lf_level has to be done before packing the stream.
      Currently alt_lf_level is not used so there hasn't been any visible
      problem here.
      
      Change-Id: Ia114162158cd833c2b16e3b89303cc9c91f19165
      48908530
    • Attila Nagy's avatar
      Fix another multithreaded encoder loopfilter race condition · ea392d47
      Attila Nagy authored
      After a key frame encoding, the frame type could change while
      filtering is still going on. Pass the frame type as parameter to the
      loopfilter function and don't read it from common storage.
      
      vp8cx_set_alt_lf_level has to be done before packing the stream.
      Currently alt_lf_level is not used so there hasn't been any visible
      problem here.
      
      Change-Id: Ia114162158cd833c2b16e3b89303cc9c91f19165
      ea392d47
  24. 15 May, 2012 1 commit
  25. 12 Apr, 2012 1 commit
    • Scott LaVarnway's avatar
      loopfilter improvements · e0a80519
      Scott LaVarnway authored
      Local variable offsets are now consistent for the functions,
      removed unused parameters, reworked the assembly to eliminate
      stalls/instructions.
      
      Change-Id: Iaa37668f8a9bb8754df435f6a51c3a08d547f879
      e0a80519
  26. 15 Mar, 2012 1 commit
    • Yaowu Xu's avatar
      WebM Experimental Codec Branch Snapshot · 6035da54
      Yaowu Xu authored
      This is a code snapshot of experimental work currently ongoing for a
      next-generation codec.
      
      The codebase has been cut down considerably from the libvpx baseline.
      For example, we are currently only supporting VBR 2-pass rate control
      and have removed most of the code relating to coding speed, threading,
      error resilience, partitions and various other features.  This is in
      part to make the codebase easier to work on and experiment with, but
      also because we want to have an open discussion about how the bitstream
      will be structured and partitioned and not have that conversation
      constrained by past work.
      
      Our basic working pattern has been to initially encapsulate experiments
      using configure options linked to #IF CONFIG_XXX statements in the
      code. Once experiments have matured and we are reasonably happy that
      they give benefit and can be merged without breaking other experiments,
      we remove the conditional compile statements and merge them in.
      
      Current changes include:
      * Temporal coding experiment for segments (though still only 4 max, it
        will likely be increased).
      * Segment feature experiment - to allow various bits of information to
        be coded at the segment level. Features tested so far include mode
        and reference frame information, limiting end of block offset and
        transform size, alongside Q and loop filter parameters, but this set
        is very fluid.
      * Support for 8x8 transform - 8x8 dct with 2nd order 2x2 haar is used
        in MBs using 16x16 prediction modes within inter frames.
      * Compound prediction (combination of signals from existing predictors
        to create a new predictor).
      * 8 tap interpolation filters and 1/8th pel motion vectors.
      * Loop filter modifications.
      * Various entropy modifications and changes to how entropy contexts and
        updates are handled.
      * Extended quantizer range matched to transform precision improvements.
      
      There are also ongoing further experiments that we hope to merge in the
      near future: For example, coding of motion and other aspects of the
      prediction signal to better support larger image formats, use of larger
      block sizes (e.g. 32x32 and up) and lossless non-transform based coding
      options (especially for key frames). It is our hope that we will be
      able to make regular updates and we will warmly welcome community
      contributions.
      
      Please be warned that, at this stage, the codebase is currently slower
      than VP8 stable branch as most new code has not been optimized, and
      even the 'C' has been deliberately written to be simple and obvious,
      not fast.
      
      The following graphs have the initial test results, numbers in the
      tables measure the compression improvement in terms of percentage. The
      build has  the following optional experiments configured:
      --enable-experimental --enable-enhanced_interp --enable-uvintra
      --enable-high_precision_mv --enable-sixteenth_subpel_uv
      
      CIF Size clips:
      http://getwebm.org/tmp/cif/
      HD size clips:
      http://getwebm.org/tmp/hd/
      (stable_20120309 represents encoding results of WebM master branch
      build as of commit#7a159071)
      
      They were encoded using the following encode parameters:
      --good --cpu-used=0 -t 0 --lag-in-frames=25 --min-q=0 --max-q=63
      --end-usage=0 --auto-alt-ref=1 -p 2 --pass=2 --kf-max-dist=9999
      --kf-min-dist=0 --drop-frame=0 --static-thresh=0 --bias-pct=50
      --minsection-pct=0 --maxsection-pct=800 --sharpness=0
      --arnr-maxframes=7 --arnr-strength=3(for HD,6 for CIF)
      --arnr-type=3
      
      Change-Id: I5c62ed09cfff5815a2bb34e7820d6a810c23183c
      6035da54
  27. 01 Mar, 2012 1 commit
  28. 28 Feb, 2012 1 commit
    • Paul Wilkins's avatar
      Merge new loop filter. · 19b9d28f
      Paul Wilkins authored
      Merge of the NEWLPF configuration experiment so it is always on.
      
      Change-Id: I7054772b6eab28bad1ff807bfa54d98f83de9308
      19b9d28f
  29. 27 Feb, 2012 1 commit
    • Paul Wilkins's avatar
      Corrected spelling · b00ed02a
      Paul Wilkins authored
      Apparently the correct spelling of segement is segment !
      
      Change-Id: I88593ee0523f251b3a96794c6166ef8c7898a029
      b00ed02a
  30. 15 Feb, 2012 1 commit
    • Yaowu Xu's avatar
      moved segment based LPF level selection under CONFIG_FEATUREUPDATES · d327dcf3
      Yaowu Xu authored
      This commit moved segment based loop filter level selection into
      the experiment of CONFIG_FEATUREUPDATES. As previous commit noted,
      the segment based loop filter selection helps the compression by
      ~0.1% on cif set, the ongoing experiment CONFIG_FEATUREUPDATES
      made encoding updates of the segment based LPF level more efficient,
      hence, another .04% gain on cif set. The commit also fixed an issue
      previously where encoder/decoder may use different loop filter level
      for one of the segments.
      
      Change-Id: Ia978b14aae95bb107d561ba53a7a2bb6ff01faf3
      d327dcf3
  31. 09 Feb, 2012 1 commit
  32. 30 Jan, 2012 1 commit