1. 24 Jan, 2014 1 commit
  2. 20 Dec, 2013 1 commit
  3. 02 Dec, 2013 1 commit
    • Dmitry Kovalev's avatar
      Using local variable for token_cache. · 5ab920d2
      Dmitry Kovalev authored
      The difference with the old code is that originally the whole token_cache
      was initialized with zeros at the beginning of decode_coefs() function.
      Now we set several zero values explicitly with "token_cache[scan[c]] = 0".
      
      Change-Id: I88cc5031f01d13012d1a4491739c36cb44f9401e
      5ab920d2
  4. 14 Nov, 2013 1 commit
    • Deb Mukherjee's avatar
      Simplifies band-getting with a static array · cfcd5c4f
      Deb Mukherjee authored
      Simplifies the code by implementing band mapping with static arrays.
      A lot of the code complexity introduced in a previous patch
      disappears.
      
      Change-Id: Ia3fac36e594fb5ad2d55ae141c58bba4c55c2d28
      cfcd5c4f
  5. 12 Nov, 2013 2 commits
  6. 05 Nov, 2013 1 commit
    • Deb Mukherjee's avatar
      token_cache changes in decoder · 3a833ea3
      Deb Mukherjee authored
      Removes stack-alocation of token_cache in decode_coefs function
      
      Seems to achieve about 1% decode speed improvement as tested on
      25 480p videos.
      
      Change-Id: I8e7eb3361fa09d9654dfad0677a6d606701fdc6e
      3a833ea3
  7. 31 Oct, 2013 1 commit
    • Dmitry Kovalev's avatar
      Reducing the number of foreach_transformed_block() calls. · 47b6030d
      Dmitry Kovalev authored
      The change doesn't affect the bitstream. It changes the order or function
      calls and affects how we reconstruct intra- and inter-blocks. Speed up is
      about 1...1.5%.
      
      For intra-blocks:
        Before:
          for each transform block read tokens
          for each transform block do prediction
          for each transform block do inverse transform
        Now:
          for each transform block
            read tokens
            do prediction
            do inverse transform
      
      For inter-blocks:
        Before:
          for each transform block read tokens
          for each transform block do inverse transform
        Now:
          for each transform block
            read tokens
            do inverse transform
      
      Change-Id: I12a79bf1aa5a18c351b8010369bd3ff1deae1570
      47b6030d
  8. 21 Oct, 2013 1 commit
  9. 26 Aug, 2013 1 commit
  10. 25 Jul, 2013 1 commit
    • Dmitry Kovalev's avatar
      General cleanups. · 7131cb0e
      Dmitry Kovalev authored
      Removing unused constants, macros, and function declarations. Using
      ROUND_POWER_OF_TWO macro, vp9_zero, vp9_copy where possible. Moving
      #include from *.h to *.c. Merging for loops for motion vectors.
      
      Change-Id: Ic3bf841764a2bb177128bb3a6d7aa8f68229cd13
      7131cb0e
  11. 18 Jun, 2013 1 commit
  12. 07 May, 2013 1 commit
  13. 25 Apr, 2013 1 commit
    • John Koleszar's avatar
      Fix incorrect dequant used in detokenize · e40a7690
      John Koleszar authored
      The quantizer can vary per-plane, and the dequantization vector is
      available in the per-plane part of MACROBLOCKD. The previous code would
      incorrectly use the Y quantizer for the whole macroblock.
      
      Change-Id: I3ab418aef9168ea0ddcfa4b7c0be32ae48b536d7
      e40a7690
  14. 22 Apr, 2013 1 commit
  15. 18 Apr, 2013 1 commit
  16. 10 Apr, 2013 1 commit
  17. 04 Apr, 2013 1 commit
  18. 03 Apr, 2013 1 commit
    • John Koleszar's avatar
      Remove special case vp9_decode_coefs_4x4 · 1e5f25ec
      John Koleszar authored
      This code was only called in the BPRED case, but had no real special
      case associated with it. Made BPRED behave like all other modes. No
      bitstream change.
      
      Change-Id: I87ba11fe723928b6314d094979011228d5ba006f
      1e5f25ec
  19. 05 Mar, 2013 1 commit
    • Ronald S. Bultje's avatar
      Make superblocks independent of macroblock code and data. · 111ca421
      Ronald S. Bultje authored
      Split macroblock and superblock tokenization and detokenization
      functions and coefficient-related data structs so that the bitstream
      layout and related code of superblock coefficients looks less like it's
      a hack to fit macroblocks in superblocks.
      
      In addition, unify chroma transform size selection from luma transform
      size (i.e. always use the same size, as long as it fits the predictor);
      in practice, this means 32x32 and 64x64 superblocks using the 16x16 luma
      transform will now use the 16x16 (instead of the 8x8) chroma transform,
      and 64x64 superblocks using the 32x32 luma transform will now use the
      32x32 (instead of the 16x16) chroma transform.
      
      Lastly, add a trellis optimize function for 32x32 transform blocks.
      
      HD gains about 0.3%, STDHD about 0.15% and derf about 0.1%. There's
      a few negative points here and there that I might want to analyze
      a little closer.
      
      Change-Id: Ibad7c3ddfe1acfc52771dfc27c03e9783e054430
      111ca421
  20. 10 Jan, 2013 1 commit
  21. 08 Jan, 2013 1 commit
  22. 18 Dec, 2012 1 commit
  23. 07 Dec, 2012 1 commit
    • Ronald S. Bultje's avatar
      32x32 transform for superblocks. · c456b35f
      Ronald S. Bultje authored
      This adds Debargha's DCT/DWT hybrid and a regular 32x32 DCT, and adds
      code all over the place to wrap that in the bitstream/encoder/decoder/RD.
      
      Some implementation notes (these probably need careful review):
      - token range is extended by 1 bit, since the value range out of this
        transform is [-16384,16383].
      - the coefficients coming out of the FDCT are manually scaled back by
        1 bit, or else they won't fit in int16_t (they are 17 bits). Because
        of this, the RD error scoring does not right-shift the MSE score by
        two (unlike for 4x4/8x8/16x16).
      - to compensate for this loss in precision, the quantizer is halved
        also. This is currently a little hacky.
      - FDCT and IDCT is double-only right now. Needs a fixed-point impl.
      - There are no default probabilities for the 32x32 transform yet; I'm
        simply using the 16x16 luma ones. A future commit will add newly
        generated probabilities for all transforms.
      - No ADST version. I don't think we'll add one for this level; if an
        ADST is desired, transform-size selection can scale back to 16x16
        or lower, and use an ADST at that level.
      
      Additional notes specific to Debargha's DWT/DCT hybrid:
      - coefficient scale is different for the top/left 16x16 (DCT-over-DWT)
        block than for the rest (DWT pixel differences) of the block. Therefore,
        RD error scoring isn't easily scalable between coefficient and pixel
        domain. Thus, unfortunately, we need to compute the RD distortion in
        the pixel domain until we figure out how to scale these appropriately.
      
      Change-Id: I00386f20f35d7fabb19aba94c8162f8aee64ef2b
      c456b35f
  24. 30 Nov, 2012 1 commit
  25. 28 Nov, 2012 1 commit
  26. 27 Nov, 2012 1 commit
    • John Koleszar's avatar
      Add vp9_ prefix to all vp9 files · fcccbcbb
      John Koleszar authored
      Support for gyp which doesn't support multiple objects in the same
      static library having the same basename.
      
      Change-Id: Ib947eefbaf68f8b177a796d23f875ccdfa6bc9dc
      fcccbcbb
  27. 25 Nov, 2012 1 commit
    • Ronald S. Bultje's avatar
      Move switch(tx_size) around txsize to detokenize.c. · 25b609b6
      Ronald S. Bultje authored
      Add a new function vp9_decode_mb_tokens() that handles the switch
      between different per-tx-size detokenize functions. Make actual
      implementations (vp9_decode_mb_tokens_NxN()) static.
      
      Change-Id: I9e0c4ef410bfa90128a02b472c079a955776816d
      25b609b6
  28. 21 Nov, 2012 1 commit
  29. 17 Nov, 2012 1 commit
    • Ronald S. Bultje's avatar
      Remove special-case inline detokenization in b_pred reconstruction. · f19a1caf
      Ronald S. Bultje authored
      Just like for all other block modes, b_pred tokens can be read together
      before starting macroblock reconstruction. This removes special cases
      for b_pred in decode_macroblock() and allows to make decode_coefs_4x4()
      static in detokenize.c.
      
      While at it, remove the redundant handling and checking of plane_type
      and block_index (i) in decode_coefs_4x4(). Since the function is static,
      and is called only from decode_mb_tokens_4x4(), we don't need to worry
      that the arguments ever go out of sync.
      
      Change-Id: I2d415da0b51b89d0490a6b9e24cc86363c2090f7
      f19a1caf
  30. 10 Nov, 2012 1 commit
    • Deb Mukherjee's avatar
      New b-intra mode where direction is contextual · d01357bb
      Deb Mukherjee authored
      Preliminary patch on a new 4x4 intra mode B_CONTEXT_PRED where the
      dominant direction from the context is used to encode. Various decoder
      changes are needed to support decoding of B_CONTEXT_PRED in conjunction
      with hybrid transforms since the scan order and tokenization depends on
      the actual direction of prediction obtained from the context. Currently
      the traditional directional modes are used in conjunction with the
      B_CONTEXT_PRED, which also seems to provide the best results.
      
      The gains are small - in the 0.1% range.
      
      Change-Id: I5a7ea80b5218f42a9c0dfb42d3f79a68c7f0cdc2
      d01357bb
  31. 01 Nov, 2012 1 commit
  32. 31 Oct, 2012 2 commits
  33. 19 Oct, 2012 1 commit
    • John Koleszar's avatar
      Remove bc, bc2 from pbi,cpi,xd · e9fd1eac
      John Koleszar authored
      Pass the bool coder to be used explicitly. This avoids cases where two
      different bool coders can be addressed from the same function. Also be
      more consistent with bool coder variable naming, start to standardize
      on 'bc'.
      
      Change-Id: I1c95e2fdbe24ebe8c0f84924daa1728e3b054a31
      e9fd1eac
  34. 11 Oct, 2012 1 commit
  35. 30 Aug, 2012 1 commit
    • Jingning Han's avatar
      hybrid transform of 16x16 dimension · de6dfa6b
      Jingning Han authored
      Enable ADST/DCT of dimension 16x16 for I16X16 modes. This change provides
      benefits mostly for hd sequences.
      
      Set up the framework for selectable transform dimension.
      
      Also allowing quantization parameter threshold to control the use
      of hybrid transform (This is currently disabled by setting threshold
      always above the quantization parameter. Adaptive thresholding can
      be built upon this, which will further improve the coding performance.)
      
      The coding performance gains (with respect to the codec that has all
      other configuration settings turned on) are
      
      derf:   0.013
      yt:     0.086
      hd:     0.198
      std-hd: 0.501
      
      Change-Id: Ibb4263a61fc74e0b3c345f54d73e8c73552bf926
      de6dfa6b
  36. 15 Aug, 2012 1 commit
    • Paul Wilkins's avatar
      Code clean up. · 77dc5c65
      Paul Wilkins authored
      Further cases of inconsistent naming convention.
      
      Change-Id: Id3411ecec6f01a4c889268a00f0c9fd5a92ea143
      77dc5c65
  37. 03 Aug, 2012 1 commit
    • Daniel Kang's avatar
      16x16 DCT blocks. · fed8a183
      Daniel Kang authored
      Set on all 16x16 intra/inter modes
      
      Features:
      - Butterfly fDCT/iDCT
      - Loop filter does not filter internal edges with 16x16
      - Optimize coefficient function
      - Update coefficient probability function
      - RD
      - Entropy stats
      - 16x16 is a config option
      
      Have not tested with experiments.
      
      hd:     2.60%
      std-hd: 2.43%
      yt:     1.32%
      derf:   0.60%
      
      Change-Id: I96fb090517c30c5da84bad4fae602c3ec0c58b1c
      fed8a183
  38. 15 Mar, 2012 1 commit
    • Yaowu Xu's avatar
      WebM Experimental Codec Branch Snapshot · 6035da54
      Yaowu Xu authored
      This is a code snapshot of experimental work currently ongoing for a
      next-generation codec.
      
      The codebase has been cut down considerably from the libvpx baseline.
      For example, we are currently only supporting VBR 2-pass rate control
      and have removed most of the code relating to coding speed, threading,
      error resilience, partitions and various other features.  This is in
      part to make the codebase easier to work on and experiment with, but
      also because we want to have an open discussion about how the bitstream
      will be structured and partitioned and not have that conversation
      constrained by past work.
      
      Our basic working pattern has been to initially encapsulate experiments
      using configure options linked to #IF CONFIG_XXX statements in the
      code. Once experiments have matured and we are reasonably happy that
      they give benefit and can be merged without breaking other experiments,
      we remove the conditional compile statements and merge them in.
      
      Current changes include:
      * Temporal coding experiment for segments (though still only 4 max, it
        will likely be increased).
      * Segment feature experiment - to allow various bits of information to
        be coded at the segment level. Features tested so far include mode
        and reference frame information, limiting end of block offset and
        transform size, alongside Q and loop filter parameters, but this set
        is very fluid.
      * Support for 8x8 transform - 8x8 dct with 2nd order 2x2 haar is used
        in MBs using 16x16 prediction modes within inter frames.
      * Compound prediction (combination of signals from existing predictors
        to create a new predictor).
      * 8 tap interpolation filters and 1/8th pel motion vectors.
      * Loop filter modifications.
      * Various entropy modifications and changes to how entropy contexts and
        updates are handled.
      * Extended quantizer range matched to transform precision improvements.
      
      There are also ongoing further experiments that we hope to merge in the
      near future: For example, coding of motion and other aspects of the
      prediction signal to better support larger image formats, use of larger
      block sizes (e.g. 32x32 and up) and lossless non-transform based coding
      options (especially for key frames). It is our hope that we will be
      able to make regular updates and we will warmly welcome community
      contributions.
      
      Please be warned that, at this stage, the codebase is currently slower
      than VP8 stable branch as most new code has not been optimized, and
      even the 'C' has been deliberately written to be simple and obvious,
      not fast.
      
      The following graphs have the initial test results, numbers in the
      tables measure the compression improvement in terms of percentage. The
      build has  the following optional experiments configured:
      --enable-experimental --enable-enhanced_interp --enable-uvintra
      --enable-high_precision_mv --enable-sixteenth_subpel_uv
      
      CIF Size clips:
      http://getwebm.org/tmp/cif/
      HD size clips:
      http://getwebm.org/tmp/hd/
      (stable_20120309 represents encoding results of WebM master branch
      build as of commit#7a159071)
      
      They were encoded using the following encode parameters:
      --good --cpu-used=0 -t 0 --lag-in-frames=25 --min-q=0 --max-q=63
      --end-usage=0 --auto-alt-ref=1 -p 2 --pass=2 --kf-max-dist=9999
      --kf-min-dist=0 --drop-frame=0 --static-thresh=0 --bias-pct=50
      --minsection-pct=0 --maxsection-pct=800 --sharpness=0
      --arnr-maxframes=7 --arnr-strength=3(for HD,6 for CIF)
      --arnr-type=3
      
      Change-Id: I5c62ed09cfff5815a2bb34e7820d6a810c23183c
      6035da54