1. 11 Apr, 2013 1 commit
  2. 28 Mar, 2013 2 commits
    • Deb Mukherjee's avatar
      Framework changes in nzc to allow more flexibility · fe9b5143
      Deb Mukherjee authored
      The patch adds the flexibility to use standard EOB based coding
      on smaller block sizes and nzc based coding on larger blocksizes.
      The tx-sizes that use nzc based coding and those that use EOB based
      coding are controlled by a function get_nzc_used().
      By default, this function uses nzc based coding for 16x16 and 32x32
      transform blocks, which seem to bridge the performance gap
      substantially.
      
      All sets are now lower by 0.5% to 0.7%, as opposed to ~1.8% before.
      
      Change-Id: I06abed3df57b52d241ea1f51b0d571c71e38fd0b
      fe9b5143
    • Ronald S. Bultje's avatar
      Fix mix-up in pt token indexing. · 9eea9fa2
      Ronald S. Bultje authored
      This fixes uninitialized reads in the trellis, and probably makes the
      trellis do something again.
      
      Change-Id: Ifac8dae9aa77574bde0954a71d4571c5c556df3c
      9eea9fa2
  3. 27 Mar, 2013 1 commit
    • Ronald S. Bultje's avatar
      Scatter-based scantables. · 513157e0
      Ronald S. Bultje authored
      This gains about 0.2% on derf, 0.1% on hd and 0.4% on stdhd. I can put
      this under an experimental flag if wanted, just trying to get my patch
      queue in shape.
      
      Change-Id: Ibe1a30fe0e0b07bec4802e0f3ff0ba22e505f576
      513157e0
  4. 26 Mar, 2013 4 commits
    • Ronald S. Bultje's avatar
      Add col/row-based coefficient scanning patterns for 1D 8x8/16x16 ADSTs. · d9094d8f
      Ronald S. Bultje authored
      These are mostly just for experimental purposes. I saw small gains (in
      the 0.1% range) when playing with this on derf.
      
      Change-Id: Ib21eed477bbb46bddcd73b21c5c708a5b46abedc
      d9094d8f
    • Ronald S. Bultje's avatar
      Redo banding for all transforms. · 3120dbdd
      Ronald S. Bultje authored
      Now that the first AC coefficient in both directions use the same DC
      as their context, there no longer is a purpose in letting both have
      their own band. Merging these two bands allows us to split bands for
      some of the very high-frequency AC bands.
      
      In addition, I'm redoing the banding for the 1D-ADST col/row scans. I
      don't think the old banding made any sense at all (it merged the last
      coefficient of the first row/col in the same band as the first two of
      the second row/col), which was clearly an oversight from the band being
      applied in scan-order (rather than in their actual position). Now,
      coefficients at the same position will be in the same band, regardless
      what scan order is used. I think this makes most sense for the purpose
      of banding, which is basically "predict energy for this coefficient
      depending on the energy of context coefficients" (i.e. pt).
      
      After full re-training, together with previous patch, derf gains about
      1.2-1.3%, and hd/stdhd gain about 0.9-1.0%.
      
      Change-Id: I7a0cc12ba724e88b278034113cb4adaaebf87e0c
      3120dbdd
    • Ronald S. Bultje's avatar
      Use above/left (instead of previous in scan-order) as token context. · 790fb132
      Ronald S. Bultje authored
      Pearson correlation for above or left is significantly higher than for
      previous-in-scan-order (absolute values depend on position in scan, but
      in general, we gain about 0.1-0.2 by using either above or left; using
      both basically just makes this even better). For eob branch skipping,
      we continue to use the previous token in scan order.
      
      This helps about 0.9% on derf after re-training on a limited data set.
      Full re-training and results on larger-resolution clips are pending.
      
      Note that this commit breaks trellis, so we can probably get further
      gains out of it by fixing trellis at some later point.
      
      Change-Id: Iead68e296fc3a105cca746b5e3da9555d6010cfe
      790fb132
    • Deb Mukherjee's avatar
      Modeling default coef probs with distribution · fd18d5df
      Deb Mukherjee authored
      Replaces the default tables for single coefficient magnitudes with
      those obtained from an appropriate distribution. The EOB node
      is left unchanged. The model is represeted as a 256-size codebook
      where the index corresponds to the probability of the Zero or the
      One node. Two variations are implemented corresponding to whether
      the Zero node or the One-node is used as the peg. The main advantage
      is that the default prob tables will become considerably smaller and
      manageable. Besides there is substantially less risk of over-fitting
      for a training set.
      
      Various distributions are tried and the one that gives the best
      results is the family of Generalized Gaussian distributions with
      shape parameter 0.75. The results are within about 0.2% of fully
      trained tables for the Zero peg variant, and within 0.1% of the
      One peg variant.
      
      The forward updates are optionally (controlled by a macro)
      model-based, i.e. restricted to only convey probabilities from the
      codebook. Backward updates can also be optionally (controlled by
      another macro) model-based, but is turned off by default. Currently
      model-based forward updates work about the same as unconstrained
      updates, but there is a drop in performance with backward-updates
      being model based.
      
      The model based approach also allows the probabilities for the key
      frames to be adjusted from the defaults based on the base_qindex of
      the frame. Currently the adjustment function is a placeholder that
      adjusts the prob of EOB and Zero node from the nominal one at higher
      quality (lower qindex) or lower quality (higher qindex) ends of the
      range. The rest of the probabilities are then derived based on the
      model from the adjusted prob of zero.
      
      Change-Id: Iae050f3cbcc6d8b3f204e8dc395ae47b3b2192c9
      fd18d5df
  5. 11 Mar, 2013 1 commit
  6. 10 Mar, 2013 1 commit
    • John Koleszar's avatar
      Optimize vp9_tree_probs_from_distribution · bd84685f
      John Koleszar authored
      The previous implementation visited each node in the tree multiple times
      because it used each symbol's encoding to revisit the branches taken and
      increment its count. Instead, we can traverse the tree depth first and
      calculate the probabilities and branch counts as we walk back up. The
      complexity goes from somewhere between O(nlogn) and O(n^2) (depending on
      how balanced the tree is) to O(n).
      
      Only tested one clip (256kbps, CIF), saw 13% decoding perf improvement.
      
      Note that this optimization should port trivially to VP8 as well. In VP8,
      the decoder doesn't use this function, but it does routinely show up
      on the profile for realtime encoding.
      
      Change-Id: I4f2848e4f41dc9a7694f73f3e75034bce08d1b12
      bd84685f
  7. 09 Mar, 2013 1 commit
    • Deb Mukherjee's avatar
      Continued experiment with nonzero count · a28139c8
      Deb Mukherjee authored
      Adds probability updates for extra bits for the nzcs, code for
      getting nzc stats, plus some minor cleanups and fixes.
      
      Change-Id: If2814e7f04fb52f5025ad9f400f3e6c50a00b543
      a28139c8
  8. 07 Mar, 2013 1 commit
    • Deb Mukherjee's avatar
      Coding con-zero count rather than EOB for coeffs · eb6ef241
      Deb Mukherjee authored
      This patch revamps the entropy coding of coefficients to code first
      a non-zero count per coded block and correspondingly remove the EOB
      token from the token set.
      
      STATUS:
      Main encode/decode code achieving encode/decode sync - done.
      Forward and backward probability updates to the nzcs - done.
      Rd costing updates for nzcs - done.
      Note: The dynamic progrmaming apporach used in trellis quantization
      is not exactly compatible with nzcs. A suboptimal approach has been
      used instead where branch costs are updated to account for changes
      in the nzcs.
      
      TODO:
      Training the default probs/counts for nzcs
      
      Change-Id: I951bc1e22f47885077a7453a09b0493daa77883d
      eb6ef241
  9. 05 Mar, 2013 1 commit
    • Ronald S. Bultje's avatar
      Make superblocks independent of macroblock code and data. · 111ca421
      Ronald S. Bultje authored
      Split macroblock and superblock tokenization and detokenization
      functions and coefficient-related data structs so that the bitstream
      layout and related code of superblock coefficients looks less like it's
      a hack to fit macroblocks in superblocks.
      
      In addition, unify chroma transform size selection from luma transform
      size (i.e. always use the same size, as long as it fits the predictor);
      in practice, this means 32x32 and 64x64 superblocks using the 16x16 luma
      transform will now use the 16x16 (instead of the 8x8) chroma transform,
      and 64x64 superblocks using the 32x32 luma transform will now use the
      32x32 (instead of the 16x16) chroma transform.
      
      Lastly, add a trellis optimize function for 32x32 transform blocks.
      
      HD gains about 0.3%, STDHD about 0.15% and derf about 0.1%. There's
      a few negative points here and there that I might want to analyze
      a little closer.
      
      Change-Id: Ibad7c3ddfe1acfc52771dfc27c03e9783e054430
      111ca421
  10. 23 Feb, 2013 2 commits
    • Ronald S. Bultje's avatar
      Split coefficient token tables intra vs. inter. · 0c9e2e9a
      Ronald S. Bultje authored
      Change-Id: I5416455f8f129ca0f450d00e48358d2012605072
      0c9e2e9a
    • Paul Wilkins's avatar
      Further changes to coefficient contexts. · c17672a3
      Paul Wilkins authored
      This patch alters the balance of context between the
      coefficient bands (reflecting the position of coefficients
      within a transform blocks) and the energy of the previous
      token (or tokens) within a block.
      
      In this case the number of coefficient bands is reduced
      but more previous token energy bands are supported.
      
      Some initial rebalancing of the default tables has been
      by running multiple derf clips at multiple data rates using
      the ENTOPY_STATS macro. Further balancing needs to be
      done using larger image formatsd especially in regard to
      the bigger transform sizes which are not as well represented
      in encodings of smaller image formats.
      
      Change-Id: If9736e95c391e711b04aef6393d26f60f36e1f8a
      c17672a3
  11. 15 Feb, 2013 1 commit
  12. 14 Feb, 2013 1 commit
  13. 13 Feb, 2013 4 commits
    • Paul Wilkins's avatar
      Abstract selection of coef band. · 9255ad10
      Paul Wilkins authored
      This patch abstracts the selection of the coefficient band
      context into a function as a precursor to further experiments
      with the coefficient context.
      
      It also removes the large per TX size coefficient band structures
      and uses a single matrix for all block sizes within the test function.
      
      This may have an impact on quality (results to follow) but is only an
      intermediate step in the process of redefining the context. Also the
      quality impact will be larger initially because the default tables will
      be out of step with the new banding.
      
      In particular the 4x4 will in this case only use 7 bands. If needed we
      can add back block size dependency localized within the function, but
      this can follow on after the other changes to the definition of the
      context.
      
      Change-Id: Id7009c2f4f9bb1d02b861af85fd8223d4285bde5
      9255ad10
    • Paul Wilkins's avatar
      Abstract the selection of coefficient context. · 0d284ffe
      Paul Wilkins authored
      This is an initial step to facilitate experimentation
      with changes to the prior token context used to code
      coefficients to take better account of the energy of
      preceding tokens.
      
      This patch merely abstracts the selection of context into
      two functions and does not alter the output.
      
      Change-Id: I117fff0b49c61da83aed641e36620442f86def86
      0d284ffe
    • Paul Wilkins's avatar
      Remove NEWCOEFCONTEXT experiment. · 6a9f0c61
      Paul Wilkins authored
      Removal of the  NEWCOEFCONTEXT experiment to
      reduce code clutter and make it easier to experiment with
      some other changes to the coefficient coding context.
      
      Change-Id: Icd17b421384c354df6117cc714747647c5eb7e98
      6a9f0c61
    • Paul Wilkins's avatar
      Removal of Hybrid DWT/DCT experiment. · 649be94c
      Paul Wilkins authored
      Removal of experiment to simplify code base for other
      changes.
      
      Change-Id: If0a33952504558511926ad212bc311fc2bffb19a
      649be94c
  14. 13 Jan, 2013 1 commit
    • Deb Mukherjee's avatar
      Further enhancements/fixes on dct/dwt hybrid txfm · 516db21c
      Deb Mukherjee authored
      Fixes some scaling issues. Adds an option to only compute the
      dct on the low-low subband for 32x32 and 64x64 blocks using
      only a single 16x16 dct after 1 and 2 wavelet decomposition
      levels respectively. Also adds an option to use a 8x8 dct
      as building block.
      
      Currenlty with the 2/6 filter and with a single 16x16 dct on
      the low low band, the reuslts compared to full 32x32 dct is
      as follows:
      derf: -0.15%
      yt: -0.29%
      std-hd: -0.18%
      hd: -0.6%
      These are my current recommended settings, since the 2/6 filter
      is very simple.
      
      Results with 8x8 dct are about 0.3% worse.
      
      Change-Id: I00100cdc96e32deced591985785ef0d06f325e44
      516db21c
  15. 10 Jan, 2013 1 commit
  16. 08 Jan, 2013 2 commits
    • Deb Mukherjee's avatar
      Adds 64x64 hybrid dct/dwt transform · 4b7304ee
      Deb Mukherjee authored
      This is to add to the 64x64 transform experiment as an alternative to
      a 64x64 DCT.
      Two levels of wavelet decomposition is used on a 64x64 block, followed
      by 16x16 DCT on the four lowest subbands. The highest three subbands
      are left untransformed after the first level DWT.
      
      Change-Id: I3d48d5800468d655191933894df6b46e15adca56
      4b7304ee
    • Ronald S. Bultje's avatar
      Merge superblocks (32x32) experiment. · 4455036c
      Ronald S. Bultje authored
      Change-Id: I0df99742029834a85c4933652b0587cf5b6b2587
      4455036c
  17. 20 Dec, 2012 1 commit
    • Deb Mukherjee's avatar
      New previous coef context experiment · 08f0c7cc
      Deb Mukherjee authored
      Adds an experiment to derive the previous context of a coefficient
      not just from the previous coefficient in the scan order but from a
      combination of several neighboring coefficients previously encountered
      in scan order.  A precomputed table of neighbors for each location
      for each scan type and block size is used. Currently 5 neighbors are
      used.
      
      Results are about 0.2% positive using a strategy where the max coef
      magnitude from the 5 neigbors is used to derive the context.
      
      Change-Id: Ie708b54d8e1898af742846ce2d1e2b0d89fd4ad5
      08f0c7cc
  18. 18 Dec, 2012 3 commits
  19. 13 Dec, 2012 1 commit
    • Deb Mukherjee's avatar
      Further improvements on the hybrid dwt/dct expt · 210dc5b2
      Deb Mukherjee authored
      Modifies the scanning pattern and uses a floating point 16x16
      dct implementation for now to handle scaling better.
      Also experiments are in progress with 2/6 and 9/7 wavelets.
      
      Results have improved to within ~0.25% of 32x32 dct for std-hd
      and about 0.03% for derf. This difference can probably be bridged by
      re-optimizing the entropy stats for these transforms. Currently
      the stats used are common between 32x32 dct and dwt/dct.
      
      Experiments are in progress with various scan pattern - wavelet
      combinations.
      
      Ideally the subbands should be tokenized separately, and an
      experiment will be condcuted next on that.
      
      Change-Id: Ia9cbfc2d63cb7a47e562b2cd9341caf962bcc110
      210dc5b2
  20. 12 Dec, 2012 1 commit
    • Ronald S. Bultje's avatar
      Consistently use get_prob(), clip_prob() and newly added clip_pixel(). · 4d0ec7aa
      Ronald S. Bultje authored
      Add a function clip_pixel() to clip a pixel value to the [0,255] range
      of allowed values, and use this where-ever appropriate (e.g. prediction,
      reconstruction). Likewise, consistently use the recently added function
      clip_prob(), which calculates a binary probability in the [1,255] range.
      If possible, try to use get_prob() or its sister get_binary_prob() to
      calculate binary probabilities, for consistency.
      
      Since in some places, this means that binary probability calculations
      are changed (we use {255,256}*count0/(total) in a range of places,
      and all of these are now changed to use 256*count0+(total>>1)/total),
      this changes the encoding result, so this patch warrants some extensive
      testing.
      
      Change-Id: Ibeeff8d886496839b8e0c0ace9ccc552351f7628
      4d0ec7aa
  21. 08 Dec, 2012 1 commit
    • Ronald S. Bultje's avatar
      Introduce vp9_coeff_probs/counts/stats/accum types. · 885cf816
      Ronald S. Bultje authored
      Use these, instead of the 4/5-dimensional arrays, to hold statistics,
      counts, accumulations and probabilities for coefficient tokens. This
      commit also re-allows ENTROPY_STATS to compile.
      
      Change-Id: If441ffac936f52a3af91d8f2922ea8a0ceabdaa5
      885cf816
  22. 07 Dec, 2012 1 commit
    • Ronald S. Bultje's avatar
      32x32 transform for superblocks. · c456b35f
      Ronald S. Bultje authored
      This adds Debargha's DCT/DWT hybrid and a regular 32x32 DCT, and adds
      code all over the place to wrap that in the bitstream/encoder/decoder/RD.
      
      Some implementation notes (these probably need careful review):
      - token range is extended by 1 bit, since the value range out of this
        transform is [-16384,16383].
      - the coefficients coming out of the FDCT are manually scaled back by
        1 bit, or else they won't fit in int16_t (they are 17 bits). Because
        of this, the RD error scoring does not right-shift the MSE score by
        two (unlike for 4x4/8x8/16x16).
      - to compensate for this loss in precision, the quantizer is halved
        also. This is currently a little hacky.
      - FDCT and IDCT is double-only right now. Needs a fixed-point impl.
      - There are no default probabilities for the 32x32 transform yet; I'm
        simply using the 16x16 luma ones. A future commit will add newly
        generated probabilities for all transforms.
      - No ADST version. I don't think we'll add one for this level; if an
        ADST is desired, transform-size selection can scale back to 16x16
        or lower, and use an ADST at that level.
      
      Additional notes specific to Debargha's DWT/DCT hybrid:
      - coefficient scale is different for the top/left 16x16 (DCT-over-DWT)
        block than for the rest (DWT pixel differences) of the block. Therefore,
        RD error scoring isn't easily scalable between coefficient and pixel
        domain. Thus, unfortunately, we need to compute the RD distortion in
        the pixel domain until we figure out how to scale these appropriately.
      
      Change-Id: I00386f20f35d7fabb19aba94c8162f8aee64ef2b
      c456b35f
  23. 28 Nov, 2012 1 commit
  24. 27 Nov, 2012 1 commit
    • John Koleszar's avatar
      Add vp9_ prefix to all vp9 files · fcccbcbb
      John Koleszar authored
      Support for gyp which doesn't support multiple objects in the same
      static library having the same basename.
      
      Change-Id: Ib947eefbaf68f8b177a796d23f875ccdfa6bc9dc
      fcccbcbb
  25. 01 Nov, 2012 2 commits
  26. 31 Oct, 2012 3 commits