1. 14 Nov, 2013 1 commit
    • Deb Mukherjee's avatar
      Simplifies band-getting with a static array · cfcd5c4f
      Deb Mukherjee authored
      Simplifies the code by implementing band mapping with static arrays.
      A lot of the code complexity introduced in a previous patch
      disappears.
      
      Change-Id: Ia3fac36e594fb5ad2d55ae141c58bba4c55c2d28
      cfcd5c4f
  2. 12 Nov, 2013 2 commits
  3. 31 Oct, 2013 1 commit
  4. 16 Oct, 2013 1 commit
    • Dmitry Kovalev's avatar
      Adding get_band_translate() function. · 9deb614a
      Dmitry Kovalev authored
      Moving code that gets band_translate array from get_scan_and_band()
      function to get_band_translate() function. Renaming get_scan_and_band() to
      get_scan().
      
      Change-Id: I43047c205a1ca2a6e24be44db39dc04b7a385008
      9deb614a
  5. 15 Oct, 2013 2 commits
  6. 11 Oct, 2013 2 commits
    • Dmitry Kovalev's avatar
      Replacing {VP9_COEF, MODE}_UPDATE_PROB with DIFF_UPDATE_PROB. · 4a0f9478
      Dmitry Kovalev authored
      Values of MODE_UPDATE_PROB and VP9_COEF_UPDATE_PROB are equal, so replacing
      them with one constant. Inlining appropriate arguments for functions:
        vp9_cond_prob_diff_update (encoder)
        vp9_diff_update_prob (decoder)
      
      Change-Id: I1255a1cb477743b799b3bfbbcd8de6b32b067338
      4a0f9478
    • Dmitry Kovalev's avatar
      Removing vp9_tree_p typedef. · 98400c1b
      Dmitry Kovalev authored
      It is used only two times and it is more clear to use real type instead
      of typedef.
      
      Change-Id: Idc25c16504c3da4d040e0cdb33a2987631bb6a5b
      98400c1b
  7. 07 Oct, 2013 2 commits
  8. 30 Sep, 2013 1 commit
  9. 27 Sep, 2013 1 commit
  10. 19 Sep, 2013 1 commit
  11. 05 Sep, 2013 1 commit
  12. 29 Aug, 2013 1 commit
  13. 27 Aug, 2013 1 commit
  14. 26 Aug, 2013 1 commit
  15. 21 Aug, 2013 1 commit
  16. 12 Aug, 2013 1 commit
  17. 01 Aug, 2013 1 commit
  18. 29 Jul, 2013 1 commit
    • John Koleszar's avatar
      Remove unnecessary 64 byte alignment · a31effca
      John Koleszar authored
      Fixes a warning on MSVS 2012 where the alignment of vp9_default_iscan_8x8
      didn't match between its declaration and definition.
      
      Change-Id: I1466a15635f4b22594d705d570b7e399bfb6cf21
      a31effca
  19. 25 Jul, 2013 1 commit
    • Dmitry Kovalev's avatar
      General cleanups. · 7131cb0e
      Dmitry Kovalev authored
      Removing unused constants, macros, and function declarations. Using
      ROUND_POWER_OF_TWO macro, vp9_zero, vp9_copy where possible. Moving
      #include from *.h to *.c. Merging for loops for motion vectors.
      
      Change-Id: Ic3bf841764a2bb177128bb3a6d7aa8f68229cd13
      7131cb0e
  20. 22 Jul, 2013 1 commit
    • Ronald S. Bultje's avatar
      More optimizations for cost_coeffs(). · e20fcd95
      Ronald S. Bultje authored
      4x4:    163 ->  123 cycles (33% faster)
      8x8:    491 ->  399 cycles (23% faster)
      16x16: 1889 -> 1763 cycles (7% faster)
      32x32: 8311 -> 8180 cycles (1.6% faster)
      
      Overall encoding time of first 50 frames of bus (speed 0) @ 1500kbps
      goes from 1min4.33 to 1min3.00, i.e. 2.11% faster.
      
      Change-Id: Ib52d1dbb5649b14de769d3e7a74af67440b5284f
      e20fcd95
  21. 16 Jul, 2013 1 commit
  22. 01 Jul, 2013 2 commits
    • Ronald S. Bultje's avatar
      Make get_coef_context() branchless. · 26b6318d
      Ronald S. Bultje authored
      This should significantly speedup cost_coeffs(). Basically what the
      patch does is to make the neighbour arrays padded by one item to
      prevent an eob check in get_coef_context(), then it populates each
      col/row scan and left/top edge coefficient with two times the same
      neighbour - this prevents a single/double context branch in
      get_coef_context(). Lastly, it populates neighbour arrays in pixel
      order (rather than scan order), so we don't have to dereference the
      scantable to get the correct neighbours.
      
      Total encoding time of first 50 frames of bus (speed 0) at 1500kbps
      goes from 2min10.1 to 2min5.3, i.e. a 2.6% overall speed increase.
      
      Change-Id: I42bcd2210fd7bec03767ef0e2945a665b851df56
      26b6318d
    • Ronald S. Bultje's avatar
      Quantize (64-bit only, for now) SSSE3 SIMD. · 7353ceab
      Ronald S. Bultje authored
      Total encoding time for first 50 frames of bus (speed 0) @ 1500kbps
      goes 2min34.8 to 2min14.4, i.e. a 10.4% overall speedup. The code is
      x86-64 only, it needs some minor modifications to be 32bit compatible,
      because it uses 15 xmm registers, whereas 32bit only has 8.
      
      Change-Id: I2df53770c2e850813ffa713e1a91b45b0082b904
      7353ceab
  23. 28 Jun, 2013 1 commit
    • Ronald S. Bultje's avatar
      Inline vp9_get_coef_context() (and remove vp9_ prefix). · d00b8e5f
      Ronald S. Bultje authored
      Makes cost_coeffs() a lot faster:
      4x4: 236 -> 181 cycles
      8x8: 888 -> 588 cycles
      16x16: 3550 -> 2483 cycles
      32x32: 17392 -> 12010 cycles
      
      Total encode time of first 50 frames of bus (speed 0) @ 1500kbps goes
      from 2min51.6 to 2min43.9, i.e. 4.7% overall speedup.
      
      Change-Id: I16b8d595946393c8dc661599550b3f37f5718896
      d00b8e5f
  24. 24 Jun, 2013 1 commit
  25. 21 Jun, 2013 1 commit
  26. 14 Jun, 2013 1 commit
  27. 29 May, 2013 1 commit
    • Deb Mukherjee's avatar
      Balancing coef-tree to reduce bool decodes · b8b3f1a4
      Deb Mukherjee authored
      This patch changes the coefficient tree to move the EOB to below
      the ZERO node in order to save number of bool decodes.
      
      The advantages of moving EOB one step down as opposed to two steps down
      in the other parallel patch are: 1. The coef modeling based on
      the One-node becomes independent of the tree structure above it, and
      2. Fewer conext/counter increases are needed.
      
      The drawback is that the potential savings in bool decodes will be
      less, but assuming that 0s are much more predominant than 1's the
      potential savings is still likely to be substantial.
      
      Results on derf300: -0.237%
      
      Change-Id: Ie784be13dc98291306b338e8228703a4c2ea2242
      b8b3f1a4
  28. 28 May, 2013 1 commit
  29. 24 May, 2013 1 commit
  30. 23 May, 2013 1 commit
  31. 22 May, 2013 1 commit
    • Deb Mukherjee's avatar
      Using 128 entry look up table for coef models · de4d682c
      Deb Mukherjee authored
      Reverts to using 128 bit LUT for the coef models rather than 48
      to ease hardware implementation.
      
      Also incorporates some cleanups including removing various
      hooks to support different lookup tables based on block_type and
      ref_type.
      
      Change-Id: I54100c120cca07a2ebd3a7776bc4630fa6a153f6
      de4d682c
  32. 21 May, 2013 2 commits
    • Deb Mukherjee's avatar
      Merging the model coef prob experiment · 7a645e4e
      Deb Mukherjee authored
      Merges the experiment.
      
      Change-Id: I4eb19af6de6df6aa3a96a2e82f231d47ed9b3ae9
      7a645e4e
    • Deb Mukherjee's avatar
      Refinements on modelcoef expt to reduce storage · 07443f15
      Deb Mukherjee authored
      Uses more aggrerssive interpolation to reduce storage for the
      model tables by almost more than half. Only 48 lists of probs are
      stored (as opposed to 128 before), corresponding to ONE_NODE
      probabilities of:
      1,
      3, 7, 11, ..., 115, 119,
      127, 135, ..., 247, 255.
      
      Besides, only 1 table is used as opposed to 2 before. So the overall
      memory needed for the tables is just 48 * 8 = 384 bytes.
      
      The table currently used is based on a new Pareto distribution with
      heavier tail than a generalized Gaussian - which improves results on
      derf by about 0.1% over a single table Generaized Gaussian.
      
      Results overall on derfraw300 is -0.14%.
      
      Change-Id: I19bd03559cbf5894a9f8594b8023dcc3e546f6bd
      07443f15
  33. 20 May, 2013 1 commit
    • Deb Mukherjee's avatar
      Updating the model coef experiment · 39a90bc8
      Deb Mukherjee authored
      Cleans up the experiment. Actually uses reduced counts for backward
      updates, and reduced number of probabilities in the context.
      
      No change in bitstream when the experiment is on.
      
      Between expt on and off:
      derfraw300 is down only -0.062% (which is better than when expts
      were run previously).
      
      Change-Id: I55285a049a0c22810bdb42914212ab5a4f8521b5
      39a90bc8
  34. 13 May, 2013 1 commit
    • Paul Wilkins's avatar
      Change to band calculation. · e5f71520
      Paul Wilkins authored
      Change band calculation back to simpler model based
      on the order in which coefficients are coded in scan order
      not the absolute coefficient positions.
      
      With the scatter scan experiment enabled the results were
      appear broadly neutral on derf (-0.028) but up a little on std-hd +0.134).
      
      Without the scatterscan experiment on the results were up derf as well.
      
      Change-Id: Ie9ef03ce42a6b24b849a4bebe950d4a5dffa6791
      e5f71520