1. 11 Oct, 2013 1 commit
    • Dmitry Kovalev's avatar
      Replacing {VP9_COEF, MODE}_UPDATE_PROB with DIFF_UPDATE_PROB. · 4a0f9478
      Dmitry Kovalev authored
      Values of MODE_UPDATE_PROB and VP9_COEF_UPDATE_PROB are equal, so replacing
      them with one constant. Inlining appropriate arguments for functions:
        vp9_cond_prob_diff_update (encoder)
        vp9_diff_update_prob (decoder)
      
      Change-Id: I1255a1cb477743b799b3bfbbcd8de6b32b067338
      4a0f9478
  2. 07 Oct, 2013 2 commits
  3. 30 Sep, 2013 1 commit
  4. 27 Sep, 2013 1 commit
  5. 19 Sep, 2013 1 commit
  6. 05 Sep, 2013 1 commit
  7. 29 Aug, 2013 1 commit
  8. 27 Aug, 2013 1 commit
  9. 26 Aug, 2013 1 commit
  10. 21 Aug, 2013 1 commit
  11. 12 Aug, 2013 1 commit
  12. 01 Aug, 2013 1 commit
  13. 29 Jul, 2013 1 commit
    • John Koleszar's avatar
      Remove unnecessary 64 byte alignment · a31effca
      John Koleszar authored
      Fixes a warning on MSVS 2012 where the alignment of vp9_default_iscan_8x8
      didn't match between its declaration and definition.
      
      Change-Id: I1466a15635f4b22594d705d570b7e399bfb6cf21
      a31effca
  14. 25 Jul, 2013 1 commit
    • Dmitry Kovalev's avatar
      General cleanups. · 7131cb0e
      Dmitry Kovalev authored
      Removing unused constants, macros, and function declarations. Using
      ROUND_POWER_OF_TWO macro, vp9_zero, vp9_copy where possible. Moving
      #include from *.h to *.c. Merging for loops for motion vectors.
      
      Change-Id: Ic3bf841764a2bb177128bb3a6d7aa8f68229cd13
      7131cb0e
  15. 22 Jul, 2013 1 commit
    • Ronald S. Bultje's avatar
      More optimizations for cost_coeffs(). · e20fcd95
      Ronald S. Bultje authored
      4x4:    163 ->  123 cycles (33% faster)
      8x8:    491 ->  399 cycles (23% faster)
      16x16: 1889 -> 1763 cycles (7% faster)
      32x32: 8311 -> 8180 cycles (1.6% faster)
      
      Overall encoding time of first 50 frames of bus (speed 0) @ 1500kbps
      goes from 1min4.33 to 1min3.00, i.e. 2.11% faster.
      
      Change-Id: Ib52d1dbb5649b14de769d3e7a74af67440b5284f
      e20fcd95
  16. 16 Jul, 2013 1 commit
  17. 01 Jul, 2013 2 commits
    • Ronald S. Bultje's avatar
      Make get_coef_context() branchless. · 26b6318d
      Ronald S. Bultje authored
      This should significantly speedup cost_coeffs(). Basically what the
      patch does is to make the neighbour arrays padded by one item to
      prevent an eob check in get_coef_context(), then it populates each
      col/row scan and left/top edge coefficient with two times the same
      neighbour - this prevents a single/double context branch in
      get_coef_context(). Lastly, it populates neighbour arrays in pixel
      order (rather than scan order), so we don't have to dereference the
      scantable to get the correct neighbours.
      
      Total encoding time of first 50 frames of bus (speed 0) at 1500kbps
      goes from 2min10.1 to 2min5.3, i.e. a 2.6% overall speed increase.
      
      Change-Id: I42bcd2210fd7bec03767ef0e2945a665b851df56
      26b6318d
    • Ronald S. Bultje's avatar
      Quantize (64-bit only, for now) SSSE3 SIMD. · 7353ceab
      Ronald S. Bultje authored
      Total encoding time for first 50 frames of bus (speed 0) @ 1500kbps
      goes 2min34.8 to 2min14.4, i.e. a 10.4% overall speedup. The code is
      x86-64 only, it needs some minor modifications to be 32bit compatible,
      because it uses 15 xmm registers, whereas 32bit only has 8.
      
      Change-Id: I2df53770c2e850813ffa713e1a91b45b0082b904
      7353ceab
  18. 28 Jun, 2013 1 commit
    • Ronald S. Bultje's avatar
      Inline vp9_get_coef_context() (and remove vp9_ prefix). · d00b8e5f
      Ronald S. Bultje authored
      Makes cost_coeffs() a lot faster:
      4x4: 236 -> 181 cycles
      8x8: 888 -> 588 cycles
      16x16: 3550 -> 2483 cycles
      32x32: 17392 -> 12010 cycles
      
      Total encode time of first 50 frames of bus (speed 0) @ 1500kbps goes
      from 2min51.6 to 2min43.9, i.e. 4.7% overall speedup.
      
      Change-Id: I16b8d595946393c8dc661599550b3f37f5718896
      d00b8e5f
  19. 24 Jun, 2013 1 commit
  20. 21 Jun, 2013 1 commit
  21. 14 Jun, 2013 1 commit
  22. 29 May, 2013 1 commit
    • Deb Mukherjee's avatar
      Balancing coef-tree to reduce bool decodes · b8b3f1a4
      Deb Mukherjee authored
      This patch changes the coefficient tree to move the EOB to below
      the ZERO node in order to save number of bool decodes.
      
      The advantages of moving EOB one step down as opposed to two steps down
      in the other parallel patch are: 1. The coef modeling based on
      the One-node becomes independent of the tree structure above it, and
      2. Fewer conext/counter increases are needed.
      
      The drawback is that the potential savings in bool decodes will be
      less, but assuming that 0s are much more predominant than 1's the
      potential savings is still likely to be substantial.
      
      Results on derf300: -0.237%
      
      Change-Id: Ie784be13dc98291306b338e8228703a4c2ea2242
      b8b3f1a4
  23. 28 May, 2013 1 commit
  24. 24 May, 2013 1 commit
  25. 23 May, 2013 1 commit
  26. 22 May, 2013 1 commit
    • Deb Mukherjee's avatar
      Using 128 entry look up table for coef models · de4d682c
      Deb Mukherjee authored
      Reverts to using 128 bit LUT for the coef models rather than 48
      to ease hardware implementation.
      
      Also incorporates some cleanups including removing various
      hooks to support different lookup tables based on block_type and
      ref_type.
      
      Change-Id: I54100c120cca07a2ebd3a7776bc4630fa6a153f6
      de4d682c
  27. 21 May, 2013 2 commits
    • Deb Mukherjee's avatar
      Merging the model coef prob experiment · 7a645e4e
      Deb Mukherjee authored
      Merges the experiment.
      
      Change-Id: I4eb19af6de6df6aa3a96a2e82f231d47ed9b3ae9
      7a645e4e
    • Deb Mukherjee's avatar
      Refinements on modelcoef expt to reduce storage · 07443f15
      Deb Mukherjee authored
      Uses more aggrerssive interpolation to reduce storage for the
      model tables by almost more than half. Only 48 lists of probs are
      stored (as opposed to 128 before), corresponding to ONE_NODE
      probabilities of:
      1,
      3, 7, 11, ..., 115, 119,
      127, 135, ..., 247, 255.
      
      Besides, only 1 table is used as opposed to 2 before. So the overall
      memory needed for the tables is just 48 * 8 = 384 bytes.
      
      The table currently used is based on a new Pareto distribution with
      heavier tail than a generalized Gaussian - which improves results on
      derf by about 0.1% over a single table Generaized Gaussian.
      
      Results overall on derfraw300 is -0.14%.
      
      Change-Id: I19bd03559cbf5894a9f8594b8023dcc3e546f6bd
      07443f15
  28. 20 May, 2013 1 commit
    • Deb Mukherjee's avatar
      Updating the model coef experiment · 39a90bc8
      Deb Mukherjee authored
      Cleans up the experiment. Actually uses reduced counts for backward
      updates, and reduced number of probabilities in the context.
      
      No change in bitstream when the experiment is on.
      
      Between expt on and off:
      derfraw300 is down only -0.062% (which is better than when expts
      were run previously).
      
      Change-Id: I55285a049a0c22810bdb42914212ab5a4f8521b5
      39a90bc8
  29. 13 May, 2013 1 commit
    • Paul Wilkins's avatar
      Change to band calculation. · e5f71520
      Paul Wilkins authored
      Change band calculation back to simpler model based
      on the order in which coefficients are coded in scan order
      not the absolute coefficient positions.
      
      With the scatter scan experiment enabled the results were
      appear broadly neutral on derf (-0.028) but up a little on std-hd +0.134).
      
      Without the scatterscan experiment on the results were up derf as well.
      
      Change-Id: Ie9ef03ce42a6b24b849a4bebe950d4a5dffa6791
      e5f71520
  30. 07 May, 2013 2 commits
  31. 29 Apr, 2013 2 commits
    • Ronald S. Bultje's avatar
      Change above/left_context to use an 8x8 basis. · 2dbaa4f4
      Ronald S. Bultje authored
      Output changes slightly because of a minor bug in (at least) the sb32x16
      block2above tx16x16 tables that previously existed in vp9_blockd.c.
      
      Change-Id: I624af28ac200a8322d64454cf05c79e9502968cc
      2dbaa4f4
    • Deb Mukherjee's avatar
      Turning model based reverse update on for coefs · 040eeed9
      Deb Mukherjee authored
      Turns model based reverse updates on for coefficients in an
      effort to reduce the memory requirement for counters.
      
      With this patch the counters needed will be reduced by about
      75% since only 3 counts are needed instead of 12.
      
      The impact in performance is:
      derf300: -0.252%
      stdhd250: -0.046%
      
      However retraining should alleviate some of the drop in
      performance.
      
      Change-Id: I6f2b3e13f6d5520aa3400b0b228fb5e8b4a43caa
      040eeed9
  32. 26 Apr, 2013 1 commit
  33. 22 Apr, 2013 3 commits