1. 07 Dec, 2012 1 commit
    • Ronald S. Bultje's avatar
      32x32 transform for superblocks. · c456b35f
      Ronald S. Bultje authored
      This adds Debargha's DCT/DWT hybrid and a regular 32x32 DCT, and adds
      code all over the place to wrap that in the bitstream/encoder/decoder/RD.
      
      Some implementation notes (these probably need careful review):
      - token range is extended by 1 bit, since the value range out of this
        transform is [-16384,16383].
      - the coefficients coming out of the FDCT are manually scaled back by
        1 bit, or else they won't fit in int16_t (they are 17 bits). Because
        of this, the RD error scoring does not right-shift the MSE score by
        two (unlike for 4x4/8x8/16x16).
      - to compensate for this loss in precision, the quantizer is halved
        also. This is currently a little hacky.
      - FDCT and IDCT is double-only right now. Needs a fixed-point impl.
      - There are no default probabilities for the 32x32 transform yet; I'm
        simply using the 16x16 luma ones. A future commit will add newly
        generated probabilities for all transforms.
      - No ADST version. I don't think we'll add one for this level; if an
        ADST is desired, transform-size selection can scale back to 16x16
        or lower, and use an ADST at that level.
      
      Additional notes specific to Debargha's DWT/DCT hybrid:
      - coefficient scale is different for the top/left 16x16 (DCT-over-DWT)
        block than for the rest (DWT pixel differences) of the block. Therefore,
        RD error scoring isn't easily scalable between coefficient and pixel
        domain. Thus, unfortunately, we need to compute the RD distortion in
        the pixel domain until we figure out how to scale these appropriately.
      
      Change-Id: I00386f20f35d7fabb19aba94c8162f8aee64ef2b
      c456b35f
  2. 06 Dec, 2012 1 commit
  3. 05 Dec, 2012 3 commits
    • Johann's avatar
      Begin to refactor vpx_scale usage in VP9 · 52d350fe
      Johann authored
      Only declare the functions in vpx_scale RTCD and include the relevant
      header.
      
      Remove unused files and functions in vpx_scale to avoid wasting time
      renaming. vpx_scale/win32/scaleopt.c contains functions which have not
      been called in a long time but are potentially optimized.
      
      The 'vp8' functions have not been renamed yet. That is for after the
      cleanup.
      
      Change-Id: I2c325a101d60fa9d27e7dfcd5b52a864b4a1e09c
      52d350fe
    • Johann's avatar
      Remove ARM optimizations from VP9 · a9056729
      Johann authored
      Change-Id: I9f0ae635fb9a95c4aa1529c177ccb07e2b76970b
      a9056729
    • Paul Wilkins's avatar
      Change to MV reference search. · 4cc657ec
      Paul Wilkins authored
      This patch reduces the cpu cost of the MV ref
      search by only allowing insert for candidates
      that would be in the current top 4.
      
      This could alter the outcome and slightly favors
      near candidates which are tested first but also
      limits the worst case loop count to 4 and means in
      many cases it will drop out and not happen.
      
      Change-Id: Idd795a825f9fd681f30f4fcd550c34c38939e113
      4cc657ec
  4. 03 Dec, 2012 3 commits
    • Johann's avatar
      Begin to refactor vpx_scale usage in VP9 · c6bd29e2
      Johann authored
      Only declare the functions in vpx_scale RTCD and include the relevant
      header.
      
      Remove unused files and functions in vpx_scale to avoid wasting time
      renaming. vpx_scale/win32/scaleopt.c contains functions which have not
      been called in a long time but are potentially optimized.
      
      The 'vp8' functions have not been renamed yet. That is for after the
      cleanup.
      
      Change-Id: I2c325a101d60fa9d27e7dfcd5b52a864b4a1e09c
      c6bd29e2
    • Johann's avatar
      Remove ARM optimizations from VP9 · 34591b54
      Johann authored
      Change-Id: I9f0ae635fb9a95c4aa1529c177ccb07e2b76970b
      34591b54
    • Jim Bankoski's avatar
      fixes --disable-vp9-encoder · d9038b3c
      Jim Bankoski authored
      Change-Id: I467bf0fdf3b35326bcce58d5459e6d2dbfd6c5e5
      d9038b3c
  5. 01 Dec, 2012 1 commit
  6. 30 Nov, 2012 3 commits
  7. 29 Nov, 2012 5 commits
  8. 28 Nov, 2012 6 commits
    • Yaowu Xu's avatar
      remove the vp9_default_mode_contexts_a · 1cc57396
      Yaowu Xu authored
      Given the way mode_context is updated, the benefit of an additional
      default is not signficant.
      
      Change-Id: I67489453e8781340b18e26a1cc2f04e9221004a2
      1cc57396
    • Jim Bankoski's avatar
      fixed includes to be fully specified · c6787398
      Jim Bankoski authored
      Change-Id: Ia1cce221f8511561b9cbd8edb7726fbc286ff243
      c6787398
    • Jim Bankoski's avatar
      remove postproc invokes · 85cba19e
      Jim Bankoski authored
      and some miscellaneous invoke left overs
      
      Change-Id: I63191b1bfd3bea4ce30cceaeb686ec850570fc43
      85cba19e
    • Yunqing Wang's avatar
      Further improve macroblock loop filters · d2021386
      Yunqing Wang authored
      This change included:
      1. Aligned reads in vp9_mbloop_filter_vertical_edge function.
      Since we actually read 16 bytes, we can align the reads to read
      starting at (s - 8) instead of (s - 5).
      2. Combined u, v loop filters.
      3. Added 8x16 transpose.
      
      This gave 2% decoder performance gain (tulip clip).
      
      Change-Id: Ib14c2f1645c4a3436df17fe2f24789506bf0bb58
      d2021386
    • Yaowu Xu's avatar
      removed redundant mode_context data structures · 12da793d
      Yaowu Xu authored
      This commit removed a couple of redundant data structures in frame
      coding contextsm, mode_context and mode_context_a, and changed to
      use vp9_mode_contexts only. The switch of the context for different
      frame type now relies on the switch of frame coding context between
      lfc and lfc_a. This commit also removed a number of memcpy among
      these redundant data structure.
      
      Change-Id: I42e8174bd60f466b0860afc44c1263896471b0f3
      12da793d
    • John Koleszar's avatar
      Clamp decoded feature data · a1f15814
      John Koleszar authored
      Not all segment feature data elements are full-range powers of two, so
      there are values that can be encoded that are invalid. Add a new function
      to clamp values to the maximum allowed.
      
      Change-Id: Ie47cb80ef2d54292e6b8db9f699c57214a915bc4
      a1f15814
  9. 27 Nov, 2012 1 commit
    • John Koleszar's avatar
      Add vp9_ prefix to all vp9 files · fcccbcbb
      John Koleszar authored
      Support for gyp which doesn't support multiple objects in the same
      static library having the same basename.
      
      Change-Id: Ib947eefbaf68f8b177a796d23f875ccdfa6bc9dc
      fcccbcbb
  10. 26 Nov, 2012 1 commit
    • Yunqing Wang's avatar
      Improve sad3x16 SSE2 function · e7cd8071
      Yunqing Wang authored
      Vp9_sad3x16_sse2() is heavily called in decoder, in which the
      unaligned reads consume lots of cpu cycles. When CONFIG_SUBPELREFMV
      is off, the unaligned offset is 1. In this situation,
      we can adjust the src_ptr to be 4-byte aligned, and then do the
      aligned reads. This reduced the reading time significantly. Tests
      on 1080p clip showed over 2% decoder performance gain with
      CONFIG_SUBPELREFM off.
      
      Change-Id: I953afe3ac5406107933ef49d0b695eafba9a6507
      e7cd8071
  11. 25 Nov, 2012 1 commit
  12. 21 Nov, 2012 1 commit
  13. 20 Nov, 2012 1 commit
  14. 18 Nov, 2012 1 commit
    • Jim Bankoski's avatar
      clean out some of the rtcd code. · f4871b6a
      Jim Bankoski authored
      This removes functions that are no longer needed and cleans up some warnings.
      
      Change-Id: I292a4c3694e9c1d68ce99cea390905b198434719
      f4871b6a
  15. 17 Nov, 2012 1 commit
  16. 16 Nov, 2012 5 commits
    • Yunqing Wang's avatar
      Add const before the dequant(dq) · 47d9d48f
      Yunqing Wang authored
      Modified code to use const before dq.
      
      Change-Id: I6fa59c2ed9743ded33ad08df70e15c2fe1ae7b99
      47d9d48f
    • Ronald S. Bultje's avatar
      Support 32x32 intra modes in non-keyframe superblocks. · 5b11052a
      Ronald S. Bultje authored
      Change-Id: Icf8ad313c543462e523bff89690e5daa8d49bcc0
      5b11052a
    • Paul Wilkins's avatar
      Further experimentation with the mode context · a57dbd95
      Paul Wilkins authored
      Experiments with a larger set of contexts and some
      clean up to replace magic numbers regarding the
      number of contexts.
      
      The starting values and rate of backwards adaption
      are still suspect and based on a small set of tests.
      Added forwards adjustment of probabilities.
      
      The net result of adding the new context and forward
      update is small compared to the old context from the
      legacy find_near function.  (down a little on derf but
      up by a similar amount for HD)
      
      HOWEVER.... with the new context and forward update
      the impact of disabling the reverse update (which may be
      necessary in some use cases to facilitate parallel decoding)
      is hugely reduced.
      
      For the old context without forward update, the impact of
      turning off reverse update (Experiment was with SB off) was
      Derf - 0.9, Yt -1.89, ythd -2.75 and sthd -8.35. The impact was
      mainly at low data rates.
      
      With the new context and forward update enabled the impact
      for all the test sets was no more than 0.5-1% (again most at
      the low end).
      
      Change-Id: Ic751b414c8ce7f7f3ebc6f19a741d774d2b4b556
      a57dbd95
    • Yaowu Xu's avatar
      changed mv candidate search for superblocks · 415e6bff
      Yaowu Xu authored
      added additional motion vectors at close neighborhood of a superblock
      to the list of candiate motion vectors, and removed a couple that are
      further away.
      
      The change helped std-hd set about .8% (all metrics) and smaller gain
      for derf set.
      
      Change-Id: Iaa69b98614db43420ed3fd4738d0ca5587b90045
      415e6bff
    • Deb Mukherjee's avatar
      Compound inter-intra experiment · 0c917fc9
      Deb Mukherjee authored
      A patch on compound inter-intra prediction.
      
      In compound inter-intra prediction, a new predictor for
      16x16 inter coded MBs are obtained by combining a single
      inter predictor with a 16x16 intra predictor, in a manner
      that the weight varies with distance from the top/left
      boundary. The current search strategy is to combine the best
      inter mode with the best intra mode obtained independently.
      
      Results so far:
      
      derf +0.31%
      yt +0.32%
      std-hd +0.35%
      hd +0.42%
      
      It is conceivable that the results would improve somewhat
      with a more thorough search strategy where all intra modes
      are searched given the best mv, or even a joint search for
      the best mv and the best intra mode.
      
      Change-Id: I7951f1ed0d6eb31ca32ac24d120f1585bcd8d79b
      0c917fc9
  17. 15 Nov, 2012 4 commits
    • John Koleszar's avatar
      Pack invisible frames without lengths · 64bcffc1
      John Koleszar authored
      Modify the decoder to return the ending position of the bool decoder and
      use that as the starting position for the next frame.
      
      The constant-space algorithm for parsing the appended frame lengths is
      O(n^2), which is a potential DoS concern if n is unbounded. Revisit
      the appended lengths for use as partition lengths when multipartition
      support is added.
      
      In addition, this allows decoding of raw streams outside of a container
      without additional framing information, though it's insufficient to
      be able to remux said stream into a container.
      
      Change-Id: I71e801a9c3e37abe559a56a597635b0cbae1934b
      64bcffc1
    • Yaowu Xu's avatar
      subpelrefmv for superblocks · 61416aed
      Yaowu Xu authored
      duplicate code clean-up and variable name corrections
      
      Change-Id: Ibc4703228e652ec425125de5e7bc038fa46595c5
      61416aed
    • John Koleszar's avatar
      support building vp8 and vp9 into a single lib · a9c7597a
      John Koleszar authored
      Change-Id: Ib8f8a66c9fd31e508cdc9caa662192f38433aa3d
      a9c7597a
    • John Koleszar's avatar
      detokenize: use SEG_LVL_EOB feature consistently · 6becad42
      John Koleszar authored
      Update decode_coefs() to break when c >= eob, since it's possible that
      c starts the loop from 1 and eob is 0. The loop won't terminate in that
      case.
      
      Add new get_eob() function to consistently clamp the eob based on the
      segment level EOB and the block size. It's possible to code a segment
      level EOB that's greater than the block size, and that leads to an
      out of bounds access.
      
      Change-Id: I859563b30414615cf1b30dcc2aef8a1de358c42d
      6becad42
  18. 14 Nov, 2012 1 commit
    • Ronald S. Bultje's avatar
      Don't use hybrid transform (ADST) for superblocks. · 1e3dd49f
      Ronald S. Bultje authored
      This is in line with other cases where we disable ADST if prediction
      size and transform size don't match. Before this patch, the RD loop
      will use ADST for superblocks, but frame encoding/decoding won't.
      
      Change-Id: I700368c632eb72b5e089c22ef25649d99d7697d0
      1e3dd49f