1. 24 Sep, 2010 1 commit
    • John Koleszar's avatar
      move reconintra_mt to decoder (for now) · 48e76ff4
      John Koleszar authored
      reconintra_mt.c is only required for building the decoder right now.
      It could definitely be used for the encoder in the future, but it
      currently depends on decoder only data structures. (onyxd_int.h,
      VP8D_COMP, etc). Move it from common/ to decoder/ until the
      necessary changes to the common multithread code are complete.
      
      This patch is needed to build with --disable-vp8-decoder.
      
      Change-Id: I568c52221a2b309234d269675cba97131ce35c86
      48e76ff4
  2. 23 Sep, 2010 1 commit
    • John Koleszar's avatar
      Add getter functions for the interface data symbols · fa7a55bb
      John Koleszar authored
      Having these symbols be available as functions rather than data is
      occasionally more convenient. Implemented this way rather than a
      get-codec-by-id style to avoid creating a link-time dependency
      between the encoder and the decoder.
      
      Fixes issue #169
      
      Change-Id: I319f281277033a5e7e3ee3b092b9a87cce2f463d
      fa7a55bb
  3. 22 Sep, 2010 1 commit
    • Johann's avatar
      Remove dead code · 7fed3832
      Johann authored
      The new loopfilter was originally introduced as an experimental change.
      It's permanent now.
      
      Change-Id: I25dbedb6ceff3e9f9c04e18bb29f84c3ecb7e546
      7fed3832
  4. 21 Sep, 2010 2 commits
    • John Koleszar's avatar
      unset execute bit on c source · cdd20666
      John Koleszar authored
      Change-Id: I6625ee41f8872908cb015ce0729e1c7a105b5217
      cdd20666
    • John Koleszar's avatar
      Don't reset mb clamping state during splitmv decoding · 4d391e8e
      John Koleszar authored
      The MV decoding changes in c5fb0eb8 introduced a bug where the
      macroblock clamping state was reset for each partition, so if an
      earlier partition needed clamping but a subsequent one didn't,
      the MB wouldn't receive clamping. Instead, the state is only
      set during splitmv decoding, never cleared.
      
      Change-Id: I224fe258493405ee0f6a04596acdb622c475e845
      4d391e8e
  5. 20 Sep, 2010 3 commits
    • Fritz Koenig's avatar
      Use movq instead of movdqu. · b7dc9398
      Fritz Koenig authored
      Movdqu is more expensive (throughput, uops) than movq.  Minimal
      impact for newer big cores, but ~2.25% gain on Atom.
      
      Change-Id: I62c80bb1cc01d8a91c350c4c7719462809a4ef7f
      b7dc9398
    • Fritz Koenig's avatar
      Better choice of instruction filter mask comparision. · 8eae7fe7
      Fritz Koenig authored
      Use pmaxub instead of a combination of psubusb/por to
      determine if any comparisons go over the limit.
      
      Change-Id: I3f0bd7d2aabe5fee9ba6620508e2b60605abcb82
      8eae7fe7
    • Guillermo Ballester Valor's avatar
      Add high limit check for unsigned parameters · 23690686
      Guillermo Ballester Valor authored
      The patch related with issue #55 (5a72620d) fixed some warnings, but the
      fix was not optimal. It actually was a trick to confuse compiler rather
      than a fix.
      
      This patch fixes it by creating a new macro used when needed just a high
      limit check for an unsigned.
      
      Change-Id: I94b322e0f7fb07604b3b1df1f9321185f48cfcb5
      23690686
  6. 17 Sep, 2010 2 commits
    • Johann's avatar
      reorder data to use wider instructions · 022323bf
      Johann authored
      the previous commit laid the groundwork by doing two sets of idcts
      together. this moved that further by grouping the interesting data
      (q[0], q+16[0]) together to allow using wider instructions. also
      managed to drop a few instructions by recognizing that the constant
      for sinpi8sqrt2 could be downshifted all the time which avoided a
      dowshift as well as workarounds for a function which only accepted
      signed data
      
      looks like a modest gain for performance: at qcif, went from ~180
      fps to ~183
      Change-Id: I842673f3080b8239e026cc9b50346dbccbab4adf
      022323bf
    • Yunqing Wang's avatar
      Restructure multi-threaded decoder · f857a850
      Yunqing Wang authored
      On each MB, loopfiltering is done right after MB decoding. This
      combines two loops in multi-threaded code into one, which reduces
      number of synchronizations to half.
      
      The above-row/left-col data are saved in temp buffers for
      next-row/next MB decoding.
      
      Tests on 4-core gLucid machine showed 10% decoder performance
      gain with threads=4 (tulip clip). Testing on other platforms
      isn't done yet.
      
      Change-Id: Id18ea7c1e84965dabea65d4c01ca5bc056ddeac9
      f857a850
  7. 16 Sep, 2010 2 commits
    • John Koleszar's avatar
      cleanup: remove unused xprintf · 9100073e
      John Koleszar authored
      These files aren't currently used, and we can get them back if we
      need them.
      
      Change-Id: I62aa3bff828e491a80c80eeb84a7c44903df29b5
      9100073e
    • John Koleszar's avatar
      Reduce size of tokenizer tables · 147b125b
      John Koleszar authored
      This patch reduces the size of the global tables maintained by the
      tokenizer to 16k from 80k-96k. See issue #177.
      
      Change-Id: If0275d5f28389af11ac83c5d929d1157cde90fbe
      147b125b
  8. 14 Sep, 2010 1 commit
    • Fritz Koenig's avatar
      Removed unnecessary pxor. · 769f2424
      Fritz Koenig authored
      There is no need to make sure that the lower byte of the
      register is 0 because the downshift by 11 overwrites that byte.
      
      Change-Id: I89cbf004b2ff532a2c68e0dc399c45a49cdad5a1
      769f2424
  9. 10 Sep, 2010 1 commit
    • Fritz Koenig's avatar
      Make block access to frame buffer sequential · a65cd3de
      Fritz Koenig authored
      Sequentially accessing memory from a low address to a high
      address should make it easier for the processor to predict
      the cache.
      
      Change-Id: I1921ce996bdd547144fe864fea6435f527f5842d
      a65cd3de
  10. 09 Sep, 2010 4 commits
    • Scott LaVarnway's avatar
      Improved subset block search · c5fb0eb8
      Scott LaVarnway authored
      Improved the subset block search and fill.  (about 3% improvement for
      32 bit)  Modified/merged the code in order to create
      vp8_read_mb_modes_mv which can decode the modes/mvs on a macroblock
      level. This will allow the decode loop (in the future) to decode
      modes/mvs on a frame, row, or mb level.
      
      Change-Id: If637d994b508792f846d39b5d44a7bf9aa5cddf3
      c5fb0eb8
    • Johann's avatar
      Update NEON wide idcts · 14ba7642
      Johann authored
      Expand 93c32a55 which used SSE2 instructions to do two
      idct/dequant/recons at a time to NEON. Initial working
      commit. More work needs to be put into rearranging and
      interlacing the data to take advantage of quadword
      operations, which is when we'll hopefully see a much
      better boost
      
      Change-Id: I86d59d96f15e0d0f9710253e2c098ac2ff2865d1
      14ba7642
    • John Koleszar's avatar
      Fix GF interval for non-lagged ARFs · edcbb1c1
      John Koleszar authored
      When ARFs are enabled in non-lagged compress modes, the GF interval
      was being reset to zero. Non-lagged ARF updates were enabled in commit
      63ccfbd5, but this incorrect GF interval caused a quality regression.
      
      Change-Id: I615c3b493f4ce2127044f4e68d0bcb07d6b730c3
      edcbb1c1
    • John Koleszar's avatar
      Use WebM in copyright notice for consistency · c2140b8a
      John Koleszar authored
      Changes 'The VP8 project' to 'The WebM project', for consistency
      with other webmproject.org repositories.
      
      Fixes issue #97.
      
      Change-Id: I37c13ed5fbdb9d334ceef71c6350e9febed9bbba
      c2140b8a
  11. 08 Sep, 2010 3 commits
    • Jim Bankoski's avatar
      Skip unnecessary search of identical frames · 69ae8f47
      Jim Bankoski authored
      vp8_get_compressed_data() was defeating logic in
      encode_frame_to_datarate() that determined the reference buffers to
      search and forcing all frames to be eligible to search. In cases
      where buffers have identical contents, this is unnecessary extra
      work.
      
      Change-Id: I9e667ac39128ae32dc455a3db4c62e3efce6f114
      69ae8f47
    • Jim Bankoski's avatar
      Enable ARFs for non-lagged compress · 63ccfbd5
      Jim Bankoski authored
      ARFs were explicitly disabled except in lagged compress mode. New
      ARF logic allows for the ARF buffer to hold an older golden frame,
      which does not require lagged compress.
      
      Change-Id: I1dff82b6f53e8311f1e0514b1794ae05919d5f79
      63ccfbd5
    • Fritz Koenig's avatar
      Bilinear subpixel optimizations for ssse3. · 3fb37162
      Fritz Koenig authored
      Used pmaddubsw for multiply and add of two filter taps
      at once for 16x16 and 8x8 blocks.
      
      Change-Id: Idccf2d6e094561624407b109fa7e80ba799355ea
      3fb37162
  12. 03 Sep, 2010 1 commit
    • Scott LaVarnway's avatar
      Reduced the size of MB_MODE_INFO · 0de458f6
      Scott LaVarnway authored
      Moved partition_bmi and partition_count out of MB_MODE_INFO and
      placed into MACROBLOCK.  Also reduced the size of other members
      of the MB_MODE_INFO struct.  For 1080p, the memory was reduced
      by 1,209,516 bytes.  The decoder performance appeared to improve
      by 3% for the clip used.
      Note:  The main goal for this change is to improve the decoder
      performance.  The encoder will be revisited at a later date for
      further structure cleanup.
      
      Change-Id: I4733621292ee9cc3fffa4046cb3fd4d99bd14613
      0de458f6
  13. 02 Sep, 2010 5 commits
    • John Koleszar's avatar
      Whitespace: nuke CRLFs · 4496db45
      John Koleszar authored
      Change-Id: I8b9fdf9875a8fcff4cb49a3357ce44f18108c2e7
      4496db45
    • James Zern's avatar
      encoder: remove postproc dependency · 76640f85
      James Zern authored
      Remove the dependency on postproc.c for the encoder in general, the only
      unchecked need for it is when CONFIG_PSNR is enabled. All other cases
      are already wrapped in CONFIG_POSTPROC. In the CONFIG_PSNR case the file
      will still be included.
      
      Additionally, when VP8_SET_POSTPROC is used with the encoder when post
      processing has been disabled an error will be returned.
      
      This addresses issue #153.
      
      Change-Id: Ia6dfe20167f7077734a6058cbd1d794550346089
      76640f85
    • Yaowu Xu's avatar
      added separate rounding/zbin constants for 2nd order · fca12920
      Yaowu Xu authored
      This allows experiments of using different rounding and
      zerobin constants for 2nd order blocks.
      
      Change-Id: Idd829adba3edd1f713c66151a8d29bb245e33a71
      fca12920
    • John Koleszar's avatar
      Disable frame dropping by default · 23216211
      John Koleszar authored
      This is not the behavior that most users expect.
      
      Change-Id: I226126ea400c22cf1f7918e80ea7fe0771c569cb
      23216211
    • Frank Galligan's avatar
      Fix rare deadlock before loop filter · d45e5501
      Frank Galligan authored
      There was an extremely rare deadlock that happened when one thread
      was waiting to start the loop filter on frame n while the other
      threads were starting to work on frame n+1.
      
      Change-Id: Icc94f728b3b6663405435640d9a2996735ba19ef
      d45e5501
  14. 01 Sep, 2010 1 commit
  15. 31 Aug, 2010 3 commits
    • Paul Wilkins's avatar
      Improved Force Key Frame Behaviour · c239a1b6
      Paul Wilkins authored
      These changes improve the behaviour of the code with
      forced key frames sent in by a calling application.
      
      The sizing of the frames is still suboptimal for two pass in
      particular but the behaviour is much better than it was.
      
      Change-Id: I35fae610c67688ccc69d11f385e87dfc884e65a1
      c239a1b6
    • Johann's avatar
      followup arm patch · 0b94f5d6
      Johann authored
      make the arm asm detokenizer work with the new structures
      
      Change-Id: I7cd92c2a018ec24032bb1cfd1bb9739bc84b444a
      0b94f5d6
    • Scott LaVarnway's avatar
      Changed above and left context data layout · e85e6315
      Scott LaVarnway authored
      The main reason for the change was to reduce cycles in the token
      decoder. (~1.5% gain for 32 bit)  This layout should be more
      cache friendly.
      
      As a result of this change, the encoder had to be updated.
      
      Change-Id: Id5e804169d8889da0378b3a519ac04dabd28c837
      Note: dixie uses a similar layout
      e85e6315
  16. 27 Aug, 2010 1 commit
    • Timothy B. Terriberry's avatar
      Fix harmless off-by-1 error. · 7a8e0a29
      Timothy B. Terriberry authored
      The memory being zeroed in vp8_update_mode_info_border() was just
       allocated with calloc, and so the entire function is actually
       redundant, but it should be made correct in case someone expects
       it to actually work in the future.
      
      Change-Id: If7a84e489157ab34ab77ec6e2fe034fb71cf8c79
      7a8e0a29
  17. 24 Aug, 2010 1 commit
    • Johann's avatar
      clean up compiler warnings · 5c244398
      Johann authored
      did a test compile with clang and got rid of some warnings that have
      been annoying me for a while:
      vp8/decoder/detokenize.c: In function 'vp8_init_detokenizer':
      vp8/decoder/detokenize.c:121: warning: assignment discards qualifiers from pointer target type
      vp8/decoder/detokenize.c:122: warning: assignment discards qualifiers from pointer target type
      vp8/decoder/detokenize.c:123: warning: assignment from incompatible pointer type
      vp8/decoder/detokenize.c:124: warning: assignment discards qualifiers from pointer target type
      vp8/decoder/detokenize.c:125: warning: assignment discards qualifiers from pointer target type
      vp8/decoder/detokenize.c:128: warning: assignment discards qualifiers from pointer target type
      vp8/decoder/detokenize.c:129: warning: assignment discards qualifiers from pointer target type
      vp8/decoder/detokenize.c:130: warning: assignment discards qualifiers from pointer target type
      vp8/decoder/detokenize.c:131: warning: assignment discards qualifiers from pointer target type
      
      Change-Id: I78ddab176fe47cbeed30379709dc7bab01c0c2e4
      5c244398
  18. 23 Aug, 2010 2 commits
    • Johann's avatar
      update structures · d73217ab
      Johann authored
      mbmi and eob moved in previous commits
      
      Change-Id: I30a2eba36addf89ee50b406ad4afdd059a832711
      d73217ab
    • Fritz Koenig's avatar
      Rework idct calling structure. · 93c32a55
      Fritz Koenig authored
      Moving the eob structure allows for a non-struct based
      function to handle decoding an entire mb of
      idct/dequant/recon data.  This allows for SIMD functions
      to idct/dequant/recon multiple blocks at once.
      
      SSE2 implementation gives 3% gain on Atom.
      
      Change-Id: I8a8f3efd546ea4e0535f517d94f347cfb737c9c2
      93c32a55
  19. 20 Aug, 2010 1 commit
    • John Koleszar's avatar
      increase rate control buffer level precision · 8e7ebacb
      John Koleszar authored
      The external API exposes the RC initial/optimal/full buffer level in
      milliseconds, but this value was truncated internally to seconds. This
      patch allows the use of the full precision during the conversion from
      time to bits.
      
      Change-Id: If8dd2a87614c05747f81432cbe75dd9e6ed2f04e
      8e7ebacb
  20. 19 Aug, 2010 3 commits
    • Jim Bankoski's avatar
      Revert "Removed ssse3 sixtap code" · b0660457
      Jim Bankoski authored
      This reverts commit 6ea5bb85.
      b0660457
    • Johann's avatar
      cleanup simple loop filter · 52852da7
      Johann authored
      move some things around, reorder some instructions
      
      constant 0 is used several times. load it once per call in horiz,
      once per loop in vert.
      
      separate saturating instructions to avoid stalls.
      
      just use one usub8 call to set GE flags, rather than uqsub8 followed by
      usub8 w/ 0
      
      document some stalls for further consideration
      
      Change-Id: Ic3877e0ddbe314bb8a17fd5db73501a7d64570ec
      52852da7
    • Johann's avatar
      fix armv6 simpleloop filter · 467a0b99
      Johann authored
      test cases were causing a crash because the count was being read
      incorrectly. after fixing that, noticed that the output was not
      matching. fixed that.
      
      Change-Id: Idb0edb887736bd566a3cf6d4aa1a03ea8d20eb27
      467a0b99
  21. 18 Aug, 2010 1 commit