1. 24 Sep, 2010 7 commits
  2. 23 Sep, 2010 2 commits
    • John Koleszar's avatar
      Add getter functions for the interface data symbols · fa7a55bb
      John Koleszar authored
      Having these symbols be available as functions rather than data is
      occasionally more convenient. Implemented this way rather than a
      get-codec-by-id style to avoid creating a link-time dependency
      between the encoder and the decoder.
      
      Fixes issue #169
      
      Change-Id: I319f281277033a5e7e3ee3b092b9a87cce2f463d
      fa7a55bb
    • Yunqing Wang's avatar
      Adjust multi-thread sync ranges according to image sizes · 8db5da29
      Yunqing Wang authored
      In multi-threaded decoder, set different sync ranges for
      different video resolutions.
      
      Change-Id: Iea48fd36f51919e0152c8ed3b1f10e1b723c0ca7
      8db5da29
  3. 22 Sep, 2010 1 commit
    • Johann's avatar
      Remove dead code · 7fed3832
      Johann authored
      The new loopfilter was originally introduced as an experimental change.
      It's permanent now.
      
      Change-Id: I25dbedb6ceff3e9f9c04e18bb29f84c3ecb7e546
      7fed3832
  4. 21 Sep, 2010 9 commits
  5. 20 Sep, 2010 6 commits
  6. 17 Sep, 2010 2 commits
    • Johann's avatar
      reorder data to use wider instructions · 022323bf
      Johann authored
      the previous commit laid the groundwork by doing two sets of idcts
      together. this moved that further by grouping the interesting data
      (q[0], q+16[0]) together to allow using wider instructions. also
      managed to drop a few instructions by recognizing that the constant
      for sinpi8sqrt2 could be downshifted all the time which avoided a
      dowshift as well as workarounds for a function which only accepted
      signed data
      
      looks like a modest gain for performance: at qcif, went from ~180
      fps to ~183
      Change-Id: I842673f3080b8239e026cc9b50346dbccbab4adf
      022323bf
    • Yunqing Wang's avatar
      Restructure multi-threaded decoder · f857a850
      Yunqing Wang authored
      On each MB, loopfiltering is done right after MB decoding. This
      combines two loops in multi-threaded code into one, which reduces
      number of synchronizations to half.
      
      The above-row/left-col data are saved in temp buffers for
      next-row/next MB decoding.
      
      Tests on 4-core gLucid machine showed 10% decoder performance
      gain with threads=4 (tulip clip). Testing on other platforms
      isn't done yet.
      
      Change-Id: Id18ea7c1e84965dabea65d4c01ca5bc056ddeac9
      f857a850
  7. 16 Sep, 2010 2 commits
    • John Koleszar's avatar
      cleanup: remove unused xprintf · 9100073e
      John Koleszar authored
      These files aren't currently used, and we can get them back if we
      need them.
      
      Change-Id: I62aa3bff828e491a80c80eeb84a7c44903df29b5
      9100073e
    • John Koleszar's avatar
      Reduce size of tokenizer tables · 147b125b
      John Koleszar authored
      This patch reduces the size of the global tables maintained by the
      tokenizer to 16k from 80k-96k. See issue #177.
      
      Change-Id: If0275d5f28389af11ac83c5d929d1157cde90fbe
      147b125b
  8. 15 Sep, 2010 1 commit
    • Fritz Koenig's avatar
      Modify GET_GOT macro for performance. · 746439ef
      Fritz Koenig authored
      GET_GOT was producing a zero length call.  This resulted in
      pipeline flushes occuring when returing from the assembly
      functions.  Masked on out of order cores, but evident on
      Atom cores.
      
      Change-Id: I8c375af313e8a169c77adbaf956693c0cfeb5ccd
      746439ef
  9. 14 Sep, 2010 1 commit
    • Fritz Koenig's avatar
      Removed unnecessary pxor. · 769f2424
      Fritz Koenig authored
      There is no need to make sure that the lower byte of the
      register is 0 because the downshift by 11 overwrites that byte.
      
      Change-Id: I89cbf004b2ff532a2c68e0dc399c45a49cdad5a1
      769f2424
  10. 13 Sep, 2010 3 commits
  11. 10 Sep, 2010 1 commit
    • Fritz Koenig's avatar
      Make block access to frame buffer sequential · a65cd3de
      Fritz Koenig authored
      Sequentially accessing memory from a low address to a high
      address should make it easier for the processor to predict
      the cache.
      
      Change-Id: I1921ce996bdd547144fe864fea6435f527f5842d
      a65cd3de
  12. 09 Sep, 2010 5 commits
    • Scott LaVarnway's avatar
      Merge "Improved subset block search" · a32ded1d
      Scott LaVarnway authored
      a32ded1d
    • Scott LaVarnway's avatar
      Improved subset block search · c5fb0eb8
      Scott LaVarnway authored
      Improved the subset block search and fill.  (about 3% improvement for
      32 bit)  Modified/merged the code in order to create
      vp8_read_mb_modes_mv which can decode the modes/mvs on a macroblock
      level. This will allow the decode loop (in the future) to decode
      modes/mvs on a frame, row, or mb level.
      
      Change-Id: If637d994b508792f846d39b5d44a7bf9aa5cddf3
      c5fb0eb8
    • Johann's avatar
      Update NEON wide idcts · 14ba7642
      Johann authored
      Expand 93c32a55 which used SSE2 instructions to do two
      idct/dequant/recons at a time to NEON. Initial working
      commit. More work needs to be put into rearranging and
      interlacing the data to take advantage of quadword
      operations, which is when we'll hopefully see a much
      better boost
      
      Change-Id: I86d59d96f15e0d0f9710253e2c098ac2ff2865d1
      14ba7642
    • John Koleszar's avatar
      Fix GF interval for non-lagged ARFs · edcbb1c1
      John Koleszar authored
      When ARFs are enabled in non-lagged compress modes, the GF interval
      was being reset to zero. Non-lagged ARF updates were enabled in commit
      63ccfbd5, but this incorrect GF interval caused a quality regression.
      
      Change-Id: I615c3b493f4ce2127044f4e68d0bcb07d6b730c3
      edcbb1c1
    • Fritz Koenig's avatar