1. 30 Sep, 2011 1 commit
    • Johann's avatar
      combine loopfilter data access · 3556deac
      Johann authored
      The data processed by the loopfilter overlaps. At the block level, this
      results in some redundant transforms. Grouping the filtering allows for
      a single 16x16 transpose (and inversion) instead of three 16x8 transposes
      (and three more inversions).
      
      This implementation is x86_64 only. We retain the previous
      implementation for x86.
      
      Improvements are obviously material dependant, but it seems to be ~%1 in
      tests here.
      
      Change-Id: I467b7ec3655be98fb5f1a94b5d145e5e5a660007
      3556deac
  2. 24 Aug, 2011 1 commit
    • Johann's avatar
      Fix data accesses for simple loopfilters · 85358d04
      Johann authored
      The data that the simple horizontal loopfilter reads is aligned, treat
      it accordingly.
      
      For the vertical, we only use the bottom 4 bytes, so don't read in 16
      (and incur the penalty for unaligned access).
      
      This shows a small improvement on older processors which have a
      significant penalty for unaligned reads.
      
      postproc_mmx.c is unused
      
      Change-Id: I87b29bbc0c3b19ee1ca1de3c4f47332a53087b3d
      85358d04
  3. 08 Jul, 2011 1 commit
  4. 19 Apr, 2011 1 commit
    • Johann's avatar
      modify SAVE_XMM for potential 64bit use · 4a2b684e
      Johann authored
      the win64 abi requires saving and restoring xmm6:xmm15. currently
      SAVE_XMM and RESTORE XMM only allow for saving xmm6:xmm7. allow
      specifying the highest register used and if the stack is unaligned.
      
      Change-Id: Ica5699622ffe3346d3a486f48eef0206c51cf867
      4a2b684e
  5. 05 Oct, 2010 1 commit
  6. 04 Oct, 2010 1 commit
    • Jan Kratochvil's avatar
      nasm: address labels 'rel label' vice 'wrt rip' · 5cdc3a4c
      Jan Kratochvil authored
      nasm does not support `label wrt rip', it requires `rel label'. It is
      still fully compatible with yasm.
      
      Provide nasm compatibility. No binary change by this patch with yasm on
      {x86_64,i686}-fedora13-linux-gnu. Few longer opcodes with nasm on
      {x86_64,i686}-fedora13-linux-gnu have been checked as safe.
      
      Change-Id: I488773a4e930a56e43b0cc72d867ee5291215f50
      5cdc3a4c
  7. 28 Sep, 2010 1 commit
    • Fritz Koenig's avatar
      Optimizations on the loopfilters. · 0964ef0e
      Fritz Koenig authored
      - Scheduling for Atom processors
      - Combining of macros to allow for better interleaving
      - Change from multiplies to adds for main filter
      - Use of movhps/movlps to fill xmm registers without
        shifting and orring
      
      Change-Id: I0b3500a5f58abf7085253ec92d64c8a96723040b
      0964ef0e
  8. 20 Sep, 2010 1 commit
  9. 14 Sep, 2010 1 commit
    • Fritz Koenig's avatar
      Removed unnecessary pxor. · 769f2424
      Fritz Koenig authored
      There is no need to make sure that the lower byte of the
      register is 0 because the downshift by 11 overwrites that byte.
      
      Change-Id: I89cbf004b2ff532a2c68e0dc399c45a49cdad5a1
      769f2424
  10. 10 Sep, 2010 1 commit
    • Fritz Koenig's avatar
      Make block access to frame buffer sequential · a65cd3de
      Fritz Koenig authored
      Sequentially accessing memory from a low address to a high
      address should make it easier for the processor to predict
      the cache.
      
      Change-Id: I1921ce996bdd547144fe864fea6435f527f5842d
      a65cd3de
  11. 09 Sep, 2010 1 commit
  12. 29 Jun, 2010 1 commit
    • Yunqing Wang's avatar
      Improve SSE2 loopfilter functions · bead039d
      Yunqing Wang authored
      Restructured and rewrote SSE2 loopfilter functions. Combined u and
      v into one function to take advantage of SSE2 128-bit registers.
      Tests on test clips showed a 4% decoder performance improvement on
      Linux desktop.
      
      Change-Id: Iccc6669f09e17f2224da715f7547d6f93b0a4987
      bead039d
  13. 18 Jun, 2010 1 commit
    • John Koleszar's avatar
      cosmetics: trim trailing whitespace · 94c52e4d
      John Koleszar authored
      When the license headers were updated, they accidentally contained
      trailing whitespace, so unfortunately we have to touch all the files
      again.
      
      Change-Id: I236c05fade06589e417179c0444cb39b09e4200d
      94c52e4d
  14. 11 Jun, 2010 1 commit
  15. 04 Jun, 2010 1 commit
  16. 18 May, 2010 1 commit