1. 20 Jun, 2014 1 commit
  2. 13 Jun, 2014 1 commit
  3. 12 Jun, 2014 1 commit
    • Jingning Han's avatar
      Fast computation path for forward transform and quantization · ccba289f
      Jingning Han authored
      This commit enables a fast path computational flow for forward
      transformation. It checks the sse and variance of prediction
      residuals and decides if the quantized coefficients are all
      zero, dc only, or more. It then selects the corresponding coding
      path in the forward transformation and quantization stage.
      It is currently enabled in rtc coding mode. Will do it for rd
      coding mode next.
      In speed -6, the runtime for pedestrian_area 1080p at 1000 kbps
      goes down from 14234 ms to 13704 ms, i.e., about 4% speed-up.
      Overall coding performance for rtc set is changed by -0.18%.
      Change-Id: I0452da1786d59bc8bcbe0a35fdae9f623d1d44e1
  4. 10 Jun, 2014 6 commits
    • James Zern's avatar
      vp9_rtcd: correct avx2 references · 9f3a0dbb
      James Zern authored
      avx2 code is all intrinsics and as a result doesn't rely on x86inc.asm
      Change-Id: I76ad39474d8a00658f3e43131830ef0f4f34772a
    • James Zern's avatar
      vp9_sub_pixel_*variance*: disable avx2 variants · 520cb3f3
      James Zern authored
      tests failing under Win32/Win64
      + variance_test: add missing avx2 functions (partially disabled)
      Change-Id: I6abc0657ea076379ab9ca65c12678b9ea199849d
    • James Zern's avatar
      vp9_sad*x4d: disable avx2 variants · d3ff009d
      James Zern authored
      tests failing under Win32/Win64
      + sad_test: add missing avx2 functions (disabled)
      Change-Id: I8224fba2b270f6039ab1877d71e1e512f0081856
    • hkuang's avatar
      Add mode info arrays and mode info index. · cdffeaaa
      hkuang authored
      In non frame-parallel decoding, this works the same way as
      current decoding scheme. Every time after decoder finish
      decoding a frame, it will swap the current mode info pointer
      and  previous mode info pointer if the decoded frame needs
      to be shown. Both mode info pointer and previous mode info
      pointer are from mode info arrays.
      In frame-parallel decoding, this will become more complicated
      as current frame's mode info pointer will be shared with next
      frame as previous mode info pointer. But when one decoder
      thread finishes decoding one frame and starts to work on next
      available frame, it needs to retain the decoded frame's mode
      info pointers until next frame finishes decoding. The mode info
      index will serve this purpose. The decoder will use different
      buffer in the mode info arrays and use the other buffer to save
      previous decoded frame’s mode info.
      Change-Id: If11d57d8eb0ee38c8876158e5482177fcb229428
    • James Zern's avatar
      vp9_f(dct|ht): disable avx2 variants · dd9f5029
      James Zern authored
      tests failing under Win32/Win64
      + dct16x16_test: add missing avx2 functions (partially disabled)
      exercises the forward transforms
      no idct/iht implementations, so the c-code is used
      Change-Id: I04f64a457fa0828a00f32b5c9fe4f55294f21f61
    • James Zern's avatar
      convolve: disable avx2 variants · 5704578f
      James Zern authored
      tests failing under Win32/Win64
      Change-Id: I5d49d11911bcda3a832b14efe5500d22597bedcf
  5. 02 Jun, 2014 1 commit
  6. 01 Jun, 2014 1 commit
  7. 29 May, 2014 2 commits
  8. 28 May, 2014 1 commit
    • Jingning Han's avatar
      Enable SSSE3 inverse 2D-DCT with 10 non-zero coeffs · 6d21cbd2
      Jingning Han authored
      This commit enables SSSE3 implementation of the inverse 2D-DCT
      with only first 10 coefficients non-zero. It reduces the runtime
      of SSE2 version from 745 cycles to 538 cycles, i.e., 27% speed-up.
      Change-Id: I18ba4128859b09c704a6ee361d69a86c09fe8dfe
  9. 27 May, 2014 2 commits
  10. 23 May, 2014 5 commits
  11. 22 May, 2014 2 commits
  12. 21 May, 2014 2 commits
    • Deb Mukherjee's avatar
      Renames x86_64 specific asm files · e2722734
      Deb Mukherjee authored
      Renames all x86_64 specific assembly files to consistently
      end in _x86_64.asm. This will be useful for build systems to
      handle these files differently.
      All new 64-bit specific assembly files should use the new
      naming convention.
      Change-Id: I36c89584967c82ffc4088b1b5044ac15d2bb7536
    • Dmitry Kovalev's avatar
      Moving itxm_add pointer from MACROBLOCKD to MACROBLOCK. · 35a83677
      Dmitry Kovalev authored
      The final goal is eventually to get rid of both itxm_add and fwd_txm4x4.
      This patch does it in the decoder.
      Change-Id: Ibb3db57efbcbb1ac387c6742538a9fcf2c6f24a5
  13. 20 May, 2014 2 commits
    • Deb Mukherjee's avatar
      Extends temporal filtering to work for 422 data · a185bc33
      Deb Mukherjee authored
      This is needed for profiles 1 and 2.
      Change-Id: I5dd7644c2932d055ab89e050d4be7d4117cd1028
    • hkuang's avatar
      Refactor decode_tiles and loopfilter code. · 20c1edf6
      hkuang authored
      The current decode_tiles decodes the frame one tile by one tile
      and then loopfilter the whole frame or use another worker thread to
      do loopfiltering.
      For example, if a tile video has one row and four cols, decode_tiles
      will decode the Tile1, then Tile2, then Tile3, then Tile4.
      And during decode each tile, decode_tile will decode row by row in
      each tile.
      For frame parallel decoding, decode_tiles will decode video in row order
      across the tiles. So the order will be:
      "Decode 1st row of Tile1" -> "Decode 1st row of Tile2"
      -> "Decode 1st row of Tile3" -> "Decode 1st row of Tile4"
      -> "Decode 2nd row of Tile1" -> "Decode 2nd row of Tile2"
      -> "Decode 2nd row of Tile3" -> "Decode 2nd row of Tile4"-> "loopfilter 1st row"
      Change-Id: I2211f9adc6d142fbf411d491031203cb8a6dbf6b
  14. 16 May, 2014 1 commit
  15. 15 May, 2014 3 commits
  16. 14 May, 2014 4 commits
    • Dmitry Kovalev's avatar
      Hiding vp9_sub_pel_filters_{8, 8s, 8lp} filters in *.c file. · 021eaabd
      Dmitry Kovalev authored
      Change-Id: Id401da740b0a0141caaef9e1bcccd981e5cef4a4
    • levytamar82's avatar
      AVX2 To VP9 Block Error Optimization · 1fbab853
      levytamar82 authored
      vp9_block_error_sse2 can only handle 16 bytes at a time but
      the function requires to handle a sequence of 32 bytes at a time
      so each 16 bytes is handled in a different register.
      With AVX2 optimization the 32 bytes can be handled in one register instead
      of two in the SSE2
      The vp9_block_error was optimized by 85%.
      The user level was optimized by 1.2%
      Change-Id: Ia8fffe60e61eff7432a5fbd538757894f6c319fd
    • Yaowu Xu's avatar
      vp9_decodeframe.c: cleanup -wextra warnings · ed095807
      Yaowu Xu authored
      Change-Id: I0315cea6a5e58182bc2556e9825ec2ef0b1480c3
    • Deb Mukherjee's avatar
      Remove Wextra warnings from vp9_sad.c · 7ab9a958
      Deb Mukherjee authored
      As a side-effect, the max_sad check is removed from the
      C-implementation of VP8, for consistency with VP9, and to
      ensure that the SAD tests common to VP8/VP9 pass.
      That will make the VP8 C implementation of sad a little slower
      but given that is rarely used in practice, the impact will be
      Change-Id: I7f43089fdea047fbf1862e40c21e4715c30f07ca
  17. 13 May, 2014 2 commits
    • Jingning Han's avatar
      Silience -wextra warnings in vp9_reconintra.c · 806fa6aa
      Jingning Han authored
      The warning messages complained that there are unused arguments
      in a few prediction modes. This structure was designed on purpose,
      such that a wrapper function can cover all prediction mode cases
      and make them readily accessible as an pointer array.
      This commit silences such warnings.
      Change-Id: I7036b6bdb70747e5327d8f6fceb154f100abc4c0
    • Adrian Grange's avatar
      vp9_convolve.c: cleanup -wextra warnings · fd6bf31b
      Adrian Grange authored
      Change-Id: I04930aca2293ebbaeb96dfedd2f9c5a55762fd2e
  18. 12 May, 2014 2 commits
  19. 08 May, 2014 1 commit