1. 16 May, 2014 1 commit
  2. 15 May, 2014 3 commits
  3. 14 May, 2014 3 commits
    • levytamar82's avatar
      AVX2 To VP9 Block Error Optimization · 1fbab853
      levytamar82 authored
      vp9_block_error_sse2 can only handle 16 bytes at a time but
      the function requires to handle a sequence of 32 bytes at a time
      so each 16 bytes is handled in a different register.
      With AVX2 optimization the 32 bytes can be handled in one register instead
      of two in the SSE2
      The vp9_block_error was optimized by 85%.
      The user level was optimized by 1.2%
      Change-Id: Ia8fffe60e61eff7432a5fbd538757894f6c319fd
    • Yaowu Xu's avatar
      vp9_decodeframe.c: cleanup -wextra warnings · ed095807
      Yaowu Xu authored
      Change-Id: I0315cea6a5e58182bc2556e9825ec2ef0b1480c3
    • Deb Mukherjee's avatar
      Remove Wextra warnings from vp9_sad.c · 7ab9a958
      Deb Mukherjee authored
      As a side-effect, the max_sad check is removed from the
      C-implementation of VP8, for consistency with VP9, and to
      ensure that the SAD tests common to VP8/VP9 pass.
      That will make the VP8 C implementation of sad a little slower
      but given that is rarely used in practice, the impact will be
      Change-Id: I7f43089fdea047fbf1862e40c21e4715c30f07ca
  4. 13 May, 2014 2 commits
    • Jingning Han's avatar
      Silience -wextra warnings in vp9_reconintra.c · 806fa6aa
      Jingning Han authored
      The warning messages complained that there are unused arguments
      in a few prediction modes. This structure was designed on purpose,
      such that a wrapper function can cover all prediction mode cases
      and make them readily accessible as an pointer array.
      This commit silences such warnings.
      Change-Id: I7036b6bdb70747e5327d8f6fceb154f100abc4c0
    • Adrian Grange's avatar
      vp9_convolve.c: cleanup -wextra warnings · fd6bf31b
      Adrian Grange authored
      Change-Id: I04930aca2293ebbaeb96dfedd2f9c5a55762fd2e
  5. 12 May, 2014 2 commits
  6. 08 May, 2014 3 commits
    • Alex Converse's avatar
      Add an x86inc MMX fwht4x4. · b5422fab
      Alex Converse authored
      Change-Id: Ib0a73d4863478f9b8a00976379d25d2f6ebbb197
    • Jingning Han's avatar
      Change eob threshold for partial inverse 8x8 2D-DCT to 12 · 41a350a8
      Jingning Han authored
      The scanning order has the first 12 coefficients of the 8x8 2D-DCT
      sitting in the top left 4x4 block. Hence the partial inverse 8x8
      2D-DCT allows to handle cases with eob below 12.
      The overall runtime of the inverse 8x8 2D-DCT unit is reduced from
      166 cycles (using SSE2) to 150 cycles (using SSSE3).
      Change-Id: I4514f9748042809ac84df4c14382c00f313f1cd2
    • Jingning Han's avatar
      SSSE3 8x8 inverse 2D-DCT with first 10 coeffs non-zero · 9e7b09bc
      Jingning Han authored
      This commit enables ssse3 assembly implementation of the 8x8
      inverse 2D-DCT with only first 10 coefficients non-zero. The
      average runtime for this unit goes down from 198 cycles to 129
      cycles (34.8% faster).
      Change-Id: Ie7fa4386f6d3a2fe0d47a2eb26fc2a6bbc592ac7
  7. 07 May, 2014 1 commit
    • Paul Wilkins's avatar
      Revert "Add an MMX fwht4x4" · 33b1c457
      Paul Wilkins authored
      Includes changes that are not compatible with VS windows builds.
      Amongst other things stdint.h is not supported in VS.
      This reverts commit 89fbf3de.
      Change-Id: Ifa86d7df250578d1ada9b539c9ff12ed0c523cdd
  8. 06 May, 2014 1 commit
  9. 05 May, 2014 2 commits
    • Alex Converse's avatar
      Add an MMX fwht4x4 · 89fbf3de
      Alex Converse authored
      7% faster encoding a desktop lossless at RT speed 4.
      Change-Id: I41627f5b737752616b6512bb91a36ec45995bf64
    • Jingning Han's avatar
      SSSE3 implementation of full inverse 8x8 2D-DCT · 52ae97b6
      Jingning Han authored
      This commit enables SSSE3 version full inverse 8x8 2D-DCT and
      reconstruction. It makes the runtime of vp9_idct8x8_64_add down
      from 256 cycles (SSE2) to 246 cycles.
      Change-Id: I0600feac894d6a443a3c9d18daf34156d4e225c3
  10. 01 May, 2014 1 commit
  11. 29 Apr, 2014 2 commits
    • Jingning Han's avatar
      Enable SSSE3 implementation of 8x8 forward 2D-DCT · 1eaa3a76
      Jingning Han authored
      Assembly implementation of ssse3 8x8 forward 2D-DCT. The current
      version is turned on only for x86_64. The average unit runtime
      goes from 157 cycles down to 136 cycles, i.e., about 12.8% faster.
      This translates into about 1.5% speed-up for pedestrian_area 1080p
      at speed 2.
      Change-Id: I0f12435857e9425ed7ce12541344dfa16837f4f4
    • Dmitry Kovalev's avatar
      Adding search_site_config struct. · aa464eca
      Dmitry Kovalev authored
      Change-Id: I2ad333553e673dbabcdc0f0366aea311e90849bf
  12. 25 Apr, 2014 2 commits
  13. 24 Apr, 2014 1 commit
  14. 23 Apr, 2014 1 commit
  15. 11 Apr, 2014 1 commit
  16. 10 Apr, 2014 1 commit
  17. 09 Apr, 2014 4 commits
    • Dmitry Kovalev's avatar
      Revert "Converting set_prev_mi() to get_prev_mi()." · 60def47f
      Dmitry Kovalev authored
      This reverts commit 22a3e307
      Change-Id: I460d905edf5fb2006da58c18fbe02c04d0c631bb
    • Yunqing Wang's avatar
      Fix encoder uninitialized read errors reported by drmemory · 3a6670fc
      Yunqing Wang authored
      This patch fixed the uninitialized read errors in Issue 748:
      "dr memory VP9 encode errors". In vp9_convolve_avg_sse2,
      when width is 4, pavgb reads 8 bytes from dst buffer that is
      out of range. An error is reported although the data is not
      actually used later. This issue was resolved by preventing
      uninitialized reads.
      Change-Id: I109a54910aa47139cb13119de86f2062cff207df
    • Tom Finegan's avatar
      Fix avx builds on macosx with clang 5.0. · f600b50a
      Tom Finegan authored
      The macosx release of clang v5.0 identifies itself as:
      Apple LLVM version 5.0 (clang-500.2.79) (based on LLVM 3.3svn)
      This version of clang uses the older _mm_broadcastsi128_si256, like
      v3.3, as given away in the LLVM svn version above.
      Change-Id: I4d6d59d5454efd57d2ae9e75f5eb7486af7cbd0c
    • Yunqing Wang's avatar
      Use source frame difference to make partition decision · 4e66293f
      Yunqing Wang authored
      Calculate the difference variance between last source frame and
      current source frame. The variance is calculated at 16x16 block
      level. The variances are compared to several thresholds to decide
      final partition sizes.
      An adaptive strategy is implemented to decide using
      in the video. The switching test is done once every
      search_type_check_frequency frames.
      The selection of source_var_thresh needs to be investigated
      further later.
      RTC set Borg test showed 0.424% overall psnr gain, and 0.357%
      ssim gain. For clips with large enough static area, the
      encoding speedup is around 2% to 15%.
      Change-Id: Id7d268f1d8cbca7fb8026aa4a53b3c77459dc156
  18. 08 Apr, 2014 1 commit
    • Deb Mukherjee's avatar
      High-level hooks for Profile 2 (10/12 bit) · d35df2d8
      Deb Mukherjee authored
      Adds some high-level hooks for profile 2 before further
      progress on the implementation.
      According to the definitiion in this patch:
      1. Profile 2 only supports 10 or 12 bit color but not 8
      2. Profile 2 supports all color sampling modes: 444, 422 and 420,
      and alpha plane.
      3. Profile 3 is currently undefined.
      Please consider the definition carefully and suggest modifications
      to the definition as needed.
      Change-Id: I5b284fc679e54ac5aee171af72fa7994cfd28995
  19. 07 Apr, 2014 2 commits
  20. 03 Apr, 2014 1 commit
  21. 02 Apr, 2014 1 commit
  22. 01 Apr, 2014 1 commit
  23. 29 Mar, 2014 1 commit
  24. 28 Mar, 2014 2 commits