1. 26 May, 2017 1 commit
    • David Barker's avatar
      ext-inter: Vectorize new masked SAD/SSE functions · 0aa39ff0
      David Barker authored
      We would expect that these new functions would be slower than
      the old masked SAD/SSE functions, as they do additional work
      (blending two inputs and comparing to a third, rather than
      just comparing two inputs).
      
      This is true for the SAD functions, which are about 50% slower
      (depending on block size and bit depth). However, the sub-pixel
      SSE functions are comparable to the old speed for the accelerated
      special cases (xoffset or yoffset = 0 or 4), and are
      between 40-90% faster for the generic case.
      
      Change-Id: I1a296ed8fc9e3edc313a6add516ff76b17cd3e9f
      0aa39ff0
  2. 25 May, 2017 1 commit
  3. 24 May, 2017 1 commit
  4. 23 May, 2017 2 commits
    • David Barker's avatar
      Vectorize high-precision convolve filter · 5d34e6a7
      David Barker authored
      Add SSE2 lowbd and SSSE3 highbd versions of the filters
      introduced in https://aomedia-review.googlesource.com/c/11962/ .
      
      These filters are equivalent in speed to the SSE2 implementations
      of the regular convolve filter. The average time to filter a
      64x64 block is:
      
      lowbd C: 52us
      lowbd SSE2: 5.6us
      highbd C: 53us
      highbd SSSE3: 5.8us
      
      Also add a correctness test based on the warp filter tests.
      
      Change-Id: Ia0d81100e8a414bbfc2b5f664d751cf24765299e
      5d34e6a7
    • David Barker's avatar
      ext-inter: Delete dead code · 0f3c94e1
      David Barker authored
      Patches https://aomedia-review.googlesource.com/c/11987/
      and https://aomedia-review.googlesource.com/c/11988/
      replaced the old masked motion search pipeline with
      a new one which uses different SAD/SSE functions.
      This resulted in a lot of dead code.
      
      This patch removes the now-dead code. Note that this
      includes vectorized SAD/SSE functions, which will need
      to be rewritten at some point for the new pipeline. It
      also includes the masked_compound_variance_* functions
      since these turned out not to be used by the new pipeline.
      
      To help with the later addition of vectorized functions, the
      masked_sad/variance_test.cc files are kept but are modified
      to work with the new functions. The tests are then disabled
      until we actually have the vectorized functions.
      
      Change-Id: I61b686abd14bba5280bed94e1be62eb74ea23d89
      0f3c94e1
  5. 15 May, 2017 1 commit
    • Ralph Giles's avatar
      Remove armv6 media-extension assembly. · be111b38
      Ralph Giles authored
      Libvpx dropped armv6 support sometime after the aom fork.
      
      We don't intend to support this platform, which is likely
      too slow in any case. Remove the assembly and intrinsics
      optimized routines, their tests, cpu feature detection,
      and rtcd specialization for this instruction set extension.
      
      Change-Id: If44ec28e5ddafc6af179c5d1982ac7e81fe54d5e
      be111b38
  6. 08 May, 2017 1 commit
    • Yi Luo's avatar
      Partial IDCT 16x16 avx2 · f6176abb
      Yi Luo authored
      - Function level improvement:
      functions      sse2  avx2  percentage
      idct16x16_256  365   226   38%
      idct16x16_38   n/a   136   n/a
      idct16x16_10   171   110   35%
      idct16x16_1     34    26   23%
      
      - Integrated in AV1 for default scan order.
      
      Change-Id: Ieb1a8e730bea9c371ebc0e5f4a748640d8f5e921
      f6176abb
  7. 05 May, 2017 1 commit
  8. 26 Apr, 2017 1 commit
    • Yi Luo's avatar
      Update partial inverse DCT according to VP9 · 3fcb356e
      Yi Luo authored
      - Partial inverse DCT unit tests have been enhanced.
      - IDCT x86_64 assembly code has been removed.
      
      Change-Id: Ic3bed2c0e70abdfd642a4f74fa969cc672d4795f
      3fcb356e
  9. 12 Apr, 2017 1 commit
  10. 04 Apr, 2017 1 commit
  11. 30 Mar, 2017 1 commit
    • Yi Luo's avatar
      High bit depth inter prediction filter AVX2 · 9d247355
      Yi Luo authored
      On i7-6700:
      - Function level speed improvement: 23%-29%
      - User level speed improvement:
         decoder: ~%2-%4.
         encoder: <1%.
      
      Change-Id: I02937a72304c3b356ca41e580352790df391f0a2
      9d247355
  12. 27 Mar, 2017 1 commit
    • Debargha Mukherjee's avatar
      Adds binary code lib for coding various symbols · 47748b56
      Debargha Mukherjee authored
      Adds a variable length binary code library for
      coding various symbols for typical use in headers.
      
      The main codes implemented are:
      1. Coding a symbol from an n-ary alphabet using a
      quasi-uniform code.
      2. A bilevel code for coding symbols from an n-ary
      alphabet based on a reference value for the symbol
      also taken from the same alphabet.
      The code has two steps. If the symbol is close to
      the reference a shorter code is used, while if it is
      farther away a longer code is used.
      3. A finite (terminated) subexponential code that codes
      a symbol from an n-ary alphabet using subexp parameter k.
      4. A finite (terminated) subexponential code that codes
      a symbol from an n-ary alphabet using subexp parameter k,
      based on a given reference also taken from the same
      alphabet. This code essentially reorders the values
      before using the same code as 3.
      
      Also adds corresponding encoder side functions to count
      the number of bits used.
      
      These codes will be subsequently used for more efficient
      encoding of loop-restoration parameters and global motion
      parameters.
      
      Change-Id: I28c82b611925c1ab17f544c48c4b1287930764b7
      47748b56
  13. 03 Mar, 2017 1 commit
  14. 01 Feb, 2017 1 commit
    • Alex Converse's avatar
      ans: Remove some dead code. · e8b34bb1
      Alex Converse authored
      This was part of the old ans zero token handling. It has been replaced
      by the new ec_multisymbol zero token handling.
      
      Change-Id: I9c1fcb42ac0d214178cf4fbf8755ad68dcbbc11f
      e8b34bb1
  15. 13 Dec, 2016 1 commit
  16. 09 Dec, 2016 1 commit
    • Yi Luo's avatar
      High bit depth motion search SAD optimization on avx2 · e9832584
      Yi Luo authored
      - For all blocks with width >= 16.
      - Add test_count to make the unit tests harder to pass.
      - Speed testing on 1080p, 100 frames, 5 Mbps, CPU, i7-6700
        User level time reduction:
         baseline:                  3.68%
         baseline + ext-partition: 36.12%
      
      Change-Id: I78c5d9ca216f0fd91f1a360dca2190b11fd54a08
      e9832584
  17. 23 Nov, 2016 1 commit
  18. 16 Nov, 2016 1 commit
  19. 10 Nov, 2016 1 commit
  20. 07 Nov, 2016 1 commit
    • Yushin Cho's avatar
      New experiment: Perceptual Vector Quantization from Daala · 77bba8d3
      Yushin Cho authored
      PVQ replaces the scalar quantizer and coefficient coding with a new
      design originally developed in Daala. It currently depends on the
      Daala entropy coder although it could be adapted to work with another
      entropy coder if needed:
      ./configure --enable-experimental --enable-daala_ec --enable-pvq
      
      The version of PVQ in this commit is adapted from the following
      revision of Daala:
      https://github.com/xiph/daala/commit/fb51c1ade6a31b668a0157d89de8f0a4493162a8
      
      More information about PVQ:
      - https://people.xiph.org/~jm/daala/pvq_demo/
      - https://jmvalin.ca/papers/spie_pvq.pdf
      
      The following files are copied as-is from Daala with minimal
      adaptations, therefore we disable clang-format on those files
      to make it easier to synchronize the AV1 and Daala codebases in the future:
       av1/common/generic_code.c
       av1/common/generic_code.h
       av1/common/laplace_tables.c
       av1/common/partition.c
       av1/common/partition.h
       av1/common/pvq.c
       av1/common/pvq.h
       av1/common/state.c
       av1/common/state.h
       av1/common/zigzag.h
       av1/common/zigzag16.c
       av1/common/zigzag32.c
       av1/common/zigzag4.c
       av1/common/zigzag64.c
       av1/common/zigzag8.c
       av1/decoder/decint.h
       av1/decoder/generic_decoder.c
       av1/decoder/laplace_decoder.c
       av1/decoder/pvq_decoder.c
       av1/decoder/pvq_decoder.h
       av1/encoder/daala_compat_enc.c
       av1/encoder/encint.h
       av1/encoder/generic_encoder.c
       av1/encoder/laplace_encoder.c
       av1/encoder/pvq_encoder.c
       av1/encoder/pvq_encoder.h
      
      Known issues:
      - Lossless mode is not supported, '--lossless=1' will give the same result as
      '--end-usage=q --cq-level=1'.
      - High bit depth is not supported by PVQ.
      
      Change-Id: I1ae0d6517b87f4c1ccea944b2e12dc906979f25e
      77bba8d3
  21. 04 Nov, 2016 1 commit
    • Yushin Cho's avatar
      New experiment: Perceptual Vector Quantization from Daala · 09705fe7
      Yushin Cho authored
      PVQ replaces the scalar quantizer and coefficient coding with a new
      design originally developed in Daala. It currently depends on the
      Daala entropy coder although it could be adapted to work with another
      entropy coder if needed:
      ./configure --enable-experimental --enable-daala_ec --enable-pvq
      
      The version of PVQ in this commit is adapted from the following
      revision of Daala:
      https://github.com/xiph/daala/commit/fb51c1ade6a31b668a0157d89de8f0a4493162a8
      
      More information about PVQ:
      - https://people.xiph.org/~jm/daala/pvq_demo/
      - https://jmvalin.ca/papers/spie_pvq.pdf
      
      The following files are copied as-is from Daala with minimal
      adaptations, therefore we disable clang-format on those files
      to make it easier to synchronize the AV1 and Daala codebases in the future:
       av1/common/generic_code.c
       av1/common/generic_code.h
       av1/common/laplace_tables.c
       av1/common/partition.c
       av1/common/partition.h
       av1/common/pvq.c
       av1/common/pvq.h
       av1/common/state.c
       av1/common/state.h
       av1/common/zigzag.h
       av1/common/zigzag16.c
       av1/common/zigzag32.c
       av1/common/zigzag4.c
       av1/common/zigzag64.c
       av1/common/zigzag8.c
       av1/decoder/decint.h
       av1/decoder/generic_decoder.c
       av1/decoder/laplace_decoder.c
       av1/decoder/pvq_decoder.c
       av1/decoder/pvq_decoder.h
       av1/encoder/daala_compat_enc.c
       av1/encoder/encint.h
       av1/encoder/generic_encoder.c
       av1/encoder/laplace_encoder.c
       av1/encoder/pvq_encoder.c
       av1/encoder/pvq_encoder.h
      
      Known issues:
      - Lossless mode is not supported, '--lossless=1' will give the same result as
      '--end-usage=q --cq-level=1'.
      - High bit depth is not supported by PVQ.
      
      Change-Id: I1ae0d6517b87f4c1ccea944b2e12dc906979f25e
      09705fe7
  22. 25 Oct, 2016 1 commit
  23. 21 Oct, 2016 2 commits
  24. 20 Oct, 2016 1 commit
  25. 19 Oct, 2016 1 commit
  26. 18 Oct, 2016 1 commit
  27. 14 Oct, 2016 2 commits
    • Nathan E. Egge's avatar
      Use Daala entropy coder to code bits. · 8043cc40
      Nathan E. Egge authored
      When building with --enable-daala_ec, calls to aom_write() and aom_read()
       use the daala entropy coder to write and read bits.
      When the probability is exactly 0.5 (128), then raw bits are used.
      
      ntt-short-1:
      
                MEDIUM (%) HIGH (%)
          PSNR -0.027556  -0.020114
       PSNRHVS -0.027401  -0.020169
          SSIM -0.027587  -0.020151
      FASTSSIM -0.027592  -0.020102
      
      subset1:
      
               RATE (%)  DSNR (dB)
          PSNR 0.03296  -0.00210
       PSNRHVS 0.03537  -0.00281
          SSIM 0.03299  -0.00161
      FASTSSIM 0.03458  -0.00111
      
      Change-Id: I48ad8eb40fc895d62d6e241ea8abc02820d573f7
      8043cc40
    • Nathan E. Egge's avatar
      Add Daala entropy coder. · 1078dee5
      Nathan E. Egge authored
      Change-Id: I2849a50163268d58cc5d80aacfec1fd02299ca43
      1078dee5
  28. 13 Oct, 2016 2 commits
  29. 12 Oct, 2016 2 commits
    • Yaowu Xu's avatar
      port changes on lpf from libvpx/nextgenv2 · 57ad0a05
      Yaowu Xu authored
      Manually cherry-picked the following commits:
      4b5e462d Upgrade vpx_lpf_{vertical,horizontal}_4 mmx to sse2
      3ea537c0 lpf_8_test: remove unneeded function wrapper
      110d3778 remove loopfilter 'count' param TODOs
      9b44d9d0 split vpx_highbd_lpf_horizontal_16 in two
      1b519fb6 split vpx_lpf_horizontal_16 in two
      e7a23d70 vpx_highbd_lpf_horizontal_4: remove unused count param
      51718573 vpx_highbd_lpf_horizontal_8: remove unused count param
      3c1019e4 vpx_highbd_lpf_vertical_4: remove unused count param
      72a9f06a vpx_highbd_lpf_vertical_8: remove unused count param
      b1e97c6a vpx_lpf_horizontal_4: remove unused count param
       ab25e46pgrade vpx_lpf_{vertical,horizontal}_4 mmx to sse2
      bd5a5bb5 vpx_lpf_horizontal_8: remove unused count param
      109a47b3 vpx_lpf_vertical_4: remove unused count param
      37225744 vpx_lpf_vertical_8: remove unused count param
      47dee375 lpf_8_test: add missing dspr2 tests
      4fec4a8e lpf_8_test: add missing vpx_lpf_horizontal_4 tests
      c3f2c8ad lpf_8_test: add missing vpx_lpf_vertical_4 tests
      45a7b5eb lpf_8_test: simplify function wrapper generation
      
      Change-Id: I0e9212497bbf30de37b19cd2d6ea63b505abe06d
      57ad0a05
    • Yaowu Xu's avatar
      minor updates · f36d0b46
      Yaowu Xu authored
      1. vp8->aom
      2. removed no-effect statements and spaces
      
      Change-Id: I367d05ff9bf1b9f3c71c517c45d8049d9d4236ec
      f36d0b46
  30. 11 Oct, 2016 1 commit
  31. 10 Oct, 2016 1 commit
  32. 06 Oct, 2016 1 commit
    • Yi Luo's avatar
      Hybrid forward transforms 16x16 AVX2 optimization · e8e8cd8f
      Yi Luo authored
      - Unit tests are added for AVX2 SIMD.
      - Encoder speed improvement:
        AV1 baseline and EXT_TX, three 1080p sequences at bitrate:
        800 Kbps, 2 Mbps, 6 Mbps, on i7-6700 CPU, average
        user level time reduction: 3.86%.
      
      Change-Id: Ibbd7837ee3a831c6b1e4e471bf6c8d3fa3a19ff4
      e8e8cd8f
  33. 28 Sep, 2016 1 commit
  34. 27 Sep, 2016 1 commit
  35. 19 Sep, 2016 1 commit
    • Alex Converse's avatar
      Move ANS to aom_dsp. · 1ac1ae73
      Alex Converse authored
      That's where it lives in aom/master.
      
      Change-Id: I38f405827d9c2d0b06ef5f3bfd7cadc35d5991ef
      1ac1ae73