1. 02 Dec, 2016 1 commit
    • Jingning Han's avatar
      Enable 2x2 intra prediction · 7833d2bf
      Jingning Han authored
      Bring 2x2 intra prediction online for chroma components.
      
      Change-Id: Ia56af9101b2a977691bca4156a6dcf89e644b4a7
      7833d2bf
  2. 28 Nov, 2016 2 commits
    • Yi Luo's avatar
      SAD avg and 4D avx2 optimization for ext-partition · 9e218747
      Yi Luo authored
      - User level time reduction <1% on i7-6700 cpu
      
      Change-Id: I8f15bde07dddd938df0b065e20ae94109e7b3b5b
      9e218747
    • Urvang Joshi's avatar
      Add a new intra prediction mode "smooth". · 6be4a54b
      Urvang Joshi authored
      This is added as part of ALT_INTRA experiment.
      
      This uses interpolation between top row and estimated bottom row; as
      well as left column and estimated right column to generate the
      predicted block.The interpolation is done using a predefined weight
      array.
      
      Based on experiments, the currently chosen weight array was created
      to represent a quadratic curve, but can be tuned further if needed.
      
      Improvement from baseline on Derf set:
      ALL Keyframes: 1.279%
      
      Improvement from existing ALT_INTRA:
      ALL Keyframes: 1.146%
      
      Change-Id: I12637fa1b91bd836f1c59b27d6caee2004acbdd4
      6be4a54b
  3. 21 Nov, 2016 1 commit
  4. 10 Nov, 2016 1 commit
  5. 09 Nov, 2016 1 commit
  6. 07 Nov, 2016 1 commit
    • Yushin Cho's avatar
      New experiment: Perceptual Vector Quantization from Daala · 77bba8d3
      Yushin Cho authored
      PVQ replaces the scalar quantizer and coefficient coding with a new
      design originally developed in Daala. It currently depends on the
      Daala entropy coder although it could be adapted to work with another
      entropy coder if needed:
      ./configure --enable-experimental --enable-daala_ec --enable-pvq
      
      The version of PVQ in this commit is adapted from the following
      revision of Daala:
      https://github.com/xiph/daala/commit/fb51c1ade6a31b668a0157d89de8f0a4493162a8
      
      More information about PVQ:
      - https://people.xiph.org/~jm/daala/pvq_demo/
      - https://jmvalin.ca/papers/spie_pvq.pdf
      
      The following files are copied as-is from Daala with minimal
      adaptations, therefore we disable clang-format on those files
      to make it easier to synchronize the AV1 and Daala codebases in the future:
       av1/common/generic_code.c
       av1/common/generic_code.h
       av1/common/laplace_tables.c
       av1/common/partition.c
       av1/common/partition.h
       av1/common/pvq.c
       av1/common/pvq.h
       av1/common/state.c
       av1/common/state.h
       av1/common/zigzag.h
       av1/common/zigzag16.c
       av1/common/zigzag32.c
       av1/common/zigzag4.c
       av1/common/zigzag64.c
       av1/common/zigzag8.c
       av1/decoder/decint.h
       av1/decoder/generic_decoder.c
       av1/decoder/laplace_decoder.c
       av1/decoder/pvq_decoder.c
       av1/decoder/pvq_decoder.h
       av1/encoder/daala_compat_enc.c
       av1/encoder/encint.h
       av1/encoder/generic_encoder.c
       av1/encoder/laplace_encoder.c
       av1/encoder/pvq_encoder.c
       av1/encoder/pvq_encoder.h
      
      Known issues:
      - Lossless mode is not supported, '--lossless=1' will give the same result as
      '--end-usage=q --cq-level=1'.
      - High bit depth is not supported by PVQ.
      
      Change-Id: I1ae0d6517b87f4c1ccea944b2e12dc906979f25e
      77bba8d3
  7. 04 Nov, 2016 1 commit
    • Yushin Cho's avatar
      New experiment: Perceptual Vector Quantization from Daala · 09705fe7
      Yushin Cho authored
      PVQ replaces the scalar quantizer and coefficient coding with a new
      design originally developed in Daala. It currently depends on the
      Daala entropy coder although it could be adapted to work with another
      entropy coder if needed:
      ./configure --enable-experimental --enable-daala_ec --enable-pvq
      
      The version of PVQ in this commit is adapted from the following
      revision of Daala:
      https://github.com/xiph/daala/commit/fb51c1ade6a31b668a0157d89de8f0a4493162a8
      
      More information about PVQ:
      - https://people.xiph.org/~jm/daala/pvq_demo/
      - https://jmvalin.ca/papers/spie_pvq.pdf
      
      The following files are copied as-is from Daala with minimal
      adaptations, therefore we disable clang-format on those files
      to make it easier to synchronize the AV1 and Daala codebases in the future:
       av1/common/generic_code.c
       av1/common/generic_code.h
       av1/common/laplace_tables.c
       av1/common/partition.c
       av1/common/partition.h
       av1/common/pvq.c
       av1/common/pvq.h
       av1/common/state.c
       av1/common/state.h
       av1/common/zigzag.h
       av1/common/zigzag16.c
       av1/common/zigzag32.c
       av1/common/zigzag4.c
       av1/common/zigzag64.c
       av1/common/zigzag8.c
       av1/decoder/decint.h
       av1/decoder/generic_decoder.c
       av1/decoder/laplace_decoder.c
       av1/decoder/pvq_decoder.c
       av1/decoder/pvq_decoder.h
       av1/encoder/daala_compat_enc.c
       av1/encoder/encint.h
       av1/encoder/generic_encoder.c
       av1/encoder/laplace_encoder.c
       av1/encoder/pvq_encoder.c
       av1/encoder/pvq_encoder.h
      
      Known issues:
      - Lossless mode is not supported, '--lossless=1' will give the same result as
      '--end-usage=q --cq-level=1'.
      - High bit depth is not supported by PVQ.
      
      Change-Id: I1ae0d6517b87f4c1ccea944b2e12dc906979f25e
      09705fe7
  8. 28 Oct, 2016 1 commit
  9. 26 Oct, 2016 1 commit
  10. 19 Oct, 2016 1 commit
  11. 14 Oct, 2016 1 commit
  12. 13 Oct, 2016 2 commits
    • Yue Chen's avatar
      Renamings for OBMC experiment · cb60b185
      Yue Chen authored
      To get ready for pulling AV1 to nextgenv2
      Replace the experimental flag by MOTION_VAR. Rename major variables.
      
      Change-Id: If6cf4f37b9319c46d8f90df551cc7295d66ca205
      cb60b185
    • Jingning Han's avatar
      Sync 2x2 intra predictors · e3954d83
      Jingning Han authored
      Add 2x2 DC, V, H, TM intra predictors.
      
      Change-Id: I2a614adde553f821c45bc5a9bf09800a9f0aaa26
      e3954d83
  13. 12 Oct, 2016 3 commits
    • Yi Luo's avatar
      Hybrid forward transform 32x32 AVX2 optimization · fed8e1c0
      Yi Luo authored
      - av1_fht32x32 AVX2 function level time reduction ~89% compared to C.
      
      - av1_fht32x32_avx2() on DCT_DCT improves 42.62% over aom_fdct32x32_avx2()
        But function replacement must go with the corresponding inverse txfm.
      
      - No obvious user level time reduction due to 32x32 TX_TYPE selection.
      
      - Zero high 128b YMM to avoid AVX-SSE transition penalties
        (fix 16x16 case).
      
      - Added 32x32 AVX2 unit tests to verify bitexact.
      
      - AVX2 optimization summary:
        On CPU i7-6700, based on 16x16/32x32 fwd txfm optimization results:
        C to AVX2: function level time reduction, ~86-89%.
        SSE2 to AVX2: function level time reduction, ~51%.
      
      Change-Id: Idd0cd8bf066a61c7117140ef15ab6c1f8eb4b036
      fed8e1c0
    • Yaowu Xu's avatar
      port changes on lpf from libvpx/nextgenv2 · 57ad0a05
      Yaowu Xu authored
      Manually cherry-picked the following commits:
      4b5e462d Upgrade vpx_lpf_{vertical,horizontal}_4 mmx to sse2
      3ea537c0 lpf_8_test: remove unneeded function wrapper
      110d3778 remove loopfilter 'count' param TODOs
      9b44d9d0 split vpx_highbd_lpf_horizontal_16 in two
      1b519fb6 split vpx_lpf_horizontal_16 in two
      e7a23d70 vpx_highbd_lpf_horizontal_4: remove unused count param
      51718573 vpx_highbd_lpf_horizontal_8: remove unused count param
      3c1019e4 vpx_highbd_lpf_vertical_4: remove unused count param
      72a9f06a vpx_highbd_lpf_vertical_8: remove unused count param
      b1e97c6a vpx_lpf_horizontal_4: remove unused count param
       ab25e46pgrade vpx_lpf_{vertical,horizontal}_4 mmx to sse2
      bd5a5bb5 vpx_lpf_horizontal_8: remove unused count param
      109a47b3 vpx_lpf_vertical_4: remove unused count param
      37225744 vpx_lpf_vertical_8: remove unused count param
      47dee375 lpf_8_test: add missing dspr2 tests
      4fec4a8e lpf_8_test: add missing vpx_lpf_horizontal_4 tests
      c3f2c8ad lpf_8_test: add missing vpx_lpf_vertical_4 tests
      45a7b5eb lpf_8_test: simplify function wrapper generation
      
      Change-Id: I0e9212497bbf30de37b19cd2d6ea63b505abe06d
      57ad0a05
    • Yaowu Xu's avatar
      minor updates · f36d0b46
      Yaowu Xu authored
      1. vp8->aom
      2. removed no-effect statements and spaces
      
      Change-Id: I367d05ff9bf1b9f3c71c517c45d8049d9d4236ec
      f36d0b46
  14. 10 Oct, 2016 5 commits
  15. 06 Oct, 2016 1 commit
    • Yi Luo's avatar
      Hybrid forward transforms 16x16 AVX2 optimization · e8e8cd8f
      Yi Luo authored
      - Unit tests are added for AVX2 SIMD.
      - Encoder speed improvement:
        AV1 baseline and EXT_TX, three 1080p sequences at bitrate:
        800 Kbps, 2 Mbps, 6 Mbps, on i7-6700 CPU, average
        user level time reduction: 3.86%.
      
      Change-Id: Ibbd7837ee3a831c6b1e4e471bf6c8d3fa3a19ff4
      e8e8cd8f
  16. 28 Sep, 2016 1 commit
  17. 26 Sep, 2016 2 commits
  18. 22 Sep, 2016 1 commit
  19. 21 Sep, 2016 1 commit
    • Angie Chiang's avatar
      Work around to avoid mismtach on adaptive scan experiment · d58f39d5
      Angie Chiang authored
      1) Turn off SIMD quantizer in adapt_scan experiment because the iscan is
      not 16-byte aligned now.
      
      2) Turn off eob-specific dqcoeff initialization in
      inverse_transform_block_inter and inverse_transform_block_intra
      
      3) Turn off transform optimization for special eob because it is not
      compatible with adapt_scan experiment
      
      Performance:
              PSNR    BDRate
      lowres  1.2%    1.068%
      midres  0.897%  0.769%
      hdres   0.945%  0.724%
      
      Change-Id: I197c19ba536761c334790a040ef44534c7cf21b5
      d58f39d5
  20. 16 Sep, 2016 1 commit
    • Steinar Midtskogen's avatar
      Extend CLPF to chroma. · a25c6c3b
      Steinar Midtskogen authored
      Objective quality impact (low latency):
      
      PSNR YCbCr:      0.13%     -1.37%     -1.79%
         PSNRHVS:      0.03%
            SSIM:      0.24%
          MSSSIM:      0.10%
       CIEDE2000:     -0.83%
      
      Change-Id: I8ddf0def569286775f0f9d4d4005932766a7fc27
      a25c6c3b
  21. 13 Sep, 2016 1 commit
  22. 08 Sep, 2016 1 commit
    • Steinar Midtskogen's avatar
      Reduce memory footprint for CLPF decoding. · eb5794da
      Steinar Midtskogen authored
      Instead of having CLPF write to an entire new frame and
      copy the result back into the original frame, make the
      filter able to work in-place by keeping a buffer of size
      frame_width*filter_block_size and delay the write-back
      by one filter_block_size row.
      
      This reduces the cycles spent in the filter to ~75%.
      
      Change-Id: I78ca74380c45492daa8935d08d766851edb5fbc1
      eb5794da
  23. 01 Sep, 2016 3 commits
    • Urvang Joshi's avatar
      Add ALT_INTRA experiment. · 340593e5
      Urvang Joshi authored
      When the experiment is ON, we use Paeth predictor instead of TM
      predictor.
      
      For derf set, this gives about 0.09% improvement overall, and 0.55%
      improvement if all frames are forced to be intra-only.
      
      Also, if the EXT_INTRA experiment is also on, the improvement overall
      is 0.056%, and improvement if all frames are forced to be intra-only is
      0.465%.
      
      Change-Id: Id74e107ede70a8d2107fa14fcb3f44b23a437274
      340593e5
    • Steinar Midtskogen's avatar
      Added generic SIMD support for CLPF. · b87cc923
      Steinar Midtskogen authored
      Change-Id: Ie03f9a5b0a4c708a586532198d755a1e7509f149
      b87cc923
    • Yaowu Xu's avatar
      Port renaming changes from AOMedia · f883b42c
      Yaowu Xu authored
      Cherry-Picked the following commits:
      0defd8f2 Changed "WebM" to "AOMedia" & "webm" to "aomedia"
      54e66767 Replace "VPx" by "AVx"
      5082a369 Change "Vpx" to "Avx"
      7df44f17 Replace "Vp9" w/ "Av1"
      967f722f Remove kVp9CodecId
      828f30ce Change "Vp8" to "AOM"
      030b5ffc AUTHORS regenerated
      2524caee Add ref-mv experimental flag
      016762be Change copyright notice to AOMedia form
      81e55269 Replace vp9 w/ av1
      9b94565b Add missing files
      fa8ca9f2 Change "vp9" to "av1"
      ec838b76  Convert "vp8" to "aom"
      80edfa01 Change "VP9" to "AV1"
      d1a11fb9 Change "vp8" to "aom"
      7b582513 Point to WebM test data
      dd1a5c8d Replace "VP8" with "AOM"
      ff00fc0f Change "VPX" to "AOM"
      01dee0bb Change "vp10" to "av1" in source code
      cebe6f0c Convert "vpx" to "aom"
      17b05679 rename vp10*.mk to av1_*.mk
      fe5f8a8a rename files vp10_* to av1_*
      
      Change-Id: I6fc3d18eb11fc171e46140c836ad5339cf6c9419
      f883b42c
  24. 22 Aug, 2016 2 commits
  25. 01 Aug, 2016 1 commit
    • Yue Chen's avatar
      Add weighted motion search for obmc predictor · 72d3ba8a
      Yue Chen authored
      Also port SIMD optimization of weighted sad/variance functions to
      av1.
      Coding gain improvement: 0.339/0.413/0.328 (lowres/midres/hdres)
      Current coding gain: 2.437/2.428/2.294
      Encoding time overhead: 17% (soccer_cif), 30% (ped_1080p25), was
      12% and 18% without motion search
      
      Change-Id: I101d6ce729f769853756edc8ced6f3a2b8d8f824
      72d3ba8a
  26. 26 Jul, 2016 1 commit
    • Yue Chen's avatar
      Port SIMD optimization for obmc blending functions to av1 · 2478bed5
      Yue Chen authored
      SIMD optimization for 1d blending functions in obmc mode, and some
      code refactoring and cleanup.
      
      (ped_1080p25.y4m, 150 frame, 2000 tb)
      Encoding time overhead: +18.8% -> +18.1%
      Decoding time overhead: +21.3% -> +8.7%
      Change-Id: I9d856c32136e7e0e6e24ab5520ef901d7b1ee9c8
      2478bed5
  27. 28 Jun, 2016 1 commit
  28. 21 Jun, 2016 1 commit
    • Yunqing Wang's avatar
      Do sub-pixel motion search in up-sampled reference frames · e02752b0
      Yunqing Wang authored
      Up-sampled the reference frames to 8 times in each dimension using the
      8-tap interpolation filter. In sub-pixel motion search, use the up-sampled
      reference frames to find the best matching blocks to increase the motion
      search precision. This is enabled as a speed feature for speed 0 and
      speed 1, and this is encoder-only improvement.
      
      Overall PSNR: -1.456%(lowres); -0.430(hdres)
      SSIM: -1.687(lowres); -0.551(hdres)
      
      Change-Id: I2085d87e41f6b91d0221dc11dc7ffd003075ba2e
      e02752b0