1. 10 Nov, 2016 1 commit
  2. 09 Nov, 2016 1 commit
  3. 07 Nov, 2016 1 commit
    • Yushin Cho's avatar
      New experiment: Perceptual Vector Quantization from Daala · 77bba8d3
      Yushin Cho authored
      PVQ replaces the scalar quantizer and coefficient coding with a new
      design originally developed in Daala. It currently depends on the
      Daala entropy coder although it could be adapted to work with another
      entropy coder if needed:
      ./configure --enable-experimental --enable-daala_ec --enable-pvq
      
      The version of PVQ in this commit is adapted from the following
      revision of Daala:
      https://github.com/xiph/daala/commit/fb51c1ade6a31b668a0157d89de8f0a4493162a8
      
      More information about PVQ:
      - https://people.xiph.org/~jm/daala/pvq_demo/
      - https://jmvalin.ca/papers/spie_pvq.pdf
      
      The following files are copied as-is from Daala with minimal
      adaptations, therefore we disable clang-format on those files
      to make it easier to synchronize the AV1 and Daala codebases in the future:
       av1/common/generic_code.c
       av1/common/generic_code.h
       av1/common/laplace_tables.c
       av1/common/partition.c
       av1/common/partition.h
       av1/common/pvq.c
       av1/common/pvq.h
       av1/common/state.c
       av1/common/state.h
       av1/common/zigzag.h
       av1/common/zigzag16.c
       av1/common/zigzag32.c
       av1/common/zigzag4.c
       av1/common/zigzag64.c
       av1/common/zigzag8.c
       av1/decoder/decint.h
       av1/decoder/generic_decoder.c
       av1/decoder/laplace_decoder.c
       av1/decoder/pvq_decoder.c
       av1/decoder/pvq_decoder.h
       av1/encoder/daala_compat_enc.c
       av1/encoder/encint.h
       av1/encoder/generic_encoder.c
       av1/encoder/laplace_encoder.c
       av1/encoder/pvq_encoder.c
       av1/encoder/pvq_encoder.h
      
      Known issues:
      - Lossless mode is not supported, '--lossless=1' will give the same result as
      '--end-usage=q --cq-level=1'.
      - High bit depth is not supported by PVQ.
      
      Change-Id: I1ae0d6517b87f4c1ccea944b2e12dc906979f25e
      77bba8d3
  4. 04 Nov, 2016 1 commit
    • Yushin Cho's avatar
      New experiment: Perceptual Vector Quantization from Daala · 09705fe7
      Yushin Cho authored
      PVQ replaces the scalar quantizer and coefficient coding with a new
      design originally developed in Daala. It currently depends on the
      Daala entropy coder although it could be adapted to work with another
      entropy coder if needed:
      ./configure --enable-experimental --enable-daala_ec --enable-pvq
      
      The version of PVQ in this commit is adapted from the following
      revision of Daala:
      https://github.com/xiph/daala/commit/fb51c1ade6a31b668a0157d89de8f0a4493162a8
      
      More information about PVQ:
      - https://people.xiph.org/~jm/daala/pvq_demo/
      - https://jmvalin.ca/papers/spie_pvq.pdf
      
      The following files are copied as-is from Daala with minimal
      adaptations, therefore we disable clang-format on those files
      to make it easier to synchronize the AV1 and Daala codebases in the future:
       av1/common/generic_code.c
       av1/common/generic_code.h
       av1/common/laplace_tables.c
       av1/common/partition.c
       av1/common/partition.h
       av1/common/pvq.c
       av1/common/pvq.h
       av1/common/state.c
       av1/common/state.h
       av1/common/zigzag.h
       av1/common/zigzag16.c
       av1/common/zigzag32.c
       av1/common/zigzag4.c
       av1/common/zigzag64.c
       av1/common/zigzag8.c
       av1/decoder/decint.h
       av1/decoder/generic_decoder.c
       av1/decoder/laplace_decoder.c
       av1/decoder/pvq_decoder.c
       av1/decoder/pvq_decoder.h
       av1/encoder/daala_compat_enc.c
       av1/encoder/encint.h
       av1/encoder/generic_encoder.c
       av1/encoder/laplace_encoder.c
       av1/encoder/pvq_encoder.c
       av1/encoder/pvq_encoder.h
      
      Known issues:
      - Lossless mode is not supported, '--lossless=1' will give the same result as
      '--end-usage=q --cq-level=1'.
      - High bit depth is not supported by PVQ.
      
      Change-Id: I1ae0d6517b87f4c1ccea944b2e12dc906979f25e
      09705fe7
  5. 28 Oct, 2016 1 commit
  6. 26 Oct, 2016 1 commit
  7. 19 Oct, 2016 1 commit
  8. 14 Oct, 2016 1 commit
  9. 13 Oct, 2016 2 commits
    • Yue Chen's avatar
      Renamings for OBMC experiment · cb60b185
      Yue Chen authored
      To get ready for pulling AV1 to nextgenv2
      Replace the experimental flag by MOTION_VAR. Rename major variables.
      
      Change-Id: If6cf4f37b9319c46d8f90df551cc7295d66ca205
      cb60b185
    • Jingning Han's avatar
      Sync 2x2 intra predictors · e3954d83
      Jingning Han authored
      Add 2x2 DC, V, H, TM intra predictors.
      
      Change-Id: I2a614adde553f821c45bc5a9bf09800a9f0aaa26
      e3954d83
  10. 12 Oct, 2016 3 commits
    • Yi Luo's avatar
      Hybrid forward transform 32x32 AVX2 optimization · fed8e1c0
      Yi Luo authored
      - av1_fht32x32 AVX2 function level time reduction ~89% compared to C.
      
      - av1_fht32x32_avx2() on DCT_DCT improves 42.62% over aom_fdct32x32_avx2()
        But function replacement must go with the corresponding inverse txfm.
      
      - No obvious user level time reduction due to 32x32 TX_TYPE selection.
      
      - Zero high 128b YMM to avoid AVX-SSE transition penalties
        (fix 16x16 case).
      
      - Added 32x32 AVX2 unit tests to verify bitexact.
      
      - AVX2 optimization summary:
        On CPU i7-6700, based on 16x16/32x32 fwd txfm optimization results:
        C to AVX2: function level time reduction, ~86-89%.
        SSE2 to AVX2: function level time reduction, ~51%.
      
      Change-Id: Idd0cd8bf066a61c7117140ef15ab6c1f8eb4b036
      fed8e1c0
    • Yaowu Xu's avatar
      port changes on lpf from libvpx/nextgenv2 · 57ad0a05
      Yaowu Xu authored
      Manually cherry-picked the following commits:
      4b5e462d Upgrade vpx_lpf_{vertical,horizontal}_4 mmx to sse2
      3ea537c0 lpf_8_test: remove unneeded function wrapper
      110d3778 remove loopfilter 'count' param TODOs
      9b44d9d0 split vpx_highbd_lpf_horizontal_16 in two
      1b519fb6 split vpx_lpf_horizontal_16 in two
      e7a23d70 vpx_highbd_lpf_horizontal_4: remove unused count param
      51718573 vpx_highbd_lpf_horizontal_8: remove unused count param
      3c1019e4 vpx_highbd_lpf_vertical_4: remove unused count param
      72a9f06a vpx_highbd_lpf_vertical_8: remove unused count param
      b1e97c6a vpx_lpf_horizontal_4: remove unused count param
       ab25e46pgrade vpx_lpf_{vertical,horizontal}_4 mmx to sse2
      bd5a5bb5 vpx_lpf_horizontal_8: remove unused count param
      109a47b3 vpx_lpf_vertical_4: remove unused count param
      37225744 vpx_lpf_vertical_8: remove unused count param
      47dee375 lpf_8_test: add missing dspr2 tests
      4fec4a8e lpf_8_test: add missing vpx_lpf_horizontal_4 tests
      c3f2c8ad lpf_8_test: add missing vpx_lpf_vertical_4 tests
      45a7b5eb lpf_8_test: simplify function wrapper generation
      
      Change-Id: I0e9212497bbf30de37b19cd2d6ea63b505abe06d
      57ad0a05
    • Yaowu Xu's avatar
      minor updates · f36d0b46
      Yaowu Xu authored
      1. vp8->aom
      2. removed no-effect statements and spaces
      
      Change-Id: I367d05ff9bf1b9f3c71c517c45d8049d9d4236ec
      f36d0b46
  11. 10 Oct, 2016 5 commits
  12. 06 Oct, 2016 1 commit
    • Yi Luo's avatar
      Hybrid forward transforms 16x16 AVX2 optimization · e8e8cd8f
      Yi Luo authored
      - Unit tests are added for AVX2 SIMD.
      - Encoder speed improvement:
        AV1 baseline and EXT_TX, three 1080p sequences at bitrate:
        800 Kbps, 2 Mbps, 6 Mbps, on i7-6700 CPU, average
        user level time reduction: 3.86%.
      
      Change-Id: Ibbd7837ee3a831c6b1e4e471bf6c8d3fa3a19ff4
      e8e8cd8f
  13. 28 Sep, 2016 1 commit
  14. 26 Sep, 2016 2 commits
  15. 22 Sep, 2016 1 commit
  16. 21 Sep, 2016 1 commit
    • Angie Chiang's avatar
      Work around to avoid mismtach on adaptive scan experiment · d58f39d5
      Angie Chiang authored
      1) Turn off SIMD quantizer in adapt_scan experiment because the iscan is
      not 16-byte aligned now.
      
      2) Turn off eob-specific dqcoeff initialization in
      inverse_transform_block_inter and inverse_transform_block_intra
      
      3) Turn off transform optimization for special eob because it is not
      compatible with adapt_scan experiment
      
      Performance:
              PSNR    BDRate
      lowres  1.2%    1.068%
      midres  0.897%  0.769%
      hdres   0.945%  0.724%
      
      Change-Id: I197c19ba536761c334790a040ef44534c7cf21b5
      d58f39d5
  17. 16 Sep, 2016 1 commit
    • Steinar Midtskogen's avatar
      Extend CLPF to chroma. · a25c6c3b
      Steinar Midtskogen authored
      Objective quality impact (low latency):
      
      PSNR YCbCr:      0.13%     -1.37%     -1.79%
         PSNRHVS:      0.03%
            SSIM:      0.24%
          MSSSIM:      0.10%
       CIEDE2000:     -0.83%
      
      Change-Id: I8ddf0def569286775f0f9d4d4005932766a7fc27
      a25c6c3b
  18. 13 Sep, 2016 1 commit
  19. 08 Sep, 2016 1 commit
    • Steinar Midtskogen's avatar
      Reduce memory footprint for CLPF decoding. · eb5794da
      Steinar Midtskogen authored
      Instead of having CLPF write to an entire new frame and
      copy the result back into the original frame, make the
      filter able to work in-place by keeping a buffer of size
      frame_width*filter_block_size and delay the write-back
      by one filter_block_size row.
      
      This reduces the cycles spent in the filter to ~75%.
      
      Change-Id: I78ca74380c45492daa8935d08d766851edb5fbc1
      eb5794da
  20. 01 Sep, 2016 3 commits
    • Urvang Joshi's avatar
      Add ALT_INTRA experiment. · 340593e5
      Urvang Joshi authored
      When the experiment is ON, we use Paeth predictor instead of TM
      predictor.
      
      For derf set, this gives about 0.09% improvement overall, and 0.55%
      improvement if all frames are forced to be intra-only.
      
      Also, if the EXT_INTRA experiment is also on, the improvement overall
      is 0.056%, and improvement if all frames are forced to be intra-only is
      0.465%.
      
      Change-Id: Id74e107ede70a8d2107fa14fcb3f44b23a437274
      340593e5
    • Steinar Midtskogen's avatar
      Added generic SIMD support for CLPF. · b87cc923
      Steinar Midtskogen authored
      Change-Id: Ie03f9a5b0a4c708a586532198d755a1e7509f149
      b87cc923
    • Yaowu Xu's avatar
      Port renaming changes from AOMedia · f883b42c
      Yaowu Xu authored
      Cherry-Picked the following commits:
      0defd8f2 Changed "WebM" to "AOMedia" & "webm" to "aomedia"
      54e66767 Replace "VPx" by "AVx"
      5082a369 Change "Vpx" to "Avx"
      7df44f17 Replace "Vp9" w/ "Av1"
      967f722f Remove kVp9CodecId
      828f30ce Change "Vp8" to "AOM"
      030b5ffc AUTHORS regenerated
      2524caee Add ref-mv experimental flag
      016762be Change copyright notice to AOMedia form
      81e55269 Replace vp9 w/ av1
      9b94565b Add missing files
      fa8ca9f2 Change "vp9" to "av1"
      ec838b76  Convert "vp8" to "aom"
      80edfa01 Change "VP9" to "AV1"
      d1a11fb9 Change "vp8" to "aom"
      7b582513 Point to WebM test data
      dd1a5c8d Replace "VP8" with "AOM"
      ff00fc0f Change "VPX" to "AOM"
      01dee0bb Change "vp10" to "av1" in source code
      cebe6f0c Convert "vpx" to "aom"
      17b05679 rename vp10*.mk to av1_*.mk
      fe5f8a8a rename files vp10_* to av1_*
      
      Change-Id: I6fc3d18eb11fc171e46140c836ad5339cf6c9419
      f883b42c
  21. 22 Aug, 2016 2 commits
  22. 01 Aug, 2016 1 commit
    • Yue Chen's avatar
      Add weighted motion search for obmc predictor · 72d3ba8a
      Yue Chen authored
      Also port SIMD optimization of weighted sad/variance functions to
      av1.
      Coding gain improvement: 0.339/0.413/0.328 (lowres/midres/hdres)
      Current coding gain: 2.437/2.428/2.294
      Encoding time overhead: 17% (soccer_cif), 30% (ped_1080p25), was
      12% and 18% without motion search
      
      Change-Id: I101d6ce729f769853756edc8ced6f3a2b8d8f824
      72d3ba8a
  23. 26 Jul, 2016 1 commit
    • Yue Chen's avatar
      Port SIMD optimization for obmc blending functions to av1 · 2478bed5
      Yue Chen authored
      SIMD optimization for 1d blending functions in obmc mode, and some
      code refactoring and cleanup.
      
      (ped_1080p25.y4m, 150 frame, 2000 tb)
      Encoding time overhead: +18.8% -> +18.1%
      Decoding time overhead: +21.3% -> +8.7%
      Change-Id: I9d856c32136e7e0e6e24ab5520ef901d7b1ee9c8
      2478bed5
  24. 28 Jun, 2016 1 commit
  25. 21 Jun, 2016 1 commit
    • Yunqing Wang's avatar
      Do sub-pixel motion search in up-sampled reference frames · e02752b0
      Yunqing Wang authored
      Up-sampled the reference frames to 8 times in each dimension using the
      8-tap interpolation filter. In sub-pixel motion search, use the up-sampled
      reference frames to find the best matching blocks to increase the motion
      search precision. This is enabled as a speed feature for speed 0 and
      speed 1, and this is encoder-only improvement.
      
      Overall PSNR: -1.456%(lowres); -0.430(hdres)
      SSIM: -1.687(lowres); -0.551(hdres)
      
      Change-Id: I2085d87e41f6b91d0221dc11dc7ffd003075ba2e
      e02752b0
  26. 08 Jun, 2016 3 commits
    • Linfeng Zhang's avatar
      Slow pshufb removal in 3 intra prediction functions. (from libvpx) · 36b4949e
      Linfeng Zhang authored
      Cherry-pick ad0646cb Slow pshufb removal in 3 intra prediction functions.
      
      Replaced aom_d45_predictor_4x4_ssse3(), aom_d45_predictor_8x8_ssse3()
      and aom_d207_predictor_4x4_ssse3() with
      created aom_d45_predictor_4x4_sse2(), aom_d45_predictor_8x8_sse2()
      and aom_d207_predictor_4x4_sse2() respectively.
      It's mostly neutral or slightly worse than ssse3 in good cases and
      better than ssse3 in the bad cases (but still worse than using the mmx
      regs).
      
      Change-Id: I40ef101cd8b2f20eaa3f0648536bd227c7ae9722
      36b4949e
    • Linfeng Zhang's avatar
      remove mmx variance functions (from libvpx) · 7056e3a0
      Linfeng Zhang authored
      Cherry-pick d0ffae82 remove mmx variance functions
      
      there are sse2 equivalents which is a reasonable modern baseline
      Removed mmx variance functions:
      vpx_get_mb_ss_mmx()
      vpx_get8x8var_mmx()
      vpx_get4x4var_mmx()
      vpx_variance4x4_mmx()
      vpx_variance8x8_mmx()
      vpx_mse16x16_mmx()
      vpx_variance16x16_mmx()
      vpx_variance16x8_mmx()
      vpx_variance8x16_mmx()
      
      Change-Id: Ife4e67fe85e0012ca560a98831f69195c852a645
      7056e3a0
    • Linfeng Zhang's avatar
      remove mmx sad functions (from libvpx) · c5de1def
      Linfeng Zhang authored
      Cherry-pick d0e687bf remove mmx sad functions
      
      there are sse2 equivalents which is a reasonable modern baseline
      
      Change-Id: I9b67ff6dd16e36179e48898257b277fee003c8be
      c5de1def
  27. 25 Mar, 2016 1 commit