1. 02 Oct, 2017 3 commits
  2. 30 Sep, 2017 1 commit
    • Tom Finegan's avatar
      Add aom_entropy_optimizer to CMake build. · e91bb45b
      Tom Finegan authored
      This is the first tool in the CMake build, so some extra
      noise is involved:
      
      - Setup tools list vars and handling.
      - Add tools support to the dist rule.
      - Move usage_exit.c generation to CMakelists.txt to allow
        use by the aom_entropy_optimizer target.
      
      BUG=aomedia:834
      
      Change-Id: I55239e89353033349ac1038b8d3d1aa8a8f23e27
      e91bb45b
  3. 29 Sep, 2017 5 commits
    • Yi Luo's avatar
      Lowbd TM_PRED intrapred ssse3 optimization · a0f66fc0
      Yi Luo authored
      Function speedup (i7-6700)
      Predictor  ssse3 v. C
      4x4        ~2.1x
      4x8        ~2.4x
      8x4        ~4.1x
      8x8        ~5.4x
      8x16       ~6.1x
      16x8       ~5.9x
      16x16      ~6.4x
      16x32      ~6.7x
      32x16      ~7.4x
      32x32      ~8.0x
      
      Change-Id: I52b8ebf8193e76f4ea1137cbad5ad7fa109d86d8
      a0f66fc0
    • Thomas Davies's avatar
      Remove delta_q experimental flag. · 3ab20b45
      Thomas Davies authored
      Change-Id: I52f204000f5fdaf1c6fff63949d72e858ceea462
      3ab20b45
    • Yi Luo's avatar
      Lowbd intrapred DC/TOP/LEFT/128/V/H avx2 · 23c61903
      Yi Luo authored
      For prediction block width equal to 32, avx2 can further speedup
      the prediction function (i7-6700):
      
      32x32     avx2 v. sse2
      DC        ~1.4x
      top       ~1.5x
      left      ~1.4x
      128       ~1.5x
      v         ~1.6x
      h         ~1.2x
      
      32x16     avx2 v. sse2
      DC        ~2.2x
      top       ~1.7x
      left      ~1.6x
      128       ~1.8x
      v         ~1.9x
      
      Note: 32x16 H_PRED on avx2 does not run faster enough than sse2 yet.
      
      Change-Id: I145ed504d1b3ea9df283b94927be66a2c6f81225
      23c61903
    • Ryan Overbeck's avatar
      Add the lightfield encoder and decoder examples · a5fefa76
      Ryan Overbeck authored
      1. Configure with --enable-experimental --enable-ext-tile and run "make"
      to build.
      2. Run "make test" to download lightfield test data: vase10x10.yuv.
      3. Run lightfield encoder to encode whole lightfield:
        examples/lightfield_encoder 1024 1024 vase10x10.yuv vase10x10.webm 10 10 5
      4. Run lightfield decoder to decode a single tile:
        examples/lightfield_decoder vase10x10.webm vase_tile.yuv 10 10 3 4 5 10 5
      
      Note: Enabled use of AOME_USE_REFERENCE(previously deprecated) for this
      example.
      
      Change-Id: I657ab6e99ba1e2d1bf99ec25a3c4686fc80bc9bb
      a5fefa76
    • Angie Chiang's avatar
      Off TestResizeCspWorks when def TRELLISQ_SEARCH · a603488c
      Angie Chiang authored
      Change-Id: Ie29d5fa261a9c7f790f170493ed0c9d59d1482e2
      a603488c
  4. 28 Sep, 2017 3 commits
    • Sebastien Alaiwan's avatar
      Remove dead function 'clamp_block' · 36967373
      Sebastien Alaiwan authored
      And reduce scope of 'get_max_bit',
      which is only used by the test code.
      
      Change-Id: I9af7be426f7bec6958419ca02957db87e7963f50
      36967373
    • Yi Luo's avatar
      Lowbd rectangle V/H intra pred sse2 optimization · 0c0fd1e5
      Yi Luo authored
      Function speedup sse2 v. C
      Predictor  V_PRED  H_PRED
      4x8        ~1.7x   ~1.8x
      8x4        ~1.8x   ~2.2x
      8x16       ~1.5x   ~1.4x
      16x8       ~1.9x   ~1.3x
      16x32      ~1.6x   ~1.4x
      32x16      ~2.0x   ~1.9x
      
      This patch disables speed tests to save Jenkins build
      time. Developer can manually enable them by using,
      --gtest_also_run_disabled_test flag in test command line.
      
      Change-Id: I81eaee5e8afc55275c7507c99774f78cc9e49f9a
      0c0fd1e5
    • Debargha Mukherjee's avatar
      Misc. resize fixes along with the resize test · ccb27264
      Debargha Mukherjee authored
      Re-enables most of the previously disabled tests.
      The ones that are still disabled expect resize to be triggered
      through rate control, which is no longer supported in Av1.
      
      Change-Id: Ie5e9ba3eb0843cd44ff1ac988500081470ba0fe2
      ccb27264
  5. 27 Sep, 2017 1 commit
    • Yi Luo's avatar
      Lowbd rect intrapred DC/LEFT/TOP/128 sse2 optimization · 39bdf36a
      Yi Luo authored
      Add lowbd unit test functionality to intrapred_test.cc
      Function speedup against C (i7-6700):
      Predictor   DC     LEFT   TOP    128
      4x8        ~1.4x  ~1.4x  ~1.7x  ~1.9x
      8x4        ~1.2x  ~1.6x  ~1.6x  ~2.6x
      8x16       ~1.4x  ~1.3x  ~1.4x  ~2.1x
      16x8       ~2.0x  ~1.8x  ~2.3x  ~2.1x
      16x32      ~2.0x  ~1.9x  ~1.8x  ~2.2x
      32x16      ~2.0x  ~2.0x  ~1.9x  ~2.2x
      
      Change-Id: I33db512020ca3c6853a9205a8079f3d00134f584
      39bdf36a
  6. 26 Sep, 2017 1 commit
  7. 25 Sep, 2017 2 commits
  8. 24 Sep, 2017 1 commit
    • Angie Chiang's avatar
      Add av1_down_sample_scan_count · 69208260
      Angie Chiang authored
      This is for reduce memory usage for adapt_scan
      The whole change will be under the flage USE_2X2_PROB
      
      Change-Id: If7839d6396dad7618155ef2f36896d17743696ce
      69208260
  9. 23 Sep, 2017 1 commit
  10. 22 Sep, 2017 1 commit
    • Yi Luo's avatar
      Highbd rectangle intrapred V/DC sse2 optimization · bdddf33a
      Yi Luo authored
      Function speedup (i7-6700),  sse2 verse C:
      Predictor      V_PRED    DC_PRED
      4x8            ~1.5x     ~4.9x
      8x4            ~2.5x     ~4.8x
      8x16           ~1.9x     ~9.1x
      16x8           ~1.9x     ~4.4x
      16x32          ~2.1x     ~5.8x
      32x16          ~2.0x     ~3.6x
      
      Change-Id: I6deffd0637e57ee5d0bd533502f5705148c4cdd4
      bdddf33a
  11. 20 Sep, 2017 2 commits
  12. 19 Sep, 2017 1 commit
    • Yi Luo's avatar
      Highbd intrapred DC_LEFT/TOP/128 sse2 optimization · bbf6186e
      Yi Luo authored
      Also extend intra pred speed test to rectangular block.
      Speedup (i7-6700)
      predictor      sse2 v. C
      left 4x4       ~5.6x
      top  4x4       ~7.2x
      128  4x4       ~6.9x
      left 4x8       ~7.7x
      top  4x8       ~10.1x
      128  4x8       ~10.0x
      
      left 8x4       ~8.1x
      top  8x4       ~9.1x
      128  8x4       ~10.1x
      left 8x8       ~10.3x
      top  8x8       ~13.6x
      128  8x8       ~14.8x
      left 8x16      ~12.6x
      top  8x16      ~14.0x
      128  8x16      ~15.5x
      
      left 16x8      ~6.3x
      top  16x8      ~7.0x
      128  16x8      ~6.5x
      left 16x16     ~6.5x
      top  16x16     ~7.1x
      128  16x16     ~8.2x
      left 16x32     ~5.1x
      top  16x32     ~6.4x
      128  16x32     ~5.6x
      
      left 32x16     ~4.2x
      top  32x16     ~4.3x
      128  32x16     ~4.5x
      left 32x32     ~3.8x
      top  32x32     ~3.7x
      128  32x32     ~3.9x
      
      Change-Id: Ie7fcc85b9ded3030ee904623c40e9edeec1695ae
      bbf6186e
  13. 18 Sep, 2017 1 commit
    • Yi Luo's avatar
      Highbd intra pred H_PRED sse2 optimization · 23b9b317
      Yi Luo authored
      sse2 v. C speedup:
      4x4   ~8.0x
      8x8   ~8.2x
      16x16 ~6.5x
      32x32 ~3.8x
      Blocksize:
      4x4, 4x8, 8x4, 8x8, 8x16, 16x8, 16x16, 16x32, 32x16, 32x32
      Square blocksize code is from libvpx:
      "30d9a1916 vpxdsp: [x86] add highbd_h_predictor functions",
      Credit goes to Scott LaVarnway. Speed tests do not support
      rectangular blocksize yet.
      
      Change-Id: I9a1f24aecab8de94f8ea59ec8748fe3537d721ae
      23b9b317
  14. 16 Sep, 2017 1 commit
  15. 15 Sep, 2017 5 commits
    • Yi Luo's avatar
      Enhance intra pred speed test to include highbd pred · f5d71a69
      Yi Luo authored
      This is a manual merge from libvpx's commit:
      "05ee24149 Add high bitdepth intra prediction
      optimization speed test". Credit goes to Linfeng Zhang.
      
      Change-Id: Ie254593aa9b601889ecb95eca900365055d46a03
      f5d71a69
    • Angie Chiang's avatar
      Turn off TestResizeCspWorks under conditions · 1ff7d9ab
      Angie Chiang authored
      DISABLE_TRELLISQ_SEARCH will incur failure of TestResizeCspWorks
      BUG=aomedia:734
      
      Change-Id: I70d51b9a490b251ebd7743faf831da54b94e48c7
      1ff7d9ab
    • Nathan E. Egge's avatar
      Fix highbd_iht_test with 4, 8, and 16 daala_tx. · 71b0513b
      Nathan E. Egge authored
      This change fixes a compile error when all three of --enable-daala_dct4,
       --enable-daala_dct8 and --enable-daala_dct16 are enabled at once.
      
      Change-Id: I4942e09fb887afbda2eda6aaacec727b5cbf6f50
      71b0513b
    • Angie Chiang's avatar
      register_state_check: simplify Check() methods · 74acf004
      Angie Chiang authored
      - make Check() void as the EXPECT's are sufficient to document failure
      
      cumulatively this has the effect of avoiding reporting incorrect Check()
      failures due to earlier test failures.
      
      This CL is ported over from
      f8c27d164 register_state_check: simplify Check() methods
      
      Change-Id: I1b65aa769c69c2a52b2e0b363f1c4432965ee89f
      74acf004
    • Yi Luo's avatar
      Enhance intra pred unit test to verify rectangular pred · da9e4afe
      Yi Luo authored
      Add a macro to improve the readibility of test case. The
      coming test cases varying on mode/size would expand the
      list too large.
      
      Change-Id: I74171344098820b21090dd9b857229bdf2e77248
      da9e4afe
  16. 10 Sep, 2017 2 commits
    • Debargha Mukherjee's avatar
      Refactoring/simplification of buffers used for sgr · 1330dfd1
      Debargha Mukherjee authored
      Inlcudes miscellaneous cleanups, test fixes, and code reorganization
      for loop-restoration components.
      
      Change-Id: I5b2e6419234d945e6f4344b22636119b50df4054
      1330dfd1
    • Debargha Mukherjee's avatar
      Reduce/Eliminate line buffer for loop-restoration. · e168a783
      Debargha Mukherjee authored
      This patch forces the vertical filtering for the top and bottom
      rows of a processing unit for the Wiener filter to not use border
      more than what is set in the WIENER_BORDER_VERT macro.
      This macro is currently set at 0 to eliminate line buffer completely,
      but it could be increased to 1 or 2 to use limited line buffers
      if the coding efficiency is affected too much with a 0 line-buffer.
      
      Also, for the sgr filter we added the option of using overlapping
      windows horizonttally and vertically to improve coding efficiency.
      The vertical border used is set by the SGRPROJ_BORDER_VERT
      macro, while the horizontal border can be set by the
      SGRPROJ_BORDER_HORZ macro set at 2, the max needed. Currently we do not
      recommend changing SGRPROJ_BORDER_HORZ below 2.
      
      The overall line buffer requirement for LR is twice the max of
      WIENER_BORDER_VERT and SGRPROJ_BORDER_VERT.
      Currently both are set as 0, eliminating line buffers completely.
      
      Also this patch extends borders consistently before CDEF / LR.
      
      Change-Id: Ie58a98c784a0db547627b9cfcf55f018c30e8e79
      e168a783
  17. 08 Sep, 2017 1 commit
    • Tom Finegan's avatar
      Fix warning in selfguided_filter_test.cc. · 9f02130a
      Tom Finegan authored
      When building without SSE4 support some compilers complain about
      an unused variable. Typically gcc's with a major version of 4.
      Guard the declaration of the offending var within a
      HAVE_SSE4_1 block to avoid the problem.
      
      Change-Id: I4e4deb46014c97f3157f3b6c2376e1b034a51b62
      9f02130a
  18. 07 Sep, 2017 1 commit
    • Yi Luo's avatar
      Lowbd parallel_deblocking sse2 optimization · ea8a0d52
      Yi Luo authored
      Baseline + parallel_deblocking:
      
      - Passed unit tests *SSE2/Loop8Test6*, *AVX2/Loop8Test6*.
      - 1080p, 25 frames, profile=0, encoding/decoding, output match.
      - Decoder frame rate increases from 54.15 to 65.84.
      
      Change-Id: I55938c94961066594f4b9080192c7268c19d9bf9
      ea8a0d52
  19. 06 Sep, 2017 1 commit
  20. 05 Sep, 2017 2 commits
    • Timothy B. Terriberry's avatar
      Remove the EC_SMALLMUL experimental flag. · f9ef4f6b
      Timothy B. Terriberry authored
      This experiment has been fully adopted and is now an integral part
      of the draft AV1 bitstream definition.
      
      objdump -d libaom.a gives identical output before and after this
      patch.
      
      Change-Id: I6f936f4b10de23a9471e0ccadf9cf178fb62be69
      f9ef4f6b
    • Rupert Swarbrick's avatar
      Define missing subtract_xxx functions in highbd_subtract_sse2.c · 4b5c2bb4
      Rupert Swarbrick authored
      Also, get rid of the boilerplate code using some macros. STACK_V(h,f) means
      "call f twice, stacking vertically at an offset of h". STACK_H(w,f)
      means "call f twice, stacking horizontally at an offset of w".
      
      Note that functions like subtract_128x64 are now only defined when the
      equivalent block sizes (e.g. BLOCK_128x64) are defined. As such, we
      have to fix up subtract_test.cc so it doesn't try to call
      aom_highbd_subtract_block_sse2 with unsupported sizes.
      
      BUG=aomedia:684
      
      Change-Id: I5b0fefe70e4083786d11d25cdd5dcf02823bae7b
      4b5c2bb4
  21. 04 Sep, 2017 1 commit
  22. 01 Sep, 2017 1 commit
  23. 30 Aug, 2017 2 commits