1. 01 Feb, 2018 11 commits
  2. 31 Jan, 2018 25 commits
    • Johann's avatar
      Revert "Enable pic (position independent code) config." · 32bb5110
      Johann authored
      This reverts commit fbeee067.
      
      The assembly which was failing has been fixed.
      
      BUG=aomedia:102
      
      Change-Id: Ide75630b38603a2553f6e231085994251c77b26c
      32bb5110
    • Hui Su's avatar
      Fix the fast tx type search feature with txk-sel on · 8f47f705
      Hui Su authored
      Change-Id: If0b1d2fe31569104f2d8eef3cfd42cab30162c7e
      8f47f705
    • Hui Su's avatar
      Reduce memory usage of inter_tx_size[] in MB_MODE_INFO · 1379beb7
      Hui Su authored
      Reduce the length of inter_tx_size[] from 1024 to 16.
      
      On a cif test sequence,
      encoder memory consumption decreases by 18% (380MB -> 312MB);
      decoder memory consumption decreases by 56% (21.4MB -> 9.4MB).
      
      Change-Id: Ie11dd055255d200954b704b8c2ad8ca3dff7bf5c
      1379beb7
    • Tom Finegan's avatar
      Fix a couple of visual studio nightly build failures. · f19a91cb
      Tom Finegan authored
      - Add explicit cast of bool to int to silence a test warning.
      - Add explicit cast of size_t to int for same in dump_obu.
      
      Change-Id: I90846eb5c88880d921f20cb66b116ab7d2799af5
      f19a91cb
    • Angie Chiang's avatar
      Integrate lv_map with aom_qm · b3167a65
      Angie Chiang authored
      BUG=aomedia:717
      
      Change-Id: Ib06a12039cb72665c1ee534cc2246ac3d23f878d
      b3167a65
    • Soo-Chul Han's avatar
      add scalability experiment · f8589863
      Soo-Chul Han authored
      cmake: -DCONFIG_SCALABILITY=1
      
      Change-Id: Ifa908f809bcf904bdf0ed87b351e1ef3accc2b3f
      f8589863
    • Johann's avatar
      use GLOBAL() macro when loading constant · 4972ac81
      Johann authored
      Clear linker error when building with gcc 6:
      relocation R_X86_64_32 against `.rodata' can not be used when making a
      shared object; recompile with -fPIC
      
      BUG=aomedia:102
      
      Change-Id: I6c06de1e9dac1c044a4b07125abcaba0943a29b6
      4972ac81
    • Hui Su's avatar
      one level less of tx size search for blocks larger than 64 · 7ed7e1fa
      Hui Su authored
      3~5% encoding speedup for speed 0; no quality loss.
      
      Change-Id: I0e31755f45253e5e99d8d9eed0d7a6fe6050f49f
      7ed7e1fa
    • Urvang Joshi's avatar
      Cleanup some fragile aspects of rd_pick_partition. · 00c6e6f7
      Urvang Joshi authored
      (1) Explicitly reset RD stats for each partition.
      
      Earlier,
      PARTITION_SPLIT was the only one resetting the RD_STATS in 'sum_rdc'.
      
      But this was working because:
      - PARTITION_SPLIT was tried before VERT, HORZ, VERT_4 and HORZ_4; and
      - RD cost calculations in VERT, HORZ, VERT_4 and HORZ_4 partitions
      implicitly discarded existing value in sum_rdc
      
      However, that was very fragile; explicitly resetting the stats every
      time is much safer.
      
      (2) Using a separate variable 'temp_best_rd_cost' was fragile as someone
      may forget to update the same. So, we use best_rdc.rdcost directly.
      
      BUG=aomedia:1246
      
      Change-Id: Icd75f25c34bb0f1806e691784648bcffce2417e6
      00c6e6f7
    • Deepa K G's avatar
      AVX2 optimization of motion compensation functions · c8e0336a
      Deepa K G authored
      AVX2 implementation of av1_convolve_x_sr, av1_convolve_y_sr and
      av1_convolve_2d_sr have been added.
      
      Improvements have been made to av1_convolve_x_avx2, av1_convolve_y_avx2
      and av1_convolve_2d_avx2.
      
      Change-Id: I62a699dd9dcf42de94dd72cc2d43affc0dc31404
      c8e0336a
    • Tom Finegan's avatar
      Add information about extra CMake build flags to README.md · aa71f071
      Tom Finegan authored
      BUG=aomedia:1296
      
      Change-Id: If9f944b58f23cdb71f919bd391f6b37e27b271f1
      aa71f071
    • Angie Chiang's avatar
      Update adst4 range · 5d7c1fcc
      Angie Chiang authored
      Serialize the adst4 operations
      Update stage range accordingly
      Change the cos_bit precision accordingly.
      Correct 4x8/8x4 inv_start_range
      
      BUG=aomedia:1271
      
      Change-Id: I10bc91585a61d790decdc24cb91659102e043620
      5d7c1fcc
    • David Barker's avatar
      [jnt-comp, normative] Avoid double-rounding in prediction · 39cf8061
      David Barker authored
      As per the linked bug report, the distance-weighted compound
      prediction has two separate round operations, first by 3
      bits (inside the various convolve functions), then by 10 bits
      (after the convolution functions).
      
      We can improve on this by right shifting by 3 bits inside the
      convolve functions - this is equivalent to doing a single round
      by 13 bits at the end.
      
      Note: In the encoder, when doing joint_motion_search(), we do
      things a bit differently: So that we can try modifying the two
      "sides" of the prediction independently, we predict each side as
      if it were a single prediction (including rounding), then blend
      these single predictions together.
      
      This is already an approximation to the "real" prediction, even
      in the non-jnt-comp case. So we leave that code path as-is.
      
      BUG=aomedia:1289
      
      Change-Id: I9ad1fbcb3e12db2b5fc3c82b407f0fd9e6b39750
      39cf8061
    • Johann's avatar
      BUG FIX: sse2 subpel variance is not PIC compliant · 0cf864fd
      Johann authored
      cherry-picked from libvpx:
        commit cb9f4dc1056b39383595f658cfcd166833bc0097
        Author: Scott LaVarnway <slavarnway@google.com>
        Date:   Sat Jan 13 07:01:04 2018 -0800
      
      BUG=aomedia:102
      
      Change-Id: Ie1736ea0787f4dad80204dcf5251fbb02d79541e
      0cf864fd
    • Imdad Sardharwalla's avatar
      Added HighBD support for mismatch debugging · 5b084ee1
      Imdad Sardharwalla authored
      Enabling CONFIG_MISMATCH_DEBUG with highbd streams was producing undefined
      behaviour. This patch adds support for highbd frames.
      
      BUG=aomedia:1246
      
      Change-Id: I36ff4ddbb9b2e884e4a5b76485247a20b1f5db3c
      5b084ee1
    • Debargha Mukherjee's avatar
      Merge in STRIPED_LOOP_RESTORATION flag · 5105f7ac
      Debargha Mukherjee authored
      CONFIG_LOOP_RESTORATION still exists.
      Only CONFIG_STRIPED_LOOP_RESTORATION has been merged into
      CONFIG_LOOP_RESTORATION as always 1.
      
      Change-Id: I37d7a1fcd4cbb56e2fc037b1568ae63f72ed6a66
      5105f7ac
    • Sebastien Alaiwan's avatar
      Update configuration comment about LOWBITDEPTH · 1e3da463
      Sebastien Alaiwan authored
      The comment was misleading as the codec always supports 8-bit,
      regardless of the value of CONFIG_LOWBITDEPTH.
      This flag just enables the optimized-for-8-bits pipeline,
      without changing the actual YUV output.
      
      Change-Id: Ic2f041870acf4e2ee435021aa42e8f013ef52b78
      1e3da463
    • Frederic Barbier's avatar
      Reduce scope of ctx derivation · 46475a30
      Frederic Barbier authored
      Change-Id: Ic8050cada6dc9dd14152da98ee21bb37042069e6
      46475a30
    • Jingning Han's avatar
      Conditionally skip transform block partition search · eb8f5e87
      Jingning Han authored
      Speed up recursive transform block partition search. When a txfm
      block is selected as all zero coefficients, skip the search over
      further split partition.
      
      Tested with txk-sel on, this makes the speed 0 / 1 both 10 - 15%
      faster at medium - high target bit-rate range. The coding
      performance change is neutral - 0.011% better for lowres set.
      
      Change-Id: I1247f3d5a33d15bf4bc5f0bcbac2bf1f3e1aca2e
      eb8f5e87
    • David Barker's avatar
      dependent-horztilegroups: Fix decoder crash · 13025199
      David Barker authored
      The tg_horz_boundary flag should always be 0 for the topmost
      tile row, even when dependent-horztilegroups is enabled.
      Otherwise, we end up trying to fetch data off the top of the
      frame, which results in segfaults.
      
      BUG=aomedia:1252
      
      Change-Id: I7caaa2b38a21c05ffb13b6c72f41f8f6e1982b69
      13025199
    • Peng Bin's avatar
      Add aom_comp_mask_<upsampled>pred_ssse3 · 33ba1fe5
      Peng Bin authored
      1) For encoder speed, overall ~1% faster with no impact on coding performance.
      2) aom_comp_mask_pred_ssse3 is 3.5x - 6x faster than aom_comp_mask_pred_c
      3) aom_comp_mask_upsampled_pred_ssse3 1.5x - 3x faster than
      aom_comp_mask_upsampled_pred_c, for special case where subpel_x ==
      subpel_y == 0, optimized version achieves 4x - 7x speedup
      
      Unittest for both functions have been added.
      
      Change-Id: Ib498317975e0dbd9cdcf61be327b640dfac9a7e5
      33ba1fe5
    • Yunqing Wang's avatar
      Remove frame counts in decoding coefs area · 1694a4ff
      Yunqing Wang authored
      Continued to remove count accumulation in decoder for decoder speedup.
      
      Change-Id: I9e3b874bfc5f750297070235bdfc4d71526ed665
      1694a4ff
    • Yunqing Wang's avatar
      Remove frame counts in decoder · e62feb65
      Yunqing Wang authored
      In the decode side, frame count accumulation is still existing. This
      patch removed part of them. More patch will follow. This should speed up
      the decoder.
      
      This doesn't change the encoder side since the counts are useful in
      some encoder optimizations.
      
      Change-Id: I91a021859f8d35e46618ea9232083e72a06431c8
      e62feb65
    • Hui Su's avatar
      txk-sel: support the fast tx type search feature · 12049df7
      Hui Su authored
      Change-Id: Ib6b07f76dd702c40841c88457ca9d96083157354
      12049df7
    • Yaowu Xu's avatar
      Fix a command line help comment · bada8230
      Yaowu Xu authored
      BUG=aomedia:1283
      
      Change-Id: I9b200d8cfb3ffcdd2fb1cece6c54a0f600d37a87
      bada8230
  3. 30 Jan, 2018 4 commits
    • Yaowu Xu's avatar
      aom_lpf_horizontal_6_sse2(): fix valgrind warnings · 5a667bfd
      Yaowu Xu authored
      BUG=aomedia:1285
      
      Change-Id: I12d522c3704083bba5c4332031dff7a01fd7dfb3
      5a667bfd
    • Dake He's avatar
      [loopfilter] remove filter_length_internal · ef8eb291
      Dake He authored
      filter_length_internal seems redundant in the current implementation of
      deblocking filter.
      
      Change-Id: I40ada51857556c38cca56da33e17d0c2c82b8fc1
      ef8eb291
    • Johann's avatar
      fwd txfm: cherrypick improvements from libvpx · c048a2d9
      Johann authored
      committ 9a780fa7db79b709787a9ca56fc324a118158da7
      Author: Jingning Han <jingning@google.com>
        Rework forward 8x8 2D-DCT ssse3 implementation
      
      commit 3e3a5686167a5493a5e2223635d1085cf8c963dd
      Author: Johann <johannkoenig@google.com>
        fwd txfm ssse3: use GLOBAL() for loading constants
      
      Change-Id: If7ca11a5b3c9dcf2ac7dbf8b7643e3424399d201
      c048a2d9
    • Hui Su's avatar
      txk-sel: support tx pruning speed feature · 35c82b47
      Hui Su authored
      Compression performance change is < 0.03%;
      Encoder speedup 6~19% on speed 0.
      
      Change-Id: I988cf0b0ab0a9ad9985bfb3b8aca672c7554525c
      35c82b47