1. 02 Feb, 2018 4 commits
    • Debargha Mukherjee's avatar
      Port first pass stats handling from Vp9 into Av1 · da01c0dc
      Debargha Mukherjee authored
      Some parameter tuning included.
      
      lowres (q, 30 frames, speed 1):
      -1.243% av PSNR, -2.337% ov PSNR, +0.577% SSIM
      
      lowres (vbr, 30 frames, speed 1):
      -0.327% av PSNR, -1.007% ov PSNR, +0.182% SSIM
      
      A few videos become a lot worse in SSIM, which needs to be
      investigated. But PSNR-wise the patch seems pretty good.
      
      Change-Id: I17c8d812c96ee49ddae7d3959a459aa3ffcea208
      da01c0dc
    • Peng Bin's avatar
      Remove aom_comp_mask_upsampled_pred from rtcd · f8daa92d
      Peng Bin authored
      Since aom_comp_mask_upsampled_pred just call aom_upsampled_pred
      and aom_comp_mask_pred, no need to separate c version from simd
      version any more.
      
      Change-Id: I1ff8bcae87d501c68a80708fd2dc6b74c6952f88
      f8daa92d
    • Yaowu Xu's avatar
      Align allocated buffers · a64c05b5
      Yaowu Xu authored
      BUG=aomedia:1306
      
      Change-Id: I5a8bdbd472213ded2de706c5b044a1bf24823670
      a64c05b5
    • Jingning Han's avatar
      Fix txk-sel unit test failure in aq-mode · 1e79d90e
      Jingning Han authored
      The current aq mode encoder setting would alter the segment_id
      between the rate-distortion optimization and the block encoding
      stages. Disable the corresponding consistency check in this case.
      
      BUG=aomedia:1251
      
      Change-Id: Ic910a23fd64a9b4554567d3c8c9a9ae5f6062c7b
      1e79d90e
  2. 01 Feb, 2018 14 commits
  3. 31 Jan, 2018 22 commits
    • Johann's avatar
      Revert "Enable pic (position independent code) config." · 32bb5110
      Johann authored
      This reverts commit fbeee067.
      
      The assembly which was failing has been fixed.
      
      BUG=aomedia:102
      
      Change-Id: Ide75630b38603a2553f6e231085994251c77b26c
      32bb5110
    • Hui Su's avatar
      Fix the fast tx type search feature with txk-sel on · 8f47f705
      Hui Su authored
      Change-Id: If0b1d2fe31569104f2d8eef3cfd42cab30162c7e
      8f47f705
    • Hui Su's avatar
      Reduce memory usage of inter_tx_size[] in MB_MODE_INFO · 1379beb7
      Hui Su authored
      Reduce the length of inter_tx_size[] from 1024 to 16.
      
      On a cif test sequence,
      encoder memory consumption decreases by 18% (380MB -> 312MB);
      decoder memory consumption decreases by 56% (21.4MB -> 9.4MB).
      
      Change-Id: Ie11dd055255d200954b704b8c2ad8ca3dff7bf5c
      1379beb7
    • Tom Finegan's avatar
      Fix a couple of visual studio nightly build failures. · f19a91cb
      Tom Finegan authored
      - Add explicit cast of bool to int to silence a test warning.
      - Add explicit cast of size_t to int for same in dump_obu.
      
      Change-Id: I90846eb5c88880d921f20cb66b116ab7d2799af5
      f19a91cb
    • Angie Chiang's avatar
      Integrate lv_map with aom_qm · b3167a65
      Angie Chiang authored
      BUG=aomedia:717
      
      Change-Id: Ib06a12039cb72665c1ee534cc2246ac3d23f878d
      b3167a65
    • Soo-Chul Han's avatar
      add scalability experiment · f8589863
      Soo-Chul Han authored
      cmake: -DCONFIG_SCALABILITY=1
      
      Change-Id: Ifa908f809bcf904bdf0ed87b351e1ef3accc2b3f
      f8589863
    • Johann's avatar
      use GLOBAL() macro when loading constant · 4972ac81
      Johann authored
      Clear linker error when building with gcc 6:
      relocation R_X86_64_32 against `.rodata' can not be used when making a
      shared object; recompile with -fPIC
      
      BUG=aomedia:102
      
      Change-Id: I6c06de1e9dac1c044a4b07125abcaba0943a29b6
      4972ac81
    • Hui Su's avatar
      one level less of tx size search for blocks larger than 64 · 7ed7e1fa
      Hui Su authored
      3~5% encoding speedup for speed 0; no quality loss.
      
      Change-Id: I0e31755f45253e5e99d8d9eed0d7a6fe6050f49f
      7ed7e1fa
    • Urvang Joshi's avatar
      Cleanup some fragile aspects of rd_pick_partition. · 00c6e6f7
      Urvang Joshi authored
      (1) Explicitly reset RD stats for each partition.
      
      Earlier,
      PARTITION_SPLIT was the only one resetting the RD_STATS in 'sum_rdc'.
      
      But this was working because:
      - PARTITION_SPLIT was tried before VERT, HORZ, VERT_4 and HORZ_4; and
      - RD cost calculations in VERT, HORZ, VERT_4 and HORZ_4 partitions
      implicitly discarded existing value in sum_rdc
      
      However, that was very fragile; explicitly resetting the stats every
      time is much safer.
      
      (2) Using a separate variable 'temp_best_rd_cost' was fragile as someone
      may forget to update the same. So, we use best_rdc.rdcost directly.
      
      BUG=aomedia:1246
      
      Change-Id: Icd75f25c34bb0f1806e691784648bcffce2417e6
      00c6e6f7
    • Deepa K G's avatar
      AVX2 optimization of motion compensation functions · c8e0336a
      Deepa K G authored
      AVX2 implementation of av1_convolve_x_sr, av1_convolve_y_sr and
      av1_convolve_2d_sr have been added.
      
      Improvements have been made to av1_convolve_x_avx2, av1_convolve_y_avx2
      and av1_convolve_2d_avx2.
      
      Change-Id: I62a699dd9dcf42de94dd72cc2d43affc0dc31404
      c8e0336a
    • Tom Finegan's avatar
      Add information about extra CMake build flags to README.md · aa71f071
      Tom Finegan authored
      BUG=aomedia:1296
      
      Change-Id: If9f944b58f23cdb71f919bd391f6b37e27b271f1
      aa71f071
    • Angie Chiang's avatar
      Update adst4 range · 5d7c1fcc
      Angie Chiang authored
      Serialize the adst4 operations
      Update stage range accordingly
      Change the cos_bit precision accordingly.
      Correct 4x8/8x4 inv_start_range
      
      BUG=aomedia:1271
      
      Change-Id: I10bc91585a61d790decdc24cb91659102e043620
      5d7c1fcc
    • David Barker's avatar
      [jnt-comp, normative] Avoid double-rounding in prediction · 39cf8061
      David Barker authored
      As per the linked bug report, the distance-weighted compound
      prediction has two separate round operations, first by 3
      bits (inside the various convolve functions), then by 10 bits
      (after the convolution functions).
      
      We can improve on this by right shifting by 3 bits inside the
      convolve functions - this is equivalent to doing a single round
      by 13 bits at the end.
      
      Note: In the encoder, when doing joint_motion_search(), we do
      things a bit differently: So that we can try modifying the two
      "sides" of the prediction independently, we predict each side as
      if it were a single prediction (including rounding), then blend
      these single predictions together.
      
      This is already an approximation to the "real" prediction, even
      in the non-jnt-comp case. So we leave that code path as-is.
      
      BUG=aomedia:1289
      
      Change-Id: I9ad1fbcb3e12db2b5fc3c82b407f0fd9e6b39750
      39cf8061
    • Johann's avatar
      BUG FIX: sse2 subpel variance is not PIC compliant · 0cf864fd
      Johann authored
      cherry-picked from libvpx:
        commit cb9f4dc1056b39383595f658cfcd166833bc0097
        Author: Scott LaVarnway <slavarnway@google.com>
        Date:   Sat Jan 13 07:01:04 2018 -0800
      
      BUG=aomedia:102
      
      Change-Id: Ie1736ea0787f4dad80204dcf5251fbb02d79541e
      0cf864fd
    • Imdad Sardharwalla's avatar
      Added HighBD support for mismatch debugging · 5b084ee1
      Imdad Sardharwalla authored
      Enabling CONFIG_MISMATCH_DEBUG with highbd streams was producing undefined
      behaviour. This patch adds support for highbd frames.
      
      BUG=aomedia:1246
      
      Change-Id: I36ff4ddbb9b2e884e4a5b76485247a20b1f5db3c
      5b084ee1
    • Debargha Mukherjee's avatar
      Merge in STRIPED_LOOP_RESTORATION flag · 5105f7ac
      Debargha Mukherjee authored
      CONFIG_LOOP_RESTORATION still exists.
      Only CONFIG_STRIPED_LOOP_RESTORATION has been merged into
      CONFIG_LOOP_RESTORATION as always 1.
      
      Change-Id: I37d7a1fcd4cbb56e2fc037b1568ae63f72ed6a66
      5105f7ac
    • Sebastien Alaiwan's avatar
      Update configuration comment about LOWBITDEPTH · 1e3da463
      Sebastien Alaiwan authored
      The comment was misleading as the codec always supports 8-bit,
      regardless of the value of CONFIG_LOWBITDEPTH.
      This flag just enables the optimized-for-8-bits pipeline,
      without changing the actual YUV output.
      
      Change-Id: Ic2f041870acf4e2ee435021aa42e8f013ef52b78
      1e3da463
    • Frederic Barbier's avatar
      Reduce scope of ctx derivation · 46475a30
      Frederic Barbier authored
      Change-Id: Ic8050cada6dc9dd14152da98ee21bb37042069e6
      46475a30
    • Jingning Han's avatar
      Conditionally skip transform block partition search · eb8f5e87
      Jingning Han authored
      Speed up recursive transform block partition search. When a txfm
      block is selected as all zero coefficients, skip the search over
      further split partition.
      
      Tested with txk-sel on, this makes the speed 0 / 1 both 10 - 15%
      faster at medium - high target bit-rate range. The coding
      performance change is neutral - 0.011% better for lowres set.
      
      Change-Id: I1247f3d5a33d15bf4bc5f0bcbac2bf1f3e1aca2e
      eb8f5e87
    • David Barker's avatar
      dependent-horztilegroups: Fix decoder crash · 13025199
      David Barker authored
      The tg_horz_boundary flag should always be 0 for the topmost
      tile row, even when dependent-horztilegroups is enabled.
      Otherwise, we end up trying to fetch data off the top of the
      frame, which results in segfaults.
      
      BUG=aomedia:1252
      
      Change-Id: I7caaa2b38a21c05ffb13b6c72f41f8f6e1982b69
      13025199
    • Peng Bin's avatar
      Add aom_comp_mask_<upsampled>pred_ssse3 · 33ba1fe5
      Peng Bin authored
      1) For encoder speed, overall ~1% faster with no impact on coding performance.
      2) aom_comp_mask_pred_ssse3 is 3.5x - 6x faster than aom_comp_mask_pred_c
      3) aom_comp_mask_upsampled_pred_ssse3 1.5x - 3x faster than
      aom_comp_mask_upsampled_pred_c, for special case where subpel_x ==
      subpel_y == 0, optimized version achieves 4x - 7x speedup
      
      Unittest for both functions have been added.
      
      Change-Id: Ib498317975e0dbd9cdcf61be327b640dfac9a7e5
      33ba1fe5
    • Yunqing Wang's avatar
      Remove frame counts in decoding coefs area · 1694a4ff
      Yunqing Wang authored
      Continued to remove count accumulation in decoder for decoder speedup.
      
      Change-Id: I9e3b874bfc5f750297070235bdfc4d71526ed665
      1694a4ff