1. 02 Feb, 2018 8 commits
    • Frederic Barbier's avatar
      Cleanup deprecated comments · e5d166ef
      Frederic Barbier authored
      Change-Id: I91f18c498c694829b933bb73812ad94d66962994
    • Imdad Sardharwalla's avatar
      AVX2 implementation of the Wiener filter · aab6aee3
      Imdad Sardharwalla authored
      Added an AVX2 version of the Wiener filter, along with associated tests. Speed
      tests have been added for all implementations of the Wiener filter.
      Speed Test results
      Low bit-depth filter:
      - SSE2 vs C: SSE2 takes ~92% less time
      - AVX2 vs C: AVX2 takes ~96% less time
      - SSE2 vs AVX2: AVX2 takes ~43% less time (~74% faster)
      High bit-depth filter:
      - SSSE3 vs C: SSSE3 takes ~92% less time
      - AVX2  vs C: AVX2  takes ~96% less time
      - SSSE3 vs AVX2: AVX2 takes ~46% less time (~84% faster)
      Low bit-depth filter:
      - SSE2 vs C: SSE2 takes ~84% less time
      - AVX2 vs C: AVX2 takes ~88% less time
      - SSE2 vs AVX2: AVX2 takes ~27% less time (~36% faster)
      High bit-depth filter:
      - SSSE3 vs C: SSSE3 takes ~85% less time
      - AVX2  vs C: AVX2  takes ~89% less time
      - SSS3  vs AVX2: AVX2 takes ~24% less time (~31% faster)
      Change-Id: Ide22d7c09c0be61483e9682caf17a39438e4a208
    • Debargha Mukherjee's avatar
      Don't use extra lines for r=2 guided filter · f7d1ff49
      Debargha Mukherjee authored
      Changes the CONFIG_FAST_SGR=1 strategy to not use any
      subsampling for the r=1 filter, but for the r=2 filter
      sub-sample vertically but combine only by filtering
      horizontally in the last stage for odd rows.
      Coding efficiency loss sems quite minimal.
      Change-Id: I5644ac400b387c37a2d278db7f6ad3ac0a6b5e93
    • Debargha Mukherjee's avatar
      Remove CONFIG_FRAME_SIGN_BIAS config flag · 23b54841
      Debargha Mukherjee authored
      Change-Id: I6138519456b2ad3ffc8bced803ddc4418b246e74
    • Debargha Mukherjee's avatar
      Port first pass stats handling from Vp9 into Av1 · da01c0dc
      Debargha Mukherjee authored
      Some parameter tuning included.
      lowres (q, 30 frames, speed 1):
      -1.243% av PSNR, -2.337% ov PSNR, +0.577% SSIM
      lowres (vbr, 30 frames, speed 1):
      -0.327% av PSNR, -1.007% ov PSNR, +0.182% SSIM
      A few videos become a lot worse in SSIM, which needs to be
      investigated. But PSNR-wise the patch seems pretty good.
      Change-Id: I17c8d812c96ee49ddae7d3959a459aa3ffcea208
    • Peng Bin's avatar
      Remove aom_comp_mask_upsampled_pred from rtcd · f8daa92d
      Peng Bin authored
      Since aom_comp_mask_upsampled_pred just call aom_upsampled_pred
      and aom_comp_mask_pred, no need to separate c version from simd
      version any more.
      Change-Id: I1ff8bcae87d501c68a80708fd2dc6b74c6952f88
    • Yaowu Xu's avatar
      Align allocated buffers · a64c05b5
      Yaowu Xu authored
      Change-Id: I5a8bdbd472213ded2de706c5b044a1bf24823670
    • Jingning Han's avatar
      Fix txk-sel unit test failure in aq-mode · 1e79d90e
      Jingning Han authored
      The current aq mode encoder setting would alter the segment_id
      between the rate-distortion optimization and the block encoding
      stages. Disable the corresponding consistency check in this case.
      Change-Id: Ic910a23fd64a9b4554567d3c8c9a9ae5f6062c7b
  2. 01 Feb, 2018 14 commits
  3. 31 Jan, 2018 18 commits
    • Johann's avatar
      Revert "Enable pic (position independent code) config." · 32bb5110
      Johann authored
      This reverts commit fbeee067.
      The assembly which was failing has been fixed.
      Change-Id: Ide75630b38603a2553f6e231085994251c77b26c
    • Hui Su's avatar
      Fix the fast tx type search feature with txk-sel on · 8f47f705
      Hui Su authored
      Change-Id: If0b1d2fe31569104f2d8eef3cfd42cab30162c7e
    • Hui Su's avatar
      Reduce memory usage of inter_tx_size[] in MB_MODE_INFO · 1379beb7
      Hui Su authored
      Reduce the length of inter_tx_size[] from 1024 to 16.
      On a cif test sequence,
      encoder memory consumption decreases by 18% (380MB -> 312MB);
      decoder memory consumption decreases by 56% (21.4MB -> 9.4MB).
      Change-Id: Ie11dd055255d200954b704b8c2ad8ca3dff7bf5c
    • Tom Finegan's avatar
      Fix a couple of visual studio nightly build failures. · f19a91cb
      Tom Finegan authored
      - Add explicit cast of bool to int to silence a test warning.
      - Add explicit cast of size_t to int for same in dump_obu.
      Change-Id: I90846eb5c88880d921f20cb66b116ab7d2799af5
    • Angie Chiang's avatar
      Integrate lv_map with aom_qm · b3167a65
      Angie Chiang authored
      Change-Id: Ib06a12039cb72665c1ee534cc2246ac3d23f878d
    • Soo-Chul Han's avatar
      add scalability experiment · f8589863
      Soo-Chul Han authored
      Change-Id: Ifa908f809bcf904bdf0ed87b351e1ef3accc2b3f
    • Johann's avatar
      use GLOBAL() macro when loading constant · 4972ac81
      Johann authored
      Clear linker error when building with gcc 6:
      relocation R_X86_64_32 against `.rodata' can not be used when making a
      shared object; recompile with -fPIC
      Change-Id: I6c06de1e9dac1c044a4b07125abcaba0943a29b6
    • Hui Su's avatar
      one level less of tx size search for blocks larger than 64 · 7ed7e1fa
      Hui Su authored
      3~5% encoding speedup for speed 0; no quality loss.
      Change-Id: I0e31755f45253e5e99d8d9eed0d7a6fe6050f49f
    • Urvang Joshi's avatar
      Cleanup some fragile aspects of rd_pick_partition. · 00c6e6f7
      Urvang Joshi authored
      (1) Explicitly reset RD stats for each partition.
      PARTITION_SPLIT was the only one resetting the RD_STATS in 'sum_rdc'.
      But this was working because:
      - PARTITION_SPLIT was tried before VERT, HORZ, VERT_4 and HORZ_4; and
      - RD cost calculations in VERT, HORZ, VERT_4 and HORZ_4 partitions
      implicitly discarded existing value in sum_rdc
      However, that was very fragile; explicitly resetting the stats every
      time is much safer.
      (2) Using a separate variable 'temp_best_rd_cost' was fragile as someone
      may forget to update the same. So, we use best_rdc.rdcost directly.
      Change-Id: Icd75f25c34bb0f1806e691784648bcffce2417e6
    • Deepa K G's avatar
      AVX2 optimization of motion compensation functions · c8e0336a
      Deepa K G authored
      AVX2 implementation of av1_convolve_x_sr, av1_convolve_y_sr and
      av1_convolve_2d_sr have been added.
      Improvements have been made to av1_convolve_x_avx2, av1_convolve_y_avx2
      and av1_convolve_2d_avx2.
      Change-Id: I62a699dd9dcf42de94dd72cc2d43affc0dc31404
    • Tom Finegan's avatar
      Add information about extra CMake build flags to README.md · aa71f071
      Tom Finegan authored
      Change-Id: If9f944b58f23cdb71f919bd391f6b37e27b271f1
    • Angie Chiang's avatar
      Update adst4 range · 5d7c1fcc
      Angie Chiang authored
      Serialize the adst4 operations
      Update stage range accordingly
      Change the cos_bit precision accordingly.
      Correct 4x8/8x4 inv_start_range
      Change-Id: I10bc91585a61d790decdc24cb91659102e043620
    • David Barker's avatar
      [jnt-comp, normative] Avoid double-rounding in prediction · 39cf8061
      David Barker authored
      As per the linked bug report, the distance-weighted compound
      prediction has two separate round operations, first by 3
      bits (inside the various convolve functions), then by 10 bits
      (after the convolution functions).
      We can improve on this by right shifting by 3 bits inside the
      convolve functions - this is equivalent to doing a single round
      by 13 bits at the end.
      Note: In the encoder, when doing joint_motion_search(), we do
      things a bit differently: So that we can try modifying the two
      "sides" of the prediction independently, we predict each side as
      if it were a single prediction (including rounding), then blend
      these single predictions together.
      This is already an approximation to the "real" prediction, even
      in the non-jnt-comp case. So we leave that code path as-is.
      Change-Id: I9ad1fbcb3e12db2b5fc3c82b407f0fd9e6b39750
    • Johann's avatar
      BUG FIX: sse2 subpel variance is not PIC compliant · 0cf864fd
      Johann authored
      cherry-picked from libvpx:
        commit cb9f4dc1056b39383595f658cfcd166833bc0097
        Author: Scott LaVarnway <slavarnway@google.com>
        Date:   Sat Jan 13 07:01:04 2018 -0800
      Change-Id: Ie1736ea0787f4dad80204dcf5251fbb02d79541e
    • Imdad Sardharwalla's avatar
      Added HighBD support for mismatch debugging · 5b084ee1
      Imdad Sardharwalla authored
      Enabling CONFIG_MISMATCH_DEBUG with highbd streams was producing undefined
      behaviour. This patch adds support for highbd frames.
      Change-Id: I36ff4ddbb9b2e884e4a5b76485247a20b1f5db3c
    • Debargha Mukherjee's avatar
      Merge in STRIPED_LOOP_RESTORATION flag · 5105f7ac
      Debargha Mukherjee authored
      CONFIG_LOOP_RESTORATION still exists.
      Only CONFIG_STRIPED_LOOP_RESTORATION has been merged into
      CONFIG_LOOP_RESTORATION as always 1.
      Change-Id: I37d7a1fcd4cbb56e2fc037b1568ae63f72ed6a66
    • Sebastien Alaiwan's avatar
      Update configuration comment about LOWBITDEPTH · 1e3da463
      Sebastien Alaiwan authored
      The comment was misleading as the codec always supports 8-bit,
      regardless of the value of CONFIG_LOWBITDEPTH.
      This flag just enables the optimized-for-8-bits pipeline,
      without changing the actual YUV output.
      Change-Id: Ic2f041870acf4e2ee435021aa42e8f013ef52b78
    • Frederic Barbier's avatar
      Reduce scope of ctx derivation · 46475a30
      Frederic Barbier authored
      Change-Id: Ic8050cada6dc9dd14152da98ee21bb37042069e6