1. 10 Oct, 2017 22 commits
    • Tom Finegan's avatar
      Correct the aom{dec,enc} output paths in CMake. · 938172c2
      Tom Finegan authored
      They're expected in the root of the config dir and not
      in the examples sub dir.
      
      Change-Id: I26e28e5a341f5bf8db4554269db198501172345e
      938172c2
    • Lester Lu's avatar
      lgt-from-pred: transforms based on prediction · 432012f6
      Lester Lu authored
      In this experiment, sharp image discontinuity in the predicted
      block is detected. Based on this discontinuity, we choose
      particular LGTs as row and column transforms.
      
      Bitstream syntax, entropy coding, and RD search for LGT are added.
      One binary symbol is used to signal whether LGT is used. This
      experiment can work independently with the lgt experiment.
      
      lowres: -0.414% for key frames, -0.151% overall
      midres: -0.413% for key frames, -0.161% overall
      
      Change-Id: Iaa2f2c2839c34ca4134fa55e77870dc3f1fa879f
      432012f6
    • Angie Chiang's avatar
      Turn off limit_nb_scan_distance() temporarily · 63647c02
      Angie Chiang authored
      Change-Id: Idb1a4bf4dd655bde22862d76f6fa70457381a770
      63647c02
    • Angie Chiang's avatar
      Add REDUCE_CONTEXT_DEPENDENCY flag · 4408aad9
      Angie Chiang authored
      This is flag will allow us to calculate the context indexes of
      any two consecutive non-zero binaries in parallel
      
      Moreover, we can set MIN_SCAN_IDX_REDUCE_CONTEXT_DEPENDENCY to X,
      which let first X coefficients be immune from the context
      dependency reduction act
      
      Change-Id: I75b71452996161ba06ec449021c7dea8e3899800
      4408aad9
    • Angie Chiang's avatar
      Pass scan_idx and scan into get_nz_map_ctx · f9711f88
      Angie Chiang authored
      This aims at facilitate the experiment about reduce context
      dependency
      
      Change-Id: I3d026bda1118cf613001efa32deed62997d5e3bb
      f9711f88
    • Angie Chiang's avatar
      Add frame-level flag to turn on/off adapt_scan · 6dbffbf1
      Angie Chiang authored
      Change-Id: I7a73dbe72b618e795191cc31bc32e31ad99d8587
      6dbffbf1
    • Yushin Cho's avatar
      Use pixel domain skip error if possible in var-tx · 952eae29
      Yushin Cho authored
      When early skipped in var-tx, distortion is set the same as sse.
      If so, use pixel domain sse (i.e. skip error) since is more accureate
      than sse from transform domain.
      
      Change-Id: Id3cbc66ea6318108c031413646f3d06250e75e7e
      952eae29
    • Hui Su's avatar
      intrabc: fix mismatch · 2b2ad0fa
      Hui Su authored
      The "txb_split_count" counter should be properly updated.
      
      BUG=aomedia:864
      
      Change-Id: I3fb34a818c3f474085c4a2980a2d3b68bd33fb12
      2b2ad0fa
    • Angie Chiang's avatar
      Refine do_adapt_scan's logic · fe533ec6
      Angie Chiang authored
      Change-Id: I6d68f03e3f9b1e40b05503f6bb4055e2fd870893
      fe533ec6
    • Yue Chen's avatar
      Process OBMC pred in max unit of 64x64 · 7eb7679d
      Yue Chen authored
      Make the codec account for the 64x64 processing unit constraint
      when generating secondary predictions and applying overlapped
      filter.
      
      This issue was addressed in commit 440d4254 and 501294ce, but
      afterwards some features are not fully retained in an obmc
      refactoring commit.
      
      Change-Id: I6f16e6fccb966d45034d5b55447c9d9cb70e02cb
      7eb7679d
    • Yi Luo's avatar
      Migrate some vp9 highbd intrapred x86 speedup to av1 · 71b6e043
      Yi Luo authored
      Function speedup on i7-6700:
      D117   sse2   ssse3
      4x4    ~1.8x
      8x8           ~3.4x
      16x16         ~5.5x
      32x32         ~2.9x
      
      D135   sse2   ssse3
      4x4    ~1.9
      8x8           ~3.3x
      16x16         ~5.3x
      32x32         ~3.6x
      
      D153   sse2   ssse3
      4x4    ~1.9x
      8x8           ~2.8x
      16x16         ~5.5x
      32x32         ~3.6x
      
      Change-Id: I43ab5fa8dcbcfa51acbde554abf3e5d7d336f391
      71b6e043
    • Debargha Mukherjee's avatar
      Fix conflicts between ext-partition & other expts · e30159ce
      Debargha Mukherjee authored
      Most of the fixes are related to replacing BLOCK_64X64 with
      cm->sb_size.
      
      Fixes the AV1/AqSegmentTest.TestNoMisMatchExtDeltaQ/* tests that
      were breaking before with ex-partition.
      
      Change-Id: I19d6045b422a93891b8cf4f8a929def97a595058
      e30159ce
    • Rupert Swarbrick's avatar
      Avoid Visual Studio compile error in loopfilter · a1befa51
      Rupert Swarbrick authored
      If you have a structure, foo_t, with an alignment request then Visual
      Studio won't allow you to declare a function
      
        void use_foo(foo_t x);
      
      The reasoning is that x might be passed on the stack, and their ABI
      doesn't allow them to guarantee that x is aligned appropriately. More
      strangely, this isn't allowed either:
      
       void use_some_foos(foo_t x[10]);
      
      This is functionally equivalent to:
      
       void use_windows_foos(foo_t *x);
      
      (except that you can't tell how long the array should be from the
      function signature).
      
      Since Visual Studio is supposed to allow the latter form, use that
      instead.
      
      Change-Id: Icd449fc1058606fa7e48a6f791091bbb42a73b2c
      a1befa51
    • Rupert Swarbrick's avatar
      Tiny cleanup in cdef_test.cc · e5442928
      Rupert Swarbrick authored
      This was triggered by a visual studio compile warning:
      
        cdef_test.cc(128):
        warning C4804: '>>': unsafe use of type 'bool' in operation
      
      However the code is rather hard to parse for humans too: when I first
      looked, I thought this was something to do with C++ templating...
      
      The new version is equivalent but defines max_pos in an outer
      loop (and a smaller indent).
      
      Change-Id: I0c5cabeee44d0839a7956a4ab1cf4ec5abfcc9ee
      e5442928
    • Yushin Cho's avatar
      Fix that sse is added twice during early skip in var-tx · 16efec40
      Yushin Cho authored
      The rd_stats->sse is already updated by
      "rd_stats->sse += tmp << 4;",
      which is measured by pixel_diff_dist(), i.e. in pixel domain and
      w/o quantization().
      
      Change-Id: I4dc20a7e80af9dd846aa5de4298cb56e7f0d8f7e
      16efec40
    • Debargha Mukherjee's avatar
      Turn on 32x64 and 64x32 transforms for real · cce6692a
      Debargha Mukherjee authored
      Change-Id: Ie4382b8a1c0f87ce50e9afefd1cef8ca55435c61
      cce6692a
    • Hui Su's avatar
      Remove unused parameter in intra mode reader · aa2965e6
      Hui Su authored
      Change-Id: Ibea4c2c732b16851ad16b475ea40f021d5b5d5b3
      aa2965e6
    • Sarah Parker's avatar
      Compute global refmv candidate at center of current block · 0a5cc5fd
      Sarah Parker authored
      When a neighboring block uses global motion, use the mv
      computed at the center of the current block as the candidate vector
      rather than the mv computed at the center of the neighboring block.
      
      0.15% improvement on cam_lowres
      
      Change-Id: I79eff8bf27a7aa84ae4a6d56e4a10c41a4438fb9
      0a5cc5fd
    • Yaowu Xu's avatar
      Revert "soft enable CDEF-singlepass" · 1542157b
      Yaowu Xu authored
      Temporarily reverting this to allow investigation of a couple of BUGS
      
      BUG=aomedia:881
      BUG=aomedia:887 (merged into #881)
      
      This reverts commit b1d3eda9.
      
      Change-Id: I2605deb7b8fefa4236d78c8695025dc42316edd2
      1542157b
    • Rupert Swarbrick's avatar
      Don't trash memory in select_tx_type_yrd · de2ea94e
      Rupert Swarbrick authored
      This patch fixes a bug in select_tx_type_yrd. The function works by
      looping over possible transform types to find the best option (calling
      select_tx_size_fix_type for each). Whenever there's a new best
      candidate, the code copies information about the transform from the
      mbmi structure into stack-allocated "best candidate" structures. At
      the end, it copies the "best candidate" data back to mbmi.
      
      Before the patch, if ref_best_rd was small, each call to
      select_tx_size_fix_type might return INT64_MAX (because they don't
      find anything better than ref_best_rd) and so we'd never actually copy
      anything to the "best candidate" structures. Then, at the end of the
      function, we'd merrily overwrite mbmi with whatever happened to be on
      the stack, causing general mayhem when something tried to read the
      data from mbmi later.
      
      This patch exits early if no candidates were found. It also adds an
      assertion saying that if no candidates were found, ref_best_rd must
      have been less than INT64_MAX. This should hopefully catch any bugs
      where the continue keywords in the loop stop us ever actually calling
      select_tx_size_fix_type.
      
      Change-Id: I54b998148281dd80f98d1570f736964593dc753f
      de2ea94e
    • Rupert Swarbrick's avatar
      Add an SSE4.1 implementation of av1_highbd_convolve_2d_scale · 724d31eb
      Rupert Swarbrick authored
      For large blocks this is about 8x the speed of the C version. The code
      needs SSE 4.1 for the PMULLD instruction that we use to do SIMD 32-bit
      multiplies.
      
      The patch uses av1_convolve_scale_test (written already to test the
      low bit depth path) to make sure the optimised code matches the C
      version.
      
      Change-Id: I9304d6bb3d2cb31390de93ed08ff1a852e3ace86
      724d31eb
    • Rupert Swarbrick's avatar
      Add an SSE4.1 implementation of av1_convolve_2d_scale · 98dc22b8
      Rupert Swarbrick authored
      For large blocks this is almost 8x the speed of the C version. The
      code needs SSE 4.1 for the PMULLD instruction that we use to do SIMD
      32-bit multiplies.
      
      This patch also makes av1_convolve_scale_test actually test something,
      making sure the optimised code matches the C version. The slightly
      excessive generality in the test (all the templating) is because of a
      following patch, which is for the high bit depth path and can then use
      most of the same test code.
      
      Change-Id: I6732bc6b2378ffaadae5aa6441100cf660f7ee11
      98dc22b8
  2. 09 Oct, 2017 11 commits
    • Angie Chiang's avatar
      Avoid updating non-used transform · ca8016ef
      Angie Chiang authored
      Since 32x32 transform use DCT only, we can avoid update other
      types of transform
      
      Change-Id: I51dd8ec71975187d249d7e25130e994a48cac5c1
      ca8016ef
    • Sarah Parker's avatar
      Change rectangular vartx recursion depth to 2 · d25ef8c6
      Sarah Parker authored
      0.15% improvement on lowres set
      
      Change-Id: If16a8e07797c64508f9e2d9b26ae874ac53c57a4
      d25ef8c6
    • Rupert Swarbrick's avatar
      Catch invalid block sizes in bitstream · 415c8f1f
      Rupert Swarbrick authored
      There's a bitstream conformance requirement that says that any block
      must subsample to a valid block size with the current subsampling
      mode. For example, this means that BLOCK_4X8 is illegal if there is
      subsampling in only the horizontal direction (since there is no
      BLOCK_2X8).
      
      This patch checks the bitstream is conformant as it reads partition
      information in decodeframe.c
      
      BUG=aomedia:875
      
      Change-Id: I18139aa76d6f965282402edbb0b68959478a46c3
      415c8f1f
    • Urvang Joshi's avatar
      Revert wrong uses of TX_SIZE enum. · ab8840eb
      Urvang Joshi authored
      Introduced by: https://aomedia-review.googlesource.com/c/aom/+/25181
      
      Change-Id: I1f25178d6b273fbeade4c33f153b5f2bac4a8b99
      ab8840eb
    • Rupert Swarbrick's avatar
      Add av1_convolve_scale_test · 1ea7ab4e
      Rupert Swarbrick authored
      This unit test doesn't actually provide any test coverage and merely
      exists to benchmark the C function, av1_convolve_2d_scale_c. The
      following patch will add an SSE version of that function and extend
      this test to check that the SSE code matches the C code.
      
      Change-Id: Ic942ad8f9fd57d2659fc60f92c5a0b6c9a9f8cac
      1ea7ab4e
    • Debargha Mukherjee's avatar
      Enable 32x64 and 64x32 transforms · 1a86b013
      Debargha Mukherjee authored
      Change-Id: I73e9d2d327b062828a75bc99fb348441dd32174a
      1a86b013
    • Debargha Mukherjee's avatar
      Resolve some static analysis warnings · e36a08c4
      Debargha Mukherjee authored
      Change-Id: Iaff923f34100ecdce76d2319fab67cde59d485ae
      e36a08c4
    • Cheng Chen's avatar
      Match braces in VIM for rdopt.c · 1483a714
      Cheng Chen authored
      Change-Id: I23344af711d9a31b819fca35ae3ad3b7edf4852e
      1483a714
    • Rupert Swarbrick's avatar
      Define block_signals_txsize function · fcff0b25
      Rupert Swarbrick authored
      This returns true if a block signals tx_size in the stream and uses it
      in the bitstream writing code and the decoder.
      
      Note that we can't quite use it in pack_inter_mode_mvs when
      CONFIG_VAR_TX && !CONFIG_RECT_TX but I've switched the code to using
      it the rest of the time since rect-tx is adopted and eventually the
      other code path should be deleted.
      
      Also use the helper function in tx_size_cost in rdopt.c, where the
      test was wrong and caused underestimates of block
      costs. (Specifically, the code that subtracts tx_size_cost from
      this_rate_tokenonly in rd_pick_intra_sby_mode ended up subtracting
      zero for a 4x8 block).
      
      The behaviour of the decoder should be unchanged. The only change in
      the encoder's behaviour should be in tx_size_cost where it should now
      match the rest of the code.
      
      Change-Id: I97236c9ce444993afe01ac5c6f4a0bb9e5049217
      fcff0b25
    • Zoe Liu's avatar
      Add experiment ext_skip · a3c5b9da
      Zoe Liu authored
      This coding tool is to introduce a new prediction mode for the
      bi-predictive frames that have a forward referernce within 2 frames
      away (distance denoted as 'fwd_delta'), and a backward reference,
      within (3-fwd_delta) frames away.
      
      If this prediction mode, namely 'ext_skip' is set, it will be coded
      using compound prediction with the most recent forward and backward
      reference frames as its reference pair, NEARESTMV as its motion mode,
      and the skip flag is set for the residue.
      
      Change-Id: I826034ccf1a956f4b350f0bc2e2dca8ea71b5197
      a3c5b9da
    • Zoe Liu's avatar
      Add encoder/decoder support to frame_sign_bias · 17af2748
      Zoe Liu authored
      Frame sign bias value will not be signaled in frame header. Instead,
      the sign bias of reference frames are derived from their corresponding
      frame offsets at both encoder and decoder.
      
      The tool of 'frame_sign_bias' is dependent of 'frame_marker'. Compared
      against baseline, the enabling of both tools obtains a small coding gain
      of -0.08 ~ -0.11% in BDRate over Google lowres/midres tests.
      
      Change-Id: I8d85dc427ced0b2152712ccf61be4be6068075b9
      17af2748
  3. 08 Oct, 2017 7 commits