1. 08 Dec, 2017 4 commits
    • Nathan E. Egge's avatar
      daala_tx: New flattened 16-point Type-IV DST. · 37131cfd
      Nathan E. Egge authored
      Change-Id: Ic741f269d0bd5e5e295b55f95bfef05050bc31e5
    • David Barker's avatar
      no-frame-context-signaling + q-adapt-probs: Fix interaction · 11eac7bf
      David Barker authored
      Slightly change the way we save and reload frame contexts during
      frame setup. For "normal" frames everything is the same, but for
      error-resilient and/or intra-only frames, we now:
      * Reset the frame context using setup_past_independence()
        (+ extra code if q-adapt-probs is enabled), as usual
      * Store this frame context into a special slot in cm->frame_contexts
      * Use that slot to fill in cm->pre_fc
      The main difference from before is that (for error-resilient/intra-only
      frames which are not key frames) we used to throw away the frame
      context after setting it up, and would re-use whatever was set up
      at the last keyframe.
      This was fine when q_adapt_probs is disabled, but it caused an
      inconsistency when combined with q_adapt_probs. See the attached
      bug report for more details on that.
      Change-Id: I9532b6b0e8ae29efbb4f059a0c67a73d7c7828ce
    • Nathan E. Egge's avatar
      daala_tx: New flattened 32-point Type-II DCT. · b9e16f2f
      Nathan E. Egge authored
      daala_tx@2017-12-07T22:33:52.954Z -> new_dct32@2017-12-07T22:34:37.310Z
         PSNR | PSNR Cb | PSNR Cr | PSNR HVS |    SSIM | MS SSIM | CIEDE 2000
      -0.0073 | -0.0284 | -0.1499 |  -0.0561 | -0.0128 | -0.0279 |    -0.0386
      Change-Id: Ib01f21aa9fc3c95f9d820331b43e70410de99958
    • Jingning Han's avatar
      Constrain hash table access within tile boundary · 3de5353c
      Jingning Han authored
      Limit the prediction residual hash table access within the same
      tile. This resolves a data race issue across tiles in multi-threading
      encoding that triggers instable encoding results.
      Change-Id: Ia4a415a0750bd60ee8ac4e56aa1cd39ec99e19c7
  2. 07 Dec, 2017 14 commits
  3. 06 Dec, 2017 9 commits
    • Yunqing Wang's avatar
      Simplify warped motion parameter estimation · 763ccd8c
      Yunqing Wang authored
      The purpose of this change is to reduce the cycles needed for warped
      motion parameter estimation.
      Method 1:
      If we remove the 2-bit bit-depth reduction(as in patch set 2), the
      downshifting of A, Bx, By is also removed. The borg test result(over
      the baseline) is:
                   avg_psnr ovr_psnr  ssim
      lowres:      0.023     0.020    0.071
      cam_lowres: -0.009    -0.017   -0.031
      Method 2:
      In theory, the above change uses 2 more bits for elements of A, Bx,
      By. In patchset 3, we modified LS_STEP to be 8(1 full pixel), and now,
      the least 2 bits in A, Bx, By elements are always 0. Namely, 2-bit
      bit-depth reduction are achieved without extra operations. The borg
      test result(over the baseline) is:
      lowres:     -0.004    -0.007   -0.023
      cam_lowres: -0.031    -0.033   -0.045
      This is a little better than patch set 2 result.
      Method 2 is the final choice.
      Change-Id: I945aaba412e2ea86b7d67e8a90741fdf395b94cd
    • Zoe Liu's avatar
      Remove redundant check on single ref for motion mode · 70539b10
      Zoe Liu authored
      Change-Id: Ia8321afd087f99371cdf07f3a03249580e09964d
    • Zoe Liu's avatar
      JNT_COMP: Simplify logic on inter-inter comp modes · 5f11e915
      Zoe Liu authored
      This patch simplies the checking criteria for the two groups of
      compound modes. It also makes the encoder side cdf update inside the
      RD loop consistent with that in the bitstream.
      Experimental results on Google test sets (30 frames of lowres and
      midres) confirm this patch obtains identical coding performance.
      Change-Id: I170eea91f7d2be2170df544cfc2c692b09aa82d6
    • Yushin Cho's avatar
      Fix the comments on the precisions in quantization · 46ae3de1
      Yushin Cho authored
      Fix the comments on the precision of quantizers and tx coefficients
      during a quantization process for different input depth and tx size.
      I think the author really meant "de-quantized/de-coded coefficients" by
      "quantized/coded coefficients". So, made it clear to avoid any possible
      Change-Id: Ib92ac7dcfddcbe58cf3adfb9448497512381c1f5
    • Cheng Chen's avatar
      Add another if case for convolve_2d_copy_sse2 · 85c29ddc
      Cheng Chen authored
      Load four 8-bit input and process.
      Change-Id: I9b3ba58ea3a03c6a8129379afa37c54a57e04501
    • Sebastien Alaiwan's avatar
      mvref_common.c: reduce scope of locals · 62cc5859
      Sebastien Alaiwan authored
      Also, make them const when appropriate.
      Change-Id: I96d544e2cc9a0bce4d52fd33e44a4eaa40edda3c
    • Maxym Dmytrychenko's avatar
      AVX2 implementation for highbd_convolve_2d · 70e7613a
      Maxym Dmytrychenko authored
      Can be up to >10% faster with bit exact results
      Change-Id: I5f169673fd2d5af96f425f00d862f3c989228d2e
    • Urvang Joshi's avatar
      16x64 and 64x16 transforms: Reuse scan order, eob · 030cea9b
      Urvang Joshi authored
      16x64 reuses scan order of 16x32
      64x16 reuses scan order of 32x16
      max eob is curtailed to 512 (instead of 1024) for both.
      Change-Id: Iac2145aa5e3d090009e2a2f5715caa8d84dfb2ee
    • Zoe Liu's avatar
      Simply the code path when jnt-comp is off · 6fa05dcf
      Zoe Liu authored
      Change-Id: I17a82393f1b7230119f499e2f9ed8d0b8fe5ba25
  4. 05 Dec, 2017 13 commits
    • Luc Trudeau's avatar
      [CFL] Disable CfL for 4:1 and 1:4 Partitions · 4d6ea54e
      Luc Trudeau authored
      Moving CfL to using partition unit DC_PRED requires 4:1 and 1:4 DC_PRED,
      which are not currently implemented. A simple solution is to disable CfL
      for 4:1 and 1:4 partitions.
      CfL is also disabled for luma intra partitions < 4x4. This is inherent
      to luma intra prediction partition sizes. We add an assert to enforce
      Resulting in the following regression for Subset1
         PSNR | PSNR Cb | PSNR Cr | PSNR HVS |   SSIM | MS SSIM | CIEDE 2000
      -0.0093 |  0.1803 |  0.1519 |  -0.0180 | 0.0256 |  0.0226 |     0.0352
      Change-Id: Ie2c8b4d9cb6b6f33a103b540209e1a2fb6df74a7
    • Angie Chiang's avatar
      Allow txk_sel to turn off optimize_b in rd loop · daccae3c
      Angie Chiang authored
      This is for speeding up the testing process
      Change-Id: I90866fa239794f14e4801675d471dbf50b779d18
    • Angie Chiang's avatar
      fh < fl --> fh <= fl in od_ec_encode_q15 · f8bf6bba
      Angie Chiang authored
      When lv_map_multi is on,
      od_ec_encode_q15 is able to handle the situation of fh == fl
      Change-Id: I7c837dda561f1d25b0203c018763dadd0cbbc75a
    • Cheng Chen's avatar
      Convolve copy function for jnt_comp · 3afe49ed
      Cheng Chen authored
      Added a copy function (c version and sse2 version) for full-pixel motion
      vectors for jnt_comp experiment following existing av1_convolve_2d_copy
      Change-Id: I20fd2219799f9c1451f591574fbe97364f40e0f0
    • Johann's avatar
      Partially revert "nasm defaults to -Ox" · f38fccee
      Johann authored
      The -Ox check in still useful to avoid the version of nasm distributed
      with Apple Xcode.
      This reverts commit 29b0c186.
      Change-Id: I9237791802267da708c3be8e5a83ca8d71e74afc
    • Sarah Parker's avatar
      Add macro to allow different tx sets for 16x16 · cec7ba10
      Sarah Parker authored
      This allows for the following options:
       Set 0:
              Inter: All 16 txfms
              Intra: Discrete Trig transforms w/0 flip (4) + Identity (1) +
                     1D Hor/vert DCT (2)
       Set 1:
              Inter: Discrete Trig transforms w/ flip (9) + Identity (1) +
                     1D Hor/Ver DCT (2)
              Intra: Discrete Trig transforms w/0 flip (4) + Identity (1)
       Set 2:
              Inter: Discrete Trig transforms w/ flip (9) + Identity (1)
              Intra: Discrete Trig transforms w/0 flip (4) + Identity (1)
      Results on lowres 40 frames with
      disable-ext-partition disable-ext-partition-types
      Set 0: 0.03%
      Set 1: No change
      Set 2: 0.06%
      Change-Id: Iec57d8c8fcfa0891528de4ca88f54753dfcb5284
    • Cyril Concolato's avatar
      Enable encode/decode of OBU streams without IVF · 6c788834
      Cyril Concolato authored
      Change-Id: Ieed4ecce63a2a3b2a74c40ccddabe91cb9386632
    • Debargha Mukherjee's avatar
      Zero out half of 16x64 and 64x16 transforms · 60586676
      Debargha Mukherjee authored
      Constrain 16x64 transform so that the bottom 16x32 is zero;
      constrain 64x16 transform so that the right 32x16 is zero;
      Also implement 32x64 transform better to reduce intermediate
      coefficient range.
      Change-Id: Ia9050ee741ed1d5b02a42616635b496d637d932f
    • Cheng Chen's avatar
      Change comp_group index context and save sending comp_group · 5a88172c
      Cheng Chen authored
      Extend context model for comp_group_idx.
      Save sending comp_group_idx when masked_compound is not allowed.
      Change-Id: Ia7ae53958c9e1c8fe07be4b14a425d9b8648082d
    • Cheng Chen's avatar
      JNT_COMP: change COMPOUND_AVERAGE in cdf · 2ef24ea2
      Cheng Chen authored
      Remove COMPOUND_AVERAGE from compound_type_cdfs since it is now grouped
      to compound_idx. However, COMPOUND_AVERAGE is still used elsewhere.
      Change-Id: Ie0d460aabf9252e80eb4130cfef9aaf0efc3969d
    • Cheng Chen's avatar
      JNT_COMP: divide compound modes into two groups · 33a13d9f
      Cheng Chen authored
      Divide compound inter prediction modes into two groups:
      Group A: jnt_comp, compound_average
      Group B: interintra, compound_segment, wedge
      Change-Id: I1142da2e3dfadf382d6b8183a87bde95119cf1b7
    • Timothy B. Terriberry's avatar
      daala_tx: Add SIMD version of the 16-point DCT · b0191d21
      Timothy B. Terriberry authored
      Change-Id: Ie3e599def556a90c474680567c4537508de2e30a
    • Nathan E. Egge's avatar
      daala_tx: New flattened 4-point Type-IV asym DST. · dc857d1b
      Nathan E. Egge authored
      This 4-point Type-IV asymmetric DST uses the same computation graph as
       the 4-point Type-IV DST.
      This change improves the accuracy of the 8-point Type-II DCT:
      Old MSE: 1.8927096972341813413041010372151e-06
      New MSE: 1.7946367518072710517065436117146e-06
      new_dst4@2017-12-04T06:31:41.096Z -> new_dst4a@2017-12-04T06:32:22.698Z
         PSNR | PSNR Cb | PSNR Cr | PSNR HVS |    SSIM | MS SSIM | CIEDE 2000
      -0.0143 |  0.0410 | -0.2166 |  -0.0556 | -0.0379 | -0.0461 |    -0.0002
      Change-Id: Ifde11fca987220130c1657306b0df34ec2f3fe25