1. 20 Nov, 2017 13 commits
    • Cheng Chen's avatar
      JNT_COMP: refactor if statements · 8263f80c
      Cheng Chen authored
      Refactor if statement that use frame_offset == -1 to indicate
      jnt_comp is not chosen, as distance now can not be negative.
      Instead, add a variable use_jnt_comp_avg for the same functionality.
      
      Change-Id: Ie6b9c6ab36131b48bc9e066babada17046729cd8
      8263f80c
    • Yunqing Wang's avatar
      Only use 1 above row and left column in warped reference MV · d3af66c7
      Yunqing Wang authored
      Multiple above rows and left columns are checked while generating reference
      MV candidate list for the current block. But, for warped ref_mv, we only
      generate warped reference MV for current block if one neighbor block is
      WARPED_CAUSAL mode and is located in the immediate above row or left column.
      
      Change-Id: Ia9e9c2b7f97b61e0e4d2eeffd8d91e9e5f93d1a0
      d3af66c7
    • Monty Montgomery's avatar
      Move Daala TX to fixed coeff depth of 12 (Q4) · 358abfb7
      Monty Montgomery authored
      This patch activates all the preceeding work, moving Daala TX to a
      greater, fixed coefficient depth (12).  This reclaims the regression
      caused by going to Q3.
      
      subset-1:
      monty-rest-of-stack-rmscale-s1@2017-11-13T14:40:20.646Z ->
       monty-rest-of-stack-Q4-s1@2017-11-13T14:40:44.807Z
      
         PSNR | PSNR Cb | PSNR Cr | PSNR HVS |    SSIM | MS SSIM | CIEDE 2000
      -0.0745 |     N/A | -0.1040 |  -0.1017 | -0.0660 | -0.0522 |    -0.0806
      
      Change-Id: If2a0853b320d57c2fa3a66f919ceb2dc526d017f
      358abfb7
    • Monty Montgomery's avatar
      Remove use of av1_get_tx_scale in Daala TX · 27d1b373
      Monty Montgomery authored
      Daala TX does not scale coefficients based on TX size.  Although
      previous patches force av1_get_tx_scale() to always return zero when
      CONFIG_DAALA_TX is true, this patch removes the call entirely.  This
      represents no functional change.
      
      subset-1:
      monty-rest-of-stack-Q3-s1@2017-11-13T14:39:52.160Z ->
       monty-rest-of-stack-rmscale-s1@2017-11-13T14:40:20.646Z
      
        PSNR | PSNR Cb | PSNR Cr | PSNR HVS |   SSIM | MS SSIM | CIEDE 2000
      0.0000 |     N/A |  0.0000 |   0.0000 | 0.0000 |  0.0000 |     0.0000
      
      Change-Id: I5757282153c291c59510b17b5f71b3e0a56382ca
      27d1b373
    • Monty Montgomery's avatar
      Enable configurable fixed-depth coefficients in Daala TX · 57f6bfd0
      Monty Montgomery authored
      This patch turns on the fixed-depth TX code in the Daala toplevel TX
      
      A REGRESSION IS EXPECTED as this is temporarily dropping Daala TX back
      to Q3, which is reduced operating precision over current master.
      
      subset-1:
      monty-rest-of-stack-RDO-s1@2017-11-13T14:39:17.093Z ->
       monty-rest-of-stack-Q3-s1@2017-11-13T14:39:52.160Z
      
        PSNR | PSNR Cb | PSNR Cr | PSNR HVS |   SSIM | MS SSIM | CIEDE 2000
      0.0927 |     N/A | -0.0001 |   0.1390 | 0.0871 |  0.0835 |     0.0826
      
      objective-1-fast --limit=4:
      monty-rest-of-stack-RDO-o1f4@2017-11-13T14:38:57.951Z ->
       monty-rest-of-stack-Q3-o1f4@2017-11-13T14:39:32.205Z
      
        PSNR | PSNR Cb | PSNR Cr | PSNR HVS |   SSIM | MS SSIM | CIEDE 2000
      0.0981 |  0.2341 |  0.4784 |   0.1215 | 0.0761 |  0.1144 |     0.1444
      
      Change-Id: Ibbe17226dd47980da632814422d6201c9fc6fa36
      57f6bfd0
    • Monty Montgomery's avatar
      Modify RDO for use with Daala TX constant-depth coeffs · 4a05a58c
      Monty Montgomery authored
      Modify the portions of RDO using TX-domain coeff calaculations to deal
      with TX_COEFF_DEPTH and constant-depth coefficient scaling.  At
      present, this represents no functional change.
      
      subset-1:
      monty-rest-of-stack-quant-s1@2017-11-13T14:38:43.774Z ->
       monty-rest-of-stack-RDO-s1@2017-11-13T14:39:17.093Z
      
        PSNR | PSNR Cb | PSNR Cr | PSNR HVS |   SSIM | MS SSIM | CIEDE 2000
      0.0000 |  0.0000 |  0.0000 |   0.0000 | 0.0000 |  0.0000 |     0.0000
      
      objective-1-fast --limit=4:
      monty-rest-of-stack-quant-o1f4@2017-11-13T14:38:28.828Z ->
       monty-rest-of-stack-RDO-o1f4@2017-11-13T14:38:57.951Z
      
        PSNR | PSNR Cb | PSNR Cr | PSNR HVS |   SSIM | MS SSIM | CIEDE 2000
      0.0000 |  0.0000 |  0.0000 |   0.0000 | 0.0000 |  0.0000 |     0.0000
      
      Change-Id: I0fbc45e018f565f48e1fc8fdeabfcd6cb6fa62fe
      4a05a58c
    • Dake He's avatar
      [lv_map_multi] Base level alphabet adjustment · 3fe369c8
      Dake He authored
      At eob-1, coefficient must be non-zero. As such, this CL changes the
      alphabet for base levels at eob-1 from size 4 to size 3. Minor
      performance improvement is observed. In addition, changes in 33462 made
      by Ola Hugosson were also incorporated.
      
      Now with trained initial probability distributions.
      
      Change-Id: Id6b5d0908b5ff186ed88ab0733ce7cc0c4a468d5
      3fe369c8
    • David Barker's avatar
      loop-restoration: Remove duplicated function · abb3e4e9
      David Barker authored
      We currently have two implementations of the same function
      (aom_memset16() and memset16()), one of which is only defined inside
      restoration.c. Remove this duplicate, and use the globally defined
      version instead.
      
      Change-Id: I52740541f2e974f505728240127842397f6ef38d
      abb3e4e9
    • Yue Chen's avatar
      New filter_intra implementation + entropy coding · da2eefc6
      Yue Chen authored
      Use 4x2 processing unit.
      Reduce # of modes from 6 to 5.
      
      Change-Id: I3c12e18084636de0e279c9102a8b212342faf4c7
      da2eefc6
    • Cheng Chen's avatar
      JNT_COMP: clamp distance · 1ee07323
      Cheng Chen authored
      Let maximum distance be INT16_MAX - 1.
      
      To decide distance weights, use multiplication
      instead of a division.
      
      Google test shows +0.05% on lowres.
      
      Change-Id: I2be25aec3c921773b0d776cf5ae00e3cd4cc27cd
      1ee07323
    • Dake He's avatar
      [lv_map] fix template size to 5 · 0c9ea803
      Dake He authored
      Cleanup context derivation for base levels by fixing the template size
      to 5. No change to bitstream.
      
      Change-Id: I496c74b386b8d8a4fb43c0b69add52a0a798a981
      0c9ea803
    • Monty Montgomery's avatar
      Add Daala TX fixed-coeff-depth capability to quantization · 60f2a229
      Monty Montgomery authored
      This patch completes the work to add fixed-depth TX domain support to
      the quantization and dequantization code.  At present, it is active but
      configured to behave identically to current AV1 master as RDO and TX
      have not yet been updated to also support this functionality.
      
      subset-1:
      monty-rest-of-stack-noshift-s1@2017-11-13T14:37:42.541Z ->
       monty-rest-of-stack-quant-s1@2017-11-13T14:38:43.774Z
      
        PSNR | PSNR Cb | PSNR Cr | PSNR HVS |   SSIM | MS SSIM | CIEDE 2000
      0.0000 |  0.0000 |  0.0000 |   0.0000 | 0.0000 |  0.0000 |     0.0000
      
      objective-1-fast --limit=4:
      monty-rest-of-stack-noshift-o1f4@2017-11-13T14:37:16.992Z ->
       monty-rest-of-stack-quant-o1f4@2017-11-13T14:38:28.828Z
      
        PSNR | PSNR Cb | PSNR Cr | PSNR HVS |   SSIM | MS SSIM | CIEDE 2000
      0.0000 |  0.0000 |  0.0000 |   0.0000 | 0.0000 |  0.0000 |     0.0000
      
      Change-Id: I3773a1fc128136c9fea227f4b547576a8aa6efa3
      60f2a229
    • Debargha Mukherjee's avatar
      Remove unused tx_size_implied count · 605d63f3
      Debargha Mukherjee authored
      Change-Id: Icca39f1d037a3aca4540e35b70fdfafeae2b094e
      605d63f3
  2. 19 Nov, 2017 1 commit
  3. 18 Nov, 2017 2 commits
    • Zoe Liu's avatar
      Add motion selection to ext_skip · f40a9577
      Zoe Liu authored
      A new block mode, referred to as skip_mode is added. If a block is
      coded as skip_mode, it will be inter-coded, with its references and
      motion vectors derived from its neighboring blocks with zero-residue.
      Otherwise, the block can be coded in the current intra or inter mode.
      
      The computational load on skip_mode evaluation at the encoder should
      be kept minimum. No transform size / type evaluations are needed.
      
      Change-Id: I5aef0159c7d5ecd64258510835903375d6c536d6
      f40a9577
    • Jingning Han's avatar
      Avoid using multiplication in context fetch · d915e4ee
      Jingning Han authored
      Replace the multiplication with shifts.
      
      Change-Id: I245efaddea2019d789179569e82e81bb7cb97715
      d915e4ee
  4. 17 Nov, 2017 9 commits
    • Yunqing Wang's avatar
      Reuse neighbor's warped motion parameters · 876a8b0b
      Yunqing Wang authored
      If a block's motion_mode is WARPED_CAUSAL and its mode is NEARESTMV, search
      its immediate above and left neighbors to get the set of neighbor blocks
      using WARPED_CAUSAL motion mode, pick the one with largest block size, and
      use that neighbor's warped motion parameters directly for the current block.
      If none of the neighbors uses WARPED_CAUSAL motion mode, we estimate the
      current block's warped motion parameters.
      
      Before this patch, for every block, we estimate its warped motion parameters.
      With this patch, we reduce the number of blocks doing parameter estimation.
      Here are results by testing on clips with camera motions.
                          WARPED_CAUSAL blocks   blocks reusing parameters
      station2_240p(30f):     3857                    1678
      netflix_arieal(30f):     692                     223
      
      No noticable changes in coding gain. Borg test result showed a PSNR
      change of +0.006% on cam_lowres set, and -0.014% on lowres set.
      
      Change-Id: If12387ad0ca8a1996ea4c3f1bedcb269ebf78c6c
      876a8b0b
    • Urvang Joshi's avatar
      intrapred: Remove two local flags. · 49404055
      Urvang Joshi authored
      These used to be a combination of some config flags. But as those config
      flags are now removed, they were always 1 now.
      
      This simplifies the code a bit.
      
      Change-Id: Ifd3a94b6b786c95c3efc6d646dcf1489cdda7f92
      49404055
    • Hui Su's avatar
      Remove prob table entries and counters for new mv · 21b67229
      Hui Su authored
      Change-Id: Ifa2cdc2d2230dfa11396ee3e547653180f96b795
      21b67229
    • Debargha Mukherjee's avatar
      Make hbd transforms compatible with 4:1 transforms · 9eabd691
      Debargha Mukherjee authored
      Change-Id: I7123717b2d11bca826d650c6e6b6ae137476d541
      9eabd691
    • Debargha Mukherjee's avatar
      Make forward 4:1 transforms txmg compliant · 69f914a8
      Debargha Mukherjee authored
      Change-Id: I9e55a9c9dd546e2e1d5e9c43e3e73fc44c3ba590
      69f914a8
    • David Barker's avatar
      striped-loop-restoration: Use consistent frame height · 9cf9e28c
      David Barker authored
      The stripes are intended to extend down to the full decoded
      height of the frame, which is always a multiple of 8 luma pixels,
      in order to avoid some nasty edge cases.
      
      This change was partially implemented in previous patches, but
      not everywhere was modified, leading to slightly inconsistent code.
      This patch finishes making the relevant changes, along with a
      slight bit of refactoring.
      
      Change-Id: Ibc8e2f5ace5415815625edbc224557a7c548c38a
      9cf9e28c
    • Ola Hugosson's avatar
      lv_map_multi: add 2 more eob coeff contexts · d2352ecb
      Ola Hugosson authored
      The EOB coefficient cannot be 0 and for that reason it has special base_cdf contexts.
      Before this commit there was two contexts (DC and AC). This commit adds two additional
      contexts to separate the AC into 3 bands (i<=N/8, i<=N/4, i<=N/2).
      
      Change-Id: If088b20fd891920b7ea7fc988d29bf6d86d93bfc
      d2352ecb
    • Jingning Han's avatar
      Set up txb coeff processing timer · 53c08960
      Jingning Han authored
      Allow the codec to time the average transform block coefficient
      processing for sw speed check.
      
      Change-Id: Ibdaf15ab5b7f1ea8264604cc00ef45e3ae3114c7
      53c08960
    • Linfeng Zhang's avatar
      Add av1_get_br_level_counts_sse2() · ae7b2f3a
      Linfeng Zhang authored
      Change-Id: I6ce7aea19e3bdeef24d3fe66ac6eba7b8d585f9a
      ae7b2f3a
  5. 16 Nov, 2017 6 commits
    • Ola Hugosson's avatar
      lv_map_multi: revert accidental prob change · 52d6c895
      Ola Hugosson authored
      In e72a2091 one lps default probability was accidentally changed from 1 to 128 for
      non LV_MAP_MULTI mode. This commit reverts that change and make the change only for
      LV_MAP_MULTI mode. Also rather than changing to 128, the probability is changed to 10.
      
      Change-Id: Ia8950379c46c59d40ea388fcd0621bbd78c26ede
      52d6c895
    • Yue Chen's avatar
      Improve filter_intra throughput · 11bac017
      Yue Chen authored
      The prediction can be done in 2x2 or 4x4 processing unit, within
      which there is no dependency and the computation can be fully
      parallelized.
      Also turn < 8x8 filter_intra on, and disable it in > 32x32 txbs.
      
      Change-Id: I4f8a3104019cbb35e88f342d97516f81b19152b0
      11bac017
    • David Barker's avatar
      loop-retoration: Fix overflow in self-guided filter · 9c1f92ba
      David Barker authored
      A while ago, I calculated some bounds on the intermediate values inside
      the self-guided filter. These bounds turned out to be not quite correct
      in one particular instance (when we have a large region of max-value
      pixels).
      
      This caused a variable to overflow a uint32_t when decoding 12-bit
      streams in the reference decoder, and would force 8/10-bit-only
      hardware to use wider buffers than intended in order to match the
      reference code.
      
      Fortunately, this can be fixed quite easily, with minimal changes
      to the filter output. See comments within the patch for the exact
      details.
      
      Also re-instate a Wikipedia link which seems to have gone missing
      but which provided useful context for the derivation of the bounds.
      
      Change-Id: I83d4a277a37eff048af9989cccf19202fafb17b5
      9c1f92ba
    • David Barker's avatar
      loop-restoration: Fix + refactor stripe boundary setup · 16ff7ef3
      David Barker authored
      * Setup and restore the correct number of left/right boundary
        pixels at vertical tile edges, and save them in the correct
        buffers.
        Also fix the restore process in high-bitdepth mode.
      
      * When loop filtering across tiles is enabled, we were previously
        acting inconsistently at horizontal tile borders: The stripe
        just above the boundary would use CDEF pixels from the tile below
        for context, while the stripe just below would use deblocked
        pixels from the stripe above.
      
        The intended design appears to have been to use CDEF pixels on
        both sides (so we logically have a 64-pixel high stripe, it's just
        split into an 8-pixel and a 56-pixel high stripe in order to keep
        the coefficient sets aligned to tiles)
      
        Implement that behaviour by disabling the context setup process
        when at a horizontal tile border.
      
      * Pull some common calculations out of
        {setup,restore}_processing_stripe_boundary and into their
        common caller. This allows us to reduce the number of arguments
        going into each function and their internal complexity.
      
      * Add more design comments around stripe boundary setup,
        as there are quite a lot of constraints to be aware of
      
      Change-Id: Ic1586c149b7f764b9c1a711df3f11fb0f130b38a
      16ff7ef3
    • Monty Montgomery's avatar
      Eliminate tx_size dependant shifts for Daala TX · a26262c3
      Monty Montgomery authored
      short-circuit av1_get_tx_scale to always return zero when
      CONFIG_DAALA_TX, and remove it from the actual Daala TX toplevel
      
      This has potential overflow consequences for any metrics computation
      based on pixels; as such, also force use of the high-bitdepth path in
      each of these case.
      
      subset-1:
      monty-rest-of-stack-baseline-s1@2017-11-13T00:39:03.881Z ->
      monty-rest-of-stack-noshift-s1@2017-11-13T14:37:42.541Z
      
         PSNR | PSNR Cb | PSNR Cr | PSNR HVS |    SSIM | MS SSIM | CIEDE 2000
      -0.0030 | -0.0523 |  0.2656 |  -0.0239 | -0.0033 | -0.0029 |     0.0067
      
      objective-1-fast --limit=4:
      monty-rest-of-stack-baseline-o1f4@2017-11-13T00:37:06.999Z ->
      monty-rest-of-stack-noshift-o1f4@2017-11-13T14:37:16.992Z
      
         PSNR | PSNR Cb | PSNR Cr | PSNR HVS |    SSIM | MS SSIM | CIEDE 2000
      -0.0264 |  0.2303 |  0.0822 |  -0.0109 | -0.0395 | -0.0709 |     0.0538
      
      Change-Id: I57da71861f105dc7a404fa75a75bde573855ef79
      a26262c3
    • Yunqing Wang's avatar
      Modify lightfield encoding example · b041d8a7
      Yunqing Wang authored
      Modified the lightfield encoding example to accommodate HW implementation
      requirements. Fixed the encoding scheme, generated a bitstream of a list
      of references followed by the surrounding large scale tile coded frames.
      All large scale tile coded frames use the same uncompressed frame header
      and the same set of frame contexts. This example also wrote out the frame
      header and frame contexts while encoding a large scale tile frame and
      setting EXT_TILE_DEBUG to 1.
      
      Change-Id: I7cc19099195d0a20335d5c6bfb9f493f1bf3a7b2
      b041d8a7
  6. 15 Nov, 2017 9 commits