1. 15 Nov, 2017 3 commits
    • Debargha Mukherjee's avatar
      Replace RECT_TX_EXT experiment · 35a4db38
      Debargha Mukherjee authored
      Remove the previous experiment and now use the same name for a
      simpler experiment that only enables 4:1 transforms for 4:1
      partitions when ext_partition_types is on, and that which was
      previously enabled with the USE_RECT_TX_EXT macro.
      
      Change-Id: Iccc35744bd292abf3c187da6f23b787692d50296
      35a4db38
    • Yaowu Xu's avatar
      Support Separate qmatrix for U and V planes · d467bba2
      Yaowu Xu authored
      This commit adds support of a separate qmatrix and iqmatrix for each
      of the U and V planes. Currently the separate matrices are intialized
      to the same ones, then the commit does not have any coding impact
      yet.
      
      Change-Id: I5c4045fe1879db262c6ec1567d8d7dfd9a98fbed
      d467bba2
    • RogerZhou's avatar
      Resolved build conflict by amvr · 088251e4
      RogerZhou authored
      Change-Id: Icd0a8c4a9cf47cb93e8329ed0bdc0f79787baaaf
      088251e4
  2. 14 Nov, 2017 19 commits
    • Debargha Mukherjee's avatar
      Temporarily turn off sse4_1 code for sgr · 256e1d23
      Debargha Mukherjee authored
      Until a valgrind error coming from the sse4 code is fixed.
      This should resolve the valgrin below.
      
      BUG=aomedia:1021
      
      Change-Id: Ic461edb1da017d703a098bf5f9491fa51d0debcc
      256e1d23
    • Sebastien Alaiwan's avatar
      Move encoder-only code to av1/encoder · 95137bde
      Sebastien Alaiwan authored
      Change-Id: Ic4e16f30827e2e2e2dd140aee94d309b049dd063
      95137bde
    • Zoe Liu's avatar
      Change mv projection to signed rounding · 11273449
      Zoe Liu authored
      The numerator in the mv projection can be negative, e.g. cur_to_bwd
      or cur_to_alt2, since either bwdref or altref2 can be a forward
      predictive reference, whereas the denominator always stays positive.
      The rounding inside mv projection hence should use signed operation.
      
      Change-Id: I42a105835754a002dd31fcfa7c845e4c105ec54f
      11273449
    • Rupert Swarbrick's avatar
      Don't send chroma data in monochrome mode · dcb3cff5
      Rupert Swarbrick authored
      This is still a rather inefficient black+white encoder, since it carefully
      computes some chroma data, but just doesn't write it. However, at least the
      bitstream is now monochrome.
      
      Change-Id: Ie8a89bf329e7b41441032fb0d9e9011385bc12ff
      dcb3cff5
    • Hui Su's avatar
      intrabc: use its own mv cost table · dfcbfbd4
      Hui Su authored
      To faciliate using intrabc on interframes.
      
      Change-Id: Ibfe376190adf24d15198c5fb548e1050e191a3d6
      dfcbfbd4
    • Rupert Swarbrick's avatar
      Replace force*split with has_rows/has_cols in rd_pick_partition · 1c2dfae3
      Rupert Swarbrick authored
      I think the result is a little easier to reason about (you now talk
      about a property of the block, rather than the behaviour that should
      be enforced). It also matches the code in read_partition in
      decodeframe.c
      
      Change-Id: I13ba06b1504fa153b8b6b60fa14b373483639718
      1c2dfae3
    • Rupert Swarbrick's avatar
      Remove nested #if !CONFIG_NEW_MULTISYMBOL lines · 3b48a6d4
      Rupert Swarbrick authored
      No change to the code, but these #if !CONFIG_NEW_MULTISYMBOL lines are
      all in the #else part of an #if CONFIG_NEW_MULTISYMBOL...
      
      Change-Id: Ibf11b1f0711113d9ee52927dcaf243d74e3f9d28
      3b48a6d4
    • Rupert Swarbrick's avatar
      Save right # of lines in save_deblock_boundary_lines · 7a7fffef
      Rupert Swarbrick authored
      The "src_height" computed in save_deblock_boundary_lines didn't match
      the one in save_tile_row_boundary_lines, which meant that the wrapper
      function assumed the deblock code was saving some lines and that code
      thought that save_cdef_boundary_lines would do it.
      
      This patch fixes up the logic to match, and also completely gets rid
      of the lines_to_save variable (after all, bad things would happen if
      lines_to_save was 1 because we'll still read both boundary lines
      later)
      
      The tile height gets rounded up to a multiple of 8 luma pixels in
      save_tile_row_boundary_lines to avoid nasty corner cases. This will
      only have any effect for rows at the bottom of the frame (where
      av1_get_tile_rect clips to the frame boundary).
      
      BUG=aomedia:1020
      
      Change-Id: I55adb53fa8ba9c7f97fb2fd5b328a3f2f5065464
      7a7fffef
    • Ola Hugosson's avatar
      WIP: lv_map_multi: make br multi symbol · e72a2091
      Ola Hugosson authored
      The br_cdf and lps_cdf with a new 4-state symbol br_cdf.
      The br symbol indicates whether the level is k, k+1, k+2 or >k+2
      In the latter case, a new br symbol is read. Up to 4 br symbols are
      read which will reach level 14 at most. Levels greater than 14 are
      golomb coded.
      
      The adapted symbol count is reduced further by this commit.
      E.g. for the I-frame of ducks_take_off at cq=12, the number of adapted symbols
      is reduced from 4.27M to 3.85M. About 10% reduction.
      
      Gains seems about neutral on a limitied subset.
      
      Change-Id: I294234dbd63fb0fa26aef297a371cba80bd67383
      e72a2091
    • Ola Hugosson's avatar
      WIP: lv_map_multi: New experiment · 13892108
      Ola Hugosson authored
      This experiment modifies lv_map to make use of multi symbol.
      
      Replace the nz_map and coeff_base binary CDF with a new multi-symbol
      CDF of size 4. The new base_cdf indicates for each coeff if the level
      is 0, 1, 2 or >2. Two new special contexts are added to be used for the
      last coefficient (the EOB coeff). For the EOB coefficient we already know
      that it is non-zero. We use one context for DC EOB and one for AC EOB
      (this can potentially be refined more).
      
      The new symbol is read/written by special bitreader/bitwriter functions.
      Those functions reduce the probability precision from 15bit to 9bit before
      the invocation of the arithmetic coding engine.
      
      The adapted symbol count is significantly reduced by this experiment.
      E.g. for the I-frame of ducks_take_off at cq=12, the number of adapted symbols
      is reduced from 6.7M to 4.3M.
      
      Change-Id: Ifc3927d81ad044fb9b0733f1e54d713cb71a1572
      13892108
    • Rostislav Pehlivanov's avatar
      q_segmentation: disable delta_q encoding when enabled · da06779c
      Rostislav Pehlivanov authored
      The decoder side correctly disabled delta_q but the encoder didn't.
      
      Change-Id: I9f720c678d9e99d723c632095c058eaecd1a639d
      da06779c
    • Rostislav Pehlivanov's avatar
      q_segmentation: set seg->q_lvls to 0 when disabled · 30556193
      Rostislav Pehlivanov authored
      Otherwise the previous value ended up being used, creating a desync.
      
      Change-Id: I42d466474ce1a2567045720b8dfd413625f21cfa
      30556193
    • Monty Montgomery's avatar
      Simplify Daala inverse TX toplevel for constant shift · 359854fe
      Monty Montgomery authored
      Rather than backing out all the LGT-related shifting matrices
      throughout the existing TX code, separate out and simplify Daala
      inverse TX into a single dedicated entry point.  When DAALA_TX is
      enabled, CONFIG_HIGHBITDEPTH is also forced, and all of Daala TX
      (lowbd and highbd) uses this single TX dispatch.
      
      This patch is purely non-functional changes.
      
      subset 1:
      monty-TXtesting-fwd-s1@2017-11-12T05:25:09.557Z ->
       monty-TXtesting-inv-s1@2017-11-12T05:25:43.878Z
      
        PSNR | PSNR Cb | PSNR Cr | PSNR HVS |   SSIM | MS SSIM | CIEDE 2000
      0.0000 |  0.0000 |  0.0000 |   0.0000 | 0.0000 |  0.0000 |     0.0000
      
      objective-1-fast:
      monty-TXtesting-fwd-o1f@2017-11-12T05:25:29.386Z ->
       monty-TXtesting-inv-o1f@2017-11-12T05:25:58.897Z
      
        PSNR | PSNR Cb | PSNR Cr | PSNR HVS |   SSIM | MS SSIM | CIEDE 2000
      0.0000 |  0.0000 |  0.0000 |   0.0000 | 0.0000 |  0.0000 |     0.0000
      
      Change-Id: I790e8d7ac08eb214eb712f5441d6e5f76ebddf17
      359854fe
    • Cheng Chen's avatar
      JNT_COMP: reduce context model number · c87b340e
      Cheng Chen authored
      Reduce context model number from 9 to 6.
      Let context be two kinds: two reference frames are equal distance
      or not.
      Also, give equal distance compound weight {9, 7} instead of {8, 8}/16
      
      Reducing context model gives neutral performance.
      New compound weight provides -0.14% gain.
      
      Change-Id: I8a3f3021eac9e446ac826e5992f42931af4c8962
      c87b340e
    • Cheng Chen's avatar
      JNT_COMP: highbd simd and unit tests · cce312fb
      Cheng Chen authored
      Change-Id: I2c913198b7ad136cdf15d4af86b9b0b9e6850b72
      cce312fb
    • Hui Su's avatar
      Fix the logic for skipping in-loop filters in loopfilter_frame() · 27a4fb68
      Hui Su authored
      Change-Id: I976b4a684d6d309da6b1076627f7e1e058e72932
      27a4fb68
    • Debargha Mukherjee's avatar
      Support for 4:1 transforms with txmg · 845057f1
      Debargha Mukherjee authored
      Change-Id: I484121349af182f4f9525b1c992a6e77f6da7ea9
      845057f1
    • Yue Chen's avatar
      Add the option of using 1:4/4:1 tx_size+sb_type · 0797a208
      Yue Chen authored
      Change-Id: I96e5ff72caee8935efb7535afa3a534175bc425c
      0797a208
    • Cheng Chen's avatar
      Revert "JNT_COMP: turn off for one_sided_compound" · b09e55cf
      Cheng Chen authored
      This reverts commit 060e192b.
      
      Change-Id: I5700d351a3cbb682ec49a0efb9cca4d0e83f9a3a
      b09e55cf
  3. 13 Nov, 2017 8 commits
  4. 12 Nov, 2017 2 commits
    • Debargha Mukherjee's avatar
      Fix some loop-restoration valgrind errors · 35bcd517
      Debargha Mukherjee authored
      Re-introduces a check that was removed in the refactor in 1a96c3f5.
      
      BUG=aomedia:1021
      BUG=aomedia:1022
      
      Change-Id: I548a30dba7586cf220b2f5a3f1fddf2b6b57e68d
      35bcd517
    • Monty Montgomery's avatar
      Simplify Daala forward TX toplevel for constant shift · a2d40a39
      Monty Montgomery authored
      Rather than backing out all the LGT-related shifting matrices
      throughout the existing TX code, separate out and simplify Daala
      forward TX into a single dedicated entry point.  When DAALA_TX is
      enabled, CONFIG_HIGHBITDEPTH is also forced, and all of Daala TX
      (lowbd and highbd) uses this single TX dispatch.
      
      At present, this should result in no effective functional change,
      however rectangular transforms are now always column-first-- that
      has minor rounding effects.
      
      subset 1:
      monty-daalaTX-fulltest-DaalaRDO-s1@2017-11-07T00:02:56.282Z ->
       monty-daalaTX-fulltest-fwd-s1@2017-11-07T03:08:55.478Z
      
         PSNR | PSNR Cb | PSNR Cr | PSNR HVS |    SSIM | MS SSIM | CIEDE 2000
      -0.0576 |     N/A | -0.2646 |  -0.0125 | -0.0439 | -0.0479 |    -0.1798
      
      objective 1 fast:
      monty-daalaTX-fulltest-DaalaRDO-o1f4@2017-11-07T05:59:50.180Z ->
       monty-daalaTX-fulltest-fwd-o1f4@2017-11-07T06:00:08.500Z
      
        PSNR | PSNR Cb | PSNR Cr | PSNR HVS |    SSIM | MS SSIM | CIEDE 2000
      0.0036 |  0.0477 |  0.1132 |   0.0863 | -0.0017 |  0.0209 |     0.0240
      
      Change-Id: I182a5c4388c410cbea8810e2f9e36fd37a4a46e5
      a2d40a39
  5. 11 Nov, 2017 5 commits
    • Frederic Barbier's avatar
      Remove experimental flag of CDEF · 1aeee2e9
      Frederic Barbier authored
      This experiment has been adopted, we can simplify the code
      by dropping the associated preprocessor conditionals.
      
      Change-Id: I17bd46ebad7796d04fb4065fb36da0e1c4eeaf9b
      1aeee2e9
    • Monty Montgomery's avatar
      Add is_hbd field to TxfmParam · 26b8a99e
      Monty Montgomery authored
      In preparation for Daala unified LBD/HBD TX, add (and use) is_hbd
      field in TxfmPama structure.  This field indicates whether or not
      pixel data is using 8 or 16 bit reference buffers (currently ambiguous
      in the case of 8 bit input).
      
      Change-Id: I28bca792a48ffa00e208617adb072b08ff816e3c
      26b8a99e
    • Monty Montgomery's avatar
      Fix bitrot in LBD Daala inverse TX · df08def5
      Monty Montgomery authored
      Cleanup/optimizations of the low-bitdepth inverse TX path for AV1 TX
      broke Daala TX in several places; this patch cleans up the cleanup.
      
      Tested against the New Daala TX code that unified LBD/HBD, restores
      bit-identical TX behavior.
      
      monty-daalaTX-invzerotest-LBD-s1-2@2017-11-10T08:46:01.822Z ->
        monty-daalaTX-invzerotest-test-s1@2017-11-09T05:09:05.483Z
      
        PSNR | PSNR Cb | PSNR Cr | PSNR HVS |   SSIM | MS SSIM | CIEDE 2000
      0.0000 |  0.0000 |  0.0000 |   0.0000 | 0.0000 |  0.0000 |     0.0000
      
      Change-Id: I58e4de4c71ec5251138ff7816f77777db6f869a3
      df08def5
    • Monty Montgomery's avatar
      Move all of LBD Daala TX to up-4, down-1 shift · 5500ce76
      Monty Montgomery authored
      Now that tran_low_t is assumed to be 32 bit when Daala TX is active,
      there's no reason for multi-stage shifting to fit coefficients into 16
      bits for the inter-tranform transpose matrix. Go to a consistent up by
      four, down by one shifting scheme for all TX block sizes.
      
      (Note this is for the current AV1 coefficient scaling scheme with
      av1_get_tx_scale and deeper coefficients for higher bitdepth input.
      Daala TX is moving to the long-intended constant-coefficient-depth in
      upcoming patches).
      
      subset 1:
      monty-4-1-baseline-s1@2017-11-11T05:57:15.857Z ->
       monty-4-1-test-s1@2017-11-11T05:57:52.983Z
      
         PSNR | PSNR Cb | PSNR Cr | PSNR HVS |   SSIM | MS SSIM | CIEDE 2000
      -0.0117 | -0.0246 |  0.0530 |   0.0238 | 0.0254 |  0.0447 |    -0.0442
      
      Change-Id: I2214e94ac822542c504d472276723277ed350abf
      5500ce76
    • David Michael Barr's avatar
      [CFL] basic early termination for alpha search · 9134586f
      David Michael Barr authored
      This causes no change in the encoder output.
      Comparing simple SSE-based RDO with the switch to
      txfm_rd_in_plane, the overhead is reduced by 23% ~ 50%.
      The total encode time increase is now 2.3% ~ 3.1%.
      
      Change-Id: I48c76216871f8ed68631815fd781697139305e94
      9134586f
  6. 10 Nov, 2017 3 commits