1. 20 Oct, 2017 22 commits
    • Johann's avatar
      expand gtest cpu extension filter · c4ec0329
      Johann authored
      Look for OPT_ because this style is used to manually shard the intrapred
      test and it's speed variant.
      
      Change-Id: Ic294148e76a1d152e65a4df0c024280fe93ab6c6
      c4ec0329
    • James Zern's avatar
      simd_cmp_impl.h: quiet visual studio warning · 04401474
      James Zern authored
      Disable "value of intrinsic immediate argument 'value' is out of range
      'lowerbound - upperbound'" warning. Visual Studio emits this warning though
      the parameters are conditionally checked in e.g., v256_shr_n_byte. Adding a
      mask doesn't always appear to be sufficient.
      
      previously:
      079acac1 Silence warnings in VS
      
      Change-Id: Ie51ca75b3816636336122fb9a9a9cf20fdf2486c
      04401474
    • Yaowu Xu's avatar
      Reduce size of TX_SIZE for msvc · 25f9771e
      Yaowu Xu authored
      MSVC always use int for enum type, this caused the TX_SIZE to be a
      4-byte type. This commit is a work around for MSVC to reduce the size
      of memory usage.
      
      Change-Id: I5383ca632ccef9951d87e678d505a0918eab1a76
      25f9771e
    • Jingning Han's avatar
      Reduce the mfmv stack size in use · 09723813
      Jingning Han authored
      Change-Id: I43c3f337e2a648ec4ee17ceab0a8f6892924d3b2
      09723813
    • Yaowu Xu's avatar
      Make more enum types packed · 812897db
      Yaowu Xu authored
      This helps compilers(gcc/clang) to use smaller integer types.
      
      Change-Id: I5ee6bda0a76468daca916c8b9120d9e7e78ade8e
      812897db
    • Monty Montgomery's avatar
      Add Daala TX to 16x32 and 32x16 transforms · ad396850
      Monty Montgomery authored
      Rectangular 416x32 and 32x16 will now use Daala TX when CONFIG_DAALA_DCT16 and
      CONFIG_DAALA_DCT32 are both enabled.
      
      Change-Id: Iab3737605fa10dc09ceab18856a26165c502e6e5
      ad396850
    • Monty Montgomery's avatar
      Add Daala TX to 8x16 and 16x8 transforms · 7eb4454b
      Monty Montgomery authored
      Rectangular 8x16 and 16x8 will now use Daala TX when CONFIG_DAALA_TX8 and
      CONFIG_DAALA_TX16 are both enabled.
      
      Change-Id: I777d5433addb8ffd4a99f7e021768d4f8651008f
      7eb4454b
    • Monty Montgomery's avatar
      Add Daala TX to 4x8 and 8x4 transforms · abd94510
      Monty Montgomery authored
      Rectangular 4x8 and 8x4 will now use Daala TX when CONFIG_DAALA_TX4 and
      CONFIG_DAALA_TX8 are both enabled.
      
      Change-Id: I56659c3e98e4bbd5bd3591404f9ff72120b33d6f
      abd94510
    • Sebastien Alaiwan's avatar
      Don't reject invalid warped motion model · 1eac584f
      Sebastien Alaiwan authored
      Fallback to default warp params instead.
      The extra assignment to DEFAULT_WMTYPE prevents an assertion.
      
      Change-Id: If21a46cbb4cc9761e5c94bd2fcbc3a06342d677d
      1eac584f
    • Nathan E. Egge's avatar
      Fix a bug in the DAALA_TX 16-point DST functions. · 69a16433
      Nathan E. Egge authored
      The OD_FDST_16() and OD_IDST_16() macros were written for use in the
       OD_FDCT_32_ASYM macro which took asymmetrically scaled input and
       after running an asymmetric butterfly step, passed it through to
       the 16-point Type-II DCT and 16-point Type-IV DST.
      Because the DST implementations were never tested as stand alone
       transforms, some of the signs from the butterfly step ended up inside
       the DST macros.
      These extra operations will be addressed in a follow up patch.
      
      Change-Id: I32f54a4bb70cd8fad4ae5646cfa4f5b14a0f969b
      69a16433
    • Jingning Han's avatar
      mfmv projection constraint · 7f537b85
      Jingning Han authored
      Apply constraint of 64x(64 + 2 x 64) referencing region on the
      reference frames.
      
      Change-Id: I4aa2b47082b85fc9e03ca6f5f489cd80a337c218
      7f537b85
    • Yaowu Xu's avatar
      Revert "Change test video size to 1080p" · 69ac7bf7
      Yaowu Xu authored
      This reverts commit bc2e897d.
      
      The reduction in image size is not helpful enough to reduce the large
      memory allocations. 
      
      
      Change-Id: Iac005c50ba33bbc8f2e75f9dacc0b1dfccf7177b
      69ac7bf7
    • Yi Luo's avatar
      Lowbd D207E/D63E/D45E intrapred x86 optimization · ae676953
      Yi Luo authored
      D207E
      Predictor  SSE2 vs C
      4x4        ~2.6X
      4x8        ~2.5X
      8x4        ~8.0X
      8x8        ~9.1X
      8x16       ~11.7X
      16x8       ~16.9X
      16x16      ~17.3X
      16x32      ~17.2X
      32x16      ~30.2X
      32x32      ~35.5X
      
      D63E
      Predictor  SSE2 vs C
      4x4        ~4.7X
      4x8        ~4.9X
      8x4        ~7.8X
      8x8        ~8.9X
      8x16       ~9.3X
      16x8       ~15.7X
      16x16      ~14.7X
      16x32      ~17.3X
      32x16      ~18.0X
      32x32      ~15.7X
      
      D45E
      Predictor  SSSE3 vs C
      4x4        ~1.8X
      4x8        ~2.9X
      8x4        ~6.7X
      8x8        ~6.5X
      8x16       ~7.4X
      16x8       ~24.4X
      16x16      ~21.5X
      16x32      ~24.2X
      32x16      ~25.4X
      32x32      ~25.2X
      
      Change-Id: I8215de190e2b6314272749761600e389d1ca0fdf
      ae676953
    • Yi Luo's avatar
      Remove compiler warnings in weekly test · 08ee5c86
      Yi Luo authored
      Change-Id: I5873d6caa8304fdc1b5fc668b05204f5e5fb73c1
      08ee5c86
    • Jingning Han's avatar
      Condidtionally drop projection from the alt2 frame · 49705df3
      Jingning Han authored
      Change-Id: I3ac81379669678254672f125df281980b687e16e
      49705df3
    • Sebastien Alaiwan's avatar
      Reduce run time of CodingPathSync test · dd4f0fa5
      Sebastien Alaiwan authored
      Move longer tests to nightly
      
      Change-Id: Ic78c8c582ce33b11f13265b085e3c6cc828107d9
      dd4f0fa5
    • Debargha Mukherjee's avatar
      Fix some var-tx mismatches when rect-tx is off · 4def76a5
      Debargha Mukherjee authored
      Change-Id: I91efb93cade65469a2c4e922253b599270b45406
      4def76a5
    • Yaowu Xu's avatar
      Squash type conversion warnings in MSVC · 29373eef
      Yaowu Xu authored
      Change-Id: I457edba98dd1ebbd212651247d6c0d1a34f780d6
      29373eef
    • Debargha Mukherjee's avatar
      Fix assert failure with tx64x64 and ext-partiton · 1fa24467
      Debargha Mukherjee authored
      Fixes an assert failure when 32x64 and 64x32 partitions are used
      with tx64x64 when ext-partition is also on.
      
      BUG=aomedia:935
      
      Change-Id: I3bf50aeab58ec4ba2c9892e6eb18cf60e425fa42
      1fa24467
    • Yaowu Xu's avatar
      Make alignment declaration consistent · 9511a433
      Yaowu Xu authored
      Fix msvc 2013 warnings.
      
      Change-Id: I48616d2568c3c1c6d40dd4fbf01f8720495972ee
      (cherry picked from commit e21b840727b6c56bb35cba43229187fd4509bbef)
      9511a433
    • Debargha Mukherjee's avatar
      Remove CONFIG_CB4X4 config flags · 6ea917ec
      Debargha Mukherjee authored
      Since CB4X4 is adopted and without it the codec does not work,
      it is better to remove it and simplify the code.
      
      Change-Id: I51019312846928069727967e3b2bbb60f0fba80d
      6ea917ec
    • Debargha Mukherjee's avatar
      Enable switchable restoration for chroma · a3d4fe50
      Debargha Mukherjee authored
      Change-Id: I78a8a1749cd4449c61a106f413c697e4a2df0475
      a3d4fe50
  2. 19 Oct, 2017 18 commits
    • Cheng Chen's avatar
      Soft enable loopfilter_level · 9ac7a0f3
      Cheng Chen authored
      Enable it as it is adopted.
      Fix some compile warnings and compatibilities.
      
      Change-Id: If324e749e27ffa42f69a19ad5ebb39bc493b33ec
      9ac7a0f3
    • Jingning Han's avatar
      Support backward motion vector projection from alt2 · 28031907
      Jingning Han authored
      Support backward projection of the motion vectors from the
      ALTREF2 reference frame to build the motion field.
      
      Change-Id: I81d41c3ea71c14e6d8932f4e106c34976696b74d
      28031907
    • Jingning Han's avatar
      Refactor motion field projection process · 3bd1bc23
      Jingning Han authored
      Abstract the operation for backward projection.
      
      Change-Id: If458cfe8d2f152227565e8b58c864fd2e7824b43
      3bd1bc23
    • Nathan E. Egge's avatar
      Rename DAALA_DCTx experiments to DAALA_TXx. · e554f36c
      Nathan E. Egge authored
      Change-Id: I8fa0a67d7a198b8b24837ffc352acf77f390cffe
      e554f36c
    • David Barker's avatar
      loopfilter-level: Fix some inconsistencies · cce013cd
      David Barker authored
      * Fix a case where we would calculate the Y horizontal filter strength
        as the sum of the base Y *vertical* strength and the
        per-segment delta Y horizontal strength.
      
      * When using delta_lf_multi, adapt the corresponding CDFs between frames
      
      * Correct values in seg_feature_data_{signed,max}
      
      Change-Id: I1976d2024e9e16fe73258cf41d56aafe8a830957
      cce013cd
    • David Barker's avatar
      Fix interaction of loopfilter-level + obu · 3dffa270
      David Barker authored
      When obu is enabled, we should only apply look filtering after
      the frame is fully decoded. This was not working correctly with
      the combination of loopfilter-level + obu; move an 'if' condition
      around in order to fix this.
      
      Change-Id: I0f06d81663ea1d91f4e4b251b1eaf4bda70a8770
      3dffa270
    • Johann's avatar
      remove references to obsolete test files · 472cd036
      Johann authored
      These files have been removed from test/test-data.mk. When they are not
      found, new versions are encoded. However, if they happen to exist, the
      tests fail due to the unsupported bitstreams.
      
      Change-Id: Ib643a5a22608b20df2b8f520691ba213f07837e1
      472cd036
    • Sebastien Alaiwan's avatar
      Coding path sync test: force quantizer · 403cd01f
      Sebastien Alaiwan authored
      Change-Id: I5ad29fe45dcb83cec15ca7295bc2116fccbd8d06
      403cd01f
    • Sebastien Alaiwan's avatar
      Fix mem corruption due to undersized token buffer · 0a86a7d2
      Sebastien Alaiwan authored
      Take a margin of 8 tokens.
      
      BUG=aomedia:647
      
      Change-Id: I04638a73deee334aa1f083f67c602c8a18cb951c
      0a86a7d2
    • Rupert Swarbrick's avatar
      Comment/refactor striped loop restoration save/restore functions · 9af0cf3c
      Rupert Swarbrick authored
      This shouldn't change the behaviour at all, but I think the resulting
      code is slightly easier to read and follow. I've also added copious
      comments to setup_processing_stripe_boundary to explain exactly what
      the code is doing.
      
      Change-Id: I68adf2d0455b7d87aa04d7e6daa43f4d730c6f80
      9af0cf3c
    • Yue Chen's avatar
      Disable residue hash feature on cross-border blocks · 25dc0701
      Yue Chen authored
      Disable this feature unless the entire block is within the frame.
      The reason is, rd decisions in mbmi, e.g. inter_tx_block[][], made
      for blocks partially out of the border can be partly nonsense
      therefore cannot be reused by blocks at other locations.
      
      It caused an infinite loop when encoding a clip with repetitive
      patterns. A cross-border block has an invalid big tx stored
      in inter_tx_block[0][1] and the other block (same residue, within
      frame) reused this mbmi, which makes encoder never reach the
      termination condition when tx blocks are being recursively
      partitioned.
      
      BUG=aomedia:913
      
      Change-Id: Id25a1dbc4a68b5136f6bdf9f6b5811b7ec6920b0
      25dc0701
    • David Barker's avatar
      deblock-13tap: Don't use 8-tap filter for chroma plane · b1d1f2ca
      David Barker authored
      For the chroma plane, the selected filter length was based on
      transform size as follows:
      TX_4X4 -> filter_length = 4
      TX_8X8 -> filter_length = 8
      TX_16X16 -> filter_length = 6 (this is intentionally *not* 16)
      
      This seems inconsistent; presumably the intention is to use the
      6-tap filter for TX_8X8 as well. This patch makes the appropriate
      change to make that be the case.
      
      Change-Id: I7f53d1dce4f16144bcf0c20131527b5193311603
      b1d1f2ca
    • Rupert Swarbrick's avatar
      General tidy-ups in loop restoration code · d3d0615e
      Rupert Swarbrick authored
      This refactors the iteration in restoration.c so that all the scary
      stuff lies in a pair of general functions, filter_frame and
      filter_rest_unit.
      
      filter_frame is currently very simple, iterating over the restoration
      units in the frame. Once we've made it so that restoration units don't
      span tile boundaries, this function is the one we'll need to update to
      iterate over tiles and then restoration units within the tile.
      
      filter_rest_unit replaces the outer loop of the loop_*_filter_tile*
      functions. It deals with chopping the restoration unit into stripes of
      height procunit_height. When CONFIG_STRIPED_LOOP_RESTORATION is true,
      it also deals with calling setup_processing_stripe_boundary and
      restore_processing_stripe_boundary to use boundary data from the
      deblocked output.
      
      Some of the ugly #if/#endif blocks have been elided in the wiener
      filter code (both low and high bit depth), by defining a convolve
      alias based on USE_WIENER_HIGH_INTERMEDIATE_PRECISION.
      
      There are also changes to extend const-ness for the source frame. I've
      adopted the convention that the frame input is called "data" (as it
      was before) while it's non-const. This is true as far as
      filter_rest_unit. Then each "process one stripe" function takes a
      const pointer to the source frame, at which point it's called "src".
      
      The intention is that, once filter_rest_unit no longer needs a
      RestorationInternal pointer, this function can be exposed in
      restoration.h and can be used by pickrst.c
      
      Change-Id: I18043a172ef0ca1154d87cf7f63e3a80944627cd
      d3d0615e
    • Luc Trudeau's avatar
      [CFL] Fix negative rounding issue in alpha dist · a45b104a
      Luc Trudeau authored
      get_scaled_luma_q0(-alpha_q3, pred_buf_q3[i]) is NOT equivalent to
      -get_scaled_luma_q0(alpha_q3, pred_buf_q3[i]). When the product
      alpha_q3*pred_buf_q3[i] is an exact multiple of 32 (0.5 in Q6), then the right
      shift will round both positive and negative values towards infinity, creating a
      bias. So, e.g., get_scaled_luma_q0(-4, 8) will yield 0, but
      -get_scaled_luma_q0(4, 8) will yield -1.
      
      Results on Subset1 (compared to parent With CfL enabled)
        PSNR | PSNR Cb | PSNR Cr | PSNR HVS |   SSIM | MS SSIM | CIEDE 2000
      0.0000 |  0.0000 |  0.0000 |   0.0000 | 0.0000 |  0.0000 |     0.0000
      
      https://arewecompressedyet.com/?job=cfl-no-round-fix%402017-10-07T11%3A50%3A47.711Z&job=cfl-round-fix%402017-10-07T02%3A15%3A51.359Z
      
      Change-Id: I8a7900c32fbd7213f1ed4e09c3626c063800e186
      a45b104a
    • Steinar Midtskogen's avatar
      Make the CDEF RDO handle 4:2:2 properly · ef1b74c7
      Steinar Midtskogen authored
      This fixes an assert:
      
      av1/common/cdef_block.c:561: cdef_filter_fb: Assertion `bsize ==
      BLOCK_8X8 || bsize == BLOCK_4X4' failed.
      
      The RDO simply assigned a strength of 0 in the 4:2:2 case and called
      cdef_filter_fb(), but cdef_filter_fb() will complain about 4:2:2 even
      if the strength is 0.
      
      The fix assigns a chroma mse of 0 when the the subsampling is
      different for x and y rather than to call the filter.  This is faster
      also.  The mse isn't really 0, but calculating the actual chroma mse
      doesn't change result.
      
      BUG=aomedia:881
      
      Change-Id: I6154e21ddcca30e51baf805684dace10459c3350
      ef1b74c7
    • Rupert Swarbrick's avatar
      Remove partial_frame support from loop restoration · 146a060a
      Rupert Swarbrick authored
      This flag comes from the loop filter's speed features and (I think)
      tells the encoder to make decisions about the filter by looking at a
      narrow strip in the middle of the frame.
      
      That's reasonable enough, but doesn't make any sense for loop
      restoration, where we were calling av1_loop_restoration_frame from
      pickrst.c in order to calculate what restoration parameters to use for
      a given restoration unit (which might not be in the narrow strip in
      the middle!)
      
      As it turns out, the LPF_PICK_FROM_SUBIMAGE method is never actually
      signalled in the reference encoder, which is presumably why we haven't
      spotted this before.
      
      Change-Id: I745e2eab873c0b33920caca40e338af9d078d25e
      146a060a
    • Rupert Swarbrick's avatar
      Remove RestorationInternal from AV1_COMMON · f88bc049
      Rupert Swarbrick authored
      The bits needed by striped loop restoration are now in
      RestorationInfo (which also gets rid of a rather ugly extra
      index).
      
      The scratch buffer that's used for self-guided restoration has been
      moved up to its own variable (rst_tmpbuf).
      
      All the rest of the fields are now safely hidden inside restoration.c
      
      This patch also does a big cleanup of the initialisation code in
      loop_restoration_rows: it doesn't need to be as repetitive now that
      the fields of YV12_BUFFER_CONFIG can be accessed by plane index.
      
      Change-Id: Iba7edc0f94041fa053cdeb3d6cf35d84a05dbfaf
      f88bc049
    • Rupert Swarbrick's avatar
      Don't compute rtile width/height in av1_get_rest_ntiles · 64b8bbdf
      Rupert Swarbrick authored
      Restoration units are a fixed square size (in cm->rst_info[plane]) for
      almost the entire image. The only special case is for tiles at the
      right hand edge or the bottom row, which might expand or be cropped.
      
      The av1_get_rest_ntiles function was implementing the cropping
      behaviour when the image happened to be less than one restoration unit
      wide or high (but not the expansion behaviour), but the result was
      never useful: if you want to get the size of a restoration tile in
      order to divide by it to work out what tile you're on, the fixed
      square size is what you want. If you need to know how big this
      particular tile is, call av1_get_rest_tile_limits.
      
      As well as removing the output arguments from
      av1_get_rest_tile_limits, this patch also removes the tile_width and
      tile_height fields from the RestorationInternal structure. Note that
      the tile size which is what you actually need is accessible as
      rst->rsi->restoration_tilesize. (In practice, these were almost always
      the same anyway).
      
      This patch also has a couple of other small cleanups. Firstly, it
      moves the subsampling_y field out of
      CONFIG_STRIPED_LOOP_RESTORATION. It's not actually needed when you're
      not doing striped loop restoration, but this gets rid of lots of
      horrible #if/#endif lines at callsites for av1_get_rest_tile_limits.
      
      Secondly, it simplifies the code in init_rest_search_ctxt (and fixes
      some tautologous assertions). Now that YV12_BUFFER_CONFIG has a more
      uniform layout, there's a simpler way to set things up, so we use
      that.
      
      Change-Id: I3c32d8ea0abe119dc86b9efa7564b27dde2151dc
      64b8bbdf