1. 25 Oct, 2017 1 commit
    • Rupert Swarbrick's avatar
      Define av1_foreach_rest_unit_in_frame · 33ed9e69
      Rupert Swarbrick authored
      This is the last stage in a quest to move all knowledge of the layout
      of restoration units across the frame into restoration.c. Now this is
      done, we can change how they are laid out (to split them properly at
      tile boundaries) without having to change code in any other file.
      
      Change-Id: Id5108d787d342f5070580d0e34d84b5ddcc53a86
      33ed9e69
  2. 24 Oct, 2017 1 commit
    • Rupert Swarbrick's avatar
      Expose av1_loop_restoration_filter_unit in restoration.h · dd6f09ab
      Rupert Swarbrick authored
      This patch also does a certain amount of rejigging for loop
      restoration coefficients, grouping the information for a given
      restoration unit into a structure called RestorationUnitInfo. The end
      result is to completely dispense with the RestorationInternal
      structure.
      
      The copy_tile functions in restoration.c, together with those
      functions that operate on a single stripe, have been changed so that
      they take pointers to the top-left corner of the area on which they
      should work, together with a width and height.
      
      The same isn't true of av1_loop_restoration_filter_unit, which still
      takes pointers to the top-left of the tile. This is because you
      actually need the absolute position in the tile in order to do striped
      loop restoration properly.
      
      Change-Id: I768c182cd15c9b2d6cfabb5ffca697cd2a3ff9e1
      dd6f09ab
  3. 19 Oct, 2017 5 commits
    • Rupert Swarbrick's avatar
      Comment/refactor striped loop restoration save/restore functions · 9af0cf3c
      Rupert Swarbrick authored
      This shouldn't change the behaviour at all, but I think the resulting
      code is slightly easier to read and follow. I've also added copious
      comments to setup_processing_stripe_boundary to explain exactly what
      the code is doing.
      
      Change-Id: I68adf2d0455b7d87aa04d7e6daa43f4d730c6f80
      9af0cf3c
    • Rupert Swarbrick's avatar
      General tidy-ups in loop restoration code · d3d0615e
      Rupert Swarbrick authored
      This refactors the iteration in restoration.c so that all the scary
      stuff lies in a pair of general functions, filter_frame and
      filter_rest_unit.
      
      filter_frame is currently very simple, iterating over the restoration
      units in the frame. Once we've made it so that restoration units don't
      span tile boundaries, this function is the one we'll need to update to
      iterate over tiles and then restoration units within the tile.
      
      filter_rest_unit replaces the outer loop of the loop_*_filter_tile*
      functions. It deals with chopping the restoration unit into stripes of
      height procunit_height. When CONFIG_STRIPED_LOOP_RESTORATION is true,
      it also deals with calling setup_processing_stripe_boundary and
      restore_processing_stripe_boundary to use boundary data from the
      deblocked output.
      
      Some of the ugly #if/#endif blocks have been elided in the wiener
      filter code (both low and high bit depth), by defining a convolve
      alias based on USE_WIENER_HIGH_INTERMEDIATE_PRECISION.
      
      There are also changes to extend const-ness for the source frame. I've
      adopted the convention that the frame input is called "data" (as it
      was before) while it's non-const. This is true as far as
      filter_rest_unit. Then each "process one stripe" function takes a
      const pointer to the source frame, at which point it's called "src".
      
      The intention is that, once filter_rest_unit no longer needs a
      RestorationInternal pointer, this function can be exposed in
      restoration.h and can be used by pickrst.c
      
      Change-Id: I18043a172ef0ca1154d87cf7f63e3a80944627cd
      d3d0615e
    • Rupert Swarbrick's avatar
      Remove partial_frame support from loop restoration · 146a060a
      Rupert Swarbrick authored
      This flag comes from the loop filter's speed features and (I think)
      tells the encoder to make decisions about the filter by looking at a
      narrow strip in the middle of the frame.
      
      That's reasonable enough, but doesn't make any sense for loop
      restoration, where we were calling av1_loop_restoration_frame from
      pickrst.c in order to calculate what restoration parameters to use for
      a given restoration unit (which might not be in the narrow strip in
      the middle!)
      
      As it turns out, the LPF_PICK_FROM_SUBIMAGE method is never actually
      signalled in the reference encoder, which is presumably why we haven't
      spotted this before.
      
      Change-Id: I745e2eab873c0b33920caca40e338af9d078d25e
      146a060a
    • Rupert Swarbrick's avatar
      Remove RestorationInternal from AV1_COMMON · f88bc049
      Rupert Swarbrick authored
      The bits needed by striped loop restoration are now in
      RestorationInfo (which also gets rid of a rather ugly extra
      index).
      
      The scratch buffer that's used for self-guided restoration has been
      moved up to its own variable (rst_tmpbuf).
      
      All the rest of the fields are now safely hidden inside restoration.c
      
      This patch also does a big cleanup of the initialisation code in
      loop_restoration_rows: it doesn't need to be as repetitive now that
      the fields of YV12_BUFFER_CONFIG can be accessed by plane index.
      
      Change-Id: Iba7edc0f94041fa053cdeb3d6cf35d84a05dbfaf
      f88bc049
    • Rupert Swarbrick's avatar
      Don't compute rtile width/height in av1_get_rest_ntiles · 64b8bbdf
      Rupert Swarbrick authored
      Restoration units are a fixed square size (in cm->rst_info[plane]) for
      almost the entire image. The only special case is for tiles at the
      right hand edge or the bottom row, which might expand or be cropped.
      
      The av1_get_rest_ntiles function was implementing the cropping
      behaviour when the image happened to be less than one restoration unit
      wide or high (but not the expansion behaviour), but the result was
      never useful: if you want to get the size of a restoration tile in
      order to divide by it to work out what tile you're on, the fixed
      square size is what you want. If you need to know how big this
      particular tile is, call av1_get_rest_tile_limits.
      
      As well as removing the output arguments from
      av1_get_rest_tile_limits, this patch also removes the tile_width and
      tile_height fields from the RestorationInternal structure. Note that
      the tile size which is what you actually need is accessible as
      rst->rsi->restoration_tilesize. (In practice, these were almost always
      the same anyway).
      
      This patch also has a couple of other small cleanups. Firstly, it
      moves the subsampling_y field out of
      CONFIG_STRIPED_LOOP_RESTORATION. It's not actually needed when you're
      not doing striped loop restoration, but this gets rid of lots of
      horrible #if/#endif lines at callsites for av1_get_rest_tile_limits.
      
      Secondly, it simplifies the code in init_rest_search_ctxt (and fixes
      some tautologous assertions). Now that YV12_BUFFER_CONFIG has a more
      uniform layout, there's a simpler way to set things up, so we use
      that.
      
      Change-Id: I3c32d8ea0abe119dc86b9efa7564b27dde2151dc
      64b8bbdf
  4. 11 Oct, 2017 1 commit
    • Debargha Mukherjee's avatar
      Change min eps value for sgr · e5fabfbb
      Debargha Mukherjee authored
      This makes sure that all eps values are at least 4 since otherwise
      the computation will not fit within 32 bits.
      
      BUG=aomedia:893
      
      Change-Id: I4815a865be8db792d0481172a2dfa0bc0a817f73
      e5fabfbb
  5. 10 Oct, 2017 1 commit
  6. 07 Oct, 2017 1 commit
    • Urvang Joshi's avatar
      FRAME_SUPERRES: Rework to use scale factor of 8/D · de71d142
      Urvang Joshi authored
      Earlier, the superres scale was in the form of:
      N/16, where N ranged from 8 to 16.
      
      We change this to the form:
      8/D, where D ranges from 8 to 16.
      
      This helps on the decoder side, by making it possible to work on 8x8
      blocks at a time.
      
      Change-Id: I6c72d4b3e8d1c830e61d4bb8d7f6337a100c3064
      de71d142
  7. 28 Sep, 2017 1 commit
    • Ola Hugosson's avatar
      Add striped_loop_restoration experiment · 1e7f2d0c
      Ola Hugosson authored
      This experiment offset the filter tile grid 8 pixels upwards.
      Deblocked pixels (rather than CDEFed pixels) are used for the
      2 lines above and below the filter processing unit. The 8 pixel
      offset is the offset produced by deblock/cdef. This way the
      loop_restoration does not need additional line buffers in a
      single pass hardware implementation.
      
      Change-Id: I89e0831dc28413a5d3e02d7a426ce2885ab629d7
      1e7f2d0c
  8. 26 Sep, 2017 1 commit
    • Rupert Swarbrick's avatar
      Simplify av1_get_rest_tile_limits · 5d2e729e
      Rupert Swarbrick authored
      The subtile and clamping features are no longer used. This patch
      removes the dead code that implemented them and the parameters to
      support them.
      
      It also changes the return format. Instead of having return type void
      and passing data out through 4 output pointers, the function now just
      returns a RestorationTileLimits structure. Since the function is
      defined inline in a header, I suspect that most callsites will
      actually compile to identical code.
      
      There should be no functional change from this patch.
      
      Change-Id: I6ebc4da66a00676bd988f939a4b4957f743e8004
      5d2e729e
  9. 10 Sep, 2017 2 commits
    • Debargha Mukherjee's avatar
      Refactoring/simplification of buffers used for sgr · 1330dfd1
      Debargha Mukherjee authored
      Inlcudes miscellaneous cleanups, test fixes, and code reorganization
      for loop-restoration components.
      
      Change-Id: I5b2e6419234d945e6f4344b22636119b50df4054
      1330dfd1
    • Debargha Mukherjee's avatar
      Reduce/Eliminate line buffer for loop-restoration. · e168a783
      Debargha Mukherjee authored
      This patch forces the vertical filtering for the top and bottom
      rows of a processing unit for the Wiener filter to not use border
      more than what is set in the WIENER_BORDER_VERT macro.
      This macro is currently set at 0 to eliminate line buffer completely,
      but it could be increased to 1 or 2 to use limited line buffers
      if the coding efficiency is affected too much with a 0 line-buffer.
      
      Also, for the sgr filter we added the option of using overlapping
      windows horizonttally and vertically to improve coding efficiency.
      The vertical border used is set by the SGRPROJ_BORDER_VERT
      macro, while the horizontal border can be set by the
      SGRPROJ_BORDER_HORZ macro set at 2, the max needed. Currently we do not
      recommend changing SGRPROJ_BORDER_HORZ below 2.
      
      The overall line buffer requirement for LR is twice the max of
      WIENER_BORDER_VERT and SGRPROJ_BORDER_VERT.
      Currently both are set as 0, eliminating line buffers completely.
      
      Also this patch extends borders consistently before CDEF / LR.
      
      Change-Id: Ie58a98c784a0db547627b9cfcf55f018c30e8e79
      e168a783
  10. 07 Sep, 2017 1 commit
    • Debargha Mukherjee's avatar
      Reduce line buffer size for Wiener filter. · 22bbe4cc
      Debargha Mukherjee authored
      This patch forces the vertical filtering for the top and bottom
      rows of a processing unit for the Wiener filter to be 5-tap.
      The 5-taps are derived from the primary 7-tap fitler by forcing
      the taps at the end to be zero, and absorbing their weights into
      the other taps to maintain normalization.
      This will effectively reduce the line buffer size for luma Wiener
      filter to 4 (from 6).
      
      Change-Id: I5e21b58369777eabf553a8987387d112f98a5598
      22bbe4cc
  11. 06 Sep, 2017 2 commits
    • Rupert Swarbrick's avatar
      Round up subsampled frame size in av1_loop_restoration_corners_in_sb · 7380b25e
      Rupert Swarbrick authored
      The previous code converted a frame_w (say) of 1 to zero for a plane
      where subsampling was enabled, causing a division by zero in
      av1_get_rest_ntiles. This doesn't match the spec, which says
      subsampling rounds up.
      
      The patch adds the rounding, and also adds an assertion to
      av1_get_rest_ntiles to help diagnose any other broken callsites.
      
      Change-Id: Ia6c249fa935c3a16d122ba6e7b450fe99f412fde
      7380b25e
    • Debargha Mukherjee's avatar
      Make loop-restoration use 64x64 processing units · 7a5587a8
      Debargha Mukherjee authored
      Changes loop-restoration to use processing unit size that is
      64x64 for luma; for chroma the processing unit is coupled to
      64x64 support region for luma.
      Thus for chroma the processing unit size is 32x32 for 4:2:0,
      32x64 for 4:2:2 and 64x64 for 4:4:4, etc.
      
      While the Wiener filter output should not change with this patch,
      the sgr filter will change since the boundary pixel handling in
      sgr is internal within the filter.
      
      Change-Id: I65a9e2df88927a19445420ce400acb1fcf7afa93
      7a5587a8
  12. 03 Sep, 2017 1 commit
    • Rupert Swarbrick's avatar
      Move loop restoration coefficients to within the frame · 6c545216
      Rupert Swarbrick authored
      Rather than encoding the loop restoration coefficients at the start of
      the frame header, this patch moves them to occur just after certain
      top-level superblocks.
      
      You might hope that we could just encode coefficients on top-level
      superblocks where the top-left corner of the superblock was also the
      top-left corner of the loop restoration tile. Unfortunately, this
      can't work with the superres experiment, where the loop restoration
      tiles don't necessarily line up with the superblocks. Indeed, in
      general there can be multiple different loop restoration coefficients
      that apply in a given top-level superblock. This patch defines a
      function, av1_loop_restoration_corners_in_sb, which yields the
      rectangle [rrow0, rrow1) x [rcol0, rcol1) of loop restoration tiles
      whose top left corners lie in this top-level superblock.
      
      The total file size should be unchanged by this patch: the bits have
      just been moved from the frame header and spread out among the rest of
      the frame.
      
      Change-Id: Icf43b0560964a63dea0d2cd801313f04139188d7
      6c545216
  13. 16 Aug, 2017 1 commit
    • Debargha Mukherjee's avatar
      Use only up to 5x5 windows for sgr in loop-rest. · 76be32df
      Debargha Mukherjee authored
      The tables for guided filter in loop-restoration are changed
      to use a max of 5x5 windows (or radius 2).
      The aim is to reduce the gate count for hardware implementation.
      
      Change-Id: I7178d6ac09e4731a626f9bccf5151467c63e00c3
      76be32df
  14. 13 Jun, 2017 1 commit
    • Fergus Simpson's avatar
      Make loop-restoration compatible w/ frame_superres · 9cd57cf8
      Fergus Simpson authored
      There were several places where loop_restoration used the encoded width
      and height while superres was active. This patch changes it to use the
      upscaled width and height, since loop_restoration is supposed to occur
      after superres has done its upscaling.
      
      Change-Id: I2b9bbb06b5370618758bf81d8eb63f2eef26af80
      9cd57cf8
  15. 06 Jun, 2017 1 commit
    • Debargha Mukherjee's avatar
      Make loop-restoration compatible w/ frame_superres · 2dd982e4
      Debargha Mukherjee authored
      When frame_superres is on, loop-restoration should work
      on the size of the upscaled frame and not on the internal
      width and height in the common structure. This patch
      makes the necessary changes on the encoder and decoder
      side to enable that.
      
      Change-Id: I1d1c024ac6f95944169d90647b4c5a61354a5cc6
      2dd982e4
  16. 15 May, 2017 1 commit
  17. 22 Apr, 2017 1 commit
  18. 20 Apr, 2017 1 commit
  19. 12 Apr, 2017 1 commit
  20. 10 Mar, 2017 2 commits
    • David Barker's avatar
      Vectorize new highpass filter for loop-restoration · eed824ef
      David Barker authored
      Change-Id: Ibe5d4933f599456cb496f636de244694bc786a4c
      eed824ef
    • Debargha Mukherjee's avatar
      Replace one self guided filter with highpass · b7bb0976
      Debargha Mukherjee authored
      Adds an option controlled by a macro to replace one of
      the guided filters in the self-guided tool with a simple
      bandpass filtered version generated with a 3x3 kernel.
      By default the macro USE_HIGHPASS_IN_SGRPROJ is 0 (turned
      off), that defaults us to the dual self-guided filter.
      When the macro is turned on, the larger radius guided
      filter is replaced by a simpler filter that is much faster.
      
      Results (if USE_HIGHPASS_IN_SGRPROJ is on vs. off):
      lowres: performance drop by +0.14% (BDRATE)
      midres: performance drop by +0.27% (BDRATE)
      
      Further experiments on this variation of guided filters is
      pending.
      
      Change-Id: I7bbcfcad7ee266cd49a8dc6d96795a454feb1a94
      b7bb0976
  21. 09 Mar, 2017 3 commits
  22. 08 Mar, 2017 3 commits
    • David Barker's avatar
      Make encoder use vectorized self-guided filter · 506eb723
      David Barker authored
      By rearranging the code in restoration.c, we can allow the
      encoder to use the SSE4.1 version of the self-guided filter
      while picking the loop-restoration filter.
      
      This also helps us prepare for adding a highbitdepth SSE4.1
      version of the self-guided filter.
      
      No effect on encoder output, but gives an end-to-end speedup
      of 1-2%.
      
      Change-Id: Id17ba4a0963ddce9f70a7cae666e212e138d5f2c
      506eb723
    • David Barker's avatar
      Fix a bug in the C selfguided filter · cff43bb2
      David Barker authored
      Patch https://aomedia-review.googlesource.com/c/8321/ introduced
      a bug in the C version of the self-guided filter in the case where
      w = 384 and h > 368 or w > 368 and h = 384. This was due to forgetting
      to adjust the offset between A and B in the C code.
      
      This patch sets the offset correctly, resolving this bug.
      
      Change-Id: I6bdf11aa76c37d0ecae02788b262e7a2e0a11a6e
      cff43bb2
    • Alex Converse's avatar
      loop_restoration: Prevent some wild memory access · 1511ea10
      Alex Converse authored
      On recode frames the encoder will attempt to serialize the bitstream
      before choosing loop filter parameters to get a rough size estimate.
      This can result in wild reads in encode restoration if leftover values
      from the previous frame aren't available.
      
      Even with a realloc instead of free-ing and reallocing all the data,
      wild reads are possible on frame size changes.
      
      Change-Id: I9956d9e11c6ed61999563436051c2fe469718538
      1511ea10
  23. 06 Mar, 2017 1 commit
    • David Barker's avatar
      Vectorize self-guided filter · ce110cc5
      David Barker authored
      Add an SSE4.1 lowbd version of the self-guided filter for
      loop-restoration, and apply some optimizations to the C
      version.
      
      Approximate times per 128x128 / 256x256 tile on the machine
      this was developed on:
      Previous C:  620us / 2800us
      Optimized C: 500us / 2200us ( 24% /  27% faster)
      SSE4.1:      147us / 600us  (320% / 370% faster)
      
      Change-Id: I23ff5a5482a191aeb06f9d1f767a9f036bb357fe
      ce110cc5
  24. 02 Mar, 2017 1 commit
    • David Barker's avatar
      Remove double rounding in selfguided filter · 7dcd7f5e
      David Barker authored
      In av1_selfguided_restoration, the values stored into 'dgd' are
      unnecessarily rounded twice. This patch replaces this by a single
      rounding operation.
      
      Change-Id: I188d283137b74823f5d5447d441250520d6ee294
      7dcd7f5e
  25. 27 Feb, 2017 2 commits
    • Alex Converse's avatar
      Remove aom_realloc() · 7f094f10
      Alex Converse authored
      It only handles the realloc constraint (preserving low elements) by
      serendipity, and we don't actually rely on that behavior anyway.
      Meanwhile the calls may do extra copying that gets immediately clobbered
      by the callers.
      
      Cherry-pick from libvpx:
      3063c3760 Remove vpx_realloc()
      
      Change-Id: I8dfa89e4a81084b084889c27bd272fdf85184e8d
      7f094f10
    • Alex Converse's avatar
      loop_restoration: Cleanup allocations · 232e3847
      Alex Converse authored
      Change-Id: Id3824c09cbaae814df1d8fb029215f28e8c7a6b1
      232e3847
  26. 22 Feb, 2017 1 commit
    • David Barker's avatar
      Rearrange self-guided filter for vectorization · 9198d135
      David Barker authored
      By rearranging the order of operations, we can ensure that all
      intermediate values fit into 32 bits. This will help when we
      vectorize the self-guided filter.
      
      Results in the noise range.
      
      Change-Id: Ic0c73613882bd103c4e8e57a0155b3132672ae04
      9198d135
  27. 17 Feb, 2017 1 commit
    • Debargha Mukherjee's avatar
      Replace division in self-guided filter · 4be12628
      Debargha Mukherjee authored
      Replaces division with multiplication in self-guided
      filter.
      
      The guided filter requires computation of:
      n^2.s^2/(n^2.s^2 + n^2.e).
      This is now implemented by computation of n^2.s^2/n^2.e followed
      by using a lookup table for the function f(x) = x/(x+1).
      To compute n^2.s^2/n^2.e, we use an integer multiplication based
      implementation which becomes feasible since n^2.e can only
      take a few values and their corresponding multipliers can be
      pre-computed.
      There is also another divison by n, that is also integerized.
      
      Change-Id: Id7b81bbafead0b8f04a1853ec69b9dec423bb66a
      4be12628
  28. 14 Feb, 2017 1 commit