- 16 Jan, 2018 1 commit
-
-
David Barker authored
Properly support loopfiltering-across-tiles in combination with superres and/or loop-restoration: Upscale one tile column at a time, rather than doing the whole frame at once. This allows us to correctly support the loop filter across tiles flag, by temporarily extending the left/right boundaries of each tile column to avoid sampling from adjacent tiles. This code is also reused by striped-loop-restoration, when upscaling the deblocked context above/below each stripe. That way, we i) ensure that the upscaling is done consistently, and ii) fix the last remaining case where loop-restoration didn't respect the loop filter across tiles flag. This also makes it easy to perform extension of the left/right edges of the frame "as needed", so we don't need to extend the frame borders immediately after deblocking. This should give marginally better CDEF filtering for frames using superres. Change-Id: I28712a177853a20c9eb2993e740da8ba7c95a8cc
-
- 27 Dec, 2017 2 commits
- 19 Dec, 2017 1 commit
-
-
Lei authored
based on the latest discussion in the HW working group about how loop filter should be integrated with tiles, the following decisions have been made: 1. two seperated flages should be added for loop_filter_across_tiles_enabled for horizontal tile boundary and vertical tile boundary. 2. encoder and decoder should only check these two flags to determine whether loop filtering (including deblocking, CDEF and loop restoration) should cross tile boundaries (vertical and/or horizontal) or not regardless the horitontal depepdent tile flag. This change list implemented the support for two seperated loop_filter_across_tiles_enabled flags for vertical and horizontal tile boundaries. The new experiment is disabled as default before it is adopted. Change-Id: I814377947517f5419c08b004a3b71b950d01eadd
-
- 18 Dec, 2017 1 commit
-
-
Urvang Joshi authored
When superres experiment is compiled, but the frame is not super-resolved, we should use the same code used by full resolution frames. Change-Id: I79b1d8410f66febdb51b78013375d61a8e52c8c5
-
- 15 Dec, 2017 1 commit
-
-
Yaowu Xu authored
Change-Id: I7ce37b2e43b4607c77515d802a6ad330047fc4c2
-
- 13 Dec, 2017 1 commit
-
-
Dominic Symes authored
Also fix one issue with max-tile+LR when the top left tile is not the largest tile. Change-Id: I721254f63f1a2c6e2c199e27ebaaaebe57234d0f
-
- 12 Dec, 2017 3 commits
-
-
Rupert Swarbrick authored
At both callsites, the "rsi" parameter is the rst_info field from cm, which is already passed. Change-Id: I837ac655a03ebf0de6fbdaece4f4910f750e4898
-
Rupert Swarbrick authored
This is done at the two call sites (where it's needed to avoid saving lines unnecessarily for striped loop restoration context), so there's no need to repeat it here. Change-Id: I11e1ed5f50711fe1b4e8cb2101d3bfb4d16cda57
-
Rupert Swarbrick authored
This is always called with all components and no destination buffer. Change-Id: I76d1a16a87e05b8ecec387288139e846e9894384
-
- 11 Dec, 2017 1 commit
-
-
Urvang Joshi authored
Change-Id: I5fc45fa9fe6a354ae34001f48850eb68364a5a79
-
- 05 Dec, 2017 1 commit
-
-
Dominic Symes authored
max-tile remains off by default until more testing is performed but I would like to check in the fixes that are known so far to prevent this patch getting too big max_tile was provisionally adopted at the working group meeting 2017-Oct-10 This patch fixes the following issues: - max_tile is fixed to suport superblock size 64x64 as well as 128x128 (ext_partition support) - max_tile is fixed in combination with loop_restoration - max_tile is fixed in combination with ext_tile (Bug: 1013) - max_tile is fixed in combination with lv_map and 64x64 subperblock (lv_map memory allocation fixed for 64x64 superblock) - max_tile reports the size of the first tile for inspection.c used by the analyzer Change-Id: Ib83ff613e5d66563c81452a085c7984d3b4813e4
-
- 29 Nov, 2017 2 commits
-
-
David Barker authored
Change-Id: Ic31a53a37ff9200fad178e5054cffc4b87c6cc42
-
Yaowu Xu authored
This is for aom_memset16() used later in the code. Change-Id: I4426354bae4f0bf2e3012a138a8e779098b35173
-
- 25 Nov, 2017 1 commit
-
-
Yaowu Xu authored
Change-Id: Ia76f0686b0fc340eb5dd28b7245f72d0f158ed42
-
- 23 Nov, 2017 3 commits
-
-
Rupert Swarbrick authored
The first stage of the selfguided filter is to generate box sums of the input image (and its squares). This is done with a pair of integral images, which are the same for both calls in apply_selfguided_restoration. This patch refactors things so that av1_selfguided_restoration calculates both "flt" buffers, allowing it to reuse the integral images that it calculated. Change-Id: Ica2f6f66e41bea38eb1a135c78c1d7ddab434d8e
-
Rupert Swarbrick authored
This doesn't have a big performance impact, and it's rather simpler just having one version of everything. Change-Id: I5fa5e7640a63d0ccb0c371f266c6eee99d9520f9
-
Rupert Swarbrick authored
Change-Id: Ifac3a3bf620061865b82b986d6b16bcabd96a187
-
- 22 Nov, 2017 1 commit
-
-
Rupert Swarbrick authored
The code was assuming that an mi was always 4 samples wide and high. For chroma planes with subsampling, this is wrong and the size and position of the sb in the plane was over-estimated by a factor of two. This meant that we sent all the coefficients in the top-left hand quarter of the tile. Since the encoder and decoder made the same mistake, this worked fine, but it's clearly not what we're supposed to do! Change-Id: I0da8ada1d76639ad476ad84491658bc25ef3a43f
-
- 20 Nov, 2017 1 commit
-
-
David Barker authored
We currently have two implementations of the same function (aom_memset16() and memset16()), one of which is only defined inside restoration.c. Remove this duplicate, and use the globally defined version instead. Change-Id: I52740541f2e974f505728240127842397f6ef38d
-
- 17 Nov, 2017 1 commit
-
-
David Barker authored
The stripes are intended to extend down to the full decoded height of the frame, which is always a multiple of 8 luma pixels, in order to avoid some nasty edge cases. This change was partially implemented in previous patches, but not everywhere was modified, leading to slightly inconsistent code. This patch finishes making the relevant changes, along with a slight bit of refactoring. Change-Id: Ibc8e2f5ace5415815625edbc224557a7c548c38a
-
- 16 Nov, 2017 2 commits
-
-
David Barker authored
A while ago, I calculated some bounds on the intermediate values inside the self-guided filter. These bounds turned out to be not quite correct in one particular instance (when we have a large region of max-value pixels). This caused a variable to overflow a uint32_t when decoding 12-bit streams in the reference decoder, and would force 8/10-bit-only hardware to use wider buffers than intended in order to match the reference code. Fortunately, this can be fixed quite easily, with minimal changes to the filter output. See comments within the patch for the exact details. Also re-instate a Wikipedia link which seems to have gone missing but which provided useful context for the derivation of the bounds. Change-Id: I83d4a277a37eff048af9989cccf19202fafb17b5
-
David Barker authored
* Setup and restore the correct number of left/right boundary pixels at vertical tile edges, and save them in the correct buffers. Also fix the restore process in high-bitdepth mode. * When loop filtering across tiles is enabled, we were previously acting inconsistently at horizontal tile borders: The stripe just above the boundary would use CDEF pixels from the tile below for context, while the stripe just below would use deblocked pixels from the stripe above. The intended design appears to have been to use CDEF pixels on both sides (so we logically have a 64-pixel high stripe, it's just split into an 8-pixel and a 56-pixel high stripe in order to keep the coefficient sets aligned to tiles) Implement that behaviour by disabling the context setup process when at a horizontal tile border. * Pull some common calculations out of {setup,restore}_processing_stripe_boundary and into their common caller. This allows us to reduce the number of arguments going into each function and their internal complexity. * Add more design comments around stripe boundary setup, as there are quite a lot of constraints to be aware of Change-Id: Ic1586c149b7f764b9c1a711df3f11fb0f130b38a
-
- 14 Nov, 2017 1 commit
-
-
Rupert Swarbrick authored
The "src_height" computed in save_deblock_boundary_lines didn't match the one in save_tile_row_boundary_lines, which meant that the wrapper function assumed the deblock code was saving some lines and that code thought that save_cdef_boundary_lines would do it. This patch fixes up the logic to match, and also completely gets rid of the lines_to_save variable (after all, bad things would happen if lines_to_save was 1 because we'll still read both boundary lines later) The tile height gets rounded up to a multiple of 8 luma pixels in save_tile_row_boundary_lines to avoid nasty corner cases. This will only have any effect for rows at the bottom of the frame (where av1_get_tile_rect clips to the frame boundary). BUG=aomedia:1020 Change-Id: I55adb53fa8ba9c7f97fb2fd5b328a3f2f5065464
-
- 10 Nov, 2017 2 commits
-
-
Hui Su authored
Change-Id: Ia47101de22091a60d7931890f00a4a3a527f5bd4
-
Rupert Swarbrick authored
The logic (copied from save_deblock_boundary_lines) to work out how many lines to save isn't correct here: we always save exactly one line but duplicate it across multiple rows in the "boundary" buffer. BUG=aomedia:1020 Change-Id: Ib7ce7a777191062f3511d82c8c4eec589c900a2f
-
- 08 Nov, 2017 1 commit
-
-
Yaowu Xu authored
This reverts commit ab8bb8b8. The reverted breaks many nightly run tests, reverting this temporarily to allow nightly tests to detect other failures. Once the issues are fixed, we can reenable the change in the reverted commit. BUG=aomedia:1012 BUG=aomedia:1013 BUG=aomedia:1014 Change-Id: I2503fe78e47c7a08bb6cfdfff2c295cec0b6497d
-
- 07 Nov, 2017 1 commit
-
-
Rupert Swarbrick authored
As of patch https://aomedia-review.googlesource.com/c/aom/+/28821 , loop-restoration units cannot cross tile borders. But the context around each processing unit was still allowed to cross tile borders. This is fine in the usual case - but, when loop filtering across tiles is switched off, we're supposed to be able to decode each tile completely independently (each tile column, if dependent-horztiles is on). Roughly, the change we need to make is: When loop filtering across tiles is switched off, we treat each tile as if it were a full frame, and extend the CDEF output for that tile to form a 3-pixel border around the tile. We only use deblocked above/below pixels for processing unit boundaries which lie inside a tile. In terms of the code, this is implemented in two parts. This only applies when the loop_filter_across_tiles_flag is false; otherwise, we keep the old behaviour. * For processing units at the top edge of a tile, fill the above context with copies of the topmost line of CDEF output *from the same tile*, rather than using deblocked pixels from the tile above. The below context of processing units at the bottom edge of a tile is treated analogously. * When setting up the boundary for a processing stripe at the left edge of a tile, fill the stripe's left boundary with copies of the leftmost column of CDEF output from the same tile. Again, processing stripes at the right edge of a tile are treated analogously. Similarly to the above/below boundaries, we store the overwritten pixels into a pair of left/right context buffers, and restore them to their original values once we've dealt with that processing stripe. Change-Id: I53a0932793c1c56dc037683c6a4353a3f5dc4539
-
- 06 Nov, 2017 1 commit
-
-
Dominic Symes authored
max_tile was provisionally adopted at the working group meeting 2017-Oct-10 This patch also enables support for 64x64 and 128x128 superblock size for max tile (rather than assuming 128). There is also one fix for max_tile in combination of loop restoration where the width/height was in the wrong units for max-tile specific code. Change-Id: Icb862a2738fea5fc6215819396e1afa4eb86e461
-
- 03 Nov, 2017 2 commits
-
-
Rupert Swarbrick authored
The "data8_bl" variable is a uint8_t* and will be scaled up later (with REAL_PTR) if it's pointing to highbd data. Don't scale up the x offset. Change-Id: I03e2ce8861e25e3a603e8f0ba2c8af585e08b9c5
-
Rupert Swarbrick authored
We do this by upscaling the deblocked output as we save it into the RestorationStripeBoundaries line buffers. (See save_boundary_lines in restoration.c for the details) The upscaling is done by calling av1_convolve_horiz_rs, which reads off the edge of the frame and, of course, across tile boundaries. This means we need to extend the frame borders before saving boundary lines (hence the changes to decodeframe.c and encoder.c) Change-Id: Ia096846898b20afe4737433d772f7277d4f71724
-
- 02 Nov, 2017 5 commits
-
-
Rupert Swarbrick authored
Before this patch, striped loop restoration didn't restart correctly on each tile row. Now, the loop restoration stripes start at the top of a tile row in the same way as if it were the top of the entire frame. Change-Id: I0a88a28d7804b2f09d792ecbbf4f22f666f67012
-
David Barker authored
Because we have an (effective) 3-pixel border around each processing unit, and the local sums in the self-guided filter are only taken over at most 5x5 regions, we have 1 pixel's worth of spare border. We can use this border to greatly simplify the filter: Instead of calculating a 64x64 region of the A[] and B[] arrays, we can calculate a 66x66 region. Then we don't have to deal with complicated boundary conditions when generating the final 64x64 output block. This also makes a few other related changes: * The 'boxnum' function has been effectively redundant for a while - due to the way we do the 5x5 (or 3x3) windowing, the values we actually use are always (2r+1)^2. So we can skip calling this function if MAX_RADIUS <= 2 * We can remove the annoying special case for tiny processing units in the self-guided filter, as we no longer have to worry about border behaviour * We change the SSE4.1 code to match the new C code, removing a ton of complexity. Further refactoring/speedups are probably now possible, but this includes the minimal changes to pass all the tests. Change-Id: I99beee164a31349a5228a9bef048e5f35c9639f2
-
Rupert Swarbrick authored
These are just RESTORATION_PROC_UNIT_SIZE shifted right by the vertical or horizontal subsampling for this plane and it's easier not to have to pass them around. Change-Id: I86441d6cd86bb146f3e5dcdf2c89e34dd9fed0e1
-
Rupert Swarbrick authored
Previously we were calling aom_extend_frame_borders to generate extended pixels for use in loop-restoration. This generates quite a large border, when we only need 3 pixels. In addition, we were also calling extend_frame, which does the same thing but with a smaller border, once (in the decoder) or multiple times (in the encoder) per plane. This patch tidies all of this up so that we only call extend_frame once per plane, with the largest border size we need (3px). It also adds two new #defines. RESTORATION_BORDER is the 3 pixel border needed to do filtering for a processing unit. RESTORATION_CTX_VERT is the number of rows saved for each stripe when doing striped loop restoration. Change-Id: I2c3ffcc19808f79db195f76d857e2f23da5d8a84
-
Rupert Swarbrick authored
After this patch, we don't scale sb coordinates vertically when using HORZ_FRAME_SUPERRES. Change-Id: I24c652b4b357b132e8b29979a119e7aeb8420e19
-
- 30 Oct, 2017 2 commits
-
-
Rupert Swarbrick authored
I'd got the scaling backwards. This gets it right and adds a comment explaining the calculation. Change-Id: Ife2913700cc73996c09b702b394832799c449a8c
-
David Barker authored
Remove the special case handling for the topmost/bottommost rows in each processing unit. This causes slightly different effects depending on whether striped-loop-restoration is enabled. With striped-loop-restoration: Now that we explicitly fill out 3 rows of above/below pixels for each stripe, we don't need to use stepdown_wiener_kernel. Instead, the duplication of the topmost/bottommost pixels accomplishes the same task, while making the code much cleaner. This patch should not cause a change in output, except in a couple of cases which were already questionable. In particular, it fixes bug #953, where the Wiener filter could not handle small processing units (<4 rows high) Without striped-loop-restoration: The Wiener filter returns to using a full 3 pixels above/below the processing unit. In order to make sure there are enough pixels, we need to expand WIENER_BORDER_VERT to 3 pixels. This will result in a slight change in output, but should be fairly minor. BUG=aomedia:953 Change-Id: I9530ef55909246f7ba488b7ecfd92d59e776b2f9
-
- 27 Oct, 2017 1 commit
-
-
David Barker authored
Save and restore 3 rows above and below each stripe, instead of 2. The extra rows are filled with duplicates of the outermost context rows. This should not affect the encoder or decoder output in any way, as currently these outer rows are not used. But this will enable later patches to simplify the code and make it a closer match to the way things are described in the striped-loop-restoration design document. Change-Id: I8ae5433e321d6025c6dc1b473330f485f1599340
-
- 26 Oct, 2017 1 commit
-
-
Rupert Swarbrick authored
With this patch, restoration units are allocated within each tile as if it were its own image. Arrays of information that need one entry per restoration unit are laid out in tiles, with rsi->units_per_tile units for each tile. Change-Id: I485c17166f33e24d281079b3138b76f98f0fe081
-