- 07 Nov, 2017 4 commits
-
-
Monty Montgomery authored
The recent 64x32 and 32x64 patches break the build when CONFIG_DAALA_TX and CONFIG_TX64X64 are enabled simultaneously. This is a minor correction that fixes the build problem. Change-Id: I53cd8df9160fc35b67f2ac16bddcfab08425cf8e
-
Debargha Mukherjee authored
This change seems to drop efficiency more than expected. So backing that out for now until a better rd based decision is found. Change-Id: I3791a13ba76cfa38dd0df2f1fd4119b42b12291d
-
Yue Chen authored
Return invalid rate (previously only invalid rdcost) if the mode combination to check is < 8x8 tx_size + filter_intra mode. BUG=aomedia:1006 Change-Id: If90f431c7692473c88ac7a644bfa969a1acb3573
-
Rupert Swarbrick authored
As of patch https://aomedia-review.googlesource.com/c/aom/+/28821 , loop-restoration units cannot cross tile borders. But the context around each processing unit was still allowed to cross tile borders. This is fine in the usual case - but, when loop filtering across tiles is switched off, we're supposed to be able to decode each tile completely independently (each tile column, if dependent-horztiles is on). Roughly, the change we need to make is: When loop filtering across tiles is switched off, we treat each tile as if it were a full frame, and extend the CDEF output for that tile to form a 3-pixel border around the tile. We only use deblocked above/below pixels for processing unit boundaries which lie inside a tile. In terms of the code, this is implemented in two parts. This only applies when the loop_filter_across_tiles_flag is false; otherwise, we keep the old behaviour. * For processing units at the top edge of a tile, fill the above context with copies of the topmost line of CDEF output *from the same tile*, rather than using deblocked pixels from the tile above. The below context of processing units at the bottom edge of a tile is treated analogously. * When setting up the boundary for a processing stripe at the left edge of a tile, fill the stripe's left boundary with copies of the leftmost column of CDEF output from the same tile. Again, processing stripes at the right edge of a tile are treated analogously. Similarly to the above/below boundaries, we store the overwritten pixels into a pair of left/right context buffers, and restore them to their original values once we've dealt with that processing stripe. Change-Id: I53a0932793c1c56dc037683c6a4353a3f5dc4539
-
- 06 Nov, 2017 14 commits
-
-
Yushin Cho authored
Remove the option of raw data or delta when coding the segment data, then only use delta coding. Raw data coding of segment data has been nowhere used but the option of "raw or delta codig of seg_data" has been coded to a bitstream. Change-Id: Iaf8f21692452d0c9a127b958812c6151d3c5db05
-
Yushin Cho authored
Also move its comment on seg_data to other relavant function. Change-Id: I5d3282040862cd09565b9d4f7baadf0124b64823
-
Luc Trudeau authored
This change does not alter the bitstream. This change simplifies a subsequent commit to remove the custom DC_PRED used by CfL. To use the DC_PRED in AV1, CfL must consider the DC_PRED as a block instead of a single value. Results on Subset1 (Compared to Previous commit with CfL enabled) PSNR | PSNR Cb | PSNR Cr | PSNR HVS | SSIM | MS SSIM | CIEDE 2000 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 https://arewecompressedyet.com/?job=master%402017-11-03T15%3A57%3A30.643Z&job=cfl-pixel-DC_PRED%402017-11-03T15%3A59%3A03.304Z Change-Id: I75f981ab93ab1808450f8280bfbabde76ea5b7fe
-
Rupert Swarbrick authored
On subsampled planes, the frame is narrower but the padding by RESTORATION_EXTRA_HORZ on each side is the same width as usual. Change-Id: Id68c0dd674efaa769412825b119ae5ebe56548ad
-
Yunqing Wang authored
While large_scale_tile=1, not use temporal MVs. Change-Id: I7107519595b79cbca45dfe72d5ada78cfdc39b00
-
Yunqing Wang authored
Updated the encoder flags for externally setting reference frames using and updating to include latest changes in AV1. 1. For what reference frames to use, always initilize cpi->ref_frame_flags with AOM_REFFRAME_ALL at the beginning of encoding a frame. The internal ref_frame_flags starts from external flags. Added AOM_EFLAG_NO_REF_LAST2 and AOM_EFLAG_NO_REF_LAST3 for LAST2 and LAST3. 2. For what reference frames to update, added ext_refresh_bwd_ref_frame and ext_refresh_alt2_ref_frame for BWD and ALT2. Also, removed AOM_EFLAG_FORCE_GF and AOM_EFLAG_FORCE_ARF since these are never actually used. They can be added back if needed later. Change-Id: I1e4429290f09bfcd1b26f2babc0cf556fc6fbc6c
-
Sebastien Alaiwan authored
When needed, fallback regular interp filter at reconstruction stage. Such bitstreams are valid. However, as we don't expect aomenc to generate them, print a helper warning. Change-Id: I7e818cf607d7d6f71df4ca7878d8976fb88c3282
-
Rupert Swarbrick authored
When upscaling a frame, we extend frame borders to stop the upscale to save boundary lines convolving with uninitialised data off the edges, which was causing encode/decode mismatches. With this patch, we only do the extension when there's going to be an upscale (otherwise there's no need), which should give a small coding gain when not upscaling. More importantly, it forces us to extend in the decode path whether or not we are using loop restoration, which matches what the encoder does and fixes a mismatch. Change-Id: Ie5a0791b0cbedbf254f9080f3cbf668318673f2f
-
Debargha Mukherjee authored
Changes a pruning criterion that seems to give a little better compression efficiency at a little faster speed. Change-Id: I8e3f9aa552b093c4af4ba615bb6ce29587bc8c36
-
Dominic Symes authored
max_tile was provisionally adopted at the working group meeting 2017-Oct-10 This patch also enables support for 64x64 and 128x128 superblock size for max tile (rather than assuming 128). There is also one fix for max_tile in combination of loop restoration where the width/height was in the wrong units for max-tile specific code. Change-Id: Icb862a2738fea5fc6215819396e1afa4eb86e461
-
Yue Chen authored
Filter coeffcients c0, c1, c2 are scaled by 8, and can be represented by 4 bits unsigned integer (c2 is always <=0) Change-Id: I93643bab6734214cef0b0175d6980ebabe9dfe10
-
Cheng Chen authored
Previously the weighted sums in convolve are right shifted without rounding. This patch adds rounding value before right shifts. Change-Id: Iea39aca419ac0ca0c32756f345293ce5e28dbd5b
-
Cheng Chen authored
Add SIMD implementations for c functions for low bit-depth, making encoder speed faster by 3~4x than c functions. Change-Id: Icca0b07b25489759be9504aaec09d1239076fc52
-
Cheng Chen authored
The refactoring serves two purposes: 1. Separate code paths for jnt_comp and original compound average computation. It provides function interface for jnt_comp while leaving original compound average computation unchanged. In near future, SIMD functions can be added for jnt_comp using the interface. 2. Previous implementation uses a hack on second_pred. But it may cause segmentation fault when the test clip is small. As reported in Issue 944. This refactoring removes hacking and make it possible to address the seg fault problem in the future. Change-Id: Idd2cb99f6c77dae03d32ccfa1f9cbed1d7eed067
-
- 05 Nov, 2017 4 commits
-
-
Sebastien Alaiwan authored
This experiment has been abandonned for AV1. Change-Id: I18cf1354df928a0614a1e58b718cd96ee7999925
-
Zoe Liu authored
This patch also updates cm->frame_offset for show_existing_frame at the encoder. Change-Id: I863876675145ba663fc229a854b83b39759309a5
-
Debargha Mukherjee authored
With this patch, and the speed settings turned on for speed 1, the coding efficiency of speed 1 in default configuration should be only a little worse than speed 0, but it should roughly run at double the speed. Specifically, this patch makes various changes to make sure that speed 1 behaves exactly the same as speed 0 except for speed settings turned on or off in speed_features.c. This will change the bitstream generated a little for speeds 1 or higher because of the following reasons: 1. Removes a hacky speed setting correction factor in firstpass.c 2. Fast cdef search is moved from speed 1+ to 2+, and a new speed feature is added to control that. 3. Mesh search settings are pushed down one level so that speeds 0 and 1 use the same settings. 4. A disable_split_mask feature for animated content previously turned on speeds 1+ is moved down to speeds 2+. Change-Id: I0ec36556f157bdc42c5daa0cfb9518cf7ff65f6b
-
Debargha Mukherjee authored
Removing the NONE partition check from horz_4 and vert_4 partition search conditions provides another 5-10% speedup at very little loss. Change-Id: Ie5f14191efe6d2b0695b27021de96ad0a1550f26
-
- 04 Nov, 2017 8 commits
-
-
Zoe Liu authored
At the encoder side, for the 7 reference frames, we always set up the priority rank as follows: LAST, ALTREF, LAST2, LAST3, GOLDEN, BWDREF, ALTREF2 That is, if two reference frames point to the same reference frame buffer, the flag for the latter frame in the rank will always be turned off. This patch does not change any coding performance / coding speed for the default configure setup. It only affects the following setup: one-sided-compound is on && ext-comp-refs is off As one-sided-compound is enabled by default when ext-comp-refs is enabled, and ext-comp-refs is enabled by default, above setup should not be considered. Change-Id: I6de18d3be938e1d4a8897e5ba0857b8d21e7f9d0
-
Sebastien Alaiwan authored
Change-Id: I752ad96a8b4349d4a437a97e30edc8e4c22f81b5
-
James Zern authored
Change-Id: Ic7096fe85dc653c9c7d7d1f098df19daff27e1cf
-
Yue Chen authored
Development of this experiment will be deferred to AV2. Change-Id: I3c4615a21b59508500bed8aab0a5c54413b4f284
-
Zoe Liu authored
One-sided compound ref prediction is used only when all reference frames are one-sided. This patch has demonstrated an encoder speedup of ~28%. Using the following configure setups, the coding performance has been dropped on Google test sets (50 frames) in BDRate by ~0.2% for lowres and by ~0.1% for midres (Corresponding performance impact should be smaller on AWCY): --enable-experimental --disable-convolve-round --disable-ext-partition --disable-ext-partition-types --disable-txk-sel --disable-txm Change-Id: I585bbffb2f8d154e8f52a1e79a84eff8bb4a471d
-
Jingning Han authored
The control of using reference frame motion vector is a separate factor from the existence of previous frame motion vectors. This commit decouples these two, such that the encoder can control the use of reference motion vector. When it is used, one can further identify if the previous frame exists or not, then to decide if need to force use_prev_frame_mvs to be zero. This solves the issue where the previous frame mvs is set to be 0 and it accidentally shuts off the access to all other existing referece frames mvs in the mfmv system. It brings back the coding performance gains to normal. Change-Id: I2531f73e55582a9bb5b3e0ff47e361a199ec8082
-
Debargha Mukherjee authored
Search the new horz/vert a/b/4 partitions only if the best so far is either oriented along the same direction or split/none, or if the rd costs obtained from the previous partition searches indicate there is potential in searching these partitions. This brings about 25-30% speedup at less than 0.1% drop as seen on lowres 30 frames. Change-Id: I6c6c347e06c34ee0ca17479aeeb4075a66dc7e2c
-
Debargha Mukherjee authored
Adds a new experiment to simplify the tx_mode symbol. The existing frame level tx_mode information is converted to a single bit to select between largest tx_size for a prediction unit or specified at the block level. The less useful modes: ALLOW_8X8, ALLOW_16X16, etc. are removed. Change-Id: Ib9358e17b0158a167eb4edef79f36ff113aa56e1
-
- 03 Nov, 2017 10 commits
-
-
Yunqing Wang authored
Added the function of allowing to disable the probability update while needed. This would be needed while encoding in multiple tiles, and enabling/disabling probability update can be set separately for every individual tile. Change-Id: Ic3c64e6cebac89c483d48b874761bd2e902d81e6
-
Dake He authored
Per the codec WG call today, turn on Plan B for level map by default. Change-Id: Iae885b38917cf79e4f0b290cc2d73ac28321710f
-
Ola Hugosson authored
Change-Id: Idc5ead2db38562924f27796eb78a05b658b5a20e
-
David Barker authored
When fixing one bug in av1_decode_tg_tiles_and_wrapup, I seem to have introduced another bug. This was due to checking the wrong condition on whether to update the frame context at the end of the frame. BUG=aomedia:1001 Change-Id: I929a710e2de31a89cc7899fb1605ca7edf968a87
-
Rupert Swarbrick authored
The "data8_bl" variable is a uint8_t* and will be scaled up later (with REAL_PTR) if it's pointing to highbd data. Don't scale up the x offset. Change-Id: I03e2ce8861e25e3a603e8f0ba2c8af585e08b9c5
-
Yue Chen authored
0.159% gain on lowres 60 frames, compared to 0.236% gain if we don't restrict it in small tx blocks. (--disable-ext-partition --disable-ext-partition-types --disable-convolve-round --disable-ext-comp-refs) Change-Id: I1d1c5474ca27de9dec992ea30a9883afd7a56474
-
Rupert Swarbrick authored
This patch regenerates the orders tables and generates both the normal ones and also those for vertical partitions. I've added a long comment above the definition of orders[] that explains how they work (there's no change, but it took me a while to understand, so it's probably a good thing to document). I've also slightly changed when we use the orders_vert tables: they are now used for both PARTITION_VERT_A and PARTITION_VERT_B. The patch also removes the #if around the partition argument to has_top_right and adds it to has_bottom_left. (I could have put it inside an #if, but I shouldn't imagine there's any measurable performance cost and the code is cleaner this way). The tables were regenerated with a Haskell script which I've included at the bottom of the commit message (so the next person doesn't have to write it from scratch yet again). The output looks reasonably clean, but clang-format does change it somewhat so you need to run that afterwards. The tables are also output in a different order, so you'll need to clean that up by hand too. -- orders.hs: Print tables to stdout by calling printOrders import Data.Foldable import Data.List (findIndex) import Data.Maybe import System.Environment import Text.Printf import Text.Read data Block = Block { lbw :: Int, lbh :: Int, vert :: Bool } minLogBlockSize :: Bool -> Int minLogBlockSize v = if v then 3 else 2 maxLogBlockSize = 7 :: Int -- This code generates the inverse of what we want: a mapping from visit order -- to raster order. That is, element i of the list will be the raster index of -- the block that we visit at step i. vrSplit :: Block -> Int -> Int -> Int -> [Int] vrSplit b stride lsz off -- PARTITION_NONE | lbw b >= lsz && lbh b >= lsz = [off] -- Some form of horizontal partition | lbw b < lsz && lbh b >= lsz = [off,off + 1..off + 2^(lsz - lbw b) - 1] -- Some form of vertical partition | lbw b >= lsz && lbh b < lsz = [off,off + stride..off + (2^(lsz - lbh b) - 1)*stride] -- PARTITION_VERT_* | vert b && lbh b + 1 == lsz && lbw b + 1 == lsz = [off, off + stride, off + 1, off + stride + 1] -- PARTITION_SPLIT | otherwise = concatMap (vrSplit b stride (lsz - 1)) [off, off + 2^(lsz - lbw b - 1), off + 2^(lsz - lbh b - 1) * stride, off + 2^(lsz - lbw b - 1) + 2^(lsz - lbh b - 1) * stride] vrOrders :: Block -> [Int] vrOrders b = vrSplit b (2 ^ (maxLogBlockSize - lbw b)) maxLogBlockSize 0 -- A simple function to invert the bijection generated by vrOrders (it's very -- naive, but the list isn't exactly long) invertList :: [Int] -> [Int] invertList is = map (\ i -> fromJust $ findIndex ((==) i) is) [0..length is - 1] genOrders :: Block -> [Int] genOrders = invertList . vrOrders -- Code to print everything out in the style used in the AOM codebase forButLast_ :: Applicative f => [a] -> (a -> f b) -> f () forButLast_ [] f = pure () forButLast_ (a : as) f = fbl a as f where fbl a [] f = pure () fbl a (a' : as) f = f a *> fbl a' as f numDigits :: Int -> Int numDigits n = if n == 0 then 1 else ceiling $ logBase 10 $ fromIntegral $ 1 + n printRow :: Int -> Int -> [Int] -> Bool -> IO () printRow indent fw as islast = do { if null as then return () else do { printf "%*s" indent "" ; forButLast_ as (\ a -> printf "%d,%*s" a (postDent a) "") ; printf "%d%s" (last as) (if islast then "\n" else ",\n") } } where postDent a = 1 + fw - numDigits a printInts :: Int -> Int -> Int -> [Int] -> IO () printInts width indent fw [] = return () printInts width indent fw as = let (row, rest) = splitAt eltsPerLine as in printRow indent fw row (null rest) >> printInts width indent fw rest where eltsPerLine = quot (width - indent + 1) (fw + 2) printBlockOrders :: Block -> IO () printBlockOrders b = do { printf "static const uint16_t orders_%s%dx%d[%d] = {\n" (if vert b then "vert_" else "") ((2 :: Int) ^ lbw b) ((2 :: Int) ^ lbh b) numElts ; printInts 79 2 intWidth (genOrders b) ; printf "};\n" } where lsz = maxLogBlockSize numElts = (2 :: Int) ^ (lsz - lbw b + lsz - lbh b) intWidth = max 1 $ ceiling $ logBase 10 $ fromIntegral (numElts - 1) blocksForWidth :: Bool -> Int -> [Block] blocksForWidth v lbw = map (\ lbh -> Block lbw lbh v) [minLbh..maxLbh] where maxLogAspectRatio = if v then 0 else 2 minLbh = max (minLogBlockSize v) (lbw - maxLogAspectRatio) maxLbh = min maxLogBlockSize (lbw + maxLogAspectRatio) blocksForV :: Bool -> [Block] blocksForV v = concatMap (blocksForWidth v) [minLbw..maxLbw] where minLbw = (minLogBlockSize v) maxLbw = maxLogBlockSize blocks :: [Block] blocks = blocksForV False ++ blocksForV True printOrders :: IO () printOrders = traverse_ printBlockOrders blocks -- Ends orders.hs BUG=aomedia:914 Change-Id: I6c53e80caa0d203cdc11f88471b6c117c633baa6
-
Rupert Swarbrick authored
These play havoc with editors' "jump to start of function" commands. There should be no change to generated code. Change-Id: Ib6961bb952da02081a675d0a4fa01eea2c1ff6d1
-
Sebastien Alaiwan authored
Change-Id: I75890c0f64f93f48299895d1e0bcfbf91846a4ab
-
Debargha Mukherjee authored
The first level is turned on for speed 1. Change-Id: I3dba0f0250b97a25e174cacc2a46ca7f76572c85
-