- 05 Nov, 2017 4 commits
-
-
Sebastien Alaiwan authored
This experiment has been abandonned for AV1. Change-Id: I18cf1354df928a0614a1e58b718cd96ee7999925
-
Zoe Liu authored
This patch also updates cm->frame_offset for show_existing_frame at the encoder. Change-Id: I863876675145ba663fc229a854b83b39759309a5
-
Debargha Mukherjee authored
With this patch, and the speed settings turned on for speed 1, the coding efficiency of speed 1 in default configuration should be only a little worse than speed 0, but it should roughly run at double the speed. Specifically, this patch makes various changes to make sure that speed 1 behaves exactly the same as speed 0 except for speed settings turned on or off in speed_features.c. This will change the bitstream generated a little for speeds 1 or higher because of the following reasons: 1. Removes a hacky speed setting correction factor in firstpass.c 2. Fast cdef search is moved from speed 1+ to 2+, and a new speed feature is added to control that. 3. Mesh search settings are pushed down one level so that speeds 0 and 1 use the same settings. 4. A disable_split_mask feature for animated content previously turned on speeds 1+ is moved down to speeds 2+. Change-Id: I0ec36556f157bdc42c5daa0cfb9518cf7ff65f6b
-
Debargha Mukherjee authored
Removing the NONE partition check from horz_4 and vert_4 partition search conditions provides another 5-10% speedup at very little loss. Change-Id: Ie5f14191efe6d2b0695b27021de96ad0a1550f26
-
- 04 Nov, 2017 8 commits
-
-
Zoe Liu authored
At the encoder side, for the 7 reference frames, we always set up the priority rank as follows: LAST, ALTREF, LAST2, LAST3, GOLDEN, BWDREF, ALTREF2 That is, if two reference frames point to the same reference frame buffer, the flag for the latter frame in the rank will always be turned off. This patch does not change any coding performance / coding speed for the default configure setup. It only affects the following setup: one-sided-compound is on && ext-comp-refs is off As one-sided-compound is enabled by default when ext-comp-refs is enabled, and ext-comp-refs is enabled by default, above setup should not be considered. Change-Id: I6de18d3be938e1d4a8897e5ba0857b8d21e7f9d0
-
Sebastien Alaiwan authored
Change-Id: I752ad96a8b4349d4a437a97e30edc8e4c22f81b5
-
James Zern authored
Change-Id: Ic7096fe85dc653c9c7d7d1f098df19daff27e1cf
-
Yue Chen authored
Development of this experiment will be deferred to AV2. Change-Id: I3c4615a21b59508500bed8aab0a5c54413b4f284
-
Zoe Liu authored
One-sided compound ref prediction is used only when all reference frames are one-sided. This patch has demonstrated an encoder speedup of ~28%. Using the following configure setups, the coding performance has been dropped on Google test sets (50 frames) in BDRate by ~0.2% for lowres and by ~0.1% for midres (Corresponding performance impact should be smaller on AWCY): --enable-experimental --disable-convolve-round --disable-ext-partition --disable-ext-partition-types --disable-txk-sel --disable-txm Change-Id: I585bbffb2f8d154e8f52a1e79a84eff8bb4a471d
-
Jingning Han authored
The control of using reference frame motion vector is a separate factor from the existence of previous frame motion vectors. This commit decouples these two, such that the encoder can control the use of reference motion vector. When it is used, one can further identify if the previous frame exists or not, then to decide if need to force use_prev_frame_mvs to be zero. This solves the issue where the previous frame mvs is set to be 0 and it accidentally shuts off the access to all other existing referece frames mvs in the mfmv system. It brings back the coding performance gains to normal. Change-Id: I2531f73e55582a9bb5b3e0ff47e361a199ec8082
-
Debargha Mukherjee authored
Search the new horz/vert a/b/4 partitions only if the best so far is either oriented along the same direction or split/none, or if the rd costs obtained from the previous partition searches indicate there is potential in searching these partitions. This brings about 25-30% speedup at less than 0.1% drop as seen on lowres 30 frames. Change-Id: I6c6c347e06c34ee0ca17479aeeb4075a66dc7e2c
-
Debargha Mukherjee authored
Adds a new experiment to simplify the tx_mode symbol. The existing frame level tx_mode information is converted to a single bit to select between largest tx_size for a prediction unit or specified at the block level. The less useful modes: ALLOW_8X8, ALLOW_16X16, etc. are removed. Change-Id: Ib9358e17b0158a167eb4edef79f36ff113aa56e1
-
- 03 Nov, 2017 20 commits
-
-
Yunqing Wang authored
Added the function of allowing to disable the probability update while needed. This would be needed while encoding in multiple tiles, and enabling/disabling probability update can be set separately for every individual tile. Change-Id: Ic3c64e6cebac89c483d48b874761bd2e902d81e6
-
Dake He authored
Per the codec WG call today, turn on Plan B for level map by default. Change-Id: Iae885b38917cf79e4f0b290cc2d73ac28321710f
-
Ola Hugosson authored
Change-Id: Idc5ead2db38562924f27796eb78a05b658b5a20e
-
David Barker authored
When fixing one bug in av1_decode_tg_tiles_and_wrapup, I seem to have introduced another bug. This was due to checking the wrong condition on whether to update the frame context at the end of the frame. BUG=aomedia:1001 Change-Id: I929a710e2de31a89cc7899fb1605ca7edf968a87
-
Rupert Swarbrick authored
The "data8_bl" variable is a uint8_t* and will be scaled up later (with REAL_PTR) if it's pointing to highbd data. Don't scale up the x offset. Change-Id: I03e2ce8861e25e3a603e8f0ba2c8af585e08b9c5
-
Yue Chen authored
0.159% gain on lowres 60 frames, compared to 0.236% gain if we don't restrict it in small tx blocks. (--disable-ext-partition --disable-ext-partition-types --disable-convolve-round --disable-ext-comp-refs) Change-Id: I1d1c5474ca27de9dec992ea30a9883afd7a56474
-
Rupert Swarbrick authored
This patch regenerates the orders tables and generates both the normal ones and also those for vertical partitions. I've added a long comment above the definition of orders[] that explains how they work (there's no change, but it took me a while to understand, so it's probably a good thing to document). I've also slightly changed when we use the orders_vert tables: they are now used for both PARTITION_VERT_A and PARTITION_VERT_B. The patch also removes the #if around the partition argument to has_top_right and adds it to has_bottom_left. (I could have put it inside an #if, but I shouldn't imagine there's any measurable performance cost and the code is cleaner this way). The tables were regenerated with a Haskell script which I've included at the bottom of the commit message (so the next person doesn't have to write it from scratch yet again). The output looks reasonably clean, but clang-format does change it somewhat so you need to run that afterwards. The tables are also output in a different order, so you'll need to clean that up by hand too. -- orders.hs: Print tables to stdout by calling printOrders import Data.Foldable import Data.List (findIndex) import Data.Maybe import System.Environment import Text.Printf import Text.Read data Block = Block { lbw :: Int, lbh :: Int, vert :: Bool } minLogBlockSize :: Bool -> Int minLogBlockSize v = if v then 3 else 2 maxLogBlockSize = 7 :: Int -- This code generates the inverse of what we want: a mapping from visit order -- to raster order. That is, element i of the list will be the raster index of -- the block that we visit at step i. vrSplit :: Block -> Int -> Int -> Int -> [Int] vrSplit b stride lsz off -- PARTITION_NONE | lbw b >= lsz && lbh b >= lsz = [off] -- Some form of horizontal partition | lbw b < lsz && lbh b >= lsz = [off,off + 1..off + 2^(lsz - lbw b) - 1] -- Some form of vertical partition | lbw b >= lsz && lbh b < lsz = [off,off + stride..off + (2^(lsz - lbh b) - 1)*stride] -- PARTITION_VERT_* | vert b && lbh b + 1 == lsz && lbw b + 1 == lsz = [off, off + stride, off + 1, off + stride + 1] -- PARTITION_SPLIT | otherwise = concatMap (vrSplit b stride (lsz - 1)) [off, off + 2^(lsz - lbw b - 1), off + 2^(lsz - lbh b - 1) * stride, off + 2^(lsz - lbw b - 1) + 2^(lsz - lbh b - 1) * stride] vrOrders :: Block -> [Int] vrOrders b = vrSplit b (2 ^ (maxLogBlockSize - lbw b)) maxLogBlockSize 0 -- A simple function to invert the bijection generated by vrOrders (it's very -- naive, but the list isn't exactly long) invertList :: [Int] -> [Int] invertList is = map (\ i -> fromJust $ findIndex ((==) i) is) [0..length is - 1] genOrders :: Block -> [Int] genOrders = invertList . vrOrders -- Code to print everything out in the style used in the AOM codebase forButLast_ :: Applicative f => [a] -> (a -> f b) -> f () forButLast_ [] f = pure () forButLast_ (a : as) f = fbl a as f where fbl a [] f = pure () fbl a (a' : as) f = f a *> fbl a' as f numDigits :: Int -> Int numDigits n = if n == 0 then 1 else ceiling $ logBase 10 $ fromIntegral $ 1 + n printRow :: Int -> Int -> [Int] -> Bool -> IO () printRow indent fw as islast = do { if null as then return () else do { printf "%*s" indent "" ; forButLast_ as (\ a -> printf "%d,%*s" a (postDent a) "") ; printf "%d%s" (last as) (if islast then "\n" else ",\n") } } where postDent a = 1 + fw - numDigits a printInts :: Int -> Int -> Int -> [Int] -> IO () printInts width indent fw [] = return () printInts width indent fw as = let (row, rest) = splitAt eltsPerLine as in printRow indent fw row (null rest) >> printInts width indent fw rest where eltsPerLine = quot (width - indent + 1) (fw + 2) printBlockOrders :: Block -> IO () printBlockOrders b = do { printf "static const uint16_t orders_%s%dx%d[%d] = {\n" (if vert b then "vert_" else "") ((2 :: Int) ^ lbw b) ((2 :: Int) ^ lbh b) numElts ; printInts 79 2 intWidth (genOrders b) ; printf "};\n" } where lsz = maxLogBlockSize numElts = (2 :: Int) ^ (lsz - lbw b + lsz - lbh b) intWidth = max 1 $ ceiling $ logBase 10 $ fromIntegral (numElts - 1) blocksForWidth :: Bool -> Int -> [Block] blocksForWidth v lbw = map (\ lbh -> Block lbw lbh v) [minLbh..maxLbh] where maxLogAspectRatio = if v then 0 else 2 minLbh = max (minLogBlockSize v) (lbw - maxLogAspectRatio) maxLbh = min maxLogBlockSize (lbw + maxLogAspectRatio) blocksForV :: Bool -> [Block] blocksForV v = concatMap (blocksForWidth v) [minLbw..maxLbw] where minLbw = (minLogBlockSize v) maxLbw = maxLogBlockSize blocks :: [Block] blocks = blocksForV False ++ blocksForV True printOrders :: IO () printOrders = traverse_ printBlockOrders blocks -- Ends orders.hs BUG=aomedia:914 Change-Id: I6c53e80caa0d203cdc11f88471b6c117c633baa6
-
Rupert Swarbrick authored
These play havoc with editors' "jump to start of function" commands. There should be no change to generated code. Change-Id: Ib6961bb952da02081a675d0a4fa01eea2c1ff6d1
-
Sebastien Alaiwan authored
Change-Id: I75890c0f64f93f48299895d1e0bcfbf91846a4ab
-
Debargha Mukherjee authored
The first level is turned on for speed 1. Change-Id: I3dba0f0250b97a25e174cacc2a46ca7f76572c85
-
Debargha Mukherjee authored
Removes features for now so that we only add features with very small loss. Change-Id: Ie50f6af2a6cc19dde5f682754a1f0adf4ec957a8
-
Alexander Bokov authored
Change-Id: I4270d1260854ac27b68c5694ca8102b92bee6faa
-
Alexander Bokov authored
Use a neural-network-based binary classifier to predict the first split decision on the highest level of the TX size RD search tree. Depending on how confident we are in the prediction we either keep full unmodified TX size search or use the largest possible TX size and stop any further search. Average speed-up: 3-4% Quality loss (lowres): 0.062% Quality loss (midres): 0.018% Change-Id: I64c0317db74cbeddfbdf772147c43e99e275891f
-
Hui Su authored
The "use_intrabc" flag is signaled with CDF. No need to keep a counter for it. Change-Id: Ia62ef8f264aa5ce2f6fceddc0b2a7d2032c73044
-
Hui Su authored
Disable it for sub8x8 4:1 partitions (4x16 and 16x4), because of conflict with cfl. Enable it for all the other sub8x8 partitions sizes. BUG=aomedia:998 Change-Id: Ifdd907f0ac1f987981e81c166eb71978e6ea27c3
-
Sebastien Alaiwan authored
This experiment has been abandonned. Change-Id: Ieabc6f365651e2d116a4505a3cc202add94d1386
-
Soo-Chul Han authored
Change-Id: Ifcc92df2e8c69752c1dbff2b447eb22035814389
-
Rupert Swarbrick authored
We do this by upscaling the deblocked output as we save it into the RestorationStripeBoundaries line buffers. (See save_boundary_lines in restoration.c for the details) The upscaling is done by calling av1_convolve_horiz_rs, which reads off the edge of the frame and, of course, across tile boundaries. This means we need to extend the frame borders before saving boundary lines (hence the changes to decodeframe.c and encoder.c) Change-Id: Ia096846898b20afe4737433d772f7277d4f71724
-
Yue Chen authored
We reverted to using 3-tap filters. So 4-tap filters related code will not be used any more. Change-Id: I7f65cf227d2eb3e9785474e3b33d0bdbf489b1f1
-
Luc Trudeau authored
This change does not alter the encoder/decoder behavior. Extra precautions are taken to stop an attacker from exploiting these function pointers. Change-Id: I4e0704f016774f2d8fbaeb2a4caec12fc6e67ec1
-
- 02 Nov, 2017 8 commits
-
-
Dake He authored
Removed assertions were not properly set up and may cause decoding failure when USE_CASUAL_CTX is enabled. This CL does not change bitstream. Change-Id: Ib9193cbda32f342335a79aca39e9cc49204a0ec9
-
Rupert Swarbrick authored
Before this patch, striped loop restoration didn't restart correctly on each tile row. Now, the loop restoration stripes start at the top of a tile row in the same way as if it were the top of the entire frame. Change-Id: I0a88a28d7804b2f09d792ecbbf4f22f666f67012
-
David Barker authored
Because we have an (effective) 3-pixel border around each processing unit, and the local sums in the self-guided filter are only taken over at most 5x5 regions, we have 1 pixel's worth of spare border. We can use this border to greatly simplify the filter: Instead of calculating a 64x64 region of the A[] and B[] arrays, we can calculate a 66x66 region. Then we don't have to deal with complicated boundary conditions when generating the final 64x64 output block. This also makes a few other related changes: * The 'boxnum' function has been effectively redundant for a while - due to the way we do the 5x5 (or 3x3) windowing, the values we actually use are always (2r+1)^2. So we can skip calling this function if MAX_RADIUS <= 2 * We can remove the annoying special case for tiny processing units in the self-guided filter, as we no longer have to worry about border behaviour * We change the SSE4.1 code to match the new C code, removing a ton of complexity. Further refactoring/speedups are probably now possible, but this includes the minimal changes to pass all the tests. Change-Id: I99beee164a31349a5228a9bef048e5f35c9639f2
-
Yaowu Xu authored
Change-Id: I299e2f2a1967f867a5452e0c449abe5243ac5d13
-
Luc Trudeau authored
BUG=aomedia:994 Change-Id: I0fa9d48487256655798dbdd64acad523e84557c6
-
Nathan E. Egge authored
Change-Id: Ia5df312c759faa38fe336ab32a7d4908760ecf08
-
Yaowu Xu authored
BUG=aomedia:995 Change-Id: Ic320e5de9b0c7d320b0f6dddce93f1445be61234
-
Rupert Swarbrick authored
These are just RESTORATION_PROC_UNIT_SIZE shifted right by the vertical or horizontal subsampling for this plane and it's easier not to have to pass them around. Change-Id: I86441d6cd86bb146f3e5dcdf2c89e34dd9fed0e1
-