1. 03 Nov, 2017 18 commits
    • Ola Hugosson's avatar
      Enable striped_loop_restoration · 201a2b49
      Ola Hugosson authored
      Change-Id: Idc5ead2db38562924f27796eb78a05b658b5a20e
    • David Barker's avatar
      Fix bug introduced by commit 01563088 · b44deca9
      David Barker authored
      When fixing one bug in av1_decode_tg_tiles_and_wrapup, I seem to have
      introduced another bug. This was due to checking the wrong condition
      on whether to update the frame context at the end of the frame.
      Change-Id: I929a710e2de31a89cc7899fb1605ca7edf968a87
    • Rupert Swarbrick's avatar
      Fix highbd striped loop restoration bug · ff090b86
      Rupert Swarbrick authored
      The "data8_bl" variable is a uint8_t* and will be scaled up
      later (with REAL_PTR) if it's pointing to highbd data. Don't scale up
      the x offset.
      Change-Id: I03e2ce8861e25e3a603e8f0ba2c8af585e08b9c5
    • Yue Chen's avatar
      Disable filter_intra mode in <8x8 tx blocks · 95e13e23
      Yue Chen authored
      0.159% gain on lowres 60 frames, compared to 0.236% gain if we don't
      restrict it in small tx blocks.
      (--disable-ext-partition --disable-ext-partition-types
       --disable-convolve-round --disable-ext-comp-refs)
      Change-Id: I1d1c5474ca27de9dec992ea30a9883afd7a56474
    • Rupert Swarbrick's avatar
      Correct has_bottom_left calculation for mixed vertical partitions · 8315daf7
      Rupert Swarbrick authored
      This patch regenerates the orders tables and generates both the normal
      ones and also those for vertical partitions. I've added a long comment
      above the definition of orders[] that explains how they work (there's
      no change, but it took me a while to understand, so it's probably a
      good thing to document). I've also slightly changed when we use the
      orders_vert tables: they are now used for both PARTITION_VERT_A and
      The patch also removes the #if around the partition argument to
      has_top_right and adds it to has_bottom_left. (I could have put it
      inside an #if, but I shouldn't imagine there's any measurable
      performance cost and the code is cleaner this way).
      The tables were regenerated with a Haskell script which I've included
      at the bottom of the commit message (so the next person doesn't have
      to write it from scratch yet again). The output looks reasonably
      clean, but clang-format does change it somewhat so you need to run
      that afterwards. The tables are also output in a different order, so
      you'll need to clean that up by hand too.
      -- orders.hs: Print tables to stdout by calling printOrders
      import Data.Foldable
      import Data.List (findIndex)
      import Data.Maybe
      import System.Environment
      import Text.Printf
      import Text.Read
      data Block = Block { lbw :: Int, lbh :: Int, vert :: Bool }
      minLogBlockSize :: Bool -> Int
      minLogBlockSize v = if v then 3 else 2
      maxLogBlockSize = 7 :: Int
      -- This code generates the inverse of what we want: a mapping from visit order
      -- to raster order. That is, element i of the list will be the raster index of
      -- the block that we visit at step i.
      vrSplit :: Block -> Int -> Int -> Int -> [Int]
      vrSplit b stride lsz off
        | lbw b >= lsz && lbh b >= lsz = [off]
        -- Some form of horizontal partition
        | lbw b < lsz && lbh b >= lsz =
            [off,off + 1..off + 2^(lsz - lbw b) - 1]
        -- Some form of vertical partition
        | lbw b >= lsz && lbh b < lsz =
            [off,off + stride..off + (2^(lsz - lbh b) - 1)*stride]
        -- PARTITION_VERT_*
        | vert b && lbh b + 1 == lsz && lbw b + 1 == lsz =
            [off, off + stride, off + 1, off + stride + 1]
        | otherwise =
          concatMap (vrSplit b stride (lsz - 1))
          [off, off + 2^(lsz - lbw b - 1), off + 2^(lsz - lbh b - 1) * stride,
           off + 2^(lsz - lbw b - 1) + 2^(lsz - lbh b - 1) * stride]
      vrOrders :: Block -> [Int]
      vrOrders b = vrSplit b (2 ^ (maxLogBlockSize - lbw b)) maxLogBlockSize 0
      -- A simple function to invert the bijection generated by vrOrders (it's very
      -- naive, but the list isn't exactly long)
      invertList :: [Int] -> [Int]
      invertList is = map (\ i -> fromJust $ findIndex ((==) i) is) [0..length is - 1]
      genOrders :: Block -> [Int]
      genOrders = invertList . vrOrders
      -- Code to print everything out in the style used in the AOM codebase
      forButLast_ :: Applicative f => [a] -> (a -> f b) -> f ()
      forButLast_ [] f = pure ()
      forButLast_ (a : as) f = fbl a as f
        where fbl a [] f = pure ()
              fbl a (a' : as) f = f a *> fbl a' as f
      numDigits :: Int -> Int
      numDigits n =
        if n == 0 then 1
        else ceiling $ logBase 10 $ fromIntegral $ 1 + n
      printRow :: Int -> Int -> [Int] -> Bool -> IO ()
      printRow indent fw as islast = do
        { if null as then return ()
          else do
            { printf "%*s" indent ""
            ; forButLast_ as (\ a -> printf "%d,%*s" a (postDent a) "")
            ; printf "%d%s" (last as) (if islast then "\n" else ",\n") } }
        where postDent a = 1 + fw - numDigits a
      printInts :: Int -> Int -> Int -> [Int] -> IO ()
      printInts width indent fw [] = return ()
      printInts width indent fw as =
        let (row, rest) = splitAt eltsPerLine as in
          printRow indent fw row (null rest) >> printInts width indent fw rest
        where eltsPerLine = quot (width - indent + 1) (fw + 2)
      printBlockOrders :: Block -> IO ()
      printBlockOrders b = do
        { printf "static const uint16_t orders_%s%dx%d[%d] = {\n"
          (if vert b then "vert_" else "")
          ((2 :: Int) ^ lbw b) ((2 :: Int) ^ lbh b) numElts
        ; printInts 79 2 intWidth (genOrders b)
        ; printf "};\n" }
        where lsz = maxLogBlockSize
              numElts = (2 :: Int) ^ (lsz - lbw b + lsz - lbh b)
              intWidth = max 1 $ ceiling $ logBase 10 $ fromIntegral (numElts - 1)
      blocksForWidth :: Bool -> Int -> [Block]
      blocksForWidth v lbw = map (\ lbh -> Block lbw lbh v) [minLbh..maxLbh]
        where maxLogAspectRatio = if v then 0 else 2
              minLbh = max (minLogBlockSize v) (lbw - maxLogAspectRatio)
              maxLbh = min maxLogBlockSize (lbw + maxLogAspectRatio)
      blocksForV :: Bool -> [Block]
      blocksForV v = concatMap (blocksForWidth v) [minLbw..maxLbw]
        where minLbw = (minLogBlockSize v)
              maxLbw = maxLogBlockSize
      blocks :: [Block]
      blocks = blocksForV False ++ blocksForV True
      printOrders :: IO ()
      printOrders = traverse_ printBlockOrders blocks
      -- Ends orders.hs
      Change-Id: I6c53e80caa0d203cdc11f88471b6c117c633baa6
    • Rupert Swarbrick's avatar
      Avoid some mismatched braces in bitstream.c · cf772765
      Rupert Swarbrick authored
      These play havoc with editors' "jump to start of function"
      commands. There should be no change to generated code.
      Change-Id: Ib6961bb952da02081a675d0a4fa01eea2c1ff6d1
    • Sebastien Alaiwan's avatar
      Simplify FRAME_ID constants · d418f68e
      Sebastien Alaiwan authored
      Change-Id: I75890c0f64f93f48299895d1e0bcfbf91846a4ab
    • Debargha Mukherjee's avatar
      Add two levels for selective ref frame sp. feature · 06b40cc3
      Debargha Mukherjee authored
      The first level is turned on for speed 1.
      Change-Id: I3dba0f0250b97a25e174cacc2a46ca7f76572c85
    • Debargha Mukherjee's avatar
      Cleanup of speed 1 · 203016e8
      Debargha Mukherjee authored
      Removes features for now so that we only add features with very
      small loss.
      Change-Id: Ie50f6af2a6cc19dde5f682754a1f0adf4ec957a8
    • Alexander Bokov's avatar
      Add highbd support in predict_skip_flag · 80eedf2e
      Alexander Bokov authored
      Change-Id: I4270d1260854ac27b68c5694ca8102b92bee6faa
    • Alexander Bokov's avatar
      Introducing a model for pruning the TX size search · 79a37242
      Alexander Bokov authored
      Use a neural-network-based binary classifier to predict the first split
      decision on the highest level of the TX size RD search tree. Depending
      on how confident we are in the prediction we either keep full unmodified
      TX size search or use the largest possible TX size and stop any further
      Average speed-up: 3-4%
      Quality loss (lowres): 0.062%
      Quality loss (midres): 0.018%
      Change-Id: I64c0317db74cbeddfbdf772147c43e99e275891f
    • Hui Su's avatar
      intrabc: remove unused counters · 7ac01f8f
      Hui Su authored
      The "use_intrabc" flag is signaled with CDF. No need to keep a counter
      for it.
      Change-Id: Ia62ef8f264aa5ce2f6fceddc0b2a7d2032c73044
    • Hui Su's avatar
      intrabc: disable it for sub8x8 4:1 partitions · e8204e33
      Hui Su authored
      Disable it for sub8x8 4:1 partitions (4x16 and 16x4), because of
      conflict with cfl.
      Enable it for all the other sub8x8 partitions sizes.
      Change-Id: Ifdd907f0ac1f987981e81c166eb71978e6ea27c3
    • Sebastien Alaiwan's avatar
      Remove VAR_REFS experiment · 4be6cb32
      Sebastien Alaiwan authored
      This experiment has been abandonned.
      Change-Id: Ieabc6f365651e2d116a4505a3cc202add94d1386
    • Soo-Chul Han's avatar
      fix bug to add frame_marker in write_uncompressed_header_obu( ) · ebdbcb4c
      Soo-Chul Han authored
      Change-Id: Ifcc92df2e8c69752c1dbff2b447eb22035814389
    • Rupert Swarbrick's avatar
      Allow horzonly superres and striped loop restoration · 76c7800e
      Rupert Swarbrick authored
      We do this by upscaling the deblocked output as we save it into the
      RestorationStripeBoundaries line buffers. (See save_boundary_lines in
      restoration.c for the details)
      The upscaling is done by calling av1_convolve_horiz_rs, which reads
      off the edge of the frame and, of course, across tile boundaries. This
      means we need to extend the frame borders before saving boundary
      lines (hence the changes to decodeframe.c and encoder.c)
      Change-Id: Ia096846898b20afe4737433d772f7277d4f71724
    • Yue Chen's avatar
      Remove 4-tap filter intra · e2692c5c
      Yue Chen authored
      We reverted to using 3-tap filters. So 4-tap filters related code
      will not be used any more.
      Change-Id: I7f65cf227d2eb3e9785474e3b33d0bdbf489b1f1
    • Luc Trudeau's avatar
      [CFL] Clean up subsampling with function pointer · 43ed5717
      Luc Trudeau authored
      This change does not alter the encoder/decoder behavior. Extra
      precautions are taken to stop an attacker from exploiting these function
      Change-Id: I4e0704f016774f2d8fbaeb2a4caec12fc6e67ec1
  2. 02 Nov, 2017 21 commits
    • Dake He's avatar
      [level map] cleanup and remove assertions · bd47bfaa
      Dake He authored
      Removed assertions were not properly set up and may cause decoding failure when USE_CASUAL_CTX is enabled. This CL does not change bitstream.
      Change-Id: Ib9193cbda32f342335a79aca39e9cc49204a0ec9
    • Rupert Swarbrick's avatar
      Correct striped-loop-restoration with multiple tile rows · dee00eb0
      Rupert Swarbrick authored
      Before this patch, striped loop restoration didn't restart correctly
      on each tile row. Now, the loop restoration stripes start at the top
      of a tile row in the same way as if it were the top of the entire
      Change-Id: I0a88a28d7804b2f09d792ecbbf4f22f666f67012
    • David Barker's avatar
      loop-restoration: Rework self-guided filter · 369d8f22
      David Barker authored
      Because we have an (effective) 3-pixel border around each
      processing unit, and the local sums in the self-guided filter are
      only taken over at most 5x5 regions, we have 1 pixel's worth of
      spare border.
      We can use this border to greatly simplify the filter: Instead
      of calculating a 64x64 region of the A[] and B[] arrays, we can
      calculate a 66x66 region. Then we don't have to deal with complicated
      boundary conditions when generating the final 64x64 output block.
      This also makes a few other related changes:
      * The 'boxnum' function has been effectively redundant
        for a while - due to the way we do the 5x5 (or 3x3) windowing,
        the values we actually use are always (2r+1)^2. So we can skip
        calling this function if MAX_RADIUS <= 2
      * We can remove the annoying special case for tiny processing units
        in the self-guided filter, as we no longer have to worry about
        border behaviour
      * We change the SSE4.1 code to match the new C code, removing a ton
        of complexity. Further refactoring/speedups are probably
        now possible, but this includes the minimal changes to pass all
        the tests.
      Change-Id: I99beee164a31349a5228a9bef048e5f35c9639f2
    • Yaowu Xu's avatar
      Fix a compiler warning of "unused variable" · 5f2749b8
      Yaowu Xu authored
      Change-Id: I299e2f2a1967f867a5452e0c449abe5243ac5d13
    • Luc Trudeau's avatar
      [CfL] Fix subsampling overflow in HBD 4:4:0 · 6acb300f
      Luc Trudeau authored
      Change-Id: I0fa9d48487256655798dbdd64acad523e84557c6
    • Nathan E. Egge's avatar
      Correct constants in daala_tx 32-point DST. · 0dffa176
      Nathan E. Egge authored
      Change-Id: Ia5df312c759faa38fe336ab32a7d4908760ecf08
    • Yaowu Xu's avatar
      Add initialization of pixel data in intrapred_test · 9137a9ae
      Yaowu Xu authored
      Change-Id: Ic320e5de9b0c7d320b0f6dddce93f1445be61234
    • Rupert Swarbrick's avatar
      Get rid of RestorationInfo::procunit_height and width · cb493d82
      Rupert Swarbrick authored
      These are just RESTORATION_PROC_UNIT_SIZE shifted right by the
      vertical or horizontal subsampling for this plane and it's easier not
      to have to pass them around.
      Change-Id: I86441d6cd86bb146f3e5dcdf2c89e34dd9fed0e1
    • Sebastien Alaiwan's avatar
      Simplify flow of control (decoder) · 238a6d63
      Sebastien Alaiwan authored
      Avoid switch fallthrough (and associated gcc/clang warnings),
      reduce scopes and add consts.
      Change-Id: I28d910d9d39ee8fe2c5618364af602af5be5c186
    • Sebastien Alaiwan's avatar
      Remove experimental flag of EXT_TX · 3bac9928
      Sebastien Alaiwan authored
      This experiment has been adopted, we can simplify the code
      by dropping the associated preprocessor conditionals.
      Change-Id: I02ed47186bbc32400ee9bfadda17659d859c0ef7
    • Pavel Frolov's avatar
      Fix delta_qindex and delta_lflevel signaling · 1dbe92d6
      Pavel Frolov authored
      Change-Id: Ifcaedaf312f056fcc29e6a8e020aac0ddc52affd
    • Debargha Mukherjee's avatar
      Further cleanups of superres resize library · 7f8f1a92
      Debargha Mukherjee authored
      Change-Id: I6cf4e3bdc5eda5f7baabc7c6ad99ba49f69a8fa4
    • Debargha Mukherjee's avatar
      Add an expt to remove allow high precision mv flag · b214775d
      Debargha Mukherjee authored
      Change-Id: I3873acafcd9539da96f67328cdb8faf7798be90f
    • David Barker's avatar
      Refactor search_selfguided_restoration · bfbd8b39
      David Barker authored
      Pull the per-unit processing out into a couple of new functions,
      to make the overall logic of search_selfguided_restoration()
      a bit more obvious
      Change-Id: Ib4ed9be7d4c76e22dc56f933f3f9d09160242f71
    • David Barker's avatar
      OBU: Fix a few bugs · 01563088
      David Barker authored
      * The function av1_decode_tg_tiles_and_wrapup performs some per-frame
        initialization; some of this was mistakenly being performed
        once per tile group instead, leading to strange behaviour
        (eg, forgetting loop-restoration coefficients, forgetting
        the boundary information for all but the last tile group, etc.)
        Fix this by pulling all of the initialization code into its
        own function and calling it only if the initialize_flag is set.
      * While fixing the above, I realized that the 'context_updated'
        flag in av1_decode_tg_tile_and_wrapup was not behaving as intended:
        The idea is that, when using frame parallel mode, we save the
        frame context early so that the next frame can start decoding.
        Then we don't need to store the frame context at the end of
        the frame, since we already dealt with it at the start of the frame.
        However, this 'context_updated' flag was local to one tile group,
        ie. it got reset to 0 once we started decoding the second tile group.
        So we'd end up storing the frame context again at the end of the frame
        if there was >1 tile group.
        This didn't break anything, but it is a bit weird. So, to match
        the original intent, we ditch the 'context_updated' flag and
        directly check if we're in frame parallel mode when necessary.
      * Fix a bug where we read one byte too much from a tile group
        OBU when the extended OBU header was used.
      Change-Id: Ifbe561de0de35525d809e23915ac5263273e8de7
    • Rupert Swarbrick's avatar
      Refactor border treatment in loop restoration · 5b401364
      Rupert Swarbrick authored
      Previously we were calling aom_extend_frame_borders to generate
      extended pixels for use in loop-restoration. This generates quite a
      large border, when we only need 3 pixels.
      In addition, we were also calling extend_frame, which does the same
      thing but with a smaller border, once (in the decoder) or multiple
      times (in the encoder) per plane.
      This patch tidies all of this up so that we only call extend_frame
      once per plane, with the largest border size we need (3px).
      It also adds two new #defines. RESTORATION_BORDER is the 3 pixel
      border needed to do filtering for a processing
      unit. RESTORATION_CTX_VERT is the number of rows saved for each stripe
      when doing striped loop restoration.
      Change-Id: I2c3ffcc19808f79db195f76d857e2f23da5d8a84
    • Rupert Swarbrick's avatar
      Fix av1_loop_restoration_corners_in_sb for HORZ_FRAME_SUPERRES · 8b68e100
      Rupert Swarbrick authored
      After this patch, we don't scale sb coordinates vertically when using
      Change-Id: I24c652b4b357b132e8b29979a119e7aeb8420e19
    • Pavel Frolov's avatar
      Increase border when CONFIG_EXT_PARTITION==1 · 902000d8
      Pavel Frolov authored
      Before an increased border of 288 pixels was used when both
      However increased border is also required when only
      CONFIG_EXT_PARTITION is enabled.
      For example when:
      1) current frame is 2x smaller than reference frames
      2) block size is 128x128
      Change-Id: I09dfddfdf6bd0b0efde2556acb924bb563b6da2f
    • Debargha Mukherjee's avatar
      Move superres 1D convolve functions to convolve.c · 97137443
      Debargha Mukherjee authored
      This is to ease integration with striped loop restoration
      Change-Id: If10e16656a3fe42bcca3e7238e4e729c962f2bb8
    • Nathan E. Egge's avatar
      Fix NaN failure in CfL unit test on x86. · 8fdcf6e3
      Nathan E. Egge authored
      Change-Id: I555f430541413a42e9e14310bfde93304dc15cfa
    • Dake He's avatar
      [level map] simplified context derivation · 03a32926
      Dake He authored
      This CL simplifies context derivation for nz and base level flags in
      level map.
      1. Reduce SIG_COEF_CONTEXTS from 58 to 42.
      2. NZ and base level flags share the same context offsets derived from a
      template of size 5 (down from 7).
      In limited runs, compression performance seems neutral if not better.
      Encoding time for a key frame on a local linux machine is reduced by about 25% or more.
      Change-Id: Ibd93b21c839154bc5ae26b993f9e66537cbf5942
  3. 01 Nov, 2017 1 commit