1. 05 Nov, 2017 3 commits
    • Zoe Liu's avatar
      Add frame offset of references to dumped info · b4f31036
      Zoe Liu authored
      This patch also updates cm->frame_offset for show_existing_frame at
      the encoder.
      
      Change-Id: I863876675145ba663fc229a854b83b39759309a5
      b4f31036
    • Debargha Mukherjee's avatar
      Misc. clean ups / refactor of speed 1 · d7338aa8
      Debargha Mukherjee authored
      With this patch, and the speed settings turned on for speed 1,
      the coding efficiency of speed 1 in default configuration should be
      only a little worse than speed 0, but it should roughly run at
      double the speed.
      
      Specifically, this patch makes various changes to make sure that
      speed 1 behaves exactly the same as speed 0 except for speed settings
      turned on or off in speed_features.c.
      
      This will change the bitstream generated a little for speeds
      1 or higher because of the following reasons:
      1. Removes a hacky speed setting correction factor in firstpass.c
      2. Fast cdef search is moved from speed 1+ to 2+, and a new speed
      feature is added to control that.
      3. Mesh search settings are pushed down one level so that speeds 0
      and 1 use the same settings.
      4. A disable_split_mask feature for animated content previously
      turned on speeds 1+ is moved down to speeds 2+.
      
      Change-Id: I0ec36556f157bdc42c5daa0cfb9518cf7ff65f6b
      d7338aa8
    • Debargha Mukherjee's avatar
      Further speed-up of ext-partition-types · 6f77d081
      Debargha Mukherjee authored
      Removing the NONE partition check from horz_4 and
      vert_4 partition search conditions provides another
      5-10% speedup at very little loss.
      
      Change-Id: Ie5f14191efe6d2b0695b27021de96ad0a1550f26
      6f77d081
  2. 04 Nov, 2017 8 commits
    • Zoe Liu's avatar
      Refactor the code for reference frame flag setup · 368bf16d
      Zoe Liu authored
      At the encoder side, for the 7 reference frames, we always set up the
      priority rank as follows:
      LAST, ALTREF, LAST2, LAST3, GOLDEN, BWDREF, ALTREF2
      That is, if two reference frames point to the same reference frame
      buffer, the flag for the latter frame in the rank will always be
      turned off.
      
      This patch does not change any coding performance / coding speed for
      the default configure setup. It only affects the following setup:
      one-sided-compound is on && ext-comp-refs is off
      As one-sided-compound is enabled by default when ext-comp-refs is
      enabled, and ext-comp-refs is enabled by default, above setup should
      not be considered.
      
      Change-Id: I6de18d3be938e1d4a8897e5ba0857b8d21e7f9d0
      368bf16d
    • Sebastien Alaiwan's avatar
      Use rtcd script to choose between implementations · 78a7bd7d
      Sebastien Alaiwan authored
      Change-Id: I752ad96a8b4349d4a437a97e30edc8e4c22f81b5
      78a7bd7d
    • James Zern's avatar
      cdef_test: quiet implicit bool to int conv warning · 9feda796
      James Zern authored
      Change-Id: Ic7096fe85dc653c9c7d7d1f098df19daff27e1cf
      9feda796
    • Yue Chen's avatar
      Remove NCOBMC_ADAPT_WEIGHT from AV1 · 80daf0c4
      Yue Chen authored
      Development of this experiment will be deferred to AV2.
      
      Change-Id: I3c4615a21b59508500bed8aab0a5c54413b4f284
      80daf0c4
    • Zoe Liu's avatar
      Speed up one-sided compound mode (ext-comp-refs) · 77fb5be1
      Zoe Liu authored
      One-sided compound ref prediction is used only when all reference
      frames are one-sided.
      
      This patch has demonstrated an encoder speedup of ~28%.
      
      Using the following configure setups, the coding performance has been
      dropped on Google test sets (50 frames) in BDRate by ~0.2% for lowres
      and by ~0.1% for midres (Corresponding performance impact should be
      smaller on AWCY):
      --enable-experimental --disable-convolve-round --disable-ext-partition
      --disable-ext-partition-types --disable-txk-sel --disable-txm
      
      Change-Id: I585bbffb2f8d154e8f52a1e79a84eff8bb4a471d
      77fb5be1
    • Jingning Han's avatar
      Separate ref frame mvs control from prev_frame_mvs · e17ebe92
      Jingning Han authored
      The control of using reference frame motion vector is a separate
      factor from the existence of previous frame motion vectors. This
      commit decouples these two, such that the encoder can control the
      use of reference motion vector. When it is used, one can further
      identify if the previous frame exists or not, then to decide if
      need to force use_prev_frame_mvs to be zero.
      
      This solves the issue where the previous frame mvs is set to be
      0 and it accidentally shuts off the access to all other existing
      referece frames mvs in the mfmv system. It brings back the coding
      performance gains to normal.
      
      Change-Id: I2531f73e55582a9bb5b3e0ff47e361a199ec8082
      e17ebe92
    • Debargha Mukherjee's avatar
      Speed up of ext-partition types · c4b67641
      Debargha Mukherjee authored
      Search the new horz/vert a/b/4 partitions only if the best so far
      is either oriented along the same direction or split/none, or if
      the rd costs obtained from the previous partition searches indicate
      there is potential in searching these partitions.
      
      This brings about 25-30% speedup at less than 0.1% drop as seen on
      lowres 30 frames.
      
      Change-Id: I6c6c347e06c34ee0ca17479aeeb4075a66dc7e2c
      c4b67641
    • Debargha Mukherjee's avatar
      Simplify tx_mode frame level bit · 923b73d8
      Debargha Mukherjee authored
      Adds a new experiment to simplify the tx_mode symbol.
      
      The existing frame level tx_mode information is converted to a single bit
      to select between largest tx_size for a prediction unit or specified
      at the block level. The less useful modes: ALLOW_8X8, ALLOW_16X16,
      etc. are removed.
      
      Change-Id: Ib9358e17b0158a167eb4edef79f36ff113aa56e1
      923b73d8
  3. 03 Nov, 2017 20 commits
    • Yunqing Wang's avatar
      Allow to disable the probability update · 0e141b56
      Yunqing Wang authored
      Added the function of allowing to disable the probability update while
      needed. This would be needed while encoding in multiple tiles, and
      enabling/disabling probability update can be set separately for every
      individual tile.
      
      Change-Id: Ic3c64e6cebac89c483d48b874761bd2e902d81e6
      0e141b56
    • Dake He's avatar
      [level map] set USE_CAUSAL_CTX to 1 · 2d9bd32e
      Dake He authored
      Per the codec WG call today, turn on Plan B for level map by default.
      
      Change-Id: Iae885b38917cf79e4f0b290cc2d73ac28321710f
      2d9bd32e
    • Ola Hugosson's avatar
      Enable striped_loop_restoration · 201a2b49
      Ola Hugosson authored
      Change-Id: Idc5ead2db38562924f27796eb78a05b658b5a20e
      201a2b49
    • David Barker's avatar
      Fix bug introduced by commit 01563088 · b44deca9
      David Barker authored
      When fixing one bug in av1_decode_tg_tiles_and_wrapup, I seem to have
      introduced another bug. This was due to checking the wrong condition
      on whether to update the frame context at the end of the frame.
      
      BUG=aomedia:1001
      
      Change-Id: I929a710e2de31a89cc7899fb1605ca7edf968a87
      b44deca9
    • Rupert Swarbrick's avatar
      Fix highbd striped loop restoration bug · ff090b86
      Rupert Swarbrick authored
      The "data8_bl" variable is a uint8_t* and will be scaled up
      later (with REAL_PTR) if it's pointing to highbd data. Don't scale up
      the x offset.
      
      Change-Id: I03e2ce8861e25e3a603e8f0ba2c8af585e08b9c5
      ff090b86
    • Yue Chen's avatar
      Disable filter_intra mode in <8x8 tx blocks · 95e13e23
      Yue Chen authored
      0.159% gain on lowres 60 frames, compared to 0.236% gain if we don't
      restrict it in small tx blocks.
      (--disable-ext-partition --disable-ext-partition-types
       --disable-convolve-round --disable-ext-comp-refs)
      
      Change-Id: I1d1c5474ca27de9dec992ea30a9883afd7a56474
      95e13e23
    • Rupert Swarbrick's avatar
      Correct has_bottom_left calculation for mixed vertical partitions · 8315daf7
      Rupert Swarbrick authored
      This patch regenerates the orders tables and generates both the normal
      ones and also those for vertical partitions. I've added a long comment
      above the definition of orders[] that explains how they work (there's
      no change, but it took me a while to understand, so it's probably a
      good thing to document). I've also slightly changed when we use the
      orders_vert tables: they are now used for both PARTITION_VERT_A and
      PARTITION_VERT_B.
      
      The patch also removes the #if around the partition argument to
      has_top_right and adds it to has_bottom_left. (I could have put it
      inside an #if, but I shouldn't imagine there's any measurable
      performance cost and the code is cleaner this way).
      
      The tables were regenerated with a Haskell script which I've included
      at the bottom of the commit message (so the next person doesn't have
      to write it from scratch yet again). The output looks reasonably
      clean, but clang-format does change it somewhat so you need to run
      that afterwards. The tables are also output in a different order, so
      you'll need to clean that up by hand too.
      
      -- orders.hs: Print tables to stdout by calling printOrders
      
      import Data.Foldable
      import Data.List (findIndex)
      import Data.Maybe
      import System.Environment
      import Text.Printf
      import Text.Read
      
      data Block = Block { lbw :: Int, lbh :: Int, vert :: Bool }
      
      minLogBlockSize :: Bool -> Int
      minLogBlockSize v = if v then 3 else 2
      
      maxLogBlockSize = 7 :: Int
      
      -- This code generates the inverse of what we want: a mapping from visit order
      -- to raster order. That is, element i of the list will be the raster index of
      -- the block that we visit at step i.
      vrSplit :: Block -> Int -> Int -> Int -> [Int]
      vrSplit b stride lsz off
        -- PARTITION_NONE
        | lbw b >= lsz && lbh b >= lsz = [off]
        -- Some form of horizontal partition
        | lbw b < lsz && lbh b >= lsz =
            [off,off + 1..off + 2^(lsz - lbw b) - 1]
        -- Some form of vertical partition
        | lbw b >= lsz && lbh b < lsz =
            [off,off + stride..off + (2^(lsz - lbh b) - 1)*stride]
        -- PARTITION_VERT_*
        | vert b && lbh b + 1 == lsz && lbw b + 1 == lsz =
            [off, off + stride, off + 1, off + stride + 1]
        -- PARTITION_SPLIT
        | otherwise =
          concatMap (vrSplit b stride (lsz - 1))
          [off, off + 2^(lsz - lbw b - 1), off + 2^(lsz - lbh b - 1) * stride,
           off + 2^(lsz - lbw b - 1) + 2^(lsz - lbh b - 1) * stride]
      
      vrOrders :: Block -> [Int]
      vrOrders b = vrSplit b (2 ^ (maxLogBlockSize - lbw b)) maxLogBlockSize 0
      
      -- A simple function to invert the bijection generated by vrOrders (it's very
      -- naive, but the list isn't exactly long)
      invertList :: [Int] -> [Int]
      invertList is = map (\ i -> fromJust $ findIndex ((==) i) is) [0..length is - 1]
      
      genOrders :: Block -> [Int]
      genOrders = invertList . vrOrders
      
      -- Code to print everything out in the style used in the AOM codebase
      forButLast_ :: Applicative f => [a] -> (a -> f b) -> f ()
      forButLast_ [] f = pure ()
      forButLast_ (a : as) f = fbl a as f
        where fbl a [] f = pure ()
              fbl a (a' : as) f = f a *> fbl a' as f
      
      numDigits :: Int -> Int
      numDigits n =
        if n == 0 then 1
        else ceiling $ logBase 10 $ fromIntegral $ 1 + n
      
      printRow :: Int -> Int -> [Int] -> Bool -> IO ()
      printRow indent fw as islast = do
        { if null as then return ()
          else do
            { printf "%*s" indent ""
            ; forButLast_ as (\ a -> printf "%d,%*s" a (postDent a) "")
            ; printf "%d%s" (last as) (if islast then "\n" else ",\n") } }
        where postDent a = 1 + fw - numDigits a
      
      printInts :: Int -> Int -> Int -> [Int] -> IO ()
      printInts width indent fw [] = return ()
      printInts width indent fw as =
        let (row, rest) = splitAt eltsPerLine as in
          printRow indent fw row (null rest) >> printInts width indent fw rest
        where eltsPerLine = quot (width - indent + 1) (fw + 2)
      
      printBlockOrders :: Block -> IO ()
      printBlockOrders b = do
        { printf "static const uint16_t orders_%s%dx%d[%d] = {\n"
          (if vert b then "vert_" else "")
          ((2 :: Int) ^ lbw b) ((2 :: Int) ^ lbh b) numElts
        ; printInts 79 2 intWidth (genOrders b)
        ; printf "};\n" }
        where lsz = maxLogBlockSize
              numElts = (2 :: Int) ^ (lsz - lbw b + lsz - lbh b)
              intWidth = max 1 $ ceiling $ logBase 10 $ fromIntegral (numElts - 1)
      
      blocksForWidth :: Bool -> Int -> [Block]
      blocksForWidth v lbw = map (\ lbh -> Block lbw lbh v) [minLbh..maxLbh]
        where maxLogAspectRatio = if v then 0 else 2
              minLbh = max (minLogBlockSize v) (lbw - maxLogAspectRatio)
              maxLbh = min maxLogBlockSize (lbw + maxLogAspectRatio)
      
      blocksForV :: Bool -> [Block]
      blocksForV v = concatMap (blocksForWidth v) [minLbw..maxLbw]
        where minLbw = (minLogBlockSize v)
              maxLbw = maxLogBlockSize
      
      blocks :: [Block]
      blocks = blocksForV False ++ blocksForV True
      
      printOrders :: IO ()
      printOrders = traverse_ printBlockOrders blocks
      
      -- Ends orders.hs
      
      BUG=aomedia:914
      
      Change-Id: I6c53e80caa0d203cdc11f88471b6c117c633baa6
      8315daf7
    • Rupert Swarbrick's avatar
      Avoid some mismatched braces in bitstream.c · cf772765
      Rupert Swarbrick authored
      These play havoc with editors' "jump to start of function"
      commands. There should be no change to generated code.
      
      Change-Id: Ib6961bb952da02081a675d0a4fa01eea2c1ff6d1
      cf772765
    • Sebastien Alaiwan's avatar
      Simplify FRAME_ID constants · d418f68e
      Sebastien Alaiwan authored
      Change-Id: I75890c0f64f93f48299895d1e0bcfbf91846a4ab
      d418f68e
    • Debargha Mukherjee's avatar
      Add two levels for selective ref frame sp. feature · 06b40cc3
      Debargha Mukherjee authored
      The first level is turned on for speed 1.
      
      Change-Id: I3dba0f0250b97a25e174cacc2a46ca7f76572c85
      06b40cc3
    • Debargha Mukherjee's avatar
      Cleanup of speed 1 · 203016e8
      Debargha Mukherjee authored
      Removes features for now so that we only add features with very
      small loss.
      
      Change-Id: Ie50f6af2a6cc19dde5f682754a1f0adf4ec957a8
      203016e8
    • Alexander Bokov's avatar
      Add highbd support in predict_skip_flag · 80eedf2e
      Alexander Bokov authored
      Change-Id: I4270d1260854ac27b68c5694ca8102b92bee6faa
      80eedf2e
    • Alexander Bokov's avatar
      Introducing a model for pruning the TX size search · 79a37242
      Alexander Bokov authored
      Use a neural-network-based binary classifier to predict the first split
      decision on the highest level of the TX size RD search tree. Depending
      on how confident we are in the prediction we either keep full unmodified
      TX size search or use the largest possible TX size and stop any further
      search.
      
      Average speed-up: 3-4%
      Quality loss (lowres): 0.062%
      Quality loss (midres): 0.018%
      
      Change-Id: I64c0317db74cbeddfbdf772147c43e99e275891f
      79a37242
    • Hui Su's avatar
      intrabc: remove unused counters · 7ac01f8f
      Hui Su authored
      The "use_intrabc" flag is signaled with CDF. No need to keep a counter
      for it.
      
      Change-Id: Ia62ef8f264aa5ce2f6fceddc0b2a7d2032c73044
      7ac01f8f
    • Hui Su's avatar
      intrabc: disable it for sub8x8 4:1 partitions · e8204e33
      Hui Su authored
      Disable it for sub8x8 4:1 partitions (4x16 and 16x4), because of
      conflict with cfl.
      
      Enable it for all the other sub8x8 partitions sizes.
      
      BUG=aomedia:998
      
      Change-Id: Ifdd907f0ac1f987981e81c166eb71978e6ea27c3
      e8204e33
    • Sebastien Alaiwan's avatar
      Remove VAR_REFS experiment · 4be6cb32
      Sebastien Alaiwan authored
      This experiment has been abandonned.
      
      Change-Id: Ieabc6f365651e2d116a4505a3cc202add94d1386
      4be6cb32
    • Soo-Chul Han's avatar
      fix bug to add frame_marker in write_uncompressed_header_obu( ) · ebdbcb4c
      Soo-Chul Han authored
      Change-Id: Ifcc92df2e8c69752c1dbff2b447eb22035814389
      ebdbcb4c
    • Rupert Swarbrick's avatar
      Allow horzonly superres and striped loop restoration · 76c7800e
      Rupert Swarbrick authored
      We do this by upscaling the deblocked output as we save it into the
      RestorationStripeBoundaries line buffers. (See save_boundary_lines in
      restoration.c for the details)
      
      The upscaling is done by calling av1_convolve_horiz_rs, which reads
      off the edge of the frame and, of course, across tile boundaries. This
      means we need to extend the frame borders before saving boundary
      lines (hence the changes to decodeframe.c and encoder.c)
      
      Change-Id: Ia096846898b20afe4737433d772f7277d4f71724
      76c7800e
    • Yue Chen's avatar
      Remove 4-tap filter intra · e2692c5c
      Yue Chen authored
      We reverted to using 3-tap filters. So 4-tap filters related code
      will not be used any more.
      
      Change-Id: I7f65cf227d2eb3e9785474e3b33d0bdbf489b1f1
      e2692c5c
    • Luc Trudeau's avatar
      [CFL] Clean up subsampling with function pointer · 43ed5717
      Luc Trudeau authored
      This change does not alter the encoder/decoder behavior. Extra
      precautions are taken to stop an attacker from exploiting these function
      pointers.
      
      Change-Id: I4e0704f016774f2d8fbaeb2a4caec12fc6e67ec1
      43ed5717
  4. 02 Nov, 2017 9 commits
    • Dake He's avatar
      [level map] cleanup and remove assertions · bd47bfaa
      Dake He authored
      Removed assertions were not properly set up and may cause decoding failure when USE_CASUAL_CTX is enabled. This CL does not change bitstream.
      
      Change-Id: Ib9193cbda32f342335a79aca39e9cc49204a0ec9
      bd47bfaa
    • Rupert Swarbrick's avatar
      Correct striped-loop-restoration with multiple tile rows · dee00eb0
      Rupert Swarbrick authored
      Before this patch, striped loop restoration didn't restart correctly
      on each tile row. Now, the loop restoration stripes start at the top
      of a tile row in the same way as if it were the top of the entire
      frame.
      
      Change-Id: I0a88a28d7804b2f09d792ecbbf4f22f666f67012
      dee00eb0
    • David Barker's avatar
      loop-restoration: Rework self-guided filter · 369d8f22
      David Barker authored
      Because we have an (effective) 3-pixel border around each
      processing unit, and the local sums in the self-guided filter are
      only taken over at most 5x5 regions, we have 1 pixel's worth of
      spare border.
      
      We can use this border to greatly simplify the filter: Instead
      of calculating a 64x64 region of the A[] and B[] arrays, we can
      calculate a 66x66 region. Then we don't have to deal with complicated
      boundary conditions when generating the final 64x64 output block.
      
      This also makes a few other related changes:
      * The 'boxnum' function has been effectively redundant
        for a while - due to the way we do the 5x5 (or 3x3) windowing,
        the values we actually use are always (2r+1)^2. So we can skip
        calling this function if MAX_RADIUS <= 2
      
      * We can remove the annoying special case for tiny processing units
        in the self-guided filter, as we no longer have to worry about
        border behaviour
      
      * We change the SSE4.1 code to match the new C code, removing a ton
        of complexity. Further refactoring/speedups are probably
        now possible, but this includes the minimal changes to pass all
        the tests.
      
      Change-Id: I99beee164a31349a5228a9bef048e5f35c9639f2
      369d8f22
    • Yaowu Xu's avatar
      Fix a compiler warning of "unused variable" · 5f2749b8
      Yaowu Xu authored
      Change-Id: I299e2f2a1967f867a5452e0c449abe5243ac5d13
      5f2749b8
    • Luc Trudeau's avatar
      [CfL] Fix subsampling overflow in HBD 4:4:0 · 6acb300f
      Luc Trudeau authored
      BUG=aomedia:994
      
      Change-Id: I0fa9d48487256655798dbdd64acad523e84557c6
      6acb300f
    • Nathan E. Egge's avatar
      Correct constants in daala_tx 32-point DST. · 0dffa176
      Nathan E. Egge authored
      Change-Id: Ia5df312c759faa38fe336ab32a7d4908760ecf08
      0dffa176
    • Yaowu Xu's avatar
      Add initialization of pixel data in intrapred_test · 9137a9ae
      Yaowu Xu authored
      BUG=aomedia:995
      
      Change-Id: Ic320e5de9b0c7d320b0f6dddce93f1445be61234
      9137a9ae
    • Rupert Swarbrick's avatar
      Get rid of RestorationInfo::procunit_height and width · cb493d82
      Rupert Swarbrick authored
      These are just RESTORATION_PROC_UNIT_SIZE shifted right by the
      vertical or horizontal subsampling for this plane and it's easier not
      to have to pass them around.
      
      Change-Id: I86441d6cd86bb146f3e5dcdf2c89e34dd9fed0e1
      cb493d82
    • Sebastien Alaiwan's avatar
      Simplify flow of control (decoder) · 238a6d63
      Sebastien Alaiwan authored
      Avoid switch fallthrough (and associated gcc/clang warnings),
      reduce scopes and add consts.
      
      Change-Id: I28d910d9d39ee8fe2c5618364af602af5be5c186
      238a6d63