1. 04 Dec, 2017 17 commits
    • Debargha Mukherjee's avatar
      Fixes to make 4:1 rectangular intra work correctly · d2cfbefb
      Debargha Mukherjee authored
      This patch fixes and enables rectangular intra transform
      sizes for 4:1 partitions (that were turned off before).
      4:1 partitions can now use rectangular intra predictions with
      2:1 rectangular transform sizes.
      BDRATE lowres (single keyframe): -0.612%
      
      Change-Id: I6f062f7c08aae8eeb0a55d31e792c8f7e3f302a2
      d2cfbefb
    • Timothy B. Terriberry's avatar
      daala_tx: Add SIMD versions of 4-point identity TX · f03f543d
      Timothy B. Terriberry authored
      These don't share the same kernel functions as the others so we can
      avoid doing two transposes for the rows and because we don't need
      to split short rows into multiple registers for the columns.
      
      The resulting IDTX implementations can be re-used for all sizes,
      though we might benefit from the larger AVX registers for the
      larger sizes.
      
      It might also be worth having a fast path for IDTX_IDTX to avoid an
      extra round-trip through memory, but that can be added in a
      separate patch if it proves worthwhile.
      
      Change-Id: I36fa4ea44c7dd2c165bff750d9bc8a213783041f
      f03f543d
    • Timothy B. Terriberry's avatar
      daala_tx: Add SIMD versions of 4-point DST/FlipDST · 47f74646
      Timothy B. Terriberry authored
      Despite the function pointers used to avoid copying and pasting the
      boilerplate code around each transform kernel, the compiler will
      inline everything to straightline code, with all SIMD parameters
      kept in registers.
      
      Change-Id: I3a89d6499e1972967dcccf397507676ee57ee33b
      47f74646
    • Timothy B. Terriberry's avatar
      daala_tx: Remove +1/-1 butterflies from 4-point tx · 3f5bbc5e
      Timothy B. Terriberry authored
      This fixes a potential overflow when using the 4-point Type VII DST
      as the row transform in a 4x16 transform block.
      
      Results on subset1:
      
      https://arewecompressedyet.com/?job=%402017-12-03T01%3A27%3A43.842Z&job=%402017-12-03T01%3A27%3A43.842Z%402017-12-03T01%3A29%3A23.170Z
      
        PSNR | PSNR HVS |   SSIM | MS SSIM | CIEDE 2000
      0.0113 |   0.0367 | 0.0063 |  0.0013 |     0.0182
      
      Change-Id: Ib8ca6a2e06cd7d1b625cbbadcded2488eececd9c
      3f5bbc5e
    • Timothy B. Terriberry's avatar
      daala_tx: Add SIMD version of the 4-point DCT · 2e90b44e
      Timothy B. Terriberry authored
      Currently this only requires SSSE3 operations, but we build it as
      AVX2 to get support for 3-operand instructions. Separate versions
      for different instruction sets will be added later.
      
      Change-Id: Ib02c1496832923ecf6dccc1a208dc5ac5559dad2
      2e90b44e
    • Timothy B. Terriberry's avatar
      daala_tx: Add SIMD verification code · 1b65571b
      Timothy B. Terriberry authored
      This calls the C version of the transforms and verifies that the
      SIMD is a bit-exact match on every invocation. It must be manually
      enabled by editing the code to define DAALA_TX_VERIFY_SIMD. This is
      intended to be replaced by real unit tests in the future.
      
      Change-Id: I2c09c8a476cce21a9f48f9d7120185bfa7af42aa
      1b65571b
    • Timothy B. Terriberry's avatar
      daala_tx: Add inverse TX SIMD dispatch · 18c803fa
      Timothy B. Terriberry authored
      This just adds a top-level daala_inv_txfm_add_avx2(), but no actual
      SIMD functions yet. It dispatches back to the C version for all TX
      types and sizes for the moment.
      
      Change-Id: I7a578a4af363f989615d01ea67ce031d8ceff977
      18c803fa
    • Jingning Han's avatar
      Add the speed feature structure for codec dev · b49c6aea
      Jingning Han authored
      This commit re-structures the speed feature setup for the codec
      development purpose. Instead of progressively reducing encoder
      complexity at the expense of incremental coding loss, we allow a
      separate set of speed features, each corresponds to a certain
      category of coding units:
      
      1 << 0: transform coding
      1 << 1: inter prediction
      1 << 2: intra prediction
      1 << 3: block partition
      1 << 4: loop filters
      1 << 5: rd early skip
      
      [6 - 7] are left open for next adjustment.
      
      It is constructed to facilitate the codec development purpose.
      When working on a coding functions, one could choose to turn on
      one or more less related coding units to speed up the evaluation
      process. For example, to test a transform related experiment, one
      could set
      --dev-sf=2, 6, or 22
      which corresponds to turning on:
      2 - inter prediction speed features,
      6 - both inter / intra speed features,
      22 - inter / intra, and loop filter features.
      
      The goal is to allow faster experimental verification during the
      development process. With the experiment in a stable state, we
      can evaluate its performance in speed 0 at higher confidence level.
      
      Change-Id: Ib46c7dea2d2a60204c399dc01f10262c976adf0d
      b49c6aea
    • Imdad Sardharwalla's avatar
      Added monochrome option to the decoder. · 730c8054
      Imdad Sardharwalla authored
      When this is set (use --monochrome), all decoded frames
      will be given constant chroma planes.
      
      If the rawvideo option is used in conjunction with the
      monochrome option (i.e. --monochrome --rawvideo), the
      written output will only consist of the Y (luma) plane.
      
      Change-Id: I967817f1c3ebb1162fa9771b51cf6431120b835c
      730c8054
    • Jingning Han's avatar
      Remove inter mode context dependency on mvs · 835a49ec
      Jingning Han authored
      This commit resolves the inter mode context model dependency on
      the reconstructed motion vectors.
      
      Change-Id: I3fd885dba6c10be8b1dcd072c1a5b3925ef4d1f5
      835a49ec
    • Dake He's avatar
      [lv_map_multi] simplify update_cdf · b79f1b67
      Dake He authored
      remove tmp0 in update_cdf due to the use of EC_MIN_PROB introduced by
      Thomas Davies.
      
      further changes to update_cdf include:
      1. Start the rate at 3+get_msb(nsymbs) and increase the rate by one at
      counts 16 and 32.
      2. Check if tmp is less than cdf[i] to avoid shifting a negative number.
      
      Change-Id: I5088ebd450d6e57ec6c3e92bb2f47a078489b947
      b79f1b67
    • Jingning Han's avatar
      Fix txb_skip context model · 4ca633dc
      Jingning Han authored
      Change-Id: I2ad279d27fb34c9c6bcee6029a40377541f066a7
      4ca633dc
    • Angie Chiang's avatar
      Set up txb size properly for TX64X64 · a9ba58ec
      Angie Chiang authored
      TX64X64 uses 32x32 coeff buffer
      
      Change-Id: Ied4279807207176d590af4c1fc4bb648a618d158
      a9ba58ec
    • Angie Chiang's avatar
      Check if tx_type is valid in av1_get_tx_type() · 2ac1868b
      Angie Chiang authored
      Change-Id: I717bcec45e061e9685c00282f1c2a4d53a3481ef
      2ac1868b
    • Angie Chiang's avatar
      Use macro to set txk_type · bce07f1c
      Angie Chiang authored
      This will make txk_sel support maximum bsize to 128x128
      
      Change-Id: I33941966cb1ae4406ac68a2124c859c833a084d8
      bce07f1c
    • Zoe Liu's avatar
      Parse skip mode stats from aom_entropy_optimizer · 8c7bd928
      Zoe Liu authored
      Change-Id: I72a01937abc3ad5a1ddd5f5ef1ea79e2320343ad
      8c7bd928
    • Nathan E. Egge's avatar
      [daala_tx] Add new flattened 8-point Type-IV DST. · 9ad3343c
      Nathan E. Egge authored
      This 8-point Type-IV DST uses the same computation graph as the
       asymmetric 8-point Type-IV DST with the following changes:
      
        - The fDST and iDST contain different multiplication constants
        - The fDST does not reuse the passed shifts in the first additions
        - The iDST does not have any OD_RSHIFT1(t_) on the last additions
      
      This reused computation structure could be later pulled into a macro or
       exploited by a hardware implementation.
      
      Change-Id: Iac09c29549ce5dcf7752f71e9e6d24609e7b018a
      9ad3343c
  2. 03 Dec, 2017 4 commits
  3. 02 Dec, 2017 16 commits
    • Debargha Mukherjee's avatar
      Turn on 64x16/16x64 tx with tx64x64 + rect-tx-ext · edbe7d1f
      Debargha Mukherjee authored
      Change-Id: Ifa0b0c56fd1454d6c856486c96092ed1d3f1b4b9
      edbe7d1f
    • Debargha Mukherjee's avatar
      Support 16x64 and 64x16 fwd/inv transforms · 0254fee3
      Debargha Mukherjee authored
      Change-Id: I57e6cd7ca71e975082b1431b0cf80d080cabeb9b
      0254fee3
    • Debargha Mukherjee's avatar
      Support 64x16 / 16x64 transform tables · 3f921084
      Debargha Mukherjee authored
      Adds various tables, scan patterns etc. for 16x64 and 64x16
      transforms.
      Also adds scan tables for previously missing 4:1 transforms
      for intra.
      Also adds missing CDFs for filterintra with tx64x64.
      
      Change-Id: I8b16e749741f503f13319e7b7b9685128b723956
      3f921084
    • Monty Montgomery's avatar
      Correct lossless HBD/LBD mismatch segfault in daala_inv_tx · 440a78b8
      Monty Montgomery authored
      The lossless mode special-case that dispatches out to the inv 4x4 WHT
      transform in Daala TX had the HDB and LBD dispatch cases backward due
      to a rebase error.
      
      Change-Id: If77b298834b1a51348fe08702a5144ea5b66df71
      440a78b8
    • Sebastien Alaiwan's avatar
      decodeframe.c: reduce scope of iterators · 3e4068e7
      Sebastien Alaiwan authored
      Change-Id: I6a384bf0b5adcdbc22d67a08a9d99a0bed1fdd6d
      3e4068e7
    • Hui Su's avatar
      intrabc: fix SB index calculation in RDO · 8de99a6e
      Hui Su authored
      It was wrong when ext-partition is on and sb_size=64, potentially causing
      big compression loss.
      
      Change-Id: I39cba439811bc0ab7c5532842887cf82bb3b5657
      8de99a6e
    • Tom Finegan's avatar
      OBU type/metadata disambiguation. · 3e632744
      Tom Finegan authored
      - OBU_TD => OBU_TEMPORAL_DELIMITER
      - METADATA_TYPE => OBU_METADATA_TYPE
      - Prefix OBU_METADATA_TYPE enum vals with "OBU_".
      
      BUG=aomedia:1046
      
      Change-Id: I0c63d36b77905520e427e6b77fbf4cbedabc7e51
      3e632744
    • Yunqing Wang's avatar
      Modify the warped motion mode context · 3afbf3fb
      Yunqing Wang authored
      Modified the warped motion mode context based on neighbor's motion modes
      and current block's mode.
      
      Change-Id: I77ca35fab37ec640bb38661ff1799f643d5aafdc
      3afbf3fb
    • Nathan E. Egge's avatar
      Add new 8-point Type-II DCT implementation. · 098fc0b0
      Nathan E. Egge authored
      subset-1:
      
      new_dct4@2017-11-27T20:52:07.119Z -> new_dct8@2017-11-27T23:57:04.520Z
      
         PSNR | PSNR Cb | PSNR Cr | PSNR HVS |    SSIM | MS SSIM | CIEDE 2000
      -0.0030 |  0.0566 | -0.1127 |  -0.0244 | -0.0078 | -0.0154 |     0.0026
      
      Change-Id: I1fde36a5ed454a50acf81004a618fc0a0c8c9073
      098fc0b0
    • Nathan E. Egge's avatar
      Add new 4-point Type-II DCT implementation. · ede85d4c
      Nathan E. Egge authored
      subset-1:
      
      master@2017-11-27T19:24:03.517Z -> new_dct4@2017-11-27T20:52:07.119Z
      
         PSNR | PSNR Cb | PSNR Cr | PSNR HVS |   SSIM | MS SSIM | CIEDE 2000
      -0.0113 |  0.0459 |  0.1285 |   0.0085 | 0.0005 |  0.0093 |     0.0352
      
      Change-Id: I0a76037ea2a08071ca9c4013979cca3ee3efe55c
      ede85d4c
    • Debargha Mukherjee's avatar
      Rd fix for returning skip correctly · 9c8decb5
      Debargha Mukherjee authored
      Change-Id: I7f108fce272b5bf416836d99430f07af801daada
      9c8decb5
    • Cheng Chen's avatar
      Extend frame marker bits from 4 to 5 · d300f0e4
      Cheng Chen authored
      Although four bits are enough to represent current distances since
      Golden Frame Group is 16, for flexibility, we use 5 bits and allow
      frame distance up to 32.
      
      BUG=aomedia:1072
      
      Change-Id: I9f413baffd656eb8bd54333ba31a4e33faefd57a
      d300f0e4
    • Tom Finegan's avatar
      Make OBU types part of the public API. · 95d900a2
      Tom Finegan authored
      And do so unconditionally: It's harmless to allow the
      types to be defined without CONFIG_OBU enabled.
      
      BUG=aomedia:1046
      
      Change-Id: I5b9a3a68e4e70b07137e381f05345d2ea609a09a
      95d900a2
    • Dake He's avatar
      [lv_map_multi] Simplified multisymbol BR coding · 7d01ab54
      Dake He authored
      Multisymbol BR coding is simplified as follows.
      1. Remove computation of level counts by using a template of size 8;
      2. Context is derived by using a template of size 3.
      3. lps and eob probabilities are trained.
      4. Share contexts between TX_16X16 and above.
      
      The number of probability values used in BR coding are reduced from 1152 to 378.
      
      Change-Id: I0419127e871f9e566c2489aa4b1825c5364aec5a
      7d01ab54
    • Zoe Liu's avatar
      Overwrite frame level skip mode flag if no usage · 8a5d3437
      Zoe Liu authored
      Add a block level usage flag for skip mode. If no block has chosen the
      skip mode, the frame level flag for skip mode will be set off.
      
      This patch also includes a small code cleanup, including the check on
      whether the best RD mode is aligned with skip mode, if yes, the best
      RD mode will be replaced by skip mode.
      
      This patch slightly improves the coding performance of ext-skip.
      
      Change-Id: If06092d5e32f15e63dcb5f35d32e68bc0f827c2b
      8a5d3437
    • Angie Chiang's avatar
      Correct the skip rate in set_skip_flag for lv_map · 4639e080
      Angie Chiang authored
      Change-Id: I584694374a2468e0dcfe6e4fdb2582e5cae051ef
      4639e080
  4. 01 Dec, 2017 3 commits