1. 18 Dec, 2017 1 commit
    • Cheng Chen's avatar
      Speed up by dropping some ref frames in compound search · c683bf9b
      Cheng Chen authored
      Record distortion for each single ref in rd. Rank according to their
      distortions. Then in compound search, drop the combination of ref
      frames of the largest and second largest distortions
      
      This patch shows neutral performance on google test using lowres
      with 20 frame.
      
      Local tests show ~5% speed up over baseline.
      
      Change-Id: I722fe66a0551f5f8a044c57c55caa74e46db7ee8
      c683bf9b
  2. 17 Dec, 2017 4 commits
    • Nathan E. Egge's avatar
      daala_tx: Flip the names on od_rotate kernels. · a25c0300
      Nathan E. Egge authored
      The od_rotate kernels were named after the trailing operation (where
       od_rotate_sub() meant that the last operation was a subtraction)
       which became less intuitive after we folded leading addition operation
       into the kernel.
      This patch simply swaps the operations and kernel labels and has no
       change to the actual transform outputs.
      
      Change-Id: I3d23f1d4aa9e47c78c849dcc6497f099ddcb3574
      a25c0300
    • Nathan E. Egge's avatar
      daala_tx: Fold the addition into od_rotate kernel. · a0bec9d1
      Nathan E. Egge authored
      Move the od_add() and od_sub() calls into the od_rotate_sub() and
       od_rotate_add() kernels respectively as suggested by Frank Bossen.
      The number of kernels is reduced by introducing a flag to indicate which
       of the two versions it is (addition or subtraction).
      Additional named flags are added for the slight kernel variations such
       as the averaging or shifting steps which should be correctly inlined by
       the compiler.
      Extra defines are used to add human readable names for the kernel types.
      The named SHIFT parameter is still passed so that the orthonormal and
       asymmetric DST kernels can be combined in a later patch.
      This patch has no change to metrics.
      
      Change-Id: I11383b11a5e898b519fcb89d5c23bcf7934d94a2
      a0bec9d1
    • Nathan E. Egge's avatar
      daala_tx: Unify the asym and ortho DST designs. · b2f82ebd
      Nathan E. Egge authored
      This patch refactors the DST transforms so that the orthonormal and
       asymmetric transforms are now nearly identical (up to multiplicaiton
       constants and an extra set of shifts).
      This means that the DST designs are now embeddable for every level
       and should address hardware concerns about gate area.
      
      In addition, minor changes were made to improve transform accuracy:
      
       - all of the transforms now have perfect reconstruction for those
          computations outside the rotations, i.e., all +/- butterfly steps
          are exactly invertible
       - two multiplication constants were reduced below < 1.0 (better for
          SIMD and gives slightly improved accuracy)
       - the averaging bias is removed which saves an extra addition for each
          of the averaging steps
      
      Additional averaging steps can be removed from the 8-point Type-IV DST
       giving a 68% reduction in MSE for the 32-point DCT, but has not been
       done in the event we use it in place of the 8-point Type-VII DST.
      
      subset-1:
      
      master-daala_tx@2017-12-10T22:38:19.651Z ->
       new-daala_tx@2017-12-10T22:37:50.844Z
      
        PSNR | PSNR Cb | PSNR Cr | PSNR HVS |    SSIM | MS SSIM | CIEDE 2000
      0.0057 | -0.0210 | -0.1821 |   0.0085 | -0.0002 |  0.0147 |    -0.0674
      
      Change-Id: Ib124eebf6f2e4b3c51c078d4e8f229fc5ec26171
      b2f82ebd
    • Frederic Barbier's avatar
      Remove experimental flag of PALETTE_DELTA_ENCODING · e6579113
      Frederic Barbier authored
      This experiment has been adopted, we can simplify the code
      by dropping the associated preprocessor conditionals.
      
      Change-Id: Idec45a597398ff4fddc6a040c3d7cb3a3c0029d6
      e6579113
  3. 16 Dec, 2017 13 commits
    • Hui Su's avatar
      hash-me: use precise mv cost calculation · f0463bb8
      Hui Su authored
      0.12% gain on screen_content testset; also simplified the code a bit.
      
      Change-Id: If3ecca55d73a69f86320c0a4ea052a831d89d15a
      f0463bb8
    • Yue Chen's avatar
      Remove masked_tx completely from config and cmake file · 6b51a40d
      Yue Chen authored
      Change-Id: I942dbcf801649986468e51c39f12d3b01f269042
      6b51a40d
    • Zoe Liu's avatar
      Remove the unused code on frame_distortion · c79088fb
      Zoe Liu authored
      Change-Id: I965ba1e9a4e286769ed1d492950ca8d6f7c73678
      c79088fb
    • Urvang Joshi's avatar
      Make TX64X64 work with CONFIG_TXMG=0 · dc250207
      Urvang Joshi authored
      And remove the corresponding workaround.
      
      BUG=aomedia:1058
      
      Change-Id: I2b08f536afdb3434ce451b58ea392eeef634ea48
      dc250207
    • Urvang Joshi's avatar
      Correct scale factor for TX_16X64 and TX_64X16. · a30b9ec6
      Urvang Joshi authored
      BUG=aomedia:1114
      
      Change-Id: I7fbeb4c2da996801b945304e182403ec325f95bc
      a30b9ec6
    • Urvang Joshi's avatar
      Rectangular transforms: smaller dim first always. · 15b0113b
      Urvang Joshi authored
      This is true independent of CONFIG_TXMG flag, so no need for the other
      code path.
      
      BUG=aomedia:1114
      
      Change-Id: I572c5151ca866d9d430460fb353610540c9bf025
      15b0113b
    • Cheng Chen's avatar
      Add filter delay for intraBC · 7b88ade6
      Cheng Chen authored
      Because of loop filter, the bottom 8 rows and the rightmost 8 cols
      of IntraBC area now is invalid. It is equal to let the valid region
      add an offset of the filter delay.
      
      Change-Id: Ia91a5b3e81279166dc97a60a7fb6fbda3f2df138
      7b88ade6
    • Jingning Han's avatar
      Remove b_mode_info structure · b8b2a0ec
      Jingning Han authored
      This structure was designed for sub8x8 blocks. It is deprecated as
      cb4x4 lands.
      
      Change-Id: Ied1dbc3fba4c503c00c59cb749e8ddc1ed2b580e
      b8b2a0ec
    • Jingning Han's avatar
      Fix joint compound mode weight assignment · ec2fbea8
      Jingning Han authored
      Fix the weighting coefficients for cases where the last reference
      frame is closer than the future reference frame.
      
      Change-Id: I52f7f9fc43d4887bfa085b0cd27959d9412b8714
      ec2fbea8
    • Jingning Han's avatar
      Deprecate the use of bmi structure from av1 codec · 2fac8a41
      Jingning Han authored
      Change-Id: I7f5010ae3b9a014b3dca0425c9eada3b9e2c0ab3
      2fac8a41
    • Jingning Han's avatar
      Properly update global motion counts · 909e0f60
      Jingning Han authored
      Unify the global motion count for all coding block sizes.
      
      Change-Id: Ifbbbe6ad74de0a40c9f3f4a96672f54a5b18dfc6
      909e0f60
    • Zoe Liu's avatar
      Support ext-skip for both low delay and high delay · 104d62e1
      Zoe Liu authored
      For both low delay and high delay scenarios, the reference pair in
      skip mode are specified as the closest fwd ref, together with the
      closest bwd ref if there is any bwd ref, otherwise with the two
      closest fwd refs.
      
      Skip mode by default uses COMPOUND_AVERAGE. When all the reference
      frames are on the same side, temporal-distance weighted compound is
      considered, and a compound index is signaled to indicate whether
      distance-weighted compound or compound-average is usd.
      
      Whether to use distance-weighted compound for skip mode is still
      under experimenting, hence a flag is temporarily added:
      SKIP_MODE_WITH_JNT_COMP.
      
      Following experimental results are obtained over 30 frames, using the
      setup of --disable-ext-partition --disable-ext-partition-types
      --disable-txmg --enable-jnt-comp --enable-mfmv --enable-ext-skip:
      
      (1) High Latency:
      For Google test sets (lowres/midres), the BDRate coding gain is ~0.2%;
      For AWCY, the coding gain is ~0.1%.
      (2) Low Latency:
      No gain has been observed over Google sets and ~0.1% gain is obtained
      only when temporal-distance weighted prediction is used.
      
      Change-Id: I8c433357adebed0126ebfdd5c4d51aa71e64be57
      104d62e1
    • Sarah Parker's avatar
      Separate inter and intra new-quant profiles · 7640ee42
      Sarah Parker authored
      This also adds some tuning to the intra parameters. The current
      gains are 0.22% on lowres.
      
      Change-Id: I923134096cda608672d2fba7771c1f7a9fbc8efe
      7640ee42
  4. 15 Dec, 2017 19 commits
  5. 14 Dec, 2017 3 commits