1. 01 Mar, 2017 1 commit
  2. 28 Feb, 2017 7 commits
    • Angie Chiang's avatar
      Turn on SIMD implementation of av1_fht32x32 · e4f98f67
      Angie Chiang authored
      Change-Id: Ie1bfece43c81ee5d149ed25c3f7fd959a8f95030
      e4f98f67
    • Michael Bebenita's avatar
      Add SIMD code for PVQ search · 3a88de8f
      Michael Bebenita authored
      This reduces the runtime profile of pvq_search_rdo_double from 37%
      to 15% and improves overall encoding speed when PVQ is enabled by ~40%.
      The SIMD code is not bit accurate with the C version and introduces a
      slight PSNR regression on AWCY:
      
        PSNR | PSNR Cb | PSNR Cr | PSNR HVS | SSIM | MS SSIM | CIEDE 2000
      0.0607 |  0.1044 |     N/A |   0.0126 |  N/A | -0.0309 |        N/A
      
      Change-Id: Ie22cebc62df2e72618305f2268668d79167860c6
      3a88de8f
    • Angie Chiang's avatar
      Add av1_cost_coeffs_txb() for lv_map experiment · 47c72189
      Angie Chiang authored
      Change-Id: I44842387207b19f8e0c3894d3f4e8d0646a4cafd
      47c72189
    • Debargha Mukherjee's avatar
      Assign offsets correctly to compute warped motion · 246d2737
      Debargha Mukherjee authored
      Offsets for the least-squares for affine motion computation
      are now set at the top left corner of the current block.
      
      Improves stability and performance a little.
      
      Change-Id: I68ca7e74c6102502daa8ca3373af2b2dd59400c3
      246d2737
    • Jingning Han's avatar
      Disable compound mode in sub8x8 coding blocks · c41a549a
      Jingning Han authored
      Disable the support of compound prediction modes for sub8x8 codking
      blocks. Make the rate-distortion optimizations process account for
      such constraints.
      
      With the use 2x2 chroma prediction block, this makes the wrost case
      number of inter predictors same as vp9. It affects the coding
      gains by 0.35% for lowres, 0.17% for midres, and 0.08% for hdres.
      
      The encoding speed is up by 10%.
      
      Change-Id: Ieb2a83030676911baa403e586f1f800cbf485d81
      c41a549a
    • Yaowu Xu's avatar
      Use correct segment · 1e2aae1a
      Yaowu Xu authored
      Segmment based lossless flag is used in select transform size, this
      commit fixes a bug where wrong segment_id is used in such selection.
      
      BUG=aomedia:350
      
      Change-Id: Ibc981c779739849bac00447155180abbd319eb28
      1e2aae1a
    • Yaowu Xu's avatar
      Move asserts into correct scope · cdf8a14e
      Yaowu Xu authored
      The macro used in assert is defined under CONFIG_VAR_TX. This fixes a
      build issuse when --enable-var-tx and --enable-rd-debug are both on.
      
      Change-Id: I497fe4a8b1fa6c7b05ac2b41c97522f7bdedc0ce
      cdf8a14e
  3. 27 Feb, 2017 11 commits
    • Angie Chiang's avatar
      Remove redundant return in set_offsets · 44701f2c
      Angie Chiang authored
      Change-Id: Idf8f03052a7e21b8a273986204038545573d7962
      44701f2c
    • Debargha Mukherjee's avatar
      Better block center in gm_get_motion_vector fn · f6dd3c68
      Debargha Mukherjee authored
      Also supports homography models for future experiments.
      
      Change-Id: I4510540f54133e063891ed491c95c087222f7810
      f6dd3c68
    • Adrian Grange's avatar
      Remove unnecessary #ifdef · d152fc04
      Adrian Grange authored
      The line of code is already within the scope
      of an #if CONFIG_EC_MULTISYMBOL.
      
      Change-Id: I62e28c8586f5d04a1e1be4ea5a2551d3123fde9f
      d152fc04
    • Debargha Mukherjee's avatar
      Adds macro to test cb4x4 w/o sub8x8 txtype search · 094c9439
      Debargha Mukherjee authored
      USE_TXTYPE_SEARCH_FOR_SUB8X8_IN_CB4X4 macro added to turn
      tx_type search on/off for sub8x8 in cb4x4 mode.
      
      The purpose is mainly to analyze the coding gains from cb4x4
      but this later can be made into a speed feature as well.
      
      Change-Id: Ic22026c373eebba87f324689ac5686a2844315b6
      094c9439
    • Debargha Mukherjee's avatar
      Integerize warped motion computation · e6eb3b53
      Debargha Mukherjee authored
      Integerizes computation of the least squares for warped motion.
      The model is restricted to only Affine. Affine seems easiest
      to compute and integerize since it can be split into two 3-dim
      least squares problems, as opposed to rotation-zoom which needs
      a 4-dim least-squares problem to be solved.
      The current implementation requires only one division per block.
      
      BDRATE impact is mminimal. The upgrade to the affine model improves
      coding efficiency but integerization also degrades efficiency a
      little. Overall there is a net gain of about -0.07% BDRATE on
      the lowres set.
      BDRATE lowres: -1.113% with ----enable-warped-motion vs. without
      (up from -1.044%).
      
      Change-Id: I6b9216ac0737d76f59054293eabee48e17739ec4
      e6eb3b53
    • Yaowu Xu's avatar
      Remove const from int ext_tx_set · 7640f5f3
      Yaowu Xu authored
      The variable was later assigned value in the function.
      
      Change-Id: I93f283a134499a050b46d9dcd6f0c0b4e8d54049
      7640f5f3
    • Angie Chiang's avatar
      Prefer using get_tx_size() · 7fcfee40
      Angie Chiang authored
      Change-Id: Ifcdd3ce2953c1ecb1d0962da412a4b5ba2cda912
      7fcfee40
    • Yaowu Xu's avatar
      Correct a macro · 345a22db
      Yaowu Xu authored
      --enable-lowbitdepth defines the flag CONFIG_LOWBITDEPTH, not
      CONFIG_AOM_LOWBITDEPTH.
      
      Change-Id: Ifa1c12847bee4978d08d010f4fc3601d75e59c31
      345a22db
    • Alex Converse's avatar
      Remove aom_realloc() · 7f094f10
      Alex Converse authored
      It only handles the realloc constraint (preserving low elements) by
      serendipity, and we don't actually rely on that behavior anyway.
      Meanwhile the calls may do extra copying that gets immediately clobbered
      by the callers.
      
      Cherry-pick from libvpx:
      3063c3760 Remove vpx_realloc()
      
      Change-Id: I8dfa89e4a81084b084889c27bd272fdf85184e8d
      7f094f10
    • Alex Converse's avatar
      loop_restoration: Cleanup allocations · 232e3847
      Alex Converse authored
      Change-Id: Id3824c09cbaae814df1d8fb029215f28e8c7a6b1
      232e3847
    • Steinar Midtskogen's avatar
      CLPF: Add quality dependent damping in the constrain function · 4305e6be
      Steinar Midtskogen authored
      PSNR YCbCr:  -0.17%     -0.03%     -0.40%
      APSNR YCbCr: -0.17%     -0.02%     -0.39%
      PSNRHVS:     -0.06%
      SSIM:        -0.17%
      MSSSIM:      -0.07%
      CIEDE2000:   -0.12%
      
      Change-Id: I69a4b6a4e18c22c3930069396540a6fee45cb30d
      4305e6be
  4. 25 Feb, 2017 6 commits
  5. 24 Feb, 2017 15 commits
    • Angie Chiang's avatar
      Add lv_map transform coefficient coding function · 80b82269
      Angie Chiang authored
      Change-Id: I70c3659940b5090f030c795df5148ac508e19d2d
      80b82269
    • Angie Chiang's avatar
      Add txb_common.h · 971a5963
      Angie Chiang authored
      This file includes common context generating functions of lv_map.
      
      Change-Id: I7aea78e48cd5003738445b5635120cbc3825ef05
      971a5963
    • Angie Chiang's avatar
      Add probability/count tables for lv_map experiment · bd57fc55
      Angie Chiang authored
      Change-Id: Ie73bb51d4a24c2ff719758c38e303db92e6f4500
      bd57fc55
    • Luc Trudeau's avatar
      Remove redundant loop in ctx_reset · 98bc74ca
      Luc Trudeau authored
      Merges two consecutive loops that iterated over TX_SIZES. There's no
      impact to the bitstream. The 4 used as the termination threshold in the
      second loop is equivalent to TX_SIZES.
      
      Change-Id: Ic891d209b28f20907d53bcdd58139fe39c37b0fa
      98bc74ca
    • Thomas Davies's avatar
      Make entropy experiments compatible with TX6X64 and CB4x4. · 1bdcc775
      Thomas Davies authored
      Use correct probability initialisations for EC_ADAPT and
      NEW_TOKENSET.
      
      Change-Id: I28310d40eab544cd57a11ce88eb8b7ab31e69ec7
      1bdcc775
    • Thomas Davies's avatar
      Use default CDF tables when initialising mv probs. · 05fdc391
      Thomas Davies authored
      No change in BDR.
      
      Change-Id: Ib6934b59de340e68dd983d9f53f8878588969acb
      05fdc391
    • Thomas Davies's avatar
      Use default CDF tables when initialising mode probs. · 1d7db728
      Thomas Davies authored
      No change in BDR.
      
      Change-Id: I77551120a2e94dcbf818b039154495f0f9b21755
      1d7db728
    • Michael Bebenita's avatar
      Clear MMX FP state in PVQ code. · e6862004
      Michael Bebenita authored
      Not clearing the FP state was causing acos to return NaN on OSX / LLVM.
      This was not causing problems Linux or AWCY.
      
      Change-Id: I278d02839e4de858b5f55cfb380fa3968937995e
      e6862004
    • Thomas Davies's avatar
      Use default CDF tables when initialising coef probs. · 87aeeb85
      Thomas Davies authored
      When creating the CDF head, do not use 8-bit probabilities
      to make the CDF tables, but load them directly.
      
      CDF tail values are created from the ONE_TOKEN relative
      probability as before.
      
      No change to BDR.
      
      Change-Id: I7386b8952f6f69cc9b77aa1b2bee71cf8e3cc9ff
      87aeeb85
    • Fangwen Fu's avatar
      improving palette throughput · 33bcd117
      Fangwen Fu authored
      * code the palette color index using 45 wavefront
      * interleave the coeff and palette color index in
        transform block level
      * the above change does not change code efficiency
      
      Details: 
      The 45 wavefront scan allows to compute the ctx of
      the diagonal samples' indices  at the same time. 
      Interleaving palette indices and palette residual
      on a transform block basis means that the entropy
       decoding and further processing of the palette 
      residual is not delayed by the entropy decoding 
      of all the color indices of the palette encoded 
      block.
      Change-Id: Ie9f576002a9a68394b99c23b01e9730df06df070
      33bcd117
    • Sebastien Alaiwan's avatar
      Allow disabling the 8-bit (low-bitdepth) operating path. · 98378137
      Sebastien Alaiwan authored
      This allows compiling a codec using the same operating path (the generic
      "high-bitdepth" one), regardless of the profile of the input bitstream.
      For now, keep the 16-bit (generic) pixel operating path disabled by default.
      
      Change-Id: Idd31a842b801a82c4918b1cfa7cc0bff5b11d060
      98378137
    • Thomas Davies's avatar
      EC_MULTISYMBOL: make all CDFs have a extra element. · f3eb840a
      Thomas Davies authored
      This will make it easier to add native CDFs for all the
      dependent experiments without excessive macros.
      
      Change-Id: Iee4710f0fe1c1b4300f686cdf2c5b879a36de987
      f3eb840a
    • Luc Trudeau's avatar
      Add get_plane_type() helper function. · 005feb6b
      Luc Trudeau authored
      Adds the static inline function get_plane_type to convert a plane number
      to the corresponding PLANE_TYPE.
      
      There's no change to the bitstream, it only encapsulates the logic to
      get the PLANE_TYPE.
      
      Change-Id: I1199db3a32c89437d9c029ab5b2b2e62582a13a2
      005feb6b
    • Angie Chiang's avatar
      Use 10 tap for sharp interpolation filter · d59fa2ae
      Angie Chiang authored
      Performance drop
      lowres 0.056%
      midres 0.024%
      hdres 0.02%
      
      Change-Id: I52d067eefbfb87198319f9d50e3b4060f80a6abb
      d59fa2ae
    • Angie Chiang's avatar
      Let hbd conv func be flexible · 0a2c0cbc
      Angie Chiang authored
      This CL allow us to change filter coefficients easily for SIMD
      implementation of high bitdepth convolution functions
      
      Change-Id: I454a5c76d3ba9e4454118c6a9d87737b3aa24898
      0a2c0cbc