1. 13 Mar, 2017 2 commits
    • Urvang Joshi's avatar
      Add some default CDFs when ALT_INTRA is on. · 63234547
      Urvang Joshi authored
      Generated using av1_tree_to_cdf().
      
      Note: These are currently overwritten by CDFs generated from default
      probability tables. But they will be used eventually when we remove the
      default probability tables.
      
      Change-Id: I41a6047fd13e05156a50b2d54349ffdd7e1e4c4a
      63234547
    • Yaowu Xu's avatar
      Remove a sse4_1 function · def28b24
      Yaowu Xu authored
      Function apply_selfguided_restoration_highbd_sse4_1() is producing
      mismatch to c version, it is removed for now, allowing investigation
      and fix.
      
      BUG=aomedia:392
      
      Change-Id: Ic55e7a6958112c02930b1d5f3af2e2ea089fe500
      def28b24
  2. 11 Mar, 2017 2 commits
  3. 10 Mar, 2017 6 commits
    • David Barker's avatar
      Vectorize new highpass filter for loop-restoration · eed824ef
      David Barker authored
      Change-Id: Ibe5d4933f599456cb496f636de244694bc786a4c
      eed824ef
    • Thomas Davies's avatar
      Add a symbol decode call count to accounting. · f7f87ff2
      Thomas Davies authored
      This keeps track of how many calls have been made
      to read symbols or bits. A given syntax element
      may make multiple calls to symbol decoding functions,
      and these variables keep track of the entropy
      decoding engine throughput.
      
      Change-Id: Iab3a720cbfe68f8d5ca3e4c415f7baa683b24268
      f7f87ff2
    • Debargha Mukherjee's avatar
      Replace one self guided filter with highpass · b7bb0976
      Debargha Mukherjee authored
      Adds an option controlled by a macro to replace one of
      the guided filters in the self-guided tool with a simple
      bandpass filtered version generated with a 3x3 kernel.
      By default the macro USE_HIGHPASS_IN_SGRPROJ is 0 (turned
      off), that defaults us to the dual self-guided filter.
      When the macro is turned on, the larger radius guided
      filter is replaced by a simpler filter that is much faster.
      
      Results (if USE_HIGHPASS_IN_SGRPROJ is on vs. off):
      lowres: performance drop by +0.14% (BDRATE)
      midres: performance drop by +0.27% (BDRATE)
      
      Further experiments on this variation of guided filters is
      pending.
      
      Change-Id: I7bbcfcad7ee266cd49a8dc6d96795a454feb1a94
      b7bb0976
    • Yaowu Xu's avatar
      Align a buffer for simd operation · bcf25cda
      Yaowu Xu authored
      BUG=aomedia:387
      
      Change-Id: I11fdc9dbc4b0f4484e82ab1662ac329b8b7f2d6c
      bcf25cda
    • Ryan Lei's avatar
      add 15 tap filter in the parallel_deblocking experiment · 41fc0c66
      Ryan Lei authored
      this change added back the original 15 tap filter from VP9 back into
      the parallel_deblocking experiment. when transform size of both blocks
      along the edge is greater than 16x16, 15 tap filter is used.
      
      Change-Id: Ieae0393b66b1168572292bcebabd2707058b7f1d
      41fc0c66
    • Sebastien Alaiwan's avatar
      Use correct format specifier for 64-bit integers. · 5443ff3a
      Sebastien Alaiwan authored
      Change-Id: I366160220b5f7fe4ea6adb4719c4efeef6a7d6f7
      5443ff3a
  4. 09 Mar, 2017 4 commits
    • David Barker's avatar
      Add SSE4.1 highbitdepth self-guided filter · 4d2af5db
      David Barker authored
      Performance is very similar to the lowbd path (only 4-5% slower)
      
      Change-Id: Ifdb272c3f6c0e6f41e7046cc49497c72b5a796d9
      4d2af5db
    • Debargha Mukherjee's avatar
      Clean up unused code in loop-restoration · 4bfd72ee
      Debargha Mukherjee authored
      Removes domain transform recursive filters and non-approximate
      guided filter code.
      
      Change-Id: Ib7ae7a6b6526a0908b3dc1787ab3561442da4e2d
      4bfd72ee
    • Debargha Mukherjee's avatar
      Add restoration tilesize to frame header · 1008c1e7
      Debargha Mukherjee authored
      The restoration tilesize can be now chosen as either 256, 128
      or 64 at the frame header.
      
      Change-Id: I852fc42afedc053484d657bdca522de73aaacd67
      1008c1e7
    • Fergus Simpson's avatar
      rdopt: refactor interpolation_filter_search() · de18e2b5
      Fergus Simpson authored
      The interpolation filter search used to be performed in a code block in
      handle_inter_mode(). This change breaks that code out into its own
      function to reduce the length of handle_inter_mode and ecapsulate its
      functionality, making both functions more readable.
      
      Attention has been paid to make as many arguments constants as can be.
      
      Change-Id: I3fd484137fc0d16a47dba0b18ce0e2b349d24446
      de18e2b5
  5. 08 Mar, 2017 8 commits
    • hui su's avatar
      Code refactoring in adapt-scan · ff0da2b4
      hui su authored
      Change-Id: Ie20bd0b05bbf3128933f10787aade7b63c98b52a
      ff0da2b4
    • Fangwen Fu's avatar
      Remove palette interleave · b3be926a
      Fangwen Fu authored
      * Run 45 degree wavefront coding for palette index
      with palette_throughput experiment.
      * Remove palette index interleave.
      
      Change-Id: Ibb57004401f817dec8b00bc2a941d70a26783ff9
      b3be926a
    • Yaowu Xu's avatar
      localize the use of CONFIG_DEPENDENT_HORZTILES · 531d6afd
      Yaowu Xu authored
      This commit changes is_inside() function to reduce the code polution
      of CONFIG_DEPENDENT_HORZTILES.
      
      Change-Id: Ic065cc337e0246379d87966a49ddeb48b975c5be
      531d6afd
    • Yaowu Xu's avatar
      Fix an asan failure · 27d158b2
      Yaowu Xu authored
      SIMD convovle functions, such as filter_horiz_v4p_ssse3(), assume that
      10-tap filters are defined using 12 taps with both end taps being 0.
      
      BUG=aomedia:380
      
      Change-Id: Id8a87ae8a1330bed0452441ab8345276857220af
      27d158b2
    • David Barker's avatar
      Make encoder use vectorized self-guided filter · 506eb723
      David Barker authored
      By rearranging the code in restoration.c, we can allow the
      encoder to use the SSE4.1 version of the self-guided filter
      while picking the loop-restoration filter.
      
      This also helps us prepare for adding a highbitdepth SSE4.1
      version of the self-guided filter.
      
      No effect on encoder output, but gives an end-to-end speedup
      of 1-2%.
      
      Change-Id: Id17ba4a0963ddce9f70a7cae666e212e138d5f2c
      506eb723
    • David Barker's avatar
      Fix a bug in the C selfguided filter · cff43bb2
      David Barker authored
      Patch https://aomedia-review.googlesource.com/c/8321/ introduced
      a bug in the C version of the self-guided filter in the case where
      w = 384 and h > 368 or w > 368 and h = 384. This was due to forgetting
      to adjust the offset between A and B in the C code.
      
      This patch sets the offset correctly, resolving this bug.
      
      Change-Id: I6bdf11aa76c37d0ecae02788b262e7a2e0a11a6e
      cff43bb2
    • David Barker's avatar
      Handle non-multiple-of-4 widths in SSE4.1 self-guided filter · 5765fad5
      David Barker authored
      Adjust the vectorized filter so that it can handle tile widths
      which are not a multiple of 4, so we do not have to fall back
      to the C version of the filter.
      
      Negligible speed impact for tiles with widths which are multiples
      of 4, and greatly improves speed on tiles with non-multiple-of-4
      widths.
      
      Change-Id: Iae9d14f812c52c6f66910d27da1d8e98930df7ba
      5765fad5
    • Alex Converse's avatar
      loop_restoration: Prevent some wild memory access · 1511ea10
      Alex Converse authored
      On recode frames the encoder will attempt to serialize the bitstream
      before choosing loop filter parameters to get a rough size estimate.
      This can result in wild reads in encode restoration if leftover values
      from the previous frame aren't available.
      
      Even with a realloc instead of free-ing and reallocing all the data,
      wild reads are possible on frame size changes.
      
      Change-Id: I9956d9e11c6ed61999563436051c2fe469718538
      1511ea10
  6. 07 Mar, 2017 3 commits
    • Thomas Davies's avatar
      Add a CDF for coding delta_q. · d6ee8a8c
      Thomas Davies authored
      Also remove forward updates for delta_q when EC_ADAPT
      is enabled.
      
      Change-Id: Idf71b57bfe7763bc60595bc45768e624dd7b67bd
      d6ee8a8c
    • Fangwen Fu's avatar
      dependent tiles togeter with tile groups · 73126c08
      Fangwen Fu authored
      Change-Id: I378eb5b2c03a4c30d261128bcf9ef00ea987ed40
      73126c08
    • hui su's avatar
      Fork the entropy experiment · 0d103578
      hui su authored
      Split it into two experiments:
      q_adapt_probs: multiple initial coeff prob tables based on q-index
      subframe_prob_update: multiple backward prob updates within frame
      
      Change-Id: I78041ebd4ba34afc9152f6861225f63c2e8eb686
      0d103578
  7. 06 Mar, 2017 5 commits
  8. 05 Mar, 2017 1 commit
    • Jingning Han's avatar
      Decouples rect-tx from var-tx · 8b77d04e
      Jingning Han authored
      With this patch, --enable-var-tx only enables recursive transform
      partitioning without using rectangular transforms.
      To enable use of rectangular transforms in addition, use:
      --enable-var-tx --enable-rect-tx
      
      The RD selection process is not fully tested under the var-tx flag
      only. We might expect certain performance loss there.
      
      Change-Id: Ie6aa17f1bbc3e8563b9990bc9ff79cc860d9a361
      8b77d04e
  9. 04 Mar, 2017 1 commit
  10. 03 Mar, 2017 1 commit
    • Yue Chen's avatar
      Restrict the number of neighbors in obmc mode · 5329a2bf
      Yue Chen authored
      Enable obmc mode only when there are <= 2 left neighbors and <=2
      above neighbors. Also disable it when there is no overlappable
      neighbors.
      
      Gain in AWCY test: 1.60%, was 1.64% when there is no restriction
      
      Change-Id: I2d82ef4fb4daa9b0843ac8844f99b9f412c4f379
      5329a2bf
  11. 02 Mar, 2017 5 commits
    • Debargha Mukherjee's avatar
      Some optimizations on integer affine estimation · 93105538
      Debargha Mukherjee authored
      1. Adds a limit on number of candidate samples used for the
      estimation.
      2. Adds a limit on max mv magnitude for use in the least-squares
      3. Makes some of the internal variables 32-bit.
      
      Impact on coding efficiency in the noise range.
      
      Change-Id: I8c1c3216368ceb2e3548660a3b8c159df54a8312
      93105538
    • David Barker's avatar
      Remove double rounding in selfguided filter · 7dcd7f5e
      David Barker authored
      In av1_selfguided_restoration, the values stored into 'dgd' are
      unnecessarily rounded twice. This patch replaces this by a single
      rounding operation.
      
      Change-Id: I188d283137b74823f5d5447d441250520d6ee294
      7dcd7f5e
    • Jingning Han's avatar
      Avoid the use of undefined marco value · b83e64ba
      Jingning Han authored
      Always define USE_TXTYPE_SEARCH_FOR_SUB8X8_IN_CB4X4 to avoid the
      use of undefined value.
      
      Change-Id: I0ad90c5b5316db231e9538487bb4591dfd6a9ce7
      b83e64ba
    • Yue Chen's avatar
      Use 3-tap spatial filter in FILTER_INTRA experiment · 8d8638a1
      Yue Chen authored
      3-tap recursive intra prediction filters are added.
      Macro USE_3TAP_INTRA_FILTER is set to 1 to use 3-tap by default.
      Coding gain of FILTER_INTRA experiment in AWCY, high delay 150f
      3-tap: 0.51%
      4-tap: 0.68%
      
      Change-Id: I44192dd08bfd8155f58a9b0b5cf1de88fceb762e
      8d8638a1
    • Sarah Parker's avatar
      Turn off global motion for sub8x8 blocks · ae7c458a
      Sarah Parker authored
      Lowres: 0.03% improvement, 1% improvement on waterfall_cif.y4m
      Midres: 0.085% overall improvement, 1.253% improvement on station2_480p25.y4m
      Change-Id: I3872934d978bb4ca828c6b9acd2fdb951d9da299
      ae7c458a
  12. 01 Mar, 2017 2 commits
    • Ryan Lei's avatar
      implement combined parallel_deblocking experiment · 392d0ff7
      Ryan Lei authored
      The parallel_deblocking experiment is proposed jointly by Intel
      and Microsoft. The following changes are implemented in this
      experiment:
      
      - deblocking filter order is changed to filter all vertical edges
        of the whole frame followed by filtering all horizontal edges
        of the whole frame
      
      - filter length decision is made based on the transform block size
        on both sides of the edge. block with smaller transform size
        determines the final filter length.
      
      - transform blocks on both sides of the edge are checked, only when
        both blocks are skipped and they belong to the same prediction
        block, filtering of that edge can be skipped.
      
      - 15-tap filter and extended flat area detection are removed.
      
      - special rule for handling 4x4 transform block on the super block
        boundary in VP9 is removed.
      
      Change-Id: I1aa82c6b5335d47c2f73eec8fc8bee2c08a1cf74
      392d0ff7
    • Jingning Han's avatar
      Fix compiling warnings in var-tx and pvq · ab77e73b
      Jingning Han authored
      Change-Id: Ie836a113978028f3bde2acd31061d9a663547087
      ab77e73b