1. 11 Mar, 2017 1 commit
  2. 10 Mar, 2017 2 commits
    • David Barker's avatar
      Vectorize new highpass filter for loop-restoration · eed824ef
      David Barker authored
      Change-Id: Ibe5d4933f599456cb496f636de244694bc786a4c
      eed824ef
    • Debargha Mukherjee's avatar
      Replace one self guided filter with highpass · b7bb0976
      Debargha Mukherjee authored
      Adds an option controlled by a macro to replace one of
      the guided filters in the self-guided tool with a simple
      bandpass filtered version generated with a 3x3 kernel.
      By default the macro USE_HIGHPASS_IN_SGRPROJ is 0 (turned
      off), that defaults us to the dual self-guided filter.
      When the macro is turned on, the larger radius guided
      filter is replaced by a simpler filter that is much faster.
      
      Results (if USE_HIGHPASS_IN_SGRPROJ is on vs. off):
      lowres: performance drop by +0.14% (BDRATE)
      midres: performance drop by +0.27% (BDRATE)
      
      Further experiments on this variation of guided filters is
      pending.
      
      Change-Id: I7bbcfcad7ee266cd49a8dc6d96795a454feb1a94
      b7bb0976
  3. 09 Mar, 2017 1 commit
  4. 08 Mar, 2017 1 commit
    • David Barker's avatar
      Handle non-multiple-of-4 widths in SSE4.1 self-guided filter · 5765fad5
      David Barker authored
      Adjust the vectorized filter so that it can handle tile widths
      which are not a multiple of 4, so we do not have to fall back
      to the C version of the filter.
      
      Negligible speed impact for tiles with widths which are multiples
      of 4, and greatly improves speed on tiles with non-multiple-of-4
      widths.
      
      Change-Id: Iae9d14f812c52c6f66910d27da1d8e98930df7ba
      5765fad5
  5. 06 Mar, 2017 1 commit
    • David Barker's avatar
      Vectorize self-guided filter · ce110cc5
      David Barker authored
      Add an SSE4.1 lowbd version of the self-guided filter for
      loop-restoration, and apply some optimizations to the C
      version.
      
      Approximate times per 128x128 / 256x256 tile on the machine
      this was developed on:
      Previous C:  620us / 2800us
      Optimized C: 500us / 2200us ( 24% /  27% faster)
      SSE4.1:      147us / 600us  (320% / 370% faster)
      
      Change-Id: I23ff5a5482a191aeb06f9d1f767a9f036bb357fe
      ce110cc5