1. 18 Jul, 2017 1 commit
    • Cheng Chen's avatar
      Clean CLPF local function · a5378e73
      Cheng Chen authored
      Rename local functions and make them static.
      Remove unnecessary header file and corresponding includes.
      
      Change-Id: I4b09e3949e7207754753997ff359992bd348d488
      a5378e73
  2. 05 Apr, 2017 1 commit
    • Steinar Midtskogen's avatar
      CDEF: Add damping to dering · 8ff52fcc
      Steinar Midtskogen authored
      high-latency, cpu-used=0:
         PSNR | PSNR Cb | PSNR Cr | PSNR HVS |    SSIM | MS SSIM | CIEDE 2000
      -0.1650 |  0.2545 |  0.2977 |  -0.0423 | -0.0947 | -0.0725 |    -0.0365
      
      low-latency, cpu-used=0:
         PSNR | PSNR Cb | PSNR Cr | PSNR HVS |    SSIM | MS SSIM | CIEDE 2000
      -0.4006 |  0.0501 | -0.0108 |  -0.1790 | -0.1660 | -0.1992 |    -0.2135
      
      low latency, cpu-used=4:
         PSNR | PSNR Cb | PSNR Cr | PSNR HVS |    SSIM | MS SSIM | CIEDE 2000
      -0.5508 | -0.2445 | -0.2762 |  -0.1981 | -0.2878 | -0.2228 |    -0.3733
      
      Change-Id: Ia20df28c8bbb6182215b02016053af33bd498145
      8ff52fcc
  3. 01 Apr, 2017 2 commits
  4. 29 Mar, 2017 2 commits
  5. 21 Mar, 2017 2 commits
  6. 17 Mar, 2017 1 commit
    • Steinar Midtskogen's avatar
      Merge dering/clpf rdo and filtering · a9d41e88
      Steinar Midtskogen authored
      * Dering and clpf were merged into a single pass.
      * 32x32 and 128x128 filter block sizes for clpf were removed.
      * RDO for dering and clpf merged and improved:
        - "0" no longer required to be in the strength selection
        - Dering strength can now be 0, 1 or 2 bits per block
      
                    LL    HL
      PSNR:       -0.04 -0.01
      PSNR HVS:   -0.27 -0.18
      SSIM:       -0.15 +0.01
      CIEDE 2000: -0.11 -0.03
      APSNR:      -0.03 -0.00
      MS SSIM:    -0.18 -0.11
      
      Change-Id: I9f002a16ad218eab6007f90f1f176232443495f0
      a9d41e88
  7. 27 Feb, 2017 1 commit
  8. 10 Feb, 2017 1 commit
    • Steinar Midtskogen's avatar
      Retune the CLPF kernel · 4f0b3ed8
      Steinar Midtskogen authored
      CLPF performance had degraded by about 0.5% over the past six months,
      which isn't totally surprising since the codec is a moving target.
      About half of that degradation comes from the improved 7 bit filter
      coefficients.  Therefore, CLPF needs to be retuned for the current
      codec.
      
      This patch makes two (normative) changes to the CLPF kernel:
      
      * The clipping function was changed from clamp(x, -s, s) to
            sign(x) * max(0, abs(x) - max(0, abs(x) - s +
                   (abs(x) >> (bitdepth - 3 - log2(s)))))
        This adds a rampdown to 0 at -32 and 32 (for 8 bit, -128 & 128
        for 10 bit, etc), so large differences are ignored.
      
      * 8 taps instead of 6 taps:
                     1
          4          3
        13 31  ->  13 31
          4          3
                     1
      
      AWCY results: low delay  high delay
      PSNR:           -0.40%     -0.47%
      PSNR HVS:        0.00%     -0.11%
      SSIM:           -0.31%     -0.39%
      CIEDE 2000:     -0.22%     -0.31%
      APSNR:          -0.40%     -0.48%
      MS SSIM:         0.01%     -0.12%
      
      About 3/4 of the gains come from the new clipping function.
      
      Change-Id: Idad9dc4004e71a9c7ec81ba62ebd12fb76fb044a
      4f0b3ed8
  9. 08 Feb, 2017 1 commit
  10. 14 Oct, 2016 2 commits
  11. 13 Oct, 2016 1 commit
    • Steinar Midtskogen's avatar
      Move CLPF block signals from frame to SB level. · 97535038
      Steinar Midtskogen authored
      These signals were in the uncompressed frame header (as a temporary
      hack), which caused two problems:
      
      * We don't want that header to be duplicated in the slice header
      * It was necessary to signal the number of bits to transmit up front
      
      However, the filter size can be 128x128 which is greater than the SB
      size, and a decoder wouldn't be able to know whether to read a bit or
      not until the final SB of that 128x128 block has been decoded
      (depending on whether the 128x128 is all skip or not).  Therefore the
      signalling was changed for 128x128 blocks so that every top left SB of
      a 128x128 filter block contains a signal regardless of whether the
      block is all skip or not.  Also, all the MB's of 128x128 block are
      filtered even if they are skip MB's.  This gives the signal a purpose
      even when the 128x128 block is all skip, and it also gives a slight
      coding gain as it leaves a way to filter skip blocks, which was
      previously forbidden.
      
      Low latency:
      PSNR YCbCr:     -0.19%     -0.14%     -0.06%
         PSNRHVS:     -0.15%
            SSIM:     -0.13%
          MSSSIM:     -0.15%
       CIEDE2000:     -0.19%
      
      High latency:
      PSNR YCbCr:     -0.03%     -0.01%     -0.09%
         PSNRHVS:      0.04%
            SSIM:      0.00%
          MSSSIM:      0.02%
       CIEDE2000:     -0.02%
      
      Change-Id: I69ba7144d07d388b4f0968f6a53558f480979171
      97535038
  12. 11 Oct, 2016 2 commits
    • Steinar Midtskogen's avatar
      Clean up and speed up CLPF clipping · e66fc87c
      Steinar Midtskogen authored
      * Move clipping tests from inside to outside loops
      * Let sizex and sizey to clpf_block() be the clipped block size rather
        than both just bs
      * Make fallback tests to C more accurate
      
      Change-Id: Icdc57540ce21b41a95403fdcc37988a4ebf546c7
      e66fc87c
    • Steinar Midtskogen's avatar
      Bugfix in the CLPF RDO. · 2e40cc4c
      Steinar Midtskogen authored
      When CLPF was extended to chroma, the chroma RDO accidentally
      discarded the optimal block size found in the luma RDO.
      
      PSNR YCbCr:     -0.25%      0.05%      0.06%
         PSNRHVS:     -0.19%
            SSIM:     -0.36%
          MSSSIM:     -0.23%
      
      Conflicts:
      	av1/common/clpf.c
      
      Change-Id: Ie49cd30f9276a311ada88cb2f13d14757617f030
      2e40cc4c
  13. 10 Oct, 2016 10 commits
  14. 06 Oct, 2016 1 commit
  15. 04 Oct, 2016 1 commit
    • Steinar Midtskogen's avatar
      Move CLPF block signals from frame to SB level. · 85437b21
      Steinar Midtskogen authored
      These signals were in the uncompressed frame header (as a temporary
      hack), which caused two problems:
      
      * We don't want that header to be duplicated in the slice header
      * It was necessary to signal the number of bits to transmit up front
      
      However, the filter size can be 128x128 which is greater than the SB
      size, and a decoder wouldn't be able to know whether to read a bit or
      not until the final SB of that 128x128 block has been decoded
      (depending on whether the 128x128 is all skip or not).  Therefore the
      signalling was changed for 128x128 blocks so that every top left SB of
      a 128x128 filter block contains a signal regardless of whether the
      block is all skip or not.  Also, all the MB's of 128x128 block are
      filtered even if they are skip MB's.  This gives the signal a purpose
      even when the 128x128 block is all skip, and it also gives a slight
      coding gain as it leaves a way to filter skip blocks, which was
      previously forbidden.
      
      Low latency:
      PSNR YCbCr:     -0.19%     -0.14%     -0.06%
         PSNRHVS:     -0.15%
            SSIM:     -0.13%
          MSSSIM:     -0.15%
       CIEDE2000:     -0.19%
      
      High latency:
      PSNR YCbCr:     -0.03%     -0.01%     -0.09%
         PSNRHVS:      0.04%
            SSIM:      0.00%
          MSSSIM:      0.02%
       CIEDE2000:     -0.02%
      
      Change-Id: I69ba7144d07d388b4f0968f6a53558f480979171
      85437b21
  16. 28 Sep, 2016 1 commit
    • Steinar Midtskogen's avatar
      Clean up and speed up CLPF clipping · ae95e6db
      Steinar Midtskogen authored
      * Move clipping tests from inside to outside loops
      * Let sizex and sizey to clpf_block() be the clipped block size rather
        than both just bs
      * Make fallback tests to C more accurate
      
      Change-Id: Icdc57540ce21b41a95403fdcc37988a4ebf546c7
      ae95e6db
  17. 21 Sep, 2016 1 commit
    • Steinar Midtskogen's avatar
      Bugfix in the CLPF RDO. · 3b780f20
      Steinar Midtskogen authored
      When CLPF was extended to chroma, the chroma RDO accidentally
      discarded the optimal block size found in the luma RDO.
      
      PSNR YCbCr:     -0.25%      0.05%      0.06%
         PSNRHVS:     -0.19%
            SSIM:     -0.36%
          MSSSIM:     -0.23%
      
      Change-Id: Idf9f4a18ad774d7b4ff8e907df0180225ea0ccaf
      3b780f20
  18. 20 Sep, 2016 1 commit
  19. 16 Sep, 2016 1 commit
    • Steinar Midtskogen's avatar
      Extend CLPF to chroma. · a25c6c3b
      Steinar Midtskogen authored
      Objective quality impact (low latency):
      
      PSNR YCbCr:      0.13%     -1.37%     -1.79%
         PSNRHVS:      0.03%
            SSIM:      0.24%
          MSSSIM:      0.10%
       CIEDE2000:     -0.83%
      
      Change-Id: I8ddf0def569286775f0f9d4d4005932766a7fc27
      a25c6c3b
  20. 13 Sep, 2016 3 commits
  21. 12 Sep, 2016 1 commit
  22. 08 Sep, 2016 1 commit
    • Steinar Midtskogen's avatar
      Reduce memory footprint for CLPF decoding. · eb5794da
      Steinar Midtskogen authored
      Instead of having CLPF write to an entire new frame and
      copy the result back into the original frame, make the
      filter able to work in-place by keeping a buffer of size
      frame_width*filter_block_size and delay the write-back
      by one filter_block_size row.
      
      This reduces the cycles spent in the filter to ~75%.
      
      Change-Id: I78ca74380c45492daa8935d08d766851edb5fbc1
      eb5794da
  23. 07 Sep, 2016 1 commit
  24. 05 Sep, 2016 1 commit