1. 10 Feb, 2017 1 commit
  2. 06 Feb, 2017 1 commit
    • Angie Chiang's avatar
      Add av1_convolve_2d_facade · 7927a97d
      Angie Chiang authored
      When convolve_round is on, av1_convolve_2d_facade will be used for
      interpolation rather than av1_convolve. Will remove the experiment
      code of convolve_round experiment from av1_convolve in another CL.
      
      So far we use 4-bit rounding in the intermediate stage on top of using
      post rounding for compound mode after the last stage.
      
      This will give us roughly 0.45% gain on lowres , 0.39% on midres and
      roughly 0.6-0.7% on hdres
      Altogether, is 1.15% on lowresm, 0.74% on midres and roughly 1.7-1.8% on
      hdres
      
      Note that there no restriction usage of 12-tap filter in the CL.
      Adding that, we will lose roughly 0.1% again on lowres.
      
      Change-Id: I6332e1d888e28a3b3ddc29711817d66e52cb5cdf
      7927a97d
  3. 02 Feb, 2017 1 commit
  4. 20 Jan, 2017 2 commits
    • Angie Chiang's avatar
      Refactor av1_convolve · caa9e5ad
      Angie Chiang authored
      Move declaration of filter_params_x/y outside of if/else block
      
      Change-Id: I4f908872b7ff85b440a12a535d939a3c137aaab5
      caa9e5ad
    • Angie Chiang's avatar
      Add CONVOLVE_POST_ROUNDING flag · 117aa0dc
      Angie Chiang authored
      By turning on CONVOLVE_POST_ROUNDING, in the compound inter
      prediction mode, FILTER_BITS rounding is moved after the summation
      of two predictions.
      
      Note that the post rounding is only applied on non-sub8x8 block
      
             PSNR     BDRate
      lowres -0.808%  -0.673%
      
      Change-Id: Ib91304e6122c24d832a582ab9f5757d33eac876c
      117aa0dc
  5. 18 Jan, 2017 2 commits
  6. 13 Jan, 2017 2 commits
    • Angie Chiang's avatar
      Add rounding option into av1_convolve · 674bffdc
      Angie Chiang authored
      Use a round flag in ConvolveParams to indicate if the destination buffer
      has the result rounded by FILTER_BITS or not.
      This CL is part of the goal of reducing interpolation rounding error in
      compound prediction mode.
      
      Change-Id: I49e522a89a67a771f5a6e7fbbc609e97923aecb6
      674bffdc
    • Jingning Han's avatar
      Clean up redundant #if statements · 203b1d30
      Jingning Han authored
      Change-Id: Ia4779ffb47de333d670ae110cbdfb6cc567da910
      203b1d30
  7. 15 Dec, 2016 2 commits
  8. 12 Dec, 2016 4 commits
  9. 01 Dec, 2016 2 commits
    • Angie Chiang's avatar
      Turn on SIMD optimization for dual_filter · 7a483cff
      Angie Chiang authored
      Let aom_convolve8_### SIMD implementation support any block width.
      Turn on SIMD optimization when interpolation filter types on two
      directions are different.
      
      This will reduce 30% of encoding time when dual_filter and ext_interp
      both on.
      
      Change-Id: I539dbb2737f01835034b7269656a15b2058fa3cc
      7a483cff
    • Angie Chiang's avatar
      Allow only one direction uses 12 sharp filter · b9b017d5
      Angie Chiang authored
      Performance drop
      BDRate
      lowres -0.116%
      midres -0.073%
      hdres  -0.056%
      
      Change-Id: Ic90caf9b8f6fb9d9fd6f9c0e80436a7c468a3c97
      b9b017d5
  10. 30 Nov, 2016 1 commit
  11. 29 Nov, 2016 1 commit
    • Angie Chiang's avatar
      Add av1_convolve_init() · e067de00
      Angie Chiang authored
      Generate simd filter structure in av1_convolve_init()
      This will provide flexibility of changing filter coefficients.
      
      Change-Id: If79f84c56483aa08c894d6b12e2b6ce10147f0ce
      e067de00
  12. 01 Nov, 2016 1 commit
  13. 25 Oct, 2016 1 commit
  14. 09 Sep, 2016 1 commit
    • James Zern's avatar
      s/INTERP_FILTER/InterpFilter/ · 7b9407a8
      James Zern authored
      this matches style guidelines and stabilizes successive runs of
      clang-format across the tree. remaining types should be address in
      successive commits.
      
      Change-Id: I6ad3f69cf0a22cb9a9b895b272195f891f71170f
      7b9407a8
  15. 01 Sep, 2016 2 commits
  16. 12 Aug, 2016 1 commit
  17. 18 Jul, 2016 1 commit
    • skal's avatar
      fix vp10_convolve() signatures · 87c2db82
      skal authored
      fortunately, the call site was calling the function with
      the correct parameter order.
      
      Change-Id: Ia48099c18288a2416c8b9a7062d2b8d417fd07df
      87c2db82
  18. 12 Jul, 2016 1 commit
    • Yi Luo's avatar
      HBD convolution filtering (10/12 taps) SSE4.1 optimization · 8cacca73
      Yi Luo authored
      - For experiment EXT_INTERP under high bit depth.
      - Add unit test to verify bit-exact.
      - Speed performance improvement:
        On Xeon E5-2680, park_joy_1080p_12.y4m, 50 frames, encoding time
        drops from 6682503 ms to 5390270 ms.
      
      Change-Id: Iea4debf5414f3accf1eb5672abeab56a0539ac77
      8cacca73
  19. 11 Jul, 2016 1 commit
  20. 27 Jun, 2016 1 commit
    • Yi Luo's avatar
      Fix bugs in convolution filter optimization · 8404253f
      Yi Luo authored
      - Fix the over-writing bug in horizontal filtering as width = 2.
      - Fix 10-tap vertical filtering which no longer reads one row of
        pixel above the block.
      - Fix 10-tap filter zero padding.
      - Encoder speed slow down ~4.0%, compared to,
        81ad9536 Convolution vertical filter SSSE3 optimization
      
      Change-Id: I9bb294a4529300081c29bf284e6bc6eb081cc536
      8404253f
  21. 23 Jun, 2016 1 commit
    • Yi Luo's avatar
      Convolution vertical filter SSSE3 optimization · 81ad9536
      Yi Luo authored
      - Apply 8-pixel vertical filtering direction parallelism.
      - Add unit tests to verify bit exact.
      - Encoder speed improves ~29% (enable EXT_INTERP) on Xeon E5-2680.
      - Combinational cycle count of vp10_convolve() drops from 26.06%
        to 6.73%.
      
      Change-Id: Ic1ae48f8fb1909991577947a8c00d07832737e57
      81ad9536
  22. 20 Jun, 2016 1 commit
    • Yi Luo's avatar
      Convolution horizontal filter SSSE3 optimization · 229690a9
      Yi Luo authored
      - Apply signal direction/4-pixel vertical/8-pixel vertical
        parallelism.
      - Add unit test to verify the bit exact result.
      - Overall encoding time improves ~24% on Xeon E5-2680 CPU.
      
      Change-Id: I104dcbfd43451476fee1f94cd16ca5f965878e59
      229690a9
  23. 19 May, 2016 1 commit
    • Jingning Han's avatar
      Properly handle the filter extension in highbd setting · d84a2e7d
      Jingning Han authored
      This commit makes the filter extension in highbd aware of the
      dual filter and ext-interp experiments to prevent enc/dec mismatch
      when both experiments are turned on.
      
      Change-Id: I11ac1f041bd5f73d61e839d6386d9c5d008da3f7
      d84a2e7d
  24. 16 May, 2016 1 commit
    • Jingning Han's avatar
      Properly handle 2D filter boundary extension · 14dd5538
      Jingning Han authored
      The amount of border extension needed in the first stage inter
      filtering is decided by the length of the second stage filter
      kernel.
      
      Change-Id: Icddbc58c02234d5df09ff0eeebcf166ffe689203
      14dd5538
  25. 09 May, 2016 1 commit
    • Jingning Han's avatar
      Fix dual filter type for high bit-depth · 9de916eb
      Jingning Han authored
      This commit fixes the compiler error in high bit-depth inter
      predictor when dual filter type experiment is turned on.
      
      Change-Id: I404a76a246477f2fcffc38a3275007d5dfe229cd
      9de916eb
  26. 07 May, 2016 1 commit
  27. 30 Mar, 2016 1 commit
    • Geza Lore's avatar
      Extend superblock size fo 128x128 pixels. · 552d5cd7
      Geza Lore authored
      If --enable-ext-partition is used at build time, the superblock size
      (sometimes also referred to as coding unit (CU) size) is extended to
      128x128 pixels.
      
      Change-Id: Ie09cec6b7e8d765b7555ff5d80974aab60803f3a
      552d5cd7
  28. 26 Feb, 2016 1 commit
  29. 20 Feb, 2016 1 commit
    • Angie Chiang's avatar
      Fix 12 TAP convolution bug · 1e403064
      Angie Chiang authored
      Priviously, we do 12-tap interpolation even there is no sub pixel,
      This could cause a bug becuase decoder doesn't extend border when there
      is no sub pixel. In this situation, if we still do interpolation, we
      will access the border extension which doesn't exist and cause a
      memory error
      
      Change-Id: I55b879722f0a10c5d13261bd9617a75c826a2418
      1e403064
  30. 06 Feb, 2016 1 commit