1. 05 Feb, 2018 5 commits
    • Thomas Davies's avatar
      Add mismatch test for quantization matrices. · 9d8004b7
      Thomas Davies authored
      Change-Id: Idb40c0817af5dcb0a61b29d7ac3b96a5c847c69b
    • Maxym Dmytrychenko's avatar
      Adding config file parsing implementation · cc6e0e16
      Maxym Dmytrychenko authored
      Parameters from config file will be added at --cfg options location to
      be processed
      Config file example:
       #ignore comment
       ext-partition   : 1 #ignore as well
       codec           : av1
       psnr            : ON
          - Config file is a simple text file
          - Comment starts with hash(#)
            Can be full line or part of the line, after hash(#) details are
          - Format: field : value
            colon(:) as delimeter, otherwise full line will be ignored
            Space(s) and tab(s) can be used, not inside field
          - long names for field are prefered
             existing --long_name option format
          - "no value" fields should contain ON as value
      Example of usage:
          aomenc --cfg=some.cfg src_filename
      Configurations support matrix:
      enable-ext-partition         : done
      enable-loop-restoration      : wip
      enable-deblocking            : wip
      Change-Id: Iad867c5d2da64271cdafa825c89f7d6444582f61
    • Frederic Barbier's avatar
      Make segmentation compatible with scaling · 5e911428
      Frederic Barbier authored
      Fallback on intra block when the segmentation doesn't allow to reference
      a frame in scaling use case (e.g: after use of a segmentation map with
      another dimension).
      Change-Id: I4aa037f07ec3d18c96752e0d49f5afa4e8674d49
    • Yunqing Wang's avatar
      Reduce reference MV search · 880ab1ca
      Yunqing Wang authored
      The VP9 style reference MV search(find_mv_refs_idx) exists in AV1,
      and gather reference MVs in mv_ref_list, which is used to set
      nearestmv and nearmv.
      This patch switches the ref_mv search order, first call
      setup_ref_mv_list() to find same reference frame MVs. If >= 2 MVs are
      found, no more search is needed. Otherwise, we add MVs with different
      reference frames. The purpose of doing this is to speed up the decoder.
      Since we depends on setup_ref_mv_list() to find same reference frame
      MVs, this change does change the bitstream. But, it wouldn't affect the
      Borg test result:
              avg_psnr ovr_psnr  ssim
      lowres:  -0.013  -0.016   -0.047
      Change-Id: I219b9ca097b8fa90335d5b00f6edd639886f414d
    • Yushin Cho's avatar
      [dist-8x8] Restore removed asserts · 5574c604
      Yushin Cho authored
      Reverts the commit 796d4e29.
      Then, put them behind DEBUG_DIST_8X8 compile time flag.
      Change-Id: I8729abeca83a664d86188e21998d8d60b00120db
  2. 03 Feb, 2018 12 commits
    • Hui Su's avatar
      Fix build for entropy-stats · 45b65afc
      Hui Su authored
      Change-Id: I6873b43bb0fedf50ce90b84d60dc22d2fe8c3e2b
    • Jingning Han's avatar
      Rework hash map for txk-sel · 73bc2aa3
      Jingning Han authored
      The txk-sel allows each transform block to select its own
      transform kernel. Such locality enables one to store the selected
      RD cost including tx_type selection per transform block size.
      It reduces the needed hash map size to 1 / 16 of what is needed
      without txk-sel.
      This commit re-works the hash map RD cost fetch for txk-sel. Tested
      on red_kayak_480p in speed 1, enabling txk-sel makes the encoding
      speed 12% faster than the baseline without txk-sel on. Further
      enabling reduced hash map size gains speed 1 another 10%.
      Change-Id: I4a5d99d27e2a76b10e76c00a8178f692c95fdf13
    • Jingning Han's avatar
      Allow aq modes to reset the tx type in the encoding stage · 62129f98
      Jingning Han authored
      The aq modes will not keep consistent RD loop and final encoding
      stage due to the segment id changes. Allow them to reset the
      transform kernel types when needed.
      Change-Id: Idecf054cc8be0a03eccf2867f19a1a195ab82e8f
    • Peng Bin's avatar
      Add aom_comp_mask_pred_avx2 · 3c74dd45
      Peng Bin authored
      1. Add AVX2 implementation of aom_comp_mask_pred.
      2. For width 8 still use ssse3 version.
      3. For other widths(16,32), AVX2 version is 1.2x-2.0x faster
      than ssse3 version
      Change-Id: I80acc1be54ab21a52f7847e91b1299853add757c
    • Jingning Han's avatar
      Refactor transform kernel search · b80466f6
      Jingning Han authored
      Make the rate distortion structure fed into the search function
      Change-Id: Id3997fea87e8aa6d0b42e64b11aa79a8c3e15af7
    • Jingning Han's avatar
      Turn on txk-sel by default · 62497a86
      Jingning Han authored
      Change-Id: Ib95dba539a3677421d4c7ee5e2f3faaf2ebc8773
    • Yushin Cho's avatar
      Remove a redundant call of av1_init_plane_quantizers() · 7942c875
      Yushin Cho authored
      With aq_mode=VARIANCE_AQ, the av1_init_plane_quantizers() is
      called in set_segment_rdmult().
      Change-Id: Id2584a0544ee633832b844ba06c137236068c4b9
    • Peng Bin's avatar
      comp_mask_pred:process each width separately · 953b77ee
      Peng Bin authored
      There are 3 valid input width of aom_comp_mask_pred_ssse3.
      Process each width(8,16,32) separately achieves
      1.2x~1.5x speed up compare to origin ssse3 version.
      Change-Id: Ida3699e2e6ca98d1f9c7662d48806b299af26f10
    • Yaowu Xu's avatar
      Replace 64 bit operations with 32 bit ones · f06f641f
      Yaowu Xu authored
      Change-Id: Ic51231510fc8bb897f8ca771dd4e750d0e1cd693
    • James Zern's avatar
      av1_txfm,cosmetics: s/(min|max)Value/\1_value/ · b785b95a
      James Zern authored
      Change-Id: I35a6ac83d8a94c803148e7ad9366053599f747a0
    • James Zern's avatar
      av1_txfm: inline range_check_value, clamp_value · dc857593
      James Zern authored
      Change-Id: I972d0304c6ff495f5f484fe77270c420a0dfe376
    • James Zern's avatar
      quiet warnings with CONFIG_DEBUG · ba575a09
      James Zern authored
      unused var plane_bsize
      Change-Id: I02d75ec5ceab2f9d61a1a4ff5b5f1bc2d1b0a7a4
  3. 02 Feb, 2018 16 commits
    • Angie Chiang's avatar
      Implement sse2 inv 1d txfms · 1637d424
      Angie Chiang authored
      Change-Id: I9a42b75de3e623f6af325edbe91e299c0662f19c
    • Thomas Daede's avatar
      Re-allow 32x32 inter idtx. · af73d536
      Thomas Daede authored
      Change-Id: Iabdeb4ef7a98b034a4777527f727231f7b8815ee
    • Sebastien Alaiwan's avatar
      Move encoder-only function to encodemb.c · 750f6445
      Sebastien Alaiwan authored
      Change-Id: Id7ea17a5124215907d076e0e3500b9aeea1146fc
    • Debargha Mukherjee's avatar
      Remove code for CONFIG_FAST_SGR=2 and cleanup · 1a709944
      Debargha Mukherjee authored
      Change-Id: I01cecc829e2d57517427a1de6387e91ba3c64312
    • Imdad Sardharwalla's avatar
      SSE4 and AVX2 implementations of updated FAST_SGR · d051e560
      Imdad Sardharwalla authored
      The SSE4.1 and AVX2 implementations of the self-guided filter have been updated
      to match the updated FAST_SGR C implementation in restoration.c.
      The self-guided filter speed tests have been altered to compare the speeds of
      the SIMD and C implementations of the relevant functions.
      Speed Tests (code compiled with CLANG)
      For LowBD:
      - The SSE4.1 implementation is ~220% faster (~69% less time) than the C code
      - The AVX2 implementation is ~314% faster (~76% less time) than the C code
      For HighBD:
      - The SSE4.1 implementation is ~240% faster (~71% less time) than the C code
      - The AVX2 implementation is ~343% faster (~77% less time) than the C code
      Change-Id: Ic2734bb89ccd3f66667c68647e5f677a5a496233
    • Angie Chiang's avatar
      Implement sse2 fwd 1d txfms · 1a796617
      Angie Chiang authored
      Change-Id: I8dcaa6882d47a097498c8f8af515b1185df4fdf3
    • Hui Su's avatar
      lv-map: move loading of default CDFs to av1_default_coef_probs() · 3d288156
      Hui Su authored
      In preparation for supporting q_adapt_probs.
      Change-Id: I4a39b81b0d2c4ceb1586ae411a1216c6c20d896d
    • Hui Su's avatar
      Reduce memory usage of inter_tx_size[] in MB_MODE_INFO · 7167d952
      Hui Su authored
      Reduce the length of inter_tx_size[] from 1024 to 16.
      On a cif test sequence,
      encoder memory consumption decreases by 18% (380MB -> 312MB);
      decoder memory consumption decreases by 56% (21.4MB -> 9.4MB).
      Change-Id: I42928eb9312748f96f4393c8d8040791f38f98b6
    • Frederic Barbier's avatar
      Cleanup deprecated comments · e5d166ef
      Frederic Barbier authored
      Change-Id: I91f18c498c694829b933bb73812ad94d66962994
    • Imdad Sardharwalla's avatar
      AVX2 implementation of the Wiener filter · aab6aee3
      Imdad Sardharwalla authored
      Added an AVX2 version of the Wiener filter, along with associated tests. Speed
      tests have been added for all implementations of the Wiener filter.
      Speed Test results
      Low bit-depth filter:
      - SSE2 vs C: SSE2 takes ~92% less time
      - AVX2 vs C: AVX2 takes ~96% less time
      - SSE2 vs AVX2: AVX2 takes ~43% less time (~74% faster)
      High bit-depth filter:
      - SSSE3 vs C: SSSE3 takes ~92% less time
      - AVX2  vs C: AVX2  takes ~96% less time
      - SSSE3 vs AVX2: AVX2 takes ~46% less time (~84% faster)
      Low bit-depth filter:
      - SSE2 vs C: SSE2 takes ~84% less time
      - AVX2 vs C: AVX2 takes ~88% less time
      - SSE2 vs AVX2: AVX2 takes ~27% less time (~36% faster)
      High bit-depth filter:
      - SSSE3 vs C: SSSE3 takes ~85% less time
      - AVX2  vs C: AVX2  takes ~89% less time
      - SSS3  vs AVX2: AVX2 takes ~24% less time (~31% faster)
      Change-Id: Ide22d7c09c0be61483e9682caf17a39438e4a208
    • Debargha Mukherjee's avatar
      Don't use extra lines for r=2 guided filter · f7d1ff49
      Debargha Mukherjee authored
      Changes the CONFIG_FAST_SGR=1 strategy to not use any
      subsampling for the r=1 filter, but for the r=2 filter
      sub-sample vertically but combine only by filtering
      horizontally in the last stage for odd rows.
      Coding efficiency loss sems quite minimal.
      Change-Id: I5644ac400b387c37a2d278db7f6ad3ac0a6b5e93
    • Debargha Mukherjee's avatar
      Remove CONFIG_FRAME_SIGN_BIAS config flag · 23b54841
      Debargha Mukherjee authored
      Change-Id: I6138519456b2ad3ffc8bced803ddc4418b246e74
    • Debargha Mukherjee's avatar
      Port first pass stats handling from Vp9 into Av1 · da01c0dc
      Debargha Mukherjee authored
      Some parameter tuning included.
      lowres (q, 30 frames, speed 1):
      -1.243% av PSNR, -2.337% ov PSNR, +0.577% SSIM
      lowres (vbr, 30 frames, speed 1):
      -0.327% av PSNR, -1.007% ov PSNR, +0.182% SSIM
      A few videos become a lot worse in SSIM, which needs to be
      investigated. But PSNR-wise the patch seems pretty good.
      Change-Id: I17c8d812c96ee49ddae7d3959a459aa3ffcea208
    • Peng Bin's avatar
      Remove aom_comp_mask_upsampled_pred from rtcd · f8daa92d
      Peng Bin authored
      Since aom_comp_mask_upsampled_pred just call aom_upsampled_pred
      and aom_comp_mask_pred, no need to separate c version from simd
      version any more.
      Change-Id: I1ff8bcae87d501c68a80708fd2dc6b74c6952f88
    • Yaowu Xu's avatar
      Align allocated buffers · a64c05b5
      Yaowu Xu authored
      Change-Id: I5a8bdbd472213ded2de706c5b044a1bf24823670
    • Jingning Han's avatar
      Fix txk-sel unit test failure in aq-mode · 1e79d90e
      Jingning Han authored
      The current aq mode encoder setting would alter the segment_id
      between the rate-distortion optimization and the block encoding
      stages. Disable the corresponding consistency check in this case.
      Change-Id: Ic910a23fd64a9b4554567d3c8c9a9ae5f6062c7b
  4. 01 Feb, 2018 7 commits