1. 31 Jul, 2017 5 commits
    • Angie Chiang's avatar
      Turn on convolve_round by default · 71ef7c27
      Angie Chiang authored
      The performance on default experiment is
      lowres: 0.812%
      
      midres/hdres and AWCY tests are still running
      
      Change-Id: Id2209c79df6517732dd06c2712a7bdefde118ead
      71ef7c27
    • Rupert Swarbrick's avatar
      Fix compiler warning in bitstream.c · 223f0489
      Rupert Swarbrick authored
      The write_motion_mode function only uses its "cm" parameter if it
      needs to write out global motion information or distinguish between
      motion_var and warped_motion. When these are disabled, you get a
      compiler warning which this patch silences.
      
      Change-Id: I64d06a150751cd72cf4b50799432f3161ee87938
      223f0489
    • Rupert Swarbrick's avatar
      Fix build of encodeframe.c without motion-var or warped-motion · bf828f93
      Rupert Swarbrick authored
      This patch surrounds two uses of motion_mode_cost and motion_mode_cdf
      with preprocessor #if lines.
      
      Both uses were added by commit bdc8dab2.
      
      Change-Id: I7e4a74e97b9179e42bae6ee17e9b2094acb992f2
      bf828f93
    • Rupert Swarbrick's avatar
      Fix unused parameter warning in decodemv.c · 766c9d59
      Rupert Swarbrick authored
      This warning only comes up if none of the experiments ext-intra,
      palette and filter-intra are enabled.
      
      Change-Id: Ic58863d8d845034aa52230bf52a3c5def8d3ac0f
      766c9d59
    • Yue Chen's avatar
      motion_var: computer motion_mode_cost from cdf · bdc8dab2
      Yue Chen authored
      Initialize mode cost using frame-level cdf.
      Also in rd selection stage, cdf is updated per 64x64.
      Performance gain 0.20%
      
      Still suboptimal since in real bitstream packing, cdf is updated
      per symbol. Per symbol update in RDO is work in progress.
      
      Change-Id: I5062af91d8b00e5bf4c08abd0a7bfb0e5b27a619
      bdc8dab2
  2. 30 Jul, 2017 1 commit
    • Yaowu Xu's avatar
      Update 8-bit frame buffers for Global Motion estimation · 0006073f
      Yaowu Xu authored
      This commit makes sure that 8bit frame buffers used in global motion
      estimation are updated, so to help global motion to improve compression
      for hbd internal encoding. 
      
      On lowres 12 encoding, the improvements are: 
      Overall PSNR: .896%
      SSIM:  1.159%
      PSNR HVS: .952%
      
      Change-Id: I5d75c231407bc1e4ed564c3a216bdd1ec3919f14
      0006073f
  3. 29 Jul, 2017 2 commits
    • Monty Montgomery's avatar
      Add CONFIG_DAALA_DCT16 experiment. · cb9c1c52
      Monty Montgomery authored
      This experiment replaces the 16-point Type-II DCT and 16-point Type-IV
      DST scaling vp9 transforms with the 16-point orthonormal Daala
      transforms.  These have reduced complexity and are perfect
      reconstruction.  There is currently no net coding performance impact.
      
      subset-1:
      
        monty-square-baseline-s1-F@2017-07-23T03:43:45.042Z ->
           monty-square-dct16-s1-F@2017-07-23T03:42:29.805Z
      
         PSNR | PSNR Cb | PSNR Cr | PSNR HVS |    SSIM | MS SSIM | CIEDE 2000
      -0.0152 | -0.0028 | -0.0929 |  -0.0432 | -0.0457 | -0.0425 |    -0.0237
      
        objective-1-fast:
      
        monty-square-baseline-o1f-F@2017-07-23T03:44:19.973Z ->
           monty-square-dct16-o1f-F@2017-07-23T03:43:22.549Z
      
        PSNR | PSNR Cb | PSNR Cr | PSNR HVS |   SSIM | MS SSIM | CIEDE 2000
      0.0305 |  0.0926 | -0.1600 |   0.0471 | 0.0219 | -0.0075 |     0.0135
      
      Change-Id: I54fed26d65fd8450693334bb400b1fafd7e0dacb
      cb9c1c52
    • David Michael Barr's avatar
      [CFL] Uniform Q3 alpha grid with extent [-2, 2] · f6eaa159
      David Michael Barr authored
      Expand the range of alpha to [-2, 2] in Q3.
      Jointly signal the signs, including zeros.
      Use the signs to give context for each quadrant
      and half-axis. The (0, 0) point is excluded.
      Symmetry in alpha_u == alpha_v yields 6 contexts.
      
      Results on Subset1 (Compared to 9136ab7d with CFL enabled)
      
         PSNR | PSNR Cb | PSNR Cr | PSNR HVS |    SSIM | MS SSIM | CIEDE 2000
      -0.0792 | -0.7535 | -0.7574 |  -0.0639 | -0.0843 | -0.0665 |    -0.3324
      
      Change-Id: I250369692e92a91d9c8d174a203d441217d15063
      Signed-off-by: default avatarDavid Michael Barr <b@rr-dav.id.au>
      f6eaa159
  4. 28 Jul, 2017 6 commits
    • Urvang Joshi's avatar
      Fix logical errors for TX64x64. · 9136ab7d
      Urvang Joshi authored
      - Wrong function argument fix: this was not caught by compile test
      because DCT_DCT has a value of 0, which was converted to a NULL pointer.
      - Wrong prob array size.
      
      Change-Id: Iaf1747dc7fb40db1d1ab35f965fb60994d8dec95
      9136ab7d
    • Urvang Joshi's avatar
      Fix tx64x64 debug build · 2dc1c841
      Urvang Joshi authored
      Change-Id: I1b77416eaae000ae40e139d8f7fc31754f817bba
      2dc1c841
    • Yushin Cho's avatar
      Fix dist_8x8 broken with 3bce7547 · a4817a6b
      Yushin Cho authored
      The commit 3bce7547 has introduced an another early-exit based on MSE distortion
      in transform domain, which enables skipping trellis coding and
      calling av1_dist_block() in block_rd_txfm() and skipping trellis coding in av1_tx_block_rd_b().
      
      However, with dist-8x8, the early-exit for sub8x8 tx block in a partition >= 8x8 in plane 0
      is disabled because that the reference distortion metric
      (which would be non-MSE and applied to 8x8 or larger) can not be compared to
      MSE distortions of sub8x8 tx blocks.
      
      Change-Id: I46ada7c90a869d23fc0f0166a01dfdc5392af311
      a4817a6b
    • Luc Trudeau's avatar
      [CFL] New UV_PREDICTION_MODE for CFL · 6e1cd787
      Luc Trudeau authored
      CfL is now an independent mode.
      
      Results on Subset1 (Compared to 4266a7ed with CFL enabled)
      
         PSNR | PSNR Cb | PSNR Cr | PSNR HVS |    SSIM | MS SSIM | CIEDE 2000
      -0.1645 | -0.4017 |  0.2475 |  -0.1851 | -0.2179 | -0.2338 |    -0.2897
      
      Change-Id: I2e86e7ea7bfc12bb1d763e70a136ca992d57a3c5
      6e1cd787
    • Jingning Han's avatar
      Conditionally skip inverse transform in transform block RD · 1a7f0a8c
      Jingning Han authored
      When the lower bound of a transform block rate-distortion cost is
      above the current best rd cost, the only possibility that this
      particular coding mode will be chosen is to fall back to all skip
      mode. Hence there is no need to estimate the transform block rate
      cost, distortion, etc. Obtain the sum of squared distance between
      the prediction and the source would be sufficient.
      
      This speeds up the encoding process by 5% - 10%.
      
      Change-Id: I728728c3a42aafefd34641f0be69b3e2a9b9bbb2
      1a7f0a8c
    • Jonathan Matthews's avatar
      Adapt palette cdf · 7abe9db7
      Jonathan Matthews authored
      Bug introduced in change: Ic4c9333c9af5993bc41e513b9e766450b3a951eb
      
      BUG=aomedia:667
      
      Change-Id: I29ab87f32d2f940a3d1e079f734b92467d2ebea9
      7abe9db7
  5. 27 Jul, 2017 6 commits
    • Tom Finegan's avatar
      Fix CMake asm flags when CONFIG_PIC is enabled. · 7596ee16
      Tom Finegan authored
      The asm flags weren't getting updated when CONFIG_PIC
      was enabled via the command line.
      
      Change-Id: Ie4654337d2c7cba87d6902eb2b85097d1ab9e7ca
      7596ee16
    • Tom Finegan's avatar
      Sync CMake build with the configure build. · 63bd445d
      Tom Finegan authored
      Added: CONFIG_INSTALL_DOCS, CONFIG_ALTREF2, CONFIG_FLEX_REFS
             CONFIG_LPF_DIRECT.
      Changed, 0 => 1: CONFIG_RECT_INTRA_PRED
      
      Change-Id: I1e958ab7dcd0c791b33a0ac5104fdf557f2cd29c
      63bd445d
    • Cheng Chen's avatar
      Make CDEF work with EXT_PARTITION · f5bdeac2
      Cheng Chen authored
      Make CDEF select filter strength every 64x64 block when block size
      could be larger than 64x64.
      
      With/without this patch, coding performances on AWCY and Google
      test of lowres and midres are neutral.
      
      BUG=aomedia:662
      
      Change-Id: Ief82cc51be91fc08a7c6d7e87f6d13bcc4336476
      f5bdeac2
    • Cheng Chen's avatar
      Select filter level for U, V planes · e94df5cf
      Cheng Chen authored
      Previously, U, V planes share the same filter level with Y.
      Here, we search and pick the best filter level for U, V planes.
      Selected filter levels are transmitted per frame.
      This works with parallel_deblocking.
      
      Coding gain on Google test set:
      		Avg_psnr	ovr_psnr	ssim
      lowres: 	-0.116		-0.120		-0.339
      midres:		-0.218		-0.228		-0.338
      hdres:		-0.260		-0.264		-0.365
      
      Change-Id: I03d2ac47539f3eea9f3c4b08007bd6d3f4b73572
      e94df5cf
    • Monty Montgomery's avatar
      Add proper CMAKE dependencies for CONFIG_DAALA_DCT4 experiment · caca1355
      Monty Montgomery authored
      CONFIG_DAALA_DCT8 added the necessary related configuration
      enable/disable logic to the CMAKE build to be consistent with the
      Automake build system, but the earlier CONFIG_DAALA_DCT4 commits did
      not.  This brings CONFIG_DAALA_DCT4 up to date and consistent with
      DCT8.
      
      Change-Id: Iba9fda00c251f5477fdb4c35fc5cd8874050b530
      caca1355
    • Angie Chiang's avatar
      Fix convolve_round's compile error · 7346ca19
      Angie Chiang authored
      Change-Id: I63fc3f1f010e77c6dc033f37e3e91ade17a55099
      7346ca19
  6. 26 Jul, 2017 13 commits
    • Jingning Han's avatar
      Reduce best rdcost value in transform partition search · 16a9df75
      Jingning Han authored
      Adaptively reduce the best rate-distortion cost value in the
      recursive transform block partition search. For bus CIF at 1000 kbps
      this reduces the encoding time from 1864 seconds to 1756 seconds,
      about 6% speed up.
      
      Change-Id: I5433a1825c0f8b13fcc5ab7e19713a98969d53fc
      16a9df75
    • Yue Chen's avatar
      rect_tx_ext: work with var_tx · d6bdd46b
      Yue Chen authored
      Change-Id: Ie2c34490dc50cb242bcd701308e6b55243883b15
      d6bdd46b
    • Angie Chiang's avatar
      Add some todo for convolve_round exp · 748d570e
      Angie Chiang authored
      1) Integrate it with supertx
      2) Integrate it with chroma_sub8x8
      
      Change-Id: If4bb906d442d15bae3741192029ec851c48d3948
      748d570e
    • Luc Trudeau's avatar
      [CFL] UV_PREDICTION_MODE · d6d9eeeb
      Luc Trudeau authored
      A separate prediction mode struct is added to allow
      for uv-only modes (like CfL). Note: CfL will be
      added as a separate mode in an upcoming commit.
      
      Results on Subset1 (Compared to 4266a7ed with CfL enabled)
        PSNR | PSNR Cb | PSNR Cr | PSNR HVS |   SSIM | MS SSIM | CIEDE 2000
      0.0000 |  0.0000 |  0.0000 |   0.0000 | 0.0000 |  0.0000 |     0.0000
      
      Change-Id: Ie80711c641c97f745daac899eadce6201ed97fcc
      d6d9eeeb
    • Sarah Parker's avatar
      Add txfm functions corresponding to MRC_DCT · 5b8e6d2d
      Sarah Parker authored
      MRC_DCT uses a mask based on the prediction signal to modify the
      residual before applying DCT_DCT. This adds all necessary functions
      to perform this transform and makes the prediction signal available
      to the 32x32 txfm functions so the mask can be created. I am still
      experimenting with different types of mask generation functions and
      so this patch contains a placeholder. This patch has no impact on
      performance.
      
      Change-Id: Ie3772f528e82103187a85c91cf00bb291dba328a
      5b8e6d2d
    • Angie Chiang's avatar
      Integrate hbd convolve_round and compound_segment · 0c604285
      Angie Chiang authored
      When convolve_round is turned on, both lbd/hbd use use 32-bit buf
      Therefore, they use the same mask/blending functions
      
      Change-Id: Icfc6db818c0a53216108e42161acac07303e6c1c
      0c604285
    • Angie Chiang's avatar
      Use ADAPT_SCAN_PROB_PRECISION to init prob · 963b86d3
      Angie Chiang authored
      Change-Id: I94d66c65d78235e1025703caf79ccca43208d604
      963b86d3
    • hui su's avatar
      Palette: use CDF to encode palette size and color indices · 466ae062
      hui su authored
      Around 0.9% improvement on screen_content set (encoding 30 frames).
      
      Change-Id: Ic4c9333c9af5993bc41e513b9e766450b3a951eb
      466ae062
    • Yaowu Xu's avatar
      use frame_type for key frame check · 1b4ffc44
      Yaowu Xu authored
      Change-Id: I416a7f99e292a6304bc24d93ab580650768d5e21
      1b4ffc44
    • Jingning Han's avatar
      Optimize transform block rate-distortion search · 3bce7547
      Jingning Han authored
      The soft coefficient optimization process would monotonically
      increase the transform block distortion and decrease the
      coefficient rate cost. Such observation provides a lower bound
      on the rate-distortion cost for the given transform block. This
      commit compares this lower bound against the best available
      rate-distortion cost value and skips unnecessary optimization
      process. It speeds up the baseline encoding process by 15%.
      
      Change-Id: Ida8098a2820cef60d59ec1e72f0bbb1acbd98165
      3bce7547
    • Di Chen's avatar
      Disable extra altref and bwdref for still gf group · 53a04f66
      Di Chen authored
      Use three metrics to identify the still gf group.
      Performance:
      lowres: pamphlet_cif -1.395; bowing_cif -0.989;
              others remain same. Overall -0.064
      midres: snow_mnt_480p -0.827. others remain same.
              Overall -0.028
      
      Change-Id: I22a6429c7ebdad2c36ec73c7a69cabc07e8208b7
      53a04f66
    • David Barker's avatar
      Avoid reading uninitialized data in decodemv.c · 0d7c4b05
      David Barker authored
      The existing code has a case where we set a variable to equal
      xd->ref_mv_stack[mbmi->ref_frame[0]][1 + mbmi->ref_mv_idx]
      even for compound blocks. However, the range of allowable
      values for mbmi->ref_mv_idx is determined by the ref_mv_count
      for the *combined* ref frame, not for the first single ref frame.
      
      This means that, if we have more ref-mv candidates for the combined
      ref frame than for the first single ref frame, then we can sometimes
      fetch uninitialized data.
      In every case where this happens, we immediately overwrite
      the destination with the correct mv, but it is still preferable
      to avoid reading uninitialized data.
      
      This patch moves the code block to avoid this bug. In addition,
      the variable (nearmv[0]) is only used when the mode equals NEARMV,
      so the condition on its assignment is changed to reflect that.
      
      Change-Id: I3bd268dc80d8065d5189999232b8a0f826d40a95
      0d7c4b05
    • Monty Montgomery's avatar
      Add CONFIG_DAALA_DCT8 experiment. · cf18fe4e
      Monty Montgomery authored
      This experiment replaces the 8-point Type-II DCT and 8-point Type-IV DST
       scaling vp9 transforms with the 8-point orthonormal Daala transforms.
      These have reduced complexity and are perfect reconstruction at the cost
       of a slightly worse coding performance.
      This is because the Daala transforms expect the input to be shifted by 4
       bits but the output scale of the vp9 transforms is only 3 bits.
      
      subset-1:
      
      monty-square-baseline-subset1 ->
        monty-square-dct8-subset1@2017-07-17T21:37:44.281Z
      
        PSNR | PSNR Cb | PSNR Cr | PSNR HVS |   SSIM | MS SSIM | CIEDE 2000
      0.0019 | -0.0011 | -0.0585 |  -0.0111 | 0.0305 |  0.0317 |     0.0187
      
      objective-1-fast:
      
      monty-square-baseline-o1f ->
        monty-square-dct8-o1f@2017-07-17T21:37:15.735Z
      
        PSNR | PSNR Cb | PSNR Cr | PSNR HVS |   SSIM | MS SSIM | CIEDE 2000
      0.0285 |  0.0129 | -0.5080 |   0.0529 | 0.0345 |  0.0441 |     0.0054
      
      Change-Id: I2b775495398fb717204a295397c3c5e3ca938183
      cf18fe4e
  7. 25 Jul, 2017 5 commits
    • Yushin Cho's avatar
      Fix that matching { and } can be searched in inter mode decision · 67dda51a
      Yushin Cho authored
      Because #if ... #else ... put the '{' on the same line, dangling { or } occurs,
      which causes automatic syntax analyzer, such as 'Ctrl-Shifht-P' in Eclipse
      or '%' of vi, fail to find matching { and }.
      
      For some developers, this can make quick reading and/or understaning blocks of code
      almost impossible.
      
      Three function or blocks are repaird.
      1. av1_rd_pick_inter_mode_sb() {...}
      
      2. for (midx = 0; midx < MAX_MODES; ++midx) {...}
         in av1_rd_pick_inter_mode_sb()
      
      3. handle_inter_mode() {...}
      
      Change-Id: Ib5ac63b8c7f9870a491fac337ae3f58c57ce5e46
      67dda51a
    • Jingning Han's avatar
      Account for the 64x64 proc block constrain in obmc masking · 440d4254
      Jingning Han authored
      Make the codec account for the 64x64 processing unit constraint
      when producing the mask for overlapped filter.
      
      Change-Id: I3e596492ae522abe678369b0c9710441549e817e
      440d4254
    • Jingning Han's avatar
      Make maximum obmc process unit 64x64 · 501294ce
      Jingning Han authored
      For 128x128 level blocks, process the overlapped prediction in
      the unit of 64x64. This allows hardware design to reuse the 64x64
      processing unit in 128x128 level block coding.
      
      Change-Id: I3967b8e3c1c697f96a50e59a0957fc69b67e6f8e
      501294ce
    • Luc Trudeau's avatar
      [CFL] Average alpha CDF · 4266a7ed
      Luc Trudeau authored
      Change-Id: Id556e8d77c5871ddae338baa1abfb93b7aa207e9
      4266a7ed
    • Luc Trudeau's avatar
      [CFL] Fix warnings when chroma_sub8x8 is disabled · 96b31516
      Luc Trudeau authored
      This change does not alter the bitstream
      
      Resuls on Subset1 (compared to 70a80a81 with cfl)
        PSNR | PSNR Cb | PSNR Cr | PSNR HVS |   SSIM | MS SSIM | CIEDE 2000
      0.0000 |  0.0000 |  0.0000 |   0.0000 | 0.0000 |  0.0000 |     0.0000
      
      Change-Id: I7672eb4cde3c649ebba32610f7e56500e378c062
      96b31516
  8. 24 Jul, 2017 2 commits