1. 04 Aug, 2017 1 commit
    • Steinar Midtskogen's avatar
      CDEF cleanup · 94de0aaa
      Steinar Midtskogen authored
      Name changes and code moves to bring code more in line with the
      design doc and an upcoming single-pass patch.  No functional changes.
      
      Change-Id: I2bccd58c644e534b139f420b623390aa971fbdb0
      94de0aaa
  2. 03 Aug, 2017 1 commit
  3. 02 Aug, 2017 5 commits
    • Angie Chiang's avatar
      Add txmg experiment · ad653a39
      Angie Chiang authored
      This experiment aims at merging lbd/hbd txfms
      
      So far this exp uses hbd transform on lbd path.
      The performances I observed are
      lowres -0.089%
      midres  0.065%
      (negative means performance drop)
      
      Started from here, two main things are needed to be done.
      1) Fix overflow due to quantizer noise
      2) Generate a 16-bit version from the hbd txfm
      
      Change-Id: I35bb1fc0cbb78decad2570ff5826ed665f739752
      ad653a39
    • David Barker's avatar
      Fix inconsistency in compound-segment masks · fc256542
      David Barker authored
      The value of 'mask_base' passed to diffwtd_mask is currently
      38 for the lowbd path and 42 for the highbd path. Going off of
      the mode name (DIFFWTD_38), presumably these are both supposed
      to be 38, so change the highbd path accordingly.
      
      Change-Id: I5fb0099c4b8b3ca3c4f211562401b12012f5c002
      fc256542
    • Yi Luo's avatar
      Setup frame/tile boundary when frame/tile geometry changes · 10e23004
      Yi Luo authored
      Change-Id: I44bc9d8887526a5ee92bf79730fa3ce6c73b160b
      10e23004
    • Angie Chiang's avatar
      Integrate convolve_round with chroma_sub8x8 · b9a822b4
      Angie Chiang authored
      Change-Id: I9a1b5b6016cd1afbc52cdac4469acb79c412e475
      b9a822b4
    • Angie Chiang's avatar
      Use 10 bits to represent adapt_scan probabilities · a506eb61
      Angie Chiang authored
      The performance drop slightly by using 10 bit probabilities.
      lowres: -0.048%
      midres: 0.007
      hdres: -0.06
      
      Change-Id: I5ba7b5607802d084a599b779e5745f88b31e2cbe
      a506eb61
  4. 01 Aug, 2017 6 commits
    • Urvang Joshi's avatar
      Rewrite some asserts to avoid visual studio errors. · d2269d8a
      Urvang Joshi authored
      Visual studio generates errors for a closing bracket on a line by
      itself.
      
      BUG=aomedia:671
      
      Change-Id: I69b0c06a4bf115d62b3625102dcd415708a2aafd
      d2269d8a
    • Thomas Daede's avatar
      Frame context signaling: Remove reset symbols from the bitstream. · a6a854b1
      Thomas Daede authored
      Because frame contexts now follow reference frames, explicit resets
      are no longer necessary, but can simply happen at the same time
      as reference frame resets.
      
      Change-Id: Idbed3794e3ed52fa298346943a3014fa1ca23897
      a6a854b1
    • Thomas Daede's avatar
      Add new experiment: frame_context_signaling. · da4d8b9c
      Thomas Daede authored
      This stores frame contexts alongside a reference frame, and always
      uses the frame in reference slot 0 (LAST_FRAME) as the source of
      the frame context.
      
      The encoder could then reorder reference frames as to control
      which frame context is used, however currently it does not.
      
      Low Latency AWCY result:
      
         PSNR | PSNR Cb | PSNR Cr | PSNR HVS |    SSIM | MS SSIM | CIEDE 2000
      -0.1438 |  0.4161 |     N/A |   0.0386 | -0.0281 |  0.0453 |     0.2514
      
      https://arewecompressedyet.com/?job=before-frame-context-signaling%402017-06-07T23%3A20%3A49.473Z&job=after-frame-context-signaling%402017-06-07T23%3A21%3A36.117Z
      
      Change-Id: I4f6f9b12cb403573efbf9e5c3077d62f5dedc467
      da4d8b9c
    • Rupert Swarbrick's avatar
      tempv_signaling: Simplify test for whether prev_frame works for mvs · 1f990a64
      Rupert Swarbrick authored
      For some background, see this previous change in Gerrit[0]. What's
      going on here is that we only want to use a previous frame for motion
      vector prediction if the encoded sizes match. When scaling with
      superres, this means the size before upscaling.
      
      To check this correctly, we need to check prev_frame's width/height
      and compare it with the current frame. Without superres, prev_frame's
      width/height is stored in y_crop_width/y_crop_height so we can check
      that way. With superres, those numbers are after the scaling, so can't
      be compared with cm->width and cm->height.
      
      The previous code worked around this by comparing with cm->last_width
      and cm->last_height. That works because these are the width/height for
      the last encoded and shown frame and that frame *is* prev_frame if
      last_show_frame is true. Since this is the only case when we want to
      use prev_frame, they are the numbers we need.
      
      This patch simplifies the logic by storing the width/height in
      RefCntBuffer before any scaling and then checking that they match.
      
      The check for whether we can use motion vectors from a previous frame
      is factored out into a pair of inline functions in the
      header. frame_might_use_prev_frame_mvs() is true if it's possible that
      this frame could use motion vectors from a previous frame. This
      doesn't use knowledge of what prev_frame is: it just checks we're not
      in error resilient mode and aren't a keyframe. When this is true, a
      flag is signaled in the bitstream to say whether we actually want to
      use motion vectors from the previous frame.
      
      The second function, frame_can_use_prev_frame_mvs, is true if the
      current frame / previous frame pair is suitable for sharing motion
      vectors. This is a stricter test: the previous frame needs to be
      have been shown and not to have been intra_only, and it needs to have
      the same width/height as the current frame.
      
      If the re-assignment of prev_frame (just before the calls to
      frame_can_use_prev_frame_mvs()) were removed in some way, we could
      probably combine the two functions and often save a bit per frame
      header.
      
      The other slight tidy-up in the patch is to move re-allocation of the
      mvs buffer into onyxc_int.h: the code that did the allocation was
      duplicated between the encoder and decoder.
      
      [0] https://aomedia-review.googlesource.com/c/13806
      
      BUG=aomedia:78
      
      Change-Id: If25227fa24222fc05c56529c2ac9ddf1e1c36a84
      1f990a64
    • Rupert Swarbrick's avatar
      ext_partition_types: Pass the correct CDF length for partitions · b95cf12e
      Rupert Swarbrick authored
      Each CDF for partitioning square blocks is initialised from
      an entry of default_partition_cdf in entropymode.c. These CDFs are of
      different lengths, depending on which partition types are supported by
      the block size.
      
      For example, 8x8 blocks have a CDF with only 4 entries (PARTITION_NONE
      through PARTITION_SPLIT). Blocks of a size that supports 1:4 and 4:1
      partitions have 10 entries. Currently, that's only 32x32 blocks. All
      other blocks have 8 entries.
      
      Change-Id: Ie2126b6d41afc0efedcc5b5b37fc1d0427b9a9fa
      b95cf12e
    • Sarah Parker's avatar
      Fix mistake in cdf table for mrc_tx · 964cabf5
      Sarah Parker authored
      One set of values was not monotonic and was causing a mismatch.
      
      Change-Id: Ib599bd1bdee8a85d171b71d02b70549d9916f2b5
      964cabf5
  5. 31 Jul, 2017 5 commits
    • Angie Chiang's avatar
      Fix w/h of av1_make_masked_inter_predictor · 9ee82650
      Angie Chiang authored
      Change-Id: Idaeb180392d6e96fedbd39f2e1ee0e4b9dba887e
      9ee82650
    • Cheng Chen's avatar
      Refactor paralles_deblocking · 61a12ef2
      Cheng Chen authored
      1.Change mix case variable names to underscore cases following
        Google C++ coding style guide:
        https://google.github.io/styleguide/cppguide.html#Variable_Names
      
      2.Reduce number of parameters to pass. Derive these parameters
        when needed inside functions.
      
      Change-Id: I17ca8aed20be2f83f9e46275e6a1f01c8f0ec510
      61a12ef2
    • Yushin Cho's avatar
      Another fix of dangling braces for search · 127c5838
      Yushin Cho authored
      These are caused when both #if and #else has if (...) '{' but
      there is only one matching '}'.
      
      Fixed for some of decoding side files. More to come soon.
      
      Change-Id: I9e63b90ba6e739b5c7e37498458c7808e2e16d33
      127c5838
    • Peter de Rivaz's avatar
      Unified warp_affine and warp_affine_post_round · b6a31753
      Peter de Rivaz authored
      This patch removes the need for a separate warp_affine_post_round
      function by adding the functionality to the warp_affine function.
      
      The encoded output should remain unchanged, but the encoder/decoder
      should operate faster because the sse2 and ssse3 warp implementation
      can now be used when post_rounding is being used.
      
      Change-Id: Ide52cae55de59a9da9c27c5793e17390f6d2c03e
      b6a31753
    • Angie Chiang's avatar
      Turn on convolve_round by default · 71ef7c27
      Angie Chiang authored
      The performance on default experiment is
      lowres: 0.812%
      
      midres/hdres and AWCY tests are still running
      
      Change-Id: Id2209c79df6517732dd06c2712a7bdefde118ead
      71ef7c27
  6. 29 Jul, 2017 2 commits
    • Monty Montgomery's avatar
      Add CONFIG_DAALA_DCT16 experiment. · cb9c1c52
      Monty Montgomery authored
      This experiment replaces the 16-point Type-II DCT and 16-point Type-IV
      DST scaling vp9 transforms with the 16-point orthonormal Daala
      transforms.  These have reduced complexity and are perfect
      reconstruction.  There is currently no net coding performance impact.
      
      subset-1:
      
        monty-square-baseline-s1-F@2017-07-23T03:43:45.042Z ->
           monty-square-dct16-s1-F@2017-07-23T03:42:29.805Z
      
         PSNR | PSNR Cb | PSNR Cr | PSNR HVS |    SSIM | MS SSIM | CIEDE 2000
      -0.0152 | -0.0028 | -0.0929 |  -0.0432 | -0.0457 | -0.0425 |    -0.0237
      
        objective-1-fast:
      
        monty-square-baseline-o1f-F@2017-07-23T03:44:19.973Z ->
           monty-square-dct16-o1f-F@2017-07-23T03:43:22.549Z
      
        PSNR | PSNR Cb | PSNR Cr | PSNR HVS |   SSIM | MS SSIM | CIEDE 2000
      0.0305 |  0.0926 | -0.1600 |   0.0471 | 0.0219 | -0.0075 |     0.0135
      
      Change-Id: I54fed26d65fd8450693334bb400b1fafd7e0dacb
      cb9c1c52
    • David Michael Barr's avatar
      [CFL] Uniform Q3 alpha grid with extent [-2, 2] · f6eaa159
      David Michael Barr authored
      Expand the range of alpha to [-2, 2] in Q3.
      Jointly signal the signs, including zeros.
      Use the signs to give context for each quadrant
      and half-axis. The (0, 0) point is excluded.
      Symmetry in alpha_u == alpha_v yields 6 contexts.
      
      Results on Subset1 (Compared to 9136ab7d
      
       with CFL enabled)
      
         PSNR | PSNR Cb | PSNR Cr | PSNR HVS |    SSIM | MS SSIM | CIEDE 2000
      -0.0792 | -0.7535 | -0.7574 |  -0.0639 | -0.0843 | -0.0665 |    -0.3324
      
      Change-Id: I250369692e92a91d9c8d174a203d441217d15063
      Signed-off-by: default avatarDavid Michael Barr <b@rr-dav.id.au>
      f6eaa159
  7. 28 Jul, 2017 5 commits
    • Urvang Joshi's avatar
      Fix logical errors for TX64x64. · 9136ab7d
      Urvang Joshi authored
      - Wrong function argument fix: this was not caught by compile test
      because DCT_DCT has a value of 0, which was converted to a NULL pointer.
      - Wrong prob array size.
      
      Change-Id: Iaf1747dc7fb40db1d1ab35f965fb60994d8dec95
      9136ab7d
    • Urvang Joshi's avatar
      Fix tx64x64 debug build · 2dc1c841
      Urvang Joshi authored
      Change-Id: I1b77416eaae000ae40e139d8f7fc31754f817bba
      2dc1c841
    • Luc Trudeau's avatar
      [CFL] New UV_PREDICTION_MODE for CFL · 6e1cd787
      Luc Trudeau authored
      CfL is now an independent mode.
      
      Results on Subset1 (Compared to 4266a7ed with CFL enabled)
      
         PSNR | PSNR Cb | PSNR Cr | PSNR HVS |    SSIM | MS SSIM | CIEDE 2000
      -0.1645 | -0.4017 |  0.2475 |  -0.1851 | -0.2179 | -0.2338 |    -0.2897
      
      Change-Id: I2e86e7ea7bfc12bb1d763e70a136ca992d57a3c5
      6e1cd787
    • Jingning Han's avatar
      Conditionally skip inverse transform in transform block RD · 1a7f0a8c
      Jingning Han authored
      When the lower bound of a transform block rate-distortion cost is
      above the current best rd cost, the only possibility that this
      particular coding mode will be chosen is to fall back to all skip
      mode. Hence there is no need to estimate the transform block rate
      cost, distortion, etc. Obtain the sum of squared distance between
      the prediction and the source would be sufficient.
      
      This speeds up the encoding process by 5% - 10%.
      
      Change-Id: I728728c3a42aafefd34641f0be69b3e2a9b9bbb2
      1a7f0a8c
    • Jonathan Matthews's avatar
      Adapt palette cdf · 7abe9db7
      Jonathan Matthews authored
      Bug introduced in change: Ic4c9333c9af5993bc41e513b9e766450b3a951eb
      
      BUG=aomedia:667
      
      Change-Id: I29ab87f32d2f940a3d1e079f734b92467d2ebea9
      7abe9db7
  8. 27 Jul, 2017 2 commits
    • Cheng Chen's avatar
      Make CDEF work with EXT_PARTITION · f5bdeac2
      Cheng Chen authored
      Make CDEF select filter strength every 64x64 block when block size
      could be larger than 64x64.
      
      With/without this patch, coding performances on AWCY and Google
      test of lowres and midres are neutral.
      
      BUG=aomedia:662
      
      Change-Id: Ief82cc51be91fc08a7c6d7e87f6d13bcc4336476
      f5bdeac2
    • Cheng Chen's avatar
      Select filter level for U, V planes · e94df5cf
      Cheng Chen authored
      Previously, U, V planes share the same filter level with Y.
      Here, we search and pick the best filter level for U, V planes.
      Selected filter levels are transmitted per frame.
      This works with parallel_deblocking.
      
      Coding gain on Google test set:
      		Avg_psnr	ovr_psnr	ssim
      lowres: 	-0.116		-0.120		-0.339
      midres:		-0.218		-0.228		-0.338
      hdres:		-0.260		-0.264		-0.365
      
      Change-Id: I03d2ac47539f3eea9f3c4b08007bd6d3f4b73572
      e94df5cf
  9. 26 Jul, 2017 9 commits
    • Yue Chen's avatar
      rect_tx_ext: work with var_tx · d6bdd46b
      Yue Chen authored
      Change-Id: Ie2c34490dc50cb242bcd701308e6b55243883b15
      d6bdd46b
    • Angie Chiang's avatar
      Add some todo for convolve_round exp · 748d570e
      Angie Chiang authored
      1) Integrate it with supertx
      2) Integrate it with chroma_sub8x8
      
      Change-Id: If4bb906d442d15bae3741192029ec851c48d3948
      748d570e
    • Luc Trudeau's avatar
      [CFL] UV_PREDICTION_MODE · d6d9eeeb
      Luc Trudeau authored
      A separate prediction mode struct is added to allow
      for uv-only modes (like CfL). Note: CfL will be
      added as a separate mode in an upcoming commit.
      
      Results on Subset1 (Compared to 4266a7ed with CfL enabled)
        PSNR | PSNR Cb | PSNR Cr | PSNR HVS |   SSIM | MS SSIM | CIEDE 2000
      0.0000 |  0.0000 |  0.0000 |   0.0000 | 0.0000 |  0.0000 |     0.0000
      
      Change-Id: Ie80711c641c97f745daac899eadce6201ed97fcc
      d6d9eeeb
    • Sarah Parker's avatar
      Add txfm functions corresponding to MRC_DCT · 5b8e6d2d
      Sarah Parker authored
      MRC_DCT uses a mask based on the prediction signal to modify the
      residual before applying DCT_DCT. This adds all necessary functions
      to perform this transform and makes the prediction signal available
      to the 32x32 txfm functions so the mask can be created. I am still
      experimenting with different types of mask generation functions and
      so this patch contains a placeholder. This patch has no impact on
      performance.
      
      Change-Id: Ie3772f528e82103187a85c91cf00bb291dba328a
      5b8e6d2d
    • Angie Chiang's avatar
      Integrate hbd convolve_round and compound_segment · 0c604285
      Angie Chiang authored
      When convolve_round is turned on, both lbd/hbd use use 32-bit buf
      Therefore, they use the same mask/blending functions
      
      Change-Id: Icfc6db818c0a53216108e42161acac07303e6c1c
      0c604285
    • Angie Chiang's avatar
      Use ADAPT_SCAN_PROB_PRECISION to init prob · 963b86d3
      Angie Chiang authored
      Change-Id: I94d66c65d78235e1025703caf79ccca43208d604
      963b86d3
    • hui su's avatar
      Palette: use CDF to encode palette size and color indices · 466ae062
      hui su authored
      Around 0.9% improvement on screen_content set (encoding 30 frames).
      
      Change-Id: Ic4c9333c9af5993bc41e513b9e766450b3a951eb
      466ae062
    • Jingning Han's avatar
      Optimize transform block rate-distortion search · 3bce7547
      Jingning Han authored
      The soft coefficient optimization process would monotonically
      increase the transform block distortion and decrease the
      coefficient rate cost. Such observation provides a lower bound
      on the rate-distortion cost for the given transform block. This
      commit compares this lower bound against the best available
      rate-distortion cost value and skips unnecessary optimization
      process. It speeds up the baseline encoding process by 15%.
      
      Change-Id: Ida8098a2820cef60d59ec1e72f0bbb1acbd98165
      3bce7547
    • Monty Montgomery's avatar
      Add CONFIG_DAALA_DCT8 experiment. · cf18fe4e
      Monty Montgomery authored
      This experiment replaces the 8-point Type-II DCT and 8-point Type-IV DST
       scaling vp9 transforms with the 8-point orthonormal Daala transforms.
      These have reduced complexity and are perfect reconstruction at the cost
       of a slightly worse coding performance.
      This is because the Daala transforms expect the input to be shifted by 4
       bits but the output scale of the vp9 transforms is only 3 bits.
      
      subset-1:
      
      monty-square-baseline-subset1 ->
        monty-square-dct8-subset1@2017-07-17T21:37:44.281Z
      
        PSNR | PSNR Cb | PSNR Cr | PSNR HVS |   SSIM | MS SSIM | CIEDE 2000
      0.0019 | -0.0011 | -0.0585 |  -0.0111 | 0.0305 |  0.0317 |     0.0187
      
      objective-1-fast:
      
      monty-square-baseline-o1f ->
        monty-square-dct8-o1f@2017-07-17T21:37:15.735Z
      
        PSNR | PSNR Cb | PSNR Cr | PSNR HVS |   SSIM | MS SSIM | CIEDE 2000
      0.0285 |  0.0129 | -0.5080 |   0.0529 | 0.0345 |  0.0441 |     0.0054
      
      Change-Id: I2b775495398fb717204a295397c3c5e3ca938183
      cf18fe4e
  10. 25 Jul, 2017 4 commits
    • Jingning Han's avatar
      Account for the 64x64 proc block constrain in obmc masking · 440d4254
      Jingning Han authored
      Make the codec account for the 64x64 processing unit constraint
      when producing the mask for overlapped filter.
      
      Change-Id: I3e596492ae522abe678369b0c9710441549e817e
      440d4254
    • Jingning Han's avatar
      Make maximum obmc process unit 64x64 · 501294ce
      Jingning Han authored
      For 128x128 level blocks, process the overlapped prediction in
      the unit of 64x64. This allows hardware design to reuse the 64x64
      processing unit in 128x128 level block coding.
      
      Change-Id: I3967b8e3c1c697f96a50e59a0957fc69b67e6f8e
      501294ce
    • Luc Trudeau's avatar
      [CFL] Average alpha CDF · 4266a7ed
      Luc Trudeau authored
      Change-Id: Id556e8d77c5871ddae338baa1abfb93b7aa207e9
      4266a7ed
    • Luc Trudeau's avatar
      [CFL] Fix warnings when chroma_sub8x8 is disabled · 96b31516
      Luc Trudeau authored
      This change does not alter the bitstream
      
      Resuls on Subset1 (compared to 70a80a81 with cfl)
        PSNR | PSNR Cb | PSNR Cr | PSNR HVS |   SSIM | MS SSIM | CIEDE 2000
      0.0000 |  0.0000 |  0.0000 |   0.0000 | 0.0000 |  0.0000 |     0.0000
      
      Change-Id: I7672eb4cde3c649ebba32610f7e56500e378c062
      96b31516