1. 27 Nov, 2017 14 commits
    • Jingning Han's avatar
      Unify rectangular transform block size scan order update · 1b156ac5
      Jingning Han authored
      Unify the non-zero counting and scan order update process for
      the rectangular transform block sizes.
      
      Change-Id: I5f2b833d7552ab67d4486b21d8d5e2fbf1bc217c
      1b156ac5
    • Jingning Han's avatar
      Partially support flip ADST in the reduced adapt-scan set · fb63e3e3
      Jingning Han authored
      Support adaptive scan order update for flip ADST types of block
      sizes 8x8 and below.
      
      Change-Id: Ibcb3c9e9e0b8d397ef260a219b10a23e49758a63
      fb63e3e3
    • Jingning Han's avatar
      Ues adaptive scan order for significant region in large txfm · f02a885b
      Jingning Han authored
      Apply the adaptive scan order update to the significant regions
      in large transform block sizes.
      
      Change-Id: Ief6c37b09462a2ac5a26464b9aa336530b940839
      f02a885b
    • Jingning Han's avatar
      Use sub-frame statistics for adaptive scan order update · 025c6c41
      Jingning Han authored
      Skip the last SB row counting for per frame adaptive scan order.
      This allows enough time window for HW decoder to process the
      scan order update for next frame decoding.
      
      Change-Id: I8a3b48fe452c68c921d55dc76cc787f0a8e00e29
      025c6c41
    • Jingning Han's avatar
      Constrain counter range in adaptive scan · ba2d817a
      Jingning Han authored
      Limit the maximum transform block count to be 256 per adaptive
      scan order model.
      
      Change-Id: If6ae054d4427b784f05dd944747b6249b86f401b
      ba2d817a
    • Jingning Han's avatar
      Allow adaptive scan to support a reduced txfm kernel set · ad4ac8a8
      Jingning Han authored
      Reduce the supported txfm kernel set from 9 to 4. This
      substantially reduce the memory requirement in SRAM for hw design.
      
      Change-Id: Id4f75b7fb1eaad05efe6db89a7bfc60d0324bd35
      ad4ac8a8
    • David Barker's avatar
      scaling: Fix border clamping for subsampled planes · b3b5304f
      David Barker authored
      When forming a scaled prediction, we need to clamp against
      the extended frame border which was set up when the relevant
      reference frame was decoded. The width of this border actually
      depends on the subsampling mode (for UV planes), but before this
      patch we were always using the Y plane's border width.
      
      This resulted in bad predictions when signalling a motion vector
      which points far outside the reference frame. This patch fixes
      the clamping, and restores the intended behaviour for out-of-frame
      motion vectors.
      
      Change-Id: I2cf575ce339a3e22a3c8444de0d0c3be031007c9
      b3b5304f
    • James Zern's avatar
      Unify loopfilter function names · 1dbe80bc
      James Zern authored
      Rename aom_lpf_horizontal_edge_8() to aom_lpf_horizontal_16().
      Rename aom_lpf_horizontal_edge_16() to aom_lpf_horizontal_16_dual().
      
      based on the same change from libvpx:
      7f1f35183 Unify loopfilter function names
      
      Change-Id: I4fda7a2e3a893fc3dee0779975e2d4145c32f5d2
      1dbe80bc
    • Yunqing Wang's avatar
      Convolve copy function optimization · 57e41ea6
      Yunqing Wang authored
      Added a copy function (c version and sse2 version) for full-pixel motion
      vectors. Here, the compound or non-compound cases were not separated, and
      the left shifting were always done.
      
      Change-Id: Idb13e7c0576503a434d0d6e926cd54db645a4ff9
      57e41ea6
    • Debargha Mukherjee's avatar
      Add option to disable split partitions for chroma · 891a8774
      Debargha Mukherjee authored
      When the flag DISABLE_VARTX_FOR_CHROMA is on chroma is
      constrained to always use the largest transform size
      for the prediction unit size.
      This is meant to simnplify the logic for transform size
      selection for chroma with hopefully no loss.
      
      Results:
      lowres 30 frames, speed 1: -0.038% (a slight improvement).
      lowres 30 frames, speed 0: 0.000% (noise level difference).
      
      Change-Id: I14dd5b1983d908bd98e59b7d252e11f5755c97e6
      891a8774
    • Sebastien Alaiwan's avatar
      Remove dead member: wedge_interintra_prob · 0f3942ff
      Sebastien Alaiwan authored
      Change-Id: I42ffbcfed9ef308a2e547d04ccc76670eb405e44
      0f3942ff
    • Sebastien Alaiwan's avatar
      Remove dead member: interintra_prob · 9f09c710
      Sebastien Alaiwan authored
      Change-Id: Icbd008d5e973aa5038e857af460e55964fe36b13
      9f09c710
    • Sebastien Alaiwan's avatar
      Remove dead member: interintra_mode_prob · bc958f66
      Sebastien Alaiwan authored
      Change-Id: I424ff643e6f46216934c96fa9d34a27c46b3e7f2
      bc958f66
    • Yaowu Xu's avatar
      Make type conversion explicit · ea691058
      Yaowu Xu authored
      Change-Id: I53d5a29c1dc1c93535e1e6c6bef34f232feb5e1e
      ea691058
  2. 26 Nov, 2017 1 commit
  3. 25 Nov, 2017 1 commit
  4. 24 Nov, 2017 8 commits
  5. 23 Nov, 2017 13 commits
    • Sebastien Alaiwan's avatar
      Remove dead members · a2fec524
      Sebastien Alaiwan authored
      Change-Id: I5bd080f1fd5c14ea72ea7eb795eb1b8996a8fa76
      a2fec524
    • Rupert Swarbrick's avatar
      Refactor to allow optimization in SGR code · 13927866
      Rupert Swarbrick authored
      The first stage of the selfguided filter is to generate box sums of
      the input image (and its squares). This is done with a pair of
      integral images, which are the same for both calls in
      apply_selfguided_restoration.
      
      This patch refactors things so that av1_selfguided_restoration
      calculates both "flt" buffers, allowing it to reuse the integral
      images that it calculated.
      
      Change-Id: Ica2f6f66e41bea38eb1a135c78c1d7ddab434d8e
      13927866
    • Sebastien Alaiwan's avatar
      Cleanup dead variables · 0ef61dd1
      Sebastien Alaiwan authored
      Change-Id: I36a4ca8bc0c2390b5731b2a60bdca54e3e37868a
      0ef61dd1
    • Sebastien Alaiwan's avatar
      Remove dead members: y_mode_prob, uv_mode_prob · 35777b8a
      Sebastien Alaiwan authored
      Change-Id: I5b03c02657134bbd50c647645898c5d2f6286d2a
      35777b8a
    • Rupert Swarbrick's avatar
      Get rid of the highbd versions of the SGR code · 625e50bd
      Rupert Swarbrick authored
      This doesn't have a big performance impact, and it's rather simpler
      just having one version of everything.
      
      Change-Id: I5fa5e7640a63d0ccb0c371f266c6eee99d9520f9
      625e50bd
    • Rupert Swarbrick's avatar
      Remove unused highpass filter from SGR code · 7cf60961
      Rupert Swarbrick authored
      Change-Id: Ifac3a3bf620061865b82b986d6b16bcabd96a187
      7cf60961
    • Rupert Swarbrick's avatar
      A working rewrite of the sgr sse code · 064c1d47
      Rupert Swarbrick authored
      This fixes some Valgrind errors caused by reads from x_by_xplus1 that
      used tainted data as an address (see the comments in selfguided_sse4.c
      for what's going on).
      
      It also rewrites the algorithm to use an integral image approach
      instead of the handwritten filters that the code was using. The end
      result is roughly the same efficiency (I think that there's one more
      memory load per group of pixels, but this seems not to be measurable)
      and I've done some performance optimisation with perf too. Several
      32-bit multiplications have been replaced by madd instructions which
      do 16-bit multiplications and add adjacent lanes. This is equivalent
      to a 32-bit multiplication when the 32-bit lanes contain numbers below
      2^15, but runs significantly faster.
      
      Change-Id: I3d0f3043c7861707a56e2fd1849574dc73897d6c
      064c1d47
    • James Zern's avatar
      av1_txfm,round_shift: remove implicit conv warning · a60e26d5
      James Zern authored
      under visual studio c4334:
      result of 32-bit shift implicitly converted to 64 bits (was 64-bit shift
      intended?)
      
      Change-Id: If06793116ddfbe3265a17a17a2bcaa6ee8cf9e2d
      (cherry picked from commit 535ecf6b31fe97f704f6725989cffad88ad960d8)
      a60e26d5
    • Hui Su's avatar
      Fix integer overflows in av1_iidentity*_c() · 19df02af
      Hui Su authored
      BUG=b/69238080,b/69288165
      
      Change-Id: Ia761d4b77049a55bd8040b5ed76063b2fac750ee
      (cherry picked from commit c9762668a3f25c2dfe31c426871450fbfd44b9e0)
      19df02af
    • Hui Su's avatar
      Add clamping in half_btf() · 8e739bcd
      Hui Su authored
      BUG=69073461
      
      Change-Id: Ib28b41adfa2738681357903a81a89bcab01c87b3
      (cherry picked from commit 08b26a8a257e54210d8bbdba799980bc291f368e)
      8e739bcd
    • Jingning Han's avatar
      Unify adaptive scan enable flag · 86b75c8a
      Jingning Han authored
      Change-Id: Ief1bedd68de55c29de15f56d805e242d932ff359
      86b75c8a
    • Jingning Han's avatar
      Merge adaptive scan control panel · e4a0b3c7
      Jingning Han authored
      Change-Id: Ifb295cbcde5474d33c4eca008d89c9dda68d327e
      e4a0b3c7
    • Hui Su's avatar
      Add explicit cast in half_btf() · 5a680b11
      Hui Su authored
      To silence asan failures in fuzzing tests.
      
      BUG=:68825590,68825594,68825599
      
      Change-Id: Ib2c713dc19af223da5e5fc5cec4652d71856f830
      (cherry picked from commit e43ea91055133baaf3b691170a097a456c032e23)
      5a680b11
  6. 22 Nov, 2017 3 commits
    • Frederic Barbier's avatar
      [idct] Fix initialization of tx_set_type · 33b39f01
      Frederic Barbier authored
      Previous assumption on reduced_tx_set_used=0 led to many assertion
      failures and prevented signalling reduced_tx_set_used equal to 1.
      
      BUG=aomedia:1053
      
      Change-Id: If9a9dff8d01ba3ec942e06559c153f06d34555f9
      33b39f01
    • Rupert Swarbrick's avatar
      Fix placement of chroma LR coefficients in stream · 1522ca6c
      Rupert Swarbrick authored
      The code was assuming that an mi was always 4 samples wide and
      high. For chroma planes with subsampling, this is wrong and the size
      and position of the sb in the plane was over-estimated by a factor of
      two. This meant that we sent all the coefficients in the top-left hand
      quarter of the tile. Since the encoder and decoder made the same
      mistake, this worked fine, but it's clearly not what we're supposed to
      do!
      
      Change-Id: I0da8ada1d76639ad476ad84491658bc25ef3a43f
      1522ca6c
    • Yaowu Xu's avatar
      Replace INT32_MIN with TXSIZE_CAT_INVALID · 5dfa1442
      Yaowu Xu authored
      Change-Id: I91dd5d3351d5dcc70ffcdb883d1e7cbd054d1a27
      5dfa1442