Commits · 91a1cf91cf4c06d4101f46effe81b8968ff62d13 · Xiph.Org / aom-rav1e

Aug 21, 2017
- Let DISABLE_TRELLISQ_SEARCH configurable · 91a1cf91
  Angie Chiang authored 7 years ago
```
Change-Id: I48ebf352c6c28e5c0c0e477b24828f0e3fe1dedb
```
  91a1cf91
- Change av1_cost_coeffs_txb's interface · 3627de2c
  Angie Chiang authored 7 years ago
```
Change-Id: Ie7c216218bd233e74970b261186df8f08aca6193
```
  3627de2c
Aug 19, 2017

Prevent bitstream from signaling illegal compound types · 680b9b17

Sarah Parker authored 7 years ago

Currently nothing forbids wedge from being signalled when
the block is > 32X32, even though there is no corresponding wedge
mask for that block size.

BUG=aomedia:640
BUG=aomedia:636

Change-Id: I538be0229a12b5ef01b2e5a950c9f16ef9a5c51e

680b9b17

Aug 18, 2017

Remove dpcm-intra experiment · 400bf651

Hui Su authored 7 years ago

Coding gain becomes tiny on top of other experiments.

Change-Id: Ia89b1c2a2653f3833dff8ac8bb612eaa3ba18446

400bf651

Aug 17, 2017

cdef-dist and daala-dist is runtime switchable · e30a47ca

Yushin Cho authored 7 years ago

Use --tune=[cdef-dist|daala-dist] to enable them.

Also, this commit set the use_activity_masking of PVQ as 0 by deafult,
which means that PVQ assumes daala-dist is not used by default.

Since we're currently not signaling which metric the encoder did use
in the bitstream, the compile flag AV1_PVQ_ENABLE_ACTIVITY_MASKING will tell PVQ
whether daala-dist is used or not.

This commit is the last part of prep-work to remove DIST_8X8, CDEF_DIST,
and DAALA_DIST experimental flags.

Change-Id: Ia465b4d6fe64aac7f04852c8f9f4bac3409d2435

e30a47ca

[CFL] Move CFL cost table to struct macroblock · 38e560cc

David Michael Barr authored 7 years ago

Also, move body of update_cfl_costs() to av1_fill_mode_rates().

Results on Subset1 (Compared to 1cfe474b with CFL enabled)

PSNR | PSNR Cb | PSNR Cr | PSNR HVS | SSIM | MS SSIM | CIEDE 2000
0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000

No change in bitstream, for an average encode speed-up of 2.3%.

Change-Id: I3948abcd70cfecad8086edfe4c45552b576ae06f

38e560cc

Add rate computation to palette · 9c0e4515

Sarah Parker authored 7 years ago

Currently the rate is never computed for the palette color indices.
The code to compute the rate is inside av1_tokenize_palette_sb
when dry_run == DRY_RUN_COSTCOEFFS, but av1_tokenize_palette_sb is
only called when !dry_run.

Change-Id: Ie33eae9e4bcf1997a22dc939f31001334cb2c399

9c0e4515

Introduce runtime switch for dist_8x8 · 55104335

Yushin Cho authored 7 years ago

Even if 'dist-8x8' is enabled with configure,
the dist-8x8 is not acutally enabled (so, no change in encoding behaviour)
until the command line option, '--enable-dist-8x8=1" is used.

The cdef-dist and daala-dist can not be enabled by a command line option yet.

This commit is a part of prep-work to remove DIST_8X8, CDEF_DIST,
and DAALA_DIST experimental flags.

Change-Id: I5c2df90f837b32f44e756572a19272dfb4c3dff4

55104335

Aug 16, 2017

Add dependency of ext-comp-refs on one-sided-compound · 5a978838

Zoe Liu authored 7 years ago

When ext-comp-refs is enabled, one-sided-compound is enabled by default,
which ensures the use of ext-comp-refs is an extension of
one-sided-compound. Both coding tools allow the use of same-sided
reference frame pairs for compound prediction.

Also, remove the dependency of ext-comp-refs on var-refs, i.e. these two
coding tools can be independently enabled. They can still work together
if both are enabled simultaneously.

Change-Id: I3134e7e2956dc35d557fe814f5d801d473683650

5a978838

Fix CONFIG_PVQ support in the CMake build. · ac87049f

Tom Finegan authored 7 years ago

Wrap usages and declaration of av1_set_txb_context with
CONFIG_PVQ preproc checks.

BUG=aomedia:683

Change-Id: I2080d7437ebe1741232eb5e4e83a430279c913a0

ac87049f

Aug 15, 2017

Remove ALT_INTRA flag. · 93b543ab

Urvang Joshi authored 7 years ago

This experiment has been adopted as it has been cleared by Tapas.

Change-Id: I0682face60f62dd43091efa0a92d09d846396850

93b543ab

Aug 11, 2017

Revert "Refactor and generalise OBMC prediction code" · d565529d

Yunqing Wang authored 7 years ago

This reverts commit 29824a42.

Unit test failure was seen.
AV1/AVxEncoderThreadLSTest.EncoderResultTest/2
AV1/TileIndependenceTestLarge.MD5Match/2

Change-Id: I836b6ef8b8eeac45014a439d1f5d4d45d17110f9

d565529d

Fix inter path for mrc-tx · de6f072e

Sarah Parker authored 7 years ago

A speed feature was causing the rdloop to skip trying
MRC_DCT. I've disabled that speed feature when mrc-tx
is enabled and MRC_DCT is allowed for inter blocks.

Change-Id: I0affa5f26465539414b2957f8ff983f718863ef1

de6f072e

Aug 10, 2017

Remove PALETTE flag · c6300aa1

Urvang Joshi authored 7 years ago

This experiment is now adopted as it was cleared by Tapas.

Note: Palette use can still be controlled by command-line option
"--tune-content=..." in 'aomenc'.

Change-Id: I832f49f20f60c34bdef5b424755849c496687e87

c6300aa1

Make palette work correctly with chroma sub8x8 blocks. · c9e71d4d

Urvang Joshi authored 7 years ago

The problem was that some functions were using scale_chroma_bsize()
function to turn sub-8x8 'bsize' to 8x8 'bsize', and then the modified
'bsize' was being passed to rd_pick_intra_sbuv_mode() for example.

In such cases, we cannot rely on the 'bsize' value passed to the
function; instead, we need to look at the original mbmi->sb_type
directly.

Also:
- Added created a common function can_use_palette() to refactor this
logic into one place.
- Added more asserts to easily catch such coding errors in future.

BUG=aomedia:688

Change-Id: I2e9f20c8c5fbc4b3ff41b703a91a02758c3c632f

c9e71d4d

cosmetics,rdopt.c: fix some typos · 89a015b2
James Zern authored 7 years ago
```
Change-Id: I558106e3e415cbcbb6673d24349daed48b616034
```
89a015b2

Aug 09, 2017

Refactor and generalise OBMC prediction code · 29824a42

Rupert Swarbrick authored 7 years ago

When doing OBMC prediction, the code must iterate over the blocks
above or to the left of the current block. In reconinter.c and
rdopt.c, there are several pieces of code that do this. These all work
in roughly the same way, iterating over the xd->mi array (although
some are written with for loops and others with do/while). To visit
each neighbouring block exactly once, each of these loops used an
"mi_step" variable which was set to the width or height of the
neighbouring block in mi-units and the loop counter got incremented by
mi_step to jump to the next block.

This patch unifies the code slightly (just using for loops) and
simplifies it when the CHROMA_SUB8X8 experiment is enabled. In this
case, chroma information is stored in the bottom right block of each
8x8 pixel region. That is, if a block has width 4 and an even mi_col,
the chroma information we need is actually found in the block
immediately to its right.

The existing code implemented this by bumping the current column or
row counter (usually mi_col_offset or mi_row_offset) and duplicating
the first part of the loop body to do it again with the new
counter. It also had to double mi_step to avoid visiting the next-door
block again.

The new code essentially just uses the "continue" keyword to restart
the loop. There's a little more book-keeping required: we might have
to increment "ilimit", the maximum loop index, to ensure we don't exit
the loop too early.

The result is hopefully easier to read, but it's also more general (in
the CHROMA_SUB8X8 case). The existing code assumed the current block
never had width or height below 8 and thus mi_col and mi_row were
always even. As such, whenever the neighbouring block had a width or
height of 4, we knew that we needed to skip to the next neighbouring
block to get the required chroma information. This version of the code
can deal with the current block being smaller. The main difference is
that it decides whether to skip forward by examining the parity of
(mi_col + i) or (mi_row + i).

This change will be needed for 16x4/4x16 block support.

Change-Id: I39c1bbc00a6e2ad1ac17b8eed3980a8bcc040074

29824a42

ext-partition-types: Don't allow 4:1 blocks to use palettes · 6f9cd946

Rupert Swarbrick authored 7 years ago

Since there are no CDFs set up for palettes for 4:1/1:4 blocks, we
should make sure we don't try to use them. Without this patch,
write_palette_mode_info gets called with a bsize of BLOCK_32X8 and
reads (and writes) off the end of the palette_y_size_cdf array.

This patch avoids calling it in this context and adds an assertion to
make sure we don't read off the end of the array in future.

The patch also adds the corresponding logic to rdopt.c.

Change-Id: I4d9aea982d057e305a6b578f35457eada819d38f

6f9cd946

Aug 08, 2017

hash based motion estimation for screen data · cc5d35d8
RogerZhou authored 7 years ago
```
Change-Id: Iec7969ffd8f53ca2f4eefd1d757cfec7b3bde131
```
cc5d35d8

Esthetic changes to choose_intra_uv_mode · 9d4cbb8b

Luc Trudeau authored 7 years ago

Only one call to rd_pick_intra_sbuv_mode and removed
the unused PICK_MODE_CONTEXT *ctx parameter

Change-Id: Ife0dbdd64cd5a01e5eeed0eab9e08417e768b41d

9d4cbb8b

Aug 04, 2017

Avoid using MRC_DCT when the mask produced is invalid · c5ccd4ca

Sarah Parker authored 7 years ago

If the mask is invalid, do not allow the encoder to select MRC_DCT.
Currently the mask is invalid if it is all 1 or all 0, but these
criteria will likely expand in a future patch.

Change-Id: I77230ea8357bfdb2bf1e6338903d44bbf1db22d1

c5ccd4ca

New experiment, CDEF-DIST · c49177e4

Yushin Cho authored 7 years ago

Distortion metric that is currently used for CDEF is also used for
distortion of luma channel during RDO-based mode decision.

This experiment works on the top of 'dist-8x8' experiment.

The BD-Rate change by this experiment for three frames of
objective-1-fast in AWCY is:

  PSNR | PSNR Cb | PSNR Cr | PSNR HVS |    SSIM | MS SSIM | CIEDE 2000
1.1589 | -2.0036 | -1.9620 |  -0.0076 | -1.4145 | -1.4561 |    -0.6410

Change-Id: I1142fe2f186f4ed86e4d33468e00b84e30b20233

c49177e4

Aug 03, 2017

Add macros to turn off inter and intra mrc_dct separately · 2e08d96d

Sarah Parker authored 7 years ago

This will aid in testing different masking methods for inter
and intra blocks.

Change-Id: Ic038da77e55405e3303177e6cd260bd5e19311c1

2e08d96d

Calculate coeff token cost from CDF · c0cf71df

Hui Su authored 7 years ago

AWCY results:
PSNR	PSNR HVS  SSIM	CIEDE 2000
-0.09	-0.04	  -0.02	  -0.03

On Google testsets:
lowres  -0.18%
midres  -0.20%

Above results are obtained with
--disable-ext-refs --disable-dual-filter --disable-loop-restoration
--disable-global-motion --disable-warped-motion

Change-Id: Iba58d5e5ec9a65d0afba29609aa2e379a80d7236

c0cf71df

Aug 01, 2017

Add encoder support to ALTREF2 · e9b15e2b

Zoe Liu authored 7 years ago

This CL adds the use of ALTREF2_FRAME to both single / comp reference
prediction at the encoder side. In particular, the encoder keeps the
distant altref as ALTREF, and uses the internal extra altrefs to
refresh ALTREF2.

Compared with the baseline (ext_tx and global_motion disabled simply
for speed concern):
(a) lowres: avg_psnr -0.395% ovr_psnr -0.393% ssim -0.329%
(b) midres: avg_psnr -0.419% ovr_psnr -0.431% ssim -0.444%
(c) AWCY High Latency:
   PSNR | PSNR Cb | PSNR Cr | PSNR HVS |    SSIM | MS SSIM | CIEDE 2000
-0.6661 | -0.5988 | -0.6669 |  -0.6993 | -0.6988 | -0.7303 | -0.6051
(d) AWCY Low Latency:
  PSNR | PSNR Cb | PSNR Cr | PSNR HVS |   SSIM | MS SSIM | CIEDE 2000
0.0720 | -0.0505 |  0.1501 |   0.0670 | 0.0842 |  0.0517 |     0.0158

TODO list:
(1) To have altref2 incorporated with ext-comp-refs;
(2) To have altref2 fully work with new-multisymbol;
(3) To re-collect the initial default probs/cdfs;
(4) To tune the encoder gf group structure design for altref2.

Change-Id: I6ad63fd65afa903d3bba20acdb68e3b67acf7fdf

e9b15e2b

Jul 31, 2017

Fix a build error when ext-tx is off · 4508d880
Zoe Liu authored 7 years ago
```
Change-Id: I1cf27c41749c8f66eaa0ec828a1fd5d8ef7dd94e
```
4508d880
Fix build warnings when global_motion is off · bc030eea
Zoe Liu authored 7 years ago
```
Change-Id: I69f042e6da5a4b5e4a18853c5f15532dfef0204a
```
bc030eea

Move mode costs that can be updated inside a frame to MACROBLOCK · b23d00a0

Yue Chen authored 7 years ago

It is a refactoring patch, which aims to make the code ready for
implementation of in-frame mode cost update in RDO.
Also add mode cost update per sb row, but it won't affect coding
results because cdf update in RDO is not there.
Mode cost arrays are moved to MACROBLOCK because in multi-thread
coding, threads share the same AV1_COMP.

This patch does not have impact on coding results.

Change-Id: I2e8f7d7d066b23ebfbfc998269023781f359a6ff

b23d00a0

motion_var: computer motion_mode_cost from cdf · bdc8dab2

Yue Chen authored 7 years ago

Initialize mode cost using frame-level cdf.
Also in rd selection stage, cdf is updated per 64x64.
Performance gain 0.20%

Still suboptimal since in real bitstream packing, cdf is updated
per symbol. Per symbol update in RDO is work in progress.

Change-Id: I5062af91d8b00e5bf4c08abd0a7bfb0e5b27a619

bdc8dab2

Jul 29, 2017

[CFL] Uniform Q3 alpha grid with extent [-2, 2] · f6eaa159

David Michael Barr authored 7 years ago


Expand the range of alpha to [-2, 2] in Q3.
Jointly signal the signs, including zeros.
Use the signs to give context for each quadrant
and half-axis. The (0, 0) point is excluded.
Symmetry in alpha_u == alpha_v yields 6 contexts.

Results on Subset1 (Compared to 9136ab7d with CFL enabled)

   PSNR | PSNR Cb | PSNR Cr | PSNR HVS |    SSIM | MS SSIM | CIEDE 2000
-0.0792 | -0.7535 | -0.7574 |  -0.0639 | -0.0843 | -0.0665 |    -0.3324

Change-Id: I250369692e92a91d9c8d174a203d441217d15063
Signed-off-by: David Michael Barr <b@rr-dav.id.au>

f6eaa159

Jul 28, 2017

Fix dist_8x8 broken with · a4817a6b

Yushin Cho authored 7 years ago

The commit 3bce7547 has introduced an another early-exit based on MSE distortion
in transform domain, which enables skipping trellis coding and
calling av1_dist_block() in block_rd_txfm() and skipping trellis coding in av1_tx_block_rd_b().

However, with dist-8x8, the early-exit for sub8x8 tx block in a partition >= 8x8 in plane 0
is disabled because that the reference distortion metric
(which would be non-MSE and applied to 8x8 or larger) can not be compared to
MSE distortions of sub8x8 tx blocks.

Change-Id: I46ada7c90a869d23fc0f0166a01dfdc5392af311

a4817a6b

[CFL] New UV_PREDICTION_MODE for CFL · 6e1cd787

Luc Trudeau authored 7 years ago

CfL is now an independent mode.

Results on Subset1 (Compared to 4266a7ed with CFL enabled)

   PSNR | PSNR Cb | PSNR Cr | PSNR HVS |    SSIM | MS SSIM | CIEDE 2000
-0.1645 | -0.4017 |  0.2475 |  -0.1851 | -0.2179 | -0.2338 |    -0.2897

Change-Id: I2e86e7ea7bfc12bb1d763e70a136ca992d57a3c5

6e1cd787

Conditionally skip inverse transform in transform block RD · 1a7f0a8c

Jingning Han authored 7 years ago

When the lower bound of a transform block rate-distortion cost is
above the current best rd cost, the only possibility that this
particular coding mode will be chosen is to fall back to all skip
mode. Hence there is no need to estimate the transform block rate
cost, distortion, etc. Obtain the sum of squared distance between
the prediction and the source would be sufficient.

This speeds up the encoding process by 5% - 10%.

Change-Id: I728728c3a42aafefd34641f0be69b3e2a9b9bbb2

1a7f0a8c

Jul 26, 2017

Reduce best rdcost value in transform partition search · 16a9df75

Jingning Han authored 7 years ago

Adaptively reduce the best rate-distortion cost value in the
recursive transform block partition search. For bus CIF at 1000 kbps
this reduces the encoding time from 1864 seconds to 1756 seconds,
about 6% speed up.

Change-Id: I5433a1825c0f8b13fcc5ab7e19713a98969d53fc

16a9df75

rect_tx_ext: work with var_tx · d6bdd46b
Yue Chen authored 7 years ago
```
Change-Id: Ie2c34490dc50cb242bcd701308e6b55243883b15
```
d6bdd46b

[CFL] UV_PREDICTION_MODE · d6d9eeeb

Luc Trudeau authored 7 years ago

A separate prediction mode struct is added to allow
for uv-only modes (like CfL). Note: CfL will be
added as a separate mode in an upcoming commit.

Results on Subset1 (Compared to 4266a7ed with CfL enabled)
  PSNR | PSNR Cb | PSNR Cr | PSNR HVS |   SSIM | MS SSIM | CIEDE 2000
0.0000 |  0.0000 |  0.0000 |   0.0000 | 0.0000 |  0.0000 |     0.0000

Change-Id: Ie80711c641c97f745daac899eadce6201ed97fcc

d6d9eeeb

Optimize transform block rate-distortion search · 3bce7547

Jingning Han authored 7 years ago

The soft coefficient optimization process would monotonically
increase the transform block distortion and decrease the
coefficient rate cost. Such observation provides a lower bound
on the rate-distortion cost for the given transform block. This
commit compares this lower bound against the best available
rate-distortion cost value and skips unnecessary optimization
process. It speeds up the baseline encoding process by 15%.

Change-Id: Ida8098a2820cef60d59ec1e72f0bbb1acbd98165

3bce7547

Jul 25, 2017

Fix that matching { and } can be searched in inter mode decision · 67dda51a

Yushin Cho authored 7 years ago

Because #if ... #else ... put the '{' on the same line, dangling { or } occurs,
which causes automatic syntax analyzer, such as 'Ctrl-Shifht-P' in Eclipse
or '%' of vi, fail to find matching { and }.

For some developers, this can make quick reading and/or understaning blocks of code
almost impossible.

Three function or blocks are repaird.
1. av1_rd_pick_inter_mode_sb() {...}

2. for (midx = 0; midx < MAX_MODES; ++midx) {...}
   in av1_rd_pick_inter_mode_sb()

3. handle_inter_mode() {...}

Change-Id: Ib5ac63b8c7f9870a491fac337ae3f58c57ce5e46

67dda51a

Account for the 64x64 proc block constrain in obmc masking · 440d4254

Jingning Han authored 7 years ago

Make the codec account for the 64x64 processing unit constraint
when producing the mask for overlapped filter.

Change-Id: I3e596492ae522abe678369b0c9710441549e817e

440d4254

Jul 24, 2017

[CFL] Fix rare overflow in distortion computation · 4c5df105

Luc Trudeau authored 7 years ago

Worst case SSE for a 12-bit 64x64 block requires 48 bits
(2*(12+log(64)+log(64))). As such, the dist variable must
be int64.

Results on Subset1 (compared to 19b5c8fa with CfL enabled)

  PSNR | PSNR Cb | PSNR Cr | PSNR HVS |   SSIM | MS SSIM | CIEDE 2000
0.0030 |  0.0001 |  0.0100 |   0.0026 | 0.0024 | -0.0008 |     0.0028

Change-Id: I1364c089c223b96daed942175a915fed0f6f1023

4c5df105