Commits · 118bf67cb61f492023da085eb7446f6d9cbd598c · Xiph.Org / aom-rav1e

Feb 12, 2017

Implement shorter-tap first in convolve_round · 118bf67c

Angie Chiang authored 8 years ago

The performance change is 0.004% on lowres

Change-Id: If3702ba6377ac42997e7d49b8959ff16fb182daa

118bf67c

Fix segfault with loop-restoration on x86. · befcc425

David Barker authored 8 years ago

The WienerInfo struct requires a 16-byte alignment on x86,
since it contains filter coefficients which are loaded using
SSE aligned load instructions. But on 32-bit x86, the default
alignment of aom_malloc/aom_realloc is only 8 bytes, leading
to occasional segfaults.

To fix this, rather than using aom_realloc to resize WienerInfo
structures, we always free and re-allocate them using aom_memalign

BUG=aomedia:345

Change-Id: Ib1b2a42d4a2fa215dcc81ea481c51271ab068a37

befcc425

Feb 11, 2017

Add a new experiment of REF_ADAPT · b05e5d10

Zoe Liu authored 8 years ago

Noticed that some ALTREF_FRAMEs could have used compound modes for its
prediction but have been labeled as SINGLE_REFERENCE mode in the frame
header. This experiment is to remove the COMPOUND_REFERENCE mode from
the frame-level reference mode choices and only leave SINGLE_REFERENCE
and REFERENCE_MODE_SELECT the two choices in the frame header.

When turning on both ext-refs and ref-adapt, compared against ext-refs
itself, a small gain is achieved. In PSNR, the bitrate saving gains are
as follows:

lowres: Avg -0.120%; BDRate -0.128%
midres: Avg -0.155%; BDRate -0.128%

Change-Id: I2cfff8a6b7eaa65ef863dbdbc4dd086d3b586f8c

b05e5d10

Feb 10, 2017

Speed up CLPF when there's nothing to clip · f844e6ef

Steinar Midtskogen authored 8 years ago

Gives 7% speed-up in the CLPF processing (measured on SSE4.2).

Change-Id: I934ad85ef2066086a44387030b42e14301b3d428

f844e6ef

Retune the CLPF kernel · 4f0b3ed8

Steinar Midtskogen authored 8 years ago

CLPF performance had degraded by about 0.5% over the past six months,
which isn't totally surprising since the codec is a moving target.
About half of that degradation comes from the improved 7 bit filter
coefficients.  Therefore, CLPF needs to be retuned for the current
codec.

This patch makes two (normative) changes to the CLPF kernel:

* The clipping function was changed from clamp(x, -s, s) to
      sign(x) * max(0, abs(x) - max(0, abs(x) - s +
             (abs(x) >> (bitdepth - 3 - log2(s)))))
  This adds a rampdown to 0 at -32 and 32 (for 8 bit, -128 & 128
  for 10 bit, etc), so large differences are ignored.

* 8 taps instead of 6 taps:
               1
    4          3
  13 31  ->  13 31
    4          3
               1

AWCY results: low delay  high delay
PSNR:           -0.40%     -0.47%
PSNR HVS:        0.00%     -0.11%
SSIM:           -0.31%     -0.39%
CIEDE 2000:     -0.22%     -0.31%
APSNR:          -0.40%     -0.48%
MS SSIM:         0.01%     -0.12%

About 3/4 of the gains come from the new clipping function.

Change-Id: Idad9dc4004e71a9c7ec81ba62ebd12fb76fb044a

4f0b3ed8

Turn on adapt_scan by default · 76ebf7ce
Angie Chiang authored 8 years ago
```
Change-Id: Ibf160e83e7cb1c7dce8b40e7cbead48416440974
```
76ebf7ce
Exclusively uses 12-tap filter in convolve_round · 822eea32
Angie Chiang authored 8 years ago
```
Performance drop by 0.084% on lowres

Change-Id: I2bcaae96b68033a0af7a1da988505623bc14ed94
```
822eea32

Feb 09, 2017
- Convert PVQ coefficient handling functions to tran_low_t. · 1dbda1b9
  Thomas Daede authored 8 years ago
  
  Change-Id: Iad2b526d65865cbcb2119aca21686563ca8e97fd
  1dbda1b9
Feb 08, 2017

ans: Increase the base state to 1<<17. · b13ce13c

Aℓex Converse authored 8 years ago

ans_multion@2017-01-25T21:00:51.374Z ->
ans_multion_rabs17@2017-01-27T19:25:33.101Z
objective-1-fast
   PSNR | PSNR Cb | PSNR Cr | PSNR HVS |    SSIM | MS SSIM | CIEDE 2000
-0.0494 | -0.0494 | -0.0494 |  -0.0475 | -0.0484 | -0.0488 | -0.0497

Increasing the state any further seems to yield a compression drop.

Change-Id: Iacfd6af7e2b8a47c41033d61e338c5106bd3679c

b13ce13c

Add support for disabling CLPF on tile boundaries · 73ad5236
Steinar Midtskogen authored 8 years ago
```
Change-Id: Icb578f9b54c4020effa4b9245e343c1519bd7acb
```
73ad5236

Avoid sending bits for the compound type for sub 8x8 blocks · 42d9610a

Sarah Parker authored 8 years ago

The only compound mode used with sub 8x8 blocks is COMPOUND_AVERAGE, so
we don't have to send anything in this case

Change-Id: I90d0162e5f7f1ad205e65094293cde2a48eb77b1

42d9610a

Feb 07, 2017

Add CONFIG_INTERNAL_STATS support to the cmake build. · 0115691e

Tom Finegan authored 8 years ago

Includes CONFIG_AOM_HIGHBITDEPTH support for same.

BUG=https://bugs.chromium.org/p/aomedia/issues/detail?id=76

Change-Id: I99893c8c3c7e163383f7297d0df777c9c21822fd

0115691e

Add high bit depth support to the cmake build. · 633b9539

Tom Finegan authored 8 years ago

BUG=https://bugs.chromium.org/p/aomedia/issues/detail?id=76

Change-Id: Ibb5564989bd02cf3fec7b8e1d61d2dee1a96c42d

633b9539

Fix cmake test_libaom build with CONFIG_AOM_HIGHBITDEPTH enabled. · ce4bcebe

Tom Finegan authored 8 years ago

- Comment out the sources that require CONFIG_MOTION_VAR.
- Add missing preproc wrap at the sites in test sources that
  require CONFIG_MOTION_VAR.

Change-Id: I703c2bfd829a579793ad55ae713973d327354473

ce4bcebe

Scale PVQ input to OD_COEFF_SHIFT resolution. · e93acb2d

Timothy B. Terriberry authored 8 years ago

This ensures we operate at the same precision that Daala uses, which matters
when activity masking is enabled, because of the gain companding.

Metrics from Patchset 4 (which had slightly incorrect rounding):

With activity masking (5 frames only):
av1_pvq_AM_ref_5f@2017-02-07T03:37:53.702Z -> av1_pvq_AM_derf_fix2_coeff_scaling_5f@2017-02-07T00:12:24.427Z

    PSNR |  PSNR Cb |  PSNR Cr | PSNR HVS |   SSIM | MS SSIM | CIEDE 2000
  0.6653 | -12.3177 | -12.1858 |   0.3350 | 4.1013 |  2.0964 |    -4.0539

In particular for Netflix_Crosswalk_1920x1080_60fps_8bit_420_60f.y4m
 -5.0589 | -22.3077 | -21.2188 |  -7.0389 | -3.3715 |-5.7794 |   -13.1891

I.e., it fixes the large regression with AM on this sequence, and
 substantially improves chroma (at a lesser cost to other metrics).

Without activity masking (5 frames only):
av1_pvq_ref_5f@2017-02-07T03:52:51.279Z -> av1_pvq_derf_fix2_coeff_scaling_5f@2017-02-07T00:12:48.873Z

    PSNR | PSNR Cb | PSNR Cr | PSNR HVS |   SSIM | MS SSIM | CIEDE 2000
  0.0989 | -0.0322 | -0.0464 |   0.1883 | 0.0795 |  0.0579 |     0.0923

Change-Id: I46b808b7c8e4733465f8bebc8336dfd5b75783ec

e93acb2d

ALT_INTRA: Integerize the weights for SMOOTH_PRED. · 7a40600c
Urvang Joshi authored 8 years ago
```
Insignificant change in BDRate.

Change-Id: Id1aa798393fd4c4c174dfcb9a8315828b531996f
```
7a40600c
Remove av1_cost_coeffs from rdopt.h if using PVQ. · 617744b6
Thomas Daede authored 8 years ago
```
Change-Id: I08a2437e4eb2ef31ec7a675fba6bcec538019241
```
617744b6

Feb 06, 2017

Add av1_convolve_2d_facade · 7927a97d

Angie Chiang authored 8 years ago

When convolve_round is on, av1_convolve_2d_facade will be used for
interpolation rather than av1_convolve. Will remove the experiment
code of convolve_round experiment from av1_convolve in another CL.

So far we use 4-bit rounding in the intermediate stage on top of using
post rounding for compound mode after the last stage.

This will give us roughly 0.45% gain on lowres , 0.39% on midres and
roughly 0.6-0.7% on hdres
Altogether, is 1.15% on lowresm, 0.74% on midres and roughly 1.7-1.8% on
hdres

Note that there no restriction usage of 12-tap filter in the CL.
Adding that, we will lose roughly 0.1% again on lowres.

Change-Id: I6332e1d888e28a3b3ddc29711817d66e52cb5cdf

7927a97d

ec_multisymbol: Split off new new_tokenset experiment · a9598cd6

Aℓex Converse authored 8 years ago

The new_tokenset experiment replaces the unconstrained tokenset with a
multisymbol alphabet in an inventive way.

Tested configurations:
new_tokenset + ec_adapt, new_tokenset, ec_multisymbol

Change-Id: I846ab2e51c2a1dc3f2f9904ed8c47a8e98f853c5

a9598cd6

Feb 04, 2017

Reset PVQ chroma QM interpolation to constant identity QM · fb993173

David Michael Barr authored 8 years ago

The PVQ QM interpolation code needs to be adapted to AV1 ranges.

av1_float_pvq_dist_scale_AM_5f_Jan31@2017-02-02T08:57:23.156Z
 -> av1_float_pvq_dist_scale_AM_5f_Jan31_crfix@2017-02-02T15:14:40.477Z

  PSNR |  PSNR Cb | PSNR Cr | PSNR HVS |   SSIM | MS SSIM | CIEDE 2000
1.8501 | -29.0766 | -6.6775 |   1.8421 | 1.8252 |  1.8228 |    -9.9734

Change-Id: Ib72c1f8eeccf806f8d719866ce80172b6908643e

fb993173

ans: Switch from uABS to rABS · c54692b5

Aℓex Converse authored 8 years ago

This is in preparation for expanding the state range.

No discernible compression impact

ans_multioff@2017-01-25T20:58:18.756Z -> ans_multioff_rabs@2017-01-26T01:05:12.801Z

     PSNR | PSNR Cb | PSNR Cr | PSNR HVS |    SSIM | MS SSIM | CIEDE 2000
  -0.0001 | -0.0001 | -0.0001 |  -0.0001 | -0.0001 | -0.0001 | -0.0001

https://arewecompressedyet.com/?job=ans_multioff%402017-01-25T20%3A58%3A18.756Z&job=ans_multioff_rabs%402017-01-26T01%3A05%3A12.801Z

Change-Id: Ie1817991190f1de6d9c31e0c97f77efbd5869d35

c54692b5

entropy: Fix --disable-ec_multisymbol build · e2746e30

Aℓex Converse authored 8 years ago

Broken by I233979909118241a0c78761c1d5c2cd6857915e0

Change-Id: I3af0d3907f63b69c1301a48e7d2a276c52d3fd00

e2746e30

add horizontal tile dependence support · 7b9f2b3b
Fangwen Fu authored 8 years ago
```
Change-Id: I1050b69045407381d4626b65a0bf6f35957a66f4
```
7b9f2b3b

Feb 03, 2017

Enable an activity masking of PVQ · e4c46918

Yushin Cho authored 8 years ago

By default, the activity masking is used with PVQ.
In addition to '--enable-pvq', '--enable-daala-dist' is also
required by configure to use the activity masking.

Change-Id: I5100a1db992f0e693e61daf5439de8ae8c64a752

e4c46918

Fix fixed-pt PVQ compand/expand outputs zero gain · 3ebfe2a4

Yushin Cho authored 8 years ago

For fixed-point version of PVQ, which is current default,
added MAXI(1, ) to limit the minimum companded or expanded gain to be one.
Previously, gain compand/expand function, which is invoked when
activity masking is enabled, sometimes outputs zero
then triggered the assert(gain != 0).

Metric change from floating-pt to fixed-pt PVQ is:
PSNR  PSNR-HVS  SSIM  CIEDE-2000  PSNR Cb PSNR Cr MS-SSIM VMAF
0.02  0.10      0.08  0.11        0.01    0.02    0.13    -0.30

Change-Id: I64a60d1970d35a26af227841e4a5e50a89ddc44c

3ebfe2a4

EC_MULTISYMBOL: Include EOB in multisymbol encoding. · fc1598ad

Thomas Davies authored 8 years ago

RD search and trellis encoding are still sub-optimal.

Change-Id: I233979909118241a0c78761c1d5c2cd6857915e0

fc1598ad

Remove interp filter for non-translation global mv · 19e7aa82

Yue Chen authored 8 years ago

BDRATE results:
lowres: -0.880% (up from -0.844%)

Change-Id: I017c0beddcc687148fed33c1e9963e05f1eaf6ea

19e7aa82

set loop_filter_across_tiles_enabled flag to 1 in default case · ad67d795
Ryan Lei authored 8 years ago
```
Change-Id: I907976619a433a92d671c5cce25f3e8806638e80
```
ad67d795

Bugfix: ensure for pareto coef that there are no zero range encodings · 13754540

Jonathan Matthews authored 8 years ago

Introduced by change I98b33fab6b9f52690f6ad618ac55e725a97be056

BUG=aomedia:349

Change-Id: Ib6df52ac2442f60c159bae2271793b7570d53a19

13754540

Fix compilation orders for ext-intra and ec-adapt · 9aa9749d
Hui Su authored 8 years ago
```
Change-Id: I378b677cf579441ba0a9014a8a77a1cf3f8b5689
```
9aa9749d
Fix high bit depth test build on macosx. · 998e606e
Tom Finegan authored 8 years ago
```
Change-Id: I8a5288d82e9dda32bf5e47a17c0ee88e4da0b1c5
```
998e606e
Add macosx Sierra (v10.12) support to configure. · 9d3076cf
Tom Finegan authored 8 years ago
```
Change-Id: Ib90be26f69c658a6be6e133097c41845db58b6e1
```
9d3076cf
Fix gtest build in cmake. · 263b39b8
Tom Finegan authored 8 years ago
```
Change-Id: Ic6a99b82e92f8512bdd40d002aa6b904b768ae9a
```
263b39b8

Remove CONFIG_AV1_TEMPORAL_DENOISING. · a488727f

Tom Finegan authored 8 years ago

Clean up. Remove dead experiment/whatever.

Change-Id: I03cae9c9240e917595aa4a38b1d6d29a2ec19115

a488727f

Enable build of some sse2, ssse3, and sse4.1 tests via cmake. · e0578d1e

Tom Finegan authored 8 years ago

Applies only to the tests that require only the presence of
compiler support. Tests that require an instrinsic flag and an
enabled experiment not included.

BUG=https://bugs.chromium.org/p/aomedia/issues/detail?id=76

Change-Id: I1ba6ee80cadc3064068db04c15caf8cc2384ab3b

e0578d1e

Enable use of the Xcode IDE via cmake. · 9412ec39

Tom Finegan authored 8 years ago

Xcode needs special handling when an executable target contains
no C++ sources, but links C++ dependencies.

BUG=https://bugs.chromium.org/p/aomedia/issues/detail?id=76

Change-Id: Ifd4f6208c8f96386194691d45279df1e70a8fc17

9412ec39

Add cmake app targets to a list variable. · 81279803

Tom Finegan authored 8 years ago

Allows for looping over app targets when all targets need
the same update.

BUG=https://bugs.chromium.org/p/aomedia/issues/detail?id=76

Change-Id: If9e3f6ab50f06c8c26104942455e6bbfac485cb1

81279803

Fix RTCD dependency problems in cmake make build. · a0c21f04

Tom Finegan authored 8 years ago

Fixes make clean && make runs (single and multi job) via addition of
new target aom_rtcd that all lib targets depend on. Target includes
the RTCD definition perl files, the output H files, the C files and
rtcd.pl itself.

Also,
- Adds list of lib targets (used to propagate the aom_rtcd dep)
- Use the correct symbol for av1 RTCD gen (aom_av1_rtcd -> av1_rtcd)

BUG=https://bugs.chromium.org/p/aomedia/issues/detail?id=76

Change-Id: Ia0e858220c4c2877c6e5f5ffed853be15c6cd711

a0c21f04

Update third_party/googletest to 1.8.0 · 51fafcbd
Johann Koenig authored 8 years ago
```
Change-Id: I49212125058816687535d3b946fccfa47c16aa11
```
51fafcbd
EC_MULTISYMBOL: always send the EOB_TOKEN after a non-zero value. · 490477ab
Thomas Davies authored 8 years ago
```
This will allow EOB_TOKEN to be merged with that value.

Change-Id: I82ba5e8d38e235d07894e43b5fec53968f84ab6c
```
490477ab