Commits · 28e9ce29c406a32c0dcab9d790a46ddb57d05ef6 · Xiph.Org / aom-rav1e

Jan 24, 2018

Adding timing info to sequence headers · 28e9ce29
Andrey Norkin authored 7 years ago
```
Change-Id: I0fdb09499196e02709e067f690dff71146ee5114
```
28e9ce29

Added SSE4.1 and AVX2 implementations of FAST SGR. · 9d234571

The self-guided filter speed tests show that:
- The SSE4.1 implementation of FAST SGR is ~35% faster than the corresponding
  implementation of SGR;
- The AVX2 implementation of FAST SGR is ~28% faster than the corresponding
  implementation of SGR.

Change-Id: Iecdc1f8cee79500084c71d06dbb02d804272aa99

9d234571

Add a config flag/code for fast sgr computation · ed5e9673

Deb Mukherjee authored 7 years ago

Adds an experiment for fast sgr computation where for the r=2
filter, computation of the A, B stats are computed for every
other row and averaged in between.
The motivation is to improve software performance with hopefully
minimal loss.

Change-Id: Ie36687826524dc18c1fbb7f6becff244187bf8da

ed5e9673

[loop-restoration, bugfix] Restrict sampling of deblocked pixels · dff901ff

David Barker authored 7 years ago

There is a special case with certain frame heights, where we
end up with a loop restoration stripe which ends 1px above the
crop border.

Previously this case was handled in quite an ugly way, which also
disagrees with the spec (+ isn't great for hardware). This patch
changes things to match the spec.

Specifically, the old method was to sometimes upscale one extra
row of deblocked pixels so that we could always have a 2px
"below" border for each processing stripe. The new method is to
only use rows inside the crop border, and to duplicate them if
necessary.

BUG=aomedia:1264

Change-Id: Idf8ab510e1091dc3f5b257de60e16bca214d8dc4

dff901ff

Remove deadline · 47cc2559

Sean DuBois authored 7 years ago

BUG=aomedia:13

Change-Id: I9df343f4a6a809b09446ff1f2083c38771ab068b

47cc2559

Set input_shift properly · 913867b4

Yaowu Xu authored 7 years ago

Profile 0 now supports 10 bit, therefore no longer means input_shift
at 0.

Change-Id: Idae429b88ee5c073ee6e939a88d569c5ffde2b0d

913867b4

Simplify cos_bit setting in txfm · d4327bce

Angie Chiang authored 7 years ago

Move cos_bit from txfm 1d cfg to 2d cfg
Each txfm stage only uses one cos_bit

This is a lossless change and it speeds up encoder by 2%

Change-Id: I45d398761e4729b8c4c37729571fe3765cb0c83f

d4327bce

Cleanup redundant assertion · dc3d916b
Frederic Barbier authored 7 years ago
```
Change-Id: I6532e20c958d5bf6f6d73a6f076664e1b74ba055
```
dc3d916b

Skip RD search over lst 2/3 frame for non-nearest neighbor mvs · 8db5f17b

Jingning Han authored 7 years ago

Skip the rate distortion search over last 2/3 reference frames for
the reference motion vectors derived from non-nearest neighbors.
The overall coding performance change is in the noise range - 0.05%
better. Speed up the encoding process by 20%.

Change-Id: I823b8ca2805ae332f4c9bc8ee255069a82db4331

8db5f17b

Use split and horz/vert to predict horzA/B/vertA/B · 6001fb05

Zoe Liu authored 7 years ago

In rd_pick_partition(), the first one or two blocks for the partition
types HORZ_A, HORZ_B, VERT_A, and VERT_B may be already evaluated,
during the evaluation of SPLIT, HORZ, and VERT. This patch saves the
RD pick mode results and tries to reuse them to remove the duplicate
RD mode evaluation operations.

This patch should not incur any coding performance loss.

Testing on a few lowres frames: when CFL is off, this patch obtains
>10% encoder speedup.

Change-Id: I932e233bc93873de62a88230254df44494236dde

6001fb05

Add AVX2 implementation for motion compensation function · 54cd8d76
Yushin Cho authored 7 years ago
```
AVX2 Code for av1_convolve_2d_sr_c()

Change-Id: Id8a2192b78bbb2c6ac22da3134a7c256941985c8
```
54cd8d76

remove deprecated cmake flags · ec254b77

Johann Koenig authored 7 years ago

These flags provided compatibility with configure but have
no effect in cmake builds.

Change-Id: I2dbb71d9aeaae759cc3c4a46917e3840d696328d

ec254b77

remove stale .gitignore entries · 4a9eda2c

Johann Koenig authored 7 years ago

In-tree builds are explicitly disallowed by cmake. Any of these files
showing up in the source tree should be cause for concern.

BUG=aomedia:1254

Change-Id: Iae42c17cbadb6554c6a95bda14daf5ac67e352a7

4a9eda2c

adopt some clang 5.0.0 formatting · 123e8a60

Johann Koenig authored 7 years ago

At least the changes that don't conflict with 4.0.1

Change-Id: Iaa2fda027b8ab2b023d608cf5ec7b377a72b851e

123e8a60

Add experiment aom_qm_ext and its dependency · e2994a5c
Yaowu Xu authored 7 years ago
```
Change-Id: I243e2a3cbae5b4eebe7fbabcb9f55552e9f13bd8
```
e2994a5c

Support rd model in txk sel search · dd8600f5

Jingning Han authored 7 years ago

Make the per transform block kernel selection process unified with
the rate distortion model used in preliminary mode search. This
makes the txk-sel model search space same as baseline.

Change-Id: I82a2d94e88a03c88154582575ced500197f8a409

dd8600f5

Code cleanup in rdopt.h · 206d22f2
Hui Su authored 7 years ago
```
Change-Id: Iea0e8665cdd5b9bc0fe17930add7068443765ea9
```
206d22f2

Jan 23, 2018

Remove av1_cost_bit() · 751a2335

Hui Su authored 7 years ago

It's more efficient to use av1_cost_literal() instead.

Change-Id: I50727d4a4ee06492b373c2e7831c224c5eae8735

751a2335

lv-map: replace read/write_bin with read/write_symbol · 41d61528
Hui Su authored 7 years ago
```
Change-Id: I9e16b5de0a3ae1814982660434812d417955d94f
```
41d61528

Change tilesize to 256x256 for >CIF resolutions · 5f7f3677

Deb Mukherjee authored 7 years ago

An improvement in coding efficiency for higher resolution
sources. Plus having this on by default will guard against
256x256 LRU support not being inadvertently broken.

Change-Id: I171b3c310eab72e27390e9ad0aa9c362f7fbb508

5f7f3677

Remove Frame_ID_NUMBERS_PRESENT_FLAG · 6eb9da2c

Yaowu Xu authored 7 years ago

This commit replaces hard coded FRAME_ID_NUMBERS_PRESENT_FLAG with
error_resilient_mode, which properly reflects the intention of the
experiment, i.e. "signal the complete state of the reference buffer
explicitly for each frame" to deal with possible frame losses.

Change-Id: I7130c110d26c6a8e1cf1266c05482b768cf352f9

6eb9da2c

Revert "add scalability experiment" · 8695e987

Tom Finegan authored 7 years ago

This reverts commit 2eeadab1.

Reason for revert: Did not address final review comments before landing.

Change-Id: I29089767857bd20b3a3e42322e3887fb7027559d

8695e987

add scalability experiment · 2eeadab1

Soo-Chul Han authored 7 years ago

configure:  --enable-experimental --enable-scalability

New applications:  scalable_encoder, scalable_decoder

scalable_encoder:
  * Encodes inputs as 2-layer (same size) stream
  * Encodes as obu file (OBU_NO_IVF must be enabled)
  * Base layer encoded in IPPPP where P's reference
    only the previous (in time) base layer
  * Enhancement layer encoded using its base layer as
    sole reference frame
  * Base layer encoded with fixed high QP
  * Enhancement layer encoded with fixed low QP

scalable_decoder:
  * Able to decode scalable stream generated by
    scalable_encoder
  * Able to decode any single-layer stream encoded
    by aomenc
  * Outputs base layer as out_lyr0.yuv, and enhancement
    layer (if they exist) as out_lyrN.yuv (N = 1, 2, 3, ..)
  * Able to decode N layers (more than 2)

Change-Id: I8555735db71e5b9b6f900ffdf978e0ad6f6bfc00

2eeadab1

Fix build when obu is not enabled · a8975df5
Yaowu Xu authored 7 years ago
```
Change-Id: I2d2ce75c184011884de8a015a6666b5209de2082
```
a8975df5
Move encoder-specific function out of decoder · 57ddc51a
Frederic Barbier authored 7 years ago
```
Change-Id: I5ae45abe5145dedf9751adbeb81a111a49df7eb5
```
57ddc51a
Let adst4's precision be adjustable · 8251736b
Angie Chiang authored 7 years ago
```
Change-Id: I6e251328b2934130992dbd355cfdffc3c721d357
```
8251736b

Tune the inv_shift · 06250276

Angie Chiang authored 7 years ago

Let the second stage of 10 bit inv txfms fit within 16 bits

Change-Id: Ia087d65484cd410651190dcd9d3292cce6594d34

06250276

Correct inv_start_range · a8b45c37
Angie Chiang authored 7 years ago
```
Change-Id: I08e4686b0bcf19a3c318a831bc338c9e58f3a127
```
a8b45c37

Tune fwd txfm's config · a0d27597

Angie Chiang authored 7 years ago

Maximize cos_bit's precision

Change-Id: Iad5d3915823f5c1c25a0caa3bd012d60caa2d521

a0d27597

Fix txfm_stage_range_check · 248f0557

Angie Chiang authored 7 years ago

Only check cos_bit range if cos_bit is not NULL

Change-Id: I286fc056812b20242cc962a8b008af7093d05b1d

248f0557

Move InvSqrt2 to the front of inv_txfm2d_add_c · 4b29ea86

Angie Chiang authored 7 years ago

This will simplify the range management of rect txfm

Change-Id: Icf678fe735dd299c6c42a215c592611025e87ba6

4b29ea86

Remove more code about probability based entropy coding · 9fdf2e2e
Hui Su authored 7 years ago
```
Change-Id: Ie0bc1dd68f7a5d81e49da0ae6f855e572e12aa10
```
9fdf2e2e

Fix a bug in jnt_comp · 5b5f3d50

Cheng Chen authored 7 years ago

(1). index may go out side of range
(2). when d0 <= d1, comparison is invalid.

Performance impact on Google lowres testset:
Turn on jnt_comp vs baseline,
Without fix: -0.211% gain
With fix: -0.357% gain

BUG=aomedia:1239

Change-Id: I761522bba8396bba0d4108d710030b472939cf32

5b5f3d50

Added a test for monochrome encoding. · 26ac0478

Imdad Sardharwalla authored 7 years ago

The test encodes 5 frames of a video using the --monochrome flag and
verifies that the decoded frames satisfy:

- each frame's monochrome flag is set to 1
- each frame's U and V planes are set to a constant, and this constant
  is the same for all decoded frames
- the initial frame's Y PSNR value is 'high enough'
- the Y PSNR values remain fairly constant across all of the frames

Change-Id: I4239ddfb745ed9746547737b4bc99963c71e51c0

26ac0478

Don't calculate chroma data in monochrome mode · af8e2648

Imdad Sardharwalla authored 7 years ago

Encoder: Prior to this patch, some chroma data was calculated and
later discarded when in monochrome mode. This patch ensures that
the chroma planes are left uninitialised and that chroma
calculations are not performed.

Decoder: Prior to this patch, some chroma calculations were still
being performed in monochrome mode (e.g. loop filtering). This
patch ensures that calculations are only performed on the y
plane, with the chroma planes being set to a constant.

Change-Id: I394c0c9fc50f884e76a65e6131bd6598b8b21b10

af8e2648

Fix Valgrind warning in av1_pick_filter_restoration · b08544de

Imdad Sardharwalla authored 7 years ago

Some array elements were defined and left uninitialised. This wasn't causing a
problem, as the elements were later ignored, but it did cause Valgrind to
produce warnings.

The function now initialises the full array immediately after its definition in
order to quiet these warnings.

BUG=aomedia:1244

Change-Id: I5083f1f4008cb3ab70a4af4d1d2573dee8793303

b08544de

Add SSE2 implementation of 1-D convolve functions · ffa57594

Frank Bossen authored 7 years ago

Can reduce decoder runtime by about 7 percent.

Change-Id: I4ee3eea9de867d065d03a176f242e286a4899004

ffa57594

Remove the dct_only experiment · 7448fc24
Hui Su authored 7 years ago
```
Change-Id: I33bb6e902e3be2847ae8101199d9cbd0e1e5c38d
```
7448fc24

Move if statement outside for loops · 2e8eaddd

Peng Bin authored 7 years ago

By avoiding break CPU's pipeline,
this patch achieves a small encoder
speedup at the range of 0.2%~0.71%.

Change-Id: I398cb09f8eb91695e3258091ff2f82f06ab74145

2e8eaddd

[segment_pred_last] fix resolution change issues · 85e8c797

Soo-Chul Han authored 7 years ago

explicitly disable segmentation when ref frame has different
resolution

BUG=aomedia:1205
BUG=aomedia:1223
BUG=aomedia:1256

Change-Id: I6db51116db308514d572eb465c2453403e64e1f2

85e8c797