Commits · 242460cb66ed2883e15e4f39b957af1eef7ffc03 · Xiph.Org / aom-rav1e

Aug 26, 2013

Cleaning up decode_block_intra function. · 242460cb
Dmitry Kovalev authored 11 years ago
```
Change-Id: Ia41ea5d526d15fcbc9b56d74079593cf8b2fdf66
```
242460cb
Fix the reading of too many input pixels · 6c5433c8
Yaowu Xu authored 11 years ago
```
in VP9_get4x4var_mmx

Change-Id: I4b4a8f45f25ebdfad281f169cc87aba5e2d6f227
```
6c5433c8

Temporarily disable SSSE3 quant_32x32 · 166dc85b

Jingning Han authored 11 years ago

Make the current head working properly, while working on fixing an
issue in the SSSE3 implementation of 32x32 quantization.

Change-Id: Ic029da3fd7f1f5e58bc641341cbd226ec49a16bc

166dc85b

Aug 24, 2013
- cosmetics: strip 'VP9_' from defines in vp9 only code · c8ba8c51
  James Zern authored 11 years ago
  
  Change-Id: I481d9bb2fa3ec72b6a83d5f04d545ad8013f295c
  c8ba8c51
Aug 23, 2013

Limit mv range to be based on partition size · 13930cf5

Yaowu Xu authored 11 years ago

Previous change c4048dbd limits the mv search range assuming max block
size of 64x64, this commit change the search range using actual block
size instead.

Change-Id: Ibe07ab02b62bf64bd9f8675d2b997af20a2c7e11

13930cf5

Removing redundant calls to clamp_mv2. · cd2cc27a

Dmitry Kovalev authored 11 years ago

We could avoid calling clamp_mv2 because it has been already called
inside vp9_find_best_ref_mvs function.

Change-Id: I08edeaf3e11e98c19e67b9711b2523ca5fb1416e

cd2cc27a

Fixing display size setting problem. · 11e3ac62

Dmitry Kovalev authored 11 years ago

Fix of https://code.google.com/p/webm/issues/detail?id=608. We could have
used invalid display size equal to the previous frame size (not to the
current frame size).

Change-Id: I91b576be5032e47084214052a1990dc51213e2f0

11e3ac62

Cleanup in mvref_common.{h, c}. · 21d8e859

Dmitry Kovalev authored 11 years ago

Making code more compact, adding consts, removing redundant arguments,
adding do/while(0) for macros.

Change-Id: Ic9ec0bc58cee0910a5450b7fb8cfbf35fa9d0d16

21d8e859

Added border extension · 656632b7

Yaowu Xu authored 11 years ago

To the source buffer to be encoded as an alt ref frame. This is to fix
the problem of using uninitialized memory in encoder.

See https://code.google.com/p/webm/issues/detail?id=605

Change-Id: I97618a2fc207e08abcf5301b734aa9e3ad695e2c

656632b7

Fix bug in convolution functions (filter selection) · 3f108313

Adrian Grange authored 11 years ago

(In response to Issue 604:
 https://code.google.com/p/webm/issues/detail?id=604)

There were bugs in the convolution code for two cases:

1. Where the filter table was assumed to be aligned to a
   256 byte boundary. The offset of the pixel in the
   source buffer was computed incorrectly.

2. Where no such alignment assumption was made. An
   incorrect address for the filter table base was used.

To fix both problems, I now assume that the filter table is
256-byte aligned and modify the pixel offset calculation to
match.

A later patch should remove the restriction that the filter
table is aligned to a 256-byte boundary.

There was also a bug in the ConvolveTest unit test
(convolve_test.cc).

(Bug & initial fix suggestion submitted by Tero Rintaluoma
and Sami Pietilä).

Change-Id: I71985551e62846e55e40de9e7e3959d4805baa82

3f108313

Changes to adaptive inter rd thresholds. · aa5b67ad

Paul Wilkins authored 11 years ago

Values now carried over frame to frame.
Change to algorithm for decreasing threshold after
a hit and to max threshold (now based on speed)

Removed some old commented out code relating to
VP8 adaptive thresholds.

The impact of these changes tested on Akiyo (50 frames)
and measured in terms of unit rd hits is as follows:

Speed 0 84.36 -> 84.67
Speed 1 29.48 -> 22.22
Speed 2 11.76 -> 8.21
Speed 3 12.32 -> 7.21

Encode speed impact is broadly in line with these.

Change-Id: I5b886efee3077a11553fa950d796fd6d00c8cb19

aa5b67ad

Limit Key frame Intra modes checks. · f76f52df

Paul Wilkins authored 11 years ago

Most of the focus so far has been on inter frames.

At high speed settings the key frame is now taking a high %
of the cycles.

This patch puts in some masking to reduce the number
of INTRA modes searched during key frame coding (as already
happens for inter frames) at higher speed settings

TODO: Develop this further with either adaptive rd thresholds
when choosing which intra modes to consider or some other
heuristic.

Impact.
At high speed settings on some clips the key frame was starting
to dominate. In a coding of the first 50 frames of AKIYO at speed
2 limiting the key frame intra modes to DC or TM_PRED resulted in
~30% overall speedup. For Bus the number was lower at ~4-5%.

Change-Id: I7bde68aee04995f9d9beb13a1902143112e341e2

f76f52df

Fix rectangular partition check flag · 84f3b76e

Jingning Han authored 11 years ago

Put rectangular partition check flag change according to the rd
costs of NONE and SPLIT partition types under the speed feature.

Change-Id: If681e1e078a8d43d86961ea4b748da5cd1b6c331

84f3b76e

Aug 22, 2013

Add neon optimize vp9_short_idct10_16x16_add. · 4082bf9d

Hangyu Kuang authored 11 years ago

vp9_short_idct10_16x16_add is used to handle the block that only have valid data
at top left 4x4 block. All the other datas are 0. So we could cut many
unnecessary calculations in order to save instructions.

Change-Id: I6e30a3fee1ece5af7f258532416d0bfddd1143f0

4082bf9d

vp9_encodeframe.c cleanup. · 604022d4

Dmitry Kovalev authored 11 years ago

Removing unused get_sbuv_perpixel_variance function, using has_second_ref/
is_inter_block functions, organizing includes.

Change-Id: I016de4af12fbbb8b4ece26a70759b2392651b095

604022d4

check_bsize_coverage cleanup. · 335b1d36
Dmitry Kovalev authored 11 years ago
```
Change-Id: Ib7803857b35c00e317c9deb8630e777e25eb278f
```
335b1d36

Checking scale factors on access. · 3c426572

Dmitry Kovalev authored 11 years ago

It is possible to have invalid scale factors and not access them
during decoding. Error is reported if we really try to use invalid scale
factors.

Change-Id: Ie532d3ea7325ee0c7a6ada08269f804350c80fdf

3c426572

rename LOG2_* defines to *_LOG2 · 40ae02c2

James Zern authored 11 years ago

gets rid of a mix of styles

Change-Id: I3591d312157bc6f53a25438bf047765c671fd8a8

40ae02c2

Removing useless calls to setup_{pre, dst}_planes. · 09858c23

Dmitry Kovalev authored 11 years ago

Comment is wrong, we don't initialize any xd pointers. We only initialize
xd->planes[i]->dst and xd->planes[i]->pre[], which are actually initialized
for every block during the decoding.

Change-Id: If152ea872ebef1f83ca70712fa6f8df1b6855f56

09858c23

vp9/encoder: fix last_frame_seg_map mem leak · a5726ac4

James Zern authored 11 years ago

remove duplicate allocation from vp9_create_compressor, it was added to
vp9_alloc_frame_buffers in:

d5bec522 Added resizing & initialization of last frame segment map

Change-Id: I996723226a16a62aff8f9a52ac74e0b73cc98fdf

a5726ac4

Adding vp9_is_scaled function. · 640dea4d
Dmitry Kovalev authored 11 years ago
```
Change-Id: Ieb7077ca3586b9491912027eed450a4f6fd38d30
```
640dea4d

Refactor rd_pick_partition for parameter control · 01a37177

Jingning Han authored 11 years ago

This commit changes the partition search order of superblocks from
{SPLIT, NONE, HORZ, VERT} to {NONE, SPLIT, HORZ, VERT} for
consistency with that of sub8x8 partition search. It enable the use
of early termination in partition search for all block sizes.

For ped_area_1080p 50 frames coded at 4000 kbps, it makes the runtime
goes down from 844305ms -> 818003ms (3% speed-up) at speed 0.

This will further move towards making the in-search partition types
configurable, hence unifying various speed-up approaches.

Some speed 1 and 2 features are turned off during the refactoring
process, including:
disable_split_var_thresh
using_small_partition_info

Stricter constraints are applied to use_square_partition_only for
right/bottom boundary blocks. Will bring back/refine these features
subsequently. At this point, it makes derf set at speed 1 about
0.45% higher in compression performance, and 9% down in run-time.

Change-Id: I3db9f9d1d1a0d6cbe2e50e49bd9eda1cf705f37c

01a37177

Optimise idct4x4: rearrange the instructions a bit · 610642c1
Hangyu Kuang authored 11 years ago
```
to improve instruction scheduling.

Change-Id: I5ea881a6e419f9e8ed4b3b619406403b4de24134
```
610642c1

Fixes on feature disabling split based on variance · 8b810c7a

Deb Mukherjee authored 11 years ago

Adds a couple of minor fixes, which may be absorbed in Jingning's
patch. Thanks to Guillaume for pointing these out.
Also adjusts the thresholds for speed 1 and 2 to 16 and 32
respectively, to keep quality drops small.

Results:
--------
derfraw300:  threshold = 16, psnr -0.082%, speedup 2-3%
             threshold = 32, psnr -0.218%, speedup 5-6%
stdhdraw250: threshold = 16, psnr -0.031%, speedup 2-3%
             threshold = 32, psnr -0.273%, speedup 5-6%

Change-Id: I4b11ae8296cca6c2a9f644be7e40de7c423b8330

8b810c7a

Initialize mb_skip_coeff before picking modes · 94bfbaa8

Scott LaVarnway authored 11 years ago

It appears that the above/left mb_skip_coeff used during
the pick modes, is left over from the previously
encode frame.  This patch initializes the flag to the default
value of zero.


Change-Id: Ida4684cc99611d6e3e82628db35ed717e28ce550

94bfbaa8

vp9: remove unnecessary wait w/threaded loopfilter · 85640f1c

James Zern authored 11 years ago

the final macroblock rows are scheduled in the main thread. prior to
this change one additional macroblock row would be scheduled in the
worker forcing the main thread to wait before finishing.

Change-Id: I05f3168e5c629b898fcebb0d77eb6d6a90d6105e

85640f1c

Cleaning up foreach_transformed_block_in_plane. · 4172d7c5
Dmitry Kovalev authored 11 years ago
```
Change-Id: I9f45af3894c57f35cb266c255e2b904295d39c34
```
4172d7c5

vp9_peek_si: add bitstream v1 support · 61673553

James Zern authored 11 years ago

currently protected by CONFIG_NON420 as v1 is still not entirely stable

Change-Id: Id1c5081b04a2c47a842822048b8804be67d23a6d

61673553

Aug 21, 2013

Cleaning up optimize_init_b function. · be60924f
Dmitry Kovalev authored 11 years ago
```
Change-Id: Ib2c975e1d96deefb7ac4d6b600c8c5388035d111
```
be60924f
Cleaning up reset_skip_context function. · c43da352
Dmitry Kovalev authored 11 years ago
```
Change-Id: Ib3e72671eb8da6f2e9767a6de292ec7c7cde6bc7
```
c43da352

Cleaning up sum_intra_stats function. · 048ccb28

Dmitry Kovalev authored 11 years ago

Using size_group_lookup table and better variable names.

Change-Id: I6e67f2ce091845db43ace7d21b7ae31c6f165aec

048ccb28

Removing PLANE_TYPE argument from cost_coeffs function. · 2f1a0a0e

Dmitry Kovalev authored 11 years ago

We can determine plane_type for another function arguments.

Change-Id: I85331877aedb357632ae916a37b5b15f22c0bb1f

2f1a0a0e

Make "good" quality 2-pass vpxenc encoding default · 0d8723f8

Deb Mukherjee authored 11 years ago

Currently, the best quality mode in VP9 is not very well developed,
and unnecessarily makes the encode too slow. Hence the command line
default is changed to "good" quality. Also, the number of passes
default is changed to 2 passes as well, since 1-pass encoding is
not very efficient in VP9.

Besides, a number of VP9 defaults are set to the currently
recommended settings. With these changes, vpxenc
run with --codec=vp9 --kf-max-dist=9999 --cpu-used=0 should
work about the same as our borg results.
Note when the --cpu-used=0 option is dropped there will be a slight
difference in the output, because of a difference in the cpu-used
value for the first pass. Specifically, the default when unspecified
is to use cpu_used=1 for the first pass and cpu_used=0 for the
second pass. But when specified, both passes will use the cpu-used
value specified.

Note that this also changes the default for VP8 as being "good"
but other options stay unchanged.

Change-Id: Ib23c1a05ae2f36ee076c0e34403efbda518c5066

0d8723f8

Removing a lot of duplicated code. · 27a984fb

Dmitry Kovalev authored 11 years ago

Adding set_contexts contexts function and call it instead of
set_contexts_on_border. Calling txfrm_block_to_raster_xy to get aoff and
loff.

Change-Id: I41897e344afd2cae1f923f4fdbe63daccf6fe80e

27a984fb

Adding scale factor check. · a3ae4c87

Dmitry Kovalev authored 11 years ago

We support only [1/16, 2] scale factors, enforcing this now.

Change-Id: I0822eb7cea51720df6814e42d3f35ff340963061

a3ae4c87

Fix typos and minor stylistic cleanup · ce28d0ca
Adrian Grange authored 11 years ago
```
Change-Id: I32e43474e8651ef2eb181d24860a8f118cfea7bf
```
ce28d0ca

vp9 rtcd: remove non-existent sad functions · ae455fab

James Zern authored 11 years ago

vp9_sad32x3, vp9_sad3x32

+ remove unnecessary sad include from vp9_findnearmv.c

Change-Id: Idef2a89cadc3fec64eff82ba9be60ffff50b3468

ae455fab

Removing unused foreach_predicted_block function. · 90027be2

Dmitry Kovalev authored 11 years ago

Moving foreach_predicted_block_in_plane function to vp9_reconinter.c
because there is only one usage.

Change-Id: I9852feae43fc3cf809b817fc541d043bc5496209

90027be2

Aug 20, 2013

Using has_second_ref function to simplify the code. · 27de4fe9

Dmitry Kovalev authored 11 years ago

Updating implementation of vp9_get_pred_context_single_ref_p2 using
has_second_ref function to make code easier to read.

Change-Id: I5ba642712f59861a48aab974e73aa01640d086fe

27de4fe9

vp9_filter.{h, c} cleanup + adding SUBPEL_TAPS constant. · d19ac4b6
Dmitry Kovalev authored 11 years ago
```
Change-Id: Ib394ea23f464591dad50b5c65c316701378d06d7
```
d19ac4b6