Commits · 4db2076594be3a48c6c1b3755c1d9621f5ad1c5b · Xiph.Org / aom-rav1e

Oct 18, 2010

Add SSE2 subtract functions · 4db20765

Yunqing Wang authored 14 years ago

Instead of doing 8-bit data unpack and 16-bit subtraction, use
psubb to do 16 8-bit subtractions and pcmpgtb to preserve the
sign information. This does not bring noticable gain since
these functions are not called frequently.

Change-Id: I90a0dfaa3db9d422e4ada324076596ffb178548e

4db20765

Oct 14, 2010

Fix one gcc compiler warning · 7804befb

Yunqing Wang authored 14 years ago

../libvpx/vp8/encoder/bitstream.c: In function ‘pack_inter_mode_mvs’:
../libvpx/vp8/encoder/bitstream.c:1026: warning: array subscript has type ‘char’

Change-Id: Ic77491e0a172fa1821e5b3e914d0dc41fe87c00f

7804befb

Improve bounds checking in vp8_diamond_search_sadx4() · d6da7b8e

Yunqing Wang authored 14 years ago

In order to know if all 4/8 neighbor points are within the bounds,
4 bounds checking are enough instead of checking 4 bounds for
each points (16/32 checkings). This improvement reduces cost of
vp8_diamond_search_sadx4() by 30%, and gives encoder a 1.5%
performance gain (test options: 1 pass, good, speed=4).

Change-Id: Ie8da29d18a6ecfc9829e74ac02f6fa70e042331a

d6da7b8e

Fix compiler warning about vp8_fast_quantize_b_impl_ssse2. · 1dc0ca13

Fritz Koenig authored 14 years ago

Typo had function defined as _ssse2 and prototyped as _sse2.

Change-Id: If9f19da1a83cff40774a90cf936d601c0bf1b7fe

1dc0ca13

Oct 13, 2010

Correct QWORD usage in assembly files · 92df4a06

Fritz Koenig authored 14 years ago

QWORD was being undefined because it was being used
incorrectly.

Change-Id: I3610cefa3d6f0da4054316760f78b9694cde3876

92df4a06

Oct 12, 2010

Centralize mb skip state calculation · 13685747

John Koleszar authored 14 years ago

This patch moves the scattered updates to the mb skip state
(mode_info_context->mbmi.mb_skip_coeff) to vp8_tokenize_mb. Recent
changes to the quantizer exposed a bug where if a macroblock
could be coded as a skip but isn't, the encoder would run the
loopfilter but the decoder wouldn't, causing a reference buffer
mismatch.

The loopfilter is controlled by a flag called dc_diff. The decoder
looks at the number of decoded coefficients when setting this flag.
The encoder sets this flag based on the skip state, since any
skippable macroblock should be transmitted as a skip. The coefficient
optimization pass (vp8_optimize_b()) could change the coefficients
such that a block that was not a skip becomes one. The encoder was
not updating the skip state in this situation for intra coded blocks.

The underlying issue predates it, but this bug was recently triggered
by enabling trellis quantization on the Y2 block in commit dcd29e36,
and by changing the quantizer range control in commit 305be4e4.

Change-Id: I5cce5da0dbc2d22f7d79ee48149f01e868a64802

13685747

Add const qualifiers to variance/SAD functions. · f4a85944

Timothy B. Terriberry authored 14 years ago

These functions should never change their input, and there's no
 reason not to declare that.
This allows them to be passed static const data.

Change-Id: Ia49fe4b01e80e9afcb24b4844817694d4da5995c

f4a85944

Oct 11, 2010

Move vp8_strict_quantize_b inside EXACT_QUANT #define. · 82c43398

Timothy B. Terriberry authored 14 years ago

There is currently no inexact version of this function, so do not
 even compile it without EXACT_QUANT.
This will prevent someone from inadvertently trying to use it without
 the proper EXACT_QUANT setup.

Change-Id: Ia13491e0128afb281c05c9222ee5987101e4010d

82c43398

Remove INTRARDOPT #define and intra_rd_opt option. · dd08db93

Timothy B. Terriberry authored 14 years ago

This is just eliminating some cruft.
Although a number of variables are declared only when INTRARDOPT
 is defined, they are used elsewhere without that protection, and
 no longer just for intra RDO.
The intra_rd_opt flag was hard-coded to 1 and never checked.

Change-Id: I83a81554ecee8053e7b4ccd8aa04e18fa60f8e4f

dd08db93

Oct 07, 2010

Remove unused file in encoder · 7e6f7b57

Yunqing Wang authored 14 years ago

Remove vp8/encoder/x86/csystemdependent.c

Change-Id: I7c590dcd07b68704d463a1452f62f29ffb1402f4

7e6f7b57

Added vp8_fast_quantize_b_sse2 · d860f685

Scott LaVarnway authored 14 years ago

Moved vp8_fast_quantize_b_sse from quantize_mmx.asm into
quantize_sse2.asm and renamed.  Updated the assembly code to
match the C version.

Change-Id: I1766d9e1ca60e173f65badc0ca0c160c2b51b200

d860f685

Oct 06, 2010

optimize fast_quantizer c version · d338d14c

Yaowu Xu authored 14 years ago

As the zbin and rounding constants are normalized, rounding effectively
does the zbinning, therefore the zbin operation can be removed. In
addition, the memset on the two arrays are no longer necessary.

Change-Id: If39c353c42d7e052296cb65322e5218810b5cc4c

d338d14c

Oct 05, 2010

nasm: movhps compatibility QWORD->MMWORD · 1fc29411

Jan Kratochvil authored 14 years ago

Filed for nasm as:
https://sourceforge.net/tracker/?func=detail&atid=106208&aid=3081103&group_id=6208

nasm just does not accept any size parameter for movhps:
1.asm:2: error: mismatch in operand sizes

Some parts of libvpx already use MMWORD for movhps and MMWORD is
defined-out so it is compatible both with yasm and nasm.

Provide nasm compatibility. No binary change by this patch with yasm on
{x86_64,i686}-fedora13-linux-gnu.

Change-Id: I4008a317ca87ec07c9ada958fcdc10a0cb589bbc

1fc29411

Oct 04, 2010

nasm: address labels 'rel label' vice 'wrt rip' · 5cdc3a4c

Jan Kratochvil authored 14 years ago

nasm does not support `label wrt rip', it requires `rel label'. It is
still fully compatible with yasm.

Provide nasm compatibility. No binary change by this patch with yasm on
{x86_64,i686}-fedora13-linux-gnu. Few longer opcodes with nasm on
{x86_64,i686}-fedora13-linux-gnu have been checked as safe.

Change-Id: I488773a4e930a56e43b0cc72d867ee5291215f50

5cdc3a4c

nasm: match instruction length (movd/movq) to parameters · e114f699

Jan Kratochvil authored 14 years ago

nasm requires the instruction length (movd/movq) to match to its
parameters. I find it more clear to really use 64bit instructions when
we use 64bit registers in the assembly.

Provide nasm compatibility. No binary change by this patch with yasm on
{x86_64,i686}-fedora13-linux-gnu. Few longer opcodes with nasm on
{x86_64,i686}-fedora13-linux-gnu have been checked as safe.

Change-Id: Id9b1a5cdfb1bc05697e523c317a296df43d42a91

e114f699

Oct 02, 2010

Tune effect of motion on KF/GF boost in two pass; · 788c0eb5

Paul Wilkins authored 14 years ago

This code adjust the impact of the amount and speed of motion
on GF and KF boost.

Sections with lots of slow motion will tend to have a
somewhat bigger boost and sections with fast motion may
have less.

There is a knock on effect to the selection of the active
quantizer range.

This will likely require further tuning but helps with a couple
of particularly bad edge cases.

Change-Id: Ic2449cda7305672b69acf42fc0a845b77ac98d40

788c0eb5

enable trellis quantization for 2nd order blocks · dcd29e36

Yaowu Xu authored 14 years ago

Experimented with different value for Y2_RD_MULT ranging f[1, 32],
without adapting the value to MB coding mode/frame type/Q value,
4 works out best among all values, providing overall 0.1% coding
gain on the test set.

Change-Id: I6b2583a8aa5db5e7e5c65c646301909c0c58f876

dcd29e36

Oct 01, 2010

Made temporal filter default to use centered mode · 999bc003

Adrian Grange authored 14 years ago

If temporal filtering is enabled but a filter type is not specified
centered filter mode is used by default.

Change-Id: I87306f267c1390074c806c506a69b4ba914d92a2

999bc003

Fix valgrind errors in the NEON loop filters. · a465076e

Timothy B. Terriberry authored 14 years ago

Like the ARMv6 code, these functions were accessing values below
 the stack pointer, which can be corrupted by signal delivery at
 any time.

a465076e

Sep 30, 2010

Changed defaults & range checking for AltRef params · 8ee7284d

Adrian Grange authored 14 years ago

Modified the range checking of parameters used in the
AltRef temporal filter (arnr-max-frames, arnr-strength,
arnr-type) and default values for each of them.

Change-Id: Ib261028d501b9523f6e44cb4790cc52167b6e92b

8ee7284d

Sep 29, 2010

Rename mode_ref_lf_test_function · 7e5e3151

John Koleszar authored 14 years ago

This function graduated from being a test func to something that's on
by default. Rename it and remove some spurious comments that confuse
its status.

Change-Id: I689695a3ad29c35e9a72a43ec93766733ac6c20b

7e5e3151

Fix loopfilter delta zero transitions · b9be7a46

John Koleszar authored 14 years ago

Loopfilter deltas are initialized to zero on keyframes in the decoder.
The values then persist from the previous frame unless an update bit
is set in the bitstream. This data is not included in the entropy
data saved by the 'refresh entropy' bit in the bitstream, so it is
effectively an additional contextual element beyond the 3 ref-frames
and the entropy data.

The encoder was treating this delta update bit as update-if-nonzero,
meaning that the value would be refreshed even if it hadn't changed,
and more significantly, if the correct value for the delta changed
to zero, the update wouldn't be sent, and the decoder would preserve
the last (presumably non-zero) value.

This patch updates the encoder to send an update only if the value
has changed from the previously transmitted value. It also forces the
value to be transmitted in error resilient mode, to account for lost
context in the event of lost frames.

Change-Id: I56671d5b42965d0166ac226765dbfce3e5301868

b9be7a46

Change to coefficient optimization rules. · 7288cdf7

Paul Wilkins authored 14 years ago

Allow coefficient optimization for good quality speed 0.

Change-Id: Id0cb363df6823c6798671584fbba097916a7df2c

7288cdf7

Moved row-specific computation of MV bounds out of col loop · 0e7c45b3
Adrian Grange authored 14 years ago
```
Moved the bounds computation on vertical MV component out
of the loop that processes MBs within a MB row.
```
0e7c45b3

Control of active min quantizer for two pass. · ff3068d6

Paul Wilkins authored 14 years ago

Create  look up tables for controlling the active quantizer range.
Some initial tuning to improve quality circa 0.5% on test set.
Clean up of some stats output code

Change-Id: Ia698a8525f8b8129a503cadace3ee73fe888f543

ff3068d6

Sep 28, 2010

Optimizations on the loopfilters. · 0964ef0e

Fritz Koenig authored 14 years ago

- Scheduling for Atom processors
- Combining of macros to allow for better interleaving
- Change from multiplies to adds for main filter
- Use of movhps/movlps to fill xmm registers without
  shifting and orring

Change-Id: I0b3500a5f58abf7085253ec92d64c8a96723040b

0964ef0e

Enabled AltRef motion map creation · 47fc8f26

Adrian Grange authored 14 years ago

Enabled the first-pass encode to output the
map of macroblock coding modes required by
the AltRef filter.

47fc8f26

Made AltRef filter adaptive & added motion compensation · 1b2f8308

Adrian Grange authored 14 years ago

Modified AltRef temporal filter to adapt filter length based
on macroblock coding modes selected during first-pass
encode.

Also added sub-pixel motion compensation to the AltRef
filter.

1b2f8308

Add 4-tap version of 2nd-pass ARMv6 MC filter. · 18dc92fd

Timothy B. Terriberry authored 14 years ago

The existing code applied a 6-tap filter with 0's on either end.
We're already paying the branch penalty to avoid computing the two
 extra columns needed as input to this filter.
We might as well save time computing the filter as well.
This reduces the inner loop from 21 instructions to 16, the number
 of loads per iteration from 4 to 1, and the number of multiplies
 from 7 to 4.
The gain in overall decoding performance, however, is small (less
 than 1%).

This change also means we now valgrind clean on ARMv6, which is
 its real purpose.
The errors reported here were valgrind's fault (it does not detect
 that 0 times an uninitialized value is initialized), but Julian
 Seward says it would slow down valgrind considerably to make such
 checks.
Speeding up libvpx rather, even by a small amount, seems a much
 better idea if only to enable proper valgrind checking of the
 rest of the codec.

Change-Id: Ifb376ea195e086b60f61daf1097d8910c4d8ff16

18dc92fd

Sep 27, 2010
- Badly placed initialization of rolling rate monitors. · 305be4e4
  Paul Wilkins authored 14 years ago
  
  This affects control of the active quantizer range. Change-Id: I30511fc81ac9f75ff20d9f1372382423d56739da
  305be4e4
- move reconintra_mt to decoder (fixup) · 2b521ab5
  John Koleszar authored 14 years ago
  
  Missed the .h file in the move. Change-Id: Ib408183fbb4d019fd46394b362f89ca6ea9d10bc
  2b521ab5
Sep 24, 2010

Fix valgrind errors in vp8_sixtap_predict8x4_armv6(). · e2795e99

Timothy B. Terriberry authored 14 years ago

This function was accessing values below the stack pointer, which
 can be corrupted by signal delivery at any time.

Change-Id: I92945b30817562eb0340f289e74c108da72aeaca

e2795e99

combine max values and compare once · f30e8dd7

Johann Koenig authored 14 years ago

previous implementation compared each set of values to limit and then
&'d them together, requiring a compare and & for each value.

this does the accumulation first, requiring only one compare

Change-Id: Ia5e3a1a50e47699c88470b8c41964f92a0dc1323

f30e8dd7

disable compilation of debugging code · 8ca779ab

John Koleszar authored 14 years ago

This patch avoids compiling some debugging code in onyx_if.c. The most
significant fix is to avoid generating code for vp8_write_yuv_frame,
which is never called. Some other code was removed by the dead code
elimination performed by the compiler, and this patch does it with the
preprocessor instead. There are advantages both ways.

Change-Id: I044fd43179d2e947553f0d6f2cad5b40907ac458

8ca779ab

move reconintra_mt to decoder (for now) · 48e76ff4

John Koleszar authored 14 years ago

reconintra_mt.c is only required for building the decoder right now.
It could definitely be used for the encoder in the future, but it
currently depends on decoder only data structures. (onyxd_int.h,
VP8D_COMP, etc). Move it from common/ to decoder/ until the
necessary changes to the common multithread code are complete.

This patch is needed to build with --disable-vp8-decoder.

Change-Id: I568c52221a2b309234d269675cba97131ce35c86

48e76ff4

Sep 23, 2010

Add getter functions for the interface data symbols · fa7a55bb

John Koleszar authored 14 years ago

Having these symbols be available as functions rather than data is
occasionally more convenient. Implemented this way rather than a
get-codec-by-id style to avoid creating a link-time dependency
between the encoder and the decoder.

Fixes issue #169

Change-Id: I319f281277033a5e7e3ee3b092b9a87cce2f463d

fa7a55bb

Adjust multi-thread sync ranges according to image sizes · 8db5da29

Yunqing Wang authored 14 years ago

In multi-threaded decoder, set different sync ranges for
different video resolutions.

Change-Id: Iea48fd36f51919e0152c8ed3b1f10e1b723c0ca7

8db5da29

Sep 22, 2010

Remove dead code · 7fed3832

Johann Koenig authored 14 years ago

The new loopfilter was originally introduced as an experimental change.
It's permanent now.

Change-Id: I25dbedb6ceff3e9f9c04e18bb29f84c3ecb7e546

7fed3832

Sep 21, 2010

unset execute bit on c source · cdd20666
John Koleszar authored 14 years ago
```
Change-Id: I6625ee41f8872908cb015ce0729e1c7a105b5217
```
cdd20666

Don't reset mb clamping state during splitmv decoding · 4d391e8e

John Koleszar authored 14 years ago

The MV decoding changes in c5fb0eb8 introduced a bug where the
macroblock clamping state was reset for each partition, so if an
earlier partition needed clamping but a subsequent one didn't,
the MB wouldn't receive clamping. Instead, the state is only
set during splitmv decoding, never cleared.

Change-Id: I224fe258493405ee0f6a04596acdb622c475e845

4d391e8e