test/selfguided_filter_test.cc · d051e56005861ea4c9016871922e0af8df8fb624 · Xiph.Org / aom-rav1e

SSE4 and AVX2 implementations of updated FAST_SGR · d051e560

Imdad Sardharwalla authored Feb 02, 2018

The SSE4.1 and AVX2 implementations of the self-guided filter have been updated
to match the updated FAST_SGR C implementation in restoration.c.

The self-guided filter speed tests have been altered to compare the speeds of
the SIMD and C implementations of the relevant functions.

Speed Tests (code compiled with CLANG)
===========

For LowBD:
- The SSE4.1 implementation is ~220% faster (~69% less time) than the C code
- The AVX2 implementation is ~314% faster (~76% less time) than the C code

For HighBD:
- The SSE4.1 implementation is ~240% faster (~71% less time) than the C code
- The AVX2 implementation is ~343% faster (~77% less time) than the C code

Change-Id: Ic2734bb89ccd3f66667c68647e5f677a5a496233

d051e560