-
David Barker authored
The SSSE3 filter is very similar to the SSE2 filter, but the horizontal pass is sped up by using the 8x8->16 multiplies added in SSSE3. Also apply const-correctness to all versions of the filter The timings of the existing filters are unchanged, and the lowbd SSSE3 filter is ~17% faster than the lowbd SSE2 filter. Timings per 8x8 block: lowbd SSE2: 320ns lowbd SSSE3: 273ns highbd SSSE3: 300ns Filter output is unchanged. Change-Id: Ifb428a33b106d900cde1b080794796c0754ae182
d8a423c6