- 14 Aug, 2014 1 commit
-
-
levytamar82 authored
In the sub_pixel_avg_variance the parameter sec was also aligned load and changed to unaligned. Change-Id: I4d4966e0291059ea4d705baed1503dc58444fcb7
-
- 08 Aug, 2014 1 commit
-
-
in the sub_pixel_*variance* function the dst is aligned to 16 bytes and not to 32 bytes - now load unaligned data Change-Id: I2e0b9745543697efc56fefa32857ea10117af135
-
- 01 Mar, 2014 1 commit
-
-
levytamar82 authored
Optimizing 2 functions to process 32 elements in parallel instead of 16: 1. vp9_sub_pixel_avg_variance64x64 2. vp9_sub_pixel_avg_variance32x32 both of those function were calling vp9_sub_pixel_avg_variance16xh_ssse3 instead of calling that function, it calls vp9_sub_pixel_avg_variance32xh_avx2 that is written in avx2 and process 32 elements in parallel. This Optimization gave 80% function level gain and 2% user level gain Change-Id: Iea694654e1b7612dc6ed11e2626208c2179502c8
-
- 19 Feb, 2014 1 commit
-
-
James Zern authored
+ fix formatting Change-Id: I7b4ec11b7b46d8926750e0b69f7a606f3ab80895
-
- 14 Feb, 2014 1 commit
-
-
levytamar82 authored
Optimizing 2 functions to process 32 elements in parallel instead of 16: 1. vp9_sub_pixel_variance64x64 2. vp9_sub_pixel_variance32x32 both of those function were calling vp9_sub_pixel_variance16xh_ssse3 instead of calling that function, it calls vp9_sub_pixel_variance32xh_avx2 that is written in avx2 and process 32 elements in parallel. This Optimization gave 70% function level gain and 2% user level gain Change-Id: I4f5cb386b346ff6c878a094e1c3b37e418e50bde
-