Skip to content
  • Tamar Levy's avatar
    SSSE3 convolution optimization · 511d218c
    Tamar Levy authored
    Optimizing all SSSE3 assembly for convolution:
    1. vp9_filter_block1d4_h8_sse2
    2. vp9_filter_block1d8_h8_sse2
    3. vp9_filter_block1d16_h8_sse2
    4. vp9_filter_block1d4_v8_sse2
    5. vp9_filter_block1d8_v8_sse2
    6. vp9_filter_block1d16_v8_sse2
    my optimization include:
    -processing 2x8 elements in one 128 bit register instead of processing
    8 elements in one 128 bit register.
    -removing unecessary loads.
    This optimization gives between 2.4% user level gain for 480p input
    and 1.6% user level gain for 720p.
    This Optimization done only for 64bit.
    
    Change-Id: Icb586dc0c938b56699864fcee6c52fd43b36b969
    511d218c