Skip to content
  • Angie Chiang's avatar
    convolve8 sse2 test · 8878fa4f
    Angie Chiang authored
    This experiment shows that when frame size is 64x64
    vpx_highbd_convolve8_sse2 and vpx_convolve8_sse2's speed are similar.
    However when frame size becomes 1024x1024
    vpx_highbd_convolve8_sse2 is around 50% slower than vpx_convolve8_sse2
    we think the bottleneck is from memory IO
    
    VP10ConvolveTest.vpx_highbd_convolve8_sse2_speed_8_64
    VP10ConvolveTest.vpx_highbd_convolve8_sse2_speed_8_64 (17 ms)
    VP10ConvolveTest.vpx_highbd_convolve8_sse2_speed_16_64
    VP10ConvolveTest.vpx_highbd_convolve8_sse2_speed_16_64 (42 ms)
    VP10ConvolveTest.vpx_highbd_convolve8_sse2_speed_32_64
    VP10ConvolveTest.vpx_highbd_convolve8_sse2_speed_32_64 (139 ms)
    VP10ConvolveTest.vpx_highbd_convolve8_sse2_speed_64_64
    VP10ConvolveTest.vpx_highbd_convolve8_sse2_speed_64_64 (499 ms)
    
    VP10ConvolveTest.vpx_convolve8_sse2_speed_l_8_64
    VP10ConvolveTest.vpx_convolve8_sse2_speed_l_8_64 (16 ms)
    VP10ConvolveTest.vpx_convolve8_sse2_speed_l_16_64
    VP10ConvolveTest.vpx_convolve8_sse2_speed_l_16_64 (40 ms)
    VP10ConvolveTest.vpx_convolve8_sse2_speed_l_32_64
    VP10ConvolveTest.vpx_convolve8_sse2_speed_l_32_64 (130 ms)
    VP10ConvolveTest.vpx_convolve8_sse2_speed_l_64_64
    VP10ConvolveTest.vpx_convolve8_sse2_speed_l_64_64 (485 ms)
    
    VP10ConvolveTest.vpx_highbd_convolve8_sse2_speed_8_1024
    VP10ConvolveTest.vpx_highbd_convolve8_sse2_speed_8_1024 (32 ms)
    VP10ConvolveTest.vpx_highbd_convolve8_sse2_speed_16_1024
    VP10ConvolveTest.vpx_highbd_convolve8_sse2_speed_16_1024 (61 ms)
    VP10ConvolveTest.vpx_highbd_convolve8_sse2_speed_32_1024
    VP10ConvolveTest.vpx_highbd_convolve8_sse2_speed_32_1024 (196 ms)
    VP10ConvolveTest.vpx_highbd_convolve8_sse2_speed_64_1024
    
    VP10ConvolveTest.vpx_highbd_convolve8_sse2_speed_64_1024 (694 ms)
    VP10ConvolveTest.vpx_convolve8_sse2_speed_l_8_1024
    VP10ConvolveTest.vpx_convolve8_sse2_speed_l_8_1024 (21 ms)
    VP10ConvolveTest.vpx_convolve8_sse2_speed_l_16_1024
    VP10ConvolveTest.vpx_convolve8_sse2_speed_l_16_1024 (44 ms)
    VP10ConvolveTest.vpx_convolve8_sse2_speed_l_32_1024
    VP10ConvolveTest.vpx_convolve8_sse2_speed_l_32_1024 (138 ms)
    VP10ConvolveTest.vpx_convolve8_sse2_speed_l_64_1024
    VP10ConvolveTest.vpx_convolve8_sse2_speed_l_64_1024 (491 ms)
    
    Change-Id: I3131a031e0380e8eae748cfcccc6cbb961d05943
    8878fa4f