Skip to content
  • Peng Bin's avatar
    Implement av1_build_compound_diffwtd_mask_sse4_1 · 640ea400
    Peng Bin authored
    1. Add sse4_1 version of
    av1_build_compound_diffwtd_mask
    2. The unit test shows it is 4.2x~ 12.23x times
    faster then C version
    3. For encoder, about 0.6% faster shows by encoding
    10 frames of foreman_cif.y4m.
    
    a) gcc (Ubuntu 7.2.0-8ubuntu3.2) 7.2.0
    b) CPU: Intel(R) Core(TM) i7-6900K CPU @ 3.20GHz
    c) Config cmd
    cmake ../ -DENABLE_CCACHE=1 -DCONFIG_LOWBITDEPTH=1
    d) Test cmd:
    ./aomenc --cpu-used=1 --end-usage=vbr \
    --target-bitrate=800 --limit=10
    
    Change-Id: If27543ba53eb1946d0e79d2977a24383377abc71
    640ea400