• Yunqing Wang's avatar
    Further improve macroblock loop filters · d2021386
    Yunqing Wang authored
    This change included:
    1. Aligned reads in vp9_mbloop_filter_vertical_edge function.
    Since we actually read 16 bytes, we can align the reads to read
    starting at (s - 8) instead of (s - 5).
    2. Combined u, v loop filters.
    3. Added 8x16 transpose.
    This gave 2% decoder performance gain (tulip clip).
    Change-Id: Ib14c2f1645c4a3436df17fe2f24789506bf0bb58
vp9_loopfilter_x86.c 29.6 KB