• Yunqing Wang's avatar
    Do horizontal loopfiltering in parallel · 64f728ca
    Yunqing Wang authored
    This patch followed "Rewrite filter_selectively_horiz for parallel
    loopfiltering" commit, and added x86 SSE2 optimization to do
    16-pixel filtering in parallel. Also, corrected the declaration
    of aligned arrays. For 8-pixel-in-parallel case, improved the
    calculation of the masks and filters. Updated the threshold loading
    since the thresholds were already duplicated. Updated neon C functions
    to call neon loopfilters twice.
    
    Using tulip clip, tests showed it gave a ~1.5% decoder speed gain.
    
    Change-Id: Id02638626ac27a4b0e0b09d71792a24c0499bd35
    64f728ca
vp9_loopfilter.c 43.2 KB