Skip to content
  • Johann Koenig's avatar
    combine loopfilter data access · 3556deac
    Johann Koenig authored
    The data processed by the loopfilter overlaps. At the block level, this
    results in some redundant transforms. Grouping the filtering allows for
    a single 16x16 transpose (and inversion) instead of three 16x8 transposes
    (and three more inversions).
    
    This implementation is x86_64 only. We retain the previous
    implementation for x86.
    
    Improvements are obviously material dependant, but it seems to be ~%1 in
    tests here.
    
    Change-Id: I467b7ec3655be98fb5f1a94b5d145e5e5a660007
    3556deac