Skip to content
  • Luc Trudeau's avatar
    [CFL] SSSE3/AVX2 versions of luma_subsampling_420_lbd · 9bd42785
    Luc Trudeau authored
    Includes unit tests for conformance and speed.
    
    SSSE2/SubsampleSpeedTest:
    4x4: C time = 868 us, SIMD time = 200 us (~4.3x)
    8x8: C time = 3054 us, SIMD time = 293 us (~10x)
    16x16: C time = 11887 us, SIMD time = 760 us (~16x)
    
    AVX2/SubsampleSpeedTest:
    4x4: C time = 784 us, SIMD time = 205 us (~3.8x)
    8x8: C time = 2774 us, SIMD time = 307 us (~9x)
    16x16: C time = 10978 us, SIMD time = 489 us (~22x)
    
    Change-Id: I7d5958097542599d57d1a9f9a0a1b809c6a345b0
    9bd42785