    First sse2 version of vp8_mbloop_filter_horizontal_edge().  For now,
    intrinsics are being used until the bitstream is finalized.  This function
    will be revisited later for further performance improvements.
    For the test clip used, a 31+% decoder performance improvement
    was seen.  This will vary depending on material.
