Relocate h_predictor_4x4 from SSSE3 to SSE2 with XMM registers. Speed up by ~25% in ./test_intra_pred_speed. Change-Id: I64e14c13b482a471449be3559bfb0da45cf88d9d