Replace MMX with SSE2, reduce mem access to left neighbor, loop unrolled. Change-Id: I941be915af809025f121ecc6c6443f73c9903e70