Commit 2d83e7e2 authored by Timothy B. Terriberry's avatar Timothy B. Terriberry
Browse files

Wrap _mm_cvtepi...() intrinsics in macros on clang.

We already needed these macros for gcc with optimizations disabled,
 but it appears clang needs them all the time.

Thanks to Jonathan Lennox for the report.
parent 3b74d8bd
......@@ -53,8 +53,14 @@ int opus_select_arch(void);
We can insert an explicit MOVD or MOVQ using _mm_cvtsi32_si128() or
_mm_loadl_epi64(), which should have the same semantics as an m32 or m64
reference in the PMOVSXWD instruction itself, but gcc is not smart enough to
optimize this out when optimizations ARE enabled.*/
# if !defined(__OPTIMIZE__)
optimize this out when optimizations ARE enabled.
It appears clang requires us to do this always (which is fair, since
technically the compiler is always allowed to do the dereference before
invoking the function implementing the intrinsic). I have not investiaged
whether it is any smarter than gcc when it comes to eliminating the extra
load instruction.*/
# if defined(__clang__) || !defined(__OPTIMIZE__)
# define OP_CVTEPI8_EPI32_M32(x) \
(_mm_cvtepi8_epi32(_mm_cvtsi32_si128(*(int *)(x))))
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment