./configure mis-detects AArch64 NEON support
I'm building for an M33 chip with DSP and FP options present. I configure with the following:
PATH=/snip/zephyr-sdk/arm-zephyr-eabi/bin:$PATH ./configure --build=x86_64-linux-gnu --host=arm-zephyr-eabi \
--disable-rtcd --disable-doc --disable-extra-programs \
CFLAGS="--specs=picolibc.specs -nostdlib -Os -march=armv8-m.main+dsp+fp -mcpu=cortex-m33 -mthumb -mfloat-abi=hard -Wa,-mimplicit-it=thumb"
This incorrectly detects support for Aarch64 Neon intrinsics, with the offending lines marked:
checking if compiler supports ARM Neon intrinsics... no <----- Correct
checking if compiler supports ARM Neon intrinsics with -mfpu=neon -mfloat-abi=softfp... yes
checking for NE10... no
checking if compiler supports Aarch64 Neon intrinsics... yes <----- Incorrect
checking if compiler supports Aarch64 dotprod intrinsics... no
checking if compiler supports Aarch64 dotprod intrinsics with -march=armv8.2-a+dotprod... yes
<snip>
General configuration:
Floating point support: ........ yes
Fast float approximations: ..... yes
Fixed point debugging: ......... no
Inline Assembly Optimizations: . ARM (EDSP) (Media)
External Assembly Optimizations: ARM (EDSP) (Media)
Intrinsics Optimizations: ...... ARM (NEON) (NEON Aarch64) (DOTPROD) <----- Incorrect?
Run-time CPU detection: ........ disabled
I don't know why intrinsics optimizations are enabled for ARM neon, Aarch64 Neon, dotprod. Maybe this is intended for a library that can run on multiple targets, some which may have support for these features? But I can't enable run-time CPU detection, and I don't want the extra code to be included -- I just want something that supports my target only.
From config.log, it looks like the Aarch64 Neon intrinsics were detected because GCC merely warns that the un-available intrinsic doesn't exist, and the link still succeeds:
configure:11187: checking if compiler supports Aarch64 Neon intrinsics
configure:11206: arm-zephyr-eabi-gcc -o conftest --specs=picolibc.specs -nostdlib -Os -march=armv8-m.main+dsp+fp -mcpu=cortex-m33 -mthumb -mfloat-abi=hard -Wa,-mimplicit-it=thumb conftest.c >&5
conftest.c: In function 'main':
conftest.c:41:22: warning: implicit declaration of function 'vqmovns_s32'; did you mean 'vqmovn_s32'? [-Wimplicit-function-declaration]
41 | OUT = vqmovns_s32(IN);
| ^~~~~~~~~~~
| vqmovn_s32
/snip/zephyr-sdk/arm-zephyr-eabi/bin/../lib/gcc/arm-zephyr-eabi/12.2.0/../../../../arm-zephyr-eabi/bin/ld: warning: cannot find entry symbol _start; defaulting to 0000000010000000
configure:11206: $? = 0
configure:11211: result: yes
In the correctly not-detected ARM Neon case, GCC errors out:
configure:10956: checking if compiler supports ARM Neon intrinsics
configure:10975: arm-zephyr-eabi-gcc -o conftest --specs=picolibc.specs -nostdlib -Os -march=armv8-m.main+dsp+fp -mcpu=cortex-m33 -mthumb -mfloat-abi=hard -Wa,-mimplicit-it=thumb conftest.c >&5
In file included from conftest.c:32:
/snip/zephyr-sdk/arm-zephyr-eabi/lib/gcc/arm-zephyr-eabi/12.2.0/include/arm_neon.h: In function 'main':
/snip/zephyr-sdk/arm-zephyr-eabi/lib/gcc/arm-zephyr-eabi/12.2.0/include/arm_neon.h:6308:1: error: inlining failed in call to 'always_inline' 'vgetq_lane_f32': target specific option mismatch
6308 | vgetq_lane_f32 (float32x4_t __a, const int __b)
| ^~~~~~~~~~~~~~
conftest.c:40:25: note: called from here
40 | return (int)vgetq_lane_f32(SUMM, 0);
| ^~~~~~~~~~~~~~~~~~~~~~~
/snip/zephyr-sdk/arm-zephyr-eabi/lib/gcc/arm-zephyr-eabi/12.2.0/include/arm_neon.h:1481:1: error: inlining failed in call to 'always_inline' 'vmlaq_f32': target specific option mismatch
1481 | vmlaq_f32 (float32x4_t __a, float32x4_t __b, float32x4_t __c)
| ^~~~~~~~~
conftest.c:39:20: note: called from here
39 | SUMM = vmlaq_f32(SUMM, A0, A1);
| ^~~~~~~~~~~~~~~~~~~~~~~
configure:10975: $? = 1
The Opus build then gives several warnings about implicitly declared functions vqadds_s32
, vqsubs_s32
, silk_NSQ_del_dec_media
.
It's interesting that there seems to be a confusion between silk_NSQ_del_dec_media
and silk_NSQ_del_dec_neon
-- some inconsistency in the way that implementations are generated/selected?
In file included from ./silk/arm/biquad_alt_arm.h:31,
from ./silk/SigProc_FIX.h:51,
from silk/float/SigProc_FLP.h:31,
from silk/float/main_FLP.h:31,
from silk/float/wrappers_FLP.c:32:
silk/float/wrappers_FLP.c: In function 'silk_NSQ_wrapper_FLP':
./silk/arm/NSQ_del_dec_arm.h:54:19: warning: implicit declaration of function 'silk_NSQ_del_dec_media'; did you mean 'silk_NSQ_del_dec_neon'? [-Wimplicit-function-declaration]
54 | PRESUME_NEON(silk_NSQ_del_dec)( \
| ^~~~~~~~~~~~~~~~
./celt/arm/armcpu.h:62:31: note: in definition of macro 'PRESUME_MEDIA'
62 | # define PRESUME_MEDIA(name) name ## _media
| ^~~~
./silk/arm/NSQ_del_dec_arm.h:54:6: note: in expansion of macro 'PRESUME_NEON'
54 | PRESUME_NEON(silk_NSQ_del_dec)( \
| ^~~~~~~~~~~~
silk/float/wrappers_FLP.c:164:9: note: in expansion of macro 'silk_NSQ_del_dec'
164 | silk_NSQ_del_dec( &psEnc->sCmn, psNSQ, psIndices, x16, pulses, PredCoef_Q12[ 0 ], LTPCoef_Q14,
| ^~~~~~~~~~~~~~~~
./silk/arm/NSQ_del_dec_arm.h:54:19: warning: nested extern declaration of 'silk_NSQ_del_dec_media' [-Wnested-externs]
54 | PRESUME_NEON(silk_NSQ_del_dec)( \
| ^~~~~~~~~~~~~~~~
./celt/arm/armcpu.h:62:31: note: in definition of macro 'PRESUME_MEDIA'
62 | # define PRESUME_MEDIA(name) name ## _media
| ^~~~
./silk/arm/NSQ_del_dec_arm.h:54:6: note: in expansion of macro 'PRESUME_NEON'
54 | PRESUME_NEON(silk_NSQ_del_dec)( \
| ^~~~~~~~~~~~
silk/float/wrappers_FLP.c:164:9: note: in expansion of macro 'silk_NSQ_del_dec'
164 | silk_NSQ_del_dec( &psEnc->sCmn, psNSQ, psIndices, x16, pulses, PredCoef_Q12[ 0 ], LTPCoef_Q14,
| ^~~~~~~~~~~~~~~~
When later using the built Opus library in a target application, linking fails because vqadds_s32
and vqsubs_s32
don't exist.
I tried to work around this by editing config.h
to #undef OPUS_ARM_PRESUME_AARCH64_NEON_INTR
before building Opus; this stops the issues with vqadds_s32
and vqsubs_s32
, but silk_NSQ_del_dec_media
still doesn't exist.
I don't know enough about Autoconf and Opus' use of it. Am I mis-using ./configure? Or is this a bug in Opus' scripts, or Autoconf?
arm-zephyr-eabi-gcc --version
reports arm-zephyr-eabi-gcc (Zephyr SDK 0.16.1) 12.2.0
, in case it matters.