- Nov 05, 2015
-
-
Signed-off-by:
Timothy B. Terriberry <tterribe@xiph.org>
-
Signed-off-by:
Timothy B. Terriberry <tterribe@xiph.org>
-
- Nov 03, 2015
-
-
- Oct 08, 2015
-
-
Signed-off-by:
Jean-Marc Valin <jmvalin@jmvalin.ca>
-
- Oct 07, 2015
-
-
Some of the fields present in NE10's float state struct are not present in the fixed-point version, but we were generating initializers for them anyway. Also, the float modes were not up-to-date with the output of dump_modes.
-
Extends usage of NEON optimized fixed-point FFT optimizations in libNE10 to clt_mdct_forward and clt_mdct_backward. Signed-off-by:
Timothy B. Terriberry <tterribe@xiph.org>
-
Uses NEON optimized fixed point FFT routines in NE10 library. Signed-off-by:
Timothy B. Terriberry <tterribe@xiph.org>
-
Signed-off-by:
Viswanath Puttagunta <viswanath.puttagunta@linaro.org> Signed-off-by:
Timothy B. Terriberry <tterribe@xiph.org>
-
Optimize opus decode (float only) use case using ARM NE10. Mainly effects opus_ifft and ctl_mdct_backward and related functions. Work based on previous Encode optimization using ARM NE10 library. See previous commit for details on how to enable this. Signed-off-by:
Timothy B. Terriberry <tterribe@xiph.org>
-
Optimize opus encode (float only) usecase using ARM NE10 library. Mainly effects opus_fft and ctl_mdct_forward and related functions. This optimization can be used for ARM CPUs that have NEON VFP unit. This patch only enables optimizations for ARMv7. Official ARM NE10 library page available at http://projectne10.github.io/Ne10/ To enable this optimization, use --enable-intrinsics --with-NE10=<install_prefix> or --enable-intrinsics --with-NE10-libraries=<NE10_lib_dir> --with-NE10-includes=<NE10_includes_dir> Compile time checks made during configure process to make sure optimization option available only when compiler supports NEON instrinsics. Runtime checks made to make sure optimized functions only called on appropriate hardware. Signed-off-by:
Timothy B. Terriberry <tterribe@xiph.org>
-
This is needed for the SMALL_DIV_TABLE constants added in commit ec5d01cb.
-
Brings MIPS in sync with the ARM/SSE optimizations that added "arch" parameters. Signed-off-by:
Jean-Marc Valin <jmvalin@jmvalin.ca>
-
- Sep 01, 2015
-
-
-
Enable x86 intrinsics when building in floating-point mode. Support SSE as an arch value. Use RTCD to conditionally enable existing floating-point Celt SSE code. Call functions directly (without RTCD) when their architecture can be presumed. Use SSE4.1 intrinsics optimized code for Silk even in floating-point mode.
-
Move SSE2 and SSE4.1 intrinsics functions to separate files, to be compiled with appropriate compiler flags. Otherwise, compilers are allowed to take advantage of (e.g.) -msse4.1 to generate code that uses SSE4.1 instructions, even when no SSE4.1 intrinsics are explicitly used in the source.
-
-
-
-
In optimized mode, don't force Clang to use explicit load/store for _mm_cvtepi16_epi32, only for _mm_cvtepi8_epi32. Adjust comment accordingly.
-
Actually try to compile intrinsics rather than using the output of --help. Allow caller of configure script to set custom compiler options to enable intrinsics. Detect when intrinsics are always available, without needing special compiler options. Make naming of #defines for detected intrinsics support more systematic.
-
- Feb 27, 2015
-
-
Timothy B. Terriberry authored
We already needed these macros for gcc with optimizations disabled, but it appears clang needs them all the time. Thanks to Jonathan Lennox for the report.
-
- Feb 20, 2015
-
-
Timothy B. Terriberry authored
This way we won't break this by accident.
-
- Jan 03, 2015
-
-
Timothy B. Terriberry authored
During review of c95c9a04, I replaced a call to _mm_cvtepi8_epi32() with the OP_CVTEPI16_EPI32_M64() macro (note the 16 instead of 8). Make a separate OP_CVTEPI8_EPI32_M32() macro and use that instead. Thaks to Wei Zhou for the report.
-
- Dec 26, 2014
-
-
Timothy B. Terriberry authored
This should suppress our current issues with unused parameters, unused variables, and set-but-not-used variables.
-
- Dec 25, 2014
-
-
Optimize celt_pitch_xcorr function (for floating point) using ARM NEON intrinsics for SoCs that have NEON VFP unit. To enable this optimization, use --enable-intrinsics configure option. Compile time and runtime checks are also supported to make sure this optimization is only enabled when the compiler supports NEON intrinsics. Signed-off-by:
Timothy B. Terriberry <tterribe@xiph.org>
-
- Dec 01, 2014
-
-
Timothy B. Terriberry authored
This should not take an arch parameter, so it can properly be used as a fallback for accelerated versions which do not. This patch instead provides a separate version which can call accelerated helpers for platforms that have taken that approach.
-
- Nov 19, 2014
-
-
Signed-off-by:
Tristan Matthews <tmatth@videolan.org>
-
- Oct 12, 2014
-
- Oct 04, 2014
-
-
Timothy B. Terriberry authored
There is also no trailing whitespace.
-
Timothy B. Terriberry authored
-
1. Only for fixed point on x86 platform (32bit and 64bit, uses SIMD intrinsics up to SSE4.2) 2. Use "configure --enable-fixed-point --enable-intrinsics" to enable optimization, default is disabled. 3. Official test cases are verified and passed. Signed-off-by:
Timothy B. Terriberry <tterribe@xiph.org>
-
- Aug 10, 2014
-
-
Tristan Matthews authored
cherry-picked from speexdsp 86779a06f6500d041573d6252d4971d3bfcb4b18
-
- Jun 19, 2014
-
-
Jean-Marc Valin authored
-
Signed-off-by:
Jean-Marc Valin <jmvalin@jmvalin.ca>
-
- Jun 18, 2014
-
-
Jean-Marc Valin authored
-
- Apr 17, 2014
-
-
Gregory Maxwell authored
-
- Mar 26, 2014
-
-
Timothy B. Terriberry authored
The patch in 76e831d9 got us most of the way there, but out-of-tree builds required a second Makefile.am rule, which was missing @ARM2GNU_PARAMS@. Also, the arm2gnu.pl was terminating argument processing on any argument beginning with --, rather than an argument that was just -- by itself (as is the normal convention in GNU programs). That meant it never saw the --apple flag even when it was passed. Thanks to Jonathan Lennox for the report and for testing.
-
- Mar 19, 2014
-
-
This allows building the arm assembly for iOS. This checks for the __APPLE__ preprocessor built-in define to determine whether this extra handling should be enabled. Signed-off-by:
Timothy B. Terriberry <tterribe@xiph.org>
-
This avoids having to use the public symbol name when jumping here, on platforms where the public symbols have an underscore prefix. Signed-off-by:
Timothy B. Terriberry <tterribe@xiph.org>
-