- Jul 06, 2016
-
-
Signed-off-by:
Timothy B. Terriberry <tterribe@xiph.org>
-
This makes it match the formatting of the output for ARM assembly better, and removes some redundant repetition of the word "intrinsics". It also fixes the output if a compiler supports RTCD for Neon intrinsics but not assembly. Signed-off-by:
Timothy B. Terriberry <tterribe@xiph.org>
-
Signed-off-by:
Timothy B. Terriberry <tterribe@xiph.org>
-
Signed-off-by:
Timothy B. Terriberry <tterribe@xiph.org>
-
Enables existing Neon intrinsic optimizations to work on aarch64 targets. Signed-off-by:
Timothy B. Terriberry <tterribe@xiph.org>
-
Signed-off-by:
Timothy B. Terriberry <tterribe@xiph.org>
-
- Jun 29, 2016
-
-
The implementation currently only codes each channel independently with no special allocation rules. Signed-off-by:
Jean-Marc Valin <jmvalin@jmvalin.ca>
-
- May 31, 2016
-
-
Signed-off-by:
Jean-Marc Valin <jmvalin@jmvalin.ca>
-
- Jan 12, 2016
-
-
Jean-Marc Valin authored
-
- Nov 26, 2015
-
-
Jean-Marc Valin authored
-
- Nov 05, 2015
-
-
Signed-off-by:
Timothy B. Terriberry <tterribe@xiph.org>
-
- Oct 16, 2015
-
-
Timothy B. Terriberry authored
These were causing "syntax error near unexpected token `fi'" in the generated configure on some systems, because they produced an else fi with no commands between the two.
-
- Oct 07, 2015
-
-
Uses NEON optimized fixed point FFT routines in NE10 library. Signed-off-by:
Timothy B. Terriberry <tterribe@xiph.org>
-
Optimize opus encode (float only) usecase using ARM NE10 library. Mainly effects opus_fft and ctl_mdct_forward and related functions. This optimization can be used for ARM CPUs that have NEON VFP unit. This patch only enables optimizations for ARMv7. Official ARM NE10 library page available at http://projectne10.github.io/Ne10/ To enable this optimization, use --enable-intrinsics --with-NE10=<install_prefix> or --enable-intrinsics --with-NE10-libraries=<NE10_lib_dir> --with-NE10-includes=<NE10_includes_dir> Compile time checks made during configure process to make sure optimization option available only when compiler supports NEON instrinsics. Runtime checks made to make sure optimized functions only called on appropriate hardware. Signed-off-by:
Timothy B. Terriberry <tterribe@xiph.org>
-
- Sep 01, 2015
-
-
Enable x86 intrinsics when building in floating-point mode. Support SSE as an arch value. Use RTCD to conditionally enable existing floating-point Celt SSE code. Call functions directly (without RTCD) when their architecture can be presumed. Use SSE4.1 intrinsics optimized code for Silk even in floating-point mode.
-
Move SSE2 and SSE4.1 intrinsics functions to separate files, to be compiled with appropriate compiler flags. Otherwise, compilers are allowed to take advantage of (e.g.) -msse4.1 to generate code that uses SSE4.1 instructions, even when no SSE4.1 intrinsics are explicitly used in the source.
-
-
Actually try to compile intrinsics rather than using the output of --help. Allow caller of configure script to set custom compiler options to enable intrinsics. Detect when intrinsics are always available, without needing special compiler options. Make naming of #defines for detected intrinsics support more systematic.
-
- Dec 25, 2014
-
-
Optimize celt_pitch_xcorr function (for floating point) using ARM NEON intrinsics for SoCs that have NEON VFP unit. To enable this optimization, use --enable-intrinsics configure option. Compile time and runtime checks are also supported to make sure this optimization is only enabled when the compiler supports NEON intrinsics. Signed-off-by:
Timothy B. Terriberry <tterribe@xiph.org>
-
- Dec 20, 2014
-
- Oct 04, 2014
-
-
1. Only for fixed point on x86 platform (32bit and 64bit, uses SIMD intrinsics up to SSE4.2) 2. Use "configure --enable-fixed-point --enable-intrinsics" to enable optimization, default is disabled. 3. Official test cases are verified and passed. Signed-off-by:
Timothy B. Terriberry <tterribe@xiph.org>
-
- Mar 19, 2014
-
-
This allows building the arm assembly for iOS. This checks for the __APPLE__ preprocessor built-in define to determine whether this extra handling should be enabled. Signed-off-by:
Timothy B. Terriberry <tterribe@xiph.org>
-
- Dec 17, 2013
-
-
Breaks configure when /bin/sh isn't bash with: configure: Trying to force-enable ARMv6 media instructions... checking if assembler supports ARMv6 media instructions on ARM... yes configure: Trying to force-enable NEON instructions... checking if assembler supports NEON instructions on ARM... yes ./configure.lineno: 12799: Bad substitution Fix it by using the %% expansion to remove everything from the first space instead. Signed-off-by:
Peter Korsgaard <peter@korsgaard.com> Signed-off-by:
Jean-Marc Valin <jmvalin@jmvalin.ca>
-
- Dec 04, 2013
-
- Dec 03, 2013
-
-
Jean-Marc Valin authored
This reverts commit 2446445b.
-
Jean-Marc Valin authored
-
- Nov 18, 2013
-
-
Optimizing celt_pitch_xcorr()/xcorr_kernel() which also speeds up FIRs, IIRs and auto-correlations Signed-off-by:
Jean-Marc Valin <jmvalin@jmvalin.ca>
-
- Nov 04, 2013
-
-
Jean-Marc Valin authored
-
- Jul 13, 2013
-
-
Ron authored
This avoids at least one case where ./autogen.sh && ./configure && make will re-run configure because the makefile rules updated something that it depends upon. Pulling a new version from git will change the version so we should update that at the first step rather than iterating after the last one.
-
- Jul 01, 2013
-
-
Ron authored
It only existed to not include -lm in the .pc for fixed point builds, but that is still needed since the float API is still enabled and will use at least lrint.
-
- Jun 28, 2013
-
-
Ron authored
Drop the test for getopt, it's not used anywhere anymore. Switch the last uses of AC_TRY_COMPILE to AC_COMPILE_IFELSE now. The former is marked as obsolete, and this will leave no confusion about which to cut and paste if new tests are added. Double quote all the parameters to AC_LANG_SOURCE and AC_LANG_PROGRAM. This is actually required, even if you can get away with not doing it sometimes, so again set a good example for future changes to follow, to hopefully avoid people getting bitten harder than they need to be. Don't bother checking for alloca if we're never going to use it (ie. if we have C99 variable-size array support). The test for this is a bit sketchy anyway ... we separately test for HAVE_ALLOCA_H and USE_ALLOCA, but the test for USE_ALLOCA depends upon having alloca.h present, yet the use of these macros in stack_alloc.h only tests for HAVE_ALLOCA_H inside of a test for USE_ALLOCA. I'm not going to change this logic right now, since I don't know what crazy system it was attempting to cater for, though I suspect it was one that was not using the autoconf build system ... since with the current test that combination should not be possible to obtain. Use LT_LIB_M instead of the song and dance with testing for exp(). This should also work for BeOS which is what the exp test was added for. It also means we don't unconditionally add -lm to everything via LIBS. Use LIBM now instead of hardcoding -lm everywhere. Use AS_HELP_STRING to format all option descriptions. Don't bother to test for doxygen if using it is --disable'd. Drop the SYMBOL_VISIBILITY export, it isn't used anywhere (we add the compiler flag to CFLAGS).
-
- Jun 08, 2013
-
-
Ron authored
These were probably cribbed from libogg, but we don't use them here, opus_types.h instead has a list of hardcoded arch definitions.
-
- Jun 04, 2013
-
-
Run-time CPU detection (RTCD) is enabled by default if target platform support it. It can be disable at compile time with --disable-rtcd option. Add RTCD support for ARM architecture. Thanks to Timothy B. Terriberry for help and code review Signed-off-by:
Timothy B. Terriberry <tterribe@xiph.org>
-
- May 26, 2013
-
-
- May 22, 2013
-
-
Timothy B. Terriberry authored
Define ARMv4_ASM to 1 like the other ARM defines.
-
- May 20, 2013
-
-
Ron authored
Needed by commit 972a34ec. Use autoreconf in autogen.sh instead of the handwritten version, it's simpler, and also updates things that we weren't handling. Drop the hand-written INSTALL file. Its information content was ~zero, and autotools wants to overwrite it with its own version, so don't fight that, just .gitignore it.
-
Timothy B. Terriberry authored
Original patch by Aurélien Zanelli <aurelien.zanelli@parrot.com>: http://lists.xiph.org/pipermail/opus/2013-May/002078.html Revised version: - Add autconf detection (ported from libtheora). - Rename ARM5E to ARMv5E (an ARM5 is not the same thing as ARMv5!). - Use actual macros so they can still be selectively overridden. - Split out ARMv4 parts and add a few more ARMv4 macros. - Label blocks to make them easy to find in generated assembly. - Fix MULT16_32_Q15() so we can pass make check. The MDCT test passes in values larger than 2**30 for b. The new version should be just as fast (or faster, since it's easier to merge the shift with following instructions), and there's no appreciable impact on accuracy (FFT/MDCT SNR actually goes up in most cases). - Fix register constraints. We were using early-clobber flags in a bunch of places that didn't need them, and commutative-pair flags in a bunch of places that weren't actually commutative. This was Jean-Marc's fault (the original code came from Speex). - Simplify silk_CLZ16(). - Port over iFFT C_MULC asm by Andree Buschmann <AndreeBuschmann@t-online.de> from Rockbox. - Speed up the C_MULC asm by using LDRD, allowing more flexible addressing, re-ordering instructions to avoid some stalls, allowing more flexible register allocation, and getting things out of the inline asm block so the compiler can schedule them better. - Add C_MUL and C_MUL4 asm for the FFT to the encoder based, on the new C_MULC. In total, this patch gives a 22.3% speed-up on test_opus_encoder on a 600 MHz Cortex A8 using gcc 4.2.1, When restricted to ARMv4 optimizations, it gives a 9.6% speed-up on the same processor/compiler. On the conformance test vectors: Average mono quality is 97.0583 % Average stereo quality is 97.775 %
-
- May 18, 2013
-
-
Ron authored
We shouldn't ever have any trailing newlines that need trimming here, and the _s version wasn't added to m4sugar.m4 until autoconf 2.63b, so this will let it work with 2.13 again.
-
- May 10, 2013
-
-
Ron authored
This one meets or exceeds the following requirements: - Version is checked/updated for every build action when in the git repo. Does not require the user to re- ./configure to get the correct version. - Version is not updated automatically when using exported tarball source. Avoids accidentally getting a wrong version from some other git repo in a parent directory of the source, and allows setting the correct version for distro package exports. - Automatic updating can be manually suppressed. For developers doing lots of change/rebuild cycles they don't plan to release, when they don't want a full rebuild triggered for every commit, and again for every change made immediately after a commit. The version will still always be updated if they do a `make dist`. - Does not require any manual updating of versions in the mainline git repo for each release aside from normal tagging. The version is recorded in one file only, that is automatically generated and will never need to be committed. - Does not require gnu-make features for the autoconf builds. It does not currently: - Keep a checksum of every source file in tarball releases to mangle the version if people modify the tarball source. Responsible people can manually update the version easily though in such cases. The version.mk file is now only used by the VC project files. Once they are updated to use the package_version file too, then it can be deleted from the repository.
-
- May 09, 2013
-
-
Jean-Marc Valin authored
-