Commits · b518b56fe11bf53f88fe30d57ea9d668337983a9 · Alexander Traud / Opus

May 21, 2013

Clean up register constraints. · b518b56f

Timothy B. Terriberry authored 11 years ago

http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dui0068b/CIHBJEHG.html
 says that "Rd cannot be the same as Rm."
http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dui0068b/CIHBJEHG.html
 says that "RdLo, RdHi, and Rm must all be different registers."
This means that some of the early clobbers I removed really should
 have been there (to prevent aliasing Rd, RdLo, or RdHi with Rm).
It also means that we should reverse some of the operands in the
 FFT's complex multiplies.
This should only affect the ARMv4 optimizations.

Thanks to Nils Wallménius for the report.

While we're here, audit the commutative pair flags again, since I
 screwed up at least one of them, and eliminate some dead code.

b518b56f

May 20, 2013

Add ARMv4/ARMv5E macros. · 972a34ec

Timothy B. Terriberry authored 11 years ago

Original patch by Aurélien Zanelli <aurelien.zanelli@parrot.com>:
 http://lists.xiph.org/pipermail/opus/2013-May/002078.html

Revised version:
- Add autconf detection (ported from libtheora).
- Rename ARM5E to ARMv5E (an ARM5 is not the same thing as ARMv5!).
- Use actual macros so they can still be selectively overridden.
- Split out ARMv4 parts and add a few more ARMv4 macros.
- Label blocks to make them easy to find in generated assembly.
- Fix MULT16_32_Q15() so we can pass make check.
  The MDCT test passes in values larger than 2**30 for b.
  The new version should be just as fast (or faster, since it's
   easier to merge the shift with following instructions), and
   there's no appreciable impact on accuracy (FFT/MDCT SNR actually
   goes up in most cases).
- Fix register constraints.
  We were using early-clobber flags in a bunch of places that
   didn't need them, and commutative-pair flags in a bunch of
   places that weren't actually commutative.
  This was Jean-Marc's fault (the original code came from Speex).
- Simplify silk_CLZ16().
- Port over iFFT C_MULC asm by Andree Buschmann
   <AndreeBuschmann@t-online.de> from Rockbox.
- Speed up the C_MULC asm by using LDRD, allowing more flexible
   addressing, re-ordering instructions to avoid some stalls,
   allowing more flexible register allocation, and getting things
   out of the inline asm block so the compiler can schedule them
   better.
- Add C_MUL and C_MUL4 asm for the FFT to the encoder based, on the
   new C_MULC.

In total, this patch gives a 22.3% speed-up on test_opus_encoder on
 a 600 MHz Cortex A8 using gcc 4.2.1,
When restricted to ARMv4 optimizations, it gives a 9.6% speed-up
 on the same processor/compiler.
On the conformance test vectors:
 Average mono quality is 97.0583 %
 Average stereo quality is 97.775 %

972a34ec

Aug 21, 2012
- Replace long long in celt/ with opus_int64. · 5685bd31
  Gregory Maxwell authored 12 years ago
  
  5685bd31
May 16, 2012
- Revert "Adds 3rd clause to CELT license" · 88f22f2d
  Jean-Marc Valin authored 12 years ago
  
  This reverts commit 9f407afa.
  88f22f2d
Apr 24, 2012
- Changes all uses of SHR()/SHL() macros to SHR32()/SHL32() · c82cd062
  Jean-Marc Valin authored 12 years ago
  
  c82cd062
- Adds 3rd clause to CELT license · 9f407afa
  Jean-Marc Valin authored 12 years ago
  
  9f407afa
Oct 27, 2011

Convert tabs to spaces in the opus and celt code. · da025d56

Ralph Giles authored 13 years ago

Also reformat some, but by no means all, of the opus
code for line length and three-character indents.

da025d56

Sep 14, 2011
- renames the libcelt/ directory to celt/ · c3749909
  Jean-Marc Valin authored 13 years ago
  
  c3749909
Aug 15, 2011
- Respect the ANSI C89 maximum line length. · 5cfa0a0e
  Gregory Maxwell authored 13 years ago
  
  5cfa0a0e
- kiss fft cleanup · 2e78b276
  Jean-Marc Valin authored 13 years ago
  
  2e78b276
Aug 02, 2011
- Remove many unused defines and convert some double constants to float. · 662587d9
  Gregory Maxwell authored 13 years ago
  
  662587d9
Jul 31, 2011
- Correct many whitespace errors under libcelt/ and remove · 71d39ad8
  Gregory Maxwell authored 13 years ago and Jean-Marc Valin committed 13 years ago
  
  non-ascii characters from the source.
  71d39ad8
Jul 29, 2011
- Renamed celt_[u]int* to opus_[u]int* · d77d6a58
  Jean-Marc Valin authored 13 years ago
  
  d77d6a58
Feb 10, 2011
- Relicensing under the simplified (2-clause) BSD license · 3806c1d7
  Jean-Marc Valin authored 14 years ago
  
  Got authorization from all copyright holders
  3806c1d7
Aug 25, 2010
- Updating dump_modes to include the MDCT and FFT. More work needed. · 24eef149
  Jean-Marc Valin authored 14 years ago
  
  24eef149
- FFT cleanup · 3fc0aada
  Jean-Marc Valin authored 14 years ago
  
  3fc0aada
Aug 03, 2010
- DOUBLE_PRECISION and MIXED_PRECISION no longer need to be defined · f81a60ca
  Jean-Marc Valin authored 14 years ago
  
  f81a60ca
Jul 09, 2010
- 16-bit bitrev table · 41a5593c
  Jean-Marc Valin authored 14 years ago
  
  41a5593c
- Sharing of the twiddles across multiple FFTs · 6c5816ea
  Jean-Marc Valin authored 14 years ago
  
  6c5816ea
Apr 17, 2010
- Converted a few double-precision constants to single precision · 628c0253
  Jean-Marc Valin authored 14 years ago
  
  628c0253
Oct 17, 2009

Changed all the celt*int*_t types to remove the _t suffix, which is reserved · 30f7f813

Jean-Marc Valin authored 15 years ago

by POSIX. The other _t types that are not part of the API are still there
for now. Also, got rid of all that was left of the 64-bit types.

30f7f813

Sep 16, 2008
- Better use of the arithmetic operators · 1dab60cc
  Jean-Marc Valin authored 16 years ago
  
  1dab60cc
Sep 13, 2008
- Generate slightly more accurate WMOPS figures · 453ccd82
  Jean-Marc Valin authored 16 years ago
  
  453ccd82
Mar 15, 2008
- Making sure not to use the C library calls directly · c7e0b76c
  Jean-Marc Valin authored 17 years ago
  
  c7e0b76c
Mar 12, 2008
- Using reciprocal approximation instead of full 32-bit division in alg_quant() · d857ac48
  Jean-Marc Valin authored 17 years ago
  
  d857ac48
Mar 05, 2008

fixed-point: changed find_spectral_pitch() to use single-precision (16-bit) FFT. · f93747c4

Jean-Marc Valin authored 17 years ago

This involved adding kfft_single.[ch] that redefines kiss_fft a second time
with a different prefix. All this is still a bit of a mess now. The mask
had to be converted to 16-bit input, but we're still using floats to apply it.

f93747c4

Feb 29, 2008
- fixed-point: converted intra prediction and folding, unb0rked mixed-precision · 877b1975
  Jean-Marc Valin authored 17 years ago
  
  877b1975
- fixed-point: overflow debugging now works again. · 2aaa0fee
  Jean-Marc Valin authored 17 years ago
  
  2aaa0fee
Feb 27, 2008
- fixed-point: Moved sqrt and cos approximations to mathops.h · 3ca9b1d2
  Jean-Marc Valin authored 17 years ago
  
  3ca9b1d2
Feb 25, 2008
- fixed-point: initial support for using the fixed-point MDCT (rest is still all · 49ca99ef
  Jean-Marc Valin authored 17 years ago
  
  float)
  49ca99ef
Feb 24, 2008
- Float FFT now does the same scaling as the fixed-point FFT · 44830b04
  Jean-Marc Valin authored 17 years ago
  
  44830b04
- minor tweak to FFT · e8b6830f
  Jean-Marc Valin authored 17 years ago
  
  e8b6830f
- Added a mixed-precision version of the FFT with 32-bit data and 16-bit twiddles. · d911bc4d
  Jean-Marc Valin authored 17 years ago
  
  d911bc4d
- Created an separate kiss_twiddle_cpx type to make it possible to use · 9ced5d04
  Jean-Marc Valin authored 17 years ago
  
  different precision for twiddles and data.
  9ced5d04
Feb 22, 2008
- Fixed the FFT for higher precision · 25649c15
  Jean-Marc Valin authored 17 years ago
  
  25649c15
- Fixed stuff that got broken during the forward-backward split of the FFT · af8402e0
  Jean-Marc Valin authored 17 years ago
  
  af8402e0
Feb 08, 2008
- Split the radix functions into forward and backward versions, removed the · 6211c90d
  Jean-Marc Valin authored 17 years ago
  
  "inverse" flag from the state so it can be shared between the forward and inverse transforms.
  6211c90d
- Made pre-computed twiddles the same for forward and inverse FFT · d7dfb008
  Jean-Marc Valin authored 17 years ago
  
  d7dfb008
Feb 07, 2008
- Real FFT cleanup, plus some testcases · e6586d21
  Jean-Marc Valin authored 17 years ago
  
  e6586d21
- Now using an MDCT implementation I can actually understand. · 4d0a7d0f
  Jean-Marc Valin authored 17 years ago
  
  4d0a7d0f