Commits · 59dc75fa9713d6543bbb85fe83cb56555513a4de · Xiph.Org / Opus

Feb 23, 2024

Rework 32-bit SSE loads yet again. · 59dc75fa

Timothy B. Terriberry authored 1 year ago and

Jean-Marc Valin committed 1 year ago

The existing code in vec_avx.h produced
  warning: dereferencing type-punned pointer will break
   strict-aliasing rules
 with gcc 6.4.0.
We already had a macro to work around this within the rules of the
 C standard, but trying to use that here does not get optimized
 into a single MOVD like we were hoping.
Replacing it with memcpy() instead does get optimized correctly,
 but requires switching from a macro to an inline function in order
 to be able to declare a local variable and return a value.
We already have such an inline function in NSQ_del_dec_avx2.c, so
 hoist that out and use it everywhere, and then convert vec_avx.h
 to use it also.

59dc75fa

Add Deep PLC/DRED/OSCE to random tests · 1186fb8e
Jean-Marc Valin authored 1 year ago
```
Also, remove -march=native because of AVX512VNNI and valgrind
```
1186fb8e

Feb 22, 2024

Fix build on ARMv7 · 6673e34b

Jean-Marc Valin authored 1 year ago

Fixes regression in 83368e6.
vcgez_s16() is A64-only, but vcge_s16(..., vdup_n_s16(0)) works
everywhere.

6673e34b

Fix AVX2 dection · f1fc944b
Jean-Marc Valin authored 1 year ago
```
broken in 9cf12e92
```
f1fc944b
Bump DRED experimental version for 3e2a6b62 · cf4e3a15
Jean-Marc Valin authored 1 year ago

cf4e3a15

Add signaling for a maximum DRED quantizer. · 3e2a6b62

Timothy B. Terriberry authored 1 year ago and

Jean-Marc Valin committed 1 year ago

Since any value of dQ > 0 will cause the initial quantizer to
 degrade to the format-implied maximum (15) with a sufficient
 number of DRED frames, allow signaling a maximum smaller than 15.
This allows encoders to improve the minimum quality of long DRED
 sequences (at the expense of bitrate) without requiring a constant
 quantizer for all frames (dQ == 0).

3e2a6b62

Fix Doxygen warnings. · 2fff6437
Timothy B. Terriberry authored 1 year ago and Jean-Marc Valin committed 1 year ago

2fff6437
Remove some dead code. · 950d8bf1
Timothy B. Terriberry authored 1 year ago

950d8bf1

Improve AVX2 compiler support detection. · 9cf12e92

Timothy B. Terriberry authored 1 year ago

Commit 735c4070 added uses of intrinsics that require at least
 gcc 9.0 (cf. <https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78782>),
 even though AVX2 support may appear to be available in earlier gcc
 versions.
We were not testing for this.
Update the compiler test in configure.ac to use these intrinsics
 explicitly, so it will error out and disable AVX2 if they are not
 available.

9cf12e92

Feb 21, 2024
- bit-exact overflow fixes in silk/arm/NSQ_del_dec_neon_intr.c · 833688e6
  Jan Buethe authored 1 year ago
  
  833688e6
- opus_dred_parse() sets dred_end to 0 when no DRED · 57901a67
  Jean-Marc Valin authored 1 year ago
  
  Also, fix documentation about return value of zero.
  57901a67
Feb 20, 2024
- Update weight-shrinking script · 6ac0c871
  Jean-Marc Valin authored 1 year ago
  
  6ac0c871
- Fixes an aliasing bug in opus_packet_pad() · d9d0e729
  Jean-Marc Valin authored 1 year ago
  
  Trying to add padding in-place breaks when we have extensions, which causes a memcpy() with overlapping data. Just doing a copy instead.
  d9d0e729
- Add missing RESTORE_STACK in tests · ecc10d83
  Jean-Marc Valin authored 1 year ago
  
  Silences NONTHREADSAFE_PSEUDOSTACK warnings
  ecc10d83
- Fix NONTHREADSAFE_PSEUDOSTACK · 001820bb
  Jean-Marc Valin authored 1 year ago
  
  001820bb
- Silences gcc warning · 512e6270
  Jean-Marc Valin authored 1 year ago
  
  warning: expression does not compute the number of elements in this array Seems like gcc thinks we're trying to get the number of elements in our array or something like that. It then suggests adding parentheses to silence the warning.
  512e6270
Feb 18, 2024
- Remove training whitespace · b75bd48d
  Jean-Marc Valin authored 1 year ago
  
  b75bd48d
- Instructions for reusing loss simulator · 5eeb5766
  Jean-Marc Valin authored 1 year ago
  
  5eeb5766
Feb 17, 2024

Add lossgen_demo · 393d463f

Jean-Marc Valin authored 1 year ago

Also skip the first loss values being generated since they're
biased towards "not lost" due to the initialization.

393d463f

Feb 16, 2024
- meson: Increase slow tests timeout · a97151d3
  Xavier Claessens authored 1 year ago and Jean-Marc Valin committed 1 year ago
  
  They timeout on GitHub actions because those runners are slower.
  a97151d3
- dump_modes: add missing file to build · 8894546b
  Giovanni Bajo authored 1 year ago and Jean-Marc Valin committed 1 year ago
  
  Signed-off-by: Jean-Marc Valin <jmvalin@jmvalin.ca>
  8894546b
- Map 2 extra channels in 5th order HOA · 5e0bb53e
  Chris Hold authored 1 year ago and Jean-Marc Valin committed 1 year ago
  
  Signed-off-by: Jean-Marc Valin <jmvalin@jmvalin.ca>
  5e0bb53e
- Provide 4th order HOA map 3 mixing and demixing · 965afac2
  Chris Hold authored 1 year ago and Jean-Marc Valin committed 1 year ago
  
  Signed-off-by: Jean-Marc Valin <jmvalin@jmvalin.ca>
  965afac2
- fixed compiler warning when building without dred · befc25fb
  Jan Buethe authored 1 year ago
  
  befc25fb
- Delaying new DRED data when just out of silence · db78df8c
  Jean-Marc Valin authored 1 year ago
  
  We don't need redundancy for the first active frame since we already have the main Opus payload.
  db78df8c
- Add dred_end return value to opus_dred_parse() · c5117c5c
  Jean-Marc Valin authored 1 year ago
  
  c5117c5c
- Support for extra offset · 1f53f1e0
  Jean-Marc Valin authored 1 year ago
  
  Allows us to exclude the most recent silence from DRED
  1f53f1e0
- Refactoring: store all states · 183a8202
  Jean-Marc Valin authored 1 year ago
  
  183a8202
- Chopping the oldest silence in a DRED payload · 9f36bfc9
  Jean-Marc Valin authored 1 year ago
  
  9f36bfc9
Feb 15, 2024
- Fix missing dotprod optimization · 9b1da1fb
  Jean-Marc Valin authored 1 year ago
  
  Use the neon version of silk_noise_shape_quantizer_short_prediction()
  9b1da1fb
- hangover fix in osce/utils/pitch.py · 367a487e
  Jan Buethe authored 1 year ago
  
  367a487e
- re-dumped osce models with sparse=False · 46f9c9c6
  Jan Buethe authored 1 year ago
  
  46f9c9c6
- disabled sparse option in osce export script · 735117b6
  Jan Buethe authored 1 year ago
  
  735117b6
- Provide 5th order HOA map 3 mixing and demixing · ffd1b0b1
  Chris Hold authored 1 year ago and Jean-Marc Valin committed 1 year ago
  
  Signed-off-by: Jean-Marc Valin <jmvalin@jmvalin.ca>
  ffd1b0b1
Feb 14, 2024
- More #ifdef around the dnn code · 6be3673c
  Jean-Marc Valin authored 1 year ago
  
  6be3673c
- Conditional include · 901c8548
  Jean-Marc Valin authored 1 year ago
  
  Thanks to Igor Palaguta for reporting the issue. https://github.com/xiph/opus/issues/313
  901c8548
- updated model · dd0e2dc3
  Jan Buethe authored 1 year ago
  
  dd0e2dc3
Feb 11, 2024
- Fix check-asm for celt_fir_sse4_1() · 4b9d9b00
  Jean-Marc Valin authored 1 year ago
  
  4b9d9b00
Feb 10, 2024

Fix OOB read in fixed-point NEON intrinsics. · 3e69410e

Timothy B. Terriberry authored 1 year ago and

Jean-Marc Valin committed 1 year ago


xcorr_kernel_neon_fixed() read one more sample from y[] in the
 main loop than it needed to allow use of vector loads, but unlike
 the native asm in celt_pitch_xcorr_arm.s, the loop condition did
 not exit early enough to prevent this from overrunning the end of
 the array.
Additionally, the tail loop _always_ read one value beyond what it
 needed.

This patch fixes the loop condition on the main loop.
Since this makes the tail section run even for lengths that are a
 multiple of 8 (e.g., on fully half the multiplies for usages like
 celt_fir() or celt_iir() with an order of 16, which is common),
 rather than try to fix the tail loop, we replace it with a
 non-looping adaptation of the native asm, which continues to use
 vector loads as much as possible for the remaining elements (and
 also does not read ahead past the end of the y[] array).

Overall slowdown of test_opus_encode on a Raspberry Pi 5 Model B
 Rev 1.0 is 0.12% vs. 0.13% for fixing the existing tail loop.

Signed-off-by: Jean-Marc Valin <jmvalin@jmvalin.ca>

3e69410e

Add check-asm for fixed-point xcorr_kernel(). · d5031251

Timothy B. Terriberry authored 1 year ago and

Jean-Marc Valin committed 1 year ago


Compare the output of xcorr_kernel() against the results of
 xcorr_kernel_c() when configured with --enable-check-asm.
Currently this is only checked in fixed point, as a float check
 requires more sophisticated error analysis and may need to be
 customized for each vector implementation.

Signed-off-by: Jean-Marc Valin <jmvalin@jmvalin.ca>

d5031251