Opus merge requestshttps://gitlab.xiph.org/xiph/opus/-/merge_requests2024-03-14T18:09:58Zhttps://gitlab.xiph.org/xiph/opus/-/merge_requests/114Fix _mm_loadu_si32 detection for vendored Clang.2024-03-14T18:09:58ZTimothy B. TerriberryFix _mm_loadu_si32 detection for vendored Clang.Apple uses different _\_clang_major_\_ version numbers than upstream,
so our test did not work.
This caused compilation failures with, e.g., XCode 10.1, which
reports _\_clang_major_\_ as 10 despite being forked from upstream's
7.0 br...Apple uses different _\_clang_major_\_ version numbers than upstream,
so our test did not work.
This caused compilation failures with, e.g., XCode 10.1, which
reports _\_clang_major_\_ as 10 despite being forked from upstream's
7.0 branch.
Fixes #2369Timothy B. TerriberryTimothy B. Terriberryhttps://gitlab.xiph.org/xiph/opus/-/merge_requests/113Fix _mm_loadu_si32 detection for vendoered Clang.2024-03-14T15:17:15ZTimothy B. TerriberryFix _mm_loadu_si32 detection for vendoered Clang.Apple uses different __clang_major__ version numbers than upstream,
so our test did not work.
This caused compilation failures with, e.g., XCode 10.1, which
reports __clang_major__ as 10 despite being forked from upstream's
7.0 branch...Apple uses different __clang_major__ version numbers than upstream,
so our test did not work.
This caused compilation failures with, e.g., XCode 10.1, which
reports __clang_major__ as 10 despite being forked from upstream's
7.0 branch.
Fixes #2369Timothy B. TerriberryTimothy B. Terriberryhttps://gitlab.xiph.org/xiph/opus/-/merge_requests/111Add Arm RTCD for FreeBSD.2024-03-10T17:41:15ZTimothy B. TerriberryAdd Arm RTCD for FreeBSD.Thanks to Robert Clausecker <fuz@FreeBSD.org> for the patch.Thanks to Robert Clausecker <fuz@FreeBSD.org> for the patch.Timothy B. TerriberryTimothy B. Terriberryhttps://gitlab.xiph.org/xiph/opus/-/merge_requests/110autootools: include all referenced READMEs2024-03-07T19:46:27ZTristan Matthewsautootools: include all referenced READMEsThese were not being included in the tarball in spite of being referenced.These were not being included in the tarball in spite of being referenced.https://gitlab.xiph.org/xiph/opus/-/merge_requests/109dnn: vec_neon: avoid redefinition of vcvtnq_s32_f322024-03-05T21:31:50ZTristan Matthewsdnn: vec_neon: avoid redefinition of vcvtnq_s32_f32clang exposes this intrinsic even in 32-bit mode, if targeting >= armv8,
whereas gcc does not, see:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95399clang exposes this intrinsic even in 32-bit mode, if targeting >= armv8,
whereas gcc does not, see:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95399https://gitlab.xiph.org/xiph/opus/-/merge_requests/108autotools: include dnn/meson.build in tarball2024-03-04T17:01:46ZTristan Matthewsautotools: include dnn/meson.build in tarballhttps://gitlab.xiph.org/xiph/opus/-/merge_requests/107add arm rtcd for apple2024-03-01T14:55:08ZMichael Klingbeiladd arm rtcd for applehttps://gitlab.xiph.org/xiph/opus/-/merge_requests/106add usage string for opus_demo dec_complexity2024-02-23T18:55:57ZMichael Klingbeiladd usage string for opus_demo dec_complexityhttps://gitlab.xiph.org/xiph/opus/-/merge_requests/105Add signaling for a maximum DRED quantizer.2024-02-23T19:14:18ZTimothy B. TerriberryAdd signaling for a maximum DRED quantizer.Since any value of dQ > 0 will cause the initial quantizer to
degrade to the format-implied maximum (15) with a sufficient
number of DRED frames, allow signaling a maximum smaller than 15.
This allows encoders to improve the minimum qu...Since any value of dQ > 0 will cause the initial quantizer to
degrade to the format-implied maximum (15) with a sufficient
number of DRED frames, allow signaling a maximum smaller than 15.
This allows encoders to improve the minimum quality of long DRED
sequences (at the expense of bitrate) without requiring a constant
quantizer for all frames (dQ == 0).Timothy B. TerriberryTimothy B. Terriberryhttps://gitlab.xiph.org/xiph/opus/-/merge_requests/104Add signaling for a maximum DRED quantizer.2024-02-22T20:21:41ZTimothy B. TerriberryAdd signaling for a maximum DRED quantizer.Since any value of dQ > 0 will cause the initial quantizer to
degrade to the format-implied maximum (15) with a sufficient
number of DRED frames, allow signaling a maximum smaller than 15.
This allows encoders to improve the minimum qu...Since any value of dQ > 0 will cause the initial quantizer to
degrade to the format-implied maximum (15) with a sufficient
number of DRED frames, allow signaling a maximum smaller than 15.
This allows encoders to improve the minimum quality of long DRED
sequences (at the expense of bitrate) without requiring a constant
quantizer for all frames (dQ == 0).Timothy B. TerriberryTimothy B. Terriberryhttps://gitlab.xiph.org/xiph/opus/-/merge_requests/103Add signaling for a maximum DRED quantizer.2024-02-22T15:05:54ZTimothy B. TerriberryAdd signaling for a maximum DRED quantizer.Since any value of dQ > 0 will cause the initial quantizer to
degrade to the format-implied maximum (15) with a sufficient
number of DRED frames, allow signaling a maximum smaller than 15.
This allows encoders to improve the minimum qu...Since any value of dQ > 0 will cause the initial quantizer to
degrade to the format-implied maximum (15) with a sufficient
number of DRED frames, allow signaling a maximum smaller than 15.
This allows encoders to improve the minimum quality of long DRED
sequences (at the expense of bitrate) without requiring a constant
quantizer for all frames (dQ == 0).Timothy B. TerriberryTimothy B. Terriberryhttps://gitlab.xiph.org/xiph/opus/-/merge_requests/102Add signaling for a maximum DRED quantizer.2024-02-22T15:12:27ZTimothy B. TerriberryAdd signaling for a maximum DRED quantizer.Since any value of dQ > 0 will cause the initial quantizer to
degrade to the format-implied maximum (15) with a sufficient
number of DRED frames, allow signaling a maximum smaller than 15.
This allows encoders to improve the minimum qu...Since any value of dQ > 0 will cause the initial quantizer to
degrade to the format-implied maximum (15) with a sufficient
number of DRED frames, allow signaling a maximum smaller than 15.
This allows encoders to improve the minimum quality of long DRED
sequences (at the expense of bitrate) without requiring a constant
quantizer for all frames (dQ == 0).Timothy B. TerriberryTimothy B. Terriberryhttps://gitlab.xiph.org/xiph/opus/-/merge_requests/100Fix OOB read in fixed-point NEON intrinsics.2024-02-10T02:00:50ZTimothy B. TerriberryFix OOB read in fixed-point NEON intrinsics.Fix OOB read in fixed-point NEON intrinsics.
xcorr_kernel_neon_fixed() read one more sample from y\[\] in the main loop than it needed to allow use of vector loads, but unlike the native asm in celt_pitch_xcorr_arm.s, the loop condition...Fix OOB read in fixed-point NEON intrinsics.
xcorr_kernel_neon_fixed() read one more sample from y\[\] in the main loop than it needed to allow use of vector loads, but unlike the native asm in celt_pitch_xcorr_arm.s, the loop condition did not exit early enough to prevent this from overrunning the end of the array. Additionally, the tail loop *always* read one value beyond what it needed.
This patch fixes the loop condition on the main loop. Since this makes the tail section run even for lengths that are a multiple of 8 (e.g., on fully half the multiplies for usages like celt_fir() or celt_iir() with an order of 16, which is common), rather than try to fix the tail loop, we replace it with a non-looping adaptation of the native asm, which continues to use vector loads as much as possible for the remaining elements (and also does not read ahead past the end of the y\[\] array).
Overall slowdown of test_opus_encode on a Raspberry Pi 5 Model B Rev 1.0 is 0.12% vs. 0.13% for fixing the existing tail loop.Timothy B. TerriberryTimothy B. Terriberryhttps://gitlab.xiph.org/xiph/opus/-/merge_requests/99Fix OOB read in fixed-point NEON intrinsics.2024-02-10T01:38:44ZTimothy B. TerriberryFix OOB read in fixed-point NEON intrinsics.Fix OOB read in fixed-point NEON intrinsics.
xcorr_kernel_neon_fixed() read one more sample from y[] in the
main loop than it needed to allow use of vector loads, but unlike
the native asm in celt_pitch_xcorr_arm.s, the loop condition...Fix OOB read in fixed-point NEON intrinsics.
xcorr_kernel_neon_fixed() read one more sample from y[] in the
main loop than it needed to allow use of vector loads, but unlike
the native asm in celt_pitch_xcorr_arm.s, the loop condition did
not exit early enough to prevent this from overrunning the end of
the array.
Additionally, the tail loop _always_ read one value beyond what it
needed.
This patch fixes the loop condition on the main loop.
Since this makes the tail section run even for lengths that are a
multiple of 8 (e.g., on fully half the multiplies for usages like
celt_fir() or celt_iir() with an order of 16, which is common),
rather than try to fix the tail loop, we replace it with a
non-looping adaptation of the native asm, which continues to use
vector loads as much as possible for the remaining elements (and
also does not read ahead past the end of the y[] array).
Overall slowdown of test_opus_encode on a Raspberry Pi 5 Model B
Rev 1.0 is 0.12% vs. 0.13% for fixing the existing tail loop.Timothy B. TerriberryTimothy B. Terriberryhttps://gitlab.xiph.org/xiph/opus/-/merge_requests/98Multiframe cleanup and fixes (version 3)2024-02-16T22:41:15ZJean-Marc ValinMultiframe cleanup and fixes (version 3)https://gitlab.xiph.org/xiph/opus/-/merge_requests/96Initial DRED tuning2024-02-16T22:41:11ZJean-Marc ValinInitial DRED tuningAdjust q0, qD and duration based on bitrate and loss.Adjust q0, qD and duration based on bitrate and loss.https://gitlab.xiph.org/xiph/opus/-/merge_requests/95Fix CBR issues, including multi-frame refactoring2024-02-16T22:41:07ZJean-Marc ValinFix CBR issues, including multi-frame refactoringhttps://gitlab.xiph.org/xiph/opus/-/merge_requests/94Multi-frame refactoring2024-02-16T22:40:58ZJean-Marc ValinMulti-frame refactoringJean-Marc ValinJean-Marc Valinhttps://gitlab.xiph.org/xiph/opus/-/merge_requests/90Adding RTCD for DNN code2023-11-21T19:27:08ZJean-Marc ValinAdding RTCD for DNN codeStarting with compute_linear()Starting with compute_linear()https://gitlab.xiph.org/xiph/opus/-/merge_requests/88Add generic linear layer2023-11-21T19:27:27ZJean-Marc ValinAdd generic linear layerShould be able to handle all previous GRU variants and more.Should be able to handle all previous GRU variants and more.