- Mar 02, 2024
-
-
Jean-Marc Valin authored
-
Jean-Marc Valin authored
-
- Feb 23, 2024
-
-
The existing code in vec_avx.h produced warning: dereferencing type-punned pointer will break strict-aliasing rules with gcc 6.4.0. We already had a macro to work around this within the rules of the C standard, but trying to use that here does not get optimized into a single MOVD like we were hoping. Replacing it with memcpy() instead does get optimized correctly, but requires switching from a macro to an inline function in order to be able to declare a local variable and return a value. We already have such an inline function in NSQ_del_dec_avx2.c, so hoist that out and use it everywhere, and then convert vec_avx.h to use it also.
-
- Nov 16, 2023
-
-
Jean-Marc Valin authored
Starting with compute_linear()
-
- Nov 11, 2023
-
-
Jean-Marc Valin authored
-
- Nov 03, 2023
-
-
Jean-Marc Valin authored
-
- Oct 30, 2023
-
-
Jean-Marc Valin authored
-
- Oct 20, 2023
-
-
Jean-Marc Valin authored
-
- Oct 07, 2023
-
-
Jean-Marc Valin authored
Those intrinsics don't actually require alignment so we're OK
-
- Sep 02, 2023
-
-
- Aug 01, 2023
-
-
Jean-Marc Valin authored
-
- Jul 22, 2023
-
-
Jean-Marc Valin authored
-
Jean-Marc Valin authored
Not so much for old machines, as for getting decent performance when not setting -march= (SSE2 is part of the amd64 ABI).
-
Jean-Marc Valin authored
-
- Jul 20, 2023
-
-
Jean-Marc Valin authored
-
Jean-Marc Valin authored
Should be able to handle all previous GRU variants and more.
-
- May 24, 2023
-
-
Depending on what defines are set there is collisions with the ones in Opus. To avoid these errors we rename the exp functions and macros. Signed-off-by:
Jean-Marc Valin <jmvalin@amazon.com>
-
-
- May 23, 2023
-
-
Signed-off-by:
Jean-Marc Valin <jmvalin@amazon.com>
-
Signed-off-by:
Jean-Marc Valin <jmvalin@amazon.com>
-
- Oct 21, 2022
-
-
Jan Buethe authored
-
- Jan 19, 2022
-
-
Jean-Marc Valin authored
-
- Jul 11, 2021
-
-
Jean-Marc Valin authored
-
- Jul 10, 2021
-
-
Jean-Marc Valin authored
AVX without AVX2 should now work again too.
-
Jean-Marc Valin authored
-
Jean-Marc Valin authored
-
Jean-Marc Valin authored
-
Jean-Marc Valin authored
-
Jean-Marc Valin authored
The 4* is now stored in the table to avoid computing it in the loop
-
Jean-Marc Valin authored
Saves on the MDense/softmax computation since we only need to compute 8 values instead of 256.
-
- Jun 29, 2021
-
-
Jean-Marc Valin authored
Using rational function approximation for tanh() and sigmoid.
-
- Jun 24, 2021
-
-
Jean-Marc Valin authored
-
- Jun 21, 2021
-
-
Jean-Marc Valin authored
This isn't necessary since valid exponents can't flip the sign bit
-
- Jan 16, 2021
-
-
Jean-Marc Valin authored
-
Jean-Marc Valin authored
-
Jean-Marc Valin authored
-
Jean-Marc Valin authored
-
Jean-Marc Valin authored
-
Jean-Marc Valin authored
-