- Nov 28, 2023
-
-
Jean-Marc Valin authored
Finished adversarial training on 800k model. Also, move weights to a new location.
-
Jean-Marc Valin authored
1) Enable asm/intrinsics even for floating-point 2) Make sure ARMv8 asimd enables EDSP/MEDIA/Neon 3) Add dotp architecture to rtcd table since AArch *can* have dotp
-
- Nov 27, 2023
-
-
Jean-Marc Valin authored
Adds RTCD tables for compute_activation() and compute_conv2d()
-
Jean-Marc Valin authored
-
Jean-Marc Valin authored
Avoids having to write intrinsics for simple loops
-
Jean-Marc Valin authored
Enabling only on platforms that have been tested just in case we run into a non-IEEE754 platform where they would break.
-
Jean-Marc Valin authored
-
- Nov 26, 2023
-
-
Jean-Marc Valin authored
Still missing some intrinsics
-
Jean-Marc Valin authored
-
- Nov 25, 2023
-
-
Jean-Marc Valin authored
Used for DNN matrix multiplies
-
- Nov 24, 2023
-
-
Jean-Marc Valin authored
-
Jean-Marc Valin authored
-
Jean-Marc Valin authored
-
Jean-Marc Valin authored
Code moved to compute_frame_features()
-
Jean-Marc Valin authored
-
- Nov 21, 2023
-
-
Jean-Marc Valin authored
for cmake, force PRESEUME_SSE4_1 on PRESUME_AVX2
-
Jean-Marc Valin authored
aka banging on it until it builds on my machine. Further improvements welcome
-
Jean-Marc Valin authored
-
Jean-Marc Valin authored
Not yet with rtcd
-
Jean-Marc Valin authored
-
Jean-Marc Valin authored
DRED will absorb the bitrate variation
-
- Nov 20, 2023
-
-
Jean-Marc Valin authored
Fixes warnings, undefined behaviour, and check-asm failure
-
The optimization is bit-exact with C function. This optimization speeds up SILK encoder (floating point) as following: AMD Zen: Complexity 0-5 : 0% Complexity 6-7 : 3 - 7% Complexity 8-10: 8 - 15% Intel Skylake: Complexity 0-5 : 0% Complexity 6-7 : 14 - 18% Complexity 8-10: 17 - 22% Adapted by Jean-Marc Valin
-
Jean-Marc Valin authored
Not hooked up
-
Jean-Marc Valin authored
-
Jean-Marc Valin authored
Reducing dependency chains
-
Jean-Marc Valin authored
-
- Nov 18, 2023
-
-
Jean-Marc Valin authored
Reducing the dependency chain between tmp1 and tmp2 at the cost of an extra multiply.
-
Jean-Marc Valin authored
-
- Nov 17, 2023
-
-
Jean-Marc Valin authored
Should never occur on amd64, but it could on 32-bit x86
-
Jean-Marc Valin authored
No RTCD yet
-
Jean-Marc Valin authored
-
- Nov 16, 2023
-
-
Jean-Marc Valin authored
-
Jean-Marc Valin authored
800k parameters, 600 MFLOPS, with a receptive field of 3 feature vectors
-
Jean-Marc Valin authored
-
Jean-Marc Valin authored
-
Jean-Marc Valin authored
Starting with compute_linear()
-
- Nov 15, 2023
-
-
Jean-Marc Valin authored
Saves ~270 kB of weights in the decoder
-
- Nov 11, 2023
-
-
Jean-Marc Valin authored
-
- Nov 08, 2023
-
-
Jean-Marc Valin authored
-