1. 31 Jan, 2014 1 commit
    • Erik de Castro Lopo's avatar
      Add a fast shift for int64 values. · 4618512d
      Erik de Castro Lopo authored
      This patch changes the code from:
      	(FLAC__int32)(xmm.m128i_i64[0] >> lp_quantization)
      into:
      	_mm_cvtsi128_si32(_mm_srli_epi64(xmm, lp_quantization));
      
      Encoding of 24-bit .wav files with 32-bit FLAC became noticeably faster.
      
      Patch-from: lvqcl <lvqcl.mail@gmail.com>
      4618512d
  2. 30 Jan, 2014 1 commit
  3. 03 Oct, 2013 1 commit
    • Erik de Castro Lopo's avatar
      Improve x86 instrinsic implementation. · ecd0acba
      Erik de Castro Lopo authored
      * Splits lpc_x86intrin.c to lpc_intrin_sse.c and lpc_intrin_sse2.c
      * Add FLAC__lpc_compute_residual_from_qlp_coefficients_intrin_sse2()
        function to lpc_intrin_sse2.c
      * Add lpc_intrin_sse41.c with two ..._wide_intrin_sse41() functions
        (useful for 24-bit en-/decoding)
      * Add precompute_partition_info_sums_intrin_sse2() / ...ssse3() and
        disables precompute_partition_info_sums_32bit_asm_ia32_().
        SSE2 version uses 4 SSE2 instructions instead of 1 SSSE3 instruction
        PABSD so it is slightly slower.
      
      Patch-from: lvqcl <lvqcl.mail@gmail.com>
      ecd0acba