- 09 Oct, 2014 1 commit
-
-
Deb Mukherjee authored
Uses highbd_ prefix convention consistently. Change-Id: I58f7f799a7ff8e32701bcd71c955bcf1cdd4581e
-
- 06 Oct, 2014 1 commit
-
-
Jingning Han authored
Bit-stream clarification related to Issue 868. Change-Id: I92a7bc5b7782c9ea5c3f6cceec761742183c9514
-
- 04 Oct, 2014 1 commit
-
-
Deb Mukherjee authored
Resolves a visual studio warning, and includes some cleanups. Change-Id: I6a7576ef323c475b7d1c659800cd82c6cb1fd18d
-
- 03 Oct, 2014 1 commit
-
-
Deb Mukherjee authored
Incorporates the WRAPLOW macro into the non-highbitdepth transforms to aid hardware verification between a software C model and an intended hardware implementation though the use of the configure options: --enable-experimental --enable-emulate-hardware. Note that to avoid further discrepancies between the sse/sse2 implementations of the transforms and the C implementation, when the emulate hardware option is invoked, we also disable sse/sse2/etc. Also incudes some minor cleanups/renaming etc. Change-Id: Ib864d8493313927d429cce402982f1c8e45b3287
-
- 30 Sep, 2014 1 commit
-
-
Jingning Han authored
Some header file in vp9_idct.c has been included in vp9_idct.h. This commit removes these redundant declarations. Change-Id: I0238c27e4efff5c981eb437022c6bc6970c4e445
-
- 12 Sep, 2014 1 commit
-
-
Deb Mukherjee authored
Adds various high bitdepth transform functions and tests. Much of the changes are related to using typedefs tran_low_t and tran_high_t for the final transform cofficients and intermediate stages of the transform computation respectively rather than fixed types int16_t/int. When vp9_highbitdepth configure flag is off, these map tp int16_t/int32_t, but when the flag is on, they map to int32_t/int64_t to make space for needed extra precision. Change-Id: I3c56de79e15b904d6f655b62ffae170729befdd8
-
- 08 May, 2014 1 commit
-
-
Jingning Han authored
The scanning order has the first 12 coefficients of the 8x8 2D-DCT sitting in the top left 4x4 block. Hence the partial inverse 8x8 2D-DCT allows to handle cases with eob below 12. The overall runtime of the inverse 8x8 2D-DCT unit is reduced from 166 cycles (using SSE2) to 150 cycles (using SSSE3). Change-Id: I4514f9748042809ac84df4c14382c00f313f1cd2
-
- 28 Jan, 2014 1 commit
-
-
Dmitry Kovalev authored
It is enough to specify (e.g.) idct16, it is obviously different from idct16x16. Change-Id: I6b408a37a945de3162429380b59a775b03b95db0
-
- 20 Nov, 2013 1 commit
-
-
hkuang authored
Change-Id: Ia568f70bddc1a2b62141a0197459119ca74c22b5
-
- 15 Nov, 2013 1 commit
-
-
Jingning Han authored
Change-Id: If97ae16a4478717933345b6b9d5bc1b417b8dd84
-
- 24 Oct, 2013 1 commit
-
-
Yunqing Wang authored
When only upper-left 8x8 area has non-zero dct coefficients, we could skip 1D IDCT for 9th to 32th rows to save operations. This function is called when eob <= 34. Change-Id: I9684b75947bdde346cfe3720f08a953aa7a13fb5
-
- 12 Oct, 2013 1 commit
-
-
Dmitry Kovalev authored
Also renaming dest_stride to stride in some places. Change-Id: I75f602b623a5a7071d4922b747c45fa0b7d7a940
-
- 11 Oct, 2013 3 commits
-
-
Dmitry Kovalev authored
Renames: vp9_iht_add -> vp9_iht4x4_add vp9_iht_add_8x8 -> vp9_iht8x8_add vp9_iht_add_16x16 -> vp9_iht16x16_add Change-Id: I8f1a2913e02d90d41f174f27e4ee2fad0dbd4a21
-
Dmitry Kovalev authored
Renames: vp9_short_iht4x4_add -> vp9_iht4x4_16_add vp9_short_iht8x8_add -> vp9_iht8x8_64_add vp9_short_iht16x16_add_c -> vp9_iht16x16_256_add Change-Id: Ibca7a188fd062b196787ac5efc1ea545e7f166c0
-
Dmitry Kovalev authored
Also adding static to iadst16_1d and fadst16 functions. Change-Id: I13c7df3b776f0f8efc6e80099bdb0a2f6d29edaf
-
- 10 Oct, 2013 2 commits
-
-
Dmitry Kovalev authored
We have two SSE2-optimized functions for idct4_1d: vp9_idct4_1d_sse2 <-- removing this one idct4_1d_sse2 vp9_idct4_1d_sse2 was used only by the following functions which already have SSE2 optimized variants: vp9_idct4x4_16_add_c -> vp9_idct4x4_16_add_see2 idct8_1d -> vp9_idct8x8_{16, 10, 1}_see2 vp9_short_iht4x4_add_c -> vp9_short_iht4x4_add_see2 Change-Id: Ib0a7f6d1373dbaf7a4a41208cd9d0671fdf15edb
-
Dmitry Kovalev authored
Renames: vp9_short_idct32x32_add -> vp9_idct32x32_1024_add vp9_short_idct32x32_1_add -> vp9_idct32x32_1_add vp9_idct_add_32x32 -> vp9_idct32x32_add Change-Id: Id85306f5814bac6c47463a6b5901a93082510666
-
- 08 Oct, 2013 1 commit
-
-
Jingning Han authored
When all coefficients are zeros, skip the corresponding 1-D inverse transform. This practice has been used in the SSE2 implementation of inverse 32x32 DCT. This commit imports this algorithm into the C code. Change-Id: I0f58bfcb183a569fab85d524d5d9cf8ae8653f86
-
- 07 Oct, 2013 1 commit
-
-
Dmitry Kovalev authored
Renames: vp9_short_idct16x16_add -> vp9_idct16x16_256_add vp9_short_idct16x16_10_add -> vp9_idct16x16_10_add vp9_short_idct16x16_1_add -> vp9_idct16x16_1_add vp9_idct_add_16x16 -> vp9_idct16x16_add Change-Id: Ief8a3904de78deab0f4ede944c4d0339c228cfc3
-
- 06 Oct, 2013 1 commit
-
-
Dmitry Kovalev authored
Renames: vp9_short_idct8x8_add -> vp9_idct8x8_64_add vp9_short_idct8x8_1_add -> vp9_idct8x8_1_add vp9_short_idct8x8_10_add -> vp9_idct8x8_10_add vp9_idct_add_8x8 -> vp9_idct8x8_add Change-Id: Ifb8d3a45b4c0397aa805b30463f3d14581bf72c1
-
- 04 Oct, 2013 1 commit
-
-
Dmitry Kovalev authored
The idea is to have the following names for each transform size: vp9_idct4x4_add vp9_idct4x4_1_add vp9_idct4x4_10_add vp9_idct4x4_16_add vp9_idct8x8_add vp9_idct8x8_1_add vp9_idct8x8_10_add vp9_idct8x8_64_add etc for 16x16, 32x32 The actual list of renames in this patch: vp9_idct_add_lossless -> vp9_iwht4x4_add vp9_short_iwalsh4x4_add -> vp9_iwht4x4_16_add vp9_short_iwalsh4x4_1_add -> vp9_iwht4x4_1_add vp9_idct_add -> vp9_idct4x4_add vp9_short_idct4x4_add -> vp9_idct4x4_16_add vp9_short_idct4x4_1_add -> vp9_idct4x4_1_add Change-Id: I6f43f7437c68dd30cdd05d72e213765578ed30b1
-
- 02 Oct, 2013 1 commit
-
-
Dmitry Kovalev authored
Moving functions from vp9_idct_blk to vp9_idct because these functions are used from both encoder and decoder. Removing duplicated code from vp9_encodemb.c and reusing existing functions. Change-Id: Ia0a6782f8c4c409efb891651b871dd4bf22d5fe8
-
- 30 Sep, 2013 1 commit
-
-
Dmitry Kovalev authored
We don't need these functions anymore. The only one which was actually used is vp9_add_constant_residual_32x32. Addition of vp9_short_idct32x32_1_add eliminates this single usage. SSE2 optimized version of vp9_short_idct32x32_1_add will be added in the next patch set, right now it is only C implementation. Now we have all idct functions implemented in a consistent manner. Change-Id: I63df79a13cf62aa2c9360a7a26933c100f9ebda3
-
- 27 Sep, 2013 1 commit
-
-
Dmitry Kovalev authored
Making name consistent with vp9_short_idct8x8 and vp9_short_idct8x8_1. Change-Id: I99e0be040ec893f9571dcf090e18f98dc58339f5
-
- 26 Sep, 2013 1 commit
-
-
Dmitry Kovalev authored
Making function name consistent with vp9_short_idct16x16 and vp9_short_idct16x16_1. Change-Id: I70e54be9e6b9a1dddab0de470686591e96d05517
-
- 24 Sep, 2013 1 commit
-
-
Yaowu Xu authored
The change is to better reflect the nature of the constants. Change-Id: Icabac6e9bceefbdb3f03f8218f88ef75943c30fb
-
- 01 Aug, 2013 1 commit
-
-
Jingning Han authored
The inverse 32x32 transform detects all zero entries and skips the computations accordingly per 8 rows in the first 1-D operation. The function vp9_short_idct10_32x32_add performs differently and is not used anywhere, hence removed. Change-Id: Ic4fad422debbde7b6b6ffed47c69fbd4268a906c
-
- 29 Jul, 2013 1 commit
-
-
Jingning Han authored
This commit provides special handle on 16x16 inverse 2D-DCT, where only DC coefficient is quantized to be non-zero value. Change-Id: I7bf71be7fa13384fab453dc8742b5b50e77a277c
-
- 26 Jul, 2013 1 commit
-
-
Jingning Han authored
This commit enables a special handle for the 8x8 inverse 2D-DCT, where only DC coefficient is quantized to be non-zero. For bus_cif at 2000 kbps, it provides about 1% speed-up at speed 0. Change-Id: I2523222359eec26b144cf8fd4c63a4ad63b1b011
-
- 24 Jul, 2013 1 commit
-
-
Jingning Han authored
They share the same functionality, so merging together. Change-Id: I98a0386fcee052cb854f9ff90c283c1b844bcb79
-
- 17 Jul, 2013 1 commit
-
-
hkuang authored
Change-Id: I386066b9bcfb4bffb582e6827af36ca0181f6a83
-
- 16 Jul, 2013 1 commit
-
-
Jingning Han authored
This commit enables SSE2 implementation of 16x16 inverse ADST/DCT hybrid transform. The runtime goes from 5742 cycles -> 1821 cycles. This provides about 1% encoding speed-up at speed 0. Change-Id: I1678d0988bf30b9efd524877705bbb3645edb17b
-
- 13 Jul, 2013 1 commit
-
-
Dmitry Kovalev authored
Change-Id: Id9b6ceeddca3f9b34bfada5c499b1e7a2f42c30b
-
- 30 May, 2013 1 commit
-
-
Yaowu Xu authored
The commit changed to use a new variant of Walsh-Hadamard Transform by Tim Terriberry. This new variant has the best compression among a number of variants that developed by Tim. Change-Id: Icb3a88515463cfc644b17ca046fcd139db2557e9
-
- 27 May, 2013 1 commit
-
-
Timothy B. Terriberry authored
Saves 1 add, 3 shifts (and a shift bias) per 1-D transform. Change-Id: I1104bb1679fe342b2f9677df8a9cdc0cb9699e7d
-
- 21 May, 2013 1 commit
-
-
Scott LaVarnway authored
No longer used. Change-Id: Id28c9247cebba183c6fa786dff96824ae100132c
-
- 20 May, 2013 1 commit
-
-
Scott LaVarnway authored
This patch eliminates the intermediate diff buffer usage by combining the short idct and the add residual into one function. The encoder can use the same code as well. Change-Id: I296604bf73579c45105de0dd1adbcc91bcc53c22
-
- 16 May, 2013 1 commit
-
-
Scott LaVarnway authored
This patch eliminates the intermediate diff buffer usage by combining the short idct and the add residual into one function. The encoder can use the same code as well. Change-Id: Iacfd57324fbe2b7beca5d7f3dcae25c976e67f45
-
- 15 May, 2013 1 commit
-
-
Scott LaVarnway authored
This patch eliminates the intermediate diff buffer usage by combining the short idct and the add residual into one function. The encoder can use the same code as well. Change-Id: Iea7976b22b1927d24b8004d2a3fddae7ecca3ba1
-
- 14 May, 2013 1 commit
-
-
Scott LaVarnway authored
This patch eliminates the intermediate diff buffer usage by combining the short idct and the add residual into one function. The encoder can use the same code as well. Change-Id: I4ea09df0e162591e420d869b7431c2e7f89a8c1a
-