- 01 Oct, 2014 2 commits
-
-
Jingning Han authored
Change-Id: Id92544762e7b96d3c729dfc8e04ecff91cbcc7f9
-
Deb Mukherjee authored
Moves transform type defines to vp9_common.h from vp9_idct.h so that they can be included in vp9_rtcd_defs.pl safely. Change-Id: Id5106227bee5934f7ce8b06f2eb9fa8a9a2e0ddb
-
- 30 Sep, 2014 2 commits
-
-
James Zern authored
This reverts commit eafc8c9c. tran_low_t/tran_high_t don't belong in a public header, they're private. Similarly the public headers shouldn't rely on config defines, vpx_config.h isn't installed. Change-Id: I194ec273598da418df8dd727b6c0e78a556740ad
-
Jingning Han authored
This commit fixes a compiling error in vp9_idct.h, where the codec checks that the intermediate steps of transformation fit within 16-bit length. The issue was due to broken file dependency. Change-Id: Ib22bba13a1e6df28489cb23d6774c561969f1fdc
-
- 12 Sep, 2014 1 commit
-
-
Deb Mukherjee authored
Adds various high bitdepth transform functions and tests. Much of the changes are related to using typedefs tran_low_t and tran_high_t for the final transform cofficients and intermediate stages of the transform computation respectively rather than fixed types int16_t/int. When vp9_highbitdepth configure flag is off, these map tp int16_t/int32_t, but when the flag is on, they map to int32_t/int64_t to make space for needed extra precision. Change-Id: I3c56de79e15b904d6f655b62ffae170729befdd8
-
- 07 Aug, 2014 1 commit
-
-
Yaowu Xu authored
This commit adds a configure time option used to enable strict error checking in decoder to make sure intermediate stage cofficients of inverse transforms are within valid range of signed 16 bit integer. For valid VP9 input streams, intermediate stage coefficients should always stay within the range of a signed 16 bit integer. Coefficients can go out of this range for invalid/corrupt VP9 streams. However, strictly checking this range for every intermediate coefficient can be a burden for decoder, therefore such validation is only enabled with configure option --enable-coefficient-range-checking. Change-Id: I47d47c8c4e48a922c3d223ca59064f51b3f0f5ed
-
- 28 May, 2014 1 commit
-
-
Jingning Han authored
This commit enables SSSE3 implementation of the inverse 2D-DCT with only first 10 coefficients non-zero. It reduces the runtime of SSE2 version from 745 cycles to 538 cycles, i.e., 27% speed-up. Change-Id: I18ba4128859b09c704a6ee361d69a86c09fe8dfe
-
- 01 May, 2014 1 commit
-
-
Dmitry Kovalev authored
Change-Id: I642a7d343677bf934e9a54cf4ad78e908620e39a
-
- 24 Jan, 2014 1 commit
-
-
James Zern authored
Change-Id: Ic334da9aee968e33762c2b25d9fbad24c844b411
-
- 21 Nov, 2013 1 commit
-
-
Jingning Han authored
Separate the rounding and right shift operations of forward transform from those of inverse transform. Take out the assertion check from inverse transforms. If the transform coefficients were constructed to cause intermediate steps of inverse transform overflow, the codec will just let it overflow without breaking the decoding flow. Change-Id: Ia7ce15dfd1a73b4abbaa78cbc74ec718523c5b1b
-
- 15 Nov, 2013 1 commit
-
-
Jingning Han authored
Separate the rounding and right shift operations of forward transform from those of inverse transform. Take out the assertion check from inverse transforms. If the transform coefficients were constructed to cause intermediate steps of inverse transform overflow, the codec will just let it overflow without breaking the decoding flow. Change-Id: I73cfc3706c4e840fc543a77cbc4cdb0b05d07730
-
- 12 Oct, 2013 1 commit
-
-
Dmitry Kovalev authored
Also renaming dest_stride to stride in some places. Change-Id: I75f602b623a5a7071d4922b747c45fa0b7d7a940
-
- 11 Oct, 2013 2 commits
-
-
Dmitry Kovalev authored
Renames: vp9_iht_add -> vp9_iht4x4_add vp9_iht_add_8x8 -> vp9_iht8x8_add vp9_iht_add_16x16 -> vp9_iht16x16_add Change-Id: I8f1a2913e02d90d41f174f27e4ee2fad0dbd4a21
-
Dmitry Kovalev authored
Also adding static to iadst16_1d and fadst16 functions. Change-Id: I13c7df3b776f0f8efc6e80099bdb0a2f6d29edaf
-
- 10 Oct, 2013 1 commit
-
-
Dmitry Kovalev authored
Renames: vp9_short_idct32x32_add -> vp9_idct32x32_1024_add vp9_short_idct32x32_1_add -> vp9_idct32x32_1_add vp9_idct_add_32x32 -> vp9_idct32x32_add Change-Id: Id85306f5814bac6c47463a6b5901a93082510666
-
- 07 Oct, 2013 1 commit
-
-
Dmitry Kovalev authored
Renames: vp9_short_idct16x16_add -> vp9_idct16x16_256_add vp9_short_idct16x16_10_add -> vp9_idct16x16_10_add vp9_short_idct16x16_1_add -> vp9_idct16x16_1_add vp9_idct_add_16x16 -> vp9_idct16x16_add Change-Id: Ief8a3904de78deab0f4ede944c4d0339c228cfc3
-
- 06 Oct, 2013 1 commit
-
-
Dmitry Kovalev authored
Renames: vp9_short_idct8x8_add -> vp9_idct8x8_64_add vp9_short_idct8x8_1_add -> vp9_idct8x8_1_add vp9_short_idct8x8_10_add -> vp9_idct8x8_10_add vp9_idct_add_8x8 -> vp9_idct8x8_add Change-Id: Ifb8d3a45b4c0397aa805b30463f3d14581bf72c1
-
- 04 Oct, 2013 1 commit
-
-
Dmitry Kovalev authored
The idea is to have the following names for each transform size: vp9_idct4x4_add vp9_idct4x4_1_add vp9_idct4x4_10_add vp9_idct4x4_16_add vp9_idct8x8_add vp9_idct8x8_1_add vp9_idct8x8_10_add vp9_idct8x8_64_add etc for 16x16, 32x32 The actual list of renames in this patch: vp9_idct_add_lossless -> vp9_iwht4x4_add vp9_short_iwalsh4x4_add -> vp9_iwht4x4_16_add vp9_short_iwalsh4x4_1_add -> vp9_iwht4x4_1_add vp9_idct_add -> vp9_idct4x4_add vp9_short_idct4x4_add -> vp9_idct4x4_16_add vp9_short_idct4x4_1_add -> vp9_idct4x4_1_add Change-Id: I6f43f7437c68dd30cdd05d72e213765578ed30b1
-
- 02 Oct, 2013 1 commit
-
-
Dmitry Kovalev authored
Moving functions from vp9_idct_blk to vp9_idct because these functions are used from both encoder and decoder. Removing duplicated code from vp9_encodemb.c and reusing existing functions. Change-Id: Ia0a6782f8c4c409efb891651b871dd4bf22d5fe8
-
- 24 Sep, 2013 1 commit
-
-
Yaowu Xu authored
The change is to better reflect the nature of the constants. Change-Id: Icabac6e9bceefbdb3f03f8218f88ef75943c30fb
-
- 19 Sep, 2013 1 commit
-
-
Yaowu Xu authored
Change-Id: I76f440a917832c02d7a727697b225bac66b99f56
-
- 12 Aug, 2013 1 commit
-
-
Jingning Han authored
Enable SSE2 implementation of high precision 32x32 forward DCT. The intermediate stacks are of 32-bits. The run-time goes down from 32126 cycles to 13442 cycles. Change-Id: Ib5ccafe3176c65bd6f2dbdef790bd47bbc880e56
-
- 16 Jul, 2013 1 commit
-
-
Dmitry Kovalev authored
Removing unused and duplicated constants, moving them from *.h to *.c if possible. Change-Id: Ief4d6b984a3ca2e9b38504f0d855ed072cf7133f
-
- 29 Jun, 2013 1 commit
-
-
Christian Duvivier authored
43,000 -> 5,750 cycles, about 7.5x faster. Change-Id: Ibfd92821b9603f4ed9c256e0ececec14fa4565d0
-
- 18 Jun, 2013 1 commit
-
-
Jingning Han authored
This commit makes use of dual fdct32x32 versions for rate-distortion optimization loop and encoding process, respectively. The one for rd loop requires only 16 bits precision for intermediate steps. The original fdct32x32 that allows higher intermediate precision (18 bits) was retained for the encoding process only. This allows speed-up for fdct32x32 in the rd loop. No performance loss observed. Change-Id: I3237770e39a8f87ed17ae5513c87228533397cc3
-
- 30 May, 2013 1 commit
-
-
Yaowu Xu authored
The commit changed to use a new variant of Walsh-Hadamard Transform by Tim Terriberry. This new variant has the best compression among a number of variants that developed by Tim. Change-Id: Icb3a88515463cfc644b17ca046fcd139db2557e9
-
- 18 Mar, 2013 1 commit
-
-
Yunqing Wang authored
Wrote sse2 functions of vp9_short_idct8x8 and vp9_short_idct10_8x8. Compared to c version, the sse2 version is 2X faster. The decoder test didn't show noticeable gain since 8x8 idct doesn't take much of decoding time (less than 1% in my test). Change-Id: I56313e18cd481700b3b52c4eda5ca204ca6365f3
-
- 07 Mar, 2013 1 commit
-
-
Dmitry Kovalev authored
Change-Id: I44660975e9985310d8c654c158ee7a61291b5a08
-
- 04 Mar, 2013 1 commit
-
-
Yunqing Wang authored
Wrote a SSE2 vp9_short_idct4x4llm to improve the decoder performance. Change-Id: I90b9d48c4bf37aaf47995bffe7e584e6d4a2c000
-
- 28 Feb, 2013 1 commit
-
-
Christian Duvivier authored
Scalar path is about 1.4x faster (4% overall encoder speedup). SSE2 path is about 7x faster (13% overall encoder speedup). Change-Id: I7e85d8225a914a74c61ea370210414696560094d
-
- 27 Feb, 2013 2 commits
-
-
Dmitry Kovalev authored
Fixing code style, using array lookup instead of switch statements for forward hybrid transforms (in the same way as for their inverses). Consistent usage of ROUND_POWER_OF_TWO macro in appropriate places. Change-Id: I0d3822ae11f928905fdbfbe4158f91d97c71015f
-
Yunqing Wang authored
Wrote SSE2 version of vp9_dc_only_idct_add_c function. In order to improve performance, clipped the absolute diff values to [0, 255]. This allowed us to keep the additions/subtractions in 8 bits. Test showed an over 2% decoder performance increase. Change-Id: Ie1a236d23d207e4ffcd1fc9f3d77462a9c7fe09d
-
- 26 Feb, 2013 1 commit
-
-
Yaowu Xu authored
The commit improves the 32x32 forward dct implementation: 1. change to use same constants and rounding as other forward dcts 2. select rounding to specifically minimize the roundtrip error, which improved average 19/block to .77/block using 100000 random input. Test showed a small but consistent gain on all test sets, about .15% Change-Id: If0afd6a71880a522f60c1c234be0462092c2eb53
-
- 25 Feb, 2013 1 commit
-
-
Jingning Han authored
Rebased. Remove the old matrix multiplication transform computation. The 16x16 ADST/DCT can be switched on/off and evaluated by setting ACTIVE_HT16 300/0 in vp9/common/vp9_blockd.h. Change-Id: Icab2dbd18538987e1dc4e88c45abfc4cfc6e133f
-
- 22 Feb, 2013 1 commit
-
-
Dmitry Kovalev authored
Removing redundant 'extern' keywords and parentheses, fixing indentation, making variable names lower case, using short expressions x *= c instead of x = x * c, minor code simplifications. Change-Id: If6a25fcf306d1db26e90d27e3c24a32735c607de
-
- 20 Feb, 2013 1 commit
-
-
Dmitry Kovalev authored
Change-Id: I7c6e3bebd94856b24dbe2aded7f9e04ef8bb8c08
-
- 11 Feb, 2013 1 commit
-
-
Jingning Han authored
fixed format issues. Implement the inverse 4x4 ADST using 9 multiplications. For this particular dimension, the original ADST transform can be factorized into simpler operations, hence is retained. Change-Id: Ie5d9749942468df299ab74e90d92cd899569e960
-
- 07 Feb, 2013 1 commit
-
-
Yaowu Xu authored
also removed some un-unsed functions. Change-Id: Ie363bcc8d94441d054137d2ef7c4fe59f56027e5
-