- 20 Jun, 2013 2 commits
-
-
Ronald S. Bultje authored
Encoding of bus @ 1500kbps (first 50 frames) goes from 3min57 to 3min35, i.e. approximately a 10.5% speedup. Note that the SIMD versions which use a bilinear filter (x_offset & 7 || y_offset & 7) aren't perfectly interleaved, and can probably be improved further in the future. I've marked this with a few TODOs/FIXMEs in the code. Change-Id: I5c9e900c0f0d32e431a50fecae213b510b2549f9
-
Ronald S. Bultje authored
Overall speedup around 5% (bus @ 1500kbps first 50 frames 4min10 -> 3min58). Specific changes to timings for each function compared to original assembly-optimized versions (or just new version timings if no previous assembly-optimized version was available): sse2 4x4: 99 -> 82 cycles sse2 4x8: 128 cycles sse2 8x4: 121 cycles sse2 8x8: 149 -> 129 cycles sse2 8x16: 235 -> 245 cycles (?) sse2 16x8: 269 -> 203 cycles sse2 16x16: 441 -> 349 cycles sse2 16x32: 641 cycles sse2 32x16: 643 cycles sse2 32x32: 1733 -> 1154 cycles sse2 32x64: 2247 cycles sse2 64x32: 2323 cycles sse2 64x64: 6984 -> 4442 cycles ssse3 4x4: 100 cycles (?) ssse3 4x8: 103 cycles ssse3 8x4: 71 cycles ssse3 8x8: 147 cycles ssse3 8x16: 158 cycles ssse3 16x8: 188 -> 162 cycles ssse3 16x16: 316 -> 273 cycles ssse3 16x32: 535 cycles ssse3 32x16: 564 cycles ssse3 32x32: 973 cycles ssse3 32x64: 1930 cycles ssse3 64x32: 1922 cycles ssse3 64x64: 3760 cycles Change-Id: I81ff6fe51daf35a40d19785167004664d7e0c59d
-
- 19 Jun, 2013 3 commits
-
-
Yunqing Wang authored
-
Yunqing Wang authored
Optimized the quantization function by making it a two-pass process. The first pass does a quick checking of the transform coefficients against the base ZBIN, and only keep the good enough set of coefficients for quantization. A skipping check is added. If all coefficients are within the base ZBIN, no quantization is needed. The second pass is the actual quantization pass, which only processes the coefficient subset determined in first pass. This reduces the computation. Furthermore, an alternitive method is used for large transform size, which often has sparse nonzero quantized coefficients. Overall, the encoder speedup is about 4%. The quantization function itself gets 20% faster. Change-Id: I3a9dd0da6db030260b6d9c314a9fa48ecae89f22
-
Yaowu Xu authored
Change-Id: Ic924f07c6ab0c929c6cdf11880d3c625806e272c
-
- 18 Jun, 2013 12 commits
-
-
John Koleszar authored
-
Jingning Han authored
-
James Zern authored
add ClearSystemState() to reset MMX registers avoiding corrupting subsequent tests. Change-Id: I668deb09aa7aa467709776e5819f936910698bc0
-
Dmitry Kovalev authored
-
Dmitry Kovalev authored
-
Jingning Han authored
This commit makes use of dual fdct32x32 versions for rate-distortion optimization loop and encoding process, respectively. The one for rd loop requires only 16 bits precision for intermediate steps. The original fdct32x32 that allows higher intermediate precision (18 bits) was retained for the encoding process only. This allows speed-up for fdct32x32 in the rd loop. No performance loss observed. Change-Id: I3237770e39a8f87ed17ae5513c87228533397cc3
-
Ronald S. Bultje authored
-
Ronald S. Bultje authored
-
John Koleszar authored
-
James Zern authored
fixes issue #583 Change-Id: I4b855a5b5b168c8961410cef6ab5e6d86f14d301
-
James Zern authored
Change-Id: I052647e13dd24354888c890f6b4a987d989552ae
-
Dmitry Kovalev authored
Change-Id: I927c7223996cdeb44f46e0e6c2e2054d458c300b
-
- 17 Jun, 2013 13 commits
-
-
Ronald S. Bultje authored
This seems to only be used in the encoder. Also remove an empty wrapper file that contained forward declarations for this function, but didn't actually define any actual functions. Change-Id: Ifc561eef7ebe374a7d03698055e51e105f6d614b
-
Dmitry Kovalev authored
Moving single function from vp9_invtrans.c to vp9_encodemb.c. Change-Id: I26bf6bb90de342a3036c0dbfba78a7dd75a61fe7
-
Ronald S. Bultje authored
2.5% faster when encoding first 50 frames of bus @ 1500kbps. Change-Id: I5a64703996cf7fd39b07e32c72311c4b125ec6d4
-
Dmitry Kovalev authored
-
Ronald S. Bultje authored
-
Ronald S. Bultje authored
Change-Id: I5d3944051d091b4bf3eb13e2a30132d34203ef74
-
Dmitry Kovalev authored
The error happened because of vp8_decrypt_cb typedef redefinition in both treereader.h and vp8dx.h. Removing typedef from vp8dx.h in favor of raw function pointer declaration. Change-Id: I0266eb341ce433d40caf0abf8748694d505ee786
-
John Koleszar authored
-
Scott LaVarnway authored
Looks like test code. Change-Id: I5deae2bf14ea6fdcbb9b9d993966c9abef95eb2e
-
Jeff Petkau authored
This allows code calling the library can choose an arbitrary encryption algorithm. Decoder control parameter VP8_SET_DECRYPT_KEY is renamed to VP8D_SET_DECRYPTOR, and now takes an small config struct instead of just a byte array. Change-Id: I0462b3388d8d45057e4f79a6b6777fe713dc546e
-
John Koleszar authored
-
John Koleszar authored
-
John Koleszar authored
-
- 15 Jun, 2013 3 commits
-
-
James Zern authored
quiets a warning on every file; the preference is to use a 64-bit compiler, which is readily available at and above this version. Change-Id: I56e7eb569022e7148249d93fe386ad5ea0eee3fc
-
John Koleszar authored
-
John Koleszar authored
-
- 14 Jun, 2013 6 commits
-
-
John Koleszar authored
vp9_default_inter_mode_probs was being accessed with a different type than it was defined with. Ensure that its declaration is included prior to its definition. Change-Id: I2f963f513ab2f4e339f8a3c17e3d0f03749eba16
-
John Koleszar authored
All elements of this table are equal to 252, so replace it with a single constant VP9_COEF_UPDATE_PROB. Change-Id: I1e2d1d284326ce6df9899a740c2fc344b3ec81c9
-
Jingning Han authored
-
Jingning Han authored
The encoding time for bus at CIF goes from 661s to 625s. This commit also enabled unit test of sad8x4/4x8 in sad_test.cc. Change-Id: If3d10ebb56bda584bdb69bcf056599d580b12cb1
-
John Koleszar authored
-
Deb Mukherjee authored
No bitstream or output change - only cosmetics. Change-Id: Ic8c1d7ad010a87dcf27d12a38cd7dd5adba683a7
-
- 13 Jun, 2013 1 commit
-
-
John Koleszar authored
Avoid calling decode_block, inverse transform/add in the block is a skip block for SBs smaller than 8x8 and intra-coded SBs. Change-Id: I1684182f4a0050c8d6bb46cba6830d9425e7127d
-