- 02 Sep, 2014 8 commits
-
-
Dmitry Kovalev authored
To avoid 'variable length array' warnings from gcc. Change-Id: I426f7e93ce674a10b901e79c0c9d9df5d4e47cb6
-
Marco authored
-
Dmitry Kovalev authored
-
Dmitry Kovalev authored
-
Marco authored
Parameter changes and modification to zero_last bias. Change-Id: I50a408d47fde049c562bbe95075194cb0f17c31b
-
Johann authored
-
Dmitry Kovalev authored
-
Jingning Han authored
-
- 01 Sep, 2014 1 commit
-
-
Dmitry Kovalev authored
-
- 30 Aug, 2014 4 commits
-
-
Dmitry Kovalev authored
Change-Id: I571ce84c97087f8a1a36a10058393bfdcefbf72a
-
Dmitry Kovalev authored
New code is 10% faster for 64-bit and 25% faster for 32-bit. Compiled using clang. Change-Id: I8ba1544c30dd6f3ca479db806384317549650dfc
-
Jingning Han authored
-
Jingning Han authored
-
- 29 Aug, 2014 15 commits
-
-
Jingning Han authored
Use unsigned int type to store the sse in the pixel domain. The precision is sufficient to handle sse of block size up to 64x64. The transform domain version however needs int64_t, since there is a transfer gain applied in the forward transformation that might cause unsigned int overflow. Change-Id: Ifef97c38597e426262290f35341fbb093cf0a079
-
Dmitry Kovalev authored
-
James Zern authored
-
James Zern authored
-
Yunqing Wang authored
-
Scott LaVarnway authored
This reverts commit 928ff038 Compiles with 4.6 now. Change-Id: Ib455da1098bb0e0623248be07579882a425fcbd1
-
Yunqing Wang authored
Added the missing "int". Change-Id: I7c8af3dee700837b40f010d53e1431a59370ae3a
-
James Zern authored
store the number of allocated rows in VP9LfSync, the calculated values can not be relied on when dealing with corrupt material. Change-Id: I13b8bcec9738c299a71df726772ab7ac05511e5b
-
Dmitry Kovalev authored
Removed functions: * vp9_mse16x16_mmx * vp9_get_mb_ss_mmx * vp9_get4x4var_mmx * vp9_get8x8var_mmx * vp9_variance4x4_mmx * vp9_variance8x8_mmx * vp9_variance16x16_mmx * vp9_variance16x8_mmx * vp9_variance8x16_mmx They all have SSE2 equivalent. Change-Id: I3796f2477c4f59b35b4828f46a300c16e62a2615
-
Jingning Han authored
This commit allows encoder to skip intra coding mode test, when the known inter residual is less than the source variance. It reduces the runtime of speed 3 for test clips: bus cif 1000 kbps: 8587 ms -> 8260 ms, 3.8% speed-up pedestrian 1080p 2000 kbps: 161381 ms -> 155241 ms, 3.7% speed-up. The compression performance is down by derf -0.36% stdhd -0.25% Change-Id: I75ce1e035b4da2153cb1ac14111d1a07c05a735d
-
Jingning Han authored
This commit extends the sse and forward transform computation flag to support the case 64x64 blocks where there are 4 32x32 2D-DCT blocks. Change-Id: I86a3e805dfaa0f3abd812f590520c71aa0e40473
-
James Zern authored
-
James Zern authored
-
James Zern authored
prevents any problems resuming decode after decoding a corrupt frame Change-Id: Ib7eb1b5c062aebe71074fef1ece32a32822c16be
-
Dmitry Kovalev authored
-
- 28 Aug, 2014 12 commits
-
-
Dmitry Kovalev authored
New SSE2 function is three times faster than MMX one. Change-Id: I4f387ce9f75b88379176ec7bdc62d86eb5f70fbe
-
Dmitry Kovalev authored
In order to understand memory layout consider the declaration of the following structs. The first one is a part of our API: struct vpx_codec_ctx { // ... struct vpx_codec_priv *priv; }; The second one is defined in vpx_codec_internal.h: struct vpx_codec_priv { // ... }; The following struct is defined 4 times for encoder/decoder VP8/VP9: struct vpx_codec_alg_priv { struct vpx_codec_priv base; // ... }; Private data allocation for the given ctx: struct vpx_codec_ctx *ctx = <get> struct vpx_codec_alg_priv *alg_priv = <allocate> ctx->priv = (struct vpx_codec_priv *)alg_priv; The cast works because vpx_codec_alg_priv has a vpx_codec_priv instance as a first member 'base'. Change-Id: I10d1afc8c9a7dfda50baade8c7b0296678bdb0d0
-
Dmitry Kovalev authored
-
Yunqing Wang authored
-
Dmitry Kovalev authored
-
James Zern authored
-
Dmitry Kovalev authored
-
Dmitry Kovalev authored
-
Dmitry Kovalev authored
-
Dmitry Kovalev authored
-
Yunqing Wang authored
In the partition search, the encoder checks all possible partitionings in the superblock's partition search tree. This patch proposed a set of criteria for partition search early termination, which effectively decided whether or not to terminate the search in current branch based on the "skippable" result of the quantized transform coefficients. The "skippable" information was gathered during the partition mode search, and no overhead calculations were introduced. This patch gives significant encoding speed gains without sacrificing the quality. Borg test results: 1. At speed 1, stdhd set: psnr: +0.074%, ssim: +0.093%; derf set: psnr: -0.024%, ssim: +0.011%; 2. At speed 2, stdhd set: psnr: +0.033%, ssim: +0.100%; derf set: psnr: -0.062%, ssim: +0.003%; 3. At speed 3, stdhd set: psnr: +0.060%, ssim: +0.190%; derf set: psnr: -0.064%, ssim: -0.002%; 4. At speed 4, stdhd set: psnr: +0.070%, ssim: +0.143%; derf set: psnr: -0.104%, ssim: +0.039%; The speedup ranges from several percent to 60+%. speed1 speed2 speed3 speed4 (1080p, 100f): old_town_cross: 48.2% 23.9% 20.8% 16.5% park_joy: 11.4% 17.8% 29.4% 18.2% pedestrian_area: 10.7% 4.0% 4.2% 2.4% (720p, 200f): mobcal: 68.1% 36.3% 34.4% 17.7% parkrun: 15.8% 24.2% 37.1% 16.8% shields: 45.1% 32.8% 30.1% 9.6% (cif, 300f) bus: 3.7% 10.4% 14.0% 7.9% deadline: 13.6% 14.8% 12.6% 10.9% mobile: 5.3% 11.5% 14.7% 10.7% Change-Id: I246c38fb952ad762ce5e365711235b605f470a66
-
Dmitry Kovalev authored
Change-Id: I65b2c1fbed5a306949843315999d10368a100431
-